GNU bug report logs -
#45925
27.1; *Summary* buffer vs. raw utf-8 headers
Previous Next
Reported by: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>
Date: Sun, 17 Jan 2021 05:37:02 UTC
Severity: minor
Tags: fixed
Found in version 27.1
Fixed in version 28.1
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 45925 in the body.
You can then email your comments to 45925 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org, bugs <at> gnus.org
:
bug#45925
; Package
emacs,gnus
.
(Sun, 17 Jan 2021 05:37:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
積丹尼 Dan Jacobson <jidanni <at> jidanni.org>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org, bugs <at> gnus.org
.
(Sun, 17 Jan 2021 05:37:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Try this simple experiment:
$ echo Subject: 一二三|procmail
$ echo Subject: 一二三|iconv -t big5|procmail
$ emacs -f gnus
In the *Article* buffer, both look like
Subject: 一二三
In the *Summary* buffer so does the big5 version.
Alas, the utf-8 version looks like
c\x80\xd3....
(Yes, these are illegal raw headers. But Gnus is supposed to be
accommodating. And it does... but oddly not for the majority (UTF-8) case.)
Important settings:
value of $LC_COLLATE: C
value of $LC_CTYPE: zh_TW.UTF-8
value of $LC_MESSAGES: C
value of $LANG: zh_TW.UTF-8
value of $XMODIFIERS: @im=ibus
locale-coding-system: utf-8-unix
(Might be related to bug#45724.)
(https://www.jidanni.org/comp/configuration/ has my dot files. )
Information forwarded
to
bug-gnu-emacs <at> gnu.org, bugs <at> gnus.org
:
bug#45925
; Package
emacs,gnus
.
(Tue, 19 Jan 2021 01:05:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 45925 <at> debbugs.gnu.org (full text, mbox):
Note with emacs-version "27.1" these certain old messages that have been
sitting in my *Summary* buffer for years suddenly have got their Subject
garbled. (Fine though in *Article* buffer.)
Information forwarded
to
bug-gnu-emacs <at> gnu.org, bugs <at> gnus.org
:
bug#45925
; Package
emacs,gnus
.
(Tue, 19 Jan 2021 05:31:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 45925 <at> debbugs.gnu.org (full text, mbox):
積丹尼 Dan Jacobson <jidanni <at> jidanni.org> writes:
> Try this simple experiment:
> $ echo Subject: 一二三|procmail
> $ echo Subject: 一二三|iconv -t big5|procmail
I don't have procmail installed, so I'm not sure what these do -- are
you sending a mail (to yourself?) here? Do you have a recipe to
reproduce this problem without the use of procmail?
> $ emacs -f gnus
>
> In the *Article* buffer, both look like
> Subject: 一二三
> In the *Summary* buffer so does the big5 version.
> Alas, the utf-8 version looks like
> c\x80\xd3....
>
> (Yes, these are illegal raw headers. But Gnus is supposed to be
> accommodating. And it does... but oddly not for the majority (UTF-8) case.)
[...]
> (Might be related to bug#45724.)
Is this still with nnml? If so, could you find the resulting lines in
the .overview files in the nnml directory and post them here? (Perhaps
after gzipping them to avoid Emacs helpfully re-encoding the lines.)
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
Information forwarded
to
bug-gnu-emacs <at> gnu.org, bugs <at> gnus.org
:
bug#45925
; Package
emacs,gnus
.
(Wed, 20 Jan 2021 06:57:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 45925 <at> debbugs.gnu.org (full text, mbox):
>>>>> "LI" == Lars Ingebrigtsen <larsi <at> gnus.org> writes:
LI> 積丹尼 Dan Jacobson <jidanni <at> jidanni.org> writes:
>> Try this simple experiment:
>> $ echo Subject: 一二三|procmail
>> $ echo Subject: 一二三|iconv -t big5|procmail
LI> I don't have procmail installed, so I'm not sure what these do -- are
LI> you sending a mail (to yourself?) here? Do you have a recipe to
LI> reproduce this problem without the use of procmail?
$ echo Subject: 一二三 > ~/Maildir/new/Z
$ file ~/Maildir/new/Z
~/Maildir/new/Z: UTF-8 Unicode text
>> $ emacs -f gnus
>>
>> In the *Article* buffer, both look like
>> Subject: 一二三
>> In the *Summary* buffer so does the big5 version.
>> Alas, the utf-8 version looks like
>> c\x80\xd3....
>>
>> (Yes, these are illegal raw headers. But Gnus is supposed to be
>> accommodating. And it does... but oddly not for the majority (UTF-8) case.)
LI> [...]
>> (Might be related to bug#45724.)
LI> Is this still with nnml? If so, could you find the resulting lines in
LI> the .overview files in the nnml directory and post them here? (Perhaps
LI> after gzipping them to avoid Emacs helpfully re-encoding the lines.)
Yes, nnml.
The headers get appended raw to .overview.
Thus .overview contains a mix of ASCII, big5, and UTF-8, all in the same file.
$ echo Subject: 一二三|iconv -t big5 > ~/Maildir/new/B5
$ echo Subject: 一二三 > ~/Maildir/new/UT
$ emacs -f gnus
$ tail -n 2 Mail/mail/misc/.overview|qprint -e
37397 =A4@=A4G=A4T (nobody) <87a6t4gnpx.5.fsf <at> totally-fudged-out-mess=
age-id> 0 0 Xref: jidanni5 mail.misc:37397=09
37398 =E4=B8=80=E4=BA=8C=E4=B8=89 (nobody) <878s8ognpx.5.fsf <at> totally-=
fudged-out-message-id> 0 0 Xref: jidanni5 mail.misc:37398=09
Anyway: *Summary* oddly can only deal with raw big5, not raw UTF-8.
However *Article* can deal with both.
Information forwarded
to
bug-gnu-emacs <at> gnu.org, bugs <at> gnus.org
:
bug#45925
; Package
emacs,gnus
.
(Wed, 20 Jan 2021 16:33:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 45925 <at> debbugs.gnu.org (full text, mbox):
積丹尼 Dan Jacobson <jidanni <at> jidanni.org> writes:
> LI> I don't have procmail installed, so I'm not sure what these do -- are
> LI> you sending a mail (to yourself?) here? Do you have a recipe to
> LI> reproduce this problem without the use of procmail?
>
> $ echo Subject: 一二三 > ~/Maildir/new/Z
> $ file ~/Maildir/new/Z
> ~/Maildir/new/Z: UTF-8 Unicode text
I thought this was about nnml? Is ~/Maildir/new/Z your nnml directory?
> LI> Is this still with nnml? If so, could you find the resulting lines in
> LI> the .overview files in the nnml directory and post them here? (Perhaps
> LI> after gzipping them to avoid Emacs helpfully re-encoding the lines.)
>
> Yes, nnml.
>
> The headers get appended raw to .overview.
>
> Thus .overview contains a mix of ASCII, big5, and UTF-8, all in the same file.
>
> $ echo Subject: 一二三|iconv -t big5 > ~/Maildir/new/B5
> $ echo Subject: 一二三 > ~/Maildir/new/UT
> $ emacs -f gnus
> $ tail -n 2 Mail/mail/misc/.overview|qprint -e
> 37397 =A4@=A4G=A4T (nobody) <87a6t4gnpx.5.fsf <at> totally-fudged-out-mess=
> age-id> 0 0 Xref: jidanni5 mail.misc:37397=09
> 37398 =E4=B8=80=E4=BA=8C=E4=B8=89 (nobody) <878s8ognpx.5.fsf <at> totally-=
> fudged-out-message-id> 0 0 Xref: jidanni5 mail.misc:37398=09
There was just ASCII in the part you posted. Could you gzip it, as I
asked you to?
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
Information forwarded
to
bug-gnu-emacs <at> gnu.org, bugs <at> gnus.org
:
bug#45925
; Package
emacs,gnus
.
(Thu, 21 Jan 2021 20:11:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 45925 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
LI> I thought this was about nnml? Is ~/Maildir/new/Z your nnml directory?
No. I just made a file where gnus gets its mail from when I hit "g".
g runs the command gnus-group-get-new-news
Anyway, all you need to do to reproduce this bug, is to have somebody
send you a mail with raw UTF-8 in the Subject header.
LI> There was just ASCII in the part you posted. Could you gzip it, as I
LI> asked you to?
$ perl -nwle 'print if /\P{ASCII}/' Mail/mail/misc/.overview > /tmp/h
$ gzip /tmp/h
[h.gz (application/gzip, attachment)]
[Message part 3 (text/plain, inline)]
Here you will see a mix of raw UTF-8, raw big5, all in the same file.
The raw big5 works fine, but the raw UTF-8 looks garbled, in the summary
buffer. In the article buffer, all look fine.
Here are all my config files:
https://www.jidanni.org/comp/configuration/.emacs
https://www.jidanni.org/comp/configuration/.gnus.el
https://www.jidanni.org/comp/configuration/.emacs-custom.el
https://www.jidanni.org/comp/configuration/.emacs-w3m
Information forwarded
to
bug-gnu-emacs <at> gnu.org, bugs <at> gnus.org
:
bug#45925
; Package
emacs,gnus
.
(Thu, 21 Jan 2021 20:24:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 45925 <at> debbugs.gnu.org (full text, mbox):
> From: 積丹尼 Dan Jacobson
> <jidanni <at> jidanni.org>
> Date: Fri, 22 Jan 2021 03:55:44 +0800
> Cc: 45925 <at> debbugs.gnu.org
>
> Here you will see a mix of raw UTF-8, raw big5, all in the same file.
> The raw big5 works fine, but the raw UTF-8 looks garbled, in the summary
> buffer. In the article buffer, all look fine.
Why do you expect a mixed-encoding stuff to work in Emacs? Emacs only
supports a single encoding of any chunk of text it gets, be it a file
or an email message.
Files such as this one are simply not supported.
Information forwarded
to
bug-gnu-emacs <at> gnu.org, bugs <at> gnus.org
:
bug#45925
; Package
emacs,gnus
.
(Thu, 21 Jan 2021 20:55:01 GMT)
Full text and
rfc822 format available.
Message #26 received at 45925 <at> debbugs.gnu.org (full text, mbox):
>>>>> "EZ" == Eli Zaretskii <eliz <at> gnu.org> writes:
EZ> Why do you expect a mixed-encoding stuff to work in Emacs? Emacs only
EZ> supports a single encoding of any chunk of text it gets, be it a file
EZ> or an email message.
EZ> Files such as this one are simply not supported.
So, Gnus should not just randomly slap raw lines into the same file.
That is the root of all problems!
Information forwarded
to
bug-gnu-emacs <at> gnu.org, bugs <at> gnus.org
:
bug#45925
; Package
emacs,gnus
.
(Fri, 22 Jan 2021 18:07:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 45925 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii <eliz <at> gnu.org> writes:
> Why do you expect a mixed-encoding stuff to work in Emacs? Emacs only
> supports a single encoding of any chunk of text it gets, be it a file
> or an email message.
>
> Files such as this one are simply not supported.
Sure they are. It's not a text file; it's an octet stream.
But as Dan points out, Gnus doesn't handle these invalid mails
optimally, and doing some RFC2047-encoding to the headers before writing
the .overview file will help a bit here, so I've now done that in Emacs
28.
(Gnus will still display some of these headers "wrong" in the summary
buffer, and display them "right" in the article buffer, because Gnus has
to guess at what the charset is, and it does further guessing in the
article buffer than in the summary buffer, for reasons of efficiency.)
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
Added tag(s) fixed.
Request was from
Lars Ingebrigtsen <larsi <at> gnus.org>
to
control <at> debbugs.gnu.org
.
(Fri, 22 Jan 2021 18:09:01 GMT)
Full text and
rfc822 format available.
bug marked as fixed in version 28.1, send any further explanations to
45925 <at> debbugs.gnu.org and 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>
Request was from
Lars Ingebrigtsen <larsi <at> gnus.org>
to
control <at> debbugs.gnu.org
.
(Fri, 22 Jan 2021 18:09:02 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Sat, 20 Feb 2021 12:24:06 GMT)
Full text and
rfc822 format available.
This bug report was last modified 3 years and 66 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.