GNU bug report logs - #36852
27.0.50; ietf-drums-parse-address doesn't handle non-ascii properly

Previous Next

Package: emacs;

Reported by: Štěpán Němec <stepnem <at> gmail.com>

Date: Tue, 30 Jul 2019 09:17:02 UTC

Severity: normal

Tags: fixed

Found in version 27.0.50

Fixed in version 27.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 36852 in the body.
You can then email your comments to 36852 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#36852; Package emacs. (Tue, 30 Jul 2019 09:17:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Štěpán Němec <stepnem <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Tue, 30 Jul 2019 09:17:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Štěpán Němec <stepnem <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 27.0.50; ietf-drums-parse-address doesn't handle non-ascii properly
Date: Tue, 30 Jul 2019 11:16:53 +0200
ietf-drums-parse-address (AKA mail-header-parse-address) uses
ietf-drums-atext-token to parse display-name, but the regexp range only
contains ASCII characters, so e.g. as used in debbugs-gnu-show-reports,
the following happens:

  (mail-header-parse-address
   (decode-coding-string "Áaááá Ůůůůů <aaa <at> example.net>" 'utf-8))

  ;;=> ("aaa <at> example.net" . "aááá")

It actually only cares about the first char of a word:

  (let ((ietf-drums-atext-token "-ÁŮ^a-zA-Z0-9!#$%&'*+/=?_`{|}~"))
    (mail-header-parse-address
     (decode-coding-string "Áaááá Ůůůůů <aaa <at> example.net>" 'utf-8)))

  ;;=> ("aaa <at> example.net" . "Áaááá Ůůůůů")

I'm not quite sure what the proper fix is, as the ASCII-only thing seems
to be intentional. Maybe it's just not supposed to be used the way it is
used in debbugs-gnu.el?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#36852; Package emacs. (Tue, 30 Jul 2019 09:54:02 GMT) Full text and rfc822 format available.

Message #8 received at 36852 <at> debbugs.gnu.org (full text, mbox):

From: Robert Pluim <rpluim <at> gmail.com>
To: Štěpán Němec <stepnem <at> gmail.com>
Cc: 36852 <at> debbugs.gnu.org
Subject: Re: bug#36852: 27.0.50; ietf-drums-parse-address doesn't handle
 non-ascii properly
Date: Tue, 30 Jul 2019 11:53:36 +0200
>>>>> On Tue, 30 Jul 2019 11:16:53 +0200, Štěpán Němec <stepnem <at> gmail.com> said:

    Štěpán> ietf-drums-parse-address (AKA mail-header-parse-address) uses
    Štěpán> ietf-drums-atext-token to parse display-name, but the regexp range only
    Štěpán> contains ASCII characters, so e.g. as used in debbugs-gnu-show-reports,
    Štěpán> the following happens:

    Štěpán>   (mail-header-parse-address
    Štěpán>    (decode-coding-string "Áaááá Ůůůůů <aaa <at> example.net>" 'utf-8))

    Štěpán>   ;;=> ("aaa <at> example.net" . "aááá")

    Štěpán> It actually only cares about the first char of a word:

    Štěpán>   (let ((ietf-drums-atext-token "-ÁŮ^a-zA-Z0-9!#$%&'*+/=?_`{|}~"))
    Štěpán>     (mail-header-parse-address
    Štěpán>      (decode-coding-string "Áaááá Ůůůůů <aaa <at> example.net>" 'utf-8)))

    Štěpán>   ;;=> ("aaa <at> example.net" . "Áaááá Ůůůůů")

    Štěpán> I'm not quite sure what the proper fix is, as the ASCII-only thing seems
    Štěpán> to be intentional. Maybe it's just not supposed to be used the way it is
    Štěpán> used in debbugs-gnu.el?

Mail headers are defined to be ascii-only, although as Iʼve just
discovered, gmail undoes Gnus' perfectly formatted RFC 2047 encoding
and replaces it with UTF-8 characters. Bad Google, bad.

Perhaps mail-header-parse-address could just discard the complete
display string if it finds a non-ascii char? That would at least
prevent it from propagating.

Robert




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#36852; Package emacs. (Sun, 15 Sep 2019 12:01:01 GMT) Full text and rfc822 format available.

Message #11 received at 36852 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Štěpán Němec <stepnem <at> gmail.com>
Cc: 36852 <at> debbugs.gnu.org
Subject: Re: bug#36852: 27.0.50; ietf-drums-parse-address doesn't handle
 non-ascii properly
Date: Sun, 15 Sep 2019 14:00:28 +0200
Štěpán Němec <stepnem <at> gmail.com> writes:

> ietf-drums-parse-address (AKA mail-header-parse-address) uses
> ietf-drums-atext-token to parse display-name, but the regexp range only
> contains ASCII characters, so e.g. as used in debbugs-gnu-show-reports,
> the following happens:
>
>   (mail-header-parse-address
>    (decode-coding-string "Áaááá Ůůůůů <aaa <at> example.net>" 'utf-8))
>
>   ;;=> ("aaa <at> example.net" . "aááá")

That's not a valid email address, so perhaps `ietf-drums-parse-address'
should return a blank string as the name here...  On the other hand,
calling that function on something that's not an email address (which
debbugs-gnu does here) it should probably be free to return whatever.

> I'm not quite sure what the proper fix is, as the ASCII-only thing seems
> to be intentional. Maybe it's just not supposed to be used the way it is
> used in debbugs-gnu.el?

Indeed.  I've now changed debbugs-gnu to split the "OCTETS
<MORE-OCTETS>" string returned by the debbugs web server correctly.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Added tag(s) fixed. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Sun, 15 Sep 2019 12:01:02 GMT) Full text and rfc822 format available.

bug marked as fixed in version 27.1, send any further explanations to 36852 <at> debbugs.gnu.org and Štěpán Němec <stepnem <at> gmail.com> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Sun, 15 Sep 2019 12:01:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 14 Oct 2019 11:24:13 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 192 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.