GNU bug report logs - #42382
26.3; url-http handling of Location redirection headers containing whitespace

Previous Next

Package: emacs;

Reported by: Daniele Nicolodi <daniele <at> grinta.net>

Date: Wed, 15 Jul 2020 23:12:02 UTC

Severity: normal

Tags: fixed

Found in version 26.3

Fixed in version 28.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 42382 in the body.
You can then email your comments to 42382 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#42382; Package emacs. (Wed, 15 Jul 2020 23:12:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Daniele Nicolodi <daniele <at> grinta.net>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Wed, 15 Jul 2020 23:12:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Daniele Nicolodi <daniele <at> grinta.net>
To: bug-gnu-emacs <at> gnu.org
Subject: 26.3; url-http handling of Location redirection headers containing
 whitespace
Date: Wed, 15 Jul 2020 14:40:36 -0600
[Message part 1 (text/plain, inline)]
url-http.el interprets HTTP responses in url-http-parse-headers. This
function contains the following code:

	 (when redirect-uri
	   ;; Clean off any whitespace and/or <...> cruft.
	   (if (string-match "\\([^ \t]+\\)[ \t]" redirect-uri)
	       (setq redirect-uri (match-string 1 redirect-uri)))
	   (if (string-match "^<\\(.*\\)>$" redirect-uri)
	       (setq redirect-uri (match-string 1 redirect-uri)))

which truncates the value of the Location header at the first whitespace
character and removes surrounding angle brackets quoting.

In RFC 7231 the Location header is defined to carry a URI-reference.
According to RFC 3986 it should be percent-encoded and thus should not
contain spaces. However, there are HTTP server implementation (notably
nginx) that do not do that. While this is a bug in those HTTP server
implementations, I think Emacs should follow what most other HTTP client
implementatios (all the ones I tested) and use the content of the
Location header unmodified. Stripping of angle bracket quotes is
unnecessary as they are not valid according to the RFCs.

Also, accordingly to the RFCs, the location header may contain a
relative location. Thus the comment that suggest that such a response is
a bug in the server should be reworded.

The attached patches implement the proposed changes.

Thank you.
[0001-url-http-Fix-handling-of-redirect-locations.patch (text/plain, attachment)]
[0002-url-http-Do-not-suggest-a-broken-HTTP-server-impleme.patch (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42382; Package emacs. (Thu, 16 Jul 2020 16:09:02 GMT) Full text and rfc822 format available.

Message #8 received at 42382 <at> debbugs.gnu.org (full text, mbox):

From: Robert Pluim <rpluim <at> gmail.com>
To: Daniele Nicolodi <daniele <at> grinta.net>
Cc: 42382 <at> debbugs.gnu.org
Subject: Re: bug#42382: 26.3; url-http handling of Location redirection
 headers containing whitespace
Date: Thu, 16 Jul 2020 18:08:37 +0200
>>>>> On Wed, 15 Jul 2020 14:40:36 -0600, Daniele Nicolodi <daniele <at> grinta.net> said:

    Daniele> In RFC 7231 the Location header is defined to carry a URI-reference.
    Daniele> According to RFC 3986 it should be percent-encoded and thus should not
    Daniele> contain spaces. However, there are HTTP server implementation (notably
    Daniele> nginx) that do not do that. While this is a bug in those HTTP server
    Daniele> implementations, I think Emacs should follow what most other HTTP client
    Daniele> implementatios (all the ones I tested) and use the content of the
    Daniele> Location header unmodified. Stripping of angle bracket quotes is
    Daniele> unnecessary as they are not valid according to the RFCs.

Nor is embedded whitespace in the URI :-)

Are you sure this won't break anything? ie are you sure there are 0
server implementations out there that send angle brackets?

Iʼd be conservative, and just replace the truncation on whitespace
with percent-encoding of said whitespace.

    Daniele> Also, accordingly to the RFCs, the location header may contain a
    Daniele> relative location. Thus the comment that suggest that such a response is
    Daniele> a bug in the server should be reworded.

    Daniele> The attached patches implement the proposed changes.

The second patch is small enough that I think you can combine the two.

Robert




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42382; Package emacs. (Thu, 16 Jul 2020 16:52:01 GMT) Full text and rfc822 format available.

Message #11 received at 42382 <at> debbugs.gnu.org (full text, mbox):

From: Daniele Nicolodi <daniele <at> grinta.net>
To: Robert Pluim <rpluim <at> gmail.com>
Cc: 42382 <at> debbugs.gnu.org
Subject: Re: bug#42382: 26.3; url-http handling of Location redirection
 headers containing whitespace
Date: Thu, 16 Jul 2020 10:30:49 -0600
On 16-07-2020 10:08, Robert Pluim wrote:
>>>>>> On Wed, 15 Jul 2020 14:40:36 -0600, Daniele Nicolodi <daniele <at> grinta.net> said:
> 
>     Daniele> In RFC 7231 the Location header is defined to carry a URI-reference.
>     Daniele> According to RFC 3986 it should be percent-encoded and thus should not
>     Daniele> contain spaces. However, there are HTTP server implementation (notably
>     Daniele> nginx) that do not do that. While this is a bug in those HTTP server
>     Daniele> implementations, I think Emacs should follow what most other HTTP client
>     Daniele> implementatios (all the ones I tested) and use the content of the
>     Daniele> Location header unmodified. Stripping of angle bracket quotes is
>     Daniele> unnecessary as they are not valid according to the RFCs.
> 
> Nor is embedded whitespace in the URI :-)

I don't understand this remark. Truncating at the first whitespace
character (current behavior) is a valid arbitrary decision for an
RFC-invalid URI-reference value. However, it is different from what all
other HTTP clients implement and it results in practical problems.

> Are you sure this won't break anything? ie are you sure there are 0
> server implementations out there that send angle brackets?

I don't see any reason why there should be angle brackets around the
value of a Location header and the current code or changelog or commit
messages does not provide a justification or a case where these have
been encountered. No other HTTP client I looked at does something like
this. I think there are many HTTP client implementations out there that
are more widely used and tested for interoperability than url-http.

> Iʼd be conservative, and just replace the truncation on whitespace
> with percent-encoding of said whitespace.

Why is percent-encoding better? The URI-reference value should not be
interpreted in any way, simply passed along. Again, all other HTTP
clients I looked at do not do this, or other manipulation of the header.

>     Daniele> Also, accordingly to the RFCs, the location header may contain a
>     Daniele> relative location. Thus the comment that suggest that such a response is
>     Daniele> a bug in the server should be reworded.
> 
>     Daniele> The attached patches implement the proposed changes.
> 
> The second patch is small enough that I think you can combine the two.

They are divided to provide justification for the changes in the commit
messages.

Thank you.

Daniele




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42382; Package emacs. (Thu, 16 Jul 2020 17:11:02 GMT) Full text and rfc822 format available.

Message #14 received at 42382 <at> debbugs.gnu.org (full text, mbox):

From: Robert Pluim <rpluim <at> gmail.com>
To: Daniele Nicolodi <daniele <at> grinta.net>
Cc: 42382 <at> debbugs.gnu.org
Subject: Re: bug#42382: 26.3; url-http handling of Location redirection
 headers containing whitespace
Date: Thu, 16 Jul 2020 19:10:02 +0200
>>>>> On Thu, 16 Jul 2020 10:30:49 -0600, Daniele Nicolodi <daniele <at> grinta.net> said:

    Daniele> On 16-07-2020 10:08, Robert Pluim wrote:
    >>>>>>> On Wed, 15 Jul 2020 14:40:36 -0600, Daniele Nicolodi <daniele <at> grinta.net> said:
    >> 
    Daniele> In RFC 7231 the Location header is defined to carry a URI-reference.
    Daniele> According to RFC 3986 it should be percent-encoded and thus should not
    Daniele> contain spaces. However, there are HTTP server implementation (notably
    Daniele> nginx) that do not do that. While this is a bug in those HTTP server
    Daniele> implementations, I think Emacs should follow what most other HTTP client
    Daniele> implementatios (all the ones I tested) and use the content of the
    Daniele> Location header unmodified. Stripping of angle bracket quotes is
    Daniele> unnecessary as they are not valid according to the RFCs.
    >> 
    >> Nor is embedded whitespace in the URI :-)

    Daniele> I don't understand this remark. Truncating at the first whitespace
    Daniele> character (current behavior) is a valid arbitrary decision for an
    Daniele> RFC-invalid URI-reference value. However, it is different from what all
    Daniele> other HTTP clients implement and it results in practical problems.

You stated that angle quotes were invalid, and proposed to remove the
support for their presence. I was merely pointing out that spaces are
equally invalid, and you propose to accomodate them. If one, why not
the other?

    >> Are you sure this won't break anything? ie are you sure there are 0
    >> server implementations out there that send angle brackets?

    Daniele> I don't see any reason why there should be angle brackets around the
    Daniele> value of a Location header and the current code or changelog or commit
    Daniele> messages does not provide a justification or a case where these have
    Daniele> been encountered. No other HTTP client I looked at does something like
    Daniele> this. I think there are many HTTP client implementations out there that
    Daniele> are more widely used and tested for interoperability than url-http.

    >> Iʼd be conservative, and just replace the truncation on whitespace
    >> with percent-encoding of said whitespace.

    Daniele> Why is percent-encoding better? The URI-reference value should not be
    Daniele> interpreted in any way, simply passed along. Again, all other HTTP
    Daniele> clients I looked at do not do this, or other manipulation of the header.

Because then it become a valid URI, and other parts of emacs that
donʼt know how to treat literal spaces in a URI wonʼt break. But Iʼll
agree that itʼs conjecture on my part.

Robert




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42382; Package emacs. (Fri, 17 Jul 2020 00:02:01 GMT) Full text and rfc822 format available.

Message #17 received at 42382 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Robert Pluim <rpluim <at> gmail.com>
Cc: 42382 <at> debbugs.gnu.org, Daniele Nicolodi <daniele <at> grinta.net>
Subject: Re: bug#42382: 26.3; url-http handling of Location redirection
 headers containing whitespace
Date: Fri, 17 Jul 2020 02:01:36 +0200
Robert Pluim <rpluim <at> gmail.com> writes:

> You stated that angle quotes were invalid, and proposed to remove the
> support for their presence. I was merely pointing out that spaces are
> equally invalid, and you propose to accomodate them. If one, why not
> the other?

I think it makes sense to remove the support for the angle brackets --
I've never seen those in use anywhere.  If they turn out to be a thing,
we can add that back.

On the other hand, spaces in Location headers have been observed in the
wild, so supporting that seems like a good idea to me.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42382; Package emacs. (Fri, 17 Jul 2020 15:56:02 GMT) Full text and rfc822 format available.

Message #20 received at 42382 <at> debbugs.gnu.org (full text, mbox):

From: Daniele Nicolodi <daniele <at> grinta.net>
To: Robert Pluim <rpluim <at> gmail.com>
Cc: 42382 <at> debbugs.gnu.org
Subject: Re: bug#42382: 26.3; url-http handling of Location redirection
 headers containing whitespace
Date: Fri, 17 Jul 2020 09:47:45 -0600
On 16/07/2020 11:10, Robert Pluim wrote:
> You stated that angle quotes were invalid, and proposed to remove the
> support for their presence. I was merely pointing out that spaces are
> equally invalid, and you propose to accomodate them. If one, why not
> the other?

Technically, I am not proposing to accommodate anything. I am proposing
to use the Location header as returned by the server, without trying to
guess what the server really meant.

If you want to look at it in another way, I am proposing to modify
url-http to follow what all other HTTP client implementations do. The
standard does not specify what to do in case of malformed header values,
thus adopting the most common approach allows for greater
interoperability. The fact that this is also the simplest thing to do is
also very attractive.

Finally, as Lars points out, angle bracket enclosed URIs in Location
headers have not been seen anywhere. Thus stripping them out is fixing
an imaginary problem, and we can come up with an infinite list of
imaginary problems that need to be "fixed". Non fixing any of them is
the reasonable thing to do.

>     Daniele> Why is percent-encoding better? The URI-reference value should not be
>     Daniele> interpreted in any way, simply passed along. Again, all other HTTP
>     Daniele> clients I looked at do not do this, or other manipulation of the header.
> 
> Because then it become a valid URI, and other parts of emacs that
> donʼt know how to treat literal spaces in a URI wonʼt break. But Iʼll
> agree that itʼs conjecture on my part.

The code I would like to fix does not expose the values of the headers
in any way, and it works just fine with non-escaped spaces, thus there
is no need to fix the percent-encoding.

Trying to fix the value of the URI-location value is a very slippery
slope. We can fix non-escaped spaces, but what about other non-escaped
characters? Should we escape them as well or leave them alone? If we
leave them alone, there is little value in escaping spaces (the
invariable "correctly percent-escaped URIs are returned, SOMETIMES" is
weaker than "the content of the Location header is return as is,
ALWAYS"). If we try to fix the escaping, how can we know that when we
get a URI containing a % character, this is a valid escape sequence an
not a literal % that needs to be escaped? We are back to the "correct
answer most of the times", which is not very helpful.

Thank you.

Cheers,
Dan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42382; Package emacs. (Sun, 19 Jul 2020 19:18:01 GMT) Full text and rfc822 format available.

Message #23 received at 42382 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Daniele Nicolodi <daniele <at> grinta.net>
Cc: 42382 <at> debbugs.gnu.org
Subject: Re: bug#42382: 26.3; url-http handling of Location redirection
 headers containing whitespace
Date: Sun, 19 Jul 2020 21:17:36 +0200
Daniele Nicolodi <daniele <at> grinta.net> writes:

> Also, accordingly to the RFCs, the location header may contain a
> relative location. Thus the comment that suggest that such a response is
> a bug in the server should be reworded.
>
> The attached patches implement the proposed changes.

Thanks; I've now applied the patches to Emacs 28.1.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Added tag(s) fixed. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Sun, 19 Jul 2020 19:18:02 GMT) Full text and rfc822 format available.

bug marked as fixed in version 28.1, send any further explanations to 42382 <at> debbugs.gnu.org and Daniele Nicolodi <daniele <at> grinta.net> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Sun, 19 Jul 2020 19:18:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42382; Package emacs. (Mon, 20 Jul 2020 14:47:02 GMT) Full text and rfc822 format available.

Message #30 received at 42382 <at> debbugs.gnu.org (full text, mbox):

From: Daniele Nicolodi <daniele <at> grinta.net>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 42382 <at> debbugs.gnu.org
Subject: Re: bug#42382: 26.3; url-http handling of Location redirection
 headers containing whitespace
Date: Mon, 20 Jul 2020 08:46:12 -0600
On 19-07-2020 13:17, Lars Ingebrigtsen wrote:
> Daniele Nicolodi <daniele <at> grinta.net> writes:
> 
>> Also, accordingly to the RFCs, the location header may contain a
>> relative location. Thus the comment that suggest that such a response is
>> a bug in the server should be reworded.
>>
>> The attached patches implement the proposed changes.
> 
> Thanks; I've now applied the patches to Emacs 28.1.

Thanks Lars.

Cheers,
Dan





bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 18 Aug 2020 11:24:08 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 245 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.