GNU bug report logs -
#42382
26.3; url-http handling of Location redirection headers containing whitespace
Previous Next
Reported by: Daniele Nicolodi <daniele <at> grinta.net>
Date: Wed, 15 Jul 2020 23:12:02 UTC
Severity: normal
Tags: fixed
Found in version 26.3
Fixed in version 28.1
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 42382 in the body.
You can then email your comments to 42382 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42382
; Package
emacs
.
(Wed, 15 Jul 2020 23:12:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Daniele Nicolodi <daniele <at> grinta.net>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Wed, 15 Jul 2020 23:12:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
url-http.el interprets HTTP responses in url-http-parse-headers. This
function contains the following code:
(when redirect-uri
;; Clean off any whitespace and/or <...> cruft.
(if (string-match "\\([^ \t]+\\)[ \t]" redirect-uri)
(setq redirect-uri (match-string 1 redirect-uri)))
(if (string-match "^<\\(.*\\)>$" redirect-uri)
(setq redirect-uri (match-string 1 redirect-uri)))
which truncates the value of the Location header at the first whitespace
character and removes surrounding angle brackets quoting.
In RFC 7231 the Location header is defined to carry a URI-reference.
According to RFC 3986 it should be percent-encoded and thus should not
contain spaces. However, there are HTTP server implementation (notably
nginx) that do not do that. While this is a bug in those HTTP server
implementations, I think Emacs should follow what most other HTTP client
implementatios (all the ones I tested) and use the content of the
Location header unmodified. Stripping of angle bracket quotes is
unnecessary as they are not valid according to the RFCs.
Also, accordingly to the RFCs, the location header may contain a
relative location. Thus the comment that suggest that such a response is
a bug in the server should be reworded.
The attached patches implement the proposed changes.
Thank you.
[0001-url-http-Fix-handling-of-redirect-locations.patch (text/plain, attachment)]
[0002-url-http-Do-not-suggest-a-broken-HTTP-server-impleme.patch (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42382
; Package
emacs
.
(Thu, 16 Jul 2020 16:09:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 42382 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Wed, 15 Jul 2020 14:40:36 -0600, Daniele Nicolodi <daniele <at> grinta.net> said:
Daniele> In RFC 7231 the Location header is defined to carry a URI-reference.
Daniele> According to RFC 3986 it should be percent-encoded and thus should not
Daniele> contain spaces. However, there are HTTP server implementation (notably
Daniele> nginx) that do not do that. While this is a bug in those HTTP server
Daniele> implementations, I think Emacs should follow what most other HTTP client
Daniele> implementatios (all the ones I tested) and use the content of the
Daniele> Location header unmodified. Stripping of angle bracket quotes is
Daniele> unnecessary as they are not valid according to the RFCs.
Nor is embedded whitespace in the URI :-)
Are you sure this won't break anything? ie are you sure there are 0
server implementations out there that send angle brackets?
Iʼd be conservative, and just replace the truncation on whitespace
with percent-encoding of said whitespace.
Daniele> Also, accordingly to the RFCs, the location header may contain a
Daniele> relative location. Thus the comment that suggest that such a response is
Daniele> a bug in the server should be reworded.
Daniele> The attached patches implement the proposed changes.
The second patch is small enough that I think you can combine the two.
Robert
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42382
; Package
emacs
.
(Thu, 16 Jul 2020 16:52:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 42382 <at> debbugs.gnu.org (full text, mbox):
On 16-07-2020 10:08, Robert Pluim wrote:
>>>>>> On Wed, 15 Jul 2020 14:40:36 -0600, Daniele Nicolodi <daniele <at> grinta.net> said:
>
> Daniele> In RFC 7231 the Location header is defined to carry a URI-reference.
> Daniele> According to RFC 3986 it should be percent-encoded and thus should not
> Daniele> contain spaces. However, there are HTTP server implementation (notably
> Daniele> nginx) that do not do that. While this is a bug in those HTTP server
> Daniele> implementations, I think Emacs should follow what most other HTTP client
> Daniele> implementatios (all the ones I tested) and use the content of the
> Daniele> Location header unmodified. Stripping of angle bracket quotes is
> Daniele> unnecessary as they are not valid according to the RFCs.
>
> Nor is embedded whitespace in the URI :-)
I don't understand this remark. Truncating at the first whitespace
character (current behavior) is a valid arbitrary decision for an
RFC-invalid URI-reference value. However, it is different from what all
other HTTP clients implement and it results in practical problems.
> Are you sure this won't break anything? ie are you sure there are 0
> server implementations out there that send angle brackets?
I don't see any reason why there should be angle brackets around the
value of a Location header and the current code or changelog or commit
messages does not provide a justification or a case where these have
been encountered. No other HTTP client I looked at does something like
this. I think there are many HTTP client implementations out there that
are more widely used and tested for interoperability than url-http.
> Iʼd be conservative, and just replace the truncation on whitespace
> with percent-encoding of said whitespace.
Why is percent-encoding better? The URI-reference value should not be
interpreted in any way, simply passed along. Again, all other HTTP
clients I looked at do not do this, or other manipulation of the header.
> Daniele> Also, accordingly to the RFCs, the location header may contain a
> Daniele> relative location. Thus the comment that suggest that such a response is
> Daniele> a bug in the server should be reworded.
>
> Daniele> The attached patches implement the proposed changes.
>
> The second patch is small enough that I think you can combine the two.
They are divided to provide justification for the changes in the commit
messages.
Thank you.
Daniele
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42382
; Package
emacs
.
(Thu, 16 Jul 2020 17:11:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 42382 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Thu, 16 Jul 2020 10:30:49 -0600, Daniele Nicolodi <daniele <at> grinta.net> said:
Daniele> On 16-07-2020 10:08, Robert Pluim wrote:
>>>>>>> On Wed, 15 Jul 2020 14:40:36 -0600, Daniele Nicolodi <daniele <at> grinta.net> said:
>>
Daniele> In RFC 7231 the Location header is defined to carry a URI-reference.
Daniele> According to RFC 3986 it should be percent-encoded and thus should not
Daniele> contain spaces. However, there are HTTP server implementation (notably
Daniele> nginx) that do not do that. While this is a bug in those HTTP server
Daniele> implementations, I think Emacs should follow what most other HTTP client
Daniele> implementatios (all the ones I tested) and use the content of the
Daniele> Location header unmodified. Stripping of angle bracket quotes is
Daniele> unnecessary as they are not valid according to the RFCs.
>>
>> Nor is embedded whitespace in the URI :-)
Daniele> I don't understand this remark. Truncating at the first whitespace
Daniele> character (current behavior) is a valid arbitrary decision for an
Daniele> RFC-invalid URI-reference value. However, it is different from what all
Daniele> other HTTP clients implement and it results in practical problems.
You stated that angle quotes were invalid, and proposed to remove the
support for their presence. I was merely pointing out that spaces are
equally invalid, and you propose to accomodate them. If one, why not
the other?
>> Are you sure this won't break anything? ie are you sure there are 0
>> server implementations out there that send angle brackets?
Daniele> I don't see any reason why there should be angle brackets around the
Daniele> value of a Location header and the current code or changelog or commit
Daniele> messages does not provide a justification or a case where these have
Daniele> been encountered. No other HTTP client I looked at does something like
Daniele> this. I think there are many HTTP client implementations out there that
Daniele> are more widely used and tested for interoperability than url-http.
>> Iʼd be conservative, and just replace the truncation on whitespace
>> with percent-encoding of said whitespace.
Daniele> Why is percent-encoding better? The URI-reference value should not be
Daniele> interpreted in any way, simply passed along. Again, all other HTTP
Daniele> clients I looked at do not do this, or other manipulation of the header.
Because then it become a valid URI, and other parts of emacs that
donʼt know how to treat literal spaces in a URI wonʼt break. But Iʼll
agree that itʼs conjecture on my part.
Robert
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42382
; Package
emacs
.
(Fri, 17 Jul 2020 00:02:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 42382 <at> debbugs.gnu.org (full text, mbox):
Robert Pluim <rpluim <at> gmail.com> writes:
> You stated that angle quotes were invalid, and proposed to remove the
> support for their presence. I was merely pointing out that spaces are
> equally invalid, and you propose to accomodate them. If one, why not
> the other?
I think it makes sense to remove the support for the angle brackets --
I've never seen those in use anywhere. If they turn out to be a thing,
we can add that back.
On the other hand, spaces in Location headers have been observed in the
wild, so supporting that seems like a good idea to me.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42382
; Package
emacs
.
(Fri, 17 Jul 2020 15:56:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 42382 <at> debbugs.gnu.org (full text, mbox):
On 16/07/2020 11:10, Robert Pluim wrote:
> You stated that angle quotes were invalid, and proposed to remove the
> support for their presence. I was merely pointing out that spaces are
> equally invalid, and you propose to accomodate them. If one, why not
> the other?
Technically, I am not proposing to accommodate anything. I am proposing
to use the Location header as returned by the server, without trying to
guess what the server really meant.
If you want to look at it in another way, I am proposing to modify
url-http to follow what all other HTTP client implementations do. The
standard does not specify what to do in case of malformed header values,
thus adopting the most common approach allows for greater
interoperability. The fact that this is also the simplest thing to do is
also very attractive.
Finally, as Lars points out, angle bracket enclosed URIs in Location
headers have not been seen anywhere. Thus stripping them out is fixing
an imaginary problem, and we can come up with an infinite list of
imaginary problems that need to be "fixed". Non fixing any of them is
the reasonable thing to do.
> Daniele> Why is percent-encoding better? The URI-reference value should not be
> Daniele> interpreted in any way, simply passed along. Again, all other HTTP
> Daniele> clients I looked at do not do this, or other manipulation of the header.
>
> Because then it become a valid URI, and other parts of emacs that
> donʼt know how to treat literal spaces in a URI wonʼt break. But Iʼll
> agree that itʼs conjecture on my part.
The code I would like to fix does not expose the values of the headers
in any way, and it works just fine with non-escaped spaces, thus there
is no need to fix the percent-encoding.
Trying to fix the value of the URI-location value is a very slippery
slope. We can fix non-escaped spaces, but what about other non-escaped
characters? Should we escape them as well or leave them alone? If we
leave them alone, there is little value in escaping spaces (the
invariable "correctly percent-escaped URIs are returned, SOMETIMES" is
weaker than "the content of the Location header is return as is,
ALWAYS"). If we try to fix the escaping, how can we know that when we
get a URI containing a % character, this is a valid escape sequence an
not a literal % that needs to be escaped? We are back to the "correct
answer most of the times", which is not very helpful.
Thank you.
Cheers,
Dan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42382
; Package
emacs
.
(Sun, 19 Jul 2020 19:18:01 GMT)
Full text and
rfc822 format available.
Message #23 received at 42382 <at> debbugs.gnu.org (full text, mbox):
Daniele Nicolodi <daniele <at> grinta.net> writes:
> Also, accordingly to the RFCs, the location header may contain a
> relative location. Thus the comment that suggest that such a response is
> a bug in the server should be reworded.
>
> The attached patches implement the proposed changes.
Thanks; I've now applied the patches to Emacs 28.1.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
Added tag(s) fixed.
Request was from
Lars Ingebrigtsen <larsi <at> gnus.org>
to
control <at> debbugs.gnu.org
.
(Sun, 19 Jul 2020 19:18:02 GMT)
Full text and
rfc822 format available.
bug marked as fixed in version 28.1, send any further explanations to
42382 <at> debbugs.gnu.org and Daniele Nicolodi <daniele <at> grinta.net>
Request was from
Lars Ingebrigtsen <larsi <at> gnus.org>
to
control <at> debbugs.gnu.org
.
(Sun, 19 Jul 2020 19:18:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42382
; Package
emacs
.
(Mon, 20 Jul 2020 14:47:02 GMT)
Full text and
rfc822 format available.
Message #30 received at 42382 <at> debbugs.gnu.org (full text, mbox):
On 19-07-2020 13:17, Lars Ingebrigtsen wrote:
> Daniele Nicolodi <daniele <at> grinta.net> writes:
>
>> Also, accordingly to the RFCs, the location header may contain a
>> relative location. Thus the comment that suggest that such a response is
>> a bug in the server should be reworded.
>>
>> The attached patches implement the proposed changes.
>
> Thanks; I've now applied the patches to Emacs 28.1.
Thanks Lars.
Cheers,
Dan
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Tue, 18 Aug 2020 11:24:08 GMT)
Full text and
rfc822 format available.
This bug report was last modified 3 years and 245 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.