GNU bug report logs - #35658
27.0.50; Problems with chunked/gzipped content in url-http

Previous Next

Package: emacs;

Reported by: Lars Ingebrigtsen <larsi <at> gnus.org>

Date: Thu, 9 May 2019 18:28:01 UTC

Severity: normal

Tags: fixed

Found in version 27.0.50

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 35658 in the body.
You can then email your comments to 35658 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#35658; Package emacs. (Thu, 09 May 2019 18:28:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Lars Ingebrigtsen <larsi <at> gnus.org>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Thu, 09 May 2019 18:28:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: bug-gnu-emacs <at> gnu.org
Subject: 27.0.50; Problems with chunked/gzipped content in url-http
Date: Thu, 09 May 2019 14:26:49 -0400
To reproduce, say:

(url-retrieve-synchronously "https://www.facebook.com/verkstedetbar/events/?ref=page_internal")

This now appears to spin until Facebook closes the connection on their
side.

Looking at the data, it appears that Facebook outputs chunked gzipped
data, which is pretty normal.  What's not normal is that what's in the
http buffer is

HTTP/1.1 200 OK
Content-Encoding: gzip
[...]
Date: Thu, 09 May 2019 17:45:53 GMT
Transfer-Encoding: chunked
Connection: keep-alive

3^_\213^H^@^@^@^@^@^@...

(I've transliterated the binary data into what it looks like in the
buffer.)

I.e., there's not chunking header before that gzip file start.  Somehow,
either Emacs is removing the leading

4af^M

and replacing it with "3", or Facebook is outputting something odd.  I
tend to believe it's the former, since curl (and other programs)
understand the chunked data perfectly.

It's most strange...

If we switch gzip http off by saying

(setq url-mime-encoding-string nil)

then the chunked headers are correct and everything works.



In GNU Emacs 27.0.50 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.22.30)
 of 2019-05-02 built on sandy
Repository revision: d4fa998c3142b5ae13664295dcf2136397b05f5a
Repository branch: master
Windowing system distributor 'The X.Org Foundation', version 11.0.11906000
System Description: Ubuntu 18.04.2 LTS


-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#35658; Package emacs. (Fri, 10 May 2019 19:22:01 GMT) Full text and rfc822 format available.

Message #8 received at 35658 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: 35658 <at> debbugs.gnu.org
Subject: Re: bug#35658: 27.0.50;
 Problems with chunked/gzipped content in url-http
Date: Fri, 10 May 2019 15:21:39 -0400
I've poked at this some more, and it does seem like url-http is doing
something odd to the data.  When I try the same in the with-fetched-url
branch, I get the expected:

Transfer-Encoding: chunked
Connection: keep-alive

327
^_\213^H^@^@^@...

So some thing is doing ... something to the data as it arrives from
Facebook.  url-http is working in a multibyte buffer, so I guess there
are many possible reasons for this happening...

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#35658; Package emacs. (Wed, 15 May 2019 04:58:01 GMT) Full text and rfc822 format available.

Message #11 received at 35658 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: 35658 <at> debbugs.gnu.org
Subject: Re: bug#35658: 27.0.50;
 Problems with chunked/gzipped content in url-http
Date: Wed, 15 May 2019 06:56:57 +0200
Lars Ingebrigtsen <larsi <at> gnus.org> writes:

> So some thing is doing ... something to the data as it arrives from
> Facebook.  url-http is working in a multibyte buffer, so I guess there
> are many possible reasons for this happening...

Nothing so complicated -- if I remove the contents of
url-http-chunked-encoding-after-change-function, the text is all there,
so it's doing something very wonky.

What seems to happen is that Facebook will output the first character of
the chunked header first, and then the rest afterwards.  This somehow
makes that function delete parts of the header before it's gotten the
complete header, which again means that it'll never find the real size
of the data.

And it's a function that's virtually impossible to debug in any sensible
manner, because everything happens async and the function depends on
many buffer-local variables...

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#35658; Package emacs. (Wed, 15 May 2019 05:14:01 GMT) Full text and rfc822 format available.

Message #14 received at 35658 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: 35658 <at> debbugs.gnu.org
Subject: Re: bug#35658: 27.0.50;
 Problems with chunked/gzipped content in url-http
Date: Wed, 15 May 2019 07:13:00 +0200
Lars Ingebrigtsen <larsi <at> gnus.org> writes:

> What seems to happen is that Facebook will output the first character of
> the chunked header first, and then the rest afterwards. 

Yup.  Pushed a fix.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Added tag(s) fixed. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Wed, 15 May 2019 05:14:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 35658 <at> debbugs.gnu.org and Lars Ingebrigtsen <larsi <at> gnus.org> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Wed, 15 May 2019 05:14:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 12 Jun 2019 11:24:07 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 320 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.