GNU bug report logs -
#23750
25.0.95; bug in url-retrieve or json.el
Previous Next
Reported by: Leo Liu <sdl.web <at> gmail.com>
Date: Sun, 12 Jun 2016 02:24:02 UTC
Severity: normal
Found in version 25.0.95
Done: Dmitry Gutov <dgutov <at> yandex.ru>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 23750 in the body.
You can then email your comments to 23750 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Sun, 12 Jun 2016 02:24:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Leo Liu <sdl.web <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Sun, 12 Jun 2016 02:24:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
I have been trying to debug an issue in TernJs¹ on and off for a few
months now and it seems the cause is some nasty bug in Emacs 25. Could
someone follow the steps detailed in
https://github.com/ternjs/tern/issues/719 to reproduce the issue?
I have verified that the bug is not in Tern but Emacs i.e. under some
circumstances emacs's URL package strips some chars in the request body
which, in this case, leads to unbalanced parentheses in the JSON doc.
Leo
Footnotes:
¹ https://github.com/ternjs/tern/issues/719
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Mon, 13 Jun 2016 15:03:03 GMT)
Full text and
rfc822 format available.
Message #8 received at 23750 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 06/12/2016 05:22 AM, Leo Liu wrote:
> ¹ https://github.com/ternjs/tern/issues/719
Investigation shows that the problem occurs when url-http-data is
multibyte and (length url-http-data) differs from (length
(string-as-unibyte url-http-data)), because we send a wrong value in
Content-length.
Changing url-http-create-request like this will make the problem more
obvious for anyone else that hits it, patch attached.
Stefan, did you have a particular situation in mind where this might be
bad, when you wrote the FIXME?
[url-http-unibyte.diff (text/x-patch, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Mon, 13 Jun 2016 19:11:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 23750 <at> debbugs.gnu.org (full text, mbox):
>> ¹ https://github.com/ternjs/tern/issues/719
> Investigation shows that the problem occurs when url-http-data is multibyte
> and (length url-http-data) differs from (length (string-as-unibyte
> url-http-data)), because we send a wrong value in Content-length.
> Changing url-http-create-request like this will make the problem more
> obvious for anyone else that hits it, patch attached.
> Stefan, did you have a particular situation in mind where this might be bad,
> when you wrote the FIXME?
No, nothing in particular. Just that `string-as-unibyte` is generally
synonymous with "the author is confused about how coding systems work",
aka "trouble".
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Mon, 13 Jun 2016 19:27:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 23750 <at> debbugs.gnu.org (full text, mbox):
On 06/13/2016 08:55 PM, Stefan Monnier wrote:
> No, nothing in particular. Just that `string-as-unibyte` is generally
> synonymous with "the author is confused about how coding systems work",
> aka "trouble".
You were also the author in this case. The same commit added both the
use of string-as-unibyte and the FIXME comment.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Tue, 14 Jun 2016 00:31:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 23750 <at> debbugs.gnu.org (full text, mbox):
>> No, nothing in particular. Just that `string-as-unibyte` is generally
>> synonymous with "the author is confused about how coding systems work",
>> aka "trouble".
> You were also the author in this case. The same commit added both the use of
> string-as-unibyte and the FIXME comment.
Can't remember why I did so. My best guess is that I tried to mimick
some earlier behavior.
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Sun, 19 Jun 2016 18:16:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 23750 <at> debbugs.gnu.org (full text, mbox):
On 06/14/2016 03:30 AM, Stefan Monnier wrote:
> Can't remember why I did so. My best guess is that I tried to mimick
> some earlier behavior.
OK, thanks anyway. I've pushed the patch to master as
2ede29575fa22eb7c265117d7511cff9fe02c606.
Eli, could we have it emacs-25 as well? It's not critical, but it should
make the life of our users easier to flagging problems with the usage of
url-http earlier, in a more appropriate place, with an error, rather
than leaving that up to them to deduce why their HTTP server truncates
the request body.
While the truncation bug itself is quite old, it's been exacerbated in
Emacs 25 by my own цщкл to make json.el faster: one side-effect is that
it doesn't \u-quote multibyte characters anymore, or at least not all of
them.
FWIW, I've been running with it applied to emacs-25 for the past week
with no problems.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Sun, 19 Jun 2016 18:27:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 23750 <at> debbugs.gnu.org (full text, mbox):
> Cc: 23750 <at> debbugs.gnu.org, Leo Liu <sdl.web <at> gmail.com>,
> Eli Zaretskii <eliz <at> gnu.org>
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Sun, 19 Jun 2016 21:14:55 +0300
>
> On 06/14/2016 03:30 AM, Stefan Monnier wrote:
>
> > Can't remember why I did so. My best guess is that I tried to mimick
> > some earlier behavior.
>
> OK, thanks anyway. I've pushed the patch to master as
> 2ede29575fa22eb7c265117d7511cff9fe02c606.
>
> Eli, could we have it emacs-25 as well? It's not critical, but it should
> make the life of our users easier to flagging problems with the usage of
> url-http earlier, in a more appropriate place, with an error, rather
> than leaving that up to them to deduce why their HTTP server truncates
> the request body.
I'd need a very detailed description of the bug, and why this
particular solution was used. IME, neither string-to-unibyte not
string-as-unibyte should ever be used in applications, their use is
more often than not a sign of some basic misunderstanding of text
encoding. For starters, how come 8-bit bytes wind up in that
function, and what do they stand for?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Sun, 19 Jun 2016 18:31:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 23750 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
>>>>> Eli Zaretskii <eliz <at> gnu.org> writes:
>> Eli, could we have it emacs-25 as well? It's not critical, but it should
>> make the life of our users easier to flagging problems with the usage of
>> url-http earlier, in a more appropriate place, with an error, rather than
>> leaving that up to them to deduce why their HTTP server truncates the
>> request body.
Bear in mind that 25.2 can be released as soon after as we want it to. If
anything is "optional" at this point in time, it should be deferred.
We shouldn't try to race anything into the release, just because we think
users will then have to live with some minor inferior behavior for a long time
after. The description above certainly does not sound like something that
needs to be happen for 25.1.
--
John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Sun, 19 Jun 2016 18:37:01 GMT)
Full text and
rfc822 format available.
Message #29 received at 23750 <at> debbugs.gnu.org (full text, mbox):
On 06/19/2016 09:25 PM, Eli Zaretskii wrote:
> I'd need a very detailed description of the bug, and why this
> particular solution was used.
This particular bug came from this:
"Content-length: " (number-to-string (length url-http-data))
Which gives wrong value when url-http-data is multibyte (it should be
length in bytes). So then, the HTTP server on the other side saw the
wrong body length and truncated the body when reading the request. Or
something along these lines.
> IME, neither string-to-unibyte not
> string-as-unibyte should ever be used in applications, their use is
> more often than not a sign of some basic misunderstanding of text
> encoding. For starters, how come 8-bit bytes wind up in that
> function, and what do they stand for?
Some 8-byte encoding of the HTTP request body.
Anyway, yes, the hope is that the programmer uses something like
encode-coding-string to produce that value (and picks the encoding, and
indicates it in the appropriate HTTP header). Then string-to-unibyte
will simply be a no-op. But we need to catch the case when they don't,
and this seems to be the easiest way to do this.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Sun, 19 Jun 2016 18:46:01 GMT)
Full text and
rfc822 format available.
Message #32 received at 23750 <at> debbugs.gnu.org (full text, mbox):
On 06/19/2016 09:30 PM, John Wiegley wrote:
> Bear in mind that 25.2 can be released as soon after as we want it to. If
> anything is "optional" at this point in time, it should be deferred.
Let's apply the few outstanding patches and release 25.2 the next day, then?
Traditionally, releases are separated by at least several months, even
ones with no big changes.
> We shouldn't try to race anything into the release, just because we think
> users will then have to live with some minor inferior behavior for a long time
> after. The description above certainly does not sound like something that
> needs to be happen for 25.1.
Just to be clear: the patch doesn't change the behavior of any working
code. It just catches a particular kind of bug earlier than it would
manifest through a cryptic behavior.
Behavior which is non-trivial to debug, and thus adds to the already
non-trivial effort required of a person writing an advanced language
support code (using an external daemon talking over HTTP is fairly
common for this these days).
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Sun, 19 Jun 2016 19:57:01 GMT)
Full text and
rfc822 format available.
Message #35 received at 23750 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
>>>>> Dmitry Gutov <dgutov <at> yandex.ru> writes:
> Just to be clear: the patch doesn't change the behavior of any working code.
> It just catches a particular kind of bug earlier than it would manifest
> through a cryptic behavior.
>
> Behavior which is non-trivial to debug, and thus adds to the already
> non-trivial effort required of a person writing an advanced language support
> code (using an external daemon talking over HTTP is fairly common for this
> these days).
I get that. But right now, if it doesn't *have* to happen, it should wait.
We're thinking about cutting the release candidate in just a few days, pending
one issue that Eli is looking into. Any change -- and I mean _any_ change --
has the potential to introduce unforeseen effects that could delay us further.
--
John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Sun, 19 Jun 2016 20:06:01 GMT)
Full text and
rfc822 format available.
Message #38 received at 23750 <at> debbugs.gnu.org (full text, mbox):
On 06/19/2016 10:56 PM, John Wiegley wrote:
> We're thinking about cutting the release candidate in just a few days, pending
> one issue that Eli is looking into. Any change -- and I mean _any_ change --
> has the potential to introduce unforeseen effects that could delay us further.
By how much?
Even if that change causes problems (which is unlikely), we'd only have
to revert it, and, unless other issues have come in the meantime, we
could build and release Emacs 25.1 right then, more or less.
It's not like a regression there has a significant potential to obscure
other problems. We've tested the current state of the URL package pretty
well by now anyway.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Sun, 19 Jun 2016 21:08:01 GMT)
Full text and
rfc822 format available.
Message #41 received at 23750 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
>>>>> Dmitry Gutov <dgutov <at> yandex.ru> writes:
> By how much?
>
> Even if that change causes problems (which is unlikely), we'd only have to
> revert it, and, unless other issues have come in the meantime, we could
> build and release Emacs 25.1 right then, more or less.
A day comes when a line has to be drawn in the sand, otherwise we could nickel
and dime ourselves into the next century. That line is drawn; the time for
25.1 is at hand. Let's start thinking about 25.2 as we think about these types
of improvements, and how we might accelerate its release so it happens in 1-2
months time. There can be many 25.x's, without disrupting the feature work
happening on master.
--
John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Mon, 20 Jun 2016 00:16:02 GMT)
Full text and
rfc822 format available.
Message #44 received at submit <at> debbugs.gnu.org (full text, mbox):
On 2016-06-19 21:36 +0300, Dmitry Gutov wrote:
> This particular bug came from this:
>
> "Content-length: " (number-to-string (length url-http-data))
>
> Which gives wrong value when url-http-data is multibyte (it should be
> length in bytes). So then, the HTTP server on the other side saw the
> wrong body length and truncated the body when reading the request.
As Dmitry mentioned earlier json-encode in 25.1 produces multibyte
strings and makes it easier to hit this bug when consuming JSON API's.
There are three parties that are suspicious: 1) JSON API server 2)
JSON.el 3) URL. It took me a while to realise it's URL's fault IOW the
bug isn't easy to debug. This is somewhat related to changes brought in
by 25.1.
Leo
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Mon, 20 Jun 2016 01:27:01 GMT)
Full text and
rfc822 format available.
Message #47 received at 23750 <at> debbugs.gnu.org (full text, mbox):
John Wiegley wrote:
> We're thinking about cutting the release candidate in just a few days
Please see admin/release-process for some tasks that should happen
before that.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Mon, 20 Jun 2016 01:29:02 GMT)
Full text and
rfc822 format available.
Message #50 received at 23750 <at> debbugs.gnu.org (full text, mbox):
John Wiegley wrote:
> There can be many 25.x's, without disrupting the feature work
> happening on master.
Then why is master STILL advertising itself as the forerunner to 25.2?
Why are we closing a bunch of bugs as "fixed in 25.2" if they won't be
fixed till 26.1?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Mon, 20 Jun 2016 02:42:02 GMT)
Full text and
rfc822 format available.
Message #53 received at 23750 <at> debbugs.gnu.org (full text, mbox):
> Cc: 23750 <at> debbugs.gnu.org, monnier <at> IRO.UMontreal.CA, sdl.web <at> gmail.com
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Sun, 19 Jun 2016 21:36:25 +0300
>
> This particular bug came from this:
>
> "Content-length: " (number-to-string (length url-http-data))
>
> Which gives wrong value when url-http-data is multibyte (it should be
> length in bytes). So then, the HTTP server on the other side saw the
> wrong body length and truncated the body when reading the request. Or
> something along these lines.
So this is not a bug in Emacs, but a diagnostic facility to let bugs
in applications be discovered?
> > IME, neither string-to-unibyte not
> > string-as-unibyte should ever be used in applications, their use is
> > more often than not a sign of some basic misunderstanding of text
> > encoding. For starters, how come 8-bit bytes wind up in that
> > function, and what do they stand for?
>
> Some 8-byte encoding of the HTTP request body.
>
> Anyway, yes, the hope is that the programmer uses something like
> encode-coding-string to produce that value (and picks the encoding, and
> indicates it in the appropriate HTTP header). Then string-to-unibyte
> will simply be a no-op. But we need to catch the case when they don't,
> and this seems to be the easiest way to do this.
If this is what you need, why not simply test the payload for being a
unibyte string? There a function, multibyte-string-p, for that.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Mon, 20 Jun 2016 02:52:02 GMT)
Full text and
rfc822 format available.
Message #56 received at 23750 <at> debbugs.gnu.org (full text, mbox):
On 06/20/2016 05:40 AM, Eli Zaretskii wrote:
> So this is not a bug in Emacs, but a diagnostic facility to let bugs
> in applications be discovered?
It's a bug. Accepting invalid input and behaving badly with it is
definitely a bug.
> If this is what you need, why not simply test the payload for being a
> unibyte string? There a function, multibyte-string-p, for that.
There are a lot of variables to test (see the comment above the
mapconcat call).
I'm fine either way, but my patch changes two characters, and yours will
be longer. And you'll have to come up with the error message(s).
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Mon, 20 Jun 2016 02:59:02 GMT)
Full text and
rfc822 format available.
Message #59 received at 23750 <at> debbugs.gnu.org (full text, mbox):
On 06/19/2016 10:56 PM, John Wiegley wrote:
> We're thinking about cutting the release candidate in just a few days, pending
> one issue that Eli is looking into.
Do you mean bug#23779? I wouldn't call it critical (judging by the
number of years it went unreported), and it's not a regression, so it
doesn't make a lot of sense to fix it without taking care of the bug
that resulted in it being reported (bug#23769).
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Mon, 20 Jun 2016 04:23:01 GMT)
Full text and
rfc822 format available.
Message #62 received at 23750 <at> debbugs.gnu.org (full text, mbox):
>>>>> Glenn Morris <rgm <at> gnu.org> writes:
> Then why is master STILL advertising itself as the forerunner to 25.2? Why
> are we closing a bunch of bugs as "fixed in 25.2" if they won't be fixed
> till 26.1?
I guess to avoid having the reported version number in bug reports keep
jumping around? Master is really working toward 26.1 at this point.
Once we start working on 25.2, we should cherry-pick over all the fixes for
bugs are marked "fixed in 25.2". Otherwise, they should be marked "fixed in
26.1".
--
John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Mon, 20 Jun 2016 13:17:02 GMT)
Full text and
rfc822 format available.
Message #65 received at 23750 <at> debbugs.gnu.org (full text, mbox):
John Wiegley <jwiegley <at> gmail.com> writes:
>>>>>> Glenn Morris <rgm <at> gnu.org> writes:
>
>> Then why is master STILL advertising itself as the forerunner to 25.2? Why
>> are we closing a bunch of bugs as "fixed in 25.2" if they won't be fixed
>> till 26.1?
>
> I guess to avoid having the reported version number in bug reports keep
> jumping around? Master is really working toward 26.1 at this point.
>
> Once we start working on 25.2, we should cherry-pick over all the fixes for
> bugs are marked "fixed in 25.2". Otherwise, they should be marked "fixed in
> 26.1".
Most bugs fixed in master are marked "fixed in 25.2" (since that is what
master is announcing itself as being the forerunner to), so that doesn't
make much sense, I'm afraid.
Which is what Glenn is telling us, once again. I really don't
understand why master hasn't been changed to say that it's the
forerunner to 26.1.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Mon, 20 Jun 2016 14:40:01 GMT)
Full text and
rfc822 format available.
Message #68 received at 23750 <at> debbugs.gnu.org (full text, mbox):
> Cc: 23750 <at> debbugs.gnu.org, monnier <at> IRO.UMontreal.CA, sdl.web <at> gmail.com
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Mon, 20 Jun 2016 05:51:06 +0300
This all sounds like my response is not welcome, but in that case why
did you ask the question?
Anyway:
> So this is not a bug in Emacs, but a diagnostic facility to let bugs
> in applications be discovered?
>
> It's a bug. Accepting invalid input and behaving badly with it is definitely a bug.
No, the bug is where the invalid input is generated in the first
place. Each API has its contract; if you violate the contract, you
invoke undefined behavior.
> If this is what you need, why not simply test the payload for being a
> unibyte string? There a function, multibyte-string-p, for that.
>
> There are a lot of variables to test (see the comment above the mapconcat call).
Looks like mapc will be able to deal with that. Or just use concat,
and test the result with multibyte-string-p before sending. Or encode
it with UTF-8, if it is not unibyte already.
Btw, I don't think the comment which explains why we started using
mapconcat is accurate these days. It was written before the move to
Unicode in Emacs 23, but we stopped converting raw bytes into Latin-1
characters in Emacs 23 and later. So maybe we should just go back to
using concat (with erroring out, if the result is multibyte, and/or
maybe with replacing 'length' with 'string-bytes').
Bottom line: like I said, there should be no reason to use
string-*-unibyte in modern Emacs code on the url-http level or higher
(maybe not at all). Its use is a sign of some basic misunderstanding,
or a bug elsewhere, or remnant of old problems that no longer exist.
So I think we should reconsider the solution on master as well.
> I'm fine either way, but my patch changes two characters, and yours will be longer.
I don't think the quality of a change should be judged by the number
of characters in the patch. That is a very strange criterion, to say
the least. It would mean, for example, that changes with comments are
worse than changes without comments, or that saving newlines in C code
(which makes the code less readable) is a virtue.
> And you'll have to come up with the error message(s).
Are you saying you like the error message from string-to-unibyte?
Cannot convert 123th character to unibyte
Doesn't really strike me as something that a user or an average
developer will understand. I thought you wanted something more
human-readable, like
Invalid multibyte text in HTTP request %s
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Mon, 20 Jun 2016 14:41:01 GMT)
Full text and
rfc822 format available.
Message #71 received at 23750 <at> debbugs.gnu.org (full text, mbox):
> From: Leo Liu <sdl.web <at> gmail.com>
> Date: Mon, 20 Jun 2016 08:15:26 +0800
>
> > This particular bug came from this:
> >
> > "Content-length: " (number-to-string (length url-http-data))
> >
> > Which gives wrong value when url-http-data is multibyte (it should be
> > length in bytes). So then, the HTTP server on the other side saw the
> > wrong body length and truncated the body when reading the request.
>
> As Dmitry mentioned earlier json-encode in 25.1 produces multibyte
> strings and makes it easier to hit this bug when consuming JSON API's.
> There are three parties that are suspicious: 1) JSON API server 2)
> JSON.el 3) URL. It took me a while to realise it's URL's fault IOW the
> bug isn't easy to debug. This is somewhat related to changes brought in
> by 25.1.
I understand that url-http expects unibyte strings. So my suggestion
is to test that, and signal an error if the requirement is violated,
with an error message text that could be understood by users and
developers.
Alternatively, we could encode multibyte strings in UTF-8, if we want
to attempt to silently cope with such strings.
In any case, using string-*-unibyte functions for that is not needed,
and I'm quite sure their use in this case is a left-over from an era
long gone.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Mon, 20 Jun 2016 14:44:01 GMT)
Full text and
rfc822 format available.
Message #74 received at 23750 <at> debbugs.gnu.org (full text, mbox):
> From: John Wiegley <jwiegley <at> gmail.com>
> Date: Sun, 19 Jun 2016 21:22:25 -0700
> Cc: 23750 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>, sdl.web <at> gmail.com,
> monnier <at> IRO.UMontreal.CA
>
> Once we start working on 25.2, we should cherry-pick over all the fixes for
> bugs are marked "fixed in 25.2".
I don't think this is practical. The only practical way is to cut a
new release branch off master.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Mon, 20 Jun 2016 14:55:01 GMT)
Full text and
rfc822 format available.
Message #77 received at 23750 <at> debbugs.gnu.org (full text, mbox):
On 06/20/2016 05:38 PM, Eli Zaretskii wrote:
> This all sounds like my response is not welcome, but in that case why
> did you ask the question?
I was kind of hoping for "yes, let's get it into 25.1!"? :)
> No, the bug is where the invalid input is generated in the first
> place. Each API has its contract; if you violate the contract, you
> invoke undefined behavior.
It's a bug in the API, or bad API, if you will. It needs stricter
contract, and the submitted patch added it.
Or to look at it another way, the current contract allows url-http-data
to be multibyte, because the requirement to the contrary is not
documented anywhere that I can see. The variable is simply undocumented.
>> If this is what you need, why not simply test the payload for being a
>> unibyte string? There a function, multibyte-string-p, for that.
>>
>> There are a lot of variables to test (see the comment above the mapconcat call).
>
> Looks like mapc will be able to deal with that. Or just use concat,
> and test the result with multibyte-string-p before sending. Or encode
> it with UTF-8, if it is not unibyte already.
I don't know if we want to be that permissive that we'll encode to UTF-8
silently.
> Btw, I don't think the comment which explains why we started using
> mapconcat is accurate these days. It was written before the move to
> Unicode in Emacs 23, but we stopped converting raw bytes into Latin-1
> characters in Emacs 23 and later. So maybe we should just go back to
> using concat (with erroring out, if the result is multibyte, and/or
> maybe with replacing 'length' with 'string-bytes').
Better error out: the payload's encoding is something only the caller
should be concerned with. Unless we're fine with the users assuming that
Emacs's internal encoding is close enough to UTF-8.
> Bottom line: like I said, there should be no reason to use
> string-*-unibyte in modern Emacs code on the url-http level or higher
> (maybe not at all). Its use is a sign of some basic misunderstanding,
> or a bug elsewhere, or remnant of old problems that no longer exist.
> So I think we should reconsider the solution on master as well.
I don't mind. Would you advocate for having this fix on emacs-25 if I
implement it the way you described?
>> And you'll have to come up with the error message(s).
>
> Are you saying you like the error message from string-to-unibyte?
>
> Cannot convert 123th character to unibyte
It's an order of magnitude better than what was before (no error and
silent corruption), but yes, there is space for improvement.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Mon, 20 Jun 2016 15:05:01 GMT)
Full text and
rfc822 format available.
Message #80 received at 23750 <at> debbugs.gnu.org (full text, mbox):
> Cc: 23750 <at> debbugs.gnu.org, monnier <at> IRO.UMontreal.CA, sdl.web <at> gmail.com
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Mon, 20 Jun 2016 17:54:23 +0300
>
> On 06/20/2016 05:38 PM, Eli Zaretskii wrote:
>
> > This all sounds like my response is not welcome, but in that case why
> > did you ask the question?
>
> I was kind of hoping for "yes, let's get it into 25.1!"? :)
I'm not that kind of guy, as you know ;-)
> > Bottom line: like I said, there should be no reason to use
> > string-*-unibyte in modern Emacs code on the url-http level or higher
> > (maybe not at all). Its use is a sign of some basic misunderstanding,
> > or a bug elsewhere, or remnant of old problems that no longer exist.
> > So I think we should reconsider the solution on master as well.
>
> I don't mind. Would you advocate for having this fix on emacs-25 if I
> implement it the way you described?
A single test and an error message is safe enough to go to emacs-25,
yes.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Mon, 20 Jun 2016 17:17:01 GMT)
Full text and
rfc822 format available.
Message #83 received at 23750 <at> debbugs.gnu.org (full text, mbox):
On 06/20/2016 05:38 PM, Eli Zaretskii wrote:
> Or just use concat,
> and test the result with multibyte-string-p before sending.
Actually, here's a reason why we might prefer not to replace
string-as/to-unibyte with multibyte-string-p: string-to-unibyte works
fine if the string's contents only contain ASCII/8-bit characters, even
if the string itself is multibyte. But multibyte-string-p returns nil
for such strings anyway.
So doing like you suggest might make some (arguably not well-written)
programs fail, which otherwise could function fine, provided they only
operate on ASCII strings. And having a multibyte string with ASCII-only
contents is fairly common when the string is produced with
buffer-substring from a source code buffer.
While it might be good to discourage this kind of programming practice
(that doesn't handle non-ASCII text properly), it seems like this would
be better for master rather that the impending release.
WDYT?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Mon, 20 Jun 2016 20:20:01 GMT)
Full text and
rfc822 format available.
Message #86 received at 23750 <at> debbugs.gnu.org (full text, mbox):
> Cc: 23750 <at> debbugs.gnu.org, monnier <at> IRO.UMontreal.CA, sdl.web <at> gmail.com
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Mon, 20 Jun 2016 20:16:37 +0300
>
> On 06/20/2016 05:38 PM, Eli Zaretskii wrote:
>
> > Or just use concat,
> > and test the result with multibyte-string-p before sending.
>
> Actually, here's a reason why we might prefer not to replace
> string-as/to-unibyte with multibyte-string-p: string-to-unibyte works
> fine if the string's contents only contain ASCII/8-bit characters, even
> if the string itself is multibyte. But multibyte-string-p returns nil
> for such strings anyway.
We can replace the call to multibyte-string-p with a comparison of
what 'length' and 'string-bytes' return. That should overcome this
issue.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Mon, 20 Jun 2016 20:28:02 GMT)
Full text and
rfc822 format available.
Message #89 received at 23750 <at> debbugs.gnu.org (full text, mbox):
On 06/20/2016 11:17 PM, Eli Zaretskii wrote:
> We can replace the call to multibyte-string-p with a comparison of
> what 'length' and 'string-bytes' return. That should overcome this
> issue.
Why not just call string-to-unibyte? To you expect different results?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Tue, 21 Jun 2016 02:32:01 GMT)
Full text and
rfc822 format available.
Message #92 received at 23750 <at> debbugs.gnu.org (full text, mbox):
> Cc: 23750 <at> debbugs.gnu.org, monnier <at> IRO.UMontreal.CA, sdl.web <at> gmail.com
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Mon, 20 Jun 2016 23:27:01 +0300
>
> On 06/20/2016 11:17 PM, Eli Zaretskii wrote:
>
> > We can replace the call to multibyte-string-p with a comparison of
> > what 'length' and 'string-bytes' return. That should overcome this
> > issue.
>
> Why not just call string-to-unibyte?
Because (a) I don't want to see that function in our sources, ever,
and (b) you don't have any control on the error message it produces,
which is not appropriate for application-level checks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Tue, 21 Jun 2016 13:53:01 GMT)
Full text and
rfc822 format available.
Message #95 received at 23750 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 06/21/2016 05:30 AM, Eli Zaretskii wrote:
> Because (a) I don't want to see that function in our sources, ever,
> and (b) you don't have any control on the error message it produces,
> which is not appropriate for application-level checks.
Please take a look at the attachment. OK to install?
I recall John saying we shouldn't push any more changes to emacs-25.
[url-http-multibyte.diff (text/x-patch, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Tue, 21 Jun 2016 15:21:01 GMT)
Full text and
rfc822 format available.
Message #98 received at 23750 <at> debbugs.gnu.org (full text, mbox):
> Cc: 23750 <at> debbugs.gnu.org, monnier <at> IRO.UMontreal.CA, sdl.web <at> gmail.com
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Tue, 21 Jun 2016 16:51:59 +0300
>
> > Because (a) I don't want to see that function in our sources, ever,
> > and (b) you don't have any control on the error message it produces,
> > which is not appropriate for application-level checks.
>
> Please take a look at the attachment. OK to install?
Yes, but let's wait for John.
> I recall John saying we shouldn't push any more changes to emacs-25.
He did? John, this change is IMO safe for emacs-25. Is it OK to
push there?
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Wed, 22 Jun 2016 01:09:02 GMT)
Full text and
rfc822 format available.
Message #101 received at 23750 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
>>>>> Eli Zaretskii <eliz <at> gnu.org> writes:
> He did? John, this change is IMO safe for emacs-25. Is it OK to push there?
If you think it's safe, Eli, then I'm good with it.
--
John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Wed, 22 Jun 2016 02:38:01 GMT)
Full text and
rfc822 format available.
Message #104 received at 23750 <at> debbugs.gnu.org (full text, mbox):
> From: John Wiegley <jwiegley <at> gmail.com>
> Cc: Dmitry Gutov <dgutov <at> yandex.ru>, 23750 <at> debbugs.gnu.org, monnier <at> IRO.UMontreal.CA, sdl.web <at> gmail.com
> Date: Tue, 21 Jun 2016 18:08:44 -0700
>
> >>>>> Eli Zaretskii <eliz <at> gnu.org> writes:
>
> > He did? John, this change is IMO safe for emacs-25. Is it OK to push there?
>
> If you think it's safe, Eli, then I'm good with it.
OK, thanks.
Reply sent
to
Dmitry Gutov <dgutov <at> yandex.ru>
:
You have taken responsibility.
(Wed, 22 Jun 2016 18:22:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Leo Liu <sdl.web <at> gmail.com>
:
bug acknowledged by developer.
(Wed, 22 Jun 2016 18:22:02 GMT)
Full text and
rfc822 format available.
Message #109 received at 23750-done <at> debbugs.gnu.org (full text, mbox):
On 06/22/2016 04:08 AM, John Wiegley wrote:
> If you think it's safe, Eli, then I'm good with it.
Thanks!
Pushed, and closing.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Thu, 23 Jun 2016 17:16:02 GMT)
Full text and
rfc822 format available.
Message #112 received at 23750 <at> debbugs.gnu.org (full text, mbox):
John Wiegley wrote:
>> Then why is master STILL advertising itself as the forerunner to 25.2? Why
>> are we closing a bunch of bugs as "fixed in 25.2" if they won't be fixed
>> till 26.1?
>
> I guess to avoid having the reported version number in bug reports keep
> jumping around? Master is really working toward 26.1 at this point.
This doesn't make any sense to me. (And why are you guessing? Isn't
there a plan?)
> Once we start working on 25.2, we should cherry-pick over all the fixes for
> bugs are marked "fixed in 25.2". Otherwise, they should be marked "fixed in
> 26.1".
I don't think that will work well, but good luck with it.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#23750
; Package
emacs
.
(Fri, 01 Jul 2016 20:52:02 GMT)
Full text and
rfc822 format available.
Message #115 received at 23750 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
>>>>> Lars Ingebrigtsen <larsi <at> gnus.org> writes:
> Most bugs fixed in master are marked "fixed in 25.2" (since that is what
> master is announcing itself as being the forerunner to), so that doesn't
> make much sense, I'm afraid.
>
> Which is what Glenn is telling us, once again. I really don't understand why
> master hasn't been changed to say that it's the forerunner to 26.1.
The last time we had our long discussion about what the various branches mean,
the conclusion was that emacs-25 is for the next release, and master is for
all other work.
Most people did NOT want master to be toward the next release (25.2), as that
leaves nowhere for changes meant for 26 only.
However, this also leaves nowhere for fixes to go that are only for 25.2. But
since no additional branches were desired, the compromise was that both types
of changes will go into master, and we will be backport certain changes into
emacs-25 toward 25.2 after the release.
Marking a bug as "fixed in 25.2" seems wrong to me, because it implies a
guarantee that the fix will get cherry picked into emacs-25 after 25.1 is
released, although I highly doubt this will happen for every such fix. There
is just too much work to be done.
What we should do is mark every commit intended for 25.2 in a way that lets us
find them all automatically after the release, with a link to the bugs they
fix so that we can safely state "fixed in 25.2". Since this hasn't happened, I
imagine it will be a very manual process, and will be missing several of those
fixes.
This is why I personally argued for 3 branches, but it's not what the people
doing the real work wanted, so this is what we have.
After 25.1, we'll just have to see what happens to emacs-25 and to the
bug-tracker. I imagine several of the "fixed in 25.2" bugs will need to be
adjusted to "fixed in 26.1".
--
John Wiegley GPG fingerprint = 4710 CF98 AF9B 327B B80F
http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2
[signature.asc (application/pgp-signature, inline)]
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Sat, 30 Jul 2016 11:24:03 GMT)
Full text and
rfc822 format available.
This bug report was last modified 8 years and 113 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.