GNU bug report logs - #30408
24.5; (format "%x" large-number) produces incorrect results

Package: emacs;

Reported by: David Sitsky <david.sitsky <at> gmail.com>

Date: Sat, 10 Feb 2018 07:03:02 UTC

Severity: wishlist

Found in version 24.5

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 30408 in the body.
You can then email your comments to 30408 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox

Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#30408; Package emacs. (Sat, 10 Feb 2018 07:03:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to David Sitsky <david.sitsky <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sat, 10 Feb 2018 07:03:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: David Sitsky <david.sitsky <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 24.5; (format "%x" large-number) produces incorrect results
Date: Sat, 10 Feb 2018 17:22:53 +1100

[Message part 1 (text/plain, inline)]

I wrote this originally on
https://emacs.stackexchange.com/questions/38710/why-does-format-x-some-large-number-produces-incorrect-results
and a poster recommended I mention this here.

I wanted the hexadecimal string for a large integer such as below:

(format "%x" 2738188573457603759)

This returns 2600000000f95c00 which is incorrect, it should be
2600000000f95caf.

The value of most-positive-fixnum on my box is 0x1fffffffffffffff which is
less than the number I'm supplying above.

As a user I'm a bit baffled what is happening. The manual indicates
integers larger than this range are converted to a floating-point number
which is a concern for precision but I suspect this is what is biting me
here?

I should have known there was an issue with this number since normally I
evaluate them directly using eval-last-sexp and it didn't show the
octal/hex variants.. :)

I wonder why Emacs Lisp doesn't support bignums by default, so precision
would not be an issue?

[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30408; Package emacs. (Mon, 12 Feb 2018 02:50:02 GMT) Full text and rfc822 format available.

Message #8 received at 30408 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: David Sitsky <david.sitsky <at> gmail.com>
Cc: 30408 <at> debbugs.gnu.org
Subject: Re: 24.5; (format "%x" large-number) produces incorrect results
Date: Sun, 11 Feb 2018 18:49:33 -0800

> I wonder why Emacs Lisp doesn't support bignums by default, so precision
> would not be an issue?

Nobody has gotten around to implementing it. It'd be nice if someone did. It is 
a wishlist item, so I'll mark this bug report as wishlist priority.

Severity set to 'wishlist' from 'normal' Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Mon, 12 Feb 2018 02:51:01 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30408; Package emacs. (Mon, 12 Feb 2018 05:05:01 GMT) Full text and rfc822 format available.

Message #13 received at 30408 <at> debbugs.gnu.org (full text, mbox):

From: Drew Adams <drew.adams <at> oracle.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>, David Sitsky <david.sitsky <at> gmail.com>
Cc: 30408 <at> debbugs.gnu.org
Subject: RE: bug#30408: 24.5; (format "%x" large-number) produces incorrect
 results
Date: Sun, 11 Feb 2018 20:56:32 -0800 (PST)

> > I wonder why Emacs Lisp doesn't support bignums by default, so
> > precision would not be an issue?
> 
> Nobody has gotten around to implementing it. It'd be nice if someone did.
> It is a wishlist item, so I'll mark this bug report as wishlist priority.

It's probably a duplicate bug.  This has been requested
in the past - perhaps more than once.  And there has
been some discussion of it.  I don't have a reference,
however.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30408; Package emacs. (Sun, 18 Feb 2018 01:09:02 GMT) Full text and rfc822 format available.

Message #16 received at 30408 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Drew Adams <drew.adams <at> oracle.com>, David Sitsky <david.sitsky <at> gmail.com>
Cc: 30408 <at> debbugs.gnu.org
Subject: Re: bug#30408: 24.5; (format "%x" large-number) produces incorrect
 results
Date: Sat, 17 Feb 2018 17:08:22 -0800

[Message part 1 (text/plain, inline)]

This kind of bug has bitten me before, so I think it's worthwhile for Emacs to 
defend against it better. Proposed patch attached. Although this patch doesn't 
address the major problem here (which is that Emacs lacks bignums), it does 
cause Emacs to respond better to large numbers, by not losing information when 
it is reading or printing integers.

With this patch, one cannot evaluate (format "%x" 2738188573457603759) because 
the Lisp reader signals an error when it sees the unrepresentable integer 
2738188573457603759, instead of silently substituting a different number. 
Another example: (format "%d" 18446744073709551616) now returns 
"18446744073709551616" instead of the quite-wrong "9223372036854775807".

[0001-Avoid-losing-info-when-converting-integers.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30408; Package emacs. (Sun, 18 Feb 2018 17:15:01 GMT) Full text and rfc822 format available.

Message #19 received at 30408 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: 30408 <at> debbugs.gnu.org
Cc: Emacs-devel <at> gnu.org
Subject: Re: Checking for loss of information on integer conversion
Date: Sun, 18 Feb 2018 19:14:33 +0200

> From: Paul Eggert <eggert <at> cs.ucla.edu>
> Date: Sat, 17 Feb 2018 17:27:37 -0800
> 
> Second, although Emacs still reads large integers like 18446744073709551616 as 
> if they were floating-point, it now signals an error if information is lost in 
> the process. For example, the number 18446744073709551615 now causes the reader 
> to signal an error, since it cannot be represented exactly either as a fixnum or 
> as a floating-point number. If you want inexact representation, you can append 
> ".0" or "e0" to the integer.

I don't think I like this particular effect of the proposed changes.
At the very least there should be an easy way of avoiding the error,
when the number is not under the control of a Lisp program.  E.g., we
represent file sizes as floats if the value overflows an Emacs
integer, but we definitely don't want to risk signaling errors due to
that, e.g. in the likes of ls-lisp.el (and in general any program that
calls file-attributes).

More generally, why signaling an error by default in this case is a
good idea?  Emacs Lisp is not used to write software that controls
aircraft and spaceships, and probably never will, so why shouldn't we
let the programmer request this feature when they need it?  That would
be similar to behavior of equivalent constructs in C programs, where
the inexact exception is AFAIK masked by default.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30408; Package emacs. (Sun, 18 Feb 2018 20:05:02 GMT) Full text and rfc822 format available.

Message #22 received at 30408 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 30408 <at> debbugs.gnu.org, emacs-devel <at> gnu.org
Subject: Re: Checking for loss of information on integer conversion
Date: Sun, 18 Feb 2018 12:04:20 -0800

[Message part 1 (text/plain, inline)]

Eli Zaretskii wrote:

> Emacs Lisp is not used to write software that controls
> aircraft and spaceships

Actually, I maintain Emacs Lisp code that controls timestamps used in aircraft 
and spaceships. I'm not saying that Emacs itself runs the aircraft and 
spaceships, but it definitely is used to develop software and data used there. 
As luck would have it, I'm currently engaged in an email thread about time 
transfer between Earth and Mars (yes, this is really a thing and people are 
trying to do it with millisecond precision) that is related to a project where I 
regularly use Emacs Lisp. See the thread containing this message:

https://mm.icann.org/pipermail/tz/2018-February/026257.html

> More generally, why signaling an error by default in this case is a
> good idea? ...  That would
> be similar to behavior of equivalent constructs in C programs

Sure, and C compilers typically issue diagnostics for situations similar to 
what's in Bug#30408. For example, for this C program:

int a = 18446744073709553664;

GCC issues a diagnostic, whereas for the similar Emacs Lisp program:

(setq b 18446744073709553664)

Emacs silently substitutes a number that is off by 2048. It's the latter 
behavior that causes the sort of problem seen in Bug#30408.

When people write a floating-point number they naturally expect it to have some 
fuzz. But when they write an integer they expect it to be represented exactly, 
and not to be rounded.  Emacs already reports an overflow error for the 
following code that attempts to use the same mathematical value:

(setq c #x10000000000000800)

so it's not like it would be a huge change to do something similar for decimal 
integers.

When Emacs was originally developed, its integers were typically 28 bits (not 
counting sign) and floating-point numbers could typically represent integers 
exactly up to 53 bits (not counting sign), so the old Emacs behavior was 
somewhat defensible: although it didn't do bignums, at least it could represent 
integers nearly twice as wide as fixnums. However, nowadays Emacs integers 
typically have more precision than floating point numbers, and the old Emacs 
behavior is more likely to lead to counterintuitive results such as those 
described in Bug#30408.

On thinking about it in the light of your comments, I suppose it's confusing 
that the proposal used a new signal 'inexact', whereas it should just signal 
overflow. After all, that's what string_to_number already does for out-of-range 
hexadecimal integers. That issue is easily fixed. Revised patch attached.

[0001-Avoid-losing-info-when-converting-integers.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30408; Package emacs. (Sun, 18 Feb 2018 20:25:02 GMT) Full text and rfc822 format available.

Message #25 received at 30408 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 30408 <at> debbugs.gnu.org, emacs-devel <at> gnu.org
Subject: Re: Checking for loss of information on integer conversion
Date: Sun, 18 Feb 2018 22:24:34 +0200

> From: Paul Eggert <eggert <at> cs.ucla.edu>
> Cc: emacs-devel <at> gnu.org, 30408 <at> debbugs.gnu.org
> Date: Sun, 18 Feb 2018 12:04:20 -0800
> 
> > Emacs Lisp is not used to write software that controls
> > aircraft and spaceships
> 
> Actually, I maintain Emacs Lisp code that controls timestamps used in aircraft 
> and spaceships. I'm not saying that Emacs itself runs the aircraft and 
> spaceships, but it definitely is used to develop software and data used there. 
> As luck would have it, I'm currently engaged in an email thread about time 
> transfer between Earth and Mars (yes, this is really a thing and people are 
> trying to do it with millisecond precision) that is related to a project where I 
> regularly use Emacs Lisp. See the thread containing this message:

Interesting, but not really relevant to the issue at hand, IMO.  I was
talking about real-time control, not off-line calculations.  And I did
propose to have this feature as opt-in, so the kind of calculations
that transfer me to Mars could still be held safely and accurately.

> > More generally, why signaling an error by default in this case is a
> > good idea? ...  That would
> > be similar to behavior of equivalent constructs in C programs
> 
> Sure, and C compilers typically issue diagnostics for situations similar to 
> what's in Bug#30408. For example, for this C program:
> 
> int a = 18446744073709553664;
> 
> GCC issues a diagnostic, whereas for the similar Emacs Lisp program:
> 
> (setq b 18446744073709553664)
> 
> Emacs silently substitutes a number that is off by 2048.

I'm okay with flagging such constants during byte compilation.  I was
talking only about run-time diagnostics, not compile-time diagnostics.

> When people write a floating-point number they naturally expect it to have some 
> fuzz. But when they write an integer they expect it to be represented exactly, 
> and not to be rounded.

That is true, but Emacs behaved like it does today for many years, and
I'm worried by the possible breakage such a significant behavior
change could have, including on our own code.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30408; Package emacs. (Sun, 18 Feb 2018 21:53:01 GMT) Full text and rfc822 format available.

Message #28 received at 30408 <at> debbugs.gnu.org (full text, mbox):

From: Drew Adams <drew.adams <at> oracle.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>, Eli Zaretskii <eliz <at> gnu.org>
Cc: 30408 <at> debbugs.gnu.org, emacs-devel <at> gnu.org
Subject: RE: Checking for loss of information on integer conversion
Date: Sun, 18 Feb 2018 13:52:24 -0800 (PST)

Do you really need to send this thread to both the bug
list and emacs-devel?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30408; Package emacs. (Fri, 09 Mar 2018 05:01:02 GMT) Full text and rfc822 format available.

Message #31 received at 30408 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 30408 <at> debbugs.gnu.org
Subject: Re: Checking for loss of information on integer conversion
Date: Thu, 8 Mar 2018 21:00:42 -0800

[Message part 1 (text/plain, inline)]

Since the qualms expressed on this topic had to do with converting strings to 
integers, I installed into master the noncontroversial part affecting conversion 
of integers to strings (see attached patch; it also fixes some minor glitches in 
the previous proposal). I'll think about the string-to-integer conversion a bit 
more and propose an updated patch for that.

[0001-Avoid-losing-info-when-formatting-integers.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30408; Package emacs. (Fri, 09 Mar 2018 08:24:01 GMT) Full text and rfc822 format available.

Message #34 received at 30408 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 30408 <at> debbugs.gnu.org
Subject: Re: Checking for loss of information on integer conversion
Date: Fri, 09 Mar 2018 10:22:58 +0200

> Cc: 30408 <at> debbugs.gnu.org
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> Date: Thu, 8 Mar 2018 21:00:42 -0800
> 
> Since the qualms expressed on this topic had to do with converting strings to 
> integers, I installed into master the noncontroversial part affecting conversion 
> of integers to strings (see attached patch; it also fixes some minor glitches in 
> the previous proposal). I'll think about the string-to-integer conversion a bit 
> more and propose an updated patch for that.

Thanks.  May I suggest to add a couple of tests for this feature?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30408; Package emacs. (Wed, 21 Mar 2018 19:14:01 GMT) Full text and rfc822 format available.

Message #37 received at 30408 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 30408 <at> debbugs.gnu.org
Subject: Re: Checking for loss of information on integer conversion
Date: Wed, 21 Mar 2018 12:13:49 -0700

[Message part 1 (text/plain, inline)]

On 03/09/2018 12:22 AM, Eli Zaretskii wrote:
> May I suggest to add a couple of tests for this feature?

Sure, I installed the attached.

[0001-Add-tests-for-Bug-30408.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30408; Package emacs. (Wed, 21 Mar 2018 19:30:02 GMT) Full text and rfc822 format available.

Message #40 received at 30408 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 30408 <at> debbugs.gnu.org
Subject: Re: Checking for loss of information on integer conversion
Date: Wed, 21 Mar 2018 21:29:30 +0200

> Cc: 30408 <at> debbugs.gnu.org
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> Date: Wed, 21 Mar 2018 12:13:49 -0700
> 
> On 03/09/2018 12:22 AM, Eli Zaretskii wrote:
> > May I suggest to add a couple of tests for this feature?
> 
> Sure, I installed the attached.

Thanks!

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30408; Package emacs. (Tue, 27 Mar 2018 23:20:02 GMT) Full text and rfc822 format available.

Message #43 received at 30408 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>, 30408 <at> debbugs.gnu.org
Subject: Re: Checking for loss of information on integer conversion
Date: Tue, 27 Mar 2018 16:19:21 -0700

[Message part 1 (text/plain, inline)]

Here's a patch that I hope addresses the main problem. The basic idea is 
to avoid the confusion exemplified in Bug#30408 by changing Emacs so 
that it ordinarily signals an error if it reads a program that contains 
an integer literal that is out of fixnum range. However, if the 
out-of-range literal is followed by '.' then Emacs continues to silently 
convert it to floating-point; this is intended as an escape hatch for 
any programs that need the old behavior (I expect this'll be rare). 
Thus, on 32-bit Emacs, plain '536870912' in a program causes Emacs to 
signal an overflow while loading the program, whereas '536870912.' is 
treated as a floating-point number as before. (On 64-bit Emacs, the same 
two literals are both integers, as before.)

Unlike my previous proposal, this patch does not affect the behavior of 
string-to-integer. As I understand it, that was a primary source of 
qualms about the previous proposal.

I've tested this on both 32- and 64-bit Emacs on master. This patch has 
helped me to find a couple of integer portability bugs which I already 
fixed on master.

[0001-Lisp-reader-now-checks-for-integer-overflow.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30408; Package emacs. (Thu, 29 Mar 2018 11:12:01 GMT) Full text and rfc822 format available.

Message #46 received at 30408 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 30408 <at> debbugs.gnu.org
Subject: Re: Checking for loss of information on integer conversion
Date: Thu, 29 Mar 2018 14:11:10 +0300

> From: Paul Eggert <eggert <at> cs.ucla.edu>
> Date: Tue, 27 Mar 2018 16:19:21 -0700
> 
> Here's a patch that I hope addresses the main problem. The basic idea is 
> to avoid the confusion exemplified in Bug#30408 by changing Emacs so 
> that it ordinarily signals an error if it reads a program that contains 
> an integer literal that is out of fixnum range. However, if the 
> out-of-range literal is followed by '.' then Emacs continues to silently 
> convert it to floating-point; this is intended as an escape hatch for 
> any programs that need the old behavior (I expect this'll be rare). 

I'd suggest, for a good measure, to have a variable which would force
the conversion to floats, avoiding an error even without the trailing
period.  We can later remove that variable, or make it a no-op, if the
danger of breaking existing code turns out low or non-existent.

Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Thu, 29 Mar 2018 18:10:01 GMT) Full text and rfc822 format available.

Notification sent to David Sitsky <david.sitsky <at> gmail.com>:
bug acknowledged by developer. (Thu, 29 Mar 2018 18:10:01 GMT) Full text and rfc822 format available.

Message #51 received at 30408-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 30408-done <at> debbugs.gnu.org, David Sitsky <david.sitsky <at> gmail.com>
Subject: Re: Checking for loss of information on integer conversion
Date: Thu, 29 Mar 2018 11:09:45 -0700

[Message part 1 (text/plain, inline)]

On 03/29/2018 04:11 AM, Eli Zaretskii wrote:
> I'd suggest, for a good measure, to have a variable which would force
> the conversion to floats, avoiding an error even without the trailing
> period.  We can later remove that variable, or make it a no-op, if the
> danger of breaking existing code turns out low or non-existent.

OK, I did that, by installing the attached into master, after installing 
the proposed patch.

As a result, unless the user sets the new variable 
read-integer-overflow-as-float, the Lisp reader now rejects the program 
(format "%x" 2738188573457603759) by signaling an overflow error. As 
this was the basis of the original bug report, I'm marking the bug as done.

[0001-New-experimental-variable-read-integer-overflow-as-f.patch (text/x-patch, attachment)]

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 27 Apr 2018 11:24:03 GMT) Full text and rfc822 format available.

bug unarchived. Request was from Bastien <bzg <at> gnu.org> to control <at> debbugs.gnu.org. (Sun, 06 May 2018 06:59:01 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30408; Package emacs. (Sun, 06 May 2018 09:58:01 GMT) Full text and rfc822 format available.

Message #58 received at 30408 <at> debbugs.gnu.org (full text, mbox):

From: Bastien <bzg <at> gnu.org>
To: 30408 <at> debbugs.gnu.org
Subject: More context in `read-integer-overflow-as-float'?
Date: Sun, 06 May 2018 08:29:10 +0200

As suggested in `read-integer-overflow-as-float' docstring:

  Non-nil means ‘read’ quietly treats an out-of-range integer as
  floating point.  Nil (the default) means signal an overflow unless
  the integer ends in ‘.’.  This variable is experimental; email
  30408 <at> debbugs.gnu.org if you need it.

(Note that the last sentence is a bit ambiguous: does "if you need it"
refer to sending an email or to the variable?)

Apparently I need (setq read-integer-overflow-as-float t) since I've
been hit by bugs here.  But what are the consequences of setting this
to `t', aside from silencing a few errors?  What is the usefulness of
not setting it to `t'?  What is the experiment about?  Can I get rid
of this setting when the experiment is over?

Since `read-integer-overflow-as-float' is the entry point for those
who are not aware of the experiment, some guidance in the docstring
might be useful.

Thanks,

-- 
 Bastien

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 03 Jun 2018 11:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 7 years and 32 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #30408 24.5; (format "%x" large-number) produces incorrect results

GNU bug report logs - #30408
24.5; (format "%x" large-number) produces incorrect results