GNU bug report logs - #8435
misuse of error ("...%d...", ...) on 64-bit hosts

Package: emacs;

Reported by: Paul Eggert <eggert <at> cs.ucla.edu>

Date: Wed, 6 Apr 2011 20:00:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 8435 in the body.
You can then email your comments to 8435 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox

Report forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8435; Package emacs. (Wed, 06 Apr 2011 20:00:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Paul Eggert <eggert <at> cs.ucla.edu>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Wed, 06 Apr 2011 20:00:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: bug-gnu-emacs <at> gnu.org
Subject: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Wed, 06 Apr 2011 12:59:09 -0700

In the Emacs trunk, src/dispnew.c contains this:

    error ("Device %d is not a termcap terminal device", t->id);

t->id is of type 'int', but the "error" routine formats %d as if it
were of type EMACS_INT.  This works on a typical 32-bit machine, but
on a 64-bit machine these two types have different representations,
and the above call relies on undefined behavior: it might work and it
might not.

The above bug can easily be fixed by casting t->id to EMACS_INT, but
other instances of the problem are not so easy.  For example,
src/term.c has this:

    maybe_fatal (must_succeed, terminal,
                 "Screen size %dx%d is too small",
                 "Screen size %dx%d is too small",
                 FrameCols (tty), FrameRows (tty));

where FrameCols and FrameRows return 'int'.  Here, if MUST_SUCCEED is
true, maybe_fatal calls 'printf' and works; but if MUST_SUCCEED is
false, maybe_fatal calls 'error' and might not work on a 64-bit machine.

I found these bugs by code inspection, and expect that there are
others like them.  Part of the problem is that it's confusing that
'error' treats format strings differently from 'printf'.  And partly
the problem is that there is currently no reliable way to catch common
programming mistakes like this.

I plan to fix this problem systematically, as follows.

  * Provide a convenient way to format EMACS_INT values using
    printf-like functions.

  * Change 'error' and similar functions so that they use
    printf-compatible format strings, and change their callers to
    format EMACS_INT values accordingly.

  * Mark 'error'-like functions with ATTRIBUTE_FORMAT_PRINTF, so that
    we can easily find other bugs like the above.

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8435; Package emacs. (Thu, 07 Apr 2011 07:34:01 GMT) Full text and rfc822 format available.

Message #8 received at 8435 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: 8435 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Thu, 07 Apr 2011 00:33:44 -0700

[Message part 1 (text/plain, inline)]

Attached is the patch I'd like to install, after more testing.

This patch affects the Windows build by removing src/doprnt.c.

This patch assumes that vsnprintf works.  This is true of the oldest,
cruftiest host I could get my hands on (a Solaris 8 box; Sun itself
stopped fixing Solaris 8 more than two years ago).  However, if Emacs
is still supposed to run on even-older (roughly, pre-1999) platforms
that lack vsnprintf then I can add the gnulib vsnprintf module, which
will provide a vsnprintf replacement for these ancient hosts.  I
assume that vsnprintf works on Windows, so the presence or absence of
the vsnprintf module shouldn't matter for Windows.

[patch.txt.gz (application/x-gzip, attachment)]

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8435; Package emacs. (Thu, 07 Apr 2011 12:31:02 GMT) Full text and rfc822 format available.

Message #11 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Andy Moreton <andrewjmoreton <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Thu, 07 Apr 2011 13:29:39 +0100

On Thu 07 Apr 2011, Paul Eggert wrote:

> Attached is the patch I'd like to install, after more testing.
>
> This patch affects the Windows build by removing src/doprnt.c.
>
> This patch assumes that vsnprintf works.  This is true of the oldest,
> cruftiest host I could get my hands on (a Solaris 8 box; Sun itself
> stopped fixing Solaris 8 more than two years ago).  However, if Emacs
> is still supposed to run on even-older (roughly, pre-1999) platforms
> that lack vsnprintf then I can add the gnulib vsnprintf module, which
> will provide a vsnprintf replacement for these ancient hosts.  I
> assume that vsnprintf works on Windows, so the presence or absence of
> the vsnprintf module shouldn't matter for Windows.

You define a pEd macro to be inserted in the format string to print an
appropriately integer sized type. Would it not be better to use a C99
print format macro name from <inttypes.h> ?

It may be necesary to provide the macros where <inttypes.h> is not
available, but at least the macro names are well known.

    AndyM

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8435; Package emacs. (Thu, 07 Apr 2011 18:35:01 GMT) Full text and rfc822 format available.

Message #14 received at 8435 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 8435 <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Thu, 07 Apr 2011 21:31:55 +0300

> Date: Thu, 07 Apr 2011 00:33:44 -0700
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> 
> This patch assumes that vsnprintf works.

Bother: using `vsnprintf' here could cause problems when `error' is
used to format messages with non-ASCII characters.  `doprnt' was
careful in these cases, e.g., it would take care of truncating
multibyte strings only on character boundaries.  It also supported %c
format for converting a multibyte character to its integer
representation.  Can we trust `vsnprintf' to not misbehave in these
cases, and DTRT with the entire repertory of characters supported by
Emacs?

There's also an issue with old pre-C99 implementations of `vsnprintf'
which returned -1 when the buffer was too small.  However, I think
catering for those would be a trivial change in `verror', so that's
not a big deal.

> I assume that vsnprintf works on Windows, so the presence or absence
> of the vsnprintf module shouldn't matter for Windows.

Yes, the MinGW build will have no problem.  The MSVC build will need
the above-mentioned change in `verror', because the Microsoft
runtime's version of `vsnprintf' is not compatible with ISO C99.  But
that can be fixed later, if needed.

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8435; Package emacs. (Thu, 07 Apr 2011 20:44:01 GMT) Full text and rfc822 format available.

Message #17 received at 8435 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 8435 <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Thu, 07 Apr 2011 13:43:39 -0700

On 04/07/2011 11:31 AM, Eli Zaretskii wrote:
> `doprnt' was careful in these cases, e.g., it would take care of truncating
> multibyte strings only on character boundaries.

Thanks for the review.  This problem should be OK, as the only place
where truncation can occur is in the call from vmessage to doprnt,
and the patched arranges for vmessage to do that truncation.

> It also supported %c format for converting a multibyte character to
> its integer representation.

This problem should be OK too.  I manually inspected all the places
that used the %c format.  All but one of them used %c only to convert
unibyte characters, so they're OK.  The only exception was in
charset_iso_charset_parameter, and I modified that function to convert
the multibyte character before passing it as a string to 'error'.

> There's also an issue with old pre-C99 implementations of `vsnprintf'
> which returned -1 when the buffer was too small.  However, I think
> catering for those would be a trivial change in `verror', so that's
> not a big deal.

Yes, thanks, I'll look into this.

Another issue has come up in further static analysis.  The vsnprintf
API does not work for strings longer than INT_MAX bytes.  For
vmessage's use of vsnprintf this is OK, since (for other reasons) a
frame title can't be that long.  However, for verror this is an
arbitrary limitation on typical 64-bit hosts.  I'll look into this,
and plan to propose a further patch to handle it.  Gnulib already
handles this situation, with its vasnprintf module, and most likely
I'll end up using that, or a variant of that.  This will most likely
affect the Windows port too.

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8435; Package emacs. (Fri, 08 Apr 2011 09:02:01 GMT) Full text and rfc822 format available.

Message #20 received at 8435 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 8435 <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Fri, 08 Apr 2011 11:58:44 +0300

> Date: Thu, 07 Apr 2011 13:43:39 -0700
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> CC: 8435 <at> debbugs.gnu.org
> 
> > It also supported %c format for converting a multibyte character to
> > its integer representation.
> 
> This problem should be OK too.  I manually inspected all the places
> that used the %c format.  All but one of them used %c only to convert
> unibyte characters, so they're OK.  The only exception was in
> charset_iso_charset_parameter, and I modified that function to convert
> the multibyte character before passing it as a string to 'error'.

But we are losing a valuable feature this way, I think.  From now on,
any code that needs to use %c for displaying a character codepoint
will need to convert it manually before calling the message functions.

Taking a step back, this issue is about possible bugs when int and
EMACS_INT data types are mixed up in the calls to functions that could
either call doprnt or printf and its ilk.  So I think it would be
better to fix these problems as follows:

  . Introduce a printf format conversion specifier for converting an
    EMACS_INT data type.

  . Fix all the direct and indirect callers of doprnt to use this new
    specifier when the argument is an EMACS_INT.

  . Fix doprnt to avoid overflow when EMACS_INT is a 64-bit type, if
    it could overflow.  (I don't see such a danger, but maybe I
    overlook something.)

You already did the first two.  So I think what my suggestion boils
down to fixing doprnt (if needed) instead of introducing vsnprintf,
and then all the additional problems caused by that introduction will
be gone.

So could you please tell what are the downsides of keeping doprnt
instead of introducing vsnprintf?

> Another issue has come up in further static analysis.  The vsnprintf
> API does not work for strings longer than INT_MAX bytes.  For
> vmessage's use of vsnprintf this is OK, since (for other reasons) a
> frame title can't be that long.  However, for verror this is an
> arbitrary limitation on typical 64-bit hosts.  I'll look into this,
> and plan to propose a further patch to handle it.

I don't think we have any reason to support strings longer than
INT_MAX in these functions.  They are used to display messages in the
echo area/minibuffer, so they can hardly be close to INT_MAX anyway.
We could simply document that and move on.  For bullet-proof code, we
could even check the length and truncate the string before passing it
to verror or its subroutines.

I also don't think we should remove message_nolog, even if it's
currently unused.  It's a useful function.  If someone feels badly
about having dead code, we could #ifdef it away, although I don't
think it matters for such a short function.

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8435; Package emacs. (Fri, 08 Apr 2011 23:35:01 GMT) Full text and rfc822 format available.

Message #23 received at 8435 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 8435 <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Fri, 08 Apr 2011 16:34:12 -0700

On 04/08/2011 01:58 AM, Eli Zaretskii wrote:

> So I think it would be better to fix these problems as follows:
>   ...
>   . Fix doprnt to avoid overflow when EMACS_INT is a 64-bit type, if
>     it could overflow.  (I don't see such a danger, but maybe I
>     overlook something.)

That wouldn't work, not because doprnt overflows with EMACS_INT, but
because doprnt doesn't work with ordinary 'int': it treats all integer
arguments as if they were EMACS_INT, and this relies on unportable
va_arg behavior.

It's true that doprnt also has overflow problems on 64-bit hosts.
For example, it overruns a buffer when formatting a string whose
length doesn't fit in 'int'.  But that's a separate issue.

No doubt these problems could be worked around with sufficient
hacking, but why bother?  The main reason doprnt exists is that
vsnprintf didn't exist back when doprnt was written, so we had to
write it ourselves.  But now that we can rely on vsnprintf, let's use
it rather than continuing to maintain our reinvented buggy wheel.

> From now on, any code that needs to use %c for displaying a
> character codepoint will need to convert it manually before calling
> the message functions.

Yes, that's true.  It's not a problem now, since there's only one
occurrence of this situation in the Emacs source code, and it is easy
to treat it as a one-off.  If it turns into a problem, we can easily
address it, by replacing this sort of code:

    int codepoint = (whatever);
    error ("Invalid FINAL-CHAR %c, it should be `0'..`~'", codepoint);

with something like this:

    int codepoint = (whatever);
    error ("Invalid FINAL-CHAR %s, it should be `0'..`~'", cvt (codepoint));

where 'cvt' converts a codepoint to a string suitable for 'error'.

> I don't think we have any reason to support strings longer than
> INT_MAX in these functions.  They are used to display messages in the
> echo area/minibuffer, so they can hardly be close to INT_MAX anyway.
> We could simply document that and move on.  For bullet-proof code, we
> could even check the length and truncate the string before passing it
> to verror or its subroutines.

OK, thanks, then here's a further patch to do something along those
lines.  It reliably reports "memory full" when the resulting error
string is longer than INT_MAX; that's better than crashing, which is
what doprnt currently does.

* eval.c: Port to Windows vsnprintf (Bug#8435).
Include <limits.h>.
(SIZE_MAX): Define if the headers do not.
(verror): Do not give up if vsnprintf returns a negative count.
Instead, grow the buffer.  This ports to Windows vsnprintf, which
does not conform to C99.  Problem reported by Eli Zaretskii.
Also, simplify the allocation scheme, by avoiding the need for
calling realloc, and removing the ALLOCATED variable.
=== modified file 'src/eval.c'
--- src/eval.c	2011-04-07 05:19:50 +0000
+++ src/eval.c	2011-04-08 23:08:55 +0000
@@ -18,6 +18,7 @@


 #include <config.h>
+#include <limits.h>
 #include <setjmp.h>
 #include "lisp.h"
 #include "blockinput.h"
@@ -30,6 +31,10 @@
 #include "xterm.h"
 #endif

+#ifndef SIZE_MAX
+# define SIZE_MAX ((size_t) -1)
+#endif
+
 /* This definition is duplicated in alloc.c and keyboard.c.  */
 /* Putting it in lisp.h makes cc bomb out!  */

@@ -1978,36 +1983,37 @@
 {
   char buf[4000];
   size_t size = sizeof buf;
-  size_t size_max = (size_t) -1;
+  size_t size_max =
+    min (MOST_POSITIVE_FIXNUM, min (INT_MAX, SIZE_MAX - 1)) + 1;
   char *buffer = buf;
-  int allocated = 0;
   int used;
   Lisp_Object string;

   while (1)
     {
       used = vsnprintf (buffer, size, m, ap);
+
       if (used < 0)
-	used = 0;
-      if (used < size)
+	{
+	  /* Non-C99 vsnprintf, such as w32, returns -1 when SIZE is too small.
+	     Guess a larger USED to work around the incompatibility.  */
+	  used = (size <= size_max / 2 ? 2 * size
+		  : size < size_max ? size_max - 1
+		  : size_max);
+	}
+      else if (used < size)
 	break;
-      if (size <= size_max / 2)
-	size *= 2;
-      else if (size < size_max)
-	size = size_max;
-      else
+      if (size_max <= used)
 	memory_full ();
-      if (allocated)
-	buffer = (char *) xrealloc (buffer, size);
-      else
-	{
-	  buffer = (char *) xmalloc (size);
-	  allocated = 1;
-	}
+      size = used + 1;
+
+      if (buffer != buf)
+	xfree (buffer);
+      buffer = (char *) xmalloc (size);
     }

   string = make_string (buffer, used);
-  if (allocated)
+  if (buffer != buf)
     xfree (buffer);

   xsignal1 (Qerror, string);

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8435; Package emacs. (Fri, 08 Apr 2011 23:39:02 GMT) Full text and rfc822 format available.

Message #26 received at 8435 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 8435 <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Fri, 08 Apr 2011 16:37:57 -0700

On 04/08/2011 01:58 AM, Eli Zaretskii wrote:

> I also don't think we should remove message_nolog, even if it's
> currently unused.  It's a useful function.  If someone feels badly
> about having dead code, we could #ifdef it away

That would be OK too.  Or we could simply retrieve it from the
repository if we ever need it again.

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8435; Package emacs. (Sat, 09 Apr 2011 07:23:02 GMT) Full text and rfc822 format available.

Message #29 received at 8435 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 8435 <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Sat, 09 Apr 2011 10:20:42 +0300

> Date: Fri, 08 Apr 2011 16:34:12 -0700
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> CC: 8435 <at> debbugs.gnu.org
> 
> On 04/08/2011 01:58 AM, Eli Zaretskii wrote:
> 
> > So I think it would be better to fix these problems as follows:
> >   ...
> >   . Fix doprnt to avoid overflow when EMACS_INT is a 64-bit type, if
> >     it could overflow.  (I don't see such a danger, but maybe I
> >     overlook something.)
> 
> That wouldn't work, not because doprnt overflows with EMACS_INT, but
> because doprnt doesn't work with ordinary 'int': it treats all integer
> arguments as if they were EMACS_INT, and this relies on unportable
> va_arg behavior.

Then let's change doprnt to support an `int'.  With the new descriptor
you introduced for EMACS_INT, it shouldn't be a problem to leave %d
and %u for `int' data types.  Are there any problems with this
approach?

> No doubt these problems could be worked around with sufficient
> hacking, but why bother?  The main reason doprnt exists is that
> vsnprintf didn't exist back when doprnt was written, so we had to
> write it ourselves.  But now that we can rely on vsnprintf, let's use
> it rather than continuing to maintain our reinvented buggy wheel.

That would be okay if vsnprintf was a drop-in replacement.  But as we
see, it isn't: doprnt provides a few features that vsnprintf does not,
and adding that support in other places has disadvantages that I
mentioned in my previous message.  So I think we should consider also
the alternative of fixing doprnt instead.  If it proves to be
reasonably easy, I think it's preferable, since that localizes the
changes.

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8435; Package emacs. (Sat, 09 Apr 2011 18:25:02 GMT) Full text and rfc822 format available.

Message #32 received at 8435 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 8435 <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Sat, 09 Apr 2011 11:24:08 -0700

On 04/09/2011 12:20 AM, Eli Zaretskii wrote:
>> Date: Fri, 08 Apr 2011 16:34:12 -0700
>> From: Paul Eggert <eggert <at> cs.ucla.edu>

> Then let's change doprnt to support an `int'...
> Are there any problems with this approach?

Yes: it'd be more work to do now, and will leave us with
more code to maintain afterwards.

> adding that support in other places has disadvantages that I
> mentioned in my previous message.

The disadvantages are small compared to the advantages.
Most of the changes in the patch are needed regardless
of whether doprnt is kept or discarded.  The part about
discarding doprnt shortens Emacs's code overall: it removes
283 lines (doprnt itself, plus scaffolding) and adds 17 lines
(callers adjusting to the minor differences between doprnt
and vsnprintf).  This is a clear win.

Part of the motivation here is that doprnt contains too many bugs.
I've mentioned two or three, but here's another:

                  while ('0' <= fmt[1] && fmt[1] <= '9')
                    {
                      if (n * 10 + fmt[1] - '0' < n)
                        error ("Format width or precision too large");
                      n = n * 10 + fmt[1] - '0';
                      *string++ = *++fmt;
                    }

That overflow check is clearly wrong: it will miss many
overflows.  This doprnt bug, like the others, could be fixed
by investing more time, but it's wasteful to spend time
maintaining a buggy near-copy of vsnprintf.  It's better
software engineering practice to use vsnprintf instead.

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8435; Package emacs. (Sat, 09 Apr 2011 18:35:02 GMT) Full text and rfc822 format available.

Message #35 received at 8435 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 8435 <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Sat, 09 Apr 2011 21:32:00 +0300

> Date: Sat, 09 Apr 2011 11:24:08 -0700
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> CC: 8435 <at> debbugs.gnu.org
> 
> > adding that support in other places has disadvantages that I
> > mentioned in my previous message.
> 
> The disadvantages are small compared to the advantages.

Sorry, but I disagree.  Instead of a single function that solved a
couple of problems internally, we now have to solve those problems in
users of that function, and we have to _remember_ that those problems
might be solved.

> Most of the changes in the patch are needed regardless
> of whether doprnt is kept or discarded.

I have no objections to those parts, as I wrote.

> The part about
> discarding doprnt shortens Emacs's code overall: it removes
> 283 lines (doprnt itself, plus scaffolding) and adds 17 lines
> (callers adjusting to the minor differences between doprnt
> and vsnprintf).  This is a clear win.

I don't think line count is a compelling argument in favor of the
change.

> Part of the motivation here is that doprnt contains too many bugs.

It cannot have too many real bugs, because it worked for so many years
with little or no trouble.

> it's wasteful to spend time maintaining a buggy near-copy of
> vsnprintf.  It's better software engineering practice to use
> vsnprintf instead.

Sorry, I don't agree.

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8435; Package emacs. (Sat, 09 Apr 2011 19:32:02 GMT) Full text and rfc822 format available.

Message #38 received at 8435 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 8435 <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Sat, 09 Apr 2011 22:28:50 +0300

> Date: Sat, 09 Apr 2011 11:24:08 -0700
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> CC: 8435 <at> debbugs.gnu.org
> 
> it's wasteful to spend time maintaining a buggy near-copy of
> vsnprintf.  It's better software engineering practice to use
> vsnprintf instead.

How about if we take vsnprintf from some library, e.g. glibc or
gnulib, and add to it code to support the two features that were at
the time added to doprnt (i.e., avoid splitting a multi-byte
character, and convert a multi-byte sequence to a wide character when
%c conversion is used)?

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8435; Package emacs. (Sat, 09 Apr 2011 19:41:02 GMT) Full text and rfc822 format available.

Message #41 received at 8435 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 8435 <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Sat, 09 Apr 2011 12:39:59 -0700

On 04/09/2011 11:32 AM, Eli Zaretskii wrote:
> I don't think line count is a compelling argument in favor of the
> change.

It's certainly good evidence that the change will simplify
Emacs maintainance overall.  It's not often that we can fix
multiple bugs like this by removing over 250 lines of source code.

> we now have to solve those problems in users of that function

Yes, that's a disadvantage, but it's a minor one; only a
17 lines of code are affected.  And there is a real advantage
to sticking with a stable, widely-used standard interface like
vsnprintf, as this makes it easier on code readers who are not
expert in Emacs internals.

As Emacs evolves, if we find that more lines of code are affected,
then the tradeoffs will change.  If that happens, it shouldn't
be hard to come up with a doprnt replacement that uses
vsnprintf internally and that also handles multibyte character
truncation and non-8-bit codepoints.  But given Emacs's current
use of doprnt, this would be overkill and would add complexity:
it would save 17 lines in doprnt's callers but require considerably
more than 17 lines to implement.

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8435; Package emacs. (Sat, 09 Apr 2011 19:44:02 GMT) Full text and rfc822 format available.

Message #44 received at 8435 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 8435 <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Sat, 09 Apr 2011 12:43:07 -0700

On 04/09/2011 12:28 PM, Eli Zaretskii wrote:
> How about if we take vsnprintf from some library, e.g. glibc or
> gnulib, and add to it code to support the two features that were at
> the time added to doprnt

Something like that might work, yes, though I'd rather not
be the guy who did it, as the code is pretty hairy.  Gnulib
would be a better choice, since any such changes along these
lines would be unlikely to be bought back by glibc.

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8435; Package emacs. (Sat, 09 Apr 2011 20:24:02 GMT) Full text and rfc822 format available.

Message #47 received at 8435 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 8435 <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Sat, 09 Apr 2011 23:21:02 +0300

> Date: Sat, 09 Apr 2011 12:39:59 -0700
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> CC: 8435 <at> debbugs.gnu.org
> 
> As Emacs evolves, if we find that more lines of code are affected,
> then the tradeoffs will change.  If that happens, it shouldn't
> be hard to come up with a doprnt replacement that uses
> vsnprintf internally and that also handles multibyte character
> truncation and non-8-bit codepoints.

I'd prefer that we do this now.  Leaving the problem to lurk for
future maintainers to deal with it would mean maintenance burden
and/or subtle bugs waiting to bite.

Emacs is a large and complex program, with many areas of its code
understandable only by a tiny few, sometimes by a single individual.
Leaving such dark corners means trading somebody else's future efforts
for our current convenience.  I just spent the best part of my weekend
hunting `int's that should have been EMACS_INT's (which, btw, is the
only _real_ problem with overflow that I know about), and I would hate
to leave similar problems to those who will come after us.

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8435; Package emacs. (Sun, 10 Apr 2011 03:53:01 GMT) Full text and rfc822 format available.

Message #50 received at 8435 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 8435 <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Sun, 10 Apr 2011 00:52:09 -0300

> I'd prefer that we do this now.  Leaving the problem to lurk for
> future maintainers to deal with it would mean maintenance burden
> and/or subtle bugs waiting to bite.

If the problematic case can be detected easily, that's good enough.
I don't understand the problem enough (haven't dug into it yet) to know
if that's the case.


        Stefan

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8435; Package emacs. (Sun, 10 Apr 2011 05:20:02 GMT) Full text and rfc822 format available.

Message #53 received at 8435 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: eggert <at> cs.ucla.edu, 8435 <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Sun, 10 Apr 2011 01:19:25 -0400

> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: Paul Eggert <eggert <at> cs.ucla.edu>,  8435 <at> debbugs.gnu.org
> Date: Sun, 10 Apr 2011 00:52:09 -0300
> 
> If the problematic case can be detected easily, that's good enough.

What would you suggest to do if we detect it?  To recap, this is in
the context of `message' or `error' called to produce a message about
something.

> I don't understand the problem enough (haven't dug into it yet) to know
> if that's the case.

There are two issues:

 . %c used to display a non-ASCII character, when the data points to a
   multibyte sequence.  doprnt converts the data to a wide character,
   when it displays the message, vsnprintf does not.

 . If the buffer supplied to doprnt is too small, it takes care to
   truncate the text only on character boundaries, whereas vsnprintf
   will not.

Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Sun, 10 Apr 2011 17:04:02 GMT) Full text and rfc822 format available.

Notification sent to Paul Eggert <eggert <at> cs.ucla.edu>:
bug acknowledged by developer. (Sun, 10 Apr 2011 17:04:02 GMT) Full text and rfc822 format available.

Message #58 received at 8435-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 8435-done <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Sun, 10 Apr 2011 10:03:21 -0700

On 04/09/2011 01:21 PM, Eli Zaretskii wrote:
>> Date: Sat, 09 Apr 2011 12:39:59 -0700
>> From: Paul Eggert <eggert <at> cs.ucla.edu>
>>
>> As Emacs evolves, if we find that more lines of code are affected,
>> then the tradeoffs will change.  If that happens, it shouldn't
>> be hard to come up with a doprnt replacement that uses
>> vsnprintf internally and that also handles multibyte character
>> truncation and non-8-bit codepoints.
> 
> I'd prefer that we do this now.

OK, please feel free to do that.  To help move this along, I
resurrected src/doprnt.c in my patch, and merged it into
the trunk, along with all the other patches I've been testing
that have to do with GCC 4.6.0's static checks.  Currently
Emacs is not using src/doprnt.c but it shouldn't be hard to
refactor the code to use doprnt again if that's what you prefer.
This should result in some simplification of vsnprintf's two
callers verror and vmessage.  I still don't think it's worth
the hassle, given Emacs's current usage (but I guess you've
been warned :-).

If you take this project on, you need to fix the 64-bit related
problems in doprnt.  For example, it's not safe to copy
a string length into an 'int'.  I've mentioned other bugs
in this area, and I'm sure there are others that I haven't
mentioned (I gave up on doprnt before fully analyzing it).

I'm going to mark this bug as "done", since the bug itself
is fixed now, and we're now talking about refactoring the fix.

Message #59 received at 8435-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 8435-done <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Sun, 10 Apr 2011 21:02:34 +0300

> Date: Sun, 10 Apr 2011 10:03:21 -0700
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> CC: 8435-done <at> debbugs.gnu.org
> 
> On 04/09/2011 01:21 PM, Eli Zaretskii wrote:
> >> Date: Sat, 09 Apr 2011 12:39:59 -0700
> >> From: Paul Eggert <eggert <at> cs.ucla.edu>
> >>
> >> As Emacs evolves, if we find that more lines of code are affected,
> >> then the tradeoffs will change.  If that happens, it shouldn't
> >> be hard to come up with a doprnt replacement that uses
> >> vsnprintf internally and that also handles multibyte character
> >> truncation and non-8-bit codepoints.
> > 
> > I'd prefer that we do this now.
> 
> OK, please feel free to do that.

Then why did you went along and committed the changes which assume
that doprnt will be replaced by vsnprintf from the library?  That just
makes the above job harder.  Surely, that wasn't your intent, but what
was it then?

Message #60 received at 8435-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 8435-done <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Sun, 10 Apr 2011 11:26:08 -0700

On 04/10/2011 11:02 AM, Eli Zaretskii wrote:
> Then why did you went along and committed the changes which assume
> that doprnt will be replaced by vsnprintf from the library?  That just
> makes the above job harder.

No, the job is easier with the changes.

Most of the changes are to areas that are unaffected by the
proposed refactoring.  These changes are necessary regardless
of whether we keep or discard doprnt, and are needed to fix the bug.
It would not be possible to install these bug fixes without also
either using vsnprintf (code that already exists, and that works)
or rewriting doprnt.  And it would not be possible to rewrite doprnt
without having these changes; otherwise, the rewritten doprnt
would be incompatible with the rest of Emacs and could crash
on 64-bit hosts.  So regardless of whether we keep or discard
doprnt, this commit is a step forward.

Message #61 received at 8435-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 8435-done <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Sun, 10 Apr 2011 23:47:21 +0300

> Date: Sun, 10 Apr 2011 10:03:21 -0700
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> CC: 8435-done <at> debbugs.gnu.org
> 
> I'm going to mark this bug as "done"

When you commit a changeset that fixes a bug, please use the --fixes
switch to bzr (see admin/notes/bugtracker for the details and setup),
so that the metadata of the commit includes the reference to the bug
you fixed.

TIA

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8435; Package emacs. (Mon, 11 Apr 2011 01:45:02 GMT) Full text and rfc822 format available.

Message #64 received at 8435 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: eggert <at> cs.ucla.edu, 8435 <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Sun, 10 Apr 2011 22:44:13 -0300

>> If the problematic case can be detected easily, that's good enough.
> What would you suggest to do if we detect it?  To recap, this is in
> the context of `message' or `error' called to produce a message about
> something.

IIRC those cases correspond to programming bugs (since the change
shifts the burden to the caller), so what needs to be done in those
cases is to signal an "internal error in foo.c:nnn".

> There are two issues:

>  . %c used to display a non-ASCII character, when the data points to a
>    multibyte sequence.  doprnt converts the data to a wide character,
>    when it displays the message, vsnprintf does not.

>  . If the buffer supplied to doprnt is too small, it takes care to
>    truncate the text only on character boundaries, whereas vsnprintf
>    will not.

It sounds like both cases can be caught more easily than they can be
fixed, right?


        Stefan

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8435; Package emacs. (Mon, 11 Apr 2011 02:59:02 GMT) Full text and rfc822 format available.

Message #67 received at 8435 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: eggert <at> cs.ucla.edu, 8435 <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Mon, 11 Apr 2011 05:54:47 +0300

> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: eggert <at> cs.ucla.edu,  8435 <at> debbugs.gnu.org
> Date: Sun, 10 Apr 2011 22:44:13 -0300
> 
> >> If the problematic case can be detected easily, that's good enough.
> > What would you suggest to do if we detect it?  To recap, this is in
> > the context of `message' or `error' called to produce a message about
> > something.
> 
> IIRC those cases correspond to programming bugs (since the change
> shifts the burden to the caller), so what needs to be done in those
> cases is to signal an "internal error in foo.c:nnn".
> 
> > There are two issues:
> 
> >  . %c used to display a non-ASCII character, when the data points to a
> >    multibyte sequence.  doprnt converts the data to a wide character,
> >    when it displays the message, vsnprintf does not.
> 
> >  . If the buffer supplied to doprnt is too small, it takes care to
> >    truncate the text only on character boundaries, whereas vsnprintf
> >    will not.
> 
> It sounds like both cases can be caught more easily than they can be
> fixed, right?

I think both ways are comparable (if I understand correctly what you
mean by "caught").

Message #68 received at 8435-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 8435-done <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Sat, 23 Apr 2011 13:39:42 +0300

> Date: Sun, 10 Apr 2011 10:03:21 -0700
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> CC: 8435-done <at> debbugs.gnu.org
> 
> On 04/09/2011 01:21 PM, Eli Zaretskii wrote:
> >> Date: Sat, 09 Apr 2011 12:39:59 -0700
> >> From: Paul Eggert <eggert <at> cs.ucla.edu>
> >>
> >> As Emacs evolves, if we find that more lines of code are affected,
> >> then the tradeoffs will change.  If that happens, it shouldn't
> >> be hard to come up with a doprnt replacement that uses
> >> vsnprintf internally and that also handles multibyte character
> >> truncation and non-8-bit codepoints.
> > 
> > I'd prefer that we do this now.
> 
> OK, please feel free to do that.

Done.

> If you take this project on, you need to fix the 64-bit related
> problems in doprnt.

I think I did; feel free to reopen this bug or file a new one, if
there are left-overs.

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8435; Package emacs. (Mon, 25 Apr 2011 01:00:03 GMT) Full text and rfc822 format available.

Message #71 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Daniel Colascione <dan.colascione <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Cc: Eli Zaretskii <eliz <at> gnu.org>, 8435 <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Sun, 24 Apr 2011 17:59:29 -0700

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 4/8/11 4:37 PM, Paul Eggert wrote:
> On 04/08/2011 01:58 AM, Eli Zaretskii wrote:
> 
>> I also don't think we should remove message_nolog, even if it's
>> currently unused.  It's a useful function.  If someone feels badly
>> about having dead code, we could #ifdef it away
> 
> That would be OK too.  Or we could simply retrieve it from the
> repository if we ever need it again.

The problem with using the change history as a code library that way is
that nobody knows functions like message_nolog are available.  Better,
IMHO, it leave it in the code.  Even if we compile it in, a good linker
ought to remove it, yes?
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (Darwin)

iEYEARECAAYFAk20x3AACgkQ17c2LVA10VsY8gCg6EOSQwfP6c9H1YbHP8SQ/egm
k7QAoKtnfklc5oMFMIo74oaGJEuxq6od
=WOuZ
-----END PGP SIGNATURE-----

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8435; Package emacs. (Mon, 25 Apr 2011 01:00:04 GMT) Full text and rfc822 format available.

Message #75 received at 8435-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 8435-done <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Sun, 24 Apr 2011 23:09:56 -0700

On 04/23/11 03:39, Eli Zaretskii wrote:
> feel free to reopen this bug or file a new one, if
> there are left-overs.

OK, thanks, I filed a new bug report for that, Bug#8545.

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8435; Package emacs. (Mon, 25 Apr 2011 06:45:03 GMT) Full text and rfc822 format available.

Message #78 received at 8435 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Colascione <dan.colascione <at> gmail.com>
Cc: eggert <at> cs.ucla.edu, 8435 <at> debbugs.gnu.org
Subject: Re: bug#8435: misuse of error ("...%d...", ...) on 64-bit hosts
Date: Mon, 25 Apr 2011 09:42:52 +0300

> Date: Sun, 24 Apr 2011 17:59:29 -0700
> From: Daniel Colascione <dan.colascione <at> gmail.com>
> CC: Eli Zaretskii <eliz <at> gnu.org>, 8435 <at> debbugs.gnu.org
> 
> On 4/8/11 4:37 PM, Paul Eggert wrote:
> > On 04/08/2011 01:58 AM, Eli Zaretskii wrote:
> > 
> >> I also don't think we should remove message_nolog, even if it's
> >> currently unused.  It's a useful function.  If someone feels badly
> >> about having dead code, we could #ifdef it away
> > 
> > That would be OK too.  Or we could simply retrieve it from the
> > repository if we ever need it again.
> 
> The problem with using the change history as a code library that way is
> that nobody knows functions like message_nolog are available.  Better,
> IMHO, it leave it in the code.  Even if we compile it in, a good linker
> ought to remove it, yes?

The function was not removed.  Paul left it in the code, #ifdef'ed
away, as I suggested.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 23 May 2011 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 12 years and 364 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #8435 misuse of error ("...%d...", ...) on 64-bit hosts

GNU bug report logs - #8435
misuse of error ("...%d...", ...) on 64-bit hosts