GNU bug report logs - #8546
fix for Emacs pseudovector incompatibility with GCC 4.6.0

Previous Next

Package: emacs;

Reported by: Paul Eggert <eggert <at> cs.ucla.edu>

Date: Mon, 25 Apr 2011 07:43:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 8546 in the body.
You can then email your comments to 8546 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8546; Package emacs. (Mon, 25 Apr 2011 07:43:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Paul Eggert <eggert <at> cs.ucla.edu>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Mon, 25 Apr 2011 07:43:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: bug-gnu-emacs <at> gnu.org
Subject: fix for Emacs pseudovector incompatibility with GCC 4.6.0
Date: Mon, 25 Apr 2011 00:41:58 -0700
[Message part 1 (text/plain, inline)]
Emacs's pseudovector implementation occasionally runs afoul of
optimizations introduced by version 4.6.0 of GCC, and this can break
Emacs.  Luckily there's a fix, which I would like to install after some more
testing.  I'm publishing it here now for a heads-up.

Another possible fix would be to disable the GCC optimizations.
However, this would no doubt make Emacs slower overall, and anyway the
optimizations are valid according to the rules of C which means that
non-GCC compilers may well be doing them too.  It's better to alter
Emacs to avoid the problem, since this is not too much trouble.

Here are more details about the problem.

Building Emacs 23.3 (as well as the Emacs trunk) on Ubuntu 10.10 x86
and GCC 4.6.0, with Emacs configured by "configure
--enable-checking=all", fails with the following symptoms:

  `/bin/pwd`/temacs --batch --load loadup bootstrap
  
  /home/eggert/junk/emacs-23.3/src/buffer.c:5177: Emacs fatal error: assertion failed: (XVECTOR (Vbuffer_defaults)->size & (PSEUDOVECTOR_FLAG | PVEC_TYPE_MASK)) == (PSEUDOVECTOR_FLAG | (PVEC_BUFFER))
  Aborted

I tracked this down to an incompatibility between Emacs and GCC 4.6.0
on the x86 with -O2 optimization.  GCC does a type-based aliasing
optimization, and reorders stores and loads to locations that cannot
possibly be the same location if the types are as the program says
they are.  Unfortunately, Emacs's pseudovector implementation dissembles
about the types and therefore runs afoul of the optimization.
Possibly there are subtle bugs induced by this, even when
--enable-checking is not used, but --enable-checking makes the problem
obvious.

Here's the source code, in buffer.c's init_buffer_once:

  reset_buffer_local_variables (&buffer_local_symbols, 1);
  /* Prevent GC from getting confused.  */
  buffer_defaults.text = &buffer_defaults.own_text;
  buffer_local_symbols.text = &buffer_local_symbols.own_text;
  BUF_INTERVALS (&buffer_defaults) = 0;
  BUF_INTERVALS (&buffer_local_symbols) = 0;
  XSETPVECTYPE (&buffer_defaults, PVEC_BUFFER);
  XSETBUFFER (Vbuffer_defaults, &buffer_defaults);

Here are the relevant definitions in lisp.h:

  #define XSETPVECTYPE(v,code) ((v)->size |= PSEUDOVECTOR_FLAG | (code))
  #define XSETBUFFER(a, b) (XSETPSEUDOVECTOR (a, b, PVEC_BUFFER))
  #define XSETPSEUDOVECTOR(a, b, code) \
    (XSETVECTOR (a, b),							\
     eassert ((XVECTOR (a)->size & (PSEUDOVECTOR_FLAG | PVEC_TYPE_MASK))	\
	      == (PSEUDOVECTOR_FLAG | (code))))

Here's the generated x86 code, with the problem highlighted:

		movl	$buffer_local_symbols, %eax
		movl	$1, %edx
		call	reset_buffer_local_variables
		movl	$buffer_defaults, %eax
		orl	$5, %eax
		movl	%eax, Vbuffer_defaults
		andl	$-8, %eax
	0=>	movl	(%eax), %eax
	1=>	orl	$1073872896, buffer_defaults
		movl	$buffer_defaults+8, buffer_defaults+76
		andl	$1082129920, %eax
	2=>	cmpl	$1073872896, %eax
		movl	$buffer_local_symbols+8, buffer_local_symbols+76
		movl	$0, buffer_defaults+64
		movl	$0, buffer_local_symbols+64
		je	.L2396
		movl	suppress_checking, %eax
		testl	%eax, %eax
		je	.L2398
	.L2396:

The code marked (1) implements the expansion of XSETPVECTYPE, and sets
buffer_defaults.size to 0x40020000, the mark for a buffer.  The code
marked (2) is part of the expansion of the eassert, and it checks that
XVECTOR (Vbuffer_defaults)->size has the proper flag and code.  But
(2) is relying on a *cached* copy of the size, which was loaded in (0),
and (0) precedes (1).  So, the assertion fails.

The patch is attached.  It's against my copy of Emacs, which has a few
other fixes that I haven't had time to merge to the trunk yet.  But it
should give a good feel for what's involved.

[pseudovec.txt.gz (application/x-gzip, attachment)]

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8546; Package emacs. (Mon, 25 Apr 2011 10:24:02 GMT) Full text and rfc822 format available.

Message #8 received at 8546 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 8546 <at> debbugs.gnu.org
Subject: Re: bug#8546: fix for Emacs pseudovector incompatibility with GCC
	4.6.0
Date: Mon, 25 Apr 2011 13:23:13 +0300
> Date: Mon, 25 Apr 2011 00:41:58 -0700
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> 
>   #define XSETPSEUDOVECTOR(a, b, code) \
>     (XSETVECTOR (a, b),							\
>      eassert ((XVECTOR (a)->size & (PSEUDOVECTOR_FLAG | PVEC_TYPE_MASK))	\
> 	      == (PSEUDOVECTOR_FLAG | (code))))
> 
> Here's the generated x86 code, with the problem highlighted:
> 
> 		movl	$buffer_local_symbols, %eax
> 		movl	$1, %edx
> 		call	reset_buffer_local_variables
> 		movl	$buffer_defaults, %eax
> 		orl	$5, %eax
> 		movl	%eax, Vbuffer_defaults
> 		andl	$-8, %eax
> 	0=>	movl	(%eax), %eax
> 	1=>	orl	$1073872896, buffer_defaults
> 		movl	$buffer_defaults+8, buffer_defaults+76
> 		andl	$1082129920, %eax
> 	2=>	cmpl	$1073872896, %eax
> 		movl	$buffer_local_symbols+8, buffer_local_symbols+76
> 		movl	$0, buffer_defaults+64
> 		movl	$0, buffer_local_symbols+64
> 		je	.L2396
> 		movl	suppress_checking, %eax
> 		testl	%eax, %eax
> 		je	.L2398
> 	.L2396:
> 
> The code marked (1) implements the expansion of XSETPVECTYPE, and sets
> buffer_defaults.size to 0x40020000, the mark for a buffer.  The code
> marked (2) is part of the expansion of the eassert, and it checks that
> XVECTOR (Vbuffer_defaults)->size has the proper flag and code.  But
> (2) is relying on a *cached* copy of the size, which was loaded in (0),
> and (0) precedes (1).  So, the assertion fails.

Could you please tell more what is it in the Emacs macros that
triggers this problem?  You say that "Emacs's pseudovector
implementation dissembles about the types", but could you please point
out where in the code this happens?  I'm afraid I don't see it, but
then I'm no expert on compiler optimizations and on how they interact
with C types.

TIA




Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8546; Package emacs. (Mon, 25 Apr 2011 14:06:01 GMT) Full text and rfc822 format available.

Message #11 received at 8546 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 8546 <at> debbugs.gnu.org
Subject: Re: bug#8546: fix for Emacs pseudovector incompatibility with GCC
	4.6.0
Date: Mon, 25 Apr 2011 11:05:20 -0300
> The patch is attached.  It's against my copy of Emacs, which has a few
> other fixes that I haven't had time to merge to the trunk yet.  But it
> should give a good feel for what's involved.

[ Please don't compress patches for review.  57KB is not that large.  ]

Thanks for tackling this problem.  A few questions/comments on your
patch (which I haven't reviewed completely):

   +struct vector_header

I'd call it vectorlike_header.
   
   +  {
   +    EMACS_UINT size;
   +    union {
   +      struct buffer *buffer;
   +      struct Lisp_Vector *vector;
   +    } next;
   +  };

Why do you need to handle buffers specially here?  That sounds wrong.

   +#define XVECTOR_SIZE(a) (XVECTOR (a)->header.size + 0)

why not use ASIZE?

   +#define XVECTOR_HEADER_SIZE(a) (((struct vector_header *) XPNTR (a))->size + 0)

why do we need this variant with this weird set of type casts?

   +	* lread.c (defsubr): Use XSETTYPED_PVECTYPE, since Lisp_Subr is a
   +	special case.

Why does Lisp_Subr need to be a special case (IIUC this applies to
XSETTYPED_PSEUDOVECTOR and TYPED_PSEUDOVECTORP as well).

   +#define XSETPVECTYPESIZE(v, code, sizeval) \
   +  ((v)->header.size = PSEUDOVECTOR_FLAG | (code) | (sizeval))

Sounds good.


        Stefan




Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8546; Package emacs. (Mon, 25 Apr 2011 19:31:02 GMT) Full text and rfc822 format available.

Message #14 received at 8546 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 8546 <at> debbugs.gnu.org
Subject: Re: bug#8546: fix for Emacs pseudovector incompatibility with GCC
	4.6.0
Date: Mon, 25 Apr 2011 12:30:29 -0700
On 04/25/11 03:23, Eli Zaretskii wrote:
> Could you please tell more what is it in the Emacs macros that
> triggers this problem?  You say that "Emacs's pseudovector
> implementation dissembles about the types", but could you please point
> out where in the code this happens?

Sure.  First, let me briefly describe the optimization, which
is allowed by the C standard.  Here's an idealized example:

	struct a { int size; ...; };
	struct b { int size; ...; };
	struct a *p = ...;
        struct b *q = ...;
	p->size = 0;
	q->size = 1;
	return p->size;

ISO C allows a compiler to optimize the last statement
to "return 0;".  If P and Q point to the
same memory location, storing through P and loading through
Q results in undefined behavior, because P and Q are incompatible
types.  When the behavior is undefined, the compiler is allowed
to generate whatever code it likes, including the optimized code.

Most of Emacs is OK with this optimization.  However, the pseudovector
code currently does stuff like this when checking is enabled
(I am giving the preprocessor output of XSETPVECTYPE followed by
XSETBUFFER, and a simplified version of it assuming the x86
to make things clearer):

	struct buffer *b = ...;
	b->size |= 0x4020000;
	Lisp_Object o = (Lisp_Object) b;
	struct Lisp_Vector *v = (struct Lisp_Vector *) o;
	if ((v->size & 0x4020000) != 0x4020000)
	  abort ();

It's the conversion of struct buffer * to struct Lisp_Vector * that
results in undefined behavior, for the same reason that the
earlier example does: the code is reading a word via a struct Lisp_Vector *
pointer, which means that the compiler is free to (and GCC does) delay
the earlier store of that word until after the check, which means the code
aborts.




Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8546; Package emacs. (Mon, 25 Apr 2011 23:14:02 GMT) Full text and rfc822 format available.

Message #17 received at 8546 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 8546 <at> debbugs.gnu.org
Subject: Re: bug#8546: fix for Emacs pseudovector incompatibility with GCC
	4.6.0
Date: Mon, 25 Apr 2011 16:12:51 -0700
[Message part 1 (text/plain, inline)]
On 04/25/11 07:05, Stefan Monnier wrote:
>    +struct vector_header
> 
> I'd call it vectorlike_header.

OK, I'll do that.

>    +#define XVECTOR_SIZE(a) (XVECTOR (a)->header.size + 0)
> 
> why not use ASIZE?

No good reason.  Thanks, I'll do that too.

>    +  {
>    +    EMACS_UINT size;
>    +    union {
>    +      struct buffer *buffer;
>    +      struct Lisp_Vector *vector;
>    +    } next;
>    +  };
> 
> Why do you need to handle buffers specially here?  That sounds wrong.

Purely as a convenience.  The code always uses the 'next' pointer as a
struct buffer * (in alloc.c, buffer.c, data.c), or as a struct
Lisp_Vector * (in alloc.c, fns.c).  As an alternative, we could
replace the above with

  {
    EMACS_UINT size;
    struct vectorlike_header *next;
  };

and then replace uses like this:

  for (b = all_buffers; b && b != po; b = b->header.next.buffer)

with uses like this:

  for (b = all_buffers; b && b != po; b = (struct buffer *) b->header.next)

I thought that the union made the code clearer and I know you
normally dislike casts, but if you prefer the style with casts it'd be
easy to do that too.

>    +#define XVECTOR_HEADER_SIZE(a) (((struct vector_header *) XPNTR (a))->size + 0)
> 
> why do we need this variant with this weird set of type casts?

I'll remove it.  It is used in only one place, in
XSETTYPED_PSEUDOVECTOR, where the idea is a key part of the
antialiasing fix.  But there's no need to break it out as a separate
macro, so I'll fold it into XSETTYPED_PSEUDOVECTOR.

>    +	* lread.c (defsubr): Use XSETTYPED_PVECTYPE, since Lisp_Subr is a
>    +	special case.
> 
> Why does Lisp_Subr need to be a special case (IIUC this applies to
> XSETTYPED_PSEUDOVECTOR and TYPED_PSEUDOVECTORP as well).

struct Lisp_Subr has a "size" field but no "next" field.  I didn't
change its layout to contain a struct vectorlike_header field, as that
would have added an extra word that isn't needed.  It would be safer,
from a C standards point of view, to spend the extra word and make
struct Lisp_Subr be like the other vector-like objects, but in
practice I doubt whether any practical optimizing compiler would break
that part of the code; so I kept it as a special case.

If you prefer the simpler and cleaner (but less space-efficient)
variant, where struct Lisp_Subr has a "next" field like all the other
vector-like data structures, that would be easy to do.

Attached is a revised patch with the above comments taken into account.

[pseudovec-2.txt (text/plain, attachment)]

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#8546; Package emacs. (Tue, 26 Apr 2011 12:47:01 GMT) Full text and rfc822 format available.

Message #20 received at 8546 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 8546 <at> debbugs.gnu.org
Subject: Re: bug#8546: fix for Emacs pseudovector incompatibility with GCC
	4.6.0
Date: Tue, 26 Apr 2011 09:46:11 -0300
>> +  {
>> +    EMACS_UINT size;
>> +    union {
>> +      struct buffer *buffer;
>> +      struct Lisp_Vector *vector;
>> +    } next;
>> +  };
>>
>> Why do you need to handle buffers specially here?  That sounds wrong.

> Purely as a convenience.  The code always uses the 'next' pointer as a
> struct buffer * (in alloc.c, buffer.c, data.c), or as a struct
> Lisp_Vector * (in alloc.c, fns.c).  As an alternative, we could

Ah, that makes sense (and deserves a brief comment).  Makes me wonder,
tho: why do we need the struct vectorlike_header?

IIUC the core part of your fix is to make all accesses to `size' use the
same type (i.e. Lisp_Vector), right?  Or does the use of the struct
help somehow?

While I understand the problem you're trying to fix, I don't know the
details of the C standard in sufficient detail to know exactly where's
the boundary between safe and unsafe.  And your patch should have
a comment next to the core part of the fix explaining which part is
important and needs to be preserved.

> I thought that the union made the code clearer and I know you

Yes, that's fine.  It just seemed odd to single out buffers.

>> Why does Lisp_Subr need to be a special case (IIUC this applies to
>> XSETTYPED_PSEUDOVECTOR and TYPED_PSEUDOVECTORP as well).

> struct Lisp_Subr has a "size" field but no "next" field.

Ah, yes, that makes sense as well (and deserves another brief comment).


        Stefan




Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Tue, 26 Apr 2011 20:07:03 GMT) Full text and rfc822 format available.

Notification sent to Paul Eggert <eggert <at> cs.ucla.edu>:
bug acknowledged by developer. (Tue, 26 Apr 2011 20:07:03 GMT) Full text and rfc822 format available.

Message #25 received at 8546-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 8546-done <at> debbugs.gnu.org, 8525-done <at> debbugs.gnu.org
Subject: Re: bug#8546: fix for Emacs pseudovector incompatibility with GCC
	4.6.0
Date: Tue, 26 Apr 2011 13:06:35 -0700
On 04/26/11 05:46, Stefan Monnier wrote:
 
> Ah, yes, that makes sense as well (and deserves another brief comment).

OK, thanks, I added comments for that, and for the other areas where
you requested comments, and merged it into the trunk.  I'll mark this
bug as done.  This merge also fixes bug 8525, which I'll also mark.

As mentioned in bug 8525, this fix assumes strtoumax and therefore
may require the Windows build to supply a 2-line inttypes.h if
Windows doesn't already have inttypes.h.  For details please see
<http://debbugs.gnu.org/cgi/bugreport.cgi?bug=8525#14>.




Message #26 received at 8546-done <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 8546-done <at> debbugs.gnu.org, 8525-done <at> debbugs.gnu.org
Subject: Re: bug#8546: fix for Emacs pseudovector incompatibility with GCC
	4.6.0
Date: Tue, 26 Apr 2011 22:11:19 -0300
> OK, thanks, I added comments for that, and for the other areas where
> you requested comments, and merged it into the trunk.  I'll mark this
> bug as done.  This merge also fixes bug 8525, which I'll also mark.

So IIUC your answer to my question "why do we need the struct
vectorlike_header?" is

 /* Header of vector-like objects.  This documents the layout constraints on
    vectors and pseudovectors other than struct Lisp_Subr.  It also prevents
    compilers from being fooled by Emacs's type punning: the XSETPSEUDOVECTOR
    and PSEUDOVECTORP macros cast their pointers to struct vectorlike_header *,
    because when two such pointers potentially alias, a compiler won't
    incorrectly reorder loads and stores to their size fields.  See
    <http://debbugs.gnu.org/cgi/bugreport.cgi?bug=8546>.  */

But that doesn't explain why we need "struct vectorlike_header".
Since the macros could cast to "EMACS_UINT*" instead to access to size
field and that give us the same result.


        Stefan




Message #27 received at 8546-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 8546-done <at> debbugs.gnu.org
Subject: Re: bug#8546: fix for Emacs pseudovector incompatibility with GCC
	4.6.0
Date: Tue, 26 Apr 2011 21:30:16 -0700
On 04/26/11 18:11, Stefan Monnier wrote:
> But that doesn't explain why we need "struct vectorlike_header".
> Since the macros could cast to "EMACS_UINT*" instead to access to size
> field and that give us the same result.

The C standard allows implementations where 'EMACS_UINT *' uses a
different representation from 'struct buffer *'.  On a word-based
machine, for example, 'EMACS_INT *' might have a one-word representation,
but 'struct buffer *' would have to be byte-addressible (as all
struct pointers must "smell the same" in C) and therefore might have
two words.  On such a machine, ((EMACS_UINT) p) would return a
different integer than ((EMACS_UINT) &p->size).

In practice, though, I doubt whether Emacs will ever be ported
to a word-based machine, so this objection is mostly theoretical,
and if you prefer the macros to cast to "EMACS_UINT *" instead
I could easily write a patch to do that.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 25 May 2011 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 12 years and 362 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.