GNU bug report logs -
#8546
fix for Emacs pseudovector incompatibility with GCC 4.6.0
Previous Next
Reported by: Paul Eggert <eggert <at> cs.ucla.edu>
Date: Mon, 25 Apr 2011 07:43:02 UTC
Severity: normal
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 8546 in the body.
You can then email your comments to 8546 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#8546
; Package
emacs
.
(Mon, 25 Apr 2011 07:43:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Paul Eggert <eggert <at> cs.ucla.edu>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Mon, 25 Apr 2011 07:43:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Emacs's pseudovector implementation occasionally runs afoul of
optimizations introduced by version 4.6.0 of GCC, and this can break
Emacs. Luckily there's a fix, which I would like to install after some more
testing. I'm publishing it here now for a heads-up.
Another possible fix would be to disable the GCC optimizations.
However, this would no doubt make Emacs slower overall, and anyway the
optimizations are valid according to the rules of C which means that
non-GCC compilers may well be doing them too. It's better to alter
Emacs to avoid the problem, since this is not too much trouble.
Here are more details about the problem.
Building Emacs 23.3 (as well as the Emacs trunk) on Ubuntu 10.10 x86
and GCC 4.6.0, with Emacs configured by "configure
--enable-checking=all", fails with the following symptoms:
`/bin/pwd`/temacs --batch --load loadup bootstrap
/home/eggert/junk/emacs-23.3/src/buffer.c:5177: Emacs fatal error: assertion failed: (XVECTOR (Vbuffer_defaults)->size & (PSEUDOVECTOR_FLAG | PVEC_TYPE_MASK)) == (PSEUDOVECTOR_FLAG | (PVEC_BUFFER))
Aborted
I tracked this down to an incompatibility between Emacs and GCC 4.6.0
on the x86 with -O2 optimization. GCC does a type-based aliasing
optimization, and reorders stores and loads to locations that cannot
possibly be the same location if the types are as the program says
they are. Unfortunately, Emacs's pseudovector implementation dissembles
about the types and therefore runs afoul of the optimization.
Possibly there are subtle bugs induced by this, even when
--enable-checking is not used, but --enable-checking makes the problem
obvious.
Here's the source code, in buffer.c's init_buffer_once:
reset_buffer_local_variables (&buffer_local_symbols, 1);
/* Prevent GC from getting confused. */
buffer_defaults.text = &buffer_defaults.own_text;
buffer_local_symbols.text = &buffer_local_symbols.own_text;
BUF_INTERVALS (&buffer_defaults) = 0;
BUF_INTERVALS (&buffer_local_symbols) = 0;
XSETPVECTYPE (&buffer_defaults, PVEC_BUFFER);
XSETBUFFER (Vbuffer_defaults, &buffer_defaults);
Here are the relevant definitions in lisp.h:
#define XSETPVECTYPE(v,code) ((v)->size |= PSEUDOVECTOR_FLAG | (code))
#define XSETBUFFER(a, b) (XSETPSEUDOVECTOR (a, b, PVEC_BUFFER))
#define XSETPSEUDOVECTOR(a, b, code) \
(XSETVECTOR (a, b), \
eassert ((XVECTOR (a)->size & (PSEUDOVECTOR_FLAG | PVEC_TYPE_MASK)) \
== (PSEUDOVECTOR_FLAG | (code))))
Here's the generated x86 code, with the problem highlighted:
movl $buffer_local_symbols, %eax
movl $1, %edx
call reset_buffer_local_variables
movl $buffer_defaults, %eax
orl $5, %eax
movl %eax, Vbuffer_defaults
andl $-8, %eax
0=> movl (%eax), %eax
1=> orl $1073872896, buffer_defaults
movl $buffer_defaults+8, buffer_defaults+76
andl $1082129920, %eax
2=> cmpl $1073872896, %eax
movl $buffer_local_symbols+8, buffer_local_symbols+76
movl $0, buffer_defaults+64
movl $0, buffer_local_symbols+64
je .L2396
movl suppress_checking, %eax
testl %eax, %eax
je .L2398
.L2396:
The code marked (1) implements the expansion of XSETPVECTYPE, and sets
buffer_defaults.size to 0x40020000, the mark for a buffer. The code
marked (2) is part of the expansion of the eassert, and it checks that
XVECTOR (Vbuffer_defaults)->size has the proper flag and code. But
(2) is relying on a *cached* copy of the size, which was loaded in (0),
and (0) precedes (1). So, the assertion fails.
The patch is attached. It's against my copy of Emacs, which has a few
other fixes that I haven't had time to merge to the trunk yet. But it
should give a good feel for what's involved.
[pseudovec.txt.gz (application/x-gzip, attachment)]
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#8546
; Package
emacs
.
(Mon, 25 Apr 2011 10:24:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 8546 <at> debbugs.gnu.org (full text, mbox):
> Date: Mon, 25 Apr 2011 00:41:58 -0700
> From: Paul Eggert <eggert <at> cs.ucla.edu>
>
> #define XSETPSEUDOVECTOR(a, b, code) \
> (XSETVECTOR (a, b), \
> eassert ((XVECTOR (a)->size & (PSEUDOVECTOR_FLAG | PVEC_TYPE_MASK)) \
> == (PSEUDOVECTOR_FLAG | (code))))
>
> Here's the generated x86 code, with the problem highlighted:
>
> movl $buffer_local_symbols, %eax
> movl $1, %edx
> call reset_buffer_local_variables
> movl $buffer_defaults, %eax
> orl $5, %eax
> movl %eax, Vbuffer_defaults
> andl $-8, %eax
> 0=> movl (%eax), %eax
> 1=> orl $1073872896, buffer_defaults
> movl $buffer_defaults+8, buffer_defaults+76
> andl $1082129920, %eax
> 2=> cmpl $1073872896, %eax
> movl $buffer_local_symbols+8, buffer_local_symbols+76
> movl $0, buffer_defaults+64
> movl $0, buffer_local_symbols+64
> je .L2396
> movl suppress_checking, %eax
> testl %eax, %eax
> je .L2398
> .L2396:
>
> The code marked (1) implements the expansion of XSETPVECTYPE, and sets
> buffer_defaults.size to 0x40020000, the mark for a buffer. The code
> marked (2) is part of the expansion of the eassert, and it checks that
> XVECTOR (Vbuffer_defaults)->size has the proper flag and code. But
> (2) is relying on a *cached* copy of the size, which was loaded in (0),
> and (0) precedes (1). So, the assertion fails.
Could you please tell more what is it in the Emacs macros that
triggers this problem? You say that "Emacs's pseudovector
implementation dissembles about the types", but could you please point
out where in the code this happens? I'm afraid I don't see it, but
then I'm no expert on compiler optimizations and on how they interact
with C types.
TIA
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#8546
; Package
emacs
.
(Mon, 25 Apr 2011 14:06:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 8546 <at> debbugs.gnu.org (full text, mbox):
> The patch is attached. It's against my copy of Emacs, which has a few
> other fixes that I haven't had time to merge to the trunk yet. But it
> should give a good feel for what's involved.
[ Please don't compress patches for review. 57KB is not that large. ]
Thanks for tackling this problem. A few questions/comments on your
patch (which I haven't reviewed completely):
+struct vector_header
I'd call it vectorlike_header.
+ {
+ EMACS_UINT size;
+ union {
+ struct buffer *buffer;
+ struct Lisp_Vector *vector;
+ } next;
+ };
Why do you need to handle buffers specially here? That sounds wrong.
+#define XVECTOR_SIZE(a) (XVECTOR (a)->header.size + 0)
why not use ASIZE?
+#define XVECTOR_HEADER_SIZE(a) (((struct vector_header *) XPNTR (a))->size + 0)
why do we need this variant with this weird set of type casts?
+ * lread.c (defsubr): Use XSETTYPED_PVECTYPE, since Lisp_Subr is a
+ special case.
Why does Lisp_Subr need to be a special case (IIUC this applies to
XSETTYPED_PSEUDOVECTOR and TYPED_PSEUDOVECTORP as well).
+#define XSETPVECTYPESIZE(v, code, sizeval) \
+ ((v)->header.size = PSEUDOVECTOR_FLAG | (code) | (sizeval))
Sounds good.
Stefan
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#8546
; Package
emacs
.
(Mon, 25 Apr 2011 19:31:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 8546 <at> debbugs.gnu.org (full text, mbox):
On 04/25/11 03:23, Eli Zaretskii wrote:
> Could you please tell more what is it in the Emacs macros that
> triggers this problem? You say that "Emacs's pseudovector
> implementation dissembles about the types", but could you please point
> out where in the code this happens?
Sure. First, let me briefly describe the optimization, which
is allowed by the C standard. Here's an idealized example:
struct a { int size; ...; };
struct b { int size; ...; };
struct a *p = ...;
struct b *q = ...;
p->size = 0;
q->size = 1;
return p->size;
ISO C allows a compiler to optimize the last statement
to "return 0;". If P and Q point to the
same memory location, storing through P and loading through
Q results in undefined behavior, because P and Q are incompatible
types. When the behavior is undefined, the compiler is allowed
to generate whatever code it likes, including the optimized code.
Most of Emacs is OK with this optimization. However, the pseudovector
code currently does stuff like this when checking is enabled
(I am giving the preprocessor output of XSETPVECTYPE followed by
XSETBUFFER, and a simplified version of it assuming the x86
to make things clearer):
struct buffer *b = ...;
b->size |= 0x4020000;
Lisp_Object o = (Lisp_Object) b;
struct Lisp_Vector *v = (struct Lisp_Vector *) o;
if ((v->size & 0x4020000) != 0x4020000)
abort ();
It's the conversion of struct buffer * to struct Lisp_Vector * that
results in undefined behavior, for the same reason that the
earlier example does: the code is reading a word via a struct Lisp_Vector *
pointer, which means that the compiler is free to (and GCC does) delay
the earlier store of that word until after the check, which means the code
aborts.
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#8546
; Package
emacs
.
(Mon, 25 Apr 2011 23:14:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 8546 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 04/25/11 07:05, Stefan Monnier wrote:
> +struct vector_header
>
> I'd call it vectorlike_header.
OK, I'll do that.
> +#define XVECTOR_SIZE(a) (XVECTOR (a)->header.size + 0)
>
> why not use ASIZE?
No good reason. Thanks, I'll do that too.
> + {
> + EMACS_UINT size;
> + union {
> + struct buffer *buffer;
> + struct Lisp_Vector *vector;
> + } next;
> + };
>
> Why do you need to handle buffers specially here? That sounds wrong.
Purely as a convenience. The code always uses the 'next' pointer as a
struct buffer * (in alloc.c, buffer.c, data.c), or as a struct
Lisp_Vector * (in alloc.c, fns.c). As an alternative, we could
replace the above with
{
EMACS_UINT size;
struct vectorlike_header *next;
};
and then replace uses like this:
for (b = all_buffers; b && b != po; b = b->header.next.buffer)
with uses like this:
for (b = all_buffers; b && b != po; b = (struct buffer *) b->header.next)
I thought that the union made the code clearer and I know you
normally dislike casts, but if you prefer the style with casts it'd be
easy to do that too.
> +#define XVECTOR_HEADER_SIZE(a) (((struct vector_header *) XPNTR (a))->size + 0)
>
> why do we need this variant with this weird set of type casts?
I'll remove it. It is used in only one place, in
XSETTYPED_PSEUDOVECTOR, where the idea is a key part of the
antialiasing fix. But there's no need to break it out as a separate
macro, so I'll fold it into XSETTYPED_PSEUDOVECTOR.
> + * lread.c (defsubr): Use XSETTYPED_PVECTYPE, since Lisp_Subr is a
> + special case.
>
> Why does Lisp_Subr need to be a special case (IIUC this applies to
> XSETTYPED_PSEUDOVECTOR and TYPED_PSEUDOVECTORP as well).
struct Lisp_Subr has a "size" field but no "next" field. I didn't
change its layout to contain a struct vectorlike_header field, as that
would have added an extra word that isn't needed. It would be safer,
from a C standards point of view, to spend the extra word and make
struct Lisp_Subr be like the other vector-like objects, but in
practice I doubt whether any practical optimizing compiler would break
that part of the code; so I kept it as a special case.
If you prefer the simpler and cleaner (but less space-efficient)
variant, where struct Lisp_Subr has a "next" field like all the other
vector-like data structures, that would be easy to do.
Attached is a revised patch with the above comments taken into account.
[pseudovec-2.txt (text/plain, attachment)]
Information forwarded
to
owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org
:
bug#8546
; Package
emacs
.
(Tue, 26 Apr 2011 12:47:01 GMT)
Full text and
rfc822 format available.
Message #20 received at 8546 <at> debbugs.gnu.org (full text, mbox):
>> + {
>> + EMACS_UINT size;
>> + union {
>> + struct buffer *buffer;
>> + struct Lisp_Vector *vector;
>> + } next;
>> + };
>>
>> Why do you need to handle buffers specially here? That sounds wrong.
> Purely as a convenience. The code always uses the 'next' pointer as a
> struct buffer * (in alloc.c, buffer.c, data.c), or as a struct
> Lisp_Vector * (in alloc.c, fns.c). As an alternative, we could
Ah, that makes sense (and deserves a brief comment). Makes me wonder,
tho: why do we need the struct vectorlike_header?
IIUC the core part of your fix is to make all accesses to `size' use the
same type (i.e. Lisp_Vector), right? Or does the use of the struct
help somehow?
While I understand the problem you're trying to fix, I don't know the
details of the C standard in sufficient detail to know exactly where's
the boundary between safe and unsafe. And your patch should have
a comment next to the core part of the fix explaining which part is
important and needs to be preserved.
> I thought that the union made the code clearer and I know you
Yes, that's fine. It just seemed odd to single out buffers.
>> Why does Lisp_Subr need to be a special case (IIUC this applies to
>> XSETTYPED_PSEUDOVECTOR and TYPED_PSEUDOVECTORP as well).
> struct Lisp_Subr has a "size" field but no "next" field.
Ah, yes, that makes sense as well (and deserves another brief comment).
Stefan
Reply sent
to
Paul Eggert <eggert <at> cs.ucla.edu>
:
You have taken responsibility.
(Tue, 26 Apr 2011 20:07:03 GMT)
Full text and
rfc822 format available.
Notification sent
to
Paul Eggert <eggert <at> cs.ucla.edu>
:
bug acknowledged by developer.
(Tue, 26 Apr 2011 20:07:03 GMT)
Full text and
rfc822 format available.
Message #25 received at 8546-done <at> debbugs.gnu.org (full text, mbox):
On 04/26/11 05:46, Stefan Monnier wrote:
> Ah, yes, that makes sense as well (and deserves another brief comment).
OK, thanks, I added comments for that, and for the other areas where
you requested comments, and merged it into the trunk. I'll mark this
bug as done. This merge also fixes bug 8525, which I'll also mark.
As mentioned in bug 8525, this fix assumes strtoumax and therefore
may require the Windows build to supply a 2-line inttypes.h if
Windows doesn't already have inttypes.h. For details please see
<http://debbugs.gnu.org/cgi/bugreport.cgi?bug=8525#14>.
Message #26 received at 8546-done <at> debbugs.gnu.org (full text, mbox):
> OK, thanks, I added comments for that, and for the other areas where
> you requested comments, and merged it into the trunk. I'll mark this
> bug as done. This merge also fixes bug 8525, which I'll also mark.
So IIUC your answer to my question "why do we need the struct
vectorlike_header?" is
/* Header of vector-like objects. This documents the layout constraints on
vectors and pseudovectors other than struct Lisp_Subr. It also prevents
compilers from being fooled by Emacs's type punning: the XSETPSEUDOVECTOR
and PSEUDOVECTORP macros cast their pointers to struct vectorlike_header *,
because when two such pointers potentially alias, a compiler won't
incorrectly reorder loads and stores to their size fields. See
<http://debbugs.gnu.org/cgi/bugreport.cgi?bug=8546>. */
But that doesn't explain why we need "struct vectorlike_header".
Since the macros could cast to "EMACS_UINT*" instead to access to size
field and that give us the same result.
Stefan
Message #27 received at 8546-done <at> debbugs.gnu.org (full text, mbox):
On 04/26/11 18:11, Stefan Monnier wrote:
> But that doesn't explain why we need "struct vectorlike_header".
> Since the macros could cast to "EMACS_UINT*" instead to access to size
> field and that give us the same result.
The C standard allows implementations where 'EMACS_UINT *' uses a
different representation from 'struct buffer *'. On a word-based
machine, for example, 'EMACS_INT *' might have a one-word representation,
but 'struct buffer *' would have to be byte-addressible (as all
struct pointers must "smell the same" in C) and therefore might have
two words. On such a machine, ((EMACS_UINT) p) would return a
different integer than ((EMACS_UINT) &p->size).
In practice, though, I doubt whether Emacs will ever be ported
to a word-based machine, so this objection is mostly theoretical,
and if you prefer the macros to cast to "EMACS_UINT *" instead
I could easily write a patch to do that.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Wed, 25 May 2011 11:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 13 years and 357 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.