GNU bug report logs - #34757
Invalid bytecode from byte compiler

Previous Next

Package: emacs;

Reported by: chuntaro <chuntaro <at> sakura-games.jp>

Date: Tue, 5 Mar 2019 15:30:02 UTC

Severity: normal

Done: Stefan Monnier <monnier <at> iro.umontreal.ca>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 34757 in the body.
You can then email your comments to 34757 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#34757; Package emacs. (Tue, 05 Mar 2019 15:30:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to chuntaro <chuntaro <at> sakura-games.jp>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Tue, 05 Mar 2019 15:30:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: chuntaro <chuntaro <at> sakura-games.jp>
To: bug-gnu-emacs <at> gnu.org
Subject: Invalid bytecode from byte compiler
Date: Tue, 5 Mar 2019 17:01:00 +0900
Invalid bytecode is output and error occurs when executed.

Byte compile the following code.

bug.el
----------------------------------------------------------------
;;; -*- lexical-binding: t; -*-

(let ((a 2))
  (print 1)
  (setq a 1))
----------------------------------------------------------------

$ emacs --batch -f batch-byte-compile bug.el

bug.elc
----------------------------------------------------------------
;ELC
...
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;


(funcall 2 'print 1)
1
----------------------------------------------------------------

$ emacs --batch -l bug.elc
Invalid function: 2

It occurs in versions 25.2, 26.1 and HEAD.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#34757; Package emacs. (Fri, 08 Mar 2019 13:19:02 GMT) Full text and rfc822 format available.

Message #8 received at 34757 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: chuntaro <chuntaro <at> sakura-games.jp>
Cc: 34757 <at> debbugs.gnu.org
Subject: Re: bug#34757: Invalid bytecode from byte compiler
Date: Fri, 08 Mar 2019 15:18:31 +0200
> From: chuntaro <chuntaro <at> sakura-games.jp>
> Date: Tue, 5 Mar 2019 17:01:00 +0900
> 
> Invalid bytecode is output and error occurs when executed.
> 
> Byte compile the following code.
> 
> bug.el
> ----------------------------------------------------------------
> ;;; -*- lexical-binding: t; -*-
> 
> (let ((a 2))
>   (print 1)
>   (setq a 1))
> ----------------------------------------------------------------
> 
> $ emacs --batch -f batch-byte-compile bug.el
> 
> bug.elc
> ----------------------------------------------------------------
> ;ELC
> ...
> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> 
> 
> (funcall 2 'print 1)
> 1
> ----------------------------------------------------------------
> 
> $ emacs --batch -l bug.elc
> Invalid function: 2

Why is that a problem?  The Lisp code is invalid, and the error
message says that much.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#34757; Package emacs. (Fri, 08 Mar 2019 13:51:01 GMT) Full text and rfc822 format available.

Message #11 received at 34757 <at> debbugs.gnu.org (full text, mbox):

From: Michael Heerdegen <michael_heerdegen <at> web.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 34757 <at> debbugs.gnu.org, chuntaro <chuntaro <at> sakura-games.jp>
Subject: Re: bug#34757: Invalid bytecode from byte compiler
Date: Fri, 08 Mar 2019 14:50:33 +0100
Eli Zaretskii <eliz <at> gnu.org> writes:

> > From: chuntaro <chuntaro <at> sakura-games.jp>
> > Date: Tue, 5 Mar 2019 17:01:00 +0900
> >
> > Invalid bytecode is output and error occurs when executed.
> >
> > Byte compile the following code.
> >
> > bug.el
> > ----------------------------------------------------------------
> > ;;; -*- lexical-binding: t; -*-
> >
> > (let ((a 2))
> >   (print 1)
> >   (setq a 1))
> > ----------------------------------------------------------------
> >
> > $ emacs --batch -f batch-byte-compile bug.el
> >
> > bug.elc
> > ----------------------------------------------------------------
> > ;ELC
> > ...
> > ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> > ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> >
> >
> > (funcall 2 'print 1)
> > 1
> > ----------------------------------------------------------------
> >
> > $ emacs --batch -l bug.elc
> > Invalid function: 2
>
> Why is that a problem?  The Lisp code is invalid, and the error
> message says that much.

Please take a closer look: The Lisp code in the compiled file is
invalid, but the code in the .el isn't.

Michael.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#34757; Package emacs. (Fri, 08 Mar 2019 14:37:02 GMT) Full text and rfc822 format available.

Message #14 received at 34757 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Michael Heerdegen <michael_heerdegen <at> web.de>
Cc: 34757 <at> debbugs.gnu.org, chuntaro <at> sakura-games.jp
Subject: Re: bug#34757: Invalid bytecode from byte compiler
Date: Fri, 08 Mar 2019 16:36:04 +0200
> From: Michael Heerdegen <michael_heerdegen <at> web.de>
> Cc: chuntaro <chuntaro <at> sakura-games.jp>,  34757 <at> debbugs.gnu.org
> Date: Fri, 08 Mar 2019 14:50:33 +0100
> 
> Please take a closer look: The Lisp code in the compiled file is
> invalid, but the code in the .el isn't.

Right, sorry for my confusion.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#34757; Package emacs. (Fri, 08 Mar 2019 21:15:01 GMT) Full text and rfc822 format available.

Message #17 received at 34757 <at> debbugs.gnu.org (full text, mbox):

From: Pip Cet <pipcet <at> gmail.com>
To: chuntaro <chuntaro <at> sakura-games.jp>
Cc: 34757 <at> debbugs.gnu.org
Subject: Re: bug#34757: Invalid bytecode from byte compiler
Date: Fri, 8 Mar 2019 21:13:37 +0000
[Message part 1 (text/plain, inline)]
On Tue, Mar 5, 2019 at 3:30 PM chuntaro <chuntaro <at> sakura-games.jp> wrote:
>
> Invalid bytecode is output and error occurs when executed.

If I'm looking at this correctly, the problem is that the byte code
which is generated in an intermediate step:

0       constant  2
1       constant  print
2       constant  1
3       call      1
4       discard
5       constant  3
6       return

is considered a "trivial function" by byte-compile-out-toplevel, which
assumes that all values on the stack are used by the call.

We could fix byte-compile-out-toplevel to properly analyze how many
stack arguments the call takes, but this patch simply treats forms
like this as nontrivial:

diff --git a/lisp/emacs-lisp/bytecomp.el b/lisp/emacs-lisp/bytecomp.el
index 0b8f8824b4c..4e54e08ce14 100644
--- a/lisp/emacs-lisp/bytecomp.el
+++ b/lisp/emacs-lisp/bytecomp.el
@@ -3025,6 +3025,7 @@ byte-compile-out-toplevel
                        (or (null (cdr rest))
                            (and (memq output-type '(file progn t))
                                 (cdr (cdr rest))
+                                (eql (length body) (cdr (car rest)))
                                 (eq (car (nth 1 rest)) 'byte-discard)
                                 (progn (setq rest (cdr rest)) t))))
                   (setq maycall nil)    ; Only allow one real function call.
[emacs-34757.diff (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#34757; Package emacs. (Fri, 15 Mar 2019 08:09:01 GMT) Full text and rfc822 format available.

Message #20 received at 34757 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Pip Cet <pipcet <at> gmail.com>, Stefan Monnier <monnier <at> iro.umontreal.ca> 
Cc: 34757 <at> debbugs.gnu.org, chuntaro <at> sakura-games.jp
Subject: Re: bug#34757: Invalid bytecode from byte compiler
Date: Fri, 15 Mar 2019 10:08:12 +0200
> From: Pip Cet <pipcet <at> gmail.com>
> Date: Fri, 8 Mar 2019 21:13:37 +0000
> Cc: 34757 <at> debbugs.gnu.org
> 
> On Tue, Mar 5, 2019 at 3:30 PM chuntaro <chuntaro <at> sakura-games.jp> wrote:
> >
> > Invalid bytecode is output and error occurs when executed.
> 
> If I'm looking at this correctly, the problem is that the byte code
> which is generated in an intermediate step:
> 
> 0       constant  2
> 1       constant  print
> 2       constant  1
> 3       call      1
> 4       discard
> 5       constant  3
> 6       return
> 
> is considered a "trivial function" by byte-compile-out-toplevel, which
> assumes that all values on the stack are used by the call.
> 
> We could fix byte-compile-out-toplevel to properly analyze how many
> stack arguments the call takes, but this patch simply treats forms
> like this as nontrivial:
> 
> diff --git a/lisp/emacs-lisp/bytecomp.el b/lisp/emacs-lisp/bytecomp.el
> index 0b8f8824b4c..4e54e08ce14 100644
> --- a/lisp/emacs-lisp/bytecomp.el
> +++ b/lisp/emacs-lisp/bytecomp.el
> @@ -3025,6 +3025,7 @@ byte-compile-out-toplevel
>                         (or (null (cdr rest))
>                             (and (memq output-type '(file progn t))
>                                  (cdr (cdr rest))
> +                                (eql (length body) (cdr (car rest)))
>                                  (eq (car (nth 1 rest)) 'byte-discard)
>                                  (progn (setq rest (cdr rest)) t))))
>                    (setq maycall nil)    ; Only allow one real function call.

Stefan, any comments?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#34757; Package emacs. (Fri, 15 Mar 2019 14:09:02 GMT) Full text and rfc822 format available.

Message #23 received at 34757 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: chuntaro <at> sakura-games.jp, Pip Cet <pipcet <at> gmail.com>, 34757 <at> debbugs.gnu.org
Subject: Re: bug#34757: Invalid bytecode from byte compiler
Date: Fri, 15 Mar 2019 10:08:48 -0400
>> diff --git a/lisp/emacs-lisp/bytecomp.el b/lisp/emacs-lisp/bytecomp.el
>> index 0b8f8824b4c..4e54e08ce14 100644
>> --- a/lisp/emacs-lisp/bytecomp.el
>> +++ b/lisp/emacs-lisp/bytecomp.el
>> @@ -3025,6 +3025,7 @@ byte-compile-out-toplevel
>>                         (or (null (cdr rest))
>>                             (and (memq output-type '(file progn t))
>>                                  (cdr (cdr rest))
>> +                                (eql (length body) (cdr (car rest)))
>>                                  (eq (car (nth 1 rest)) 'byte-discard)
>>                                  (progn (setq rest (cdr rest)) t))))
>>                    (setq maycall nil)    ; Only allow one real function call.
>
> Stefan, any comments?

Looks good to me, thanks.

I'm personally running with the following patch instead, which is much
more risky and hasn't been sufficiently tested yet.


        Stefan


diff --git a/lisp/emacs-lisp/bytecomp.el b/lisp/emacs-lisp/bytecomp.el
index f46cab2c17..573b0489d4 100644
*** a/lisp/emacs-lisp/bytecomp.el
--- b/lisp/emacs-lisp/bytecomp.el
***************
*** 2990,3059 ****
        (setq byte-compile-output
  	    (byte-optimize-lapcode byte-compile-output)))
  
!   ;; Decompile trivial functions:
!   ;; only constants and variables, or a single funcall except in lambdas.
!   ;; Except for Lisp_Compiled objects, forms like (foo "hi")
!   ;; are still quicker than (byte-code "..." [foo "hi"] 2).
!   ;; Note that even (quote foo) must be parsed just as any subr by the
!   ;; interpreter, so quote should be compiled into byte-code in some contexts.
!   ;; What to leave uncompiled:
!   ;;	lambda	-> never.  we used to leave it uncompiled if the body was
!   ;;		   a single atom, but that causes confusion if the docstring
!   ;;		   uses the (file . pos) syntax.  Besides, now that we have
!   ;;		   the Lisp_Compiled type, the compiled form is faster.
!   ;;	eval	-> atom, quote or (function atom atom atom)
!   ;;	progn	-> as <<same-as-eval>> or (progn <<same-as-eval>> atom)
!   ;;	file	-> as progn, but takes both quotes and atoms, and longer forms.
!   (let (rest
! 	(maycall (not (eq output-type 'lambda))) ; t if we may make a funcall.
! 	tmp body)
!     (cond
!      ;; #### This should be split out into byte-compile-nontrivial-function-p.
!      ((or (eq output-type 'lambda)
! 	  (nthcdr (if (eq output-type 'file) 50 8) byte-compile-output)
! 	  (assq 'TAG byte-compile-output) ; Not necessary, but speeds up a bit.
! 	  (not (setq tmp (assq 'byte-return byte-compile-output)))
! 	  (progn
! 	    (setq rest (nreverse
! 			(cdr (memq tmp (reverse byte-compile-output)))))
! 	    (while
!                 (cond
!                  ((memq (car (car rest)) '(byte-varref byte-constant))
!                   (setq tmp (car (cdr (car rest))))
!                   (if (if (eq (car (car rest)) 'byte-constant)
!                           (or (consp tmp)
!                               (and (symbolp tmp)
!                                    (not (macroexp--const-symbol-p tmp)))))
!                       (if maycall
!                           (setq body (cons (list 'quote tmp) body)))
!                     (setq body (cons tmp body))))
!                  ((and maycall
!                        ;; Allow a funcall if at most one atom follows it.
!                        (null (nthcdr 3 rest))
!                        (setq tmp (get (car (car rest)) 'byte-opcode-invert))
!                        (or (null (cdr rest))
!                            (and (memq output-type '(file progn t))
!                                 (cdr (cdr rest))
!                                 (eq (car (nth 1 rest)) 'byte-discard)
!                                 (progn (setq rest (cdr rest)) t))))
!                   (setq maycall nil)	; Only allow one real function call.
!                   (setq body (nreverse body))
!                   (setq body (list
!                               (if (and (eq tmp 'funcall)
!                                        (eq (car-safe (car body)) 'quote)
! 				       (symbolp (nth 1 (car body))))
!                                   (cons (nth 1 (car body)) (cdr body))
!                                 (cons tmp body))))
!                   (or (eq output-type 'file)
!                       (not (delq nil (mapcar 'consp (cdr (car body))))))))
! 	      (setq rest (cdr rest)))
! 	    rest))
!       (let ((byte-compile-vector (byte-compile-constants-vector)))
! 	(list 'byte-code (byte-compile-lapcode byte-compile-output)
! 	      byte-compile-vector byte-compile-maxdepth)))
!      ;; it's a trivial function
!      ((cdr body) (cons 'progn (nreverse body)))
!      ((car body)))))
  
  ;; Given BODY, compile it and return a new body.
  (defun byte-compile-top-level-body (body &optional for-effect)
--- 2993,3013 ----
        (setq byte-compile-output
  	    (byte-optimize-lapcode byte-compile-output)))
  
!   (cond
!    ((and (not (eq output-type 'lambda)) ;;The caller really wants (byte-code ...)
!          (null (cddr byte-compile-output))
!          (eq (car (nth 0 byte-compile-output)) 'byte-constant)
!          (eq (car (nth 1 byte-compile-output)) 'byte-return))
!     ;; Special case for trivial code returning a function.
!     ;; This is so that when compiling #'(lambda ...) we return
!     ;; a #[...] byte-code object instead of a (byte-code "...")
!     ;; expression that returns this object.  It's not indispensable,
!     ;; but it's cleaner.
!     (macroexp-quote (cadr (nth 0 byte-compile-output))))
!    (t
!     (let ((byte-compile-vector (byte-compile-constants-vector)))
!       (list 'byte-code (byte-compile-lapcode byte-compile-output)
! 	    byte-compile-vector byte-compile-maxdepth)))))
  
  ;; Given BODY, compile it and return a new body.
  (defun byte-compile-top-level-body (body &optional for-effect)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#34757; Package emacs. (Fri, 15 Mar 2019 19:42:01 GMT) Full text and rfc822 format available.

Message #26 received at 34757 <at> debbugs.gnu.org (full text, mbox):

From: Pip Cet <pipcet <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Eli Zaretskii <eliz <at> gnu.org>, chuntaro <at> sakura-games.jp,
 34757 <at> debbugs.gnu.org
Subject: Re: bug#34757: Invalid bytecode from byte compiler
Date: Fri, 15 Mar 2019 19:40:48 +0000
On Fri, Mar 15, 2019 at 2:08 PM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
> >> diff --git a/lisp/emacs-lisp/bytecomp.el b/lisp/emacs-lisp/bytecomp.el
> >> index 0b8f8824b4c..4e54e08ce14 100644
> >> --- a/lisp/emacs-lisp/bytecomp.el
> >> +++ b/lisp/emacs-lisp/bytecomp.el
> >> @@ -3025,6 +3025,7 @@ byte-compile-out-toplevel
> >>                         (or (null (cdr rest))
> >>                             (and (memq output-type '(file progn t))
> >>                                  (cdr (cdr rest))
> >> +                                (eql (length body) (cdr (car rest)))
> >>                                  (eq (car (nth 1 rest)) 'byte-discard)
> >>                                  (progn (setq rest (cdr rest)) t))))
> >>                    (setq maycall nil)    ; Only allow one real function call.
> >
> > Stefan, any comments?
>
> Looks good to me, thanks.
>
> I'm personally running with the following patch instead, which is much
> more risky and hasn't been sufficiently tested yet.

Just to be sure I understand correctly, you would like to remove the
decompilation of trivial function calls entirely?

That seems good to me.

It seems the special case is necessary to avoid compilation errors,
and that it's used for (byte-compile 3), so I think the comment could
be improved a little.

I'm not sure this case can actually happen without doing something
silly, but (byte-compile '(internal-get-closed-var 0)) throws an error
with Stefan's patch, because the byte code is (byte-constant . 0)
(byte-return).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#34757; Package emacs. (Fri, 15 Mar 2019 20:31:02 GMT) Full text and rfc822 format available.

Message #29 received at 34757 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
To: Pip Cet <pipcet <at> gmail.com>
Cc: Eli Zaretskii <eliz <at> gnu.org>, chuntaro <at> sakura-games.jp,
 34757 <at> debbugs.gnu.org
Subject: Re: bug#34757: Invalid bytecode from byte compiler
Date: Fri, 15 Mar 2019 16:30:10 -0400
> Just to be sure I understand correctly, you would like to remove the
> decompilation of trivial function calls entirely?

Yes, tho the main motivation was to try and figure out what the
decompilation is useful for.

This only affects "open code" (i.e. not the content of functions, which
are already never decompiled), so the impact should be minor and it
removes a rather complicated and brittle chunk of code whose purpose is
not clear.  It was originally introduced when we didn't have
byte-compiled function objects, so its main purpose was one of
performance, to avoid pessimizing the code by replacing trivial function
calls with more costly (byte-code "...") expressions but nowadays such
(byte-code "...") expressions only occur basically at the top-level of
.elc files where such pessimization would be unnoticeable in terms
of performance.

It does impact the readability of .elc files, OTOH, so I'm not
completely happy with the result when considering the few cases where
I was happy to be able to make sense of a .elc file to better understand
the source of a problem (after all, that's why I wrote the
elisp-byte-code-mode).

> It seems the special case is necessary to avoid compilation errors,

I haven't found it to be really necessary, no.

> and that it's used for (byte-compile 3), so I think the comment could
> be improved a little.

(byte-compile 3) seems to work correctly here even without the
special case.  It returns (byte-code "\300\207" [3] 1) which is indeed
a correct expression that evaluates to 3 (just like the argument to
`byte-compile` was an expression whose evaluation returns 3).

Let's not forget that what `byte-compile` tries to do is to preserve the
invariant that

    (eval EXP) ≃ (eval (byte-compile EXP))

This said, if you remove the special case, you will bump into
a corner-case bug in `byte-compile` which happens because that function
incorrectly considers that `byte-compile-top-level` returns a value
whereas in reality it returns an expression (just like `byte-compile`):
the decompilation happens to turn expressions that return constant
values (like byte-compiled functions) into their value (as an
optimization which relies on the fact that those objects are
self-evaluating), so if you remove it, you then bump into this bug of
byte-compile.  The patch below would fix this bug.

But indeed the decompilation of constants is handy since that's what
people expect from `byte-compile`.  When I (byte-compile '(lambda (x)
(foo))) I expect to receive a byte-compiled function, and not
a byte-code expression which when evaluated will return that
byte-compiled function.

> I'm not sure this case can actually happen without doing something
> silly, but (byte-compile '(internal-get-closed-var 0)) throws an error
> with Stefan's patch, because the byte code is (byte-constant . 0)
> (byte-return).

This source code is arguably invalid, so it's not a real problem, but
indeed we should burp in that case.  I can't remember enough of how
internal-get-closed-var is handled to know where'd the bug be, offhand.


        Stefan


diff --git a/lisp/emacs-lisp/bytecomp.el b/lisp/emacs-lisp/bytecomp.el
index f46cab2c17..ae17553d0c 100644
--- a/lisp/emacs-lisp/bytecomp.el
+++ b/lisp/emacs-lisp/bytecomp.el
@@ -2674,7 +2674,11 @@ byte-compile
          (setq fun (byte-compile-top-level fun nil 'eval)))
         (if macro (push 'macro fun))
         (if (symbolp form)
-            (fset form fun)
+            ;; byte-compile returns an *expression* equivalent to the
+            ;; `fun' expression, so we need to evaluate it, tho normally
+            ;; this is not needed because the expression is just a constant
+            ;; byte-code object, which is self-evaluating.
+            (fset form (eval fun t))
           fun)))))))
 
 (defun byte-compile-sexp (sexp)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#34757; Package emacs. (Sat, 16 Mar 2019 16:53:01 GMT) Full text and rfc822 format available.

Message #32 received at 34757 <at> debbugs.gnu.org (full text, mbox):

From: Pip Cet <pipcet <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Eli Zaretskii <eliz <at> gnu.org>, chuntaro <at> sakura-games.jp,
 34757 <at> debbugs.gnu.org
Subject: Re: bug#34757: Invalid bytecode from byte compiler
Date: Sat, 16 Mar 2019 16:51:13 +0000
On Fri, Mar 15, 2019 at 8:30 PM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
>
> > Just to be sure I understand correctly, you would like to remove the
> > decompilation of trivial function calls entirely?
>
> Yes, tho the main motivation was to try and figure out what the
> decompilation is useful for.

Thanks for explaining!

> This only affects "open code" (i.e. not the content of functions, which
> are already never decompiled), so the impact should be minor and it
> removes a rather complicated and brittle chunk of code whose purpose is
> not clear.  It was originally introduced when we didn't have
> byte-compiled function objects, so its main purpose was one of
> performance, to avoid pessimizing the code by replacing trivial function
> calls with more costly (byte-code "...") expressions but nowadays such
> (byte-code "...") expressions only occur basically at the top-level of
> .elc files where such pessimization would be unnoticeable in terms
> of performance.

I agree completely, for what it's worth.

> It does impact the readability of .elc files, OTOH, so I'm not
> completely happy with the result when considering the few cases where
> I was happy to be able to make sense of a .elc file to better understand
> the source of a problem (after all, that's why I wrote the
> elisp-byte-code-mode).

I can speak only for myself, but I think the byte-compiler "magically"
deciding to turn (rare) top-level non-defuns into plain code rather
than byte code is confusing. It's much better with your patches.

> > It seems the special case is necessary to avoid compilation errors,
>
> I haven't found it to be really necessary, no.

Well, you fixed it with the second patch.

> > and that it's used for (byte-compile 3), so I think the comment could
> > be improved a little.
>
> (byte-compile 3) seems to work correctly here even without the
> special case.  It returns (byte-code "\300\207" [3] 1) which is indeed
> a correct expression that evaluates to 3 (just like the argument to
> `byte-compile` was an expression whose evaluation returns 3).

No argument here. The case branch affects what happens when
(byte-compile 3) is called, but isn't necessary for the result to be
correct, and the comment can be misread to imply it's irrelevant in
this case.

> Let's not forget that what `byte-compile` tries to do is to preserve the
> invariant that
>
>     (eval EXP) ≃ (eval (byte-compile EXP))

I think byte-compile does different things for different arguments:
the behavior for symbols and other expressions is simply different.

> This said, if you remove the special case, you will bump into
> a corner-case bug in `byte-compile` which happens because that function
> incorrectly considers that `byte-compile-top-level` returns a value
> whereas in reality it returns an expression (just like `byte-compile`):
> the decompilation happens to turn expressions that return constant
> values (like byte-compiled functions) into their value (as an
> optimization which relies on the fact that those objects are
> self-evaluating), so if you remove it, you then bump into this bug of
> byte-compile.  The patch below would fix this bug.

I don't think that was a bug, but it was an unfortunate wrinkle in the
(undocumented) API of byte-compile-top-level.

> But indeed the decompilation of constants is handy since that's what
> people expect from `byte-compile`.  When I (byte-compile '(lambda (x)
> (foo))) I expect to receive a byte-compiled function, and not
> a byte-code expression which when evaluated will return that
> byte-compiled function.

I think it's more than handy: it's how I'd read the current
documentation, and how I'd expect a function called byte-compile to
behave.

> > I'm not sure this case can actually happen without doing something
> > silly, but (byte-compile '(internal-get-closed-var 0)) throws an error
> > with Stefan's patch, because the byte code is (byte-constant . 0)
> > (byte-return).
>
> This source code is arguably invalid, so it's not a real problem, but

The source code is invalid, but the LAP code is valid-looking, and I
can't conclude it cannot be generated by valid source code being
passed to `byte-compile' somehow.

> diff --git a/lisp/emacs-lisp/bytecomp.el b/lisp/emacs-lisp/bytecomp.el
> index f46cab2c17..ae17553d0c 100644
> --- a/lisp/emacs-lisp/bytecomp.el
> +++ b/lisp/emacs-lisp/bytecomp.el
> @@ -2674,7 +2674,11 @@ byte-compile
>           (setq fun (byte-compile-top-level fun nil 'eval)))
>          (if macro (push 'macro fun))
>          (if (symbolp form)
> -            (fset form fun)
> +            ;; byte-compile returns an *expression* equivalent to the

I think this would be clearer if it read "byte-compile-top-level
returns an *expression*..."




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#34757; Package emacs. (Thu, 13 Jun 2019 11:45:02 GMT) Full text and rfc822 format available.

Message #35 received at 34757 <at> debbugs.gnu.org (full text, mbox):

From: Pip Cet <pipcet <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Eli Zaretskii <eliz <at> gnu.org>, chuntaro <at> sakura-games.jp,
 34757 <at> debbugs.gnu.org
Subject: Re: bug#34757: Invalid bytecode from byte compiler
Date: Thu, 13 Jun 2019 11:44:08 +0000
This bug still appears to be present. Maybe it's time to apply
Stefan's patch and see whether anything breaks?

On Sat, Mar 16, 2019 at 4:51 PM Pip Cet <pipcet <at> gmail.com> wrote:
>
> On Fri, Mar 15, 2019 at 8:30 PM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
> >
> > > Just to be sure I understand correctly, you would like to remove the
> > > decompilation of trivial function calls entirely?
> >
> > Yes, tho the main motivation was to try and figure out what the
> > decompilation is useful for.
>
> Thanks for explaining!
>
> > This only affects "open code" (i.e. not the content of functions, which
> > are already never decompiled), so the impact should be minor and it
> > removes a rather complicated and brittle chunk of code whose purpose is
> > not clear.  It was originally introduced when we didn't have
> > byte-compiled function objects, so its main purpose was one of
> > performance, to avoid pessimizing the code by replacing trivial function
> > calls with more costly (byte-code "...") expressions but nowadays such
> > (byte-code "...") expressions only occur basically at the top-level of
> > .elc files where such pessimization would be unnoticeable in terms
> > of performance.
>
> I agree completely, for what it's worth.
>
> > It does impact the readability of .elc files, OTOH, so I'm not
> > completely happy with the result when considering the few cases where
> > I was happy to be able to make sense of a .elc file to better understand
> > the source of a problem (after all, that's why I wrote the
> > elisp-byte-code-mode).
>
> I can speak only for myself, but I think the byte-compiler "magically"
> deciding to turn (rare) top-level non-defuns into plain code rather
> than byte code is confusing. It's much better with your patches.
>
> > > It seems the special case is necessary to avoid compilation errors,
> >
> > I haven't found it to be really necessary, no.
>
> Well, you fixed it with the second patch.
>
> > > and that it's used for (byte-compile 3), so I think the comment could
> > > be improved a little.
> >
> > (byte-compile 3) seems to work correctly here even without the
> > special case.  It returns (byte-code "\300\207" [3] 1) which is indeed
> > a correct expression that evaluates to 3 (just like the argument to
> > `byte-compile` was an expression whose evaluation returns 3).
>
> No argument here. The case branch affects what happens when
> (byte-compile 3) is called, but isn't necessary for the result to be
> correct, and the comment can be misread to imply it's irrelevant in
> this case.
>
> > Let's not forget that what `byte-compile` tries to do is to preserve the
> > invariant that
> >
> >     (eval EXP) ≃ (eval (byte-compile EXP))
>
> I think byte-compile does different things for different arguments:
> the behavior for symbols and other expressions is simply different.
>
> > This said, if you remove the special case, you will bump into
> > a corner-case bug in `byte-compile` which happens because that function
> > incorrectly considers that `byte-compile-top-level` returns a value
> > whereas in reality it returns an expression (just like `byte-compile`):
> > the decompilation happens to turn expressions that return constant
> > values (like byte-compiled functions) into their value (as an
> > optimization which relies on the fact that those objects are
> > self-evaluating), so if you remove it, you then bump into this bug of
> > byte-compile.  The patch below would fix this bug.
>
> I don't think that was a bug, but it was an unfortunate wrinkle in the
> (undocumented) API of byte-compile-top-level.
>
> > But indeed the decompilation of constants is handy since that's what
> > people expect from `byte-compile`.  When I (byte-compile '(lambda (x)
> > (foo))) I expect to receive a byte-compiled function, and not
> > a byte-code expression which when evaluated will return that
> > byte-compiled function.
>
> I think it's more than handy: it's how I'd read the current
> documentation, and how I'd expect a function called byte-compile to
> behave.
>
> > > I'm not sure this case can actually happen without doing something
> > > silly, but (byte-compile '(internal-get-closed-var 0)) throws an error
> > > with Stefan's patch, because the byte code is (byte-constant . 0)
> > > (byte-return).
> >
> > This source code is arguably invalid, so it's not a real problem, but
>
> The source code is invalid, but the LAP code is valid-looking, and I
> can't conclude it cannot be generated by valid source code being
> passed to `byte-compile' somehow.
>
> > diff --git a/lisp/emacs-lisp/bytecomp.el b/lisp/emacs-lisp/bytecomp.el
> > index f46cab2c17..ae17553d0c 100644
> > --- a/lisp/emacs-lisp/bytecomp.el
> > +++ b/lisp/emacs-lisp/bytecomp.el
> > @@ -2674,7 +2674,11 @@ byte-compile
> >           (setq fun (byte-compile-top-level fun nil 'eval)))
> >          (if macro (push 'macro fun))
> >          (if (symbolp form)
> > -            (fset form fun)
> > +            ;; byte-compile returns an *expression* equivalent to the
>
> I think this would be clearer if it read "byte-compile-top-level
> returns an *expression*..."




Reply sent to Stefan Monnier <monnier <at> iro.umontreal.ca>:
You have taken responsibility. (Sat, 27 Jul 2019 21:31:02 GMT) Full text and rfc822 format available.

Notification sent to chuntaro <chuntaro <at> sakura-games.jp>:
bug acknowledged by developer. (Sat, 27 Jul 2019 21:31:03 GMT) Full text and rfc822 format available.

Message #40 received at 34757-done <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Pip Cet <pipcet <at> gmail.com>
Cc: 34757-done <at> debbugs.gnu.org, chuntaro <chuntaro <at> sakura-games.jp>
Subject: Re: bug#34757: Invalid bytecode from byte compiler
Date: Sat, 27 Jul 2019 17:30:03 -0400
> We could fix byte-compile-out-toplevel to properly analyze how many
> stack arguments the call takes, but this patch simply treats forms
> like this as nontrivial:
>
> diff --git a/lisp/emacs-lisp/bytecomp.el b/lisp/emacs-lisp/bytecomp.el
> index 0b8f8824b4c..4e54e08ce14 100644
> --- a/lisp/emacs-lisp/bytecomp.el
> +++ b/lisp/emacs-lisp/bytecomp.el
> @@ -3025,6 +3025,7 @@ byte-compile-out-toplevel
>                         (or (null (cdr rest))
>                             (and (memq output-type '(file progn t))
>                                  (cdr (cdr rest))
> +                                (eql (length body) (cdr (car rest)))
>                                  (eq (car (nth 1 rest)) 'byte-discard)
>                                  (progn (setq rest (cdr rest)) t))))
>                    (setq maycall nil)    ; Only allow one real function call.

I installed this fix for now.


        Stefan





bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 25 Aug 2019 11:24:09 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 217 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.