GNU bug report logs - #56048
[PATCH] bindat (strz): Null terminate fixed-length strings if there is room

Previous Next

Package: emacs;

Reported by: Richard Hansen <rhansen <at> rhansen.org>

Date: Fri, 17 Jun 2022 22:54:02 UTC

Severity: wishlist

Tags: patch

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 56048 in the body.
You can then email your comments to 56048 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to monnier <at> iro.umontreal.ca, bug-gnu-emacs <at> gnu.org:
bug#56048; Package emacs. (Fri, 17 Jun 2022 22:54:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Richard Hansen <rhansen <at> rhansen.org>:
New bug report received and forwarded. Copy sent to monnier <at> iro.umontreal.ca, bug-gnu-emacs <at> gnu.org. (Fri, 17 Jun 2022 22:54:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Richard Hansen <rhansen <at> rhansen.org>
To: bug-gnu-emacs <at> gnu.org
Subject: [PATCH] bindat (strz): Null terminate fixed-length strings if there
 is room
Date: Fri, 17 Jun 2022 18:52:51 -0400
[Message part 1 (text/plain, inline)]
X-Debbugs-CC: monnier <at> iro.umontreal.ca

Two patches attached:

Patch 1:

    ; bindat (strz): Move all pack logic to pack function

Patch 2:

    bindat (strz): Null terminate fixed-length strings if there is room

    * lisp/emacs-lisp/bindat.el (bindat--pack-strz): For fixed-length strz
    fields, explicitly write a null terminator after the packed string if
    there is room.
    * doc/lispref/processes.texi (Bindat Types): Update documentation.
    * test/lisp/emacs-lisp/bindat-tests.el (bindat-test--str-strz-prealloc):
    Update tests.
[0001-bindat-strz-Move-all-pack-logic-to-pack-function.patch (text/x-patch, attachment)]
[0002-bindat-strz-Null-terminate-fixed-length-strings-if-t.patch (text/x-patch, attachment)]
[OpenPGP_signature (application/pgp-signature, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#56048; Package emacs. (Sat, 18 Jun 2022 03:04:02 GMT) Full text and rfc822 format available.

Message #8 received at 56048 <at> debbugs.gnu.org (full text, mbox):

From: Richard Hansen <rhansen <at> rhansen.org>
To: 56048 <at> debbugs.gnu.org
Cc: monnier <at> iro.umontreal.ca
Subject: Re: [PATCH] bindat (strz): Null terminate fixed-length strings if
 there is room
Date: Fri, 17 Jun 2022 23:02:57 -0400
[Message part 1 (text/plain, inline)]
Attached are new revisions of the patches. The only differences are the comments were filled at column 70 instead of 80, and the commit message mentions the bug number.
[v2-0001-bindat-strz-Move-all-pack-logic-to-pack-function.patch (text/x-patch, attachment)]
[v2-0002-bindat-strz-Null-terminate-fixed-length-strings-i.patch (text/x-patch, attachment)]
[OpenPGP_signature (application/pgp-signature, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#56048; Package emacs. (Sat, 18 Jun 2022 06:34:02 GMT) Full text and rfc822 format available.

Message #11 received at 56048 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Richard Hansen <rhansen <at> rhansen.org>
Cc: 56048 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca
Subject: Re: bug#56048: [PATCH] bindat (strz): Null terminate fixed-length
 strings if there is room
Date: Sat, 18 Jun 2022 09:32:51 +0300
> Cc: monnier <at> iro.umontreal.ca
> Date: Fri, 17 Jun 2022 23:02:57 -0400
> From: Richard Hansen <rhansen <at> rhansen.org>
> 
> Attached are new revisions of the patches. The only differences are the comments were filled at column 70 instead of 80, and the commit message mentions the bug number.

While waiting for Stefan to tell whether he has any comments, a couple
of nits:

> Subject: [PATCH v2 1/2] ; bindat (strz): Move all pack logic to pack function

What is the motivation/rationale for this refactoring?

> +When packing, a null terminator is written after the packed string if
> +the length of the input string is less than @var{len}.

Since "length of a string" is highly ambiguous in Emacs, please always
make a point of saying "@var{len} bytes" explicitly.  Byte length is
something rarely seen or used in Emacs, so people must be informed
about that each time.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#56048; Package emacs. (Sat, 18 Jun 2022 22:58:01 GMT) Full text and rfc822 format available.

Message #14 received at 56048 <at> debbugs.gnu.org (full text, mbox):

From: Richard Hansen <rhansen <at> rhansen.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 56048 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca
Subject: Re: bug#56048: [PATCH] bindat (strz): Null terminate fixed-length
 strings if there is room
Date: Sat, 18 Jun 2022 18:57:23 -0400
[Message part 1 (text/plain, inline)]
On 2022-06-18 02:32, Eli Zaretskii wrote:
>> Subject: [PATCH v2 1/2] ; bindat (strz): Move all pack logic to pack function
> 
> What is the motivation/rationale for this refactoring?

The attached revision updates the commit message to explain:

    ; bindat (strz): Move all pack logic to pack function

    Motivation/rationale:
      * Improve code readability.  Now `bindat--pack-strz` is used for all
        `strz` packing, not just variable-length `strz` packing.
      * Make it easier to change the behavior of fixed-length `strz`
        packing without also affecting the behavior of `str` packing.  (A
        future commit will modify `strz` to write a null terminator if
        there is room.)

> 
>> +When packing, a null terminator is written after the packed string if
>> +the length of the input string is less than @var{len}.
> 
> Since "length of a string" is highly ambiguous in Emacs, please always
> make a point of saying "@var{len} bytes" explicitly.  Byte length is
> something rarely seen or used in Emacs, so people must be informed
> about that each time.

Done; see attached revision.

Thanks,
Richard
[v3-0001-bindat-strz-Move-all-pack-logic-to-pack-function.patch (text/x-patch, attachment)]
[v3-0002-bindat-strz-Null-terminate-fixed-length-strings-i.patch (text/x-patch, attachment)]
[OpenPGP_signature (application/pgp-signature, attachment)]

Severity set to 'wishlist' from 'normal' Request was from Stefan Kangas <stefan <at> marxist.se> to control <at> debbugs.gnu.org. (Sun, 19 Jun 2022 13:54:03 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#56048; Package emacs. (Tue, 21 Jun 2022 20:45:02 GMT) Full text and rfc822 format available.

Message #19 received at 56048 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Richard Hansen <rhansen <at> rhansen.org>
Cc: 56048 <at> debbugs.gnu.org
Subject: Re: [PATCH] bindat (strz): Null terminate fixed-length strings if
 there is room
Date: Tue, 21 Jun 2022 16:44:06 -0400
Thanks, Richard.  Looks good to me.


        Stefan


Richard Hansen [2022-06-17 23:02:57] wrote:

> Attached are new revisions of the patches. The only differences are the
> comments were filled at column 70 instead of 80, and the commit message
> mentions the bug number.
>
> From 6096bc8bceada87a5c54e4eb500812a0b72ffd44 Mon Sep 17 00:00:00 2001
> From: Richard Hansen <rhansen <at> rhansen.org>
> Date: Sun, 29 May 2022 21:23:57 -0400
> Subject: [PATCH v2 1/2] ; bindat (strz): Move all pack logic to pack function
>
> ---
>  lisp/emacs-lisp/bindat.el | 49 ++++++++++++++++++---------------------
>  1 file changed, 23 insertions(+), 26 deletions(-)
>
> diff --git a/lisp/emacs-lisp/bindat.el b/lisp/emacs-lisp/bindat.el
> index 46e2a4901c..4a642bb9c5 100644
> --- a/lisp/emacs-lisp/bindat.el
> +++ b/lisp/emacs-lisp/bindat.el
> @@ -440,20 +440,27 @@ bindat--pack-str
>        (aset bindat-raw (+ bindat-idx i) (aref v i)))
>      (setq bindat-idx (+ bindat-idx len))))
>  
> -(defun bindat--pack-strz (v)
> +(defun bindat--pack-strz (len v)
>    (let* ((v (string-to-unibyte v))
> -         (len (length v)))
> -    (dotimes (i len)
> -      (when (= (aref v i) 0)
> -        ;; Alternatively we could pretend that this was the end of
> -        ;; the string and stop packing, but then bindat-length would
> -        ;; need to scan the input string looking for a null byte.
> -        (error "Null byte encountered in input strz string"))
> -      (aset bindat-raw (+ bindat-idx i) (aref v i)))
> -    ;; Explicitly write a null terminator in case the user provided a
> -    ;; pre-allocated string to bindat-pack that wasn't zeroed first.
> -    (aset bindat-raw (+ bindat-idx len) 0)
> -    (setq bindat-idx (+ bindat-idx len 1))))
> +         (vlen (length v)))
> +    (if len
> +        ;; When len is specified, behave the same as the str type
> +        ;; since we don't actually add the terminating zero anyway
> +        ;; (because we rely on the fact that `bindat-raw' was
> +        ;; presumably initialized with all-zeroes before we started).
> +        (bindat--pack-str len v)
> +      (dotimes (i vlen)
> +        (when (= (aref v i) 0)
> +          ;; Alternatively we could pretend that this was the end of
> +          ;; the string and stop packing, but then bindat-length would
> +          ;; need to scan the input string looking for a null byte.
> +          (error "Null byte encountered in input strz string"))
> +        (aset bindat-raw (+ bindat-idx i) (aref v i)))
> +      ;; Explicitly write a null terminator in case the user provided
> +      ;; a pre-allocated string to `bindat-pack' that wasn't already
> +      ;; zeroed.
> +      (aset bindat-raw (+ bindat-idx vlen) 0)
> +      (setq bindat-idx (+ bindat-idx vlen 1)))))
>  
>  (defun bindat--pack-bits (len v)
>    (let ((bnum (1- (* 8 len))) j m)
> @@ -482,7 +489,8 @@ bindat--pack-item
>     ('u24r (bindat--pack-u24r v))
>     ('u32r (bindat--pack-u32r v))
>     ('bits (bindat--pack-bits len v))
> -   ((or 'str 'strz) (bindat--pack-str len v))
> +   ('str (bindat--pack-str len v))
> +   ('strz (bindat--pack-strz len v))
>     ('vec
>      (let ((l (length v)) (vlen 1))
>        (if (consp vectype)
> @@ -699,18 +707,7 @@ bindat--type
>                              ((numberp len) len)
>                              ;; General expression support.
>                              (t `(or ,len (1+ (length ,val)))))))
> -    (`(pack . ,args)
> -     ;; When len is specified, behave the same as the str type since we don't
> -     ;; actually add the terminating zero anyway (because we rely on the fact
> -     ;; that `bindat-raw' was presumably initialized with all-zeroes before we
> -     ;; started).
> -     (cond ; Same optimizations as 'length above.
> -      ((null len) `(bindat--pack-strz . ,args))
> -      ((numberp len) `(bindat--pack-str ,len . ,args))
> -      (t (macroexp-let2 nil len len
> -           `(if ,len
> -                (bindat--pack-str ,len . ,args)
> -              (bindat--pack-strz . ,args))))))))
> +    (`(pack . ,args) `(bindat--pack-strz ,len . ,args))))
>  
>  (cl-defmethod bindat--type (op (_ (eql 'bits))  len)
>    (bindat--pcase op
> -- 
> 2.36.1
>
>
> From 9ebda68264adca6f60f780d44995c4213d2c12c2 Mon Sep 17 00:00:00 2001
> From: Richard Hansen <rhansen <at> rhansen.org>
> Date: Thu, 16 Jun 2022 15:21:57 -0400
> Subject: [PATCH v2 2/2] bindat (strz): Null terminate fixed-length strings if
>  there is room
>
> * lisp/emacs-lisp/bindat.el (bindat--pack-strz): For fixed-length strz
> fields, explicitly write a null terminator after the packed string if
> there is room (bug#56048).
> * doc/lispref/processes.texi (Bindat Types): Update documentation.
> * test/lisp/emacs-lisp/bindat-tests.el (bindat-test--str-strz-prealloc):
> Update tests.
> ---
>  doc/lispref/processes.texi           | 30 ++++++++++++++--------------
>  lisp/emacs-lisp/bindat.el            | 13 ++++++------
>  test/lisp/emacs-lisp/bindat-tests.el | 12 +++++------
>  3 files changed, 27 insertions(+), 28 deletions(-)
>
> diff --git a/doc/lispref/processes.texi b/doc/lispref/processes.texi
> index b9200aedde..d038d52d44 100644
> --- a/doc/lispref/processes.texi
> +++ b/doc/lispref/processes.texi
> @@ -3509,23 +3509,23 @@ Bindat Types
>  (but excluding) the null byte that terminated the input string.
>  
>  If @var{len} is provided, @code{strz} behaves the same as @code{str},
> -but with one difference: when unpacking, the first null byte
> -encountered in the packed string is interpreted as the terminating
> -byte, and it and all subsequent bytes are excluded from the result of
> -the unpacking.
> +but with a couple of differences:
> +
> +@itemize @bullet
> +@item
> +When packing, a null terminator is written after the packed string if
> +the length of the input string is less than @var{len}.
> +
> +@item
> +When unpacking, the first null byte encountered in the packed string
> +is interpreted as the terminating byte, and it and all subsequent
> +bytes are excluded from the result of the unpacking.
> +@end itemize
>  
>  @quotation Caution
> -The packed output will not be null-terminated unless one of the
> -following is true:
> -@itemize
> -@item
> -The input string is shorter than @var{len} bytes and either no pre-allocated
> -string was provided to @code{bindat-pack} or the appropriate byte in
> -the pre-allocated string was already null.
> -@item
> -The input string contains a null byte within the first @var{len}
> -bytes.
> -@end itemize
> +The packed output will not be null-terminated unless the input string
> +is shorter than @var{len} bytes or it contains a null byte within the
> +first @var{len} bytes.
>  @end quotation
>  
>  @item vec @var{len} [@var{type}]
> diff --git a/lisp/emacs-lisp/bindat.el b/lisp/emacs-lisp/bindat.el
> index 4a642bb9c5..0ecac3d52a 100644
> --- a/lisp/emacs-lisp/bindat.el
> +++ b/lisp/emacs-lisp/bindat.el
> @@ -443,11 +443,14 @@ bindat--pack-str
>  (defun bindat--pack-strz (len v)
>    (let* ((v (string-to-unibyte v))
>           (vlen (length v)))
> +    ;; Explicitly write a null terminator (if there's room) in case
> +    ;; the user provided a pre-allocated string to `bindat-pack' that
> +    ;; wasn't already zeroed.
> +    (when (or (null len) (< vlen len))
> +      (aset bindat-raw (+ bindat-idx vlen) 0))
>      (if len
>          ;; When len is specified, behave the same as the str type
> -        ;; since we don't actually add the terminating zero anyway
> -        ;; (because we rely on the fact that `bindat-raw' was
> -        ;; presumably initialized with all-zeroes before we started).
> +        ;; (except for the null terminator possibly written above).
>          (bindat--pack-str len v)
>        (dotimes (i vlen)
>          (when (= (aref v i) 0)
> @@ -456,10 +459,6 @@ bindat--pack-strz
>            ;; need to scan the input string looking for a null byte.
>            (error "Null byte encountered in input strz string"))
>          (aset bindat-raw (+ bindat-idx i) (aref v i)))
> -      ;; Explicitly write a null terminator in case the user provided
> -      ;; a pre-allocated string to `bindat-pack' that wasn't already
> -      ;; zeroed.
> -      (aset bindat-raw (+ bindat-idx vlen) 0)
>        (setq bindat-idx (+ bindat-idx vlen 1)))))
>  
>  (defun bindat--pack-bits (len v)
> diff --git a/test/lisp/emacs-lisp/bindat-tests.el b/test/lisp/emacs-lisp/bindat-tests.el
> index cc223ad14e..0c03c51e2e 100644
> --- a/test/lisp/emacs-lisp/bindat-tests.el
> +++ b/test/lisp/emacs-lisp/bindat-tests.el
> @@ -172,14 +172,14 @@ bindat-test--str-strz-prealloc
>                  ((((x str 2)) ((x . "a"))) . "ax")
>                  ((((x str 2)) ((x . "ab"))) . "ab")
>                  ((((x str 2)) ((x . "abc"))) . "ab")
> -                ((,(bindat-type strz 1) "") . "xx")
> -                ((,(bindat-type strz 2) "") . "xx")
> -                ((,(bindat-type strz 2) "a") . "ax")
> +                ((,(bindat-type strz 1) "") . "\0x")
> +                ((,(bindat-type strz 2) "") . "\0x")
> +                ((,(bindat-type strz 2) "a") . "a\0")
>                  ((,(bindat-type strz 2) "ab") . "ab")
>                  ((,(bindat-type strz 2) "abc") . "ab")
> -                ((((x strz 1)) ((x . ""))) . "xx")
> -                ((((x strz 2)) ((x . ""))) . "xx")
> -                ((((x strz 2)) ((x . "a"))) . "ax")
> +                ((((x strz 1)) ((x . ""))) . "\0x")
> +                ((((x strz 2)) ((x . ""))) . "\0x")
> +                ((((x strz 2)) ((x . "a"))) . "a\0")
>                  ((((x strz 2)) ((x . "ab"))) . "ab")
>                  ((((x strz 2)) ((x . "abc"))) . "ab")
>                  ((,(bindat-type strz) "") . "\0x")





Reply sent to Eli Zaretskii <eliz <at> gnu.org>:
You have taken responsibility. (Wed, 22 Jun 2022 13:58:02 GMT) Full text and rfc822 format available.

Notification sent to Richard Hansen <rhansen <at> rhansen.org>:
bug acknowledged by developer. (Wed, 22 Jun 2022 13:58:02 GMT) Full text and rfc822 format available.

Message #24 received at 56048-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 56048-done <at> debbugs.gnu.org, rhansen <at> rhansen.org
Subject: Re: bug#56048: [PATCH] bindat (strz): Null terminate fixed-length
 strings if there is room
Date: Wed, 22 Jun 2022 16:57:19 +0300
> Cc: 56048 <at> debbugs.gnu.org
> Date: Tue, 21 Jun 2022 16:44:06 -0400
> From:  Stefan Monnier via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
> 
> Thanks, Richard.  Looks good to me.

Installed on master.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 21 Jul 2022 11:24:08 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 272 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.