GNU bug report logs - #32592
s with i modifier seems to work incorrectly

Previous Next

Package: sed;

Reported by: Saito Takaaki <tails.saito <at> gmail.com>

Date: Thu, 30 Aug 2018 14:44:01 UTC

Severity: normal

Tags: fixed

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 32592 in the body.
You can then email your comments to 32592 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-sed <at> gnu.org:
bug#32592; Package sed. (Thu, 30 Aug 2018 14:44:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Saito Takaaki <tails.saito <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-sed <at> gnu.org. (Thu, 30 Aug 2018 14:44:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Saito Takaaki <tails.saito <at> gmail.com>
To: bug-sed <at> gnu.org
Subject: s with i modifier seems to work incorrectly
Date: Thu, 30 Aug 2018 17:31:29 +0900
Greetings,

I guess the "s" command, when "i" flag is supplied,
does not work correctly in some cases.

Assuming the following Bash command as an example:

    echo abcdefghijkl | sed 'h;G;s/\(a.*\).*\1/(\1)/i'

the expected output is:

    (abcdefghijkl)

(The command would output it as expected if the final "i"
flag were not supplied.)
However, the actual output is [1][2]:

    (abcdefg)hijkl

or [3]:

    (abcdefgh)ijkl

[1] sed (GNU sed) 4.4 Packaged by Cygwin (4.4-1)
  on Cygwin/Windows 10 (32bit)

[2] GNU sed version 4.2.1 on Debian wheezy/sid (32bit)

[3] sed (GNU sed) 4.4.2 on CentOS 7 (64bit)

I'm sorry I haven't tested that with the latest sed 4.5.

Thank you very much for your attention.

-- 
Takaaki Saito




Information forwarded to bug-sed <at> gnu.org:
bug#32592; Package sed. (Thu, 30 Aug 2018 18:29:02 GMT) Full text and rfc822 format available.

Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):

From: bill-auger <bill-auger <at> peers.community>
To: bug-sed <at> gnu.org
Subject: Re: bug#32592: s with i modifier seems to work incorrectly
Date: Thu, 30 Aug 2018 14:27:23 +0000
On Thu, 30 Aug 2018 17:31:29 +0900
Saito Takaaki <tails.saito <at> gmail.com> wrote:
> Assuming the following Bash command as an example:
>     echo abcdefghijkl | sed 'h;G;s/\(a.*\).*\1/(\1)/i'
> the expected output is:
>     (abcdefghijkl)
> However, the actual output is [1][2]:
>     (abcdefg)hijkl

np with GNU sed 4.5 on parabola (archlinux)

$ echo abcdefghijkl | sed 'h;G;s/\(a.*\).*\1/(\1)/i'
(abcdefghijkl)

$ sed --version
sed (GNU sed) 4.5




Information forwarded to bug-sed <at> gnu.org:
bug#32592; Package sed. (Tue, 04 Sep 2018 23:18:01 GMT) Full text and rfc822 format available.

Message #11 received at 32592 <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: Saito Takaaki <tails.saito <at> gmail.com>, 32592 <at> debbugs.gnu.org,
 bill-auger <bill-auger <at> peers.community>, Eric Blake <eblake <at> redhat.com>,
 Jim Meyering <jim <at> meyering.net>
Subject: Re: bug#32592: s with i modifier seems to work incorrectly
Date: Tue, 4 Sep 2018 17:16:56 -0600
Hello,

On 30/08/18 02:31 AM, Saito Takaaki wrote:
> I guess the "s" command, when "i" flag is supplied, does not work
> correctly in some cases.
> 
> Assuming the following Bash command as an example:
> 
> echo abcdefghijkl | sed 'h;G;s/\(a.*\).*\1/(\1)/i'

Thank you for reporting this issue with specific details
to easily reproduce it, and thanks to you Bill for a quick test on the
latest sed version.

I can confirm that this is an old bug in sed (technically: in gnulib)
which was fixed sometime between version 4.3 and 4.4 .

Specifically, this commit fixed the issue (by updating gnulib):
https://git.savannah.gnu.org/cgit/sed.git/commit/?id=44d99bf5c98ea77de0addf55ad7fe281396de996

In gnulib, these are updated changes and they include several
fixes to the regex/dfa code:

https://git.savannah.gnu.org/cgit/gnulib.git/log/?qt=range&q=a3fd683d..85bd3ab6

I have not tried to pin-point the exact change which fixed the issue.

> [1] sed (GNU sed) 4.4 Packaged by Cygwin (4.4-1) on Cygwin/Windows 10
> (32bit)

I don't have access to 32bit cygwin, but on 64bit cygwin (windows 7) 
with sed-4.4-1 I do not experience the bug - can you confirm it still 
happens even in sed-4.4 ?

> [2] GNU sed version 4.2.1 on Debian wheezy/sid (32bit)
> 
> [3] sed (GNU sed) 4.4.2 on CentOS 7 (64bit)
(I assume you meant sed-4.2.2 on CentOS)

Given that sed-4.2.2 is almost six years old,
and the fixed version (sed-4.4) is also already a year and a half old,
I'm inclined to close this bug as "wontfix".

CentOS-7 (and by proxy, RHEL-7) are likely the only officially supported
distributions that still use the old sed version 4.2.2.

Eric, Jim - what do you think?

regards,
 - assaf




Information forwarded to bug-sed <at> gnu.org:
bug#32592; Package sed. (Wed, 05 Sep 2018 01:04:01 GMT) Full text and rfc822 format available.

Message #14 received at 32592 <at> debbugs.gnu.org (full text, mbox):

From: Saito Takaaki <tails.saito <at> gmail.com>
To: 32592 <at> debbugs.gnu.org
Cc: bill-auger <at> peers.community, assafgordon <at> gmail.com, eblake <at> redhat.com,
 jim <at> meyering.net
Subject: Re: bug#32592: s with i modifier seems to work incorrectly
Date: Wed, 5 Sep 2018 10:02:58 +0900
Hello assaf, thank you very much for the detailed explanation.

> > [3] sed (GNU sed) 4.4.2 on CentOS 7 (64bit)
> (I assume you meant sed-4.2.2 on CentOS)

You are right.  I'm sorry for my critical mistake.

> I don't have access to 32bit cygwin, but on 64bit cygwin (windows 7)
> with sed-4.4-1 I do not experience the bug - can you confirm it still
> happens even in sed-4.4 ?

Here is the result of the command in question and the first three lines
of sed --version on 32bit cygwin on Windows 10.
--------
$ echo abcdefghijkl | sed 'h;G;s/\(a.*\).*\1/(\1)/i'
(abcdefg)hijkl
$ sed --version  | head -3
sed (GNU sed) 4.4
Packaged by Cygwin (4.4-1)
Copyright (C) 2017 Free Software Foundation, Inc.
--------

Additionally, I tried the command with sed 4.4 on ideone and found
it does not happen.
https://ideone.com/pYoF5y

However, a friend showed me a more complex case which is
problematic even with sed 4.4 on ideone.  The last two lines of the
output (for the identical input lines) are  particularly interesting.
https://ideone.com/Sq5xJX

I hope this helps even a bit.
Thank you very much.

-- 
Takaaki Saito




Information forwarded to bug-sed <at> gnu.org:
bug#32592; Package sed. (Wed, 05 Sep 2018 07:33:01 GMT) Full text and rfc822 format available.

Message #17 received at 32592 <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: Saito Takaaki <tails.saito <at> gmail.com>, 32592 <at> debbugs.gnu.org,
 bug-gnulib <at> gnu.org
Cc: bill-auger <bill-auger <at> peers.community>, Eric Blake <eblake <at> redhat.com>,
 Jim Meyering <jim <at> meyering.net>
Subject: bug#32592: heap-use-after-free in regex module (was: s with i
 modifier seems to work incorrectly)
Date: Wed, 5 Sep 2018 01:32:27 -0600
(adding gnulib)

On 04/09/18 07:02 PM, Saito Takaaki wrote:
[... discussing a sed bug ...]
> However, a friend showed me a more complex case which is
> problematic even with sed 4.4 on ideone.  The last two lines of the
> output (for the identical input lines) are  particularly interesting.
> https://ideone.com/Sq5xJX
> 
> I hope this helps even a bit.

Thank you for persisting with this bug.

The linked snippet you provided exposed a heap-use-after-free bug
in gnulib's regex module (possibly in glibc as well).

A simple way to reproduce with latest sed:

  cd sed
  ./bootstrap
  ./configure --with-included-regex
  make
  echo 'abcdefghijklmns!!!!!!!!!!' \
     | valgrind ./sed/sed -E 'h;G;s/((.).+(.))(.*\n.*\1)/\2-\3\4/i'

Results in a use-after-free relating to the back-references (valgrind
output below). There's some interplay with the input length - if the 
exclamation marks are removed, the bug is not triggered.
The bug does not trigger without the case-insensitive flag (s///i).

This is easier to trigger with gnulib (hence --with-included-regex)
but happens also with glibc's regex module.

This could also mean that the bug you previously reported and I surmised
was fixed is not fixed at all - could be that it was just much harder to
trigger with later sed versions.

I'm still learning the code so don't have a fix yet.

comments welcomed,
 - assaf

=========================

==13408== Memcheck, a memory error detector
==13408== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==13408== Using Valgrind-3.12.0.SVN and LibVEX; rerun with -h for 
copyright info
==13408== Command: ./sed/sed -E h;G;s/((.).+(.))(.*\\n.*\\1)/\\2-\\3\\4/i
==13408==
==13408== Invalid read of size 1
==13408==    at 0x123857: get_subexp (regexec.c:2747)
==13408==    by 0x123857: transit_state_bkref.isra.32 (regexec.c:2561)
==13408==    by 0x123BDC: merge_state_with_log (regexec.c:2345)
==13408==    by 0x1248B8: check_matching (regexec.c:1135)
==13408==    by 0x1248B8: re_search_internal (regexec.c:802)
==13408==    by 0x12921E: re_search_stub (regexec.c:424)
==13408==    by 0x12995F: rpl_re_search (regexec.c:289)
==13408==    by 0x111C84: match_regex (regexp.c:358)
==13408==    by 0x110205: do_subst (execute.c:1015)
==13408==    by 0x110205: execute_program (execute.c:1536)
==13408==    by 0x11145A: process_files (execute.c:1673)
==13408==    by 0x10B23B: main (sed.c:360)
==13408==  Address 0x56096d0 is 16 bytes inside a block of size 42 free'd
==13408==    at 0x4C2DDCF: realloc (vg_replace_malloc.c:785)
==13408==    by 0x11BF43: re_string_realloc_buffers (regex_internal.c:167)
==13408==    by 0x11CA8C: extend_buffers (regexec.c:4057)
==13408==    by 0x11CBBA: clean_state_log_if_needed (regexec.c:1697)
==13408==    by 0x123967: get_subexp (regexec.c:2778)
==13408==    by 0x123967: transit_state_bkref.isra.32 (regexec.c:2561)
==13408==    by 0x123BDC: merge_state_with_log (regexec.c:2345)
==13408==    by 0x1248B8: check_matching (regexec.c:1135)
==13408==    by 0x1248B8: re_search_internal (regexec.c:802)
==13408==    by 0x12921E: re_search_stub (regexec.c:424)
==13408==    by 0x12995F: rpl_re_search (regexec.c:289)
==13408==    by 0x111C84: match_regex (regexp.c:358)
==13408==    by 0x110205: do_subst (execute.c:1015)
==13408==    by 0x110205: execute_program (execute.c:1536)
==13408==    by 0x11145A: process_files (execute.c:1673)
==13408==  Block was alloc'd at
==13408==    at 0x4C2DDCF: realloc (vg_replace_malloc.c:785)
==13408==    by 0x11BF43: re_string_realloc_buffers (regex_internal.c:167)
==13408==    by 0x11CA8C: extend_buffers (regexec.c:4057)
==13408==    by 0x124A1A: check_matching (regexec.c:1125)
==13408==    by 0x124A1A: re_search_internal (regexec.c:802)
==13408==    by 0x12921E: re_search_stub (regexec.c:424)
==13408==    by 0x12995F: rpl_re_search (regexec.c:289)
==13408==    by 0x111C84: match_regex (regexp.c:358)
==13408==    by 0x110205: do_subst (execute.c:1015)
==13408==    by 0x110205: execute_program (execute.c:1536)
==13408==    by 0x11145A: process_files (execute.c:1673)
==13408==    by 0x10B23B: main (sed.c:360)
==13408==
==13408== Invalid read of size 1
==13408==    at 0x12385C: get_subexp (regexec.c:2747)
==13408==    by 0x12385C: transit_state_bkref.isra.32 (regexec.c:2561)
==13408==    by 0x123BDC: merge_state_with_log (regexec.c:2345)
==13408==    by 0x1248B8: check_matching (regexec.c:1135)
==13408==    by 0x1248B8: re_search_internal (regexec.c:802)
==13408==    by 0x12921E: re_search_stub (regexec.c:424)
==13408==    by 0x12995F: rpl_re_search (regexec.c:289)
==13408==    by 0x111C84: match_regex (regexp.c:358)
==13408==    by 0x110205: do_subst (execute.c:1015)
==13408==    by 0x110205: execute_program (execute.c:1536)
==13408==    by 0x11145A: process_files (execute.c:1673)
==13408==    by 0x10B23B: main (sed.c:360)
==13408==  Address 0x56096ea is 0 bytes after a block of size 42 free'd
==13408==    at 0x4C2DDCF: realloc (vg_replace_malloc.c:785)
==13408==    by 0x11BF43: re_string_realloc_buffers (regex_internal.c:167)
==13408==    by 0x11CA8C: extend_buffers (regexec.c:4057)
==13408==    by 0x11CBBA: clean_state_log_if_needed (regexec.c:1697)
==13408==    by 0x123967: get_subexp (regexec.c:2778)
==13408==    by 0x123967: transit_state_bkref.isra.32 (regexec.c:2561)
==13408==    by 0x123BDC: merge_state_with_log (regexec.c:2345)
==13408==    by 0x1248B8: check_matching (regexec.c:1135)
==13408==    by 0x1248B8: re_search_internal (regexec.c:802)
==13408==    by 0x12921E: re_search_stub (regexec.c:424)
==13408==    by 0x12995F: rpl_re_search (regexec.c:289)
==13408==    by 0x111C84: match_regex (regexp.c:358)
==13408==    by 0x110205: do_subst (execute.c:1015)
==13408==    by 0x110205: execute_program (execute.c:1536)
==13408==    by 0x11145A: process_files (execute.c:1673)
==13408==  Block was alloc'd at
==13408==    at 0x4C2DDCF: realloc (vg_replace_malloc.c:785)
==13408==    by 0x11BF43: re_string_realloc_buffers (regex_internal.c:167)
==13408==    by 0x11CA8C: extend_buffers (regexec.c:4057)
==13408==    by 0x124A1A: check_matching (regexec.c:1125)
==13408==    by 0x124A1A: re_search_internal (regexec.c:802)
==13408==    by 0x12921E: re_search_stub (regexec.c:424)
==13408==    by 0x12995F: rpl_re_search (regexec.c:289)
==13408==    by 0x111C84: match_regex (regexp.c:358)
==13408==    by 0x110205: do_subst (execute.c:1015)
==13408==    by 0x110205: execute_program (execute.c:1536)
==13408==    by 0x11145A: process_files (execute.c:1673)
==13408==    by 0x10B23B: main (sed.c:360)
==13408==
a-!!!!!!!!!!
abcdefghijklmns!!!!!!!!!!
==13408==
==13408== HEAP SUMMARY:
==13408==     in use at exit: 1,840 bytes in 5 blocks
==13408==   total heap usage: 1,131 allocs, 1,126 frees, 205,127 bytes 
allocated
==13408==
==13408== LEAK SUMMARY:
==13408==    definitely lost: 0 bytes in 0 blocks
==13408==    indirectly lost: 0 bytes in 0 blocks
==13408==      possibly lost: 0 bytes in 0 blocks
==13408==    still reachable: 1,840 bytes in 5 blocks
==13408==         suppressed: 0 bytes in 0 blocks
==13408== Rerun with --leak-check=full to see details of leaked memory
==13408==
==13408== For counts of detected and suppressed errors, rerun with: -v
==13408== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)




Information forwarded to bug-sed <at> gnu.org:
bug#32592; Package sed. (Wed, 05 Sep 2018 13:24:01 GMT) Full text and rfc822 format available.

Message #20 received at 32592 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Assaf Gordon <assafgordon <at> gmail.com>
Cc: bill-auger <at> peers.community, Eric Blake <eblake <at> redhat.com>,
 32592 <at> debbugs.gnu.org, tails.saito <at> gmail.com,
 "bug-gnulib <at> gnu.org List" <bug-gnulib <at> gnu.org>
Subject: Re: bug#32592: heap-use-after-free in regex module (was: s with i
 modifier seems to work incorrectly)
Date: Wed, 5 Sep 2018 06:23:21 -0700
On Wed, Sep 5, 2018 at 12:32 AM Assaf Gordon <assafgordon <at> gmail.com> wrote:
>
> (adding gnulib)
>
> On 04/09/18 07:02 PM, Saito Takaaki wrote:
> [... discussing a sed bug ...]
> > However, a friend showed me a more complex case which is
> > problematic even with sed 4.4 on ideone.  The last two lines of the
> > output (for the identical input lines) are  particularly interesting.
> > https://ideone.com/Sq5xJX
> >
> > I hope this helps even a bit.
>
> Thank you for persisting with this bug.
>
> The linked snippet you provided exposed a heap-use-after-free bug
> in gnulib's regex module (possibly in glibc as well).
>
> A simple way to reproduce with latest sed:
>
>    cd sed
>    ./bootstrap
>    ./configure --with-included-regex
>    make
>    echo 'abcdefghijklmns!!!!!!!!!!' \
>       | valgrind ./sed/sed -E 'h;G;s/((.).+(.))(.*\n.*\1)/\2-\3\4/i'
>
> Results in a use-after-free relating to the back-references (valgrind
> output below). There's some interplay with the input length - if the
> exclamation marks are removed, the bug is not triggered.
> The bug does not trigger without the case-insensitive flag (s///i).
>
> This is easier to trigger with gnulib (hence --with-included-regex)
> but happens also with glibc's regex module.
>
> This could also mean that the bug you previously reported and I surmised
> was fixed is not fixed at all - could be that it was just much harder to
> trigger with later sed versions.
>
> I'm still learning the code so don't have a fix yet.

Wow, another!?! Thanks for pursuing!




Information forwarded to bug-sed <at> gnu.org:
bug#32592; Package sed. (Thu, 06 Sep 2018 00:05:02 GMT) Full text and rfc822 format available.

Message #23 received at 32592 <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: bill-auger <at> peers.community, Eric Blake <eblake <at> redhat.com>,
 32592 <at> debbugs.gnu.org, tails.saito <at> gmail.com,
 "bug-gnulib <at> gnu.org List" <bug-gnulib <at> gnu.org>
Subject: Re: bug#32592: heap-use-after-free in regex module
Date: Wed, 5 Sep 2018 18:04:03 -0600
[Message part 1 (text/plain, inline)]
Hello,

> On Wed, Sep 5, 2018 at 12:32 AM Assaf Gordon <assafgordon <at> gmail.com> wrote: >>>> On 04/09/18 07:02 PM, Saito Takaaki wrote:>>> 
https://ideone.com/Sq5xJX>>>>>> I hope this helps even a bit.>>>> The 
linked snippet you provided exposed a heap-use-after-free bug
>> in gnulib's regex module (possibly in glibc as well).

Please find the attached patch as a suggested fix.

Comments and review very welcomed,
 - assaf
[0001-regex-fix-heap-use-after-free-error.patch (text/x-patch, attachment)]

Information forwarded to bug-sed <at> gnu.org:
bug#32592; Package sed. (Thu, 06 Sep 2018 01:09:02 GMT) Full text and rfc822 format available.

Message #26 received at 32592 <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: bill-auger <at> peers.community, Eric Blake <eblake <at> redhat.com>,
 32592 <at> debbugs.gnu.org, tails.saito <at> gmail.com,
 "bug-gnulib <at> gnu.org List" <bug-gnulib <at> gnu.org>
Subject: Re: bug#32592: heap-use-after-free in regex module
Date: Wed, 5 Sep 2018 19:08:24 -0600
[Message part 1 (text/plain, inline)]
Hello,
Assuming the gnulib bugfix is valid (in my previous email),
I suggest adding the following test to sed (after updating gnulib).

comments welcomed,
 - assaf


[0001-sed-fix-heap-use-after-free-bug-in-s-command.patch (text/x-patch, attachment)]

Information forwarded to bug-sed <at> gnu.org:
bug#32592; Package sed. (Thu, 06 Sep 2018 01:29:02 GMT) Full text and rfc822 format available.

Message #29 received at 32592 <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: bill-auger <at> peers.community, Eric Blake <eblake <at> redhat.com>,
 32592 <at> debbugs.gnu.org, tails.saito <at> gmail.com,
 "bug-gnulib <at> gnu.org List" <bug-gnulib <at> gnu.org>
Subject: Re: bug#32592: heap-use-after-free in regex module
Date: Wed, 5 Sep 2018 19:28:14 -0600
[Message part 1 (text/plain, inline)]
Bruno alerted me off-list:

On 05/09/18 07:19 PM, Bruno Haible wrote:
> Is the ChangeLog entry up-to-date?
>
> +	* regexec.c (get_subexp): Update 'buf' after call to get_subexp_sub.
> +	Additionally, check for allocation errors and bail out if needed.
>
> I don't see a code change for
> "check for allocation errors and bail out if needed".

Thanks!

I initially had a check for REG_NOERROR there, but removed it.

Attached an updated patch without the outdated comment.

-assaf


[0001-regex-fix-heap-use-after-free-error.patch (text/x-patch, attachment)]

Information forwarded to bug-sed <at> gnu.org:
bug#32592; Package sed. (Thu, 06 Sep 2018 04:34:02 GMT) Full text and rfc822 format available.

Message #32 received at 32592 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Assaf Gordon <assafgordon <at> gmail.com>
Cc: bill-auger <at> peers.community, Eric Blake <eblake <at> redhat.com>,
 32592 <at> debbugs.gnu.org, tails.saito <at> gmail.com,
 "bug-gnulib <at> gnu.org List" <bug-gnulib <at> gnu.org>
Subject: Re: bug#32592: heap-use-after-free in regex module
Date: Wed, 5 Sep 2018 21:32:58 -0700
[Message part 1 (text/plain, inline)]
On Wed, Sep 5, 2018 at 6:08 PM Assaf Gordon <assafgordon <at> gmail.com> wrote:
> Assuming the gnulib bugfix is valid (in my previous email),
> I suggest adding the following test to sed (after updating gnulib).

Thank you, Assaf.
Only tiny suggestions:
[sed-test-tweak.diff (application/octet-stream, attachment)]

Information forwarded to bug-sed <at> gnu.org:
bug#32592; Package sed. (Thu, 06 Sep 2018 04:47:02 GMT) Full text and rfc822 format available.

Message #35 received at 32592 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Assaf Gordon <assafgordon <at> gmail.com>
Cc: bill-auger <at> peers.community, Eric Blake <eblake <at> redhat.com>,
 32592 <at> debbugs.gnu.org, tails.saito <at> gmail.com,
 "bug-gnulib <at> gnu.org List" <bug-gnulib <at> gnu.org>
Subject: Re: bug#32592: heap-use-after-free in regex module
Date: Wed, 5 Sep 2018 21:45:57 -0700
On Wed, Sep 5, 2018 at 6:28 PM Assaf Gordon <assafgordon <at> gmail.com> wrote:
>
> Bruno alerted me off-list:
>
> On 05/09/18 07:19 PM, Bruno Haible wrote:
>  > Is the ChangeLog entry up-to-date?
>  >
>  > +    * regexec.c (get_subexp): Update 'buf' after call to get_subexp_sub.
>  > +    Additionally, check for allocation errors and bail out if needed.
>  >
>  > I don't see a code change for
>  > "check for allocation errors and bail out if needed".
>
> Thanks!
>
> I initially had a check for REG_NOERROR there, but removed it.
>
> Attached an updated patch without the outdated comment.

Very nice work!

Your change looks fine: set "buf" to account for potentially-moved
allocation, just as is done on three other lines above.
However, I couldn't help but notice this nonsense right after the line
you inserted:

          if (err == REG_NOMATCH)
            continue;
        }

That is an "if (...) continue;" just before the closing brace of a
for-loop. Those two lines constitute a no-op and should be removed,
though not as part of your change.




Information forwarded to bug-sed <at> gnu.org:
bug#32592; Package sed. (Thu, 06 Sep 2018 07:19:03 GMT) Full text and rfc822 format available.

Message #38 received at 32592 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Jim Meyering <jim <at> meyering.net>, Assaf Gordon <assafgordon <at> gmail.com>
Cc: bill-auger <at> peers.community, "bug-gnulib <at> gnu.org List" <bug-gnulib <at> gnu.org>,
 32592 <at> debbugs.gnu.org, tails.saito <at> gmail.com, Eric Blake <eblake <at> redhat.com>
Subject: Re: bug#32592: heap-use-after-free in regex module
Date: Thu, 6 Sep 2018 00:18:18 -0700
Jim Meyering wrote:
> I couldn't help but notice this nonsense right after the line
> you inserted:
> 
>            if (err == REG_NOMATCH)
>              continue;
>          }
> 
> That is an "if (...) continue;" just before the closing brace of a
> for-loop. Those two lines constitute a no-op and should be removed,
> though not as part of your change.

Actually I think the abovementioned code should be kept, and the nonsense comes 
from the fact that some code is missing after the "if". When err != REG_NOMATCH 
&& err != REG_NOERROR, the function should exit the loop and return immediately, 
because there is a memory allocation error in a subroutine.

What a coincidence that we would find two bugs right next to each other, huh?...

I filed a bug report against glibc, and unless there's an objection I would like 
to fix both bugs in glibc and propagate the fix into gnulib. Please see the 
glibc bug here:

https://sourceware.org/bugzilla/show_bug.cgi?id=23609




Information forwarded to bug-sed <at> gnu.org:
bug#32592; Package sed. (Thu, 06 Sep 2018 08:03:02 GMT) Full text and rfc822 format available.

Message #41 received at 32592 <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: "bug-gnulib <at> gnu.org List" <bug-gnulib <at> gnu.org>,
 Jim Meyering <jim <at> meyering.net>, bill-auger <bill-auger <at> peers.community>,
 32592 <at> debbugs.gnu.org, Saito Takaaki <tails.saito <at> gmail.com>,
 Eric Blake <eblake <at> redhat.com>
Subject: Re: bug#32592: heap-use-after-free in regex module
Date: Thu, 6 Sep 2018 02:02:08 -0600
Thank you all for the review and comments.

On Thu, Sep 6, 2018 at 1:18 AM, Paul Eggert <eggert <at> cs.ucla.edu> wrote:
> What a coincidence that we would find two bugs right next to each other,
> huh?...
>
> I filed a bug report against glibc, and unless there's an objection I would
> like to fix both bugs in glibc and propagate the fix into gnulib. Please see
> the glibc bug here:
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=23609

Speaking of coincidences,
I just found this use-after-free bug was already reported (but not fixed)
back in 2015: https://sourceware.org/bugzilla/show_bug.cgi?id=18040 .


regards,
 - assaf




Information forwarded to bug-sed <at> gnu.org:
bug#32592; Package sed. (Thu, 06 Sep 2018 08:25:02 GMT) Full text and rfc822 format available.

Message #44 received at 32592 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Assaf Gordon <assafgordon <at> gmail.com>
Cc: "bug-gnulib <at> gnu.org List" <bug-gnulib <at> gnu.org>,
 Jim Meyering <jim <at> meyering.net>, bill-auger <bill-auger <at> peers.community>,
 32592 <at> debbugs.gnu.org, Saito Takaaki <tails.saito <at> gmail.com>,
 Eric Blake <eblake <at> redhat.com>
Subject: Re: bug#32592: heap-use-after-free in regex module
Date: Thu, 6 Sep 2018 01:24:35 -0700
Assaf Gordon wrote:
> Speaking of coincidences,
> I just found this use-after-free bug was already reported (but not fixed)
> back in 2015:https://sourceware.org/bugzilla/show_bug.cgi?id=18040  .

Thanks, I had looked for a duplicate bug report before filing glibc bug 23609 
but did not find that one. I have added notes to glibc bugs 18040 and 23609 
suggesting that they be merged (which is apparently not something I can do via 
the web UI).




Information forwarded to bug-sed <at> gnu.org:
bug#32592; Package sed. (Thu, 06 Sep 2018 13:43:02 GMT) Full text and rfc822 format available.

Message #47 received at 32592 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Assaf Gordon <assafgordon <at> gmail.com>,
 "bug-gnulib <at> gnu.org List" <bug-gnulib <at> gnu.org>, bill-auger <at> peers.community,
 32592 <at> debbugs.gnu.org, tails.saito <at> gmail.com, Eric Blake <eblake <at> redhat.com>
Subject: Re: bug#32592: heap-use-after-free in regex module
Date: Thu, 6 Sep 2018 06:41:41 -0700
On Thu, Sep 6, 2018 at 12:18 AM Paul Eggert <eggert <at> cs.ucla.edu> wrote:
> Jim Meyering wrote:
> > I couldn't help but notice this nonsense right after the line
> > you inserted:
> >
> >            if (err == REG_NOMATCH)
> >              continue;
> >          }
> >
> > That is an "if (...) continue;" just before the closing brace of a
> > for-loop. Those two lines constitute a no-op and should be removed,
> > though not as part of your change.
>
> Actually I think the abovementioned code should be kept, and the nonsense comes
> from the fact that some code is missing after the "if". When err != REG_NOMATCH
> && err != REG_NOERROR, the function should exit the loop and return immediately,
> because there is a memory allocation error in a subroutine.
>
> What a coincidence that we would find two bugs right next to each other, huh?...

Indeed. Glad you realized that.




Added tag(s) fixed. Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Mon, 08 Oct 2018 23:50:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 32592 <at> debbugs.gnu.org and Saito Takaaki <tails.saito <at> gmail.com> Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Mon, 08 Oct 2018 23:50:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-sed <at> gnu.org:
bug#32592; Package sed. (Mon, 08 Oct 2018 23:50:03 GMT) Full text and rfc822 format available.

Message #54 received at 32592-done <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: 32592-done <at> debbugs.gnu.org
Subject: Re: bug#32592: heap-use-after-free in regex module
Date: Mon, 8 Oct 2018 17:49:06 -0600
tags 32592 fixed
close 32592
thanks

This has been fixed in gnulib and pulled into sed, so closing.





bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 06 Nov 2018 12:24:07 GMT) Full text and rfc822 format available.

This bug report was last modified 5 years and 165 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.