GNU bug report logs -
#77462
"/s" instability? I think this is a bug.
Previous Next
To reply to this bug, email your comments to 77462 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-sed <at> gnu.org
:
bug#77462
; Package
sed
.
(Wed, 02 Apr 2025 15:21:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
"gnudborgonly <at> s-epost.no" <gnudborgonly <at> s-epost.no>
:
New bug report received and forwarded. Copy sent to
bug-sed <at> gnu.org
.
(Wed, 02 Apr 2025 15:21:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
This seems to qualify as a bug:
The sed version included in my Linux seems to be unstable when using
the '\s' and/or '\S' regex extensions:
Example:
id <at> pc:~$ echo '[ subCA2 ]' | sed -n 's|^\[\s\+\([^\s]\+[0-
9]\+\)\s\+\]\s*$|\1|p'
id <at> pc:~$ echo '[ subCA2 ]' | sed -n 's:^\[\s\+\([\S]\+[0-
9]\+\)\s\+\]\s*$:\1:p'
id <at> pc:~$ echo '[ rootCA1 ]' | sed -n 's|^\[\s\+\([^\s]\+[0-
9]\+\)\s\+\]\s*$|\1|p'
rootCA1
If I replace '\s' with '[ \t]' (and '\S' with '[^ \t]') things work as
expected:
id <at> pc:~$ echo '[ subCA2 ]' | sed -n 's:^\[[ \t]\+\([^ \t]\+[0-9]\+\)[
\t]\+\][ \t]*$:\1:p'
subCA2
id <at> pc:~$ echo '[ rootCA2 ]' | sed -n 's:^\[[ \t]\+\([^ \t]\+[0-9]\+\)[
\t]\+\][ \t]*$:\1:p'
rootCA2
-----------------------------------------------------------------------
--------------------------
My Linux version:
Debian 12.10 as per 2025-04-02, terminal session in an Xfce4 desktop
environment, fully updated:
My sed version:
id <at> pc:~$ sed --version
sed (GNU sed) 4.9
Packaged by Debian
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Jay Fenlason, Tom Lord, Ken Pizzini,
Paolo Bonzini, Jim Meyering, and Assaf Gordon.
This sed program was built with SELinux support.
SELinux is disabled on this system.
GNU sed home page: <https://www.gnu.org/software/sed/>.
General help using GNU software: <https://www.gnu.org/gethelp/>.
E-mail bug reports to: <bug-sed <at> gnu.org>.
id <at> pc:~$
-----------------------------------------------------------------------
--------------------------
Regards,
Vidar Hanto
Norway
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-sed <at> gnu.org
:
bug#77462
; Package
sed
.
(Wed, 02 Apr 2025 20:08:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 77462 <at> debbugs.gnu.org (full text, mbox):
On Wed, Apr 2, 2025 at 8:41 AM gnudborgonly <at> s-epost.no via
<bug-sed <at> gnu.org> wrote:
> This seems to qualify as a bug:
Thanks for the report.
You can fix your usage by not putting "[...]" around those uses of "\s" or "\S".
> The sed version included in my Linux seems to be unstable when using
> the '\s' and/or '\S' regex extensions:
>
> Example:
>
> id <at> pc:~$ echo '[ subCA2 ]' | sed -n 's|^\[\s\+\([^\s]\+[0-9]\+\)\s\+\]\s*$|\1|p'
> id <at> pc:~$ echo '[ subCA2 ]' | sed -n 's:^\[\s\+\([\S]\+[0-9]\+\)\s\+\]\s*$:\1:p'
Drop the square brackets and it works. I.e., change the latter to this:
$ echo '[ subCA2 ]' | sed -n 's:^\[\s\+\(\S\+[0-9]\+\)\s\+\]\s*$:\1:p'
subCA2
Or better still, use sed's -E option to make the regular expression
more readable, eliding **six** backslashes:
echo '[ subCA2 ]' | sed -nE 's:^\[\s+(\S+[0-9]+)\s+\]\s*$:\1:p'
I admit this is an unpleasant irregularity about GNU sed's "\S" extension,
since it's different from how things work in PCRE.
This is one of the reasons I urge people use Perl instead of sed
(another is because PCRE lets you use "\d" and non-greedy modifiers
like "\S+?" below):
$ echo '[ subCA2 ]' | perl -nle 'm{^\[\s+(\S+?\d+)\s+\]\s*$} and print $1'
subCA2
Searching Sed's sources/docs for references to \S and \s vs ranges, I
found no trace, but did see this 4.1 NEWS entry:
* removed documentation for \s and \S which worked incorrectly
I'll leave this bug report open, because this is a wart that needs to
be documented.
Information forwarded
to
bug-sed <at> gnu.org
:
bug#77462
; Package
sed
.
(Wed, 02 Apr 2025 20:32:04 GMT)
Full text and
rfc822 format available.
Message #11 received at 77462 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
I think I have confirmed the bug. Consider this bash-script:
#! /bin/bash
for x in A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e
f g h i j k l m n o p q r s t u v w x y z; do
echo "[ ${x}XXtxt1 ]" | sed -n 's|^\[\s\+\([^\s]\+[0-
9]\+\)\s\+\]\s*$|\1|p'
done
sed is printing a result in all cases, except when the 's' is starting
the text that is captured in the sed-script.
Regards,
Vidar Hanto
Norway
-----------------------------------------------------------------------
----------------------------------------
On Wed, 2025-04-02 at 15:21 +0000, GNU bug Tracking System wrote:
> Thank you for filing a new bug report with debbugs.gnu.org.
>
> This is an automatically generated reply to let you know your message
> has been received.
>
> Your message is being forwarded to the package maintainers and other
> interested parties for their attention; they will reply in due
> course.
>
> Your message has been sent to the package maintainer(s):
> bug-sed <at> gnu.org
>
> If you wish to submit further information on this problem, please
> send it to 77462 <at> debbugs.gnu.org.
>
> Please do not send mail to help-debbugs <at> gnu.org unless you wish
> to report a problem with the Bug-tracking system.
>
-----------------------------------------------------------------------
-------------------------------------
This seems to qualify as a bug:
The sed version included in my Linux seems to be unstable when using
the '\s' and/or '\S' regex extensions:
Example:
id <at> pc:~$ echo '[ subCA2 ]' | sed -n 's|^\[\s\+\([^\s]\+[0-
9]\+\)\s\+\]\s*$|\1|p'
id <at> pc:~$ echo '[ subCA2 ]' | sed -n 's:^\[\s\+\([\S]\+[0-
9]\+\)\s\+\]\s*$:\1:p'
id <at> pc:~$ echo '[ rootCA1 ]' | sed -n 's|^\[\s\+\([^\s]\+[0-
9]\+\)\s\+\]\s*$|\1|p'
rootCA1
If I replace '\s' with '[ \t]' (and '\S' with '[^ \t]') things work as
expected:
id <at> pc:~$ echo '[ subCA2 ]' | sed -n 's:^\[[ \t]\+\([^ \t]\+[0-9]\+\)[
\t]\+\][ \t]*$:\1:p'
subCA2
id <at> pc:~$ echo '[ rootCA2 ]' | sed -n 's:^\[[ \t]\+\([^ \t]\+[0-9]\+\)[
\t]\+\][ \t]*$:\1:p'
rootCA2
-----------------------------------------------------------------------
--------------------------
My Linux version:
Debian 12.10 as per 2025-04-02, terminal session in an Xfce4 desktop
environment, fully updated:
My sed version:
id <at> pc:~$ sed --version
sed (GNU sed) 4.9
Packaged by Debian
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Jay Fenlason, Tom Lord, Ken Pizzini,
Paolo Bonzini, Jim Meyering, and Assaf Gordon.
This sed program was built with SELinux support.
SELinux is disabled on this system.
GNU sed home page: <https://www.gnu.org/software/sed/>.
General help using GNU software: <https://www.gnu.org/gethelp/>.
E-mail bug reports to: <bug-sed <at> gnu.org>.
id <at> pc:~$
-----------------------------------------------------------------------
--------------------------
Regards,
Vidar Hanto
Norway
[Message part 2 (text/html, inline)]
This bug report was last modified 2 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.