GNU bug report logs -
#78883
backslash interpretation in 's' replacement text violates POSIX
Previous Next
To reply to this bug, email your comments to 78883 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-sed <at> gnu.org
:
bug#78883
; Package
sed
.
(Mon, 23 Jun 2025 21:48:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Bruno Haible <bruno <at> clisp.org>
:
New bug report received and forwarded. Copy sent to
bug-sed <at> gnu.org
.
(Mon, 23 Jun 2025 21:48:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Hi,
When the replacement text in an 's' command contains an escape sequence
like '\n', GNU sed interprets this escape sequence, while other implementations
(OpenBSD sed, Solaris sed) do not.
How to reproduce:
$ echo foo | sed -e 's/f.*/line1\nline2/'
line1
line2
$ echo foo | POSIXLY_CORRECT=1 sed --posix -e 's/f.*/line1\nline2/'
line1
line2
Seen with GNU sed 4.9.
While OpenBSD sed, Solaris sed produce:
$ echo foo | sed -e 's/f.*/line1\nline2/'
line1nline2
POSIX [1] is ambiguous here, I would say:
Quoting:
"For each other <backslash> encountered, the following character shall
lose its special meaning (if any)."
but also
"The meaning of an unescaped <backslash> immediately followed by any
character other than '&', <backslash>, a digit, <newline>, or the
delimiter character used for this command, is unspecified."
So the interpretation of escape sequences looks like a GNU extension.
By the description of the '--posix' option ("In order to simplify
writing portable scripts"), the --posix option should turn off this
interpretation. Or, better, emit a diagnostic.
Bruno
[1] https://pubs.opengroup.org/onlinepubs/9799919799/utilities/sed.html
Information forwarded
to
bug-sed <at> gnu.org
:
bug#78883
; Package
sed
.
(Tue, 24 Jun 2025 00:32:01 GMT)
Full text and
rfc822 format available.
Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):
Bruno Haible via <bug-sed <at> gnu.org> writes:
> POSIX [1] is ambiguous here, I would say:
> Quoting:
> "For each other <backslash> encountered, the following character shall
> lose its special meaning (if any)."
> but also
> "The meaning of an unescaped <backslash> immediately followed by any
> character other than '&', <backslash>, a digit, <newline>, or the
> delimiter character used for this command, is unspecified."
>
> So the interpretation of escape sequences looks like a GNU extension.
>
> By the description of the '--posix' option ("In order to simplify
> writing portable scripts"), the --posix option should turn off this
> interpretation. Or, better, emit a diagnostic.
Just want to add a strong +1 for keeping it as a GNU extension. I find
it annoying that other systems 'sed' command does not behave the same
way.
I'm fine with adding a warning for when '--posix' is used though. That
would be helpful for scripts that aim to be as portable as possible
(e.g. the old gnulib-tool).
Collin
Information forwarded
to
bug-sed <at> gnu.org
:
bug#78883
; Package
sed
.
(Tue, 24 Jun 2025 00:32:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-sed <at> gnu.org
:
bug#78883
; Package
sed
.
(Tue, 24 Jun 2025 00:42:02 GMT)
Full text and
rfc822 format available.
Message #14 received at submit <at> debbugs.gnu.org (full text, mbox):
Correcting the title. It's probably not a POSIX violation (as I initially
thought). Rather, it's a portability problem; and this is where the '--posix'
option becomes relevant.
Bruno
Information forwarded
to
bug-sed <at> gnu.org
:
bug#78883
; Package
sed
.
(Tue, 24 Jun 2025 00:42:02 GMT)
Full text and
rfc822 format available.
This bug report was last modified 2 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.