GNU bug report logs - #78883
backslash interpretation in 's' replacement text violates POSIX

Previous Next

Package: sed;

Reported by: Bruno Haible <bruno <at> clisp.org>

Date: Mon, 23 Jun 2025 21:48:02 UTC

Severity: normal

To reply to this bug, email your comments to 78883 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-sed <at> gnu.org:
bug#78883; Package sed. (Mon, 23 Jun 2025 21:48:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Bruno Haible <bruno <at> clisp.org>:
New bug report received and forwarded. Copy sent to bug-sed <at> gnu.org. (Mon, 23 Jun 2025 21:48:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: bug-sed <at> gnu.org
Subject: backslash interpretation in 's' replacement text violates POSIX
Date: Mon, 23 Jun 2025 23:46:47 +0200
Hi,

When the replacement text in an 's' command contains an escape sequence
like '\n', GNU sed interprets this escape sequence, while other implementations
(OpenBSD sed, Solaris sed) do not.

How to reproduce:

$ echo foo | sed -e 's/f.*/line1\nline2/'
line1
line2

$ echo foo | POSIXLY_CORRECT=1 sed --posix -e 's/f.*/line1\nline2/'
line1
line2

Seen with GNU sed 4.9.

While OpenBSD sed, Solaris sed produce:

$ echo foo | sed -e 's/f.*/line1\nline2/'
line1nline2

POSIX [1] is ambiguous here, I would say:
Quoting:
  "For each other <backslash> encountered, the following character shall
   lose its special meaning (if any)."
but also
  "The meaning of an unescaped <backslash> immediately followed by any
   character other than '&', <backslash>, a digit, <newline>, or the
   delimiter character used for this command, is unspecified."

So the interpretation of escape sequences looks like a GNU extension.

By the description of the '--posix' option ("In order to simplify
writing portable scripts"), the --posix option should turn off this
interpretation. Or, better, emit a diagnostic.

Bruno

[1] https://pubs.opengroup.org/onlinepubs/9799919799/utilities/sed.html







Information forwarded to bug-sed <at> gnu.org:
bug#78883; Package sed. (Tue, 24 Jun 2025 00:32:01 GMT) Full text and rfc822 format available.

Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Collin Funk <collin.funk1 <at> gmail.com>
To: Bruno Haible via <bug-sed <at> gnu.org>
Cc: 78883 <at> debbugs.gnu.org, Bruno Haible <bruno <at> clisp.org>
Subject: Re: bug#78883: backslash interpretation in 's' replacement text
 violates POSIX
Date: Mon, 23 Jun 2025 17:30:53 -0700
Bruno Haible via <bug-sed <at> gnu.org> writes:

> POSIX [1] is ambiguous here, I would say:
> Quoting:
>   "For each other <backslash> encountered, the following character shall
>    lose its special meaning (if any)."
> but also
>   "The meaning of an unescaped <backslash> immediately followed by any
>    character other than '&', <backslash>, a digit, <newline>, or the
>    delimiter character used for this command, is unspecified."
>
> So the interpretation of escape sequences looks like a GNU extension.
>
> By the description of the '--posix' option ("In order to simplify
> writing portable scripts"), the --posix option should turn off this
> interpretation. Or, better, emit a diagnostic.

Just want to add a strong +1 for keeping it as a GNU extension. I find
it annoying that other systems 'sed' command does not behave the same
way.

I'm fine with adding a warning for when '--posix' is used though. That
would be helpful for scripts that aim to be as portable as possible
(e.g. the old gnulib-tool).

Collin




Information forwarded to bug-sed <at> gnu.org:
bug#78883; Package sed. (Tue, 24 Jun 2025 00:32:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-sed <at> gnu.org:
bug#78883; Package sed. (Tue, 24 Jun 2025 00:42:02 GMT) Full text and rfc822 format available.

Message #14 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: bug-sed <at> gnu.org
Cc: 78883 <at> debbugs.gnu.org, Collin Funk <collin.funk1 <at> gmail.com>
Subject: Re: bug#78883: backslash interpretation in 's' replacement text is a
 portability problem
Date: Tue, 24 Jun 2025 02:40:57 +0200
Correcting the title. It's probably not a POSIX violation (as I initially
thought). Rather, it's a portability problem; and this is where the '--posix'
option becomes relevant.

Bruno








Information forwarded to bug-sed <at> gnu.org:
bug#78883; Package sed. (Tue, 24 Jun 2025 00:42:02 GMT) Full text and rfc822 format available.

This bug report was last modified 2 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.