GNU bug report logs - #79231
sends non-printable characters to the terminal in error message

Previous Next

Package: gzip;

Reported by: Vincent Lefevre <vincent <at> vinc17.net>

Date: Wed, 13 Aug 2025 14:51:01 UTC

Severity: normal

To reply to this bug, email your comments to 79231 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gzip <at> gnu.org:
bug#79231; Package gzip. (Wed, 13 Aug 2025 14:51:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Vincent Lefevre <vincent <at> vinc17.net>:
New bug report received and forwarded. Copy sent to bug-gzip <at> gnu.org. (Wed, 13 Aug 2025 14:51:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Vincent Lefevre <vincent <at> vinc17.net>
To: bug-gzip <at> gnu.org
Subject: sends non-printable characters to the terminal in error message
Date: Wed, 13 Aug 2025 16:49:59 +0200
gzip can send non-printable characters to the terminal in its error
message. This is bad because escape sequences and control characters
can have unpredictable consequences in the terminal.

For instance,

$ touch "$(printf "file\e[H\e[c\n\b")"
$ gunzip file*

makes xterm crash with reverseWrap enabled.

Note: The end user is not necessary the cause of such of file name,
which may come from a downloaded archive or from a bug in some
software.

-- 
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Pascaline project (LIP, ENS-Lyon)




Information forwarded to bug-gzip <at> gnu.org:
bug#79231; Package gzip. (Wed, 13 Aug 2025 16:09:01 GMT) Full text and rfc822 format available.

Message #8 received at 79231 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Vincent Lefevre <vincent <at> vinc17.net>
Cc: 79231 <at> debbugs.gnu.org
Subject: Re: bug#79231: sends non-printable characters to the terminal in
 error message
Date: Wed, 13 Aug 2025 09:08:26 -0700
On 8/13/25 07:49, Vincent Lefevre wrote:
> $ touch "$(printf "file\e[H\e[c\n\b")"
> $ gunzip file*

Not sure it's gzip's job to sanitize file names that the user gave it. 
Pretty much every much program in the universe will output file names 
as-is, if the user tells it the file name explicitly. Why should gzip be 
an exception?




Information forwarded to bug-gzip <at> gnu.org:
bug#79231; Package gzip. (Wed, 13 Aug 2025 16:17:02 GMT) Full text and rfc822 format available.

Message #11 received at 79231 <at> debbugs.gnu.org (full text, mbox):

From: Vincent Lefevre <vincent <at> vinc17.net>
To: 79231 <at> debbugs.gnu.org
Subject: Re: sends non-printable characters to the terminal in error message
Date: Wed, 13 Aug 2025 18:16:43 +0200
On 2025-08-13 16:49:59 +0200, Vincent Lefevre wrote:
> gzip can send non-printable characters to the terminal in its error
> message. This is bad because escape sequences and control characters
> can have unpredictable consequences in the terminal.

I forgot to say: this occurs with
  * gzip 1.13 in Debian 13 (trixie);
  * gzip 1.14 under Termux/Android.

-- 
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Pascaline project (LIP, ENS-Lyon)




Information forwarded to bug-gzip <at> gnu.org:
bug#79231; Package gzip. (Wed, 13 Aug 2025 16:40:02 GMT) Full text and rfc822 format available.

Message #14 received at 79231 <at> debbugs.gnu.org (full text, mbox):

From: Vincent Lefevre <vincent <at> vinc17.net>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 79231 <at> debbugs.gnu.org
Subject: Re: bug#79231: sends non-printable characters to the terminal in
 error message
Date: Wed, 13 Aug 2025 18:39:17 +0200
On 2025-08-13 09:08:26 -0700, Paul Eggert wrote:
> On 8/13/25 07:49, Vincent Lefevre wrote:
> > $ touch "$(printf "file\e[H\e[c\n\b")"
> > $ gunzip file*
> 
> Not sure it's gzip's job to sanitize file names that the user gave it.
> Pretty much every much program in the universe will output file names as-is,

Many programs quote non-printable characters, e.g. those from
GNU Coreutils, but also xz (XZ Utils), diff from GNU diffutils,
and find from GNU findutils (I was the one who reported the
issue for find in 2005[*]).

[*] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=311384

> if the user tells it the file name explicitly.

Well, it is given from the shell, not by the user explicitly.
But the shell cannot sanitize the file name; otherwise gzip
would not find the file.

So, this would be up to the file system to prevent the creation
of such file names (I don't know what POSIX says on this point,
but POSIX might also require the opposite).

> Why should gzip be an exception?

Not really an exception (see above).

-- 
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Pascaline project (LIP, ENS-Lyon)




Information forwarded to bug-gzip <at> gnu.org:
bug#79231; Package gzip. (Wed, 13 Aug 2025 18:48:01 GMT) Full text and rfc822 format available.

Message #17 received at 79231 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Vincent Lefevre <vincent <at> vinc17.net>
Cc: 79231 <at> debbugs.gnu.org
Subject: Re: bug#79231: sends non-printable characters to the terminal in
 error message
Date: Wed, 13 Aug 2025 11:47:29 -0700
On 8/13/25 09:39, Vincent Lefevre wrote:
> Many programs quote non-printable characters, e.g. those from
> GNU Coreutils

Oh, thanks, I didn't know that. I see this was added to coreutils 
several years ago. In that case, patches to do this for gzip would be 
welcome.




Information forwarded to bug-gzip <at> gnu.org:
bug#79231; Package gzip. (Sun, 17 Aug 2025 02:02:01 GMT) Full text and rfc822 format available.

Message #20 received at 79231 <at> debbugs.gnu.org (full text, mbox):

From: Collin Funk <collin.funk1 <at> gmail.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Vincent Lefevre <vincent <at> vinc17.net>, 79231 <at> debbugs.gnu.org
Subject: Re: bug#79231: sends non-printable characters to the terminal in
 error message
Date: Sat, 16 Aug 2025 19:01:12 -0700
Paul Eggert <eggert <at> cs.ucla.edu> writes:

> On 8/13/25 09:39, Vincent Lefevre wrote:
>> Many programs quote non-printable characters, e.g. those from
>> GNU Coreutils
>
> Oh, thanks, I didn't know that. I see this was added to coreutils
> several years ago. In that case, patches to do this for gzip would be
> welcome.

Is there any reason that gzip doesn't use quote and error from Gnulib?
e.g. to avoid dependencies on locale stuff?

I'm assuming that it is just because no one has cared enough to add it
to gzip, but that feels like the correct solution to this issue.

There are some places where it is a bit more work than adding
quote/quote_n like this:

    fprintf(stderr,"%s: %s/%s: pathname too long\n",
            program_name, dir, entry);

Ideally we could get rid of the MAX_PATH_LEN limitation on file names
(see GNU Coding Standards [1]) and therefore never have to print this
message. But that is more complex than this issue...

Collin

[1] https://www.gnu.org/prep/standards/standards.html#Semantics




Information forwarded to bug-gzip <at> gnu.org:
bug#79231; Package gzip. (Sun, 17 Aug 2025 15:46:01 GMT) Full text and rfc822 format available.

Message #23 received at 79231 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Collin Funk <collin.funk1 <at> gmail.com>
Cc: Vincent Lefevre <vincent <at> vinc17.net>, 79231 <at> debbugs.gnu.org
Subject: Re: bug#79231: sends non-printable characters to the terminal in
 error message
Date: Sun, 17 Aug 2025 08:45:32 -0700
On 8/16/25 19:01, Collin Funk wrote:
> Is there any reason that gzip doesn't use quote and error from Gnulib?
> e.g. to avoid dependencies on locale stuff?

Partly that, and partly because it's a symptom of a larger issue: gzip 
was written in a hurry and is poorly structured and people 
understandably don't want to mess with it. Decades ago I toyed with the 
idea of rewriting it from scratch but gave it up as a job not worth doing.




Information forwarded to bug-gzip <at> gnu.org:
bug#79231; Package gzip. (Sun, 17 Aug 2025 23:31:02 GMT) Full text and rfc822 format available.

Message #26 received at 79231 <at> debbugs.gnu.org (full text, mbox):

From: Antonio Diaz Diaz <antonio <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Vincent Lefevre <vincent <at> vinc17.net>, 79231 <at> debbugs.gnu.org,
 Collin Funk <collin.funk1 <at> gmail.com>
Subject: Re: bug#79231: sends non-printable characters to the terminal in
 error message
Date: Mon, 18 Aug 2025 01:31:38 +0200
Paul Eggert wrote:
> On 8/16/25 19:01, Collin Funk wrote:
>> Is there any reason that gzip doesn't use quote and error from Gnulib?
>> e.g. to avoid dependencies on locale stuff?
>
> Partly that, and partly because it's a symptom of a larger issue: gzip
> was written in a hurry and is poorly structured and people
> understandably don't want to mess with it.

Maybe an alternative to searching for all the places where gzip would need 
to be patched could be to reject outright any file name containing any 
control char in the range ( ch <= 31 && ch >= 1 ) || ch == 127

If a file with such a name needs to be decompressed, it can be redirected to 
standard input.

POSIX is encouraging implementations to disallow the creation of file names 
containing any bytes that have the encoded value of a <newline> character. 
See https://pubs.opengroup.org/onlinepubs/9799919799/utilities/compress.html 
section CHANGE HISTORY subsection Issue 8.

Since January 2024, GNU ed is rejecting by default file names containing 
control chars and nobody has complained yet.

Best regards,
Antonio.




Information forwarded to bug-gzip <at> gnu.org:
bug#79231; Package gzip. (Sun, 17 Aug 2025 23:54:02 GMT) Full text and rfc822 format available.

Message #29 received at 79231 <at> debbugs.gnu.org (full text, mbox):

From: Vincent Lefevre <vincent <at> vinc17.net>
To: Antonio Diaz Diaz <antonio <at> gnu.org>
Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 79231 <at> debbugs.gnu.org,
 Collin Funk <collin.funk1 <at> gmail.com>
Subject: Re: bug#79231: sends non-printable characters to the terminal in
 error message
Date: Mon, 18 Aug 2025 01:53:43 +0200
On 2025-08-18 01:31:38 +0200, Antonio Diaz Diaz wrote:
> Since January 2024, GNU ed is rejecting by default file names containing
> control chars and nobody has complained yet.

Perhaps for creation, but not as input, where GNU ed outputs
non-printable characters to the terminal due to the file name in
the error message. So this does not solve the problem. GNU ed is
as buggy as gzip in this respect. I've just reported the bug.

-- 
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Pascaline project (LIP, ENS-Lyon)




Information forwarded to bug-gzip <at> gnu.org:
bug#79231; Package gzip. (Sun, 17 Aug 2025 23:58:02 GMT) Full text and rfc822 format available.

Message #32 received at 79231 <at> debbugs.gnu.org (full text, mbox):

From: Vincent Lefevre <vincent <at> vinc17.net>
To: Antonio Diaz Diaz <antonio <at> gnu.org>
Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 79231 <at> debbugs.gnu.org,
 Collin Funk <collin.funk1 <at> gmail.com>
Subject: Re: bug#79231: sends non-printable characters to the terminal in
 error message
Date: Mon, 18 Aug 2025 01:57:27 +0200
On 2025-08-18 01:53:43 +0200, Vincent Lefevre wrote:
> On 2025-08-18 01:31:38 +0200, Antonio Diaz Diaz wrote:
> > Since January 2024, GNU ed is rejecting by default file names containing
> > control chars and nobody has complained yet.
> 
> Perhaps for creation, but not as input, where GNU ed outputs
> non-printable characters to the terminal due to the file name in
> the error message. So this does not solve the problem. GNU ed is
> as buggy as gzip in this respect. I've just reported the bug.

In short, the only way to avoid the issue in any program would
be to make the Linux kernel to prevent the creation of such
file names in the first place (well, archive utilities would
also need to filter such characters for their output in case
they can appear in archives).

-- 
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Pascaline project (LIP, ENS-Lyon)




Information forwarded to bug-gzip <at> gnu.org:
bug#79231; Package gzip. (Tue, 19 Aug 2025 10:55:02 GMT) Full text and rfc822 format available.

Message #35 received at 79231 <at> debbugs.gnu.org (full text, mbox):

From: Antonio Diaz Diaz <antonio <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Vincent Lefevre <vincent <at> vinc17.net>, 79231 <at> debbugs.gnu.org
Subject: Re: bug#79231: sends non-printable characters to the terminal in
 error message
Date: Tue, 19 Aug 2025 12:55:42 +0200
Paul Eggert wrote:
> On 8/13/25 07:49, Vincent Lefevre wrote:
>> $ touch "$(printf "file\e[H\e[c\n\b")"
>> $ gunzip file*
>
> Not sure it's gzip's job to sanitize file names that the user gave it.
> Pretty much every much program in the universe will output file names
> as-is, if the user tells it the file name explicitly. Why should gzip be
> an exception?

After thinking about it, I have reached the conclusion that the command line 
is an internal interface, much like memcpy. I.e., it is not directly exposed 
to the outside world. The efficient way of making safe an internal interface 
is to assign to the caller the responsibility to supply valid arguments.

For example, if the caller supplies to memcpy pointers to objects that 
overlap, the behavior is undefined. Likewise, if the caller of gzip supplies 
it with invalid file names, the behavior is undefined. (The caller of gzip 
can be the user or another program).

The case of tools like 'ls' or 'mv' is different. They must accept invalid 
file names because they are the means to check and rename files to give them 
valid names.

Therefore, I would agree with this issue being "fixed" by adding something 
like the following to the gzip documentation:

It is the responsibility of the caller to check that the file names supplied 
to gzip are valid. (For example, that they do not contain unprintable 
characters).

Best regards,
Antonio.




Information forwarded to bug-gzip <at> gnu.org:
bug#79231; Package gzip. (Tue, 19 Aug 2025 11:11:02 GMT) Full text and rfc822 format available.

Message #38 received at 79231 <at> debbugs.gnu.org (full text, mbox):

From: Vincent Lefevre <vincent <at> vinc17.net>
To: Antonio Diaz Diaz <antonio <at> gnu.org>
Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 79231 <at> debbugs.gnu.org
Subject: Re: bug#79231: sends non-printable characters to the terminal in
 error message
Date: Tue, 19 Aug 2025 13:10:33 +0200
On 2025-08-19 12:55:42 +0200, Antonio Diaz Diaz wrote:
> Paul Eggert wrote:
> > On 8/13/25 07:49, Vincent Lefevre wrote:
> > > $ touch "$(printf "file\e[H\e[c\n\b")"
> > > $ gunzip file*
> > 
> > Not sure it's gzip's job to sanitize file names that the user gave it.
> > Pretty much every much program in the universe will output file names
> > as-is, if the user tells it the file name explicitly. Why should gzip be
> > an exception?
> 
> After thinking about it, I have reached the conclusion that the command line
> is an internal interface, much like memcpy. I.e., it is not directly exposed
> to the outside world. The efficient way of making safe an internal interface
> is to assign to the caller the responsibility to supply valid arguments.

As long as the file system allows the creation of such file names,
or at least to make them visible via usual means, they must be
regarded as valid.

> For example, if the caller supplies to memcpy pointers to objects that
> overlap, the behavior is undefined. Likewise, if the caller of gzip supplies
> it with invalid file names, the behavior is undefined. (The caller of gzip
> can be the user or another program).

Pointers are under the control of the developer. Not the filenames,
which may come from external sources.

BTW, even undefined behavior in programming may be a very bad idea.
It is mostly there for efficiency reasons. There are programming
languages that avoid such issues.

> The case of tools like 'ls' or 'mv' is different. They must accept invalid
> file names because they are the means to check and rename files to give them
> valid names.

They should not be different, as being standard utilities. This would
introduce an inconsistency in the handling of files.

For admin purpose, there could be additional utilities (or the same
utilities with options to make such files visible) if one considers
that such files should normally be hidden.

-- 
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Pascaline project (LIP, ENS-Lyon)




Information forwarded to bug-gzip <at> gnu.org:
bug#79231; Package gzip. (Tue, 19 Aug 2025 21:49:01 GMT) Full text and rfc822 format available.

Message #41 received at 79231 <at> debbugs.gnu.org (full text, mbox):

From: Collin Funk <collin.funk1 <at> gmail.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Vincent Lefevre <vincent <at> vinc17.net>, 79231 <at> debbugs.gnu.org
Subject: Re: bug#79231: sends non-printable characters to the terminal in
 error message
Date: Tue, 19 Aug 2025 14:48:28 -0700
Paul Eggert <eggert <at> cs.ucla.edu> writes:

> On 8/16/25 19:01, Collin Funk wrote:
>> Is there any reason that gzip doesn't use quote and error from Gnulib?
>> e.g. to avoid dependencies on locale stuff?
>
> Partly that, and partly because it's a symptom of a larger issue: gzip
> was written in a hurry and is poorly structured and people
> understandably don't want to mess with it. Decades ago I toyed with
> the idea of rewriting it from scratch but gave it up as a job not
> worth doing.

Yep, I wanted to get rid of some of the ancient cruft (e.g. pre-standard
C tolower [1]), but it is a chore. One has to be careful with patches
that are less trivial than that one.

It probably makes sense to rewrite 'gzip' using zlib which is more
portable than Gnulib (IIRC) and very stable. If one wants experimental
optimizations they could link to zlib-ng instead. But also, pigz exists
using zlib, so maybe there isn't much benefit...

Collin

[1] https://debbugs.gnu.org/cgi/bugreport.cgi?bug=74196




Information forwarded to bug-gzip <at> gnu.org:
bug#79231; Package gzip. (Tue, 19 Aug 2025 21:58:02 GMT) Full text and rfc822 format available.

Message #44 received at 79231 <at> debbugs.gnu.org (full text, mbox):

From: Collin Funk <collin.funk1 <at> gmail.com>
To: Antonio Diaz Diaz <antonio <at> gnu.org>
Cc: Paul Eggert <eggert <at> cs.ucla.edu>, Vincent Lefevre <vincent <at> vinc17.net>,
 79231 <at> debbugs.gnu.org
Subject: Re: bug#79231: sends non-printable characters to the terminal in
 error message
Date: Tue, 19 Aug 2025 14:57:47 -0700
Antonio Diaz Diaz <antonio <at> gnu.org> writes:

> POSIX is encouraging implementations to disallow the creation of file
> names containing any bytes that have the encoded value of a <newline>
> character. See
> https://pubs.opengroup.org/onlinepubs/9799919799/utilities/compress.html
> section CHANGE HISTORY subsection Issue 8.

I meant to get around to mentioning this change on the Coreutils mailing
list.

It is my impression that this change was to avoid issues like the
following example:

    $ touch $'abc\ndef'
    $ for file in `find . -type f`; do stat "$file"; done
    stat: cannot statx './abc': No such file or directory
    stat: cannot statx 'def': No such file or directory

I don't think it was intended to prevent terminals from having
non-printable file names emitted.

I don't like the idea of rejecting file names that the file system
and/or operating system allow. If POSIX were to mandate the behavior,
then my personal preference would be to hide it behind POSIXLY_CORRECT.

Collin




Information forwarded to bug-gzip <at> gnu.org:
bug#79231; Package gzip. (Wed, 20 Aug 2025 01:32:02 GMT) Full text and rfc822 format available.

Message #47 received at 79231 <at> debbugs.gnu.org (full text, mbox):

From: Vincent Lefevre <vincent <at> vinc17.net>
To: Collin Funk <collin.funk1 <at> gmail.com>
Cc: Antonio Diaz Diaz <antonio <at> gnu.org>, 79231 <at> debbugs.gnu.org,
 Paul Eggert <eggert <at> cs.ucla.edu>
Subject: Re: bug#79231: sends non-printable characters to the terminal in
 error message
Date: Wed, 20 Aug 2025 03:31:47 +0200
On 2025-08-19 14:57:47 -0700, Collin Funk wrote:
> Antonio Diaz Diaz <antonio <at> gnu.org> writes:
> 
> > POSIX is encouraging implementations to disallow the creation of file
> > names containing any bytes that have the encoded value of a <newline>
> > character. See
> > https://pubs.opengroup.org/onlinepubs/9799919799/utilities/compress.html
> > section CHANGE HISTORY subsection Issue 8.
> 
[...]
> 
> I don't like the idea of rejecting file names that the file system
> and/or operating system allow. If POSIX were to mandate the behavior,
> then my personal preference would be to hide it behind POSIXLY_CORRECT.

https://www.austingroupbugs.net/view.php?id=251 proposed to add the
paragraph

  Implementations are encouraged to have fopen() and freopen() report
  an [EILSEQ] error if mode begins with 'w' or 'a', the file did not
  previously exist, and the last component of pathname contains any
  bytes that have the encoded value of a <newline> character.

and similar text for other functions.

But the consequence is that it will not be possible for "compress" to
create such a file. The paragraph

  If this utility is directed to create a new directory entry that
  contains any bytes that have the encoded value of a <newline>
  character, implementations are encouraged to treat this as an error.
  A future version of this standard may require implementations to
  treat this as an error.

could just mean that this is a consequence of the failure from fopen()
to create the file. (Implementations should be regarded as a whole.)

-- 
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Pascaline project (LIP, ENS-Lyon)




This bug report was last modified 80 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.