GNU bug report logs -
#79231
sends non-printable characters to the terminal in error message
Previous Next
To reply to this bug, email your comments to 79231 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gzip <at> gnu.org:
bug#79231; Package
gzip.
(Wed, 13 Aug 2025 14:51:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Vincent Lefevre <vincent <at> vinc17.net>:
New bug report received and forwarded. Copy sent to
bug-gzip <at> gnu.org.
(Wed, 13 Aug 2025 14:51:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
gzip can send non-printable characters to the terminal in its error
message. This is bad because escape sequences and control characters
can have unpredictable consequences in the terminal.
For instance,
$ touch "$(printf "file\e[H\e[c\n\b")"
$ gunzip file*
makes xterm crash with reverseWrap enabled.
Note: The end user is not necessary the cause of such of file name,
which may come from a downloaded archive or from a bug in some
software.
--
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Pascaline project (LIP, ENS-Lyon)
Information forwarded
to
bug-gzip <at> gnu.org:
bug#79231; Package
gzip.
(Wed, 13 Aug 2025 16:09:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 79231 <at> debbugs.gnu.org (full text, mbox):
On 8/13/25 07:49, Vincent Lefevre wrote:
> $ touch "$(printf "file\e[H\e[c\n\b")"
> $ gunzip file*
Not sure it's gzip's job to sanitize file names that the user gave it.
Pretty much every much program in the universe will output file names
as-is, if the user tells it the file name explicitly. Why should gzip be
an exception?
Information forwarded
to
bug-gzip <at> gnu.org:
bug#79231; Package
gzip.
(Wed, 13 Aug 2025 16:17:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 79231 <at> debbugs.gnu.org (full text, mbox):
On 2025-08-13 16:49:59 +0200, Vincent Lefevre wrote:
> gzip can send non-printable characters to the terminal in its error
> message. This is bad because escape sequences and control characters
> can have unpredictable consequences in the terminal.
I forgot to say: this occurs with
* gzip 1.13 in Debian 13 (trixie);
* gzip 1.14 under Termux/Android.
--
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Pascaline project (LIP, ENS-Lyon)
Information forwarded
to
bug-gzip <at> gnu.org:
bug#79231; Package
gzip.
(Wed, 13 Aug 2025 16:40:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 79231 <at> debbugs.gnu.org (full text, mbox):
On 2025-08-13 09:08:26 -0700, Paul Eggert wrote:
> On 8/13/25 07:49, Vincent Lefevre wrote:
> > $ touch "$(printf "file\e[H\e[c\n\b")"
> > $ gunzip file*
>
> Not sure it's gzip's job to sanitize file names that the user gave it.
> Pretty much every much program in the universe will output file names as-is,
Many programs quote non-printable characters, e.g. those from
GNU Coreutils, but also xz (XZ Utils), diff from GNU diffutils,
and find from GNU findutils (I was the one who reported the
issue for find in 2005[*]).
[*] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=311384
> if the user tells it the file name explicitly.
Well, it is given from the shell, not by the user explicitly.
But the shell cannot sanitize the file name; otherwise gzip
would not find the file.
So, this would be up to the file system to prevent the creation
of such file names (I don't know what POSIX says on this point,
but POSIX might also require the opposite).
> Why should gzip be an exception?
Not really an exception (see above).
--
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Pascaline project (LIP, ENS-Lyon)
Information forwarded
to
bug-gzip <at> gnu.org:
bug#79231; Package
gzip.
(Wed, 13 Aug 2025 18:48:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 79231 <at> debbugs.gnu.org (full text, mbox):
On 8/13/25 09:39, Vincent Lefevre wrote:
> Many programs quote non-printable characters, e.g. those from
> GNU Coreutils
Oh, thanks, I didn't know that. I see this was added to coreutils
several years ago. In that case, patches to do this for gzip would be
welcome.
Information forwarded
to
bug-gzip <at> gnu.org:
bug#79231; Package
gzip.
(Sun, 17 Aug 2025 02:02:01 GMT)
Full text and
rfc822 format available.
Message #20 received at 79231 <at> debbugs.gnu.org (full text, mbox):
Paul Eggert <eggert <at> cs.ucla.edu> writes:
> On 8/13/25 09:39, Vincent Lefevre wrote:
>> Many programs quote non-printable characters, e.g. those from
>> GNU Coreutils
>
> Oh, thanks, I didn't know that. I see this was added to coreutils
> several years ago. In that case, patches to do this for gzip would be
> welcome.
Is there any reason that gzip doesn't use quote and error from Gnulib?
e.g. to avoid dependencies on locale stuff?
I'm assuming that it is just because no one has cared enough to add it
to gzip, but that feels like the correct solution to this issue.
There are some places where it is a bit more work than adding
quote/quote_n like this:
fprintf(stderr,"%s: %s/%s: pathname too long\n",
program_name, dir, entry);
Ideally we could get rid of the MAX_PATH_LEN limitation on file names
(see GNU Coding Standards [1]) and therefore never have to print this
message. But that is more complex than this issue...
Collin
[1] https://www.gnu.org/prep/standards/standards.html#Semantics
Information forwarded
to
bug-gzip <at> gnu.org:
bug#79231; Package
gzip.
(Sun, 17 Aug 2025 15:46:01 GMT)
Full text and
rfc822 format available.
Message #23 received at 79231 <at> debbugs.gnu.org (full text, mbox):
On 8/16/25 19:01, Collin Funk wrote:
> Is there any reason that gzip doesn't use quote and error from Gnulib?
> e.g. to avoid dependencies on locale stuff?
Partly that, and partly because it's a symptom of a larger issue: gzip
was written in a hurry and is poorly structured and people
understandably don't want to mess with it. Decades ago I toyed with the
idea of rewriting it from scratch but gave it up as a job not worth doing.
Information forwarded
to
bug-gzip <at> gnu.org:
bug#79231; Package
gzip.
(Sun, 17 Aug 2025 23:31:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 79231 <at> debbugs.gnu.org (full text, mbox):
Paul Eggert wrote:
> On 8/16/25 19:01, Collin Funk wrote:
>> Is there any reason that gzip doesn't use quote and error from Gnulib?
>> e.g. to avoid dependencies on locale stuff?
>
> Partly that, and partly because it's a symptom of a larger issue: gzip
> was written in a hurry and is poorly structured and people
> understandably don't want to mess with it.
Maybe an alternative to searching for all the places where gzip would need
to be patched could be to reject outright any file name containing any
control char in the range ( ch <= 31 && ch >= 1 ) || ch == 127
If a file with such a name needs to be decompressed, it can be redirected to
standard input.
POSIX is encouraging implementations to disallow the creation of file names
containing any bytes that have the encoded value of a <newline> character.
See https://pubs.opengroup.org/onlinepubs/9799919799/utilities/compress.html
section CHANGE HISTORY subsection Issue 8.
Since January 2024, GNU ed is rejecting by default file names containing
control chars and nobody has complained yet.
Best regards,
Antonio.
Information forwarded
to
bug-gzip <at> gnu.org:
bug#79231; Package
gzip.
(Sun, 17 Aug 2025 23:54:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 79231 <at> debbugs.gnu.org (full text, mbox):
On 2025-08-18 01:31:38 +0200, Antonio Diaz Diaz wrote:
> Since January 2024, GNU ed is rejecting by default file names containing
> control chars and nobody has complained yet.
Perhaps for creation, but not as input, where GNU ed outputs
non-printable characters to the terminal due to the file name in
the error message. So this does not solve the problem. GNU ed is
as buggy as gzip in this respect. I've just reported the bug.
--
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Pascaline project (LIP, ENS-Lyon)
Information forwarded
to
bug-gzip <at> gnu.org:
bug#79231; Package
gzip.
(Sun, 17 Aug 2025 23:58:02 GMT)
Full text and
rfc822 format available.
Message #32 received at 79231 <at> debbugs.gnu.org (full text, mbox):
On 2025-08-18 01:53:43 +0200, Vincent Lefevre wrote:
> On 2025-08-18 01:31:38 +0200, Antonio Diaz Diaz wrote:
> > Since January 2024, GNU ed is rejecting by default file names containing
> > control chars and nobody has complained yet.
>
> Perhaps for creation, but not as input, where GNU ed outputs
> non-printable characters to the terminal due to the file name in
> the error message. So this does not solve the problem. GNU ed is
> as buggy as gzip in this respect. I've just reported the bug.
In short, the only way to avoid the issue in any program would
be to make the Linux kernel to prevent the creation of such
file names in the first place (well, archive utilities would
also need to filter such characters for their output in case
they can appear in archives).
--
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Pascaline project (LIP, ENS-Lyon)
Information forwarded
to
bug-gzip <at> gnu.org:
bug#79231; Package
gzip.
(Tue, 19 Aug 2025 10:55:02 GMT)
Full text and
rfc822 format available.
Message #35 received at 79231 <at> debbugs.gnu.org (full text, mbox):
Paul Eggert wrote:
> On 8/13/25 07:49, Vincent Lefevre wrote:
>> $ touch "$(printf "file\e[H\e[c\n\b")"
>> $ gunzip file*
>
> Not sure it's gzip's job to sanitize file names that the user gave it.
> Pretty much every much program in the universe will output file names
> as-is, if the user tells it the file name explicitly. Why should gzip be
> an exception?
After thinking about it, I have reached the conclusion that the command line
is an internal interface, much like memcpy. I.e., it is not directly exposed
to the outside world. The efficient way of making safe an internal interface
is to assign to the caller the responsibility to supply valid arguments.
For example, if the caller supplies to memcpy pointers to objects that
overlap, the behavior is undefined. Likewise, if the caller of gzip supplies
it with invalid file names, the behavior is undefined. (The caller of gzip
can be the user or another program).
The case of tools like 'ls' or 'mv' is different. They must accept invalid
file names because they are the means to check and rename files to give them
valid names.
Therefore, I would agree with this issue being "fixed" by adding something
like the following to the gzip documentation:
It is the responsibility of the caller to check that the file names supplied
to gzip are valid. (For example, that they do not contain unprintable
characters).
Best regards,
Antonio.
Information forwarded
to
bug-gzip <at> gnu.org:
bug#79231; Package
gzip.
(Tue, 19 Aug 2025 11:11:02 GMT)
Full text and
rfc822 format available.
Message #38 received at 79231 <at> debbugs.gnu.org (full text, mbox):
On 2025-08-19 12:55:42 +0200, Antonio Diaz Diaz wrote:
> Paul Eggert wrote:
> > On 8/13/25 07:49, Vincent Lefevre wrote:
> > > $ touch "$(printf "file\e[H\e[c\n\b")"
> > > $ gunzip file*
> >
> > Not sure it's gzip's job to sanitize file names that the user gave it.
> > Pretty much every much program in the universe will output file names
> > as-is, if the user tells it the file name explicitly. Why should gzip be
> > an exception?
>
> After thinking about it, I have reached the conclusion that the command line
> is an internal interface, much like memcpy. I.e., it is not directly exposed
> to the outside world. The efficient way of making safe an internal interface
> is to assign to the caller the responsibility to supply valid arguments.
As long as the file system allows the creation of such file names,
or at least to make them visible via usual means, they must be
regarded as valid.
> For example, if the caller supplies to memcpy pointers to objects that
> overlap, the behavior is undefined. Likewise, if the caller of gzip supplies
> it with invalid file names, the behavior is undefined. (The caller of gzip
> can be the user or another program).
Pointers are under the control of the developer. Not the filenames,
which may come from external sources.
BTW, even undefined behavior in programming may be a very bad idea.
It is mostly there for efficiency reasons. There are programming
languages that avoid such issues.
> The case of tools like 'ls' or 'mv' is different. They must accept invalid
> file names because they are the means to check and rename files to give them
> valid names.
They should not be different, as being standard utilities. This would
introduce an inconsistency in the handling of files.
For admin purpose, there could be additional utilities (or the same
utilities with options to make such files visible) if one considers
that such files should normally be hidden.
--
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Pascaline project (LIP, ENS-Lyon)
Information forwarded
to
bug-gzip <at> gnu.org:
bug#79231; Package
gzip.
(Tue, 19 Aug 2025 21:49:01 GMT)
Full text and
rfc822 format available.
Message #41 received at 79231 <at> debbugs.gnu.org (full text, mbox):
Paul Eggert <eggert <at> cs.ucla.edu> writes:
> On 8/16/25 19:01, Collin Funk wrote:
>> Is there any reason that gzip doesn't use quote and error from Gnulib?
>> e.g. to avoid dependencies on locale stuff?
>
> Partly that, and partly because it's a symptom of a larger issue: gzip
> was written in a hurry and is poorly structured and people
> understandably don't want to mess with it. Decades ago I toyed with
> the idea of rewriting it from scratch but gave it up as a job not
> worth doing.
Yep, I wanted to get rid of some of the ancient cruft (e.g. pre-standard
C tolower [1]), but it is a chore. One has to be careful with patches
that are less trivial than that one.
It probably makes sense to rewrite 'gzip' using zlib which is more
portable than Gnulib (IIRC) and very stable. If one wants experimental
optimizations they could link to zlib-ng instead. But also, pigz exists
using zlib, so maybe there isn't much benefit...
Collin
[1] https://debbugs.gnu.org/cgi/bugreport.cgi?bug=74196
Information forwarded
to
bug-gzip <at> gnu.org:
bug#79231; Package
gzip.
(Tue, 19 Aug 2025 21:58:02 GMT)
Full text and
rfc822 format available.
Message #44 received at 79231 <at> debbugs.gnu.org (full text, mbox):
Antonio Diaz Diaz <antonio <at> gnu.org> writes:
> POSIX is encouraging implementations to disallow the creation of file
> names containing any bytes that have the encoded value of a <newline>
> character. See
> https://pubs.opengroup.org/onlinepubs/9799919799/utilities/compress.html
> section CHANGE HISTORY subsection Issue 8.
I meant to get around to mentioning this change on the Coreutils mailing
list.
It is my impression that this change was to avoid issues like the
following example:
$ touch $'abc\ndef'
$ for file in `find . -type f`; do stat "$file"; done
stat: cannot statx './abc': No such file or directory
stat: cannot statx 'def': No such file or directory
I don't think it was intended to prevent terminals from having
non-printable file names emitted.
I don't like the idea of rejecting file names that the file system
and/or operating system allow. If POSIX were to mandate the behavior,
then my personal preference would be to hide it behind POSIXLY_CORRECT.
Collin
Information forwarded
to
bug-gzip <at> gnu.org:
bug#79231; Package
gzip.
(Wed, 20 Aug 2025 01:32:02 GMT)
Full text and
rfc822 format available.
Message #47 received at 79231 <at> debbugs.gnu.org (full text, mbox):
On 2025-08-19 14:57:47 -0700, Collin Funk wrote:
> Antonio Diaz Diaz <antonio <at> gnu.org> writes:
>
> > POSIX is encouraging implementations to disallow the creation of file
> > names containing any bytes that have the encoded value of a <newline>
> > character. See
> > https://pubs.opengroup.org/onlinepubs/9799919799/utilities/compress.html
> > section CHANGE HISTORY subsection Issue 8.
>
[...]
>
> I don't like the idea of rejecting file names that the file system
> and/or operating system allow. If POSIX were to mandate the behavior,
> then my personal preference would be to hide it behind POSIXLY_CORRECT.
https://www.austingroupbugs.net/view.php?id=251 proposed to add the
paragraph
Implementations are encouraged to have fopen() and freopen() report
an [EILSEQ] error if mode begins with 'w' or 'a', the file did not
previously exist, and the last component of pathname contains any
bytes that have the encoded value of a <newline> character.
and similar text for other functions.
But the consequence is that it will not be possible for "compress" to
create such a file. The paragraph
If this utility is directed to create a new directory entry that
contains any bytes that have the encoded value of a <newline>
character, implementations are encouraged to treat this as an error.
A future version of this standard may require implementations to
treat this as an error.
could just mean that this is a consequence of the failure from fopen()
to create the file. (Implementations should be regarded as a whole.)
--
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Pascaline project (LIP, ENS-Lyon)
This bug report was last modified 80 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.