GNU bug report logs - #36367
Potential bug in grep / egrep

Previous Next

Package: grep;

Reported by: Henrik Holst <henrik.holst <at> outlook.com>

Date: Mon, 24 Jun 2019 22:08:01 UTC

Severity: normal

Tags: notabug

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 36367 in the body.
You can then email your comments to 36367 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#36367; Package grep. (Mon, 24 Jun 2019 22:08:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Henrik Holst <henrik.holst <at> outlook.com>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Mon, 24 Jun 2019 22:08:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Henrik Holst <henrik.holst <at> outlook.com>
To: "bug-grep <at> gnu.org" <bug-grep <at> gnu.org>
Subject: Potential bug in grep / egrep
Date: Mon, 24 Jun 2019 22:01:22 +0000
I expected these two commands to produce the exact same result:

holst <at> hholst-lt:~$ egrep -l '^(telegram-desktop)$' /proc/*/cmdline
/proc/20596/cmdline
holst <at> hholst-lt:~$ egrep -x -l 'telegram-desktop' /proc/*/cmdline
/proc/20596/cmdline
/proc/self/cmdline
/proc/thread-self/cmdline
holst <at> hholst-lt:~$ 

Version:

holst <at> hholst-lt:~$ egrep --version
grep (GNU grep) 3.3
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Mike Haertel and others; see
<https://git.sv.gnu.org/cgit/grep.git/tree/AUTHORS>.
holst <at> hholst-lt:~$ cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=19.04
DISTRIB_CODENAME=disco
DISTRIB_DESCRIPTION="Ubuntu 19.04"
holst <at> hholst-lt:~$ 






Information forwarded to bug-grep <at> gnu.org:
bug#36367; Package grep. (Wed, 26 Jun 2019 15:53:02 GMT) Full text and rfc822 format available.

Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):

From: g1pi <at> libero.it
To: bug-grep <at> gnu.org
Subject: bug#36367: Potential bug in grep / egrep
Date: Wed, 26 Jun 2019 16:39:25 +0200
Hi Henrik.

It's not a bug, and it's described in the manpage:

    When  type  is  binary,  grep  may  treat non-text bytes as line
    terminators even without the -z  option.   This  means  choosing
    binary  versus text can affect whether a pattern matches a file.
    For example, when type is binary the pattern q$  might  match  q
    immediately  followed  by  a  null byte, even though this is not
    matched when type is text.  Conversely, when type is binary  the
    pattern . (period) might not match a null byte.

Despite its appearance, /proc/*/cmdline is a binary file, because args are
separated by zero bytes, instead of blanks.

It happens that the cmdline for the first command contains

    ...NUL^(telegram-desktop)$NUL...

and does not match the ERE '^(telegram-desktop)$' (or the equivalent
'^telegram-desktop$'), while the cmdline for the second contains

    ...NULtelegram-desktopNUL...

which DOES match -x 'telegram-desktop' because the surrounding NULs are
treated as line boundaries.

By the way, parsing files under /proc, or the output of the ps command,
requires special care when done with grep and friends.  One popular trick to
avoid matching itself is to make some little changes to the ERE.

E.g. 
    ps -elf | grep '\<some[t]hing\>'
mathes "run something" without matching the grep process.

However, the easiest way to cope with process tables is pgrep(1).

Best,
	g1




Added tag(s) notabug. Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Thu, 02 Jan 2020 09:40:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 36367 <at> debbugs.gnu.org and Henrik Holst <henrik.holst <at> outlook.com> Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Thu, 02 Jan 2020 09:40:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 30 Jan 2020 12:24:08 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 86 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.