GNU bug report logs -
#38792
man grep
Previous Next
Reported by: Martin Simons <martin <at> webhuis.nl>
Date: Sun, 29 Dec 2019 15:19:01 UTC
Severity: normal
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 38792 in the body.
You can then email your comments to 38792 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-grep <at> gnu.org
:
bug#38792
; Package
grep
.
(Sun, 29 Dec 2019 15:19:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Martin Simons <martin <at> webhuis.nl>
:
New bug report received and forwarded. Copy sent to
bug-grep <at> gnu.org
.
(Sun, 29 Dec 2019 15:19:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Dear Friend,
At the moment I am working part time as a Unix / Linux teacher and also
as an AIX system administrator.
Privately I am using Debian / Ubuntu and I use it while in class, giving
presentations and on the fly examples of statements and so on.
In class I tell the students how grep should be used in a directory
called test. Please find the contents of the directory and some files
below. My example of use is as follows:
martin <at> laptop:~/test$ grep 'Jantje*' school.txt
Which delivers the desired output:
Jantje
Ik las dat Jantje
Jantje voortaan op tijd op school komt,
Hoogachtend, Jantjes vader, Piet Bel.
So far for the class. In reality, however, in most occasions the
statement will be issued like this (objections are being laughed away):
martin <at> laptop:~/test$ grep Jantje* school.txt
Delivering this undesired output:
Jantjes:Jantje zag eens pruimen hangen
Jantjes:en Jantje wilde pruimen plukken voor zijn moeder
Jantjes:toen zei Jantjes moeder
Jantjes:pas toch op Jantje!
school.txt:Jantje
school.txt:Ik las dat Jantje
school.txt:Jantje voortaan op tijd op school komt,
school.txt:Hoogachtend, Jantjes vader, Piet Bel.
In trying to give more weight to the argument I pointed the students to
the man page of grep, but it struck me that there is not a single
reference to the need of using quotes in the search pattern. This does
not help.
I checked the AIX man page for grep and they at least give some, not
all, examples of using quotes around the search pattern.
The example given above makes me assume grave errors occur in production
environments, just because users are not lead to use search patterns
right.
It may no be the task of the grep project to provide a man page, but
even then I feel there is an opportunity for improvement here by
providing a basic man page with a couple of good examples. I would be
more than glad to contribute.
Files and Contents of test.
Diectory test:
-rw-r--r-- 1 martin martin 208 May 20 2019 Jantje
-rw-r--r-- 1 martin martin 124 Apr 30 2019 Jantjes
-rw-r--r-- 1 martin martin 242 Jun 27 2019 school.txt
The contents of Jantjes and school.txt are:
file Jantjes:
Jantje zag eens pruimen hangen
en Jantje wilde pruimen plukken voor zijn moeder
toen zei Jantjes moeder
pas toch op Jantje!
file school.txt:
Geachte Heer de Vries,
Jantje
Ik las dat Jantje
weer eens te laat op school kwam.
Ik kan u verzekeren dat wij er alles aan
zullen doen om er voor te zorgen dat
Jantje voortaan op tijd op school komt,
Hoogachtend, Jantjes vader, Piet Bel.
Met vriendelijke groet,
Martin.
LinkedIn: https://www.linkedin.com/in/martinsimons1/
GitHub: https://github.com/Webhuis/CFEngine-Roadshow/tree/master/
Twitter: https://twitter.com/webhuis @Webhuis #TheCFEngineRoadshow
Information forwarded
to
bug-grep <at> gnu.org
:
bug#38792
; Package
grep
.
(Sun, 29 Dec 2019 18:47:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 38792 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 12/29/19 6:24 AM, Martin Simons wrote:
> It may not be the task of the grep project to provide a man page, but even then I
> feel there is an opportunity for improvement here
Right on both counts. I installed the attached patches to improve things a bit
in the next version of grep. Thanks for reporting the problem.
The GNU guidelines deprecate man pages, so extended examples should go into
doc/grep.texi where students can view them at
<https://www.gnu.org/software/grep/manual/>. So if you'd like to improve the
examples further, please suggest them as patches to doc/grep.texi in "git
format-patch" style. If they're extensive we'll also need to get the copyright
formalities done which I can fill you in about if you're interested in pursuing
this.
[0001-doc-document-quoting-better.patch (text/x-patch, attachment)]
[0002-doc-fix-typo-in-previous-patch.patch (text/x-patch, attachment)]
Information forwarded
to
bug-grep <at> gnu.org
:
bug#38792
; Package
grep
.
(Sun, 29 Dec 2019 19:01:01 GMT)
Full text and
rfc822 format available.
Message #11 received at submit <at> debbugs.gnu.org (full text, mbox):
2019-12-29 15:24:47 +0100, Martin Simons:
[...]
> martin <at> laptop:~/test$ grep 'Jantje*' school.txt
> Which delivers the desired output:
> Jantje
It also matches on Jantj? The * regexp operator matches 0 or
more of the preceding atom. So e* matches 0 or more "e"s. It's
not to confused with the "*" shell wildcard operator.
In effect, that's equivalent to
grep Jantj school.txt
[...]
> martin <at> laptop:~/test$ grep Jantje* school.txt
> Delivering this undesired output:
> Jantjes:Jantje zag eens pruimen hangen
> Jantjes:en Jantje wilde pruimen plukken voor zijn moeder
> Jantjes:toen zei Jantjes moeder
> Jantjes:pas toch op Jantje!
> school.txt:Jantje
> school.txt:Ik las dat Jantje
> school.txt:Jantje voortaan op tijd op school komt,
> school.txt:Hoogachtend, Jantjes vader, Piet Bel.
>
> In trying to give more weight to the argument I pointed the students to the
> man page of grep, but it struck me that there is not a single reference to
> the need of using quotes in the search pattern. This does not help.
All grep receives is a list of arguments. That * needs to be
quoted for the shell not to treat it as a glob operator, not for
grep.
You'll find explanation of shell globbing (aka filename
generation, aka pathname expansion, aka filename expansion) and
the effect of quoting on it in your shell manual.
For the GNU shell, that's with:
info bash 'filename expansion'
What needs to be quoted for a given shell varies with the shell
implementation, grep has nothing to do with that.
--
Stephane
Information forwarded
to
bug-grep <at> gnu.org
:
bug#38792
; Package
grep
.
(Sun, 29 Dec 2019 19:16:02 GMT)
Full text and
rfc822 format available.
Message #14 received at submit <at> debbugs.gnu.org (full text, mbox):
2019-12-29 10:46:10 -0800, Paul Eggert:
> On 12/29/19 6:24 AM, Martin Simons wrote:
> > It may not be the task of the grep project to provide a man page, but even then I
> > feel there is an opportunity for improvement here
>
> Right on both counts. I installed the attached patches to improve things a bit
> in the next version of grep.
[...]
> From 8b7da49786e613c6ae9a2b299b1ce2187b32ed26 Mon Sep 17 00:00:00 2001
> Subject: [PATCH 1/2] doc: document quoting better
[...]
> +.SH "EXIT STATUS"
> +Normally the exit status is 0 if a line is selected, 1 if no lines
> +were selected, and 2 if an error occurred.
[...]
Note that that wording makes it unclear what the exit status
should be if -o is in use.
[...]
> +$ \fBgrep\fP \-n 'f.*\e.c$' *g*.h /dev/null
[...]
It should be
grep -n -- 'f.*\.c$' *g*.h /dev/null
Or:
grep -ne 'f.*\.c$' -- *g*.h /dev/null
(unless $POSIXLY_CORRECT is set).
grep pattern *.h
is fine in POSIX compliant greps, but not in GNU grep as GNU
getopt*() accept options after non-option arguments. IMO, it's
worth pointing out as it's a common gotchas with GNU utilities.
grep -e pattern *.h
is not fine in any grep.
(why not using -H instead of /dev/null btw?).
[...]
> +@example
> +$ @kbd{grep -n 'f.*\.c$' *g*.h /dev/null}
> +argmatch.h:1:/* definitions and prototypes for argmatch.c
> +@end example
[...]
same in texinfo.
--
Stephane
Information forwarded
to
bug-grep <at> gnu.org
:
bug#38792
; Package
grep
.
(Mon, 30 Dec 2019 03:00:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 38792 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 12/29/19 11:10 AM, Stephane Chazelas wrote:
> Note that that wording makes it unclear what the exit status
> should be if -o is in use.
It seems reasonably clear that a line would be selected if any part of it is
selected. Anyway, that text is exactly the same as before, so rewordsmithing it
could be a different thread.
> It should be
>
> grep -n -- 'f.*\.c$' *g*.h /dev/null
Thanks, done in the attached patch.
> (why not using -H instead of /dev/null btw?).
I wanted the example to be portable.
[0001-doc-Add-to-more-complex-example.patch (text/x-patch, attachment)]
Information forwarded
to
bug-grep <at> gnu.org
:
bug#38792
; Package
grep
.
(Mon, 30 Dec 2019 08:56:02 GMT)
Full text and
rfc822 format available.
Message #20 received at submit <at> debbugs.gnu.org (full text, mbox):
2019-12-29 18:59:22 -0800, Paul Eggert:
[...]
> > It should be
> >
> > grep -n -- 'f.*\.c$' *g*.h /dev/null
>
> Thanks, done in the attached patch.
[...]
There were a few other instances. The patch below attempts to
fix those (includes your patch).
I've also replaced find|xargs to the standard and more portable
(and simpler/faster...) find -exec + (GNU find used not to
support that but that was eventually fixed over 10 years ago).
I've replaced the POSIX+XSI ps -e with POSIX ps -A (for BSD
compatibility).
I've moved the grep -H and grep -- FAQ entries to the front so
they are explained before being used.
diff --git a/doc/grep.in.1 b/doc/grep.in.1
index a382966..a91b2a6 100644
--- a/doc/grep.in.1
+++ b/doc/grep.in.1
@@ -1333,13 +1333,18 @@ The following example outputs the location and contents of any line
containing \*(lqf\*(rq and ending in \*(lq.c\*(rq,
within all files in the current directory whose names
contain \*(lqg\*(rq and end in \*(lq.h\*(rq.
-The command also searches the empty file /dev/null,
-so that file names are displayed
+The
+.B \-n
+option outputs line numbers, the
+.B \-\-
+argument treats expansions of \*(lq*g*.h\*(rq starting with \*(lq\-\*(rq
+as file names not options,
+and the empty file /dev/null causes file names to be output
even if only one file name happens to be of the form \*(lq*g*.h\*(rq.
.PP
.in +2n
.EX
-$ \fBgrep\fP \-n 'f.*\e.c$' *g*.h /dev/null
+$ \fBgrep\fP \-n \-\- 'f.*\e.c$' *g*.h /dev/null
argmatch.h:1:/* definitions and prototypes for argmatch.c
.EE
.in
diff --git a/doc/grep.texi b/doc/grep.texi
index 873b53c..de7d028 100644
--- a/doc/grep.texi
+++ b/doc/grep.texi
@@ -1585,12 +1585,13 @@ showing the location and contents of any line
containing @samp{f} and ending in @samp{.c},
within all files in the current directory whose names
contain @samp{g} and end in @samp{.h}.
-The command also searches the empty file @file{/dev/null},
-so that file names are displayed
+The @option{-n} option outputs line numbers, the @option{--} argument
+treats any later arguments starting with @samp{-} as file names not
+options, and the empty file @file{/dev/null} causes file names to be output
even if only one file name happens to be of the form @samp{*g*.h}.
@example
-$ @kbd{grep -n 'f.*\.c$' *g*.h /dev/null}
+$ @kbd{grep -n -- 'f.*\.c$' *g*.h /dev/null}
argmatch.h:1:/* definitions and prototypes for argmatch.c
@end example
@@ -1608,11 +1609,78 @@ Here are some common questions and answers about @command{grep} usage.
@enumerate
+@item
+How do I force @command{grep} to print the name of the file?
+
+Append @file{/dev/null}:
+
+@example
+grep 'eli' /etc/passwd /dev/null
+@end example
+
+gets you:
+
+@example
+/etc/passwd:eli:x:2098:1000:Eli Smith:/home/eli:/bin/bash
+@end example
+
+Alternatively, use @option{-H}, which is a GNU extension:
+
+@example
+grep -H 'eli' /etc/passwd
+@end example
+
+Using that trick is generally wanted when using @command{find}, shell
+globbing or other forms of expansions or more generally when we don't
+know in advance what file are being searched in and want that
+information to be returned. Without it
+
+@example
+grep -- pattern *.txt
+@end example
+
+could output the matched lines without indication of which file they
+were found in if there was only one non-hidden @samp{.txt} file in the
+current directory.
+
+@item
+What if a pattern or filename has a leading @samp{-}?
+
+@example
+grep -H -- '--cut here--' *
+@end example
+
+@noindent
+searches for all lines matching @samp{--cut here--}.
+
+@samp{--} marks the end of options. None of the arguments after
+@samp{--} are treated as options even if they start with
+@samp{-}.
+
+Alternatively, one can use @option{-e} to specify the pattern:
+
+@example
+grep -H -e '--cut here--' -- *
+@end example
+
+But @samp{--} is still needed to guard against filenames that
+start with @samp{-}.
+
+Another approach would be to use:
+
+@example
+grep -H -e '--cut here--' ./*
+@end example
+
+Which also guards against a file called @samp{-} (which
+@command{grep} would otherwise interpret as meaning standard
+input).
+
@item
How can I list just the names of matching files?
@example
-grep -l 'main' *.c
+grep -l -- 'main' *.c
@end example
@noindent
@@ -1630,45 +1698,33 @@ grep -r 'hello' /home/gigi
searches for @samp{hello} in all files
under the @file{/home/gigi} directory.
For more control over which files are searched,
-use @command{find}, @command{grep}, and @command{xargs}.
+use @command{find} in combination with @command{grep}.
For example, the following command searches only C files:
@example
-find /home/gigi -name '*.c' -print0 | xargs -0r grep -H 'hello'
+find /home/gigi -name '*.c' -exec grep -H 'hello' '@{@}' +
@end example
This differs from the command:
@example
-grep -H 'hello' *.c
+grep -H -- 'hello' *.c
@end example
-which merely looks for @samp{hello} in all files in the current
-directory whose names end in @samp{.c}.
+which merely looks for @samp{hello} in all (non-hidden) files in
+the current directory whose names end in @samp{.c}.
The @samp{find ...} command line above is more similar to the command:
@example
-grep -rH --include='*.c' 'hello' /home/gigi
+grep -r --include='*.c' 'hello' /home/gigi
@end example
-@item
-What if a pattern has a leading @samp{-}?
-
-@example
-grep -e '--cut here--' *
-@end example
-
-@noindent
-searches for all lines matching @samp{--cut here--}.
-Without @option{-e},
-@command{grep} would attempt to parse @samp{--cut here--} as a list of
-options.
@item
Suppose I want to search for a whole word, not a part of a word?
@example
-grep -w 'hello' *
+grep -H -w -- 'hello' *
@end example
@noindent
@@ -1679,7 +1735,7 @@ For more control, use @samp{\<} and
For example:
@example
-grep 'hello\>' *
+grep -H -- 'hello\>' *
@end example
@noindent
@@ -1690,38 +1746,17 @@ searches only for words ending in @samp{hello}, so it matches the word
How do I output context around the matching lines?
@example
-grep -C 2 'hello' *
+grep -H -C 2 -- 'hello' *
@end example
@noindent
prints two lines of context around each matching line.
-@item
-How do I force @command{grep} to print the name of the file?
-
-Append @file{/dev/null}:
-
-@example
-grep 'eli' /etc/passwd /dev/null
-@end example
-
-gets you:
-
-@example
-/etc/passwd:eli:x:2098:1000:Eli Smith:/home/eli:/bin/bash
-@end example
-
-Alternatively, use @option{-H}, which is a GNU extension:
-
-@example
-grep -H 'eli' /etc/passwd
-@end example
-
@item
Why do people use strange regular expressions on @command{ps} output?
@example
-ps -ef | grep '[c]ron'
+ps -Af | grep '[c]ron'
@end example
If the pattern had been written without the square brackets, it would
@@ -1788,6 +1823,9 @@ Use the special file name @samp{-}:
cat /etc/passwd | grep 'alain' - /etc/motd
@end example
+To match in a file called @samp{-} in the current directory, use
+@samp{./-}.
+
@item
@cindex palindromes
How to express palindromes in a regular expression?
diff --git a/gnulib b/gnulib
--- a/gnulib
+++ b/gnulib
@@ -1 +1 @@
-Subproject commit 575b0ecbad2f34d5777f9562eebc2d0c815bfc5c
+Subproject commit 575b0ecbad2f34d5777f9562eebc2d0c815bfc5c-dirty
Information forwarded
to
bug-grep <at> gnu.org
:
bug#38792
; Package
grep
.
(Mon, 30 Dec 2019 10:00:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 38792 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Thanks, I installed the attached, which is a bit more conservative than what you
suggested but which I hope catches all the issues.
[0001-doc-robustify-some-examples.patch (text/x-patch, attachment)]
Information forwarded
to
bug-grep <at> gnu.org
:
bug#38792
; Package
grep
.
(Mon, 30 Dec 2019 17:55:01 GMT)
Full text and
rfc822 format available.
Message #26 received at 38792 <at> debbugs.gnu.org (full text, mbox):
2019-12-30 01:59:09 -0800, Paul Eggert:
> Thanks, I installed the attached, which is a bit more conservative than what you
> suggested but which I hope catches all the issues.
[...]
> @example
> -grep -l 'main' *.c
> +grep -l 'main' test-*.c
> @end example
[...]
Thanks,
though I find it's a bit of a shame to /hide/ the problem like
that here.
It's so common for people to forget that "--" (I just came
across https://en.wikipedia.org/wiki/Glob_(programming) which
does that mistake in the very first paragraph for instance) that
it would be beneficial IMO if the manual showed the right way to do,
especially for the GNU implementation of grep which makes the
problem worse by accepting options after non-option arguments.
Please consider making it
grep -l -- 'main' *.c
to raise awareness and try and teach people "the right way".
--
Stephane
Information forwarded
to
bug-grep <at> gnu.org
:
bug#38792
; Package
grep
.
(Mon, 30 Dec 2019 18:46:01 GMT)
Full text and
rfc822 format available.
Message #29 received at 38792 <at> debbugs.gnu.org (full text, mbox):
On 12/30/19 9:54 AM, Stephane Chazelas wrote:
> Please consider making it
>
>
> grep -l -- 'main' *.c
>
> to raise awareness and try and teach people "the right way".
Interactively, I invariably use 'grep foo *.c' without the '--' and it works
just fine because the files under my control have names that do not start with
'-'. This is standard practice and simplifies the common-usage examples. So
although '--' should (and does) appear in examples to document a defensive
measure for scripts that must work even in nonstandard environments, '--'
needn't be used in examples everywhere that it could conceivably apply.
Reply sent
to
Paul Eggert <eggert <at> cs.ucla.edu>
:
You have taken responsibility.
(Mon, 21 Sep 2020 19:11:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Martin Simons <martin <at> webhuis.nl>
:
bug acknowledged by developer.
(Mon, 21 Sep 2020 19:11:02 GMT)
Full text and
rfc822 format available.
Message #34 received at 38792-done <at> debbugs.gnu.org (full text, mbox):
Discussion on this worthsmithing issue died down in December so I'm taking the
liberty of closing the bug report. We can reopen it or start a new one as necessary.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Tue, 20 Oct 2020 11:24:14 GMT)
Full text and
rfc822 format available.
This bug report was last modified 3 years and 182 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.