GNU bug report logs - #31228
Update GREP manpage

Previous Next

Package: grep;

Reported by: "Laura Morales" <lauretas <at> mail.com>

Date: Fri, 20 Apr 2018 16:26:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 31228 in the body.
You can then email your comments to 31228 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#31228; Package grep. (Fri, 20 Apr 2018 16:26:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to "Laura Morales" <lauretas <at> mail.com>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Fri, 20 Apr 2018 16:26:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: "Laura Morales" <lauretas <at> mail.com>
To: bug-grep <at> gnu.org
Subject: Update GREP manpage
Date: Fri, 20 Apr 2018 18:03:09 +0200
Hi, I was trying to convert grep.1 to HTML and I discovered that there are some issues... in particular it seems to be using an unsupported macro for email addresses and URLs. You can see an example here https://manpages.debian.org/stretch/grep/grep.1.en.html#Reporting_Bugs the output is

Email bug reports to An and a are available.

but instead it should be

Email bug reports to the bug-reporting address <bug-grep <at> gnu.org>. An email archive <http://lists.gnu.org/mailman/listinfo/bug-grep> and a bug tracker <http://debbugs.gnu.org/cgi/pkgreport.cgi?package=grep> are available.

I've asked Ingo who is the maintainer of mandoc, and the answer seems to be that the grep manpage is malformed and should be updated.
Can you guys please fix the grep manpage? Thank you in advance.

Following is Ingo's reply:
--------------------------

> Look at this page: https://paste.debian.net/plain/1021106
> If I do (from Linux):
> $ mandoc -Thtml grep.1 > grep.1.html
> mandoc is not producing the output for email addresses and URLs.
> For instance take a look at the BUGS section
>
> .SH BUGS
> .SS "Reporting Bugs"
> Email bug reports to
> .MTO bug-grep <at> gnu.org "the bug-reporting address" .
> An
> .URL http://lists.gnu.org/mailman/listinfo/bug-grep "email archive"
> and a
> .URL http://debbugs.gnu.org/cgi/pkgreport.cgi?package=grep "bug tracker"
> are available.
>
> the HTML output is
>
> BUGS
> Reporting Bugs
> Email bug reports to An and a are available.
>
> Looks like it's completely skipping some macros.
> Is this a bug

Not a bug, no.

> or is there some flag that I can turn on to enable all macros?

No. We do not believe in options, stuff is supposed to just
work without users twiddling knobs.


This is supposed to be a groff_man(7) document but .MTO and .URL
are not man macros.

The man language does *not* provide markup for mail addresses
or URIs.

Please tell the maintainers of GNU grep(1) to fix their manual
page. They have the following four options. The smaller the
number, the better, cleaner, and more portable:

1. BEST:
Switch to the better mdoc(7) language which does provide
such markup.

2. If they insist on using the obselete man(7) language and
want to remain portable, use standard macros like .I and
.B and accept that no hyperlinks will appear. It is
a limitation of the chosen language. You can't have the
cake and eat it, too.

3. If they don't care about portability and live in a GNU-only
world, use the -man-ext macros .MT and .UR, documented in
groff_man(7). mandoc(1) implements these for GNU compatibility,
but other formatters may not.

4. WORST:
If they absolutely insist on using the alien www.tmac macros
(even though it is irresponsible to mix random alien macro sets
into manual page markup), then at least fix the feature test
as follows:

.mso www.tmac
.if !dMTO \{\
. de MTO
\\$2 \(la\\$1\(ra\\$3
..
.\}
.if !dURL \{\
. de URL
\\$2 \(la\\$1\(ra\\$3
..
.\}

or maybe even better simply

. de MTO
\\$2 \(la\\$1\(ra\\$3
..
. de URL
\\$2 \(la\\$1\(ra\\$3
..
.mso www.tmac

because defining the macros first and then overriding them with GNU
implementations when available avoids the potentially non-portable
and fragile conditional.

But note that i do not recommend option 4 at all. It works with
groff and mandoc, but it is exceeding ugly and potentially non-
portable. Manual pages have no business whatsoever defining their
own macros!

Also note that using \(.g to test for availability of features is
utterly stupid. *ANY* formatter that wants to stand a chance to
format anything properly positively *must* define it, no matter
whether it is groff or not.

Feel free to include this message, or extracts from it as you see
fit, when reporting to the GNU grep(1) maintainers.

Yours,
Ingo




Information forwarded to bug-grep <at> gnu.org:
bug#31228; Package grep. (Fri, 20 Apr 2018 19:49:02 GMT) Full text and rfc822 format available.

Message #8 received at 31228 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Laura Morales <lauretas <at> mail.com>
Cc: 31228 <at> debbugs.gnu.org
Subject: Re: bug#31228: Update GREP manpage
Date: Fri, 20 Apr 2018 12:48:25 -0700
[Message part 1 (text/plain, inline)]
On 04/20/2018 09:03 AM, Laura Morales quoted Ingo Schwarze as writing:
> 1. BEST: Switch to the better mdoc(7) language which does provide
> such markup.

This does not work with Solaris nroff and troff, which we'd still like 
to port to, as Solaris is still a live platform.

> 2. If they insist on using the obselete man(7) language and
> want to remain portable, use standard macros like .I and
> .B and accept that no hyperlinks will appear. It is
> a limitation of the chosen language. You can't have the
> cake and eat it, too.

Although this would port, it would be less readable on newer systems. 
For example, the HTML man page wouldn't contain links to the grep 
manual. We can do better.

> 3. If they don't care about portability and live in a GNU-only
> world, use the -man-ext macros .MT and .UR, documented in
> groff_man(7). mandoc(1) implements these for GNU compatibility,
> but other formatters may not.

(3) is not as unportable as Ingo suggested, since the MT and UR macros 
can work outside the GNU environment when included in the man page. 
Please try the attached patch, which I've installed into the grep 
master. If it runs afoul of the troff subset that Ingo is using, please 
let us know and (ideally) suggest a fix.

> using \(.g to test for availability of features is
> utterly stupid. *ANY* formatter that wants to stand a chance to
> format anything properly positively *must* define it, no matter
> whether it is groff or not.

No, Solaris nroff can format the grep man page properly and it does not 
define \n(.g.
[0001-doc-port-better-to-mandoc.patch (text/x-patch, attachment)]

Information forwarded to bug-grep <at> gnu.org:
bug#31228; Package grep. (Fri, 20 Apr 2018 23:07:01 GMT) Full text and rfc822 format available.

Message #11 received at 31228 <at> debbugs.gnu.org (full text, mbox):

From: Ingo Schwarze <schwarze <at> usta.de>
To: 31228 <at> debbugs.gnu.org
Cc: Laura Morales <lauretas <at> mail.com>
Subject: Re: bug#31228: Update GREP manpage
Date: Sat, 21 Apr 2018 00:59:03 +0200
Hi,

thanks for looking at this issue.

Paul Eggert wrote:
> Ingo Schwarze wrote:

>> 1. BEST: Switch to the better mdoc(7) language which does provide
>> such markup.

> This does not work with Solaris nroff and troff, which we'd still
> like to port to, as Solaris is still a live platform.

That would require
 - creating man(7) versions from the master mdoc(7) versions
   on your build host with "mandoc -Tman" when building release
   tarballs and including these autogenerated man(7) versions
   into the tarballs in addition to the mdoc(7) versions
 - decide at autoconf time on the target system which version
   to install

Here is an example how sudo(8) does it:

  https://www.sudo.ws/repos/sudo/file/tip/configure.ac

But of course, i admit that it requires a bit of work, but it would
give your users better manual pages, in particular regarding the
HTML output, which you do seem to care about considering the
view you express on option 2:

>> 2. If they insist on using the obselete man(7) language and
>> want to remain portable, use standard macros like .I and
>> .B and accept that no hyperlinks will appear. It is
>> a limitation of the chosen language. You can't have the
>> cake and eat it, too.

> Although this would port, it would be less readable on newer systems.
> For example, the HTML man page wouldn't contain links to the grep
> manual. We can do better.

>> 3. If they don't care about portability and live in a GNU-only
>> world, use the -man-ext macros .MT and .UR, documented in
>> groff_man(7). mandoc(1) implements these for GNU compatibility,
>> but other formatters may not.

> (3) is not as unportable as Ingo suggested, since the MT and UR macros
> can work outside the GNU environment when included in the man page.
> Please try the attached patch, which I've installed into the grep
> master. If it runs afoul of the troff subset that Ingo is using, please
> let us know and (ideally) suggest a fix.

I'm not a big fan of copying macro implementations into individual
manual pages.  It bloats the page, and there is always a risk that
the fallback implementation, or the conditionals needed to enable
or disable it,  may run afoul of some formatter out there, in
particular when you use fancy stuff like .do, .ftr, .ev, .di,
.chop...

That said, groff and mandoc output of what you committed to git are
byte-by-byte identical except for the following one-blank difference
(mandoc output looks minimally better than groff output at that
point, groff prints one excess blank character because it wrongly
detects the end of a sentence where there is none):

 $ gmdiff grep.in.1
 ========== grep.in.1 ========== 
roff errors:
mandoc errors:
mandoc: grep.in.1:41:9: UNSUPP: unsupported roff request: do
mandoc: grep.in.1:42:1: UNSUPP: unsupported roff request: do
mandoc: grep.in.1:43:1: UNSUPP: unsupported roff request: do
--- /tmp/roff.out	Sat Apr 21 00:08:05 2018
+++ /tmp/mandoc.out	Sat Apr 21 00:08:05 2018
@@ -263,8 +263,8 @@
               the whole name, or any suffix starting after a // and before a
               +non-//.  When searching recursively, skip any subfile whose base
               name matches _G_L_O_B; the base name is the part after the last //.
-              A pattern can use **, ??, and [[...]]  as wildcards, and \\ to quote
-              a wildcard or backslash character literally.
+              A pattern can use **, ??, and [[...]] as wildcards, and \\ to quote a
+              wildcard or backslash character literally.
 
        ----eexxcclluuddee--ffrroomm==_F_I_L_E
               Skip files whose base name matches any of the file-name globs

Your commit also happens to work with Heirloom roff.

So it *does* seem to be an improvement.

>> using \(.g to test for availability of features is
>> utterly stupid. *ANY* formatter that wants to stand a chance to
>> format anything properly positively *must* define it, no matter
>> whether it is groff or not.

> No, Solaris nroff can format the grep man page properly and it does not
> define \n(.g.

I don't doubt that an archaic formatter like Solaris 11 nroff can
format a page specifically written to work with archaic software,
even if \(.g is used.  :-)

What i meant is from the perspective of the formatter:  If a formatter
intends to work well with most manual pages found in the wild,
nowadays, it *must* define \(.g, even if it is not groff.  For
example, mandoc handles over 99% of the software packages in our
ports tree that contain manual pages.  Without defining \(.g, it
couldn't even come close.  I have no idea how many current manual
pages Solaris 11 nroff may be able to handle - i guess it may fail
on quite a few.

Anyway, thank you for fixing this!
  Ingo




Information forwarded to bug-grep <at> gnu.org:
bug#31228; Package grep. (Sat, 21 Apr 2018 20:56:01 GMT) Full text and rfc822 format available.

Message #14 received at 31228 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Ingo Schwarze <schwarze <at> usta.de>, 31228 <at> debbugs.gnu.org
Subject: Re: bug#31228: Update GREP manpage
Date: Sat, 21 Apr 2018 13:55:17 -0700
[Message part 1 (text/plain, inline)]
Ingo Schwarze wrote:

> i admit that it requires a bit of work, but it would
> give your users better manual pages

We could add it to our list of things to do. In the meantime...

> groff and mandoc output of what you committed to git are
> byte-by-byte identical except for the following one-blank difference
> (mandoc output looks minimally better than groff output at that
> point, groff prints one excess blank character because it wrongly
> detects the end of a sentence where there is none):

That's a minor formatting glitch in the grep man page. Thanks for reporting it. 
I fixed it by installing the attached patch, which fixes some similar glitches 
too. I installed a couple of other man page patches while in the neighborhood; 
you can see the current version here:

https://git.savannah.gnu.org/cgit/grep.git/plain/doc/grep.in.1
[0001-doc-man-page-format-fixes.patch (text/x-patch, attachment)]

Information forwarded to bug-grep <at> gnu.org:
bug#31228; Package grep. (Sun, 22 Apr 2018 14:29:02 GMT) Full text and rfc822 format available.

Message #17 received at 31228 <at> debbugs.gnu.org (full text, mbox):

From: Ingo Schwarze <schwarze <at> usta.de>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 31228 <at> debbugs.gnu.org
Subject: Re: bug#31228: Update GREP manpage
Date: Sun, 22 Apr 2018 16:28:20 +0200
Hi Paul,

Paul Eggert wrote on Sat, Apr 21, 2018 at 01:55:17PM -0700:
> Ingo Schwarze wrote:

>> i admit that it requires a bit of work, but it would
>> give your users better manual pages

> We could add it to our list of things to do. In the meantime...

Thanks.  Totally reasonable and not urgent.

>> groff and mandoc output of what you committed to git are
>> byte-by-byte identical except for the following one-blank difference
>> (mandoc output looks minimally better than groff output at that
>> point, groff prints one excess blank character because it wrongly
>> detects the end of a sentence where there is none):

> That's a minor formatting glitch in the grep man page.
> Thanks for reporting it. 

Heh.  I didn't even view it as a glitch in the grep(1) manual,
but you are right, there is nothing wrong with disambiguating
it with "]\&" on the closing bracket.

> I fixed it by installing the attached patch, which fixes some
> similar glitches too.  I installed a couple of other man page
> patches while in the neighborhood; 
> you can see the current version here:
> 
> https://git.savannah.gnu.org/cgit/grep.git/plain/doc/grep.in.1

I confirm that it now renders byte-by-byte identically with
git-master groff and CVS-HEAD mandoc in -Tascii mode.  While
differences are not necessarily man page bugs, identical output
is usually not a bad sign.

> @@ -1126,7 +1127,7 @@ The default is a cyan text foreground over the terminal's default background.
>  .B ne
>  Boolean value that prevents clearing to the end of line
>  using Erase in Line (EL) to Right
> -.RB ( \\\\\\33[K )
> +.RB ( \e33[K )
>  each time a colorized item ends.
>  This is needed on terminals on which EL is not supported.
>  It is otherwise useful on terminals

Excellent idea, i'm kind of surprised the old version actually
worked...  :-)

Thanks again,
  Ingo




bug closed, send any further explanations to 31228 <at> debbugs.gnu.org and "Laura Morales" <lauretas <at> mail.com> Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Wed, 01 Jan 2020 07:41:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 29 Jan 2020 12:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 88 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.