GNU bug report logs - #77012
Documentation problem: odd characters in ASCII text format manual

Previous Next

Package: grep;

Reported by: Giacomo De Bello <giacomodebello <at> hotmail.it>

Date: Fri, 14 Mar 2025 14:02:03 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 77012 in the body.
You can then email your comments to 77012 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#77012; Package grep. (Fri, 14 Mar 2025 14:02:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Giacomo De Bello <giacomodebello <at> hotmail.it>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Fri, 14 Mar 2025 14:02:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Giacomo De Bello <giacomodebello <at> hotmail.it>
To: "bug-grep <at> gnu.org" <bug-grep <at> gnu.org>
Subject: Documentation problem: odd characters in ASCII text format manual
Date: Fri, 14 Mar 2025 09:22:23 +0000
Greetings, the manual file in ASCII text format is filled with odd characters, and maybe this is a character conversion problem during the file generation.
In example, you can see the problem in the third line of the plain text manual online

https://www.gnu.org/software/grep/manual/grep.txt

I quote:

2 Invoking ‘grep’

It should be

2 Invoking grep

or

2 Invoking 'grep'

or something else. As far as I can see, the HTML format of the manual has the word "grep" between <code> tags, and I guess the problem is or was somewhat related to that, in docbook2txt or gendocs.sh; I said "was" because maybe just relaunching gendocs.sh will fix that, as the document was last updated May 13, 2023. Regards.



Information forwarded to bug-grep <at> gnu.org:
bug#77012; Package grep. (Fri, 14 Mar 2025 17:39:02 GMT) Full text and rfc822 format available.

Message #8 received at 77012 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Giacomo De Bello <giacomodebello <at> hotmail.it>
Cc: savannah-hackers <at> gnu.org, 77012 <at> debbugs.gnu.org
Subject: Re: bug#77012: Documentation problem: odd characters in ASCII text
 format manual
Date: Fri, 14 Mar 2025 10:38:30 -0700
On 2025-03-14 02:22, Giacomo De Bello wrote:
> Greetings, the manual file in ASCII text format is filled with odd characters, and maybe this is a character conversion problem during the file generation.
> In example, you can see the problem in the third line of the plain text manual online
> 
> https://www.gnu.org/software/grep/manual/grep.txt

Unfortunately the Savannah web server responds with a header that lacks 
the line "Content-Type: text/plain; charset=utf-8", needed because this 
file uses UTF-8 encoding, not ASCII.

I attempted to fix the server bug by adding an .htaccess file like this:

  <Files ".txt">
    AddDefaultCharset utf-8
    ForceType text/plain
  </Files>

at grep's root, but no dice.

Although possibly I messed up the CVS commit (I never remember how to 
use CVS any more), Savannah is reeeeallly slow right now and I'm getting 
a "502 Bad Gateway" when I try to visit, for example, 
<http://web.cvs.savannah.gnu.org/viewvc/grep/> to try to debug this. So 
I'll cc this email to savannah-hackers to see whether they can fix the 
UTF-8 issue for me.

Thanks for the reporting the problem.




Information forwarded to bug-grep <at> gnu.org:
bug#77012; Package grep. (Fri, 14 Mar 2025 19:43:02 GMT) Full text and rfc822 format available.

Message #11 received at 77012 <at> debbugs.gnu.org (full text, mbox):

From: Giacomo De Bello <giacomodebello <at> hotmail.it>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: "savannah-hackers <at> gnu.org" <savannah-hackers <at> gnu.org>,
 "77012 <at> debbugs.gnu.org" <77012 <at> debbugs.gnu.org>
Subject: Re: bug#77012: Documentation problem: odd characters in ASCII text
 format manual
Date: Fri, 14 Mar 2025 19:42:38 +0000
On 2025-03-14 18:38, Paul Eggert wrote:

> the Savannah web server responds with a header that lacks
> the line "Content-Type: text/plain; charset=utf-8", needed because
> this file uses UTF-8 encoding, not ASCII.

Trying to be helpful, I found that Savannah migrated to "updated
systems" in November 2016, and the grep.txt page was OK before [2],
and problematic after [3] the upgrade.
Anyway, another project, alive [4], has a similar problem starting from
a day between May [5] and October [6] 2013.
But I'll stop here, to avoid noise and to go off topic.

> Savannah is reeeeallly slow right now and I'm gettinga "502 Bad
> Gateway" when I try to visit, for example,
> <http://web.cvs.savannah.gnu.org/viewvc/grep/>

Same here.

> So I'll cc this email to savannah-hackers to see whether they can fix
> the UTF-8 issue for me.

Same.

> Thanks for the reporting the problem.

Thanks for your efforts.

[1] https://savannah.nongnu.org/news/?id=8710
[2] https://web.archive.org/web/20151225120101/http://www.gnu.org/software/grep/manual/grep.txt
[3] https://web.archive.org/web/20161224123914/http://www.gnu.org/software/grep/manual/grep.txt
[4] http://www.gnu.org/software/alive/manual/alive.txt
[5] https://web.archive.org/web/20130517082946/http://www.gnu.org/software/alive/manual/alive.txt
[6] https://web.archive.org/web/20131010110923/http://www.gnu.org/software/alive/manual/alive.txt



Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Mon, 17 Mar 2025 20:00:02 GMT) Full text and rfc822 format available.

Notification sent to Giacomo De Bello <giacomodebello <at> hotmail.it>:
bug acknowledged by developer. (Mon, 17 Mar 2025 20:00:02 GMT) Full text and rfc822 format available.

Message #16 received at 77012-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Giacomo De Bello <giacomodebello <at> hotmail.it>
Cc: savannah-hackers <at> gnu.org, 77012-done <at> debbugs.gnu.org
Subject: Re: bug#77012: Documentation problem: odd characters in ASCII text
 format manual
Date: Mon, 17 Mar 2025 12:59:05 -0700
Thanks for the further investigation. I solved the problem for grep by 
installing the following patch, and am closing the bug report.

https://web.cvs.savannah.gnu.org/viewvc/grep/grep/.htaccess?r1=1.1&r2=1.2

Presumably GNU project web pages should routinely have .htaccess files 
that say something like this:

  <Files "*.txt">
    AddDefaultCharset utf-8
    ForceType text/plain
  </Files>

as the old Windows-1252 default no longer makes sense. I'll take a look 
at doing this for the projects I help maintain.

Alternatively (and more efficiently) this could be addressed by changing 
the Apache configuration on Savannah; however, I lack admin access for that.






bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 15 Apr 2025 11:24:08 GMT) Full text and rfc822 format available.

This bug report was last modified 23 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.