GNU bug report logs - #51698
surrogate-pair test fails under Cygwin

Previous Next

Package: grep;

Reported by: Duncan Roe <duncan_roe <at> optusnet.com.au>

Date: Tue, 9 Nov 2021 02:56:02 UTC

Severity: minor

Merged with 27555, 49983

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 51698 in the body.
You can then email your comments to 51698 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#51698; Package grep. (Tue, 09 Nov 2021 02:56:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Duncan Roe <duncan_roe <at> optusnet.com.au>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Tue, 09 Nov 2021 02:56:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Duncan Roe <duncan_roe <at> optusnet.com.au>
To: bug-grep <at> gnu.org
Cc: Duncan Roe <duncan_roe <at> optusnet.com.au>
Subject: surrogate-pair test fails under Cygwin
Date: Tue, 9 Nov 2021 13:48:02 +1100
[Message part 1 (text/plain, inline)]
The 3rd surrogate-pair test fails under Cygwin:
> # Also test whether a surrogate-pair in the search string works.
Fails at grep-3.7 or latest commit.

Reproduces easily enough from the command line:
> printf '%s\n' "$(printf '\360\220\220\205')" >in
> LANG=en_US.utf8
> locale
> src/grep --file=in in

Reports a match under Linux but not under Cygwin. Tested Cygwin64 on Windows 7
Home and Windows 10.

Comparing gdb sessions between the platforms, I noticed:
> linux:  sbclen = '\001' <repeats 128 times>, '\377' <repeats 66 times>, '\376' <repeats 60 times>, "\377\377"
> cygwin: sbclen = '\001' <repeats 128 times>, '\377' <repeats 64 times>, '\376' <repeats 53 times>, '\377' <repeats 11 times>
in `dfa` (i.e. dfa.localeinfo.sbclen).

Also this:
> linux:  enlistnew (cpp=0x, new=0x "\360\220\220\205") at dfa.c:3928
> cygwin: enlistnew (cpp=0x, new=0x "\360\355\260\205") at dfa.c:3928

Locale data is different for the same locale on the 2 systems. I investigated
this further by breakpointing the code as it starts to compute sbclen[250] which
is \376 ubder Linux but \377 under Cygwin. I captured the gdb sessions using
`script` and have attached them in the hope they are some help.

If your system rejects the tar.gz attachment I'll send them plaintext in
separate emails. They compare best in a side-by-side diff highlighting changed
characters. I find `tkdiff` good for this: from View choose "Show inline
comparison (recursive)".

Uninteresting changes between the sessions are removed:
 Automatic
 - strip hex numbers (addresses usually) to plain 0x
 - remove escape sequences (colouring &c.)
 - probably other stuff
 Specifics
 - force matching locale names
 - insert blank lines at linux:72 to line up return stmt
 - split linux:100 to more easily see later args

Cheers ... Duncan.
[gdb_sessions.tar.gz (application/x-tar-gz, attachment)]

Information forwarded to bug-grep <at> gnu.org:
bug#51698; Package grep. (Mon, 15 Nov 2021 05:20:02 GMT) Full text and rfc822 format available.

Message #8 received at 51698 <at> debbugs.gnu.org (full text, mbox):

From: Duncan Roe <duncan_roe <at> optusnet.com.au>
To: 51698 <at> debbugs.gnu.org
Cc: Duncan Roe <duncan_roe <at> optusnet.com.au>
Subject: This is a duplicate
Date: Mon, 15 Nov 2021 16:19:47 +1100
Hi

Sorry but I just noticed this is a duplicate of bug 27555, last modified over 4
years ago.

To save anyone else wasting a lot of time over this, I could send in a patch to
skip the test under Cygwin - how about it?

(You just need to test if $(uname -s) starts "CYGWIN").

Cheers ... Duncan.




Merged 27555 51698. Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Tue, 16 Nov 2021 18:13:01 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#51698; Package grep. (Tue, 16 Nov 2021 18:15:01 GMT) Full text and rfc822 format available.

Message #13 received at 51698 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Duncan Roe <duncan_roe <at> optusnet.com.au>
Cc: 51698 <at> debbugs.gnu.org
Subject: Re: bug#51698: This is a duplicate
Date: Tue, 16 Nov 2021 10:14:22 -0800
On 11/14/21 21:19, Duncan Roe wrote:

> I just noticed this is a duplicate of bug 27555, last modified over 4
> years ago.

Thanks for mentioning that. I merged the bug reports.

> To save anyone else wasting a lot of time over this, I could send in a patch to
> skip the test under Cygwin - how about it?

Sounds good.




Severity set to 'minor' from 'normal' Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Wed, 24 Nov 2021 02:30:02 GMT) Full text and rfc822 format available.

Merged 27555 49983 51698. Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Wed, 24 Nov 2021 02:30:02 GMT) Full text and rfc822 format available.

Removed tag(s) moreinfo. Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Wed, 24 Nov 2021 02:38:01 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 22 Dec 2021 12:24:07 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 124 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.