GNU logs - #78439, boring messages


Message sent to bug-grep@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#78439: Accent insensitive grep
Resent-From: "Avid Seeker" <avidseeker@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-grep@HIDDEN
Resent-Date: Thu, 15 May 2025 07:47:02 +0000
Resent-Message-ID: <handler.78439.B.174729517812630 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: report 78439
X-GNU-PR-Package: grep
X-GNU-PR-Keywords: 
To: 78439 <at> debbugs.gnu.org
X-Debbugs-Original-To: <bug-grep@HIDDEN>
Received: via spool by submit <at> debbugs.gnu.org id=B.174729517812630
          (code B ref -1); Thu, 15 May 2025 07:47:02 +0000
Received: (at submit) by debbugs.gnu.org; 15 May 2025 07:46:18 +0000
Received: from localhost ([127.0.0.1]:50611 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1uFTIH-0003HZ-Cw
	for submit <at> debbugs.gnu.org; Thu, 15 May 2025 03:46:18 -0400
Received: from lists.gnu.org ([2001:470:142::17]:33014)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.84_2) (envelope-from <avidseeker@HIDDEN>)
 id 1uFRT0-0002jA-Ns
 for submit <at> debbugs.gnu.org; Thu, 15 May 2025 01:49:15 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <avidseeker@HIDDEN>)
 id 1uFRSu-00034x-R3
 for bug-grep@HIDDEN; Thu, 15 May 2025 01:49:08 -0400
Received: from layka.disroot.org ([178.21.23.139])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
 (Exim 4.90_1) (envelope-from <avidseeker@HIDDEN>)
 id 1uFRSs-0000OJ-OG
 for bug-grep@HIDDEN; Thu, 15 May 2025 01:49:08 -0400
Received: from mail01.disroot.lan (localhost [127.0.0.1])
 by disroot.org (Postfix) with ESMTP id F12A1252B1
 for <bug-grep@HIDDEN>; Thu, 15 May 2025 07:49:02 +0200 (CEST)
X-Virus-Scanned: SPAM Filter at disroot.org
Received: from layka.disroot.org ([127.0.0.1])
 by localhost (disroot.org [127.0.0.1]) (amavis, port 10024) with ESMTP
 id L0h5YT0v0zFN for <bug-grep@HIDDEN>;
 Thu, 15 May 2025 07:49:02 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=disroot.org; s=mail;
 t=1747288142; bh=Xu5wWXBoen1LAu/+jswb7ZnfGuJhK88VGbgkOzROvSU=;
 h=Date:From:Subject:To;
 b=DiGgZXa2lvksGTqDEJDn7G+tMZyxuBiv7pTGu5ljyOyRF6Kcpd1wHcSHAF6g/QVsA
 NWKO6nxVlTJnIv9Cjj2Sn09cvaqiTwV8TrbLZjz87voGjFdLRXipgSCxhuB5ZcmFnK
 65ixavmohKImfEr/WDMXdcU6TmwqO0GjcriNfWvblBY4cyvq2uGclK/mC7se2JdDo1
 rEj30O7YFCY0Mn9oy/hT7CLbSRVJpXHK2NIgfIQ0I5/XuxnYAs0H/+suPy4gLPqTew
 CbnpIkhxgpXRDLmjtkgZNZfGGLS6sDonjjejTHctcmNwNvkpaMvOkcu81UppnFiCz3
 SfTrDGhgmr9cg==
Content-Type: text/plain; charset=UTF-8; format=Flowed
Date: Thu, 15 May 2025 05:49:00 +0000
Message-Id: <D9WHYA9BBOX7.394N0TBSJEIHJ@HIDDEN>
From: "Avid Seeker" <avidseeker@HIDDEN>
Mime-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Received-SPF: pass client-ip=178.21.23.139;
 envelope-from=avidseeker@HIDDEN; helo=layka.disroot.org
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001,
 SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-Spam-Score: 0.9 (/)
X-Mailman-Approved-At: Thu, 15 May 2025 03:46:11 -0400
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.1 (/)

Re-iterating the question on SO <https://stackoverflow.com/questions/209378=
64/> of applying an
accent-insensitive grep to text. (e.g: all accents of a letter 'e' should b=
e regarded as an ascii 'e').

The response by Adam Katz mentions:
> You should not expect equivalence classes to be portable as they are too =
arcane.

What's the stance of grep developers on this? are equivalence classes the
right tool to approach this? I see that they depend on LC_COLLATE, in
which case it would be possible to setup a custom locale that matches
digraphs.

In the example he gave, he also mentions:
> This matches all words like aei... [but won't match] =C3=A6i... it's quit=
e
> likely that digraphs are beyond the reach of even the best equivalence
> class map.

Is there a way to setup a locale without having to recompile glibc or
are these locale values hardcoded into programs using glibc?

Thanks,
Avid




Message sent:


Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Mailer: MIME-tools 5.505 (Entity 5.505)
Content-Type: text/plain; charset=utf-8
X-Loop: help-debbugs@HIDDEN
From: help-debbugs@HIDDEN (GNU bug Tracking System)
To: "Avid Seeker" <avidseeker@HIDDEN>
Subject: bug#78439: Acknowledgement (Accent insensitive grep)
Message-ID: <handler.78439.B.174729517812630.ack <at> debbugs.gnu.org>
References: <D9WHYA9BBOX7.394N0TBSJEIHJ@HIDDEN>
X-Gnu-PR-Message: ack 78439
X-Gnu-PR-Package: grep
Reply-To: 78439 <at> debbugs.gnu.org
Date: Thu, 15 May 2025 07:47:03 +0000

Thank you for filing a new bug report with debbugs.gnu.org.

This is an automatically generated reply to let you know your message
has been received.

Your message is being forwarded to the package maintainers and other
interested parties for their attention; they will reply in due course.

Your message has been sent to the package maintainer(s):
 bug-grep@HIDDEN

If you wish to submit further information on this problem, please
send it to 78439 <at> debbugs.gnu.org.

Please do not send mail to help-debbugs@HIDDEN unless you wish
to report a problem with the Bug-tracking system.

--=20
78439: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D78439
GNU Bug Tracking System
Contact help-debbugs@HIDDEN with problems


Message sent to bug-grep@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#78439: Accent insensitive grep
Resent-From: Paul Eggert <eggert@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-grep@HIDDEN
Resent-Date: Thu, 15 May 2025 16:20:04 +0000
Resent-Message-ID: <handler.78439.B78439.174732599319036 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 78439
X-GNU-PR-Package: grep
X-GNU-PR-Keywords: 
To: Avid Seeker <avidseeker@HIDDEN>
Cc: 78439 <at> debbugs.gnu.org
Received: via spool by 78439-submit <at> debbugs.gnu.org id=B78439.174732599319036
          (code B ref 78439); Thu, 15 May 2025 16:20:04 +0000
Received: (at 78439) by debbugs.gnu.org; 15 May 2025 16:19:53 +0000
Received: from localhost ([127.0.0.1]:55376 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1uFbJI-0004wv-6S
	for submit <at> debbugs.gnu.org; Thu, 15 May 2025 12:19:53 -0400
Received: from mail.cs.ucla.edu ([131.179.128.66]:53128)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.84_2) (envelope-from <eggert@HIDDEN>)
 id 1uFbIo-0004vh-1m
 for 78439 <at> debbugs.gnu.org; Thu, 15 May 2025 12:19:27 -0400
Received: from localhost (localhost [127.0.0.1])
 by mail.cs.ucla.edu (Postfix) with ESMTP id 4FECB3C0140A0;
 Thu, 15 May 2025 09:19:15 -0700 (PDT)
Received: from mail.cs.ucla.edu ([127.0.0.1])
 by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavis, port 10032) with ESMTP
 id NthAMA7Rzisg; Thu, 15 May 2025 09:19:15 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1])
 by mail.cs.ucla.edu (Postfix) with ESMTP id 287D33C0149C6;
 Thu, 15 May 2025 09:19:15 -0700 (PDT)
DKIM-Filter: OpenDKIM Filter v2.10.3 mail.cs.ucla.edu 287D33C0149C6
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.ucla.edu;
 s=9D0B346E-2AEB-11ED-9476-E14B719DCE6C; t=1747325955;
 bh=8soblP3JaemDQZKum8be67J0ZRPSsL8Qx+A/TMkgNeA=;
 h=Message-ID:Date:MIME-Version:To:From;
 b=K25149QgUVi4oL2IA2soxE04OokaRDyI0eE0QpJsgiZnWLivjseFqpV3Jt5AzQxCu
 U8YoMsZ01PYeVNVLDLCVKyRRJOa5PJlJkf99oKeCHEaHWQDgye55ZeDaT+IhFMmERQ
 xwuUC4lDyl2Kaa92QR8FnuTJ3M6V/mucrDWVvPpsldeNX+wwv0EXKZvuvJLDNQ8CP/
 yXWCkkhGyS6ZjoLisYmMGHwJc0jqLS2rQJLWcFfaXdFKOHOQIZ2Sssd1Rwg4VVM0lL
 AHM684clcPe2FBdjwSbUYmRRPtBP+I8vlIB7F+wv4coLpYAj67E9Cvl8/G8J+538HB
 ZVDxedk57qA7A==
X-Virus-Scanned: amavis at mail.cs.ucla.edu
Received: from mail.cs.ucla.edu ([127.0.0.1])
 by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavis, port 10026) with ESMTP
 id 9WD6NINoEkt5; Thu, 15 May 2025 09:19:15 -0700 (PDT)
Received: from [192.168.254.12]
 (47-147-225-25.fdr01.snmn.ca.ip.frontiernet.net [47.147.225.25])
 by mail.cs.ucla.edu (Postfix) with ESMTPSA id 0F7B03C0140A0;
 Thu, 15 May 2025 09:19:15 -0700 (PDT)
Message-ID: <36aaec4c-6a7e-4a7c-b9bf-e0ddf2efaa67@HIDDEN>
Date: Thu, 15 May 2025 09:19:14 -0700
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
References: <D9WHYA9BBOX7.394N0TBSJEIHJ@HIDDEN>
Content-Language: en-US
From: Paul Eggert <eggert@HIDDEN>
Organization: UCLA Computer Science Department
In-Reply-To: <D9WHYA9BBOX7.394N0TBSJEIHJ@HIDDEN>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Score: 0.0 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

On 2025-05-14 22:49, Avid Seeker via Bug reports for GNU grep wrote:

> are equivalence classes the
> right tool to approach this?

They're supposed to be, yes ...

> I see that they depend on LC_COLLATE, in
> which case it would be possible to setup a custom locale that matches
> digraphs.

... though you're venturing into uncharted territory here. Please let us 
know of any monsters you find.

> Is there a way to setup a locale without having to recompile glibc

Yes, use localedef.






Last modified: Thu, 15 May 2025 16:30:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.