Received: (at 78439) by debbugs.gnu.org; 15 May 2025 16:19:53 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Thu May 15 12:19:53 2025 Received: from localhost ([127.0.0.1]:55376 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1uFbJI-0004wv-6S for submit <at> debbugs.gnu.org; Thu, 15 May 2025 12:19:53 -0400 Received: from mail.cs.ucla.edu ([131.179.128.66]:53128) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <eggert@HIDDEN>) id 1uFbIo-0004vh-1m for 78439 <at> debbugs.gnu.org; Thu, 15 May 2025 12:19:27 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.cs.ucla.edu (Postfix) with ESMTP id 4FECB3C0140A0; Thu, 15 May 2025 09:19:15 -0700 (PDT) Received: from mail.cs.ucla.edu ([127.0.0.1]) by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavis, port 10032) with ESMTP id NthAMA7Rzisg; Thu, 15 May 2025 09:19:15 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by mail.cs.ucla.edu (Postfix) with ESMTP id 287D33C0149C6; Thu, 15 May 2025 09:19:15 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.cs.ucla.edu 287D33C0149C6 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.ucla.edu; s=9D0B346E-2AEB-11ED-9476-E14B719DCE6C; t=1747325955; bh=8soblP3JaemDQZKum8be67J0ZRPSsL8Qx+A/TMkgNeA=; h=Message-ID:Date:MIME-Version:To:From; b=K25149QgUVi4oL2IA2soxE04OokaRDyI0eE0QpJsgiZnWLivjseFqpV3Jt5AzQxCu U8YoMsZ01PYeVNVLDLCVKyRRJOa5PJlJkf99oKeCHEaHWQDgye55ZeDaT+IhFMmERQ xwuUC4lDyl2Kaa92QR8FnuTJ3M6V/mucrDWVvPpsldeNX+wwv0EXKZvuvJLDNQ8CP/ yXWCkkhGyS6ZjoLisYmMGHwJc0jqLS2rQJLWcFfaXdFKOHOQIZ2Sssd1Rwg4VVM0lL AHM684clcPe2FBdjwSbUYmRRPtBP+I8vlIB7F+wv4coLpYAj67E9Cvl8/G8J+538HB ZVDxedk57qA7A== X-Virus-Scanned: amavis at mail.cs.ucla.edu Received: from mail.cs.ucla.edu ([127.0.0.1]) by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavis, port 10026) with ESMTP id 9WD6NINoEkt5; Thu, 15 May 2025 09:19:15 -0700 (PDT) Received: from [192.168.254.12] (47-147-225-25.fdr01.snmn.ca.ip.frontiernet.net [47.147.225.25]) by mail.cs.ucla.edu (Postfix) with ESMTPSA id 0F7B03C0140A0; Thu, 15 May 2025 09:19:15 -0700 (PDT) Message-ID: <36aaec4c-6a7e-4a7c-b9bf-e0ddf2efaa67@HIDDEN> Date: Thu, 15 May 2025 09:19:14 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: bug#78439: Accent insensitive grep To: Avid Seeker <avidseeker@HIDDEN> References: <D9WHYA9BBOX7.394N0TBSJEIHJ@HIDDEN> Content-Language: en-US From: Paul Eggert <eggert@HIDDEN> Organization: UCLA Computer Science Department In-Reply-To: <D9WHYA9BBOX7.394N0TBSJEIHJ@HIDDEN> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 78439 Cc: 78439 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) On 2025-05-14 22:49, Avid Seeker via Bug reports for GNU grep wrote: > are equivalence classes the > right tool to approach this? They're supposed to be, yes ... > I see that they depend on LC_COLLATE, in > which case it would be possible to setup a custom locale that matches > digraphs. ... though you're venturing into uncharted territory here. Please let us know of any monsters you find. > Is there a way to setup a locale without having to recompile glibc Yes, use localedef.
bug-grep@HIDDEN
:bug#78439
; Package grep
.
Full text available.Received: (at submit) by debbugs.gnu.org; 15 May 2025 07:46:18 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Thu May 15 03:46:18 2025 Received: from localhost ([127.0.0.1]:50611 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1uFTIH-0003HZ-Cw for submit <at> debbugs.gnu.org; Thu, 15 May 2025 03:46:18 -0400 Received: from lists.gnu.org ([2001:470:142::17]:33014) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <avidseeker@HIDDEN>) id 1uFRT0-0002jA-Ns for submit <at> debbugs.gnu.org; Thu, 15 May 2025 01:49:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <avidseeker@HIDDEN>) id 1uFRSu-00034x-R3 for bug-grep@HIDDEN; Thu, 15 May 2025 01:49:08 -0400 Received: from layka.disroot.org ([178.21.23.139]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from <avidseeker@HIDDEN>) id 1uFRSs-0000OJ-OG for bug-grep@HIDDEN; Thu, 15 May 2025 01:49:08 -0400 Received: from mail01.disroot.lan (localhost [127.0.0.1]) by disroot.org (Postfix) with ESMTP id F12A1252B1 for <bug-grep@HIDDEN>; Thu, 15 May 2025 07:49:02 +0200 (CEST) X-Virus-Scanned: SPAM Filter at disroot.org Received: from layka.disroot.org ([127.0.0.1]) by localhost (disroot.org [127.0.0.1]) (amavis, port 10024) with ESMTP id L0h5YT0v0zFN for <bug-grep@HIDDEN>; Thu, 15 May 2025 07:49:02 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=disroot.org; s=mail; t=1747288142; bh=Xu5wWXBoen1LAu/+jswb7ZnfGuJhK88VGbgkOzROvSU=; h=Date:From:Subject:To; b=DiGgZXa2lvksGTqDEJDn7G+tMZyxuBiv7pTGu5ljyOyRF6Kcpd1wHcSHAF6g/QVsA NWKO6nxVlTJnIv9Cjj2Sn09cvaqiTwV8TrbLZjz87voGjFdLRXipgSCxhuB5ZcmFnK 65ixavmohKImfEr/WDMXdcU6TmwqO0GjcriNfWvblBY4cyvq2uGclK/mC7se2JdDo1 rEj30O7YFCY0Mn9oy/hT7CLbSRVJpXHK2NIgfIQ0I5/XuxnYAs0H/+suPy4gLPqTew CbnpIkhxgpXRDLmjtkgZNZfGGLS6sDonjjejTHctcmNwNvkpaMvOkcu81UppnFiCz3 SfTrDGhgmr9cg== Content-Type: text/plain; charset=UTF-8; format=Flowed Date: Thu, 15 May 2025 05:49:00 +0000 Message-Id: <D9WHYA9BBOX7.394N0TBSJEIHJ@HIDDEN> From: "Avid Seeker" <avidseeker@HIDDEN> Subject: Accent insensitive grep To: <bug-grep@HIDDEN> Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=178.21.23.139; envelope-from=avidseeker@HIDDEN; helo=layka.disroot.org X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: 0.9 (/) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Thu, 15 May 2025 03:46:11 -0400 X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -0.1 (/) Re-iterating the question on SO <https://stackoverflow.com/questions/209378= 64/> of applying an accent-insensitive grep to text. (e.g: all accents of a letter 'e' should b= e regarded as an ascii 'e'). The response by Adam Katz mentions: > You should not expect equivalence classes to be portable as they are too = arcane. What's the stance of grep developers on this? are equivalence classes the right tool to approach this? I see that they depend on LC_COLLATE, in which case it would be possible to setup a custom locale that matches digraphs. In the example he gave, he also mentions: > This matches all words like aei... [but won't match] =C3=A6i... it's quit= e > likely that digraphs are beyond the reach of even the best equivalence > class map. Is there a way to setup a locale without having to recompile glibc or are these locale values hardcoded into programs using glibc? Thanks, Avid
"Avid Seeker" <avidseeker@HIDDEN>
:bug-grep@HIDDEN
.
Full text available.bug-grep@HIDDEN
:bug#78439
; Package grep
.
Full text available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.