X-Loop: help-debbugs@HIDDEN Subject: bug#78439: Accent insensitive grep Resent-From: "Avid Seeker" <avidseeker@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-grep@HIDDEN Resent-Date: Thu, 15 May 2025 07:47:02 +0000 Resent-Message-ID: <handler.78439.B.174729517812630 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: report 78439 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: 78439 <at> debbugs.gnu.org X-Debbugs-Original-To: <bug-grep@HIDDEN> Received: via spool by submit <at> debbugs.gnu.org id=B.174729517812630 (code B ref -1); Thu, 15 May 2025 07:47:02 +0000 Received: (at submit) by debbugs.gnu.org; 15 May 2025 07:46:18 +0000 Received: from localhost ([127.0.0.1]:50611 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1uFTIH-0003HZ-Cw for submit <at> debbugs.gnu.org; Thu, 15 May 2025 03:46:18 -0400 Received: from lists.gnu.org ([2001:470:142::17]:33014) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <avidseeker@HIDDEN>) id 1uFRT0-0002jA-Ns for submit <at> debbugs.gnu.org; Thu, 15 May 2025 01:49:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <avidseeker@HIDDEN>) id 1uFRSu-00034x-R3 for bug-grep@HIDDEN; Thu, 15 May 2025 01:49:08 -0400 Received: from layka.disroot.org ([178.21.23.139]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from <avidseeker@HIDDEN>) id 1uFRSs-0000OJ-OG for bug-grep@HIDDEN; Thu, 15 May 2025 01:49:08 -0400 Received: from mail01.disroot.lan (localhost [127.0.0.1]) by disroot.org (Postfix) with ESMTP id F12A1252B1 for <bug-grep@HIDDEN>; Thu, 15 May 2025 07:49:02 +0200 (CEST) X-Virus-Scanned: SPAM Filter at disroot.org Received: from layka.disroot.org ([127.0.0.1]) by localhost (disroot.org [127.0.0.1]) (amavis, port 10024) with ESMTP id L0h5YT0v0zFN for <bug-grep@HIDDEN>; Thu, 15 May 2025 07:49:02 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=disroot.org; s=mail; t=1747288142; bh=Xu5wWXBoen1LAu/+jswb7ZnfGuJhK88VGbgkOzROvSU=; h=Date:From:Subject:To; b=DiGgZXa2lvksGTqDEJDn7G+tMZyxuBiv7pTGu5ljyOyRF6Kcpd1wHcSHAF6g/QVsA NWKO6nxVlTJnIv9Cjj2Sn09cvaqiTwV8TrbLZjz87voGjFdLRXipgSCxhuB5ZcmFnK 65ixavmohKImfEr/WDMXdcU6TmwqO0GjcriNfWvblBY4cyvq2uGclK/mC7se2JdDo1 rEj30O7YFCY0Mn9oy/hT7CLbSRVJpXHK2NIgfIQ0I5/XuxnYAs0H/+suPy4gLPqTew CbnpIkhxgpXRDLmjtkgZNZfGGLS6sDonjjejTHctcmNwNvkpaMvOkcu81UppnFiCz3 SfTrDGhgmr9cg== Content-Type: text/plain; charset=UTF-8; format=Flowed Date: Thu, 15 May 2025 05:49:00 +0000 Message-Id: <D9WHYA9BBOX7.394N0TBSJEIHJ@HIDDEN> From: "Avid Seeker" <avidseeker@HIDDEN> Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=178.21.23.139; envelope-from=avidseeker@HIDDEN; helo=layka.disroot.org X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: 0.9 (/) X-Mailman-Approved-At: Thu, 15 May 2025 03:46:11 -0400 X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -0.1 (/) Re-iterating the question on SO <https://stackoverflow.com/questions/209378= 64/> of applying an accent-insensitive grep to text. (e.g: all accents of a letter 'e' should b= e regarded as an ascii 'e'). The response by Adam Katz mentions: > You should not expect equivalence classes to be portable as they are too = arcane. What's the stance of grep developers on this? are equivalence classes the right tool to approach this? I see that they depend on LC_COLLATE, in which case it would be possible to setup a custom locale that matches digraphs. In the example he gave, he also mentions: > This matches all words like aei... [but won't match] =C3=A6i... it's quit= e > likely that digraphs are beyond the reach of even the best equivalence > class map. Is there a way to setup a locale without having to recompile glibc or are these locale values hardcoded into programs using glibc? Thanks, Avid
Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) Content-Type: text/plain; charset=utf-8 X-Loop: help-debbugs@HIDDEN From: help-debbugs@HIDDEN (GNU bug Tracking System) To: "Avid Seeker" <avidseeker@HIDDEN> Subject: bug#78439: Acknowledgement (Accent insensitive grep) Message-ID: <handler.78439.B.174729517812630.ack <at> debbugs.gnu.org> References: <D9WHYA9BBOX7.394N0TBSJEIHJ@HIDDEN> X-Gnu-PR-Message: ack 78439 X-Gnu-PR-Package: grep Reply-To: 78439 <at> debbugs.gnu.org Date: Thu, 15 May 2025 07:47:03 +0000 Thank you for filing a new bug report with debbugs.gnu.org. This is an automatically generated reply to let you know your message has been received. Your message is being forwarded to the package maintainers and other interested parties for their attention; they will reply in due course. Your message has been sent to the package maintainer(s): bug-grep@HIDDEN If you wish to submit further information on this problem, please send it to 78439 <at> debbugs.gnu.org. Please do not send mail to help-debbugs@HIDDEN unless you wish to report a problem with the Bug-tracking system. --=20 78439: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D78439 GNU Bug Tracking System Contact help-debbugs@HIDDEN with problems
X-Loop: help-debbugs@HIDDEN Subject: bug#78439: Accent insensitive grep Resent-From: Paul Eggert <eggert@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-grep@HIDDEN Resent-Date: Thu, 15 May 2025 16:20:04 +0000 Resent-Message-ID: <handler.78439.B78439.174732599319036 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 78439 X-GNU-PR-Package: grep X-GNU-PR-Keywords: To: Avid Seeker <avidseeker@HIDDEN> Cc: 78439 <at> debbugs.gnu.org Received: via spool by 78439-submit <at> debbugs.gnu.org id=B78439.174732599319036 (code B ref 78439); Thu, 15 May 2025 16:20:04 +0000 Received: (at 78439) by debbugs.gnu.org; 15 May 2025 16:19:53 +0000 Received: from localhost ([127.0.0.1]:55376 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1uFbJI-0004wv-6S for submit <at> debbugs.gnu.org; Thu, 15 May 2025 12:19:53 -0400 Received: from mail.cs.ucla.edu ([131.179.128.66]:53128) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <eggert@HIDDEN>) id 1uFbIo-0004vh-1m for 78439 <at> debbugs.gnu.org; Thu, 15 May 2025 12:19:27 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.cs.ucla.edu (Postfix) with ESMTP id 4FECB3C0140A0; Thu, 15 May 2025 09:19:15 -0700 (PDT) Received: from mail.cs.ucla.edu ([127.0.0.1]) by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavis, port 10032) with ESMTP id NthAMA7Rzisg; Thu, 15 May 2025 09:19:15 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by mail.cs.ucla.edu (Postfix) with ESMTP id 287D33C0149C6; Thu, 15 May 2025 09:19:15 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.cs.ucla.edu 287D33C0149C6 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.ucla.edu; s=9D0B346E-2AEB-11ED-9476-E14B719DCE6C; t=1747325955; bh=8soblP3JaemDQZKum8be67J0ZRPSsL8Qx+A/TMkgNeA=; h=Message-ID:Date:MIME-Version:To:From; b=K25149QgUVi4oL2IA2soxE04OokaRDyI0eE0QpJsgiZnWLivjseFqpV3Jt5AzQxCu U8YoMsZ01PYeVNVLDLCVKyRRJOa5PJlJkf99oKeCHEaHWQDgye55ZeDaT+IhFMmERQ xwuUC4lDyl2Kaa92QR8FnuTJ3M6V/mucrDWVvPpsldeNX+wwv0EXKZvuvJLDNQ8CP/ yXWCkkhGyS6ZjoLisYmMGHwJc0jqLS2rQJLWcFfaXdFKOHOQIZ2Sssd1Rwg4VVM0lL AHM684clcPe2FBdjwSbUYmRRPtBP+I8vlIB7F+wv4coLpYAj67E9Cvl8/G8J+538HB ZVDxedk57qA7A== X-Virus-Scanned: amavis at mail.cs.ucla.edu Received: from mail.cs.ucla.edu ([127.0.0.1]) by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavis, port 10026) with ESMTP id 9WD6NINoEkt5; Thu, 15 May 2025 09:19:15 -0700 (PDT) Received: from [192.168.254.12] (47-147-225-25.fdr01.snmn.ca.ip.frontiernet.net [47.147.225.25]) by mail.cs.ucla.edu (Postfix) with ESMTPSA id 0F7B03C0140A0; Thu, 15 May 2025 09:19:15 -0700 (PDT) Message-ID: <36aaec4c-6a7e-4a7c-b9bf-e0ddf2efaa67@HIDDEN> Date: Thu, 15 May 2025 09:19:14 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird References: <D9WHYA9BBOX7.394N0TBSJEIHJ@HIDDEN> Content-Language: en-US From: Paul Eggert <eggert@HIDDEN> Organization: UCLA Computer Science Department In-Reply-To: <D9WHYA9BBOX7.394N0TBSJEIHJ@HIDDEN> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) On 2025-05-14 22:49, Avid Seeker via Bug reports for GNU grep wrote: > are equivalence classes the > right tool to approach this? They're supposed to be, yes ... > I see that they depend on LC_COLLATE, in > which case it would be possible to setup a custom locale that matches > digraphs. ... though you're venturing into uncharted territory here. Please let us know of any monsters you find. > Is there a way to setup a locale without having to recompile glibc Yes, use localedef.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.