GNU bug report logs - #40216
28.0.50; Misinformation in isearch char-fold

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: emacs; Reported by: Juri Linkov <juri@HIDDEN>; Keywords: patch fixed; Done: Juri Linkov <juri@HIDDEN>; Maintainer for emacs is bug-gnu-emacs@HIDDEN.
bug marked as fixed in version 28.0.50, send any further explanations to 40216 <at> debbugs.gnu.org and Juri Linkov <juri@HIDDEN> Request was from Juri Linkov <juri@HIDDEN> to control <at> debbugs.gnu.org. Full text available.
Added tag(s) fixed. Request was from Juri Linkov <juri@HIDDEN> to control <at> debbugs.gnu.org. Full text available.

Message received at 40216 <at> debbugs.gnu.org:


Received: (at 40216) by debbugs.gnu.org; 30 Mar 2020 02:35:34 +0000
From juri@HIDDEN Sat Mar 28 20:10:17 2020
Received: from relay1-d.mail.gandi.net ([217.70.183.193]:37447)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <juri@HIDDEN>)
 id 1jILWq-00048l-Td; Sat, 28 Mar 2020 20:10:17 -0400
X-Originating-IP: 91.129.96.173
Received: from mail.gandi.net (m91-129-96-173.cust.tele2.ee [91.129.96.173])
 (Authenticated sender: juri@HIDDEN)
 by relay1-d.mail.gandi.net (Postfix) with ESMTPSA id 96EEA240004;
 Sun, 29 Mar 2020 00:10:09 +0000 (UTC)
From: Juri Linkov <juri@HIDDEN>
To: Robert Pluim <rpluim@HIDDEN>
Cc: Eli Zaretskii <eliz@HIDDEN>,  40216 <at> debbugs.gnu.org
Subject: Re: bug#40216: 28.0.50; Misinformation in isearch char-fold
Organization: LINKOV.NET
References: <87y2rppddp.fsf@HIDDEN> <m2zhc4eql0.fsf@HIDDEN>
 <87imiskzc6.fsf@HIDDEN> <m2imirea7c.fsf@HIDDEN>
 <83d08z9os5.fsf@HIDDEN> <m21rpf86lt.fsf@HIDDEN>
 <87ftdud8gj.fsf@HIDDEN> <83pncy8dlg.fsf@HIDDEN>
 <m2k1366vz3.fsf@HIDDEN>
Date: Sun, 29 Mar 2020 01:42:02 +0200
In-Reply-To: <m2k1366vz3.fsf@HIDDEN> (Robert Pluim's message of "Fri, 27
 Mar 2020 09:30:24 +0100")
Message-ID: <87k134jbcl.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (x86_64-pc-linux-gnu)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 40216
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
X-List-Received-Date: Sun, 29 Mar 2020 00:10:17 -0000

tags 40216 fixed
close 40216 28.0.50
quit

>     Eli> So if one wants to support the kind of folding you expected, one would
>     Eli> have to customize char-fold-include to add those additional rules.
>
> Yes, wrong example. I guess this wouldnʼt be useful after all (and

Thanks for pointing out a possibility to optimize char-fold,
I haven't thought about this before.  But it seems this optimization
limits the usability of char-fold since matching non-ascii characters
on ascii text is not needed as often as matching ascii on non-ascii text,
or both ways.  Even the current default of folding ascii to non-ascii
is so useless for me that I have to enable char-fold-symmetric.

> I see nothing wrong with Juri's proposed fix to the actual issue).

So now pushed to master.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#40216; Package emacs. Full text available.

Message received at 40216 <at> debbugs.gnu.org:


Received: (at 40216) by debbugs.gnu.org; 27 Mar 2020 08:30:33 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Mar 27 04:30:33 2020
Received: from localhost ([127.0.0.1]:60510 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jHkNt-0000Od-Dd
	for submit <at> debbugs.gnu.org; Fri, 27 Mar 2020 04:30:33 -0400
Received: from mail-wm1-f45.google.com ([209.85.128.45]:37232)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <rpluim@HIDDEN>) id 1jHkNs-0000OL-Cq
 for 40216 <at> debbugs.gnu.org; Fri, 27 Mar 2020 04:30:32 -0400
Received: by mail-wm1-f45.google.com with SMTP id d1so11342749wmb.2
 for <40216 <at> debbugs.gnu.org>; Fri, 27 Mar 2020 01:30:32 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=from:to:cc:subject:references:date:in-reply-to:message-id
 :mime-version:content-transfer-encoding;
 bh=ONs5AfAP0a5awKl6/omKYo4UA/8/jJo/GKBDByy6108=;
 b=Y/qybcISrG89Sa0707wIULdY2nn8B9u1U5vYxFQkna8j37SmuYsEYhP7Xhrq9tQa/t
 Q+AtNvCNfj+XRTrw48cUH5tUKNXs256lcWN9PooGx87neFYvWR3b8Ue6SPRw/2+eRqxn
 guGqwfF1nMwjJHtbMUbZl7Es6JdiPPHwnQoUJASVDj2aNvoRIQn6Xb9Tw7Bf80K/Mfs4
 wk66MetzWHpTSsu/k8Sn3mIyYH86id9F14ZIENu1WI3F6claYTp20wX/RD41LolCO+mx
 51fVSe0VPHbdxJyyHfW8WCWn+HvP0zehCjiWmDWwbHrd3kBR+mb52bIEYHFbpBc/cASW
 rVfA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to
 :message-id:mime-version:content-transfer-encoding;
 bh=ONs5AfAP0a5awKl6/omKYo4UA/8/jJo/GKBDByy6108=;
 b=uY5r57K7+B7ymHfoup5mD7Tjvz/gJu1jUXIMbpedl/DXpyA6p2HDdYvIloaeGWQzzv
 6RDPAkeMLVahs9fuBhx6Bh3FilK33n2xs/o1iGrKHRQgNf9po+8qzwrNjmXuaHXTGyXk
 GhBH200NjMiQNrcr0YuKMrR+Hw7HcXx4hJAIIfwHBBC+l16kweU3+Sr6MK924Tf/kJG6
 367IQKucAlQcQC30Qr/Zg9bSShvc/SUEr7ScnYI71nsgPpar9AsJEZ2KDZisN8nki+tN
 OS7p5znMXflLFxuEmL1peP4gxXUhhWWOZb/AibP4HdEMSUBAqaBuMPi5St+H7OSMWrv6
 lONA==
X-Gm-Message-State: ANhLgQ2FKAuZjLerW1iMxVKy5LeY1oC8nxDJ9VkjjJxmPSCzhPebDIrB
 LA9YeyWJWz1ciibJCHIa08duxLIb
X-Google-Smtp-Source: ADFU+vshHLdenU0c/4G0oDTrftruy5aia/OcXItgVazgKZmmYo0c0qUc6ymPZKy4y8CwnyF5HMARLA==
X-Received: by 2002:a1c:8090:: with SMTP id b138mr4393036wmd.55.1585297826002; 
 Fri, 27 Mar 2020 01:30:26 -0700 (PDT)
Received: from rpluim-mac ([2a01:e34:ecfc:a860:bdc9:e98c:bf1f:4cce])
 by smtp.gmail.com with ESMTPSA id a82sm13859970wmh.0.2020.03.27.01.30.24
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Fri, 27 Mar 2020 01:30:25 -0700 (PDT)
From: Robert Pluim <rpluim@HIDDEN>
To: Eli Zaretskii <eliz@HIDDEN>
Subject: Re: bug#40216: 28.0.50; Misinformation in isearch char-fold
References: <87y2rppddp.fsf@HIDDEN> <m2zhc4eql0.fsf@HIDDEN>
 <87imiskzc6.fsf@HIDDEN> <m2imirea7c.fsf@HIDDEN>
 <83d08z9os5.fsf@HIDDEN> <m21rpf86lt.fsf@HIDDEN>
 <87ftdud8gj.fsf@HIDDEN> <83pncy8dlg.fsf@HIDDEN>
Date: Fri, 27 Mar 2020 09:30:24 +0100
In-Reply-To: <83pncy8dlg.fsf@HIDDEN> (Eli Zaretskii's message of "Fri, 27 Mar
 2020 10:24:27 +0300")
Message-ID: <m2k1366vz3.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 40216
Cc: 40216 <at> debbugs.gnu.org, Juri Linkov <juri@HIDDEN>
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

>>>>> On Fri, 27 Mar 2020 10:24:27 +0300, Eli Zaretskii <eliz@HIDDEN> said:

    >> From: Juri Linkov <juri@HIDDEN>
    >> Cc: Eli Zaretskii <eliz@HIDDEN>,  40216 <at> debbugs.gnu.org
    >> Date: Fri, 27 Mar 2020 01:04:12 +0200
    >>=20
    >> I tried to find R=C3=B8bert by typing Robert, but char-fold fails to=
 find it.
    >> A bug in char-fold?

    Eli> I don't think it's a bug, because =C3=B8 doesn't have a decomposit=
ion in
    Eli> the Unicode character database:

    Eli>    (get-char-code-property ?=C3=B8 'decomposition) =3D> (248)

    Eli> (i.e. the character "decomposes" into itself).  By contrast:

    Eli>    (get-char-code-property ?=C3=A1 'decomposition) =3D> (97 769)

    Eli> (i.e. =C3=A1 decomposes into a followed by U+0301 COMBINING ACUTE =
ACCENT).

    Eli> So if one wants to support the kind of folding you expected, one w=
ould
    Eli> have to customize char-fold-include to add those additional rules.

Yes, wrong example. I guess this wouldn=CA=BCt be useful after all (and I
see nothing wrong with Juri's proposed fix to the actual issue).

Robert




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#40216; Package emacs. Full text available.

Message received at 40216 <at> debbugs.gnu.org:


Received: (at 40216) by debbugs.gnu.org; 27 Mar 2020 07:24:33 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Mar 27 03:24:33 2020
Received: from localhost ([127.0.0.1]:60475 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jHjM1-000510-Kx
	for submit <at> debbugs.gnu.org; Fri, 27 Mar 2020 03:24:33 -0400
Received: from eggs.gnu.org ([209.51.188.92]:37483)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1jHjM0-00050k-9j
 for 40216 <at> debbugs.gnu.org; Fri, 27 Mar 2020 03:24:32 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e]:51843)
 by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <eliz@HIDDEN>)
 id 1jHjLu-0004c9-Tw; Fri, 27 Mar 2020 03:24:27 -0400
Received: from [176.228.60.248] (port=4849 helo=home-c4e4a596f7)
 by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256)
 (Exim 4.82) (envelope-from <eliz@HIDDEN>)
 id 1jHjLt-0006f1-Ns; Fri, 27 Mar 2020 03:24:26 -0400
Date: Fri, 27 Mar 2020 10:24:27 +0300
Message-Id: <83pncy8dlg.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Juri Linkov <juri@HIDDEN>
In-Reply-To: <87ftdud8gj.fsf@HIDDEN> (message from Juri Linkov on
 Fri, 27 Mar 2020 01:04:12 +0200)
Subject: Re: bug#40216: 28.0.50; Misinformation in isearch char-fold
References: <87y2rppddp.fsf@HIDDEN> <m2zhc4eql0.fsf@HIDDEN>
 <87imiskzc6.fsf@HIDDEN> <m2imirea7c.fsf@HIDDEN>
 <83d08z9os5.fsf@HIDDEN> <m21rpf86lt.fsf@HIDDEN>
 <87ftdud8gj.fsf@HIDDEN>
MIME-version: 1.0
Content-type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 40216
Cc: 40216 <at> debbugs.gnu.org, rpluim@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

> From: Juri Linkov <juri@HIDDEN>
> Cc: Eli Zaretskii <eliz@HIDDEN>,  40216 <at> debbugs.gnu.org
> Date: Fri, 27 Mar 2020 01:04:12 +0200
> 
> I tried to find Røbert by typing Robert, but char-fold fails to find it.
> A bug in char-fold?

I don't think it's a bug, because ø doesn't have a decomposition in
the Unicode character database:

   (get-char-code-property ?ø 'decomposition) => (248)

(i.e. the character "decomposes" into itself).  By contrast:

   (get-char-code-property ?á 'decomposition) => (97 769)

(i.e. á decomposes into a followed by U+0301 COMBINING ACUTE ACCENT).

So if one wants to support the kind of folding you expected, one would
have to customize char-fold-include to add those additional rules.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#40216; Package emacs. Full text available.

Message received at 40216 <at> debbugs.gnu.org:


Received: (at 40216) by debbugs.gnu.org; 26 Mar 2020 23:46:25 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Mar 26 19:46:25 2020
Received: from localhost ([127.0.0.1]:60380 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jHcCe-0005hY-BI
	for submit <at> debbugs.gnu.org; Thu, 26 Mar 2020 19:46:25 -0400
Received: from relay10.mail.gandi.net ([217.70.178.230]:41129)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <juri@HIDDEN>) id 1jHcCb-0005h2-Iv
 for 40216 <at> debbugs.gnu.org; Thu, 26 Mar 2020 19:46:22 -0400
Received: from mail.gandi.net (m91-129-96-173.cust.tele2.ee [91.129.96.173])
 (Authenticated sender: juri@HIDDEN)
 by relay10.mail.gandi.net (Postfix) with ESMTPSA id 7AD62240006;
 Thu, 26 Mar 2020 23:46:14 +0000 (UTC)
From: Juri Linkov <juri@HIDDEN>
To: Robert Pluim <rpluim@HIDDEN>
Subject: Re: bug#40216: 28.0.50; Misinformation in isearch char-fold
Organization: LINKOV.NET
References: <87y2rppddp.fsf@HIDDEN> <m2zhc4eql0.fsf@HIDDEN>
 <87imiskzc6.fsf@HIDDEN> <m2imirea7c.fsf@HIDDEN>
 <83d08z9os5.fsf@HIDDEN> <m21rpf86lt.fsf@HIDDEN>
Date: Fri, 27 Mar 2020 01:04:12 +0200
In-Reply-To: <m21rpf86lt.fsf@HIDDEN> (Robert Pluim's message of "Thu, 26
 Mar 2020 16:43:10 +0100")
Message-ID: <87ftdud8gj.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (x86_64-pc-linux-gnu)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 40216
Cc: 40216 <at> debbugs.gnu.org, Eli Zaretskii <eliz@HIDDEN>
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

>     >> Do we need an option to char-fold-regexp that says 'only apply
>     >> char-folding to non-ascii characters'?
>
>     Eli> But this feature is not intended only to find variants of non-ASCII
>     Eli> characters when one searches for a non-ASCII, it is also intended to
>     Eli> find variants when searching for ASCII characters.  For example,
>     Eli> searching for a is supposed to find ä and à and á.  Or am I missing
>     Eli> something?
>
> Yes, thatʼs exactly right. But in the case where you have mainly
> characters where you donʼt want case-folding, it might make sense to
> restrict the folding to non-ascii as an optimisation. eg. Suppose my
> name were Røbert, with people frequently misspelling it as Robert, I
> might want isearch to just search for "R\\(?:ǿ\\|[øǿo]\\)bert"

I tried to find Røbert by typing Robert, but char-fold fails to find it.
A bug in char-fold?




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#40216; Package emacs. Full text available.

Message received at 40216 <at> debbugs.gnu.org:


Received: (at 40216) by debbugs.gnu.org; 26 Mar 2020 23:46:22 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Mar 26 19:46:22 2020
Received: from localhost ([127.0.0.1]:60378 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jHcCb-0005hI-Vg
	for submit <at> debbugs.gnu.org; Thu, 26 Mar 2020 19:46:22 -0400
Received: from relay9-d.mail.gandi.net ([217.70.183.199]:52025)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <juri@HIDDEN>) id 1jHcCZ-0005gw-7v
 for 40216 <at> debbugs.gnu.org; Thu, 26 Mar 2020 19:46:20 -0400
X-Originating-IP: 91.129.96.173
Received: from mail.gandi.net (m91-129-96-173.cust.tele2.ee [91.129.96.173])
 (Authenticated sender: juri@HIDDEN)
 by relay9-d.mail.gandi.net (Postfix) with ESMTPSA id E21BEFF802;
 Thu, 26 Mar 2020 23:46:11 +0000 (UTC)
From: Juri Linkov <juri@HIDDEN>
To: Robert Pluim <rpluim@HIDDEN>
Subject: Re: bug#40216: 28.0.50; Misinformation in isearch char-fold
Organization: LINKOV.NET
References: <87y2rppddp.fsf@HIDDEN> <m2zhc4eql0.fsf@HIDDEN>
 <87imiskzc6.fsf@HIDDEN> <m2imirea7c.fsf@HIDDEN>
Date: Fri, 27 Mar 2020 01:00:18 +0200
In-Reply-To: <m2imirea7c.fsf@HIDDEN> (Robert Pluim's message of "Thu, 26
 Mar 2020 10:28:55 +0100")
Message-ID: <87mu82d8n1.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (x86_64-pc-linux-gnu)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 40216
Cc: 40216 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

> Ah, I hadn't considered that use case. Do we need an option to
> char-fold-regexp that says 'only apply char-folding to non-ascii
> characters'? That would reduce the size of the regexp considerably.

Currently there are 2 covered use cases:

1. the default is to fold ascii to non-ascii characters;

2. non-nil char-fold-symmetric additionally folds
   non-ascii to ascii characters.

It seems you are proposing a third use case:

3. symmetric-only that can be implemented with a new non-nil option
   char-fold-symmetric-only that will fold only non-ascii characters
   to ascii.

I have doubts how useful this will be.

The current default behavior is useful when the user types
ascii characters on the keyboard with ascii characters only.

The option char-fold-symmetric is useful to match pasted text
both ways ignoring all differences between ascii/non-ascii characters.

But for symmetric-only I can't imagine any useful use case.
For example, when you paste non-ascii characters into the search string,
and want to find corresponding ascii characters.  But why wouldn't you
want to find the other way around: pasting ascii characters
to find non-ascii counterparts?




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#40216; Package emacs. Full text available.

Message received at 40216 <at> debbugs.gnu.org:


Received: (at 40216) by debbugs.gnu.org; 26 Mar 2020 15:43:20 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Mar 26 11:43:20 2020
Received: from localhost ([127.0.0.1]:59525 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jHUfA-0007SK-H8
	for submit <at> debbugs.gnu.org; Thu, 26 Mar 2020 11:43:20 -0400
Received: from mail-wm1-f53.google.com ([209.85.128.53]:39679)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <rpluim@HIDDEN>) id 1jHUf9-0007S1-3r
 for 40216 <at> debbugs.gnu.org; Thu, 26 Mar 2020 11:43:19 -0400
Received: by mail-wm1-f53.google.com with SMTP id a9so7515510wmj.4
 for <40216 <at> debbugs.gnu.org>; Thu, 26 Mar 2020 08:43:19 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=from:to:cc:subject:references:date:in-reply-to:message-id
 :mime-version:content-transfer-encoding;
 bh=NljDggT/BdjfDeUpmn8RMOrX8DaLa4WofpaBHdmP/7g=;
 b=eWdaPTKA29wLUEvtxaXbzfNluPgKp0uZJ2cuArwzVYXHeFXI1743oQpFz05/FNJGxB
 zO/GUgBZF4JDufDaQ+ML0Zaz2bXEKETdaIXBbIQFoPhXqal2PiKtz5DvUJeL5zGh/F+8
 91fSZo/T+ajKTBNWg0PLwWdbKYED7t0gZf828CnRAJMxW/Ie7faRbAr8jY84g1wdFFbL
 c87Cf/HydnrglPH7MLH2dMBBOPQx+//zfHUdSvADeF0rUkRfd7nfEUba/YlmAxKzRXWK
 G5rmm6sX856luKvEW6O1HrNjiOmGSlPq/f+f8Kr56PNeeSNJ2iP423B5oiZR4mrBziEU
 /OCA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to
 :message-id:mime-version:content-transfer-encoding;
 bh=NljDggT/BdjfDeUpmn8RMOrX8DaLa4WofpaBHdmP/7g=;
 b=rvLIWp1cG0wIpRh6e1eBCU1glPJf9wwFoIG5L/02wC7+IkdU8MmvFORdLXDAVccxn0
 /pojawiHu3ic4IKf4pcrGIsyE7kDHjwO0GzE08WMGcrhhK/pdTzfCDZdMcrzBg+MV9FS
 vEdtEBS1YI3OXY8COlnxbEPNDenbbuTtaz+onfXy4IeGGTdy8Dsr1rgk5JlEmMtrJ9Hz
 STigWhoGW8zu7dT1109GlCIg1dJ1RknTLflnXwco6nvnIseXOEMB8XpPEZmRlZi6dUNf
 DyyPvFRrBn7IIhx3+j1wJPeEnFiMxJKLg0+nixASewAGqNlHXNgmySzEttnOcZhg/5K8
 39yw==
X-Gm-Message-State: ANhLgQ1JJ9UMnMmV+jnGrzBdc2DDoTzEBm3/ct6Qkvw7BAFSwm1qzBvA
 Ei3mJ7ISnBLYnFy/uEFAPDItwO1z
X-Google-Smtp-Source: ADFU+vvMOT/lZ1oCOVwN9ju0FUu6SWGGHfx4efCFkqZCLuJFRJ4qFSJeCP3y3d4jgr5VFajTtbqJpA==
X-Received: by 2002:a05:600c:295e:: with SMTP id
 n30mr549425wmd.78.1585237392754; 
 Thu, 26 Mar 2020 08:43:12 -0700 (PDT)
Received: from rpluim-mac ([2a01:e34:ecfc:a860:bdc9:e98c:bf1f:4cce])
 by smtp.gmail.com with ESMTPSA id j5sm4050213wrr.47.2020.03.26.08.43.11
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Thu, 26 Mar 2020 08:43:11 -0700 (PDT)
From: Robert Pluim <rpluim@HIDDEN>
To: Eli Zaretskii <eliz@HIDDEN>
Subject: Re: bug#40216: 28.0.50; Misinformation in isearch char-fold
References: <87y2rppddp.fsf@HIDDEN> <m2zhc4eql0.fsf@HIDDEN>
 <87imiskzc6.fsf@HIDDEN> <m2imirea7c.fsf@HIDDEN>
 <83d08z9os5.fsf@HIDDEN>
Date: Thu, 26 Mar 2020 16:43:10 +0100
In-Reply-To: <83d08z9os5.fsf@HIDDEN> (Eli Zaretskii's message of "Thu, 26 Mar
 2020 16:25:14 +0200")
Message-ID: <m21rpf86lt.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 40216
Cc: 40216 <at> debbugs.gnu.org, juri@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

>>>>> On Thu, 26 Mar 2020 16:25:14 +0200, Eli Zaretskii <eliz@HIDDEN> said:

    >> From: Robert Pluim <rpluim@HIDDEN>
    >> Date: Thu, 26 Mar 2020 10:28:55 +0100
    >> Cc: 40216 <at> debbugs.gnu.org
    >>=20
    >> Do we need an option to char-fold-regexp that says 'only apply
    >> char-folding to non-ascii characters'?

    Eli> But this feature is not intended only to find variants of non-ASCII
    Eli> characters when one searches for a non-ASCII, it is also intended =
to
    Eli> find variants when searching for ASCII characters.  For example,
    Eli> searching for a is supposed to find =C3=A4 and =C3=A0 and =C3=A1. =
 Or am I missing
    Eli> something?

Yes, that=CA=BCs exactly right. But in the case where you have mainly
characters where you don=CA=BCt want case-folding, it might make sense to
restrict the folding to non-ascii as an optimisation. eg. Suppose my
name were R=C3=B8bert, with people frequently misspelling it as Robert, I
might want isearch to just search for "R\\(?:=C3=B8=CC=81\\|[=C3=B8=C7=BFo]=
\\)bert"

R=C3=B8bert




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#40216; Package emacs. Full text available.

Message received at 40216 <at> debbugs.gnu.org:


Received: (at 40216) by debbugs.gnu.org; 26 Mar 2020 14:25:27 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Mar 26 10:25:27 2020
Received: from localhost ([127.0.0.1]:59474 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jHTRn-0005Lo-1l
	for submit <at> debbugs.gnu.org; Thu, 26 Mar 2020 10:25:27 -0400
Received: from eggs.gnu.org ([209.51.188.92]:53191)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1jHTRl-0005Lb-Bv
 for 40216 <at> debbugs.gnu.org; Thu, 26 Mar 2020 10:25:25 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e]:35289)
 by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <eliz@HIDDEN>)
 id 1jHTRf-0000qE-5j; Thu, 26 Mar 2020 10:25:19 -0400
Received: from [176.228.60.248] (port=1603 helo=home-c4e4a596f7)
 by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256)
 (Exim 4.82) (envelope-from <eliz@HIDDEN>)
 id 1jHTRc-0001wV-G2; Thu, 26 Mar 2020 10:25:17 -0400
Date: Thu, 26 Mar 2020 16:25:14 +0200
Message-Id: <83d08z9os5.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Robert Pluim <rpluim@HIDDEN>
In-Reply-To: <m2imirea7c.fsf@HIDDEN> (message from Robert Pluim on Thu, 26
 Mar 2020 10:28:55 +0100)
Subject: Re: bug#40216: 28.0.50; Misinformation in isearch char-fold
References: <87y2rppddp.fsf@HIDDEN> <m2zhc4eql0.fsf@HIDDEN>
 <87imiskzc6.fsf@HIDDEN> <m2imirea7c.fsf@HIDDEN>
MIME-version: 1.0
Content-type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 40216
Cc: 40216 <at> debbugs.gnu.org, juri@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

> From: Robert Pluim <rpluim@HIDDEN>
> Date: Thu, 26 Mar 2020 10:28:55 +0100
> Cc: 40216 <at> debbugs.gnu.org
> 
> Do we need an option to char-fold-regexp that says 'only apply
> char-folding to non-ascii characters'?

But this feature is not intended only to find variants of non-ASCII
characters when one searches for a non-ASCII, it is also intended to
find variants when searching for ASCII characters.  For example,
searching for a is supposed to find  and  and .  Or am I missing
something?




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#40216; Package emacs. Full text available.

Message received at 40216 <at> debbugs.gnu.org:


Received: (at 40216) by debbugs.gnu.org; 26 Mar 2020 09:29:08 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Mar 26 05:29:08 2020
Received: from localhost ([127.0.0.1]:58106 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jHOp1-0003KJ-QB
	for submit <at> debbugs.gnu.org; Thu, 26 Mar 2020 05:29:07 -0400
Received: from mail-wr1-f44.google.com ([209.85.221.44]:38472)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <rpluim@HIDDEN>) id 1jHOoz-0003Jf-Au
 for 40216 <at> debbugs.gnu.org; Thu, 26 Mar 2020 05:29:05 -0400
Received: by mail-wr1-f44.google.com with SMTP id s1so6823744wrv.5
 for <40216 <at> debbugs.gnu.org>; Thu, 26 Mar 2020 02:29:05 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=from:to:cc:subject:references:date:in-reply-to:message-id
 :mime-version; bh=4NIt38vYcDFuDpYR47lEZpVaD9n05os3jsYQqKAD8Lc=;
 b=aJhQyPRMcAVoMeynCuA6Pli+cS+QDq51uGMowtuTWW/3zv6NuPgdi+ae7RKnTtNDJS
 7zpYtuVlznhPudvv5mtnxWD7kzlw795+JGSUex8iuI/9DyJdbvDItG/TxbarY8pWPnCZ
 /RPo9FhZRiptiMPTMlLZcSa9U7XQQTuO+wXlvS1jjgJYzMKGEz3/GLeReNHyuTkjwq8W
 TIDzyKrshbue/7wiGFeHcXS2ev3Zdljy0U5UDZFOzlwhlSN6fulG9CT5zLaT1CTNYl9M
 1nXJYxCpmoGH2Q4t8Pa7hiW0qyvPleSYcdwZ3bctmga6jrYiKkgyYE6EvA+G8fzGnQFU
 SDuw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to
 :message-id:mime-version;
 bh=4NIt38vYcDFuDpYR47lEZpVaD9n05os3jsYQqKAD8Lc=;
 b=KblavP/6CQOWPVDm8GdHcPJeghYN/jmydERkrEF+tjZkm8CC5Isw+KVNwsviVnYOIO
 zZ2N35f8Ez6r8uRPV43tjOT9scaJ+VNJ06u00atDCucueUvR85J6pG0sA8nZbEss/DtX
 HO3Y1DQBNGPklfRbH/1hQntGtPVuV2TDj9+vkHIko6+5jiI20Ck4x4f+RJzVqM+NRVus
 +HS40ZOgglSV9MyHcuZd8JcbM0l4csJ9SqhKY58uo5bcGnQyEC98t3uKuLkVCPgS61e/
 d6m8G/QgWssg2yBgzxoQUNzDW+14xzglTFDLhKFiYyzG8gCdxX8nplb9XPhC50VqUDO0
 f2Rw==
X-Gm-Message-State: ANhLgQ37FTlfibraxdpiUbPeeQuYv1W1T+K8q15D2EBmjhx79lVMCHEh
 FD3SqlUAsTHe5uDSX9rBgRl43OCe
X-Google-Smtp-Source: ADFU+vsnXOmt/bvLmX/4qijhJBidrlxLxOOHl1tXkgxA2R4FECXhH7zvkOJUj3geOGscOjpgyf8cEw==
X-Received: by 2002:adf:b35e:: with SMTP id k30mr8146831wrd.362.1585214939034; 
 Thu, 26 Mar 2020 02:28:59 -0700 (PDT)
Received: from rpluim-mac ([2a01:e34:ecfc:a860:5432:f8b5:445f:afcd])
 by smtp.gmail.com with ESMTPSA id l17sm2777495wrm.57.2020.03.26.02.28.57
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Thu, 26 Mar 2020 02:28:58 -0700 (PDT)
From: Robert Pluim <rpluim@HIDDEN>
To: Juri Linkov <juri@HIDDEN>
Subject: Re: bug#40216: 28.0.50; Misinformation in isearch char-fold
References: <87y2rppddp.fsf@HIDDEN> <m2zhc4eql0.fsf@HIDDEN>
 <87imiskzc6.fsf@HIDDEN>
Date: Thu, 26 Mar 2020 10:28:55 +0100
In-Reply-To: <87imiskzc6.fsf@HIDDEN> (Juri Linkov's message of "Wed, 
 25 Mar 2020 22:29:29 +0200")
Message-ID: <m2imirea7c.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 40216
Cc: 40216 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

>>>>> On Wed, 25 Mar 2020 22:29:29 +0200, Juri Linkov <juri@HIDDEN> said:

    >> Out of curiosity, what were you searching for that resulted in such a
    >> large regexp?

    Juri> Sometimes I pull a few of lines (usually 1-3 lines, not more)
    Juri> from the buffer into the search string to confirm that the same lines
    Juri> exist in more places in the same buffer ignoring the differences defined
    Juri> by folding rules.  But after pulling 2 lines into the search string,
    Juri> the generated regexp becomes so long that the regexp search fails
    Juri> with the error "Regular expression too big".  Currently it silently
    Juri> switches to literal search without notification that it doesn't follow
    Juri> the folding rules anymore.  With the patch it informs about switching
    Juri> to literal search.

Ah, I hadn't considered that use case. Do we need an option to
char-fold-regexp that says 'only apply char-folding to non-ascii
characters'? That would reduce the size of the regexp considerably.

Robert




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#40216; Package emacs. Full text available.

Message received at 40216 <at> debbugs.gnu.org:


Received: (at 40216) by debbugs.gnu.org; 25 Mar 2020 20:59:37 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Mar 25 16:59:37 2020
Received: from localhost ([127.0.0.1]:57841 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jHD7h-0008A1-Mf
	for submit <at> debbugs.gnu.org; Wed, 25 Mar 2020 16:59:37 -0400
Received: from relay8-d.mail.gandi.net ([217.70.183.201]:39971)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <juri@HIDDEN>) id 1jHD7e-00089P-1v
 for 40216 <at> debbugs.gnu.org; Wed, 25 Mar 2020 16:59:34 -0400
X-Originating-IP: 91.129.96.173
Received: from mail.gandi.net (m91-129-96-173.cust.tele2.ee [91.129.96.173])
 (Authenticated sender: juri@HIDDEN)
 by relay8-d.mail.gandi.net (Postfix) with ESMTPSA id 111621BF203;
 Wed, 25 Mar 2020 20:59:26 +0000 (UTC)
From: Juri Linkov <juri@HIDDEN>
To: Robert Pluim <rpluim@HIDDEN>
Subject: Re: bug#40216: 28.0.50; Misinformation in isearch char-fold
Organization: LINKOV.NET
References: <87y2rppddp.fsf@HIDDEN> <m2zhc4eql0.fsf@HIDDEN>
Date: Wed, 25 Mar 2020 22:29:29 +0200
In-Reply-To: <m2zhc4eql0.fsf@HIDDEN> (Robert Pluim's message of "Wed, 25
 Mar 2020 10:22:51 +0100")
Message-ID: <87imiskzc6.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (x86_64-pc-linux-gnu)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 40216
Cc: 40216 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

> Out of curiosity, what were you searching for that resulted in such a
> large regexp?

Sometimes I pull a few of lines (usually 1-3 lines, not more)
from the buffer into the search string to confirm that the same lines
exist in more places in the same buffer ignoring the differences defined
by folding rules.  But after pulling 2 lines into the search string,
the generated regexp becomes so long that the regexp search fails
with the error "Regular expression too big".  Currently it silently
switches to literal search without notification that it doesn't follow
the folding rules anymore.  With the patch it informs about switching
to literal search.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#40216; Package emacs. Full text available.

Message received at 40216 <at> debbugs.gnu.org:


Received: (at 40216) by debbugs.gnu.org; 25 Mar 2020 09:23:01 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Mar 25 05:23:01 2020
Received: from localhost ([127.0.0.1]:56201 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jH2FZ-0001kx-CK
	for submit <at> debbugs.gnu.org; Wed, 25 Mar 2020 05:23:01 -0400
Received: from mail-wr1-f44.google.com ([209.85.221.44]:38493)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <rpluim@HIDDEN>) id 1jH2FX-0001kd-C4
 for 40216 <at> debbugs.gnu.org; Wed, 25 Mar 2020 05:22:59 -0400
Received: by mail-wr1-f44.google.com with SMTP id s1so1960995wrv.5
 for <40216 <at> debbugs.gnu.org>; Wed, 25 Mar 2020 02:22:59 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=from:to:cc:subject:references:date:in-reply-to:message-id
 :mime-version; bh=1WU1KoxlYNS2QkqYXZ2O18WZJGkH3ME4eLa21E9YJgk=;
 b=hcpTCb89dRXdRSTrUPEqKhT99jFdqfTDQYtOnj5D7ZotMmLMg4n0fRDEmG4rzgswYB
 wLTUlnLnNEeoLuWG+4ICR1L4FYCbt3+GLF9xuTEMAjkjOopxwEWX1YGaRLhlxzVDPAUS
 XE/5B180ORgtLsbMMihZCPcDXCukQUD9pk+zzsGOm5imueZ9oMTLcJzNgzbi2Q8wa708
 gy/4p72T/lkRg1ahge8bhw1PjkzcAaViWzPSJe85C48etQh7wlz4IHZBZn3x+KlgsUWK
 oiZS4Wipqhv2Kmxee+ENbFYKF9O9mlZipu6VtpIunoMH968dDwoP2utQwPhB3UMwCe6S
 KS+g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to
 :message-id:mime-version;
 bh=1WU1KoxlYNS2QkqYXZ2O18WZJGkH3ME4eLa21E9YJgk=;
 b=SYsatL2Ijx2ZU+xzZhNuRNtgzWmasSGoJnT0z4mr9VKkwfXtUlTP8lDFhcoiuEuW/p
 pYpC8eJn86yxqZuiR8D4USG+eLLIMZbaWJGImF79COFNeIBWW6XexRGc8UZ/LsQW0cDc
 Bubuy39HnivXfFTHqaJ43S04CbvH8mA3TJQaTdY1K8un/ndL0MKvD+Dilc0S0aE18e2Q
 WZ33KQYW6zmcEgqeBokx9elRcya7J9RQ/2ZoZtEd7XMFUGHoFXZZo3OAnQT+RGJE6xhR
 zd2p7+xlZdDMMoshjzahi8BOIerwzpmnKcBPYsh/ahS/ufs6Pz2nKvsUDpXQoAan8GMY
 Mysw==
X-Gm-Message-State: ANhLgQ22EJweP2D7+DbgSA1NPdAlhOBq+eoxco7RMlVhjkmYcU1o+hB1
 iNnrCG7a5dY+obqF3sRTZxpBq8v5
X-Google-Smtp-Source: ADFU+vueFUjrdDxeqTDPiUK0PaTqFO1EAwnbr2SOeBxWB+QP6PNRJVi/FP5PQR8TB2cYXI5+C0d0+A==
X-Received: by 2002:adf:9022:: with SMTP id h31mr2291680wrh.223.1585128173159; 
 Wed, 25 Mar 2020 02:22:53 -0700 (PDT)
Received: from rpluim-mac ([2a01:e34:ecfc:a860:5432:f8b5:445f:afcd])
 by smtp.gmail.com with ESMTPSA id n186sm8309074wme.25.2020.03.25.02.22.52
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Wed, 25 Mar 2020 02:22:52 -0700 (PDT)
From: Robert Pluim <rpluim@HIDDEN>
To: Juri Linkov <juri@HIDDEN>
Subject: Re: bug#40216: 28.0.50; Misinformation in isearch char-fold
References: <87y2rppddp.fsf@HIDDEN>
Date: Wed, 25 Mar 2020 10:22:51 +0100
In-Reply-To: <87y2rppddp.fsf@HIDDEN> (Juri Linkov's message of "Wed, 
 25 Mar 2020 01:00:18 +0200")
Message-ID: <m2zhc4eql0.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 40216
Cc: 40216 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

>>>>> On Wed, 25 Mar 2020 01:00:18 +0200, Juri Linkov <juri@HIDDEN> said:

    Juri> Tags: patch
    Juri> When the size of the generated regexp in char-fold isearch mode reaches
    Juri> a certain limit, it silently falls back to literal search without notifying
    Juri> the user about this fact.  Thus uninformed users might miss some search hits.

Out of curiosity, what were you searching for that resulted in such a
large regexp?

Robert




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#40216; Package emacs. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 24 Mar 2020 23:03:34 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Tue Mar 24 19:03:34 2020
Received: from localhost ([127.0.0.1]:55734 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jGsa6-00005s-Jc
	for submit <at> debbugs.gnu.org; Tue, 24 Mar 2020 19:03:34 -0400
Received: from lists.gnu.org ([209.51.188.17]:39340)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <juri@HIDDEN>) id 1jGsa2-00005U-33
 for submit <at> debbugs.gnu.org; Tue, 24 Mar 2020 19:03:30 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:48340)
 by lists.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <juri@HIDDEN>) id 1jGsZz-0000ct-Ub
 for bug-gnu-emacs@HIDDEN; Tue, 24 Mar 2020 19:03:28 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.1 required=5.0 tests=BAYES_50,RCVD_IN_DNSWL_LOW
 autolearn=disabled version=3.3.2
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <juri@HIDDEN>) id 1jGsZy-0006Sb-DS
 for bug-gnu-emacs@HIDDEN; Tue, 24 Mar 2020 19:03:27 -0400
Received: from relay9-d.mail.gandi.net ([217.70.183.199]:38243)
 by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <juri@HIDDEN>) id 1jGsZy-0006R1-6F
 for bug-gnu-emacs@HIDDEN; Tue, 24 Mar 2020 19:03:26 -0400
X-Originating-IP: 91.129.96.173
Received: from mail.gandi.net (m91-129-96-173.cust.tele2.ee [91.129.96.173])
 (Authenticated sender: juri@HIDDEN)
 by relay9-d.mail.gandi.net (Postfix) with ESMTPSA id C491EFF802;
 Tue, 24 Mar 2020 23:03:21 +0000 (UTC)
From: Juri Linkov <juri@HIDDEN>
To: bug-gnu-emacs@HIDDEN
Subject: 28.0.50; Misinformation in isearch char-fold
Organization: LINKOV.NET
Date: Wed, 25 Mar 2020 01:00:18 +0200
Message-ID: <87y2rppddp.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (x86_64-pc-linux-gnu)
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=-=-="
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
 [fuzzy]
X-Received-From: 217.70.183.199
X-Spam-Score: -0.0 (/)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

--=-=-=
Content-Type: text/plain

Tags: patch

When the size of the generated regexp in char-fold isearch mode reaches
a certain limit, it silently falls back to literal search without notifying
the user about this fact.  Thus uninformed users might miss some search hits.

Here is the patch that instead of returning a quoted string in
char-fold-to-regexp when it reaches some arbitrary limit,
instead of this it toggles the literal search mode explicitly,
tries to find the next occurrence in literal mode, and displays
the message about switching search mode for 2 seconds:


--=-=-=
Content-Type: text/x-diff
Content-Disposition: inline; filename=isearch-literal-char-fold.patch

diff --git a/lisp/char-fold.el b/lisp/char-fold.el
index f8a303956e..34561a2efe 100644
--- a/lisp/char-fold.el
+++ b/lisp/char-fold.el
@@ -370,11 +377,7 @@ char-fold-to-regexp
       (setq i (1+ i)))
     (when (> spaces 0)
       (push (char-fold--make-space-string spaces) out))
-    (let ((regexp (apply #'concat (nreverse out))))
-      ;; Limited by `MAX_BUF_SIZE' in `regex-emacs.c'.
-      (if (> (length regexp) 5000)
-          (regexp-quote string)
-        regexp))))
+    (apply #'concat (nreverse out))))
 
 
 ;;; Commands provided for completeness.
diff --git a/lisp/isearch.el b/lisp/isearch.el
index ddf9190dc6..7625ec12b5 100644
--- a/lisp/isearch.el
+++ b/lisp/isearch.el
@@ -2011,15 +2011,16 @@ regexp
 (defvar isearch-message-properties minibuffer-prompt-properties
   "Text properties that are added to the isearch prompt.")
 
-(defun isearch--momentary-message (string)
-  "Print STRING at the end of the isearch prompt for 1 second."
+(defun isearch--momentary-message (string &optional seconds)
+  "Print STRING at the end of the isearch prompt for 1 second.
+The optional argument SECONDS overrides the number of seconds."
   (let ((message-log-max nil))
     (message "%s%s%s"
              (isearch-message-prefix nil isearch-nonincremental)
              isearch-message
              (apply #'propertize (format " [%s]" string)
                     isearch-message-properties)))
-  (sit-for 1))
+  (sit-for (or seconds 1)))
 
 (isearch-define-mode-toggle lax-whitespace " " nil
   "In ordinary search, toggles the value of the variable
@@ -3443,7 +3444,10 @@ isearch-search
 	    (string-match "\\`Regular expression too big" isearch-error))
        (cond
 	(isearch-regexp-function
-	 (setq isearch-error "Too many words"))
+         (setq isearch-error nil)
+         (setq isearch-regexp-function nil)
+         (isearch-search-and-update)
+         (isearch--momentary-message "Too many words; switched to literal mode" 2))
 	((and isearch-lax-whitespace search-whitespace-regexp)
 	 (setq isearch-error "Too many spaces for whitespace matching"))))))
 

--=-=-=--




Acknowledgement sent to Juri Linkov <juri@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs@HIDDEN. Full text available.
Report forwarded to bug-gnu-emacs@HIDDEN:
bug#40216; Package emacs. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Mon, 30 Mar 2020 02:45:01 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.