GNU bug report logs - #19873
Ill-formed regular expression is constructed in forward-paragraph.

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: emacs; Reported by: Alan Mackenzie <acm@HIDDEN>; dated Sun, 15 Feb 2015 10:39:01 UTC; Maintainer for emacs is bug-gnu-emacs@HIDDEN.

Message received at 19873 <at> debbugs.gnu.org:


Received: (at 19873) by debbugs.gnu.org; 26 Feb 2017 16:57:37 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Feb 26 11:57:37 2017
Received: from localhost ([127.0.0.1]:56979 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1ci28f-0006Y6-Ju
	for submit <at> debbugs.gnu.org; Sun, 26 Feb 2017 11:57:37 -0500
Received: from eggs.gnu.org ([208.118.235.92]:56187)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eliz@HIDDEN>) id 1ci28d-0006Xt-E7
 for 19873 <at> debbugs.gnu.org; Sun, 26 Feb 2017 11:57:36 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <eliz@HIDDEN>) id 1ci28V-0000bd-8q
 for 19873 <at> debbugs.gnu.org; Sun, 26 Feb 2017 11:57:30 -0500
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD
 autolearn=disabled version=3.3.2
Received: from fencepost.gnu.org ([2001:4830:134:3::e]:44347)
 by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <eliz@HIDDEN>)
 id 1ci28V-0000bW-5L; Sun, 26 Feb 2017 11:57:27 -0500
Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:2068
 helo=home-c4e4a596f7)
 by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256)
 (Exim 4.82) (envelope-from <eliz@HIDDEN>)
 id 1ci28U-0003jT-6t; Sun, 26 Feb 2017 11:57:26 -0500
Date: Sun, 26 Feb 2017 18:57:05 +0200
Message-Id: <837f4criu6.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Marcin Borkowski <mbork@HIDDEN>
In-reply-to: <87o9xodhq4.fsf@jane> (message from Marcin Borkowski on Sun, 26
 Feb 2017 17:44:51 +0100)
Subject: Re: bug#19873: Ill-formed regular expression is constructed in
 forward-paragraph.
References: <20150215103122.GA3282@HIDDEN> <87o9xodhq4.fsf@jane>
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Received-From: 2001:4830:134:3::e
X-Spam-Score: -5.0 (-----)
X-Debbugs-Envelope-To: 19873
Cc: acm@HIDDEN, 19873 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Reply-To: Eli Zaretskii <eliz@HIDDEN>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -5.0 (-----)

> From: Marcin Borkowski <mbork@HIDDEN>
> Date: Sun, 26 Feb 2017 17:44:51 +0100
> Cc: 19873 <at> debbugs.gnu.org
> 
> First of all, my Emacs has this as paragraph-start:
> 
> "\\|[ 	]*$"
> 
> and this as paragraph-separate:
> 
> "[ 	]*$"
> 
> and frankly speaking, I'm not sure why they differ at all (by default).

I believe this is explained in the Emacs manual, node "Paragraphs".
In a nutshell, these two regexps have different purposes.




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#19873; Package emacs. Full text available.

Message received at 19873 <at> debbugs.gnu.org:


Received: (at 19873) by debbugs.gnu.org; 26 Feb 2017 16:44:29 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Feb 26 11:44:29 2017
Received: from localhost ([127.0.0.1]:56965 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1ci1vw-0006EY-Oz
	for submit <at> debbugs.gnu.org; Sun, 26 Feb 2017 11:44:28 -0500
Received: from mx1.amu.edu.pl ([150.254.65.108]:49719)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <mbork@HIDDEN>) id 1ci1vu-0006EH-C2
 for 19873 <at> debbugs.gnu.org; Sun, 26 Feb 2017 11:44:27 -0500
Received: from localhost (localhost [127.0.0.1])
 by mx1.amu.edu.pl (Postfix) with ESMTP id 82F8F1B1820;
 Sun, 26 Feb 2017 17:44:15 +0100 (CET)
X-Virus-Scanned: amavisd-new at amu.edu.pl
Received: from mx1.amu.edu.pl ([127.0.0.1])
 by localhost (mx1.amu.edu.pl [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id E_-l3zrvYlt4; Sun, 26 Feb 2017 17:44:15 +0100 (CET)
Received: from pp.amu.edu.pl (pp.amu.edu.pl [150.254.65.67])
 by mx1.amu.edu.pl (Postfix) with ESMTPS;
 Sun, 26 Feb 2017 17:44:13 +0100 (CET)
Received: from localhost (pp.amu.edu.pl [127.0.0.1])
 by pp.amu.edu.pl (Postfix) with ESMTP id D6BF762737;
 Sun, 26 Feb 2017 17:44:17 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=amu.edu.pl; h=
 content-type:content-type:mime-version:user-agent:message-id
 :in-reply-to:date:date:references:subject:subject:from:from
 :received:received; s=pp; t=1488127457; x=1489941858; bh=K6gY5vo
 NKcOwsgGDsqKOjxyxbCH+7saMvQilm/RTzW0=; b=UiCBgp05LL/XWr2ywLTE61R
 X3MgSaUO2lbmNf/DwABBLJDr0AEWQPXhiFPgsxQEV8N8Te4/DvQNDIyIg2vOOYop
 +DDLIznzW2KTCOW3ahIqMri/nO/4Wpidl8LaD0lJolEjCTJENnemA9TnQPzyrhhC
 TX8d+jUHEYJqaVYqEWTg=
X-Virus-Scanned: amavisd-new at amu.edu.pl
Received: from pp.amu.edu.pl ([127.0.0.1])
 by localhost (pp.amu.edu.pl [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id qF_l37BCbFeX; Sun, 26 Feb 2017 17:44:17 +0100 (CET)
Received: from localhost (static-dwadziewiec-jedenpiec7.echostar.pl
 [109.232.29.157])
 by pp.amu.edu.pl (Postfix) with ESMTPSA id 9E9DE6277B;
 Sun, 26 Feb 2017 17:44:17 +0100 (CET)
From: Marcin Borkowski <mbork@HIDDEN>
To: Alan Mackenzie <acm@HIDDEN>
Subject: Re: bug#19873: Ill-formed regular expression is constructed in
 forward-paragraph.
References: <20150215103122.GA3282@HIDDEN>
Date: Sun, 26 Feb 2017 17:44:51 +0100
In-Reply-To: <20150215103122.GA3282@HIDDEN> (Alan Mackenzie's message
 of "Sun, 15 Feb 2015 10:31:22 +0000")
Message-ID: <87o9xodhq4.fsf@jane>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.0.50 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 19873
Cc: 19873 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: 0.0 (/)

On 2015-02-15, at 10:31, Alan Mackenzie <acm@HIDDEN> wrote:

> Hello, Emacs!
>
> In forward-paragraph, L37, a regular expression is constructed as
> follows:
>
> (let* ...
>  (sp-parstart (concat "^[ \t]*\\(?:" parstart "\\|" parsep "\\)"))
>  ...)
>
> .  Here parstart and parsep are, more or less,
> paragraph-{start,separate}.
>
> The problem is that parstart and parsep themselves are likely to begin
> with "[ \t]*" (the default values certainly do), so we have two
> consecutive matchers for an arbitrary amount of whitespace.  This causes
> the regexp engine to run very slowly when a line starts with lots of WS
> but doesn't match.
>
> This problem seems to be the cause of bug # 19846 (where holding down the
> spacebar inside a C comment causes Emacs to seize up when auto-fill mode
> is enabled).

Hi Alan, hi all,

I put this bug on my todo-list some time ago and decided now to revisit
it.

I'm wondering what could be done about it.  First of all, my Emacs has
this as paragraph-start:

"\\|[ 	]*$"

and this as paragraph-separate:

"[ 	]*$"

and frankly speaking, I'm not sure why they differ at all (by default).
Also, even though forward-paragraph checks for "^" at their beginning,
they actually don't begin with that character (again, by default).

My first thought is to add a check whether paragraph-start and
paragraph-sep match something like

"^\\^?\\[[[:space:]]+\\][+*]?"

and if yes, make parstart/parsep equal to them, but without the matching
part.

WDYT?

-- 
Marcin Borkowski
http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski
Faculty of Mathematics and Computer Science
Adam Mickiewicz University




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#19873; Package emacs. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 15 Feb 2015 10:38:56 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Feb 15 05:38:56 2015
Received: from localhost ([127.0.0.1]:44317 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1YMwbH-0002ea-Rn
	for submit <at> debbugs.gnu.org; Sun, 15 Feb 2015 05:38:56 -0500
Received: from eggs.gnu.org ([208.118.235.92]:59461)
 by debbugs.gnu.org with esmtp (Exim 4.80)
 (envelope-from <acm@HIDDEN>) id 1YMwbF-0002eN-LL
 for submit <at> debbugs.gnu.org; Sun, 15 Feb 2015 05:38:54 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <acm@HIDDEN>) id 1YMwb5-0005JY-9o
 for submit <at> debbugs.gnu.org; Sun, 15 Feb 2015 05:38:48 -0500
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled
 version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:56085)
 by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <acm@HIDDEN>)
 id 1YMwb5-0005JU-72
 for submit <at> debbugs.gnu.org; Sun, 15 Feb 2015 05:38:43 -0500
Received: from eggs.gnu.org ([2001:4830:134:3::10]:32938)
 by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <acm@HIDDEN>)
 id 1YMwb0-0006Jm-Hc
 for bug-gnu-emacs@HIDDEN; Sun, 15 Feb 2015 05:38:43 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <acm@HIDDEN>) id 1YMwav-0005GD-VC
 for bug-gnu-emacs@HIDDEN; Sun, 15 Feb 2015 05:38:38 -0500
Received: from colin.muc.de ([193.149.48.1]:28511 helo=mail.muc.de)
 by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <acm@HIDDEN>)
 id 1YMwav-0005G0-LB
 for bug-gnu-emacs@HIDDEN; Sun, 15 Feb 2015 05:38:33 -0500
Received: (qmail 94991 invoked by uid 3782); 15 Feb 2015 10:31:51 -0000
Received: from acm.muc.de (pD951970A.dip0.t-ipconnect.de [217.81.151.10]) by
 colin.muc.de (tmda-ofmipd) with ESMTP;
 Sun, 15 Feb 2015 11:31:50 +0100
Received: (qmail 3374 invoked by uid 1000); 15 Feb 2015 10:31:22 -0000
Date: Sun, 15 Feb 2015 10:31:22 +0000
To: bug-gnu-emacs@HIDDEN
Subject: Ill-formed regular expression is constructed in forward-paragraph.
Message-ID: <20150215103122.GA3282@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.22 (2013-10-16)
X-Delivery-Agent: TMDA/1.1.12 (Macallan)
From: Alan Mackenzie <acm@HIDDEN>
X-Primary-Address: acm@HIDDEN
X-detected-operating-system: by eggs.gnu.org: FreeBSD 8.x
X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address
 (bad octet value).
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -4.3 (----)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -4.3 (----)

Hello, Emacs!

In forward-paragraph, L37, a regular expression is constructed as
follows:

(let* ...
 (sp-parstart (concat "^[ \t]*\\(?:" parstart "\\|" parsep "\\)"))
 ...)

.  Here parstart and parsep are, more or less,
paragraph-{start,separate}.

The problem is that parstart and parsep themselves are likely to begin
with "[ \t]*" (the default values certainly do), so we have two
consecutive matchers for an arbitrary amount of whitespace.  This causes
the regexp engine to run very slowly when a line starts with lots of WS
but doesn't match.

This problem seems to be the cause of bug # 19846 (where holding down the
spacebar inside a C comment causes Emacs to seize up when auto-fill mode
is enabled).

-- 
Alan Mackenzie (Nuremberg, Germany).




Acknowledgement sent to Alan Mackenzie <acm@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs@HIDDEN. Full text available.
Report forwarded to bug-gnu-emacs@HIDDEN:
bug#19873; Package emacs. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Sun, 26 Feb 2017 17:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.