GNU bug report logs - #31817
Possible bug in regex search

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: emacs; Reported by: Michele Pes <mp81ss@HIDDEN>; merged with #6640, #20230, #34823; dated Wed, 13 Jun 2018 17:54:03 UTC; Maintainer for emacs is bug-gnu-emacs@HIDDEN.
Merged 6640 20230 31817 34823. Request was from Noam Postavsky <npostavs@HIDDEN> to control <at> debbugs.gnu.org. Full text available.
Merged 6640 20230 31817. Request was from Noam Postavsky <npostavs@HIDDEN> to control <at> debbugs.gnu.org. Full text available.

Message received at 31817 <at> debbugs.gnu.org:


Received: (at 31817) by debbugs.gnu.org; 14 Jun 2018 00:38:01 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Jun 13 20:38:01 2018
Received: from localhost ([127.0.0.1]:47082 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1fTGH3-0005Wh-B8
	for submit <at> debbugs.gnu.org; Wed, 13 Jun 2018 20:38:01 -0400
Received: from mail-io0-f173.google.com ([209.85.223.173]:39048)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <npostavs@HIDDEN>) id 1fTGH1-0005WV-K7
 for 31817 <at> debbugs.gnu.org; Wed, 13 Jun 2018 20:38:00 -0400
Received: by mail-io0-f173.google.com with SMTP id f1-v6so5403815ioh.6
 for <31817 <at> debbugs.gnu.org>; Wed, 13 Jun 2018 17:37:59 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=from:to:cc:subject:references:date:in-reply-to:message-id
 :user-agent:mime-version;
 bh=63IHoFuqT3BjLn1h99n8o01eLdjeZMcdedbJEthLe1Q=;
 b=DtzoSL5UqOwEUt1KSHgiZiTdIOilWqKKmY/R5A9ZWm92HlXW6BDAHYpIsLQAeTNQFl
 UeluJPZHDb5qK5EEF3fIqohmDwOPQCbhbdLREG6D8AX3ZE30+kov1y2DfOToGiAmXi7f
 SReALHUm7W/3Zfz/1PRq2OdkaKNHgDi7pK4y37GVEM4+C8wZmnbaCoaJ3O2/ov61fyMf
 l0hrHafO/aOXk5ZkRofTFxSrexji7ytZ6snKhOSGtszfKrp2HDIjmgbWWcUHv33PIu4r
 6IYUYIq189GFwG5+y/G+5lyJg4TgkL3eHsrX8FkoZnotbXrsblGgwG+8dIq1i9yVsyNP
 TXtQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to
 :message-id:user-agent:mime-version;
 bh=63IHoFuqT3BjLn1h99n8o01eLdjeZMcdedbJEthLe1Q=;
 b=mNrr1q11SF/jIcAfpdpoac/oDdqbqC8fxUWQfOEDuwV/u0FXzdmv8TXwwlkXJEQxcf
 mJpykuBM0Ft9bO8QhihF6FUJkEtfmhTP4mI3XmTAGGve/xKdrQieyjTnnlvywJVcvoq4
 Bc7GvFfStzMLdT7n1hto+x9A1HMap4kspZjki9JsRFvalrm9rgsiMSbM3Jx26jJ2jxT5
 0aGekbWj4jXdWVQwKhyWRoVmhkxM7FbpUrtJoBTiba+Yz7rBzXQbIACaa9UG6AyMxfO0
 jXIBV73Z7v8hyOrN4ONyjRHv5gCqCmA39nly64vPX0zSQlFfmlGpg/3DwK3adBfg5GaA
 FTAg==
X-Gm-Message-State: APt69E1qvov2csmaV6IVR+Wo5L9UiJtUFKF91KJ/DVkrAiQGiYsQDlsZ
 PFehzrSIY6BoRlDBGb1HKc3Sqw==
X-Google-Smtp-Source: ADUXVKLlf0LOhJZZqAAH/aR2shRU4C44aQMDuP0Mbwjmr9/oNdSQJDaC6DiotViVbh+ZphNHNSGKmA==
X-Received: by 2002:a6b:2286:: with SMTP id
 i128-v6mr349940ioi.289.1528936673972; 
 Wed, 13 Jun 2018 17:37:53 -0700 (PDT)
Received: from zebian (cbl-45-2-119-34.yyz.frontiernetworks.ca. [45.2.119.34])
 by smtp.googlemail.com with ESMTPSA id
 e6-v6sm1411324iog.7.2018.06.13.17.37.52
 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256);
 Wed, 13 Jun 2018 17:37:53 -0700 (PDT)
From: Noam Postavsky <npostavs@HIDDEN>
To: "Michele Pes" <mp81ss@HIDDEN>
Subject: Re: bug#31817: Possible bug in regex search
References: <1528899604.595409.22564.45300@HIDDEN>
 <1528899618.415948.22587.31975@HIDDEN>
Date: Wed, 13 Jun 2018 20:37:51 -0400
In-Reply-To: <1528899618.415948.22587.31975@HIDDEN> (Michele Pes's
 message of "Wed, 13 Jun 2018 17:20:18 +0300")
Message-ID: <87sh5qw6cg.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 31817
Cc: 31817 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

"Michele Pes" <mp81ss@HIDDEN> writes:

> Hi,I'm working with regex, and I isolated an unexpected behaviour.If I paste
> attached code code in scratch buffer and evaluate it, emacs hangs (cpu stays at
> 100% until I kill emacs)

This is because the current regexp engine is implemented with
backtracking, so certain patterns trigger an exponential amount of
searching.

> (let (
>       (regex
>        (concat "[ \f\t\n\r\v]" "*\\(" "_*[[:alpha:]]+[A-Z0-9a-z_]*"

So here, [[:alpha:]]+ and [A-Z0-9a-z_]* have a lot of overlap, so when
matching against "void", for example, the matcher could match
[[:alpha:]]+ with "v", "vo", "voi", or "void".  And for each of these it
goes and matches the rest of the pattern (with additional backtracking)
all of which takes a long time.

It's better to use non-overlapping matches, like

(let ((regex
       (concat "[ \f\t\n\r\v]*"
               "\\(" "[_[:alpha:]][A-Z0-9a-z_]*" "[ \f\t\n\r\v\\*]*" "\\)+"
               "(" "[ \f\t\n\r\v]*" "\\*?" "[ \f\t\n\r\v]*"
               "\\(" "[_[:alpha:]][A-Z0-9a-z_]*" "\\)"
               "[ \f\t\n\r\v]*" "[()]"))
      (x "void Vsdk_Init(void) {     /* Open watchdog iwdt driver, watchdog is running now */     iwdt.p_api->open"))
  (if (and (string-match "(" x)
           (string-match regex x))
      (match-string 2 x)
    (message "did not match")))

Also note, if you meant the regex to match starting from the beginning
of the string, you should prefix it with "\\`", so that the string-match
won't try other positions.

Further reading:

https://en.wikipedia.org/wiki/ReDoS
https://swtch.com/~rsc/regexp/regexp1.html




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#31817; Package emacs. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 13 Jun 2018 17:53:31 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Jun 13 13:53:31 2018
Received: from localhost ([127.0.0.1]:46881 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1fT9xa-0000fs-QZ
	for submit <at> debbugs.gnu.org; Wed, 13 Jun 2018 13:53:31 -0400
Received: from eggs.gnu.org ([208.118.235.92]:54457)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <mp81ss@HIDDEN>) id 1fT6eo-0006lk-UJ
 for submit <at> debbugs.gnu.org; Wed, 13 Jun 2018 10:21:56 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <mp81ss@HIDDEN>) id 1fT6ei-0000n7-M6
 for submit <at> debbugs.gnu.org; Wed, 13 Jun 2018 10:21:49 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM,
 HTML_MESSAGE,T_DKIM_INVALID autolearn=disabled version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:41606)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <mp81ss@HIDDEN>) id 1fT6ei-0000ms-HZ
 for submit <at> debbugs.gnu.org; Wed, 13 Jun 2018 10:21:48 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:45423)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <mp81ss@HIDDEN>) id 1fT6eh-0003Gt-6x
 for bug-gnu-emacs@HIDDEN; Wed, 13 Jun 2018 10:21:48 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <mp81ss@HIDDEN>) id 1fT6ec-0000Z2-VR
 for bug-gnu-emacs@HIDDEN; Wed, 13 Jun 2018 10:21:47 -0400
Received: from mxout3.rambler.ru ([81.19.78.102]:9329)
 by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <mp81ss@HIDDEN>) id 1fT6ec-0005Kl-CW
 for bug-gnu-emacs@HIDDEN; Wed, 13 Jun 2018 10:21:42 -0400
Received: from saddam1.rambler.ru (saddam1.rambler.ru [10.32.16.1])
 by mxout3.rambler.ru (Postfix) with ESMTP id D48D37C0244
 for <bug-gnu-emacs@HIDDEN>; Wed, 13 Jun 2018 17:20:18 +0300 (MSK)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rambler.ru; s=mail;
 t=1528899619; bh=5A/q9hX/lZk1HGitxVcDAJGATxMFtTOjuqF8e0+xhr8=;
 h=From:To:Reply-To:Subject:Date:In-Reply-To:References;
 b=a5e94aHuOFnLb07nT2FTBDHcfGrdrYf/zsl4yA024Gf7VFjuJ/O+DZMNiqjmU4EA3
 2xuCBNB6vQneygUBvvxRw2W1Xv0OQrugMB4cYg6i7NU7c/vk8sg+5UUm6/LYpfGEQf
 yHcWyFQHiLJ6pU7IAoCJLPr5/LSBmcnnPo5IwR2E=
Received: from localhost.localdomain (localhost [127.0.0.1])
 by saddam1.rambler.ru (Postfix) with ESMTP id C503D253B02
 for <bug-gnu-emacs@HIDDEN>; Wed, 13 Jun 2018 17:20:18 +0300 (MSK)
Received: from [79.8.93.92] by mail.rambler.ru with HTTP;
 Wed, 13 Jun 2018 17:20:18 +0300
From: "Michele Pes" <mp81ss@HIDDEN>
To: bug-gnu-emacs@HIDDEN
Subject: Possible bug in regex search
Date: Wed, 13 Jun 2018 17:20:18 +0300
Content-Transfer-Encoding: 7bit
Content-Type: multipart/mixed; boundary="_----------=_15288996182258762"
In-Reply-To: <1528899604.595409.22564.45300@HIDDEN>
Message-Id: <1528899618.415948.22587.31975@HIDDEN>
MIME-Version: 1.0
References: <1528899604.595409.22564.45300@HIDDEN>
X-Mailer: Rambler WebMail, http://mail.rambler.ru/
X-Rambler-User: mp81ss@HIDDEN/79.8.93.92
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
 [fuzzy]
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -4.0 (----)
X-Debbugs-Envelope-To: submit
X-Mailman-Approved-At: Wed, 13 Jun 2018 13:53:29 -0400
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Reply-To: Michele Pes <mp81ss@HIDDEN>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -5.0 (-----)


This is a multi-part message in MIME format.

--_----------=_15288996182258762
Content-Type: multipart/alternative; boundary="_----------=_15288996182258763"

This is a multi-part message in MIME format.

--_----------=_15288996182258763
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"; format="flowed"

Hi,I'm working with regex, and I isolated an unexpected behaviour.If I paste
attached code code in scratch buffer and evaluate it, emacs hangs (cpu stay=
s at
100% until I kill emacs)Im running emacs for windows x64, in particular this
package:
https://sourceforge.net/projects/emacsbinw64/files/release/emacs-w64-25.3-O=
2-with-modules.7z/download
The variable case-fold-search is set to t.
Can you help me?
Thank you very much,
Michele Pes

--_----------=_15288996182258763
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html; charset="utf-8"

<div>Hi,</div><div>I'm working with regex, and I isolated an unexpected beh=
aviour.</div><div>If I paste attached code code in scratch buffer and evalu=
ate it, emacs hangs (cpu stays at 100% until I kill emacs)</div><div>Im run=
ning emacs for windows x64, in particular this package:</div><div><br data-=
mce-bogus=3D"1"></div><div>https://sourceforge.net/projects/emacsbinw64/fil=
es/release/emacs-w64-25.3-O2-with-modules.7z/download</div><div><br data-mc=
e-bogus=3D"1"></div><div>The variable&nbsp;case-fold-search is set to t.</d=
iv><div><br data-mce-bogus=3D"1"></div><div>Can you help me?</div><div><br>=
</div><div>Thank you very much,</div><div><br data-mce-bogus=3D"1"></div><d=
iv>Michele Pes</div><div><br></div>=

--_----------=_15288996182258763--

--_----------=_15288996182258762
Content-Disposition: attachment; filename="bug.el"
Content-Transfer-Encoding: base64
Content-Type: application/octet-stream; name="bug.el"

KGxldCAoDQogICAgICAocmVnZXgNCiAgICAgICAoY29uY2F0ICJbIFxmXHRc
blxyXHZdIiAiKlxcKCIgIl8qW1s6YWxwaGE6XV0rW0EtWjAtOWEtel9dKiIN
CiAgICAgICAgICAgICAgICJbIFxmXHRcblxyXHZcXCpdIg0KICAgICAgICAg
ICAgICAgIipcXCkrKCIgIlsgXGZcdFxuXHJcdl0iDQogICAgICAgICAgICAg
ICAiKlxcKiIgIlsgXGZcdFxuXHJcdl0iDQogICAgICAgICAgICAgICAiKlxc
KCIgIl8qW1s6YWxwaGE6XV0rW0EtWjAtOWEtel9dKiIgIlxcKSINCiAgICAg
ICAgICAgICAgICJbIFxmXHRcblxyXHZdIiAiKlsoKV0iKSkNCg0KICAgICAg
KHggInZvaWQgVnNka19Jbml0KHZvaWQpIHsgICAgIC8qIE9wZW4gd2F0Y2hk
b2cgaXdkdCBkcml2ZXIsIHdhdGNoZG9nIGlzIHJ1bm5pbmcgbm93ICovICAg
ICBpd2R0LnBfYXBpLT5vcGVuIikpDQogIChpZiAoc3RyaW5nLW1hdGNoICIo
IiB4KQ0KICAgICAgKHByb2cyDQogICAgICAgICAgKHN0cmluZy1tYXRjaCBy
ZWdleCB4KQ0KICAgICAgICAgIChtYXRjaC1zdHJpbmcgMiB4KSkNCiAgICAo
bWVzc2FnZSAiZGlkIG5vdCBtYXRjaCIpKSkNCg==

--_----------=_15288996182258762--




Acknowledgement sent to Michele Pes <mp81ss@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs@HIDDEN. Full text available.
Report forwarded to bug-gnu-emacs@HIDDEN:
bug#31817; Package emacs. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Mon, 25 Nov 2019 12:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.