GNU bug report logs - #20307
25.0.50; (regexp-opt nil ...) returns ""

Previous Next

Package: emacs;

Reported by: David Kastrup <dak <at> gnu.org>

Date: Sun, 12 Apr 2015 09:51:01 UTC

Severity: wishlist

Found in version 25.0.50

Done: Mattias Engdegård <mattiase <at> acm.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 20307 in the body.
You can then email your comments to 20307 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#20307; Package emacs. (Sun, 12 Apr 2015 09:51:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to David Kastrup <dak <at> gnu.org>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sun, 12 Apr 2015 09:51:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: David Kastrup <dak <at> gnu.org>
To: bug-gnu-emacs <at> gnu.org
Subject: 25.0.50; (regexp-opt nil ...) returns ""
Date: Sun, 12 Apr 2015 11:50:09 +0200
Both

M-: (regexp-opt nil) RET

and

M-: (regexp-opt nil t) RET

return "".  However, they should return a regexp matching _nothing_
rather than everything, and the second invocation should also count as
one \(\) pairing.

So something like "[b-a]" and "\([b-a]\)" or
"[^[:unibyte:][:multibyte:]]" or something similarly contorted.



In GNU Emacs 25.0.50.1 (i686-pc-linux-gnu, GTK+ Version 3.12.2)
 of 2015-03-04 on lola
Repository revision: ca2b0e220ee6b2cab538e84703559696ce477e71
Windowing system distributor `The X.Org Foundation', version 11.0.11600000
System Description:	Ubuntu 14.10

Configured using:
 `configure --without-toolkit-scroll-bars'

Configured features:
XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND GPM DBUS GCONF GSETTINGS
NOTIFY LIBSELINUX GNUTLS LIBXML2 FREETYPE M17N_FLT LIBOTF XFT ZLIB

Important settings:
  value of $LC_MONETARY: en_US.UTF-8
  value of $LC_NUMERIC: en_US.UTF-8
  value of $LC_TIME: en_US.UTF-8
  value of $LANG: en_US.UTF-8
  value of $XMODIFIERS: @im=ibus
  locale-coding-system: utf-8-unix

Major mode: Group

Minor modes in effect:
  gnus-undo-mode: t
  diff-auto-refine-mode: t
  TeX-PDF-mode: t
  desktop-save-mode: t
  minibuffer-electric-default-mode: t
  tooltip-mode: t
  global-eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  buffer-read-only: t
  line-number-mode: t

Recent messages:
nnml: Reading incoming mail (no new mail)...done
Reading active file from private via nnml...done
Reading active file from archive via nnfolder...done
Reading active file via nndraft...done
Reading active file from /home/dak3/Mail/family via nndir...done
Checking new news...done
Opening nnml server on private...done
Expiring articles...done
"" [2 times]
Auto-saving...
Quit

Load-path shadows:
None found.

Features:
(shadow nnir emacsbug sendmail mule-util sort smiley gnus-cite flow-fill
mm-archive mail-extr gnus-bcklg gnus-kill gnus-async qp gnus-ml
disp-table pop3 nndir nndraft nnmh gnutls network-stream nsm auth-source
cl-macs eieio gv eieio-core cl-generic pcase starttls nnml nnfolder
nnnil gnus-agent gnus-srvr gnus-score score-mode nnvirtual gnus-msg
gnus-art mm-uu mml2015 mm-view mml-smime smime password-cache dig
mailcap nntp gnus-cache gnus-sum gnus-group gnus-undo gnus-start
gnus-cloud nnimap nnmail mail-source tls utf7 netrc nnoo parse-time
gnus-spec gnus-int gnus-range gnus-win warnings help-mode debug tar-mode
message format-spec rfc822 mml mml-sec mm-decode mm-bodies mm-encode
mail-parse rfc2231 rfc2047 rfc2045 ietf-drums mailabbrev gmm-utils
mailheader sh-script smie executable make-mode smerge-mode nxml-uchnm
rng-xsd xsd-regexp rng-cmpct rng-nxml rng-valid rng-loc rng-uri
rng-parse nxml-parse rng-match rng-dt rng-util rng-pttrn nxml-ns
nxml-mode nxml-outln nxml-rap nxml-util nxml-glyph nxml-enc xmltok
python json autorevert filenotify add-log tex-info texinfo latexenc
jka-compr preview prv-emacs tex-bar toolbar-x noutline outline latex
edmacro kmacro tex-style reftex-dcr reftex-auc reftex reftex-vars
dired-x dired scheme vc vc-dispatcher vc-git diff-mode easy-mmode
lilypond-mode compile comint ansi-color ring font-latex byte-opt
bytecomp byte-compile cl-extra seq cconv plain-tex tex-buf tex dbus xml
crm cc-mode cc-fonts cc-guess cc-menus cc-cmds cc-styles cc-align
cc-engine cc-vars cc-defs info easymenu package epg-config advice
desktop frameset minibuf-eldef gnus gnus-ems nnheader gnus-util
mail-utils mm-util help-fns mail-prsvr wid-edit cl-loaddefs cl-lib
cus-start cus-load preview-latex tex-site auto-loads server time-date
tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type
mwheel x-win x-dnd tool-bar dnd fontset image regexp-opt fringe
tabulated-list newcomment elisp-mode lisp-mode prog-mode register page
menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock font-lock
syntax facemenu font-core frame cham georgian utf-8-lang misc-lang
vietnamese tibetan thai tai-viet lao korean japanese hebrew greek
romanian slovak czech european ethiopic indian cyrillic chinese
case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer
cl-preloaded nadvice loaddefs button faces cus-face macroexp files
text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget hashtable-print-readable backquote make-network-process
dbusbind gfilenotify dynamic-setting system-font-setting
font-render-setting move-toolbar gtk x-toolkit x multi-tty emacs)

Memory information:
((conses 8 624077 47421)
 (symbols 24 53351 145)
 (miscs 20 1251 1014)
 (strings 16 106692 17355)
 (string-bytes 1 3514198)
 (vectors 8 38616)
 (vector-slots 4 967485 13436)
 (floats 8 355 575)
 (intervals 28 23406 1055)
 (buffers 520 305)
 (heap 1024 69082 6867))

-- 
David Kastrup




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20307; Package emacs. (Sun, 12 Apr 2015 12:10:02 GMT) Full text and rfc822 format available.

Message #8 received at 20307 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
To: David Kastrup <dak <at> gnu.org>
Cc: 20307 <at> debbugs.gnu.org
Subject: Re: bug#20307: 25.0.50; (regexp-opt nil ...) returns ""
Date: Sun, 12 Apr 2015 08:09:14 -0400
> So something like "[b-a]" and "\([b-a]\)" or
> "[^[:unibyte:][:multibyte:]]" or something similarly contorted.

I can't remember which regexp I used last time I needed one like that,
but something like "\\`\\'a" has the advantage of being efficient (the
\` anchor at the start is detected by Emacs and prevents looking for
a match over the whole searched area).


        Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20307; Package emacs. (Mon, 13 Apr 2015 14:09:02 GMT) Full text and rfc822 format available.

Message #11 received at 20307 <at> debbugs.gnu.org (full text, mbox):

From: Artur Malabarba <bruce.connor.am <at> gmail.com>
To: David Kastrup <dak <at> gnu.org>
Cc: 20307 <at> debbugs.gnu.org
Subject: Re: bug#20307: 25.0.50; (regexp-opt nil ...) returns ""
Date: Mon, 13 Apr 2015 15:08:25 +0100
2015-04-12 10:50 GMT+01:00 David Kastrup
> Both
>
> M-: (regexp-opt nil) RET
>
> and
>
> M-: (regexp-opt nil t) RET
>
> return "".  However, they should return a regexp matching nothing
> rather than everything, and the second invocation should also count as
> one () pairing.

I agree there should be () on the second one, but I strongly disagree
they should match nothing.

regexp-opt is NOT meant to match only the given strings. It is meant
to match anything containing the given strings.

There is a very fundamental difference in that. The less strings you
pass to regexp-opt, the MORE things the regexp will match. Why would
we suddently flip that on its head when going from 1 to 0 strings?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20307; Package emacs. (Mon, 13 Apr 2015 15:21:02 GMT) Full text and rfc822 format available.

Message #14 received at 20307 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Artur Malabarba <bruce.connor.am <at> gmail.com>
Cc: 20307 <at> debbugs.gnu.org, David Kastrup <dak <at> gnu.org>
Subject: Re: bug#20307: 25.0.50; (regexp-opt nil ...) returns ""
Date: Mon, 13 Apr 2015 11:19:44 -0400
> The less strings you pass to regexp-opt, the MORE things the regexp
> will match.

Hmm... I don't think so.


        Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20307; Package emacs. (Mon, 13 Apr 2015 15:35:02 GMT) Full text and rfc822 format available.

Message #17 received at 20307 <at> debbugs.gnu.org (full text, mbox):

From: David Kastrup <dak <at> gnu.org>
To: Artur Malabarba <bruce.connor.am <at> gmail.com>
Cc: 20307 <at> debbugs.gnu.org
Subject: Re: bug#20307: 25.0.50; (regexp-opt nil ...) returns ""
Date: Mon, 13 Apr 2015 17:34:34 +0200
Artur Malabarba <bruce.connor.am <at> gmail.com> writes:

> 2015-04-12 10:50 GMT+01:00 David Kastrup
>> Both
>>
>> M-: (regexp-opt nil) RET
>>
>> and
>>
>> M-: (regexp-opt nil t) RET
>>
>> return "".  However, they should return a regexp matching nothing
>> rather than everything, and the second invocation should also count as
>> one () pairing.
>
> I agree there should be () on the second one, but I strongly disagree
> they should match nothing.
>
> regexp-opt is NOT meant to match only the given strings. It is meant
> to match anything containing the given strings.

Well, and no string to match has been given.  This is not
(regexp-opt '(""))
but rather
(regexp-opt '())

> There is a very fundamental difference in that. The less strings you
> pass to regexp-opt, the MORE things the regexp will match.

Come again?

> Why would we suddently flip that on its head when going from 1 to 0
> strings?

(regexp-opt '("a" "b" "c")) -> "[abc]"
(regexp-opt '("a" "b")) -> "[ab]"

Quite literally (execute C-x C-e after the expressions above if you
don't believe me).  So how does "[ab]" match more than "[abc]" ?

-- 
David Kastrup




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20307; Package emacs. (Mon, 13 Apr 2015 16:00:04 GMT) Full text and rfc822 format available.

Message #20 received at 20307 <at> debbugs.gnu.org (full text, mbox):

From: Artur Malabarba <bruce.connor.am <at> gmail.com>
To: David Kastrup <dak <at> gnu.org>
Cc: 20307 <at> debbugs.gnu.org
Subject: Re: bug#20307: 25.0.50; (regexp-opt nil ...) returns ""
Date: Mon, 13 Apr 2015 16:59:36 +0100
[Message part 1 (text/plain, inline)]
I believe you. Please ignore this babbling baboon.
That's what I get for typing on the bus.
On Apr 13, 2015 4:34 PM, "David Kastrup" <dak <at> gnu.org> wrote:

> Artur Malabarba <bruce.connor.am <at> gmail.com> writes:
>
> > 2015-04-12 10:50 GMT+01:00 David Kastrup
> >> Both
> >>
> >> M-: (regexp-opt nil) RET
> >>
> >> and
> >>
> >> M-: (regexp-opt nil t) RET
> >>
> >> return "".  However, they should return a regexp matching nothing
> >> rather than everything, and the second invocation should also count as
> >> one () pairing.
> >
> > I agree there should be () on the second one, but I strongly disagree
> > they should match nothing.
> >
> > regexp-opt is NOT meant to match only the given strings. It is meant
> > to match anything containing the given strings.
>
> Well, and no string to match has been given.  This is not
> (regexp-opt '(""))
> but rather
> (regexp-opt '())
>
> > There is a very fundamental difference in that. The less strings you
> > pass to regexp-opt, the MORE things the regexp will match.
>
> Come again?
>
> > Why would we suddently flip that on its head when going from 1 to 0
> > strings?
>
> (regexp-opt '("a" "b" "c")) -> "[abc]"
> (regexp-opt '("a" "b")) -> "[ab]"
>
> Quite literally (execute C-x C-e after the expressions above if you
> don't believe me).  So how does "[ab]" match more than "[abc]" ?
>
> --
> David Kastrup
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20307; Package emacs. (Mon, 25 Feb 2019 14:58:02 GMT) Full text and rfc822 format available.

Message #23 received at 20307 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: 20307 <at> debbugs.gnu.org
Subject: [PATCH] make regexp-opt return a no-match return value with empty
 input
Date: Mon, 25 Feb 2019 15:57:44 +0100
[Message part 1 (text/plain, inline)]
Here is a patch (moved from Bug#34641 where it was independently reported).
[0001-Correct-regexp-opt-return-value-for-empty-string-lis.patch (application/octet-stream, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20307; Package emacs. (Sat, 02 Mar 2019 12:38:02 GMT) Full text and rfc822 format available.

Message #26 received at 20307 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 20307 <at> debbugs.gnu.org
Subject: Re: bug#20307: [PATCH] make regexp-opt return a no-match return value
 with empty input
Date: Sat, 02 Mar 2019 14:37:26 +0200
> From: Mattias Engdegård <mattiase <at> acm.org>
> Date: Mon, 25 Feb 2019 15:57:44 +0100
> 
> +If STRINGS is empty, the return value is a regexp that never
> +matches anything.

This says "empty", but the actual test will catch nil as well as an
empty list.  Should we perhaps mention that?

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20307; Package emacs. (Sat, 02 Mar 2019 14:22:02 GMT) Full text and rfc822 format available.

Message #29 received at 20307 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20307 <at> debbugs.gnu.org
Subject: Re: bug#20307: [PATCH] make regexp-opt return a no-match return value
 with empty input
Date: Sat, 2 Mar 2019 15:21:52 +0100
2 mars 2019 kl. 13.37 skrev Eli Zaretskii <eliz <at> gnu.org>:
> 
>> From: Mattias Engdegård <mattiase <at> acm.org>
>> Date: Mon, 25 Feb 2019 15:57:44 +0100
>> 
>> +If STRINGS is empty, the return value is a regexp that never
>> +matches anything.
> 
> This says "empty", but the actual test will catch nil as well as an
> empty list.  Should we perhaps mention that?

Sorry, I don't understand. Do we in general distinguish nil from the empty list in documentation?
Or did you mean that the phrase should be "If STRINGS is the empty list..."?






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20307; Package emacs. (Sat, 02 Mar 2019 14:42:01 GMT) Full text and rfc822 format available.

Message #32 received at 20307 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 20307 <at> debbugs.gnu.org
Subject: Re: bug#20307: [PATCH] make regexp-opt return a no-match return value
 with empty input
Date: Sat, 02 Mar 2019 16:41:17 +0200
> From: Mattias Engdegård <mattiase <at> acm.org>
> Date: Sat, 2 Mar 2019 15:21:52 +0100
> Cc: 20307 <at> debbugs.gnu.org
> 
> > This says "empty", but the actual test will catch nil as well as an
> > empty list.  Should we perhaps mention that?
> 
> Sorry, I don't understand. Do we in general distinguish nil from the empty list in documentation?

I don't know.  I thought the reader might not be aware of their
equivalence, so being explicit would be better.

> Or did you mean that the phrase should be "If STRINGS is the empty list..."?

Yes, that would take care of the issue.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20307; Package emacs. (Sat, 02 Mar 2019 14:50:02 GMT) Full text and rfc822 format available.

Message #35 received at 20307 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20307 <at> debbugs.gnu.org
Subject: Re: bug#20307: [PATCH] make regexp-opt return a no-match return value
 with empty input
Date: Sat, 2 Mar 2019 15:49:07 +0100
[Message part 1 (text/plain, inline)]
>> Sorry, I don't understand. Do we in general distinguish nil from the empty list in documentation?
> 
> I don't know.  I thought the reader might not be aware of their
> equivalence, so being explicit would be better.
> 
>> Or did you mean that the phrase should be "If STRINGS is the empty list..."?
> 
> Yes, that would take care of the issue.

Now done in the doc string and in searching.texi.

New patch attached.

[0001-Correct-regexp-opt-return-value-for-empty-string-lis.patch (application/octet-stream, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20307; Package emacs. (Sat, 02 Mar 2019 14:59:01 GMT) Full text and rfc822 format available.

Message #38 received at 20307 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 20307 <at> debbugs.gnu.org
Subject: Re: bug#20307: [PATCH] make regexp-opt return a no-match return value
 with empty input
Date: Sat, 02 Mar 2019 16:57:54 +0200
> From: Mattias Engdegård <mattiase <at> acm.org>
> Date: Sat, 2 Mar 2019 15:49:07 +0100
> Cc: 20307 <at> debbugs.gnu.org
> 
> >> Or did you mean that the phrase should be "If STRINGS is the empty list..."?
> > 
> > Yes, that would take care of the issue.
> 
> Now done in the doc string and in searching.texi.
> 
> New patch attached.

Fine with me, thanks.




Reply sent to Mattias Engdegård <mattiase <at> acm.org>:
You have taken responsibility. (Sat, 02 Mar 2019 15:25:02 GMT) Full text and rfc822 format available.

Notification sent to David Kastrup <dak <at> gnu.org>:
bug acknowledged by developer. (Sat, 02 Mar 2019 15:25:02 GMT) Full text and rfc822 format available.

Message #43 received at 20307-done <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20307-done <at> debbugs.gnu.org
Subject: Re: bug#20307: [PATCH] make regexp-opt return a no-match return value
 with empty input
Date: Sat, 2 Mar 2019 16:24:09 +0100
2 mars 2019 kl. 15.57 skrev Eli Zaretskii <eliz <at> gnu.org>:
>> 
>> New patch attached.
> 
> Fine with me, thanks.

Thank you, pushed.






bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 31 Mar 2019 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 5 years and 21 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.