GNU bug report logs -
#37440
[PATCH] New rx implementation with extension constructs
Previous Next
Reported by: Mattias Engdegård <mattiase <at> acm.org>
Date: Tue, 17 Sep 2019 12:51:02 UTC
Severity: wishlist
Tags: patch
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 37440 in the body.
You can then email your comments to 37440 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#37440
; Package
emacs
.
(Tue, 17 Sep 2019 12:51:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Mattias Engdegård <mattiase <at> acm.org>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Tue, 17 Sep 2019 12:51:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
[Continuing from https://lists.gnu.org/archive/html/emacs-devel/2019-09/msg00048.html]
Here is a new rx implementation (faster, easier to work with, fewer bugs, better tests), and, as a separate patch, an rx extension mechanism adding the macros `rx-define', `rx-let' and `rx-let-eval'.
The first patch is a ground-up rewrite of rx. It should be completely compatible.
The second patch adds
(rx-define NAME [ARGS] RX)
(rx-let ((NAME [ARGS] RX) ...) BODY)
(rx-let-eval ((NAME [ARGS] RX) ...) BODY)
as mentioned in the emacs-devel thread earlier. Additions to the manual are included.
Although I believe this to be a consistent and useful design that could be used as-is, some points worth thinking about are:
* Allow for multiple RXs in the definitions, making an implicit (seq ...). This could be done with the Schemeish syntax
(rx-define NAME RX...)
(rx-define (NAME ARGS...) RX...)
which is quite readable as "definition mirrors use". Should then the &rest parameter be declared using a dotted list, as
(rx-define (NAME ARG1 ARG2 . ARG-REST) RX...)
?
* There is some disagreement regarding whether function-like definitions should be standard Lisp expressions instead of the restricted substitution-based macros in this patch, as in
(rx-define whole (x) `(seq bos ,x eos))
I believe the usability of the chosen design is better, but see the point of not reinventing the wheel.
* Not entirely satisfied with the name `rx-let-eval', but unless someone comes up with something better, it stands.
[0001-New-rx-implementation.patch (application/octet-stream, attachment)]
[0002-Add-rx-extension-mechanism.patch (application/octet-stream, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#37440
; Package
emacs
.
(Tue, 17 Sep 2019 17:48:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 37440 <at> debbugs.gnu.org (full text, mbox):
I like the rx proposal in <https://bugs.gnu.org/37440>. Let's install it into
master if no significant problems turn up after a few days of review. Thanks for
bird-dogging it.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#37440
; Package
emacs
.
(Tue, 24 Sep 2019 17:56:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 37440 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
I tried the proposed patches with current Emacs master on Fedora 30
x86-64 and got a test failure as shown in the attached file. Could you
please look into that? Thanks.
[rx-tests.log (text/x-log, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#37440
; Package
emacs
.
(Wed, 25 Sep 2019 12:35:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 37440 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
24 sep. 2019 kl. 19.55 skrev Paul Eggert <eggert <at> cs.ucla.edu>:
>
> I tried the proposed patches with current Emacs master on Fedora 30 x86-64 and got a test failure as shown in the attached file.
Thank you! Those failures only occur when running test loaded from a byte-compiled file -- I suppose you used TEST_LOAD_EL=no.
First, the unibyte and multibyte forms of a string like "\326" print the same but aren't equal:
(string-to-multibyte "\326")
=> "\326"
(equal (string-to-multibyte "\326") "\326")
=> nil
This means that if a multibyte string ends up as a constant in byte-compiled code, surprise, it may become a unibyte value when loaded. The test had to be made to work both interpreted and compiled. Fortunately the regexp engine was recently fixed with respect to raw bytes, making its semantics invariant for strings with the same print representation, so this is not a problem with the rx implementation.
The second item of interest was that `rx-define', since it relies on `eval-and-compile', doesn't expand to code when macroexpanded. I don't know if it will be a problem in practice. The test now uses an auxiliary function as work-around.
Updated patches attached.
[0001-New-rx-implementation.patch (application/octet-stream, attachment)]
[0002-Add-rx-extension-mechanism.patch (application/octet-stream, attachment)]
Reply sent
to
Paul Eggert <eggert <at> cs.ucla.edu>
:
You have taken responsibility.
(Wed, 25 Sep 2019 21:31:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Mattias Engdegård <mattiase <at> acm.org>
:
bug acknowledged by developer.
(Wed, 25 Sep 2019 21:31:02 GMT)
Full text and
rfc822 format available.
Message #19 received at 37440-done <at> debbugs.gnu.org (full text, mbox):
Thanks, I installed those patches into master and am closing the bug report.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Thu, 24 Oct 2019 11:24:08 GMT)
Full text and
rfc822 format available.
This bug report was last modified 4 years and 186 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.