GNU bug report logs - #37440
[PATCH] New rx implementation with extension constructs

Previous Next

Package: emacs;

Reported by: Mattias Engdegård <mattiase <at> acm.org>

Date: Tue, 17 Sep 2019 12:51:02 UTC

Severity: wishlist

Tags: patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 37440 in the body.
You can then email your comments to 37440 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#37440; Package emacs. (Tue, 17 Sep 2019 12:51:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Mattias Engdegård <mattiase <at> acm.org>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Tue, 17 Sep 2019 12:51:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: bug-gnu-emacs <at> gnu.org
Subject: [PATCH] New rx implementation with extension constructs
Date: Tue, 17 Sep 2019 14:49:51 +0200
[Message part 1 (text/plain, inline)]
[Continuing from https://lists.gnu.org/archive/html/emacs-devel/2019-09/msg00048.html]

Here is a new rx implementation (faster, easier to work with, fewer bugs, better tests), and, as a separate patch, an rx extension mechanism adding the macros `rx-define', `rx-let' and `rx-let-eval'.

The first patch is a ground-up rewrite of rx. It should be completely compatible.

The second patch adds

(rx-define NAME [ARGS] RX)
(rx-let ((NAME [ARGS] RX) ...) BODY)
(rx-let-eval ((NAME [ARGS] RX) ...) BODY)

as mentioned in the emacs-devel thread earlier. Additions to the manual are included.

Although I believe this to be a consistent and useful design that could be used as-is, some points worth thinking about are:

* Allow for multiple RXs in the definitions, making an implicit (seq ...). This could be done with the Schemeish syntax

(rx-define NAME RX...)
(rx-define (NAME ARGS...) RX...)

which is quite readable as "definition mirrors use". Should then the &rest parameter be declared using a dotted list, as

(rx-define (NAME ARG1 ARG2 . ARG-REST) RX...)

?

* There is some disagreement regarding whether function-like definitions should be standard Lisp expressions instead of the restricted substitution-based macros in this patch, as in

(rx-define whole (x) `(seq bos ,x eos))

I believe the usability of the chosen design is better, but see the point of not reinventing the wheel.

* Not entirely satisfied with the name `rx-let-eval', but unless someone comes up with something better, it stands.
[0001-New-rx-implementation.patch (application/octet-stream, attachment)]
[0002-Add-rx-extension-mechanism.patch (application/octet-stream, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#37440; Package emacs. (Tue, 17 Sep 2019 17:48:01 GMT) Full text and rfc822 format available.

Message #8 received at 37440 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 37440 <at> debbugs.gnu.org
Subject: Re: [PATCH] New rx implementation with extension constructs
Date: Tue, 17 Sep 2019 10:47:37 -0700
I like the rx proposal in <https://bugs.gnu.org/37440>. Let's install it into 
master if no significant problems turn up after a few days of review. Thanks for 
bird-dogging it.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#37440; Package emacs. (Tue, 24 Sep 2019 17:56:02 GMT) Full text and rfc822 format available.

Message #11 received at 37440 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 37440 <at> debbugs.gnu.org
Subject: Re: [PATCH] New rx implementation with extension constructs
Date: Tue, 24 Sep 2019 10:55:07 -0700
[Message part 1 (text/plain, inline)]
I tried the proposed patches with current Emacs master on Fedora 30 
x86-64 and got a test failure as shown in the attached file. Could you 
please look into that? Thanks.
[rx-tests.log (text/x-log, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#37440; Package emacs. (Wed, 25 Sep 2019 12:35:01 GMT) Full text and rfc822 format available.

Message #14 received at 37440 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 37440 <at> debbugs.gnu.org
Subject: Re: [PATCH] New rx implementation with extension constructs
Date: Wed, 25 Sep 2019 14:33:46 +0200
[Message part 1 (text/plain, inline)]
24 sep. 2019 kl. 19.55 skrev Paul Eggert <eggert <at> cs.ucla.edu>:
> 
> I tried the proposed patches with current Emacs master on Fedora 30 x86-64 and got a test failure as shown in the attached file. 

Thank you! Those failures only occur when running test loaded from a byte-compiled file -- I suppose you used TEST_LOAD_EL=no.

First, the unibyte and multibyte forms of a string like "\326" print the same but aren't equal:

(string-to-multibyte "\326")
=> "\326"
(equal (string-to-multibyte "\326") "\326")
=> nil

This means that if a multibyte string ends up as a constant in byte-compiled code, surprise, it may become a unibyte value when loaded. The test had to be made to work both interpreted and compiled. Fortunately the regexp engine was recently fixed with respect to raw bytes, making its semantics invariant for strings with the same print representation, so this is not a problem with the rx implementation.

The second item of interest was that `rx-define', since it relies on `eval-and-compile', doesn't expand to code when macroexpanded. I don't know if it will be a problem in practice. The test now uses an auxiliary function as work-around.

Updated patches attached.

[0001-New-rx-implementation.patch (application/octet-stream, attachment)]
[0002-Add-rx-extension-mechanism.patch (application/octet-stream, attachment)]

Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Wed, 25 Sep 2019 21:31:02 GMT) Full text and rfc822 format available.

Notification sent to Mattias Engdegård <mattiase <at> acm.org>:
bug acknowledged by developer. (Wed, 25 Sep 2019 21:31:02 GMT) Full text and rfc822 format available.

Message #19 received at 37440-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 37440-done <at> debbugs.gnu.org
Subject: Re: [PATCH] New rx implementation with extension constructs
Date: Wed, 25 Sep 2019 14:30:28 -0700
Thanks, I installed those patches into master and am closing the bug report.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 24 Oct 2019 11:24:08 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 186 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.