GNU bug report logs - #26338
26.0.50; Collect all matches for REGEXP in current buffer

Previous Next

Package: emacs;

Reported by: Tino Calancha <tino.calancha <at> gmail.com>

Date: Sun, 2 Apr 2017 12:42:01 UTC

Severity: wishlist

Tags: wontfix

Found in version 26.0.50

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 26338 in the body.
You can then email your comments to 26338 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to juri <at> linkov.net, bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Sun, 02 Apr 2017 12:42:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Tino Calancha <tino.calancha <at> gmail.com>:
New bug report received and forwarded. Copy sent to juri <at> linkov.net, bug-gnu-emacs <at> gnu.org. (Sun, 02 Apr 2017 12:42:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Tino Calancha <tino.calancha <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 26.0.50; Collect all matches for REGEXP in current buffer
Date: Sun, 02 Apr 2017 21:41:30 +0900
X-Debbugs-CC: Juri Linkov <juri <at> linkov.net>
Severity: wishlist

Hi,

we have `count-matches' in replace.el, which returns the
number of matches of a regexp.  Why not to have an standard
function `collect-matches' as well?

I know `xref-collect-matches' but it uses grep program: some users might
not have grep installed, or they may prefer to use Emacs regexps.

I've being using for a while something similar than the patch below.
Probably it doesn't need to be a command, just a normal function.

What do you think?
Regards,
Tino
--8<-----------------------------cut here---------------start------------->8---
commit ccc78b19aa044f6bdb27875937320ed06c2b517a
Author: Tino Calancha <tino.calancha <at> gmail.com>
Date:   Sun Apr 2 21:37:19 2017 +0900

    collect-matches: New command
    
    Collect all matches for REGEXP in current buffer (Bug#26338).
    * lisp/replace.el (collect-matches): New command.

diff --git a/lisp/replace.el b/lisp/replace.el
index a7b8ae6a34..6f2c6c9a2b 100644
--- a/lisp/replace.el
+++ b/lisp/replace.el
@@ -1002,6 +1002,44 @@ how-many
 				 (if (= count 1) "" "s")))
       count)))
 
+(defun collect-matches (regexp &optional region group limit)
+  "Collect matches for REGEXP following point.
+Optional arg REGION, if non-nil, mean restrict search to the
+ specified region.  Otherwise search the entire buffer.
+ REGION must be a list of (START . END) positions as returned by
+ `region-bounds'.
+Interactively, in Transient Mark mode when the mark is active, operate
+ on the contents of the region.  Otherwise, operate from point to the
+ end of (the accessible portion of) the buffer.
+Optional GROUP if non-nil, then is the regexp group to save.  Otherwise,
+ save the whole match.  Interactively, a numeric prefix set GROUP.
+Optional LIMIT if non-nil, then stop after such number of matches.
+ Otherwise collect all of them."
+  (interactive
+   (list (read-regexp "Collect matches for regexp: ")
+         (and (use-region-p) (region-bounds))
+         (if current-prefix-arg (prefix-numeric-value current-prefix-arg) 0)
+         nil))
+  (unless group (setq group 0))
+  (let* ((count 0)
+         (start (if region (max (caar region) (point-min)) (point)))
+         (end (if region (min (cdar region) (point-max)) (point-max)))
+         res)
+    (save-excursion
+      (goto-char start)
+      (catch '--collect-matches-end
+        (while (re-search-forward regexp nil t)
+          (unless (>= (point) end)
+            (push (match-string-no-properties group) res)
+            (cl-incf count))
+          (when (or (>= (point) end)
+                    (and (natnump limit) (>= count limit)))
+            (throw '--collect-matches-end nil)))))
+    (message "%d Match%s: %s"
+             count (if (= count 1) "" "es")
+             (mapconcat 'identity (setq res (nreverse res)) " "))
+    res))
+
 
 (defvar occur-menu-map
   (let ((map (make-sparse-keymap)))
--8<-----------------------------cut here---------------end--------------->8---
In GNU Emacs 26.0.50 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.22.9)
 of 2017-04-02
Repository revision: afabe53b562675b6279cc670ceba32357fac2214




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Sun, 02 Apr 2017 15:58:02 GMT) Full text and rfc822 format available.

Message #8 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Tino Calancha <tino.calancha <at> gmail.com>, 26338 <at> debbugs.gnu.org
Cc: juri linkov <juri <at> linkov.net>
Subject: Re: bug#26338: 26.0.50; Collect all matches for REGEXP in current
 buffer
Date: Sun, 2 Apr 2017 18:57:02 +0300
On 02.04.2017 15:41, Tino Calancha wrote:

> we have `count-matches' in replace.el, which returns the
> number of matches of a regexp.  Why not to have an standard
> function `collect-matches' as well?
> 
> I know `xref-collect-matches' but it uses grep program: some users might
> not have grep installed, or they may prefer to use Emacs regexps.
> 
> I've being using for a while something similar than the patch below.
> Probably it doesn't need to be a command, just a normal function.
> 
> What do you think?
When used interactively, isn't M-x occur doing something like this?

And for Elisp programs, (while (re-search-forward ...)) is usually 
sufficient. That's a three-liner at worst.

And I've never had a need to limit the number of matches, personally.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Sun, 02 Apr 2017 22:13:02 GMT) Full text and rfc822 format available.

Message #11 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Tino Calancha <tino.calancha <at> gmail.com>
Cc: 26338 <at> debbugs.gnu.org
Subject: Re: bug#26338: 26.0.50;
 Collect all matches for REGEXP in current buffer
Date: Mon, 03 Apr 2017 01:10:02 +0300
> we have `count-matches' in replace.el, which returns the
> number of matches of a regexp.  Why not to have an standard
> function `collect-matches' as well?
>
> I know `xref-collect-matches' but it uses grep program: some users might
> not have grep installed, or they may prefer to use Emacs regexps.
>
> I've being using for a while something similar than the patch below.
> Probably it doesn't need to be a command, just a normal function.
>
> What do you think?

But there is already the occur-collect feature implemented in occur-1
and occur-read-primary-args.  Why would we need a separate command?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Mon, 03 Apr 2017 03:59:01 GMT) Full text and rfc822 format available.

Message #14 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Tino Calancha <tino.calancha <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 26338 <at> debbugs.gnu.org, juri linkov <juri <at> linkov.net>,
 Tino Calancha <tino.calancha <at> gmail.com>
Subject: Re: bug#26338: 26.0.50; Collect all matches for REGEXP in current
 buffer
Date: Mon, 3 Apr 2017 12:58:44 +0900 (JST)

On Sun, 2 Apr 2017, Dmitry Gutov wrote:

> On 02.04.2017 15:41, Tino Calancha wrote:
>
>> we have `count-matches' in replace.el, which returns the
>> number of matches of a regexp.  Why not to have an standard
>> function `collect-matches' as well?
>> 
>> I know `xref-collect-matches' but it uses grep program: some users might
>> not have grep installed, or they may prefer to use Emacs regexps.
>> 
>> I've being using for a while something similar than the patch below.
>> Probably it doesn't need to be a command, just a normal function.
>> 
>> What do you think?
> When used interactively, isn't M-x occur doing something like this?
>
> And for Elisp programs, (while (re-search-forward ...)) is usually 
> sufficient. That's a three-liner at worst.
It might be argue the same for occur.  You can just increase a counter
inside (while (re-search-forward ...))

> And I've never had a need to limit the number of matches, personally.
I did often while implementing Bug#25493.  Let's say i am interested in
the last 200 commits modifying a file foo.el.
M-x: find-library foo RET
C-x v l
M-: (setq hashes (collect-matches "^commit \\([[:xdigit:]]+\\)" nil 1 200))

In this case, there is no need to go beyond 200 that's why the limit 
argument might be useful.

Another example,
let's say i want to know the two first defun's in subr.el
M-x: find-library subr RET
M-: (collect-matches "^(defun \\([^[:blank:]]+\\)" nil 1 2) RET

Of course you could do:
M-: (seq-take (collect-matches "^(defun \\([^[:blank:]]+\\)" nil 1) 2) RET
;; But if you just want the 2 leading defun's this is a waste.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Mon, 03 Apr 2017 04:02:02 GMT) Full text and rfc822 format available.

Message #17 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Tino Calancha <tino.calancha <at> gmail.com>
To: Juri Linkov <juri <at> linkov.net>
Cc: 26338 <at> debbugs.gnu.org, Tino Calancha <tino.calancha <at> gmail.com>
Subject: Re: bug#26338: 26.0.50; Collect all matches for REGEXP in current
 buffer
Date: Mon, 3 Apr 2017 13:01:50 +0900 (JST)

On Mon, 3 Apr 2017, Juri Linkov wrote:

>> we have `count-matches' in replace.el, which returns the
>> number of matches of a regexp.  Why not to have an standard
>> function `collect-matches' as well?
>>
>> I know `xref-collect-matches' but it uses grep program: some users might
>> not have grep installed, or they may prefer to use Emacs regexps.
>>
>> I've being using for a while something similar than the patch below.
>> Probably it doesn't need to be a command, just a normal function.
>>
>> What do you think?
>
> But there is already the occur-collect feature implemented in occur-1
> and occur-read-primary-args.  Why would we need a separate command?
Sorry, i don't know about `occur-collect', i can not find its definition.
It doesn't seem to be a defun in replace.el. 
See my previous e-mail and let me know if `occur-collect' can serve for 
that purpose.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Mon, 03 Apr 2017 06:14:02 GMT) Full text and rfc822 format available.

Message #20 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Tino Calancha <tino.calancha <at> gmail.com>
To: Juri Linkov <juri <at> linkov.net>
Cc: 26338 <at> debbugs.gnu.org
Subject: Re: bug#26338: 26.0.50;
 Collect all matches for REGEXP in current buffer
Date: Mon, 03 Apr 2017 15:13:07 +0900
Juri Linkov <juri <at> linkov.net> writes:

>> we have `count-matches' in replace.el, which returns the
>> number of matches of a regexp.  Why not to have an standard
>> function `collect-matches' as well?
>>
>> I know `xref-collect-matches' but it uses grep program: some users might
>> not have grep installed, or they may prefer to use Emacs regexps.
>>
>> I've being using for a while something similar than the patch below.
>> Probably it doesn't need to be a command, just a normal function.
>>
>> What do you think?
>
> But there is already the occur-collect feature implemented in occur-1
> and occur-read-primary-args.  Why would we need a separate command?
Indeed i don't think we need a new command for this.  I am thinking more
in an standard function.
Following:
(occur "defun\\s +\\(\\S +\\)" "\\1")

doesn't return the collected things.  It writes the matches in *Occur*
buffer.  Then, if you want a list with the matches you must loop
again inside *Occur* which is sub-optimal.
For me, it has sense to have a `occur-collect' which just returns the
list of matches.
Then, we might use such function in the implementation of occur-1
which could bring a cleaner implementation.
We might get also the LIMIT argument for occur which might come
in handy for multi-occur with lot of input buffers (just an idea).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Mon, 03 Apr 2017 23:37:02 GMT) Full text and rfc822 format available.

Message #23 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Tino Calancha <tino.calancha <at> gmail.com>
Cc: 26338 <at> debbugs.gnu.org
Subject: Re: bug#26338: 26.0.50;
 Collect all matches for REGEXP in current buffer
Date: Tue, 04 Apr 2017 02:35:29 +0300
>>> What do you think?
>>
>> But there is already the occur-collect feature implemented in occur-1
>> and occur-read-primary-args.  Why would we need a separate command?
> Indeed i don't think we need a new command for this.  I am thinking more
> in an standard function.
> Following:
> (occur "defun\\s +\\(\\S +\\)" "\\1")
>
> doesn't return the collected things.  It writes the matches in *Occur*
> buffer.  Then, if you want a list with the matches you must loop
> again inside *Occur* which is sub-optimal.
> For me, it has sense to have a `occur-collect' which just returns the
> list of matches.
> Then, we might use such function in the implementation of occur-1
> which could bring a cleaner implementation.
> We might get also the LIMIT argument for occur which might come
> in handy for multi-occur with lot of input buffers (just an idea).

occur-collect is intended for interactive use.  As for programmatic use,
Dmitry is right: a universal idiom is (while (re-search-forward ...)).
This is why e.g. the docstring of ‘replace-regexp’ recommends to use
an explicit loop like (while (re-search-forward ...) (replace-match ...))




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Tue, 04 Apr 2017 01:39:02 GMT) Full text and rfc822 format available.

Message #26 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Tino Calancha <tino.calancha <at> gmail.com>
To: Juri Linkov <juri <at> linkov.net>
Cc: 26338 <at> debbugs.gnu.org
Subject: Re: bug#26338: 26.0.50;
 Collect all matches for REGEXP in current buffer
Date: Tue, 04 Apr 2017 10:37:48 +0900
Juri Linkov <juri <at> linkov.net> writes:

>>>> What do you think?
>>>
>>> But there is already the occur-collect feature implemented in occur-1
>>> and occur-read-primary-args.  Why would we need a separate command?
>> Indeed i don't think we need a new command for this.  I am thinking more
>> in an standard function.
>> Following:
>> (occur "defun\\s +\\(\\S +\\)" "\\1")
>>
>> doesn't return the collected things.  It writes the matches in *Occur*
>> buffer.  Then, if you want a list with the matches you must loop
>> again inside *Occur* which is sub-optimal.
>> For me, it has sense to have a `occur-collect' which just returns the
>> list of matches.
>> Then, we might use such function in the implementation of occur-1
>> which could bring a cleaner implementation.
>> We might get also the LIMIT argument for occur which might come
>> in handy for multi-occur with lot of input buffers (just an idea).
>
> occur-collect is intended for interactive use.  As for programmatic use,
> Dmitry is right: a universal idiom is (while (re-search-forward ...)).
> This is why e.g. the docstring of ‘replace-regexp’ recommends to use
> an explicit loop like (while (re-search-forward ...) (replace-match ...))
OK thanks.  Let me ask you my last proposal before come back to my dark
cave and start painting animals in the walls.

Any interest in something like this?:

(defmacro with-collect-matches (regexp &optional group &rest body)
  "Collect matches for REGEXP and eval BODY for each match.
BODY is evaluated with `it' bound to the match.
Optional GROUP if non-nil, then is the regexp group to save.  Otherwise,
save the whole match."




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Tue, 04 Apr 2017 02:21:01 GMT) Full text and rfc822 format available.

Message #29 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Tino Calancha <tino.calancha <at> gmail.com>
To: Juri Linkov <juri <at> linkov.net>
Cc: 26338 <at> debbugs.gnu.org, tino.calancha <at> gmail.com
Subject: Re: bug#26338: 26.0.50;
 Collect all matches for REGEXP in current buffer
Date: Tue, 04 Apr 2017 11:20:18 +0900
Tino Calancha <tino.calancha <at> gmail.com> writes:

> Juri Linkov <juri <at> linkov.net> writes:
>> occur-collect is intended for interactive use.  As for programmatic use,
>> Dmitry is right: a universal idiom is (while (re-search-forward ...)).
>> This is why e.g. the docstring of ‘replace-regexp’ recommends to use
>> an explicit loop like (while (re-search-forward ...) (replace-match ...))
> OK thanks.  Let me ask you my last proposal before come back to my dark
> cave and start painting animals in the walls.
>
> Any interest in something like this?:
>
> (defmacro with-collect-matches (regexp &optional group &rest body)
>   "Collect matches for REGEXP and eval BODY for each match.
> BODY is evaluated with `it' bound to the match.
> Optional GROUP if non-nil, then is the regexp group to save.  Otherwise,
> save the whole match."
Sorry, i was paiting a mammoth and i forgot something in the docstring:

--8<-----------------------------cut here---------------start------------->8---
 (defmacro with-collect-matches (regexp &optional group &rest body)
   "Collect matches for REGEXP and eval BODY for each match.
 BODY is evaluated with `it' bound to the match.
 Optional GROUP if non-nil, then is the regexp group to save.  Otherwise,
 save the whole match.
 Return a list with the matches."
 --8<-----------------------------cut here---------------end--------------->8---

So, for instance:
M-x find-library replace RET
M-: (length (with-collect-matches "^(defun \\(\\S +\\)" 1)) RET
=> 52

M-x find-library replace RET
M-: (length (with-collect-matches "^(defun \\(\\S +\\)" 1
              (with-current-buffer (get-buffer-create "*Matches*")
                (when (string-match "\\`query-" it)
                  (insert (format "%s\n" it)))))
=> 52
;; Same return as before but only write into *Matches* those
;; functions with name starting with "query-".




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Tue, 04 Apr 2017 14:33:01 GMT) Full text and rfc822 format available.

Message #32 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Marcin Borkowski <mbork <at> mbork.pl>
To: Tino Calancha <tino.calancha <at> gmail.com>
Cc: 26338 <at> debbugs.gnu.org, Juri Linkov <juri <at> linkov.net>
Subject: Re: bug#26338: 26.0.50;
 Collect all matches for REGEXP in current buffer
Date: Tue, 04 Apr 2017 16:32:12 +0200
On 2017-04-04, at 03:37, Tino Calancha <tino.calancha <at> gmail.com> wrote:

> Juri Linkov <juri <at> linkov.net> writes:
>
>>>>> What do you think?
>>>>
>>>> But there is already the occur-collect feature implemented in occur-1
>>>> and occur-read-primary-args.  Why would we need a separate command?
>>> Indeed i don't think we need a new command for this.  I am thinking more
>>> in an standard function.
>>> Following:
>>> (occur "defun\\s +\\(\\S +\\)" "\\1")
>>>
>>> doesn't return the collected things.  It writes the matches in *Occur*
>>> buffer.  Then, if you want a list with the matches you must loop
>>> again inside *Occur* which is sub-optimal.
>>> For me, it has sense to have a `occur-collect' which just returns the
>>> list of matches.
>>> Then, we might use such function in the implementation of occur-1
>>> which could bring a cleaner implementation.
>>> We might get also the LIMIT argument for occur which might come
>>> in handy for multi-occur with lot of input buffers (just an idea).
>>
>> occur-collect is intended for interactive use.  As for programmatic use,
>> Dmitry is right: a universal idiom is (while (re-search-forward ...)).
>> This is why e.g. the docstring of ‘replace-regexp’ recommends to use
>> an explicit loop like (while (re-search-forward ...) (replace-match ...))
> OK thanks.  Let me ask you my last proposal before come back to my dark
> cave and start painting animals in the walls.
>
> Any interest in something like this?:
>
> (defmacro with-collect-matches (regexp &optional group &rest body)
>   "Collect matches for REGEXP and eval BODY for each match.
> BODY is evaluated with `it' bound to the match.
> Optional GROUP if non-nil, then is the regexp group to save.  Otherwise,
> save the whole match."

Sorry if this was said already, but why a macro and not a map-like
function?

Best,

-- 
Marcin Borkowski




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Wed, 05 Apr 2017 12:00:03 GMT) Full text and rfc822 format available.

Message #35 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Tino Calancha <tino.calancha <at> gmail.com>
To: Marcin Borkowski <mbork <at> mbork.pl>
Cc: 26338 <at> debbugs.gnu.org, Juri Linkov <juri <at> linkov.net>,
 Tino Calancha <tino.calancha <at> gmail.com>
Subject: Re: bug#26338: 26.0.50; Collect all matches for REGEXP in current
 buffer
Date: Wed, 5 Apr 2017 20:58:51 +0900 (JST)

On Tue, 4 Apr 2017, Marcin Borkowski wrote:

>> Any interest in something like this?:
>>
>> (defmacro with-collect-matches (regexp &optional group &rest body)
>>   "Collect matches for REGEXP and eval BODY for each match.
>> BODY is evaluated with `it' bound to the match.
>> Optional GROUP if non-nil, then is the regexp group to save.  Otherwise,
>> save the whole match."
>
> Sorry if this was said already, but why a macro and not a map-like
> function?
No special reason.  It's the second idea which came to my mind after
my initial proposal was declined.  Maybe because is shorter to do:
(with-collect-matches regexp)
than
(foo-collect-matches regexp nil #'identity)

if you are just interested in the list of matches.  Implementing it as
a map function might be also nice.  Don't see a big enthusiasm on
the proposal, though :-(

So far people think that it's easy to write a while loop.  I wonder 
if they think the same about the existence of `dolist': the should
never use it and always write a `while' loop instead.  Don't think they
do that anyway.

I will repeat it once more.  I find nice, having an operator returning a 
list with matches for REGEXP.  If such operator, in addition, accepts a 
body of code or a function, then i find this operator very nice
and elegant.

Regards,
Tino




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Wed, 05 Apr 2017 13:11:01 GMT) Full text and rfc822 format available.

Message #38 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: npostavs <at> users.sourceforge.net
To: Tino Calancha <tino.calancha <at> gmail.com>
Cc: 26338 <at> debbugs.gnu.org, Marcin Borkowski <mbork <at> mbork.pl>,
 Juri Linkov <juri <at> linkov.net>
Subject: Re: bug#26338: 26.0.50;
 Collect all matches for REGEXP in current buffer
Date: Wed, 05 Apr 2017 09:11:30 -0400
Tino Calancha <tino.calancha <at> gmail.com> writes:

>
> So far people think that it's easy to write a while loop.  I wonder if
> they think the same about the existence of `dolist': the should
> never use it and always write a `while' loop instead.  Don't think they
> do that anyway.

Perhaps a macro that loops over matches?

    (defmacro domatches (spec &rest body)
      "Loop over matches to REGEXP.

      \(fn (MATCH-VAR [GROUP] REGEXP [BOUND]) BODY...)")

Or an addition to cl-loop that would allow doing something like

    (cl-loop for m being the matches of "foo\\|bar"
             do ...)

Then you could easily 'collect m' to get the list of matches if you want
that.

> I will repeat it once more.  I find nice, having an operator returning
> a list with matches for REGEXP.

I don't think that's come up for me very much, if at all.  It seems
easier to just operate on the matches directly rather than collecting
and then mapping.

> If such operator, in addition,
> accepts a body of code or a function, then i find this operator very
> nice
> and elegant.

Forcing collection on the looping operator seems inelegant to me.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Wed, 05 Apr 2017 22:06:02 GMT) Full text and rfc822 format available.

Message #41 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Tino Calancha <tino.calancha <at> gmail.com>
Cc: 26338 <at> debbugs.gnu.org, Marcin Borkowski <mbork <at> mbork.pl>
Subject: Re: bug#26338: 26.0.50;
 Collect all matches for REGEXP in current buffer
Date: Thu, 06 Apr 2017 01:03:04 +0300
>> Sorry if this was said already, but why a macro and not a map-like
>> function?
> No special reason.  It's the second idea which came to my mind after
> my initial proposal was declined.  Maybe because is shorter to do:
> (with-collect-matches regexp)
> than
> (foo-collect-matches regexp nil #'identity)
>
> if you are just interested in the list of matches.  Implementing it as
> a map function might be also nice.  Don't see a big enthusiasm on
> the proposal, though :-(
>
> So far people think that it's easy to write a while loop.  I wonder if they
> think the same about the existence of `dolist': the should
> never use it and always write a `while' loop instead.  Don't think they
> do that anyway.
>
> I will repeat it once more.  I find nice, having an operator returning
> a list with matches for REGEXP.  If such operator, in addition, accepts
> a body of code or a function, then i find this operator very nice
> and elegant.

A mapcar-like function presumes a lambda where you can process every
match as you need, but going this way you'd have a temptation to
implement an analogous API from other programming languages like e.g.
https://apidock.com/ruby/String/scan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Fri, 07 Apr 2017 10:07:02 GMT) Full text and rfc822 format available.

Message #44 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Tino Calancha <tino.calancha <at> gmail.com>
To: npostavs <at> users.sourceforge.net
Cc: 26338 <at> debbugs.gnu.org, Juri Linkov <juri <at> linkov.net>,
 Marcin Borkowski <mbork <at> mbork.pl>, Tino Calancha <tino.calancha <at> gmail.com>
Subject: Re: bug#26338: 26.0.50; Collect all matches for REGEXP in current
 buffer
Date: Fri, 7 Apr 2017 19:06:30 +0900 (JST)

On Wed, 5 Apr 2017, npostavs <at> users.sourceforge.net wrote:

> Tino Calancha <tino.calancha <at> gmail.com> writes:
>
>>
>> So far people think that it's easy to write a while loop.  I wonder if
>> they think the same about the existence of `dolist': the should
>> never use it and always write a `while' loop instead.  Don't think they
>> do that anyway.
>
> Perhaps a macro that loops over matches?
>
>    (defmacro domatches (spec &rest body)
>      "Loop over matches to REGEXP.
>
>      \(fn (MATCH-VAR [GROUP] REGEXP [BOUND]) BODY...)")
>
> Or an addition to cl-loop that would allow doing something like
>
>    (cl-loop for m being the matches of "foo\\|bar"
>             do ...)
>
> Then you could easily 'collect m' to get the list of matches if you want
> that.
Your proposals looks nice to me ;-)
>
>> I will repeat it once more.  I find nice, having an operator returning
>> a list with matches for REGEXP.
>
> I don't think that's come up for me very much, if at all.  It seems
> easier to just operate on the matches directly rather than collecting
> and then mapping.
Sometimes i want to collect matches for different purposes; feed them into
another functions accepting a list.  That's why i miss a standard operator
collecting matches.  Sure, it can be done with a `while' loop, and 3-5
lines.  With the operator would be just one function call.

>> If such operator, in addition,
>> accepts a body of code or a function, then i find this operator very
>> nice
>> and elegant.
>
> Forcing collection on the looping operator seems inelegant to me.
You know, the beauty is in the eyes watching.  The elegance too.  Maybe
you don't like the blue jersey i am wearing now; my mum made it
for me and i love it ;-)
Suppose depend on the name of the operator.  Not 
a sorprise if `collect-matches' collect matches;  a bit of sorprise if
`domatches' does such thing.
Thank you for your opinion :-)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Fri, 07 Apr 2017 14:41:01 GMT) Full text and rfc822 format available.

Message #47 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Drew Adams <drew.adams <at> oracle.com>
To: Tino Calancha <tino.calancha <at> gmail.com>, npostavs <at> users.sourceforge.net
Cc: 26338 <at> debbugs.gnu.org, Marcin Borkowski <mbork <at> mbork.pl>,
 Juri Linkov <juri <at> linkov.net>
Subject: RE: bug#26338: 26.0.50; Collect all matches for REGEXP in current
 buffer
Date: Fri, 7 Apr 2017 07:40:28 -0700 (PDT)
> > Or an addition to cl-loop that would allow doing something like
> >
> >    (cl-loop for m being the matches of "foo\\|bar"
> >             do ...)
> >
> > Then you could easily 'collect m' to get the list of matches if you want
> > that.
>
> Your proposals looks nice to me ;-)

(Caveat: I have not been following this thread.)

I think that `cl-loop' should be as close to Common Lisp `loop'
as we can reasonably make it.  We should _not_ be adding other
features to it or changing its behavior away from what it is
supposedly emulating.

If you want, create a _different_ macro that is Emacs-specific,
with whatever behavior you want.  Call it whatever you want
that will not be confused with Common Lisp emulation.

Please keep `cl-' for Common Lisp emulation.  We've already
seen more than enough tampering with this - people adding
their favorite thing to the `cl-' namespace.  Not good.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Fri, 07 Apr 2017 14:48:02 GMT) Full text and rfc822 format available.

Message #50 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Tino Calancha <tino.calancha <at> gmail.com>
To: Juri Linkov <juri <at> linkov.net>
Cc: 26338 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>,
 Marcin Borkowski <mbork <at> mbork.pl>, tino.calancha <at> gmail.com,
 npostavs <at> users.sourceforge.net
Subject: Re: bug#26338: 26.0.50;
 Collect all matches for REGEXP in current buffer
Date: Fri, 07 Apr 2017 23:47:16 +0900
Juri Linkov <juri <at> linkov.net> writes:

>>> Sorry if this was said already, but why a macro and not a map-like
>>> function?
>> No special reason.  It's the second idea which came to my mind after
>> my initial proposal was declined.  Maybe because is shorter to do:
>> (with-collect-matches regexp)
>> than
>> (foo-collect-matches regexp nil #'identity)
>>
>> if you are just interested in the list of matches.  Implementing it as
>> a map function might be also nice.  Don't see a big enthusiasm on
>> the proposal, though :-(
>>
>> So far people think that it's easy to write a while loop.  I wonder if they
>> think the same about the existence of `dolist': the should
>> never use it and always write a `while' loop instead.  Don't think they
>> do that anyway.
>>
>> I will repeat it once more.  I find nice, having an operator returning
>> a list with matches for REGEXP.  If such operator, in addition, accepts
>> a body of code or a function, then i find this operator very nice
>> and elegant.
>
> A mapcar-like function presumes a lambda where you can process every
> match as you need, but going this way you'd have a temptation to
> implement an analogous API from other programming languages like e.g.
> https://apidock.com/ruby/String/scan
I am not crazy with the mapcar-like implemention either.
Actually, I have changed my mind after nice Noah suggestion.  He
mentioned the possibility of extend `cl-loop' with a new clause to iterate on
matches for a regexp.
I think this clause fits well in cl-loop; this way we don't need to
introduce a new function/macro name.


--8<-----------------------------cut here---------------start------------->8---
commit 59e66771d13fce73ff5220ce3df677b9247c9c52
Author: Tino Calancha <tino.calancha <at> gmail.com>
Date:   Fri Apr 7 23:31:08 2017 +0900

    New clause in cl-loop to iterate in the matches of a regexp
    
    Add new clause in cl-loop facility to loop over the matches for
    REGEXP in the current buffer (Bug#26338).
    * lisp/emacs-lisp/cl-macs.el (cl--parse-loop-clause): Add new clause.
    (cl-loop): update docstring.
    * doc/misc/cl.texi (For Clauses): Document the new clause.
    * etc/NEWS: Mention this change.

diff --git a/doc/misc/cl.texi b/doc/misc/cl.texi
index 2339d57631..6c5c43ad09 100644
--- a/doc/misc/cl.texi
+++ b/doc/misc/cl.texi
@@ -2030,6 +2030,21 @@ For Clauses
 This clause iterates over a sequence, with @var{var} a @code{setf}-able
 reference onto the elements; see @code{in-ref} above.
 
+@item for @var{var} being the matches of @var{regexp}
+This clause iterates over the matches for @var{regexp} in the current buffer.
+By default, @var{var} is bound to the full match.  Optionally, @var{var}
+might be bound to a subpart of the match.  It's also possible to restrict
+the loop to a given number of matches.  For example,
+
+@example
+(cl-loop for x being the matches of "^(defun \\(\\S +\\)"
+         using '(group 1 limit 10)
+         collect x)
+@end example
+
+@noindent
+collects the next 10 function names after point.
+
 @item for @var{var} being the symbols [of @var{obarray}]
 This clause iterates over symbols, either over all interned symbols
 or over all symbols in @var{obarray}.  The loop is executed with
diff --git a/etc/NEWS b/etc/NEWS
index aaca229d5c..03f6ecb88b 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -862,6 +862,10 @@ instead of its first.
 * Lisp Changes in Emacs 26.1
 
 +++
+** New clause in cl-loop to iterate in the matches for a regexp
+in the current buffer.
+
++++
 ** Emacs now supports records for user-defined types, via the new
 functions 'copy-record', 'make-record', 'record', and 'recordp'.
 Records are now used internally to represent cl-defstruct and defclass
diff --git a/lisp/emacs-lisp/cl-macs.el b/lisp/emacs-lisp/cl-macs.el
index 25c9f99992..50596c066e 100644
--- a/lisp/emacs-lisp/cl-macs.el
+++ b/lisp/emacs-lisp/cl-macs.el
@@ -892,6 +892,7 @@ cl-loop
       the overlays/intervals [of BUFFER] [from POS1] [to POS2]
       the frames/buffers
       the windows [of FRAME]
+      the matches of/for REGEXP [using (group GROUP [limit LIMIT])]
   Iteration clauses:
     repeat INTEGER
     while/until/always/never/thereis CONDITION
@@ -1339,6 +1340,33 @@ cl--parse-loop-clause
 		  (push (list temp-idx `(1+ ,temp-idx))
 			loop-for-steps)))
 
+               ((memq word '(match matches))
+		(let* ((_ (or (and (not (memq (car cl--loop-args) '(of for)))
+				    (error "Expected `of'"))))
+		      (regexp (cl--pop2 cl--loop-args))
+                      (group-limit
+                       (and (eq (car cl--loop-args) 'using)
+                            (consp (cadr cl--loop-args))
+                            (>= (length (cadr cl--loop-args)) 2)
+                            (cadr (cl--pop2 cl--loop-args))))
+                      (group
+                       (or (and group-limit
+                                (cl-find 'group group-limit)
+                                (nth (1+ (cl-position 'group group-limit)) group-limit))
+                           0))
+                      (limit
+                       (and group-limit
+                            (cl-find 'limit group-limit)
+                            (nth (1+ (cl-position 'limit group-limit)) group-limit)))
+                      (count (make-symbol "--cl-count")))
+                  (push (list count 0) loop-for-bindings)
+                  (push (list var nil) loop-for-bindings)
+                  (push `(re-search-forward ,regexp nil t) cl--loop-body)
+                  (push `(or (null ,limit) (and (natnump ,limit) (< ,count ,limit))) cl--loop-body)
+                  (push (list count `(1+ ,count)) loop-for-sets)
+                  (push (list var `(match-string-no-properties ,group))
+                        loop-for-sets)))
+
 	       ((memq word hash-types)
 		(or (memq (car cl--loop-args) '(in of))
                     (error "Expected `of'"))
--8<-----------------------------cut here---------------end--------------->8---
In GNU Emacs 26.0.50 (build 7, x86_64-pc-linux-gnu, GTK+ Version 3.22.11)
 of 2017-04-07
Repository revision: 67aeaa74af8504f950f653136d749c6dd03a60de




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Fri, 07 Apr 2017 15:29:02 GMT) Full text and rfc822 format available.

Message #53 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Noam Postavsky <npostavs <at> users.sourceforge.net>
To: Tino Calancha <tino.calancha <at> gmail.com>
Cc: 26338 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>,
 Marcin Borkowski <mbork <at> mbork.pl>, Juri Linkov <juri <at> linkov.net>
Subject: Re: bug#26338: 26.0.50;
 Collect all matches for REGEXP in current buffer
Date: Fri, 7 Apr 2017 11:28:46 -0400
On Fri, Apr 7, 2017 at 10:47 AM, Tino Calancha <tino.calancha <at> gmail.com> wrote:
> +
> +@example
> +(cl-loop for x being the matches of "^(defun \\(\\S +\\)"
> +         using '(group 1 limit 10)
> +         collect x)
> +@end example

You can reuse the existing 'repeat N' clause instead of 'using (limit N)'.

(cl-loop for x being the matches of "^(defun \\(\\S +\\)" using (group 1)
         repeat 10
         collect x)

Regarding Drew's concerns about extending cl-loop with more non-Common
Lisp things, I just don't see that as a problem. I suppose it would be
nice to have a more easily extensible looping macro, like iterate [1].
That would be quite a bit of work though.

[1]: https://common-lisp.net/project/iterate/




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Fri, 07 Apr 2017 15:55:02 GMT) Full text and rfc822 format available.

Message #56 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Drew Adams <drew.adams <at> oracle.com>
To: Noam Postavsky <npostavs <at> users.sourceforge.net>, Tino Calancha
 <tino.calancha <at> gmail.com>
Cc: 26338 <at> debbugs.gnu.org, Dmitry Gutov <dgutov <at> yandex.ru>,
 Marcin Borkowski <mbork <at> mbork.pl>, Juri Linkov <juri <at> linkov.net>
Subject: RE: bug#26338: 26.0.50; Collect all matches for REGEXP in current
 buffer
Date: Fri, 7 Apr 2017 08:54:44 -0700 (PDT)
> Regarding Drew's concerns about extending cl-loop with more non-Common
> Lisp things, I just don't see that as a problem.

It depends on what one considers "a problem".  I think it
is a problem for `cl-', which was intended for Common Lisp
emulation, to become a dumping ground for anyone's idea of
a cool thing to add to Emacs.  That's not what it's for.

We've already had a couple of things unrelated to CL that
were misguidedly added to `cl-'.  We should not continue
that practice (and we really should remove those from the
`cl-' namespace).

There is nothing preventing Emacs from adding any constructs
it wants.  There just is no reason why the `cl-' namespace
(and the `cl*.el' files) should be polluted with stuff that
is not Common Lisp emulation.

A user of `cl-loop' should be able to expect Common Lisp
`loop', or as close to it as we can get.

> I suppose it would be nice to have a more easily extensible
> looping macro, like iterate [1].  That would be quite a bit
> of work though.

As for `iterate': If this is what you mean:
https://common-lisp.net/project/iterate/

then I'm all in favor of it.  I much prefer it to `loop'.
But I don't see anyone stepping forward to add it to Emacs.

Even then, I would probably prefer that we add it to the
`cl-' namespace and stay as close as possible to emulating
the Common Lisp `iterate' (no, it is not part of the CL
language, but yes, it is something developed for/with CL).

There are lots of users of CL, and lots of CL code.  Both
should find a simple, straightforward path to Emacs.  We
should minimize any differences between Emacs emulations
and the things being emulated.

But again, nothing prevents Emacs adding a different
construct that does exactly what you want, with all the
bells and whistles you think are improvements over `loop'
or `iterate' or whatever.

That should not be in the `cl-' namespace, and we should
not confuse users by passing it off as (even a partial)
emulation of a Common Lisp construct.  That's all.

Just one opinion.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Sat, 08 Apr 2017 04:46:02 GMT) Full text and rfc822 format available.

Message #59 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Tino Calancha <tino.calancha <at> gmail.com>
To: Drew Adams <drew.adams <at> oracle.com>
Cc: 26338 <at> debbugs.gnu.org, Juri Linkov <juri <at> linkov.net>,
 npostavs <at> users.sourceforge.net, Marcin Borkowski <mbork <at> mbork.pl>,
 Tino Calancha <tino.calancha <at> gmail.com>
Subject: RE: bug#26338: 26.0.50; Collect all matches for REGEXP in current
 buffer
Date: Sat, 8 Apr 2017 13:45:49 +0900 (JST)

On Fri, 7 Apr 2017, Drew Adams wrote:

>>> Or an addition to cl-loop that would allow doing something like
>>>
>>>    (cl-loop for m being the matches of "foo\\|bar"
>>>             do ...)
>>>
>>> Then you could easily 'collect m' to get the list of matches if you want
>>> that.
>>
>> Your proposals looks nice to me ;-)
>
> (Caveat: I have not been following this thread.)
>
> I think that `cl-loop' should be as close to Common Lisp `loop'
> as we can reasonably make it.  We should _not_ be adding other
> features to it or changing its behavior away from what it is
> supposedly emulating.
>
> If you want, create a _different_ macro that is Emacs-specific,
> with whatever behavior you want.  Call it whatever you want
> that will not be confused with Common Lisp emulation.
>
> Please keep `cl-' for Common Lisp emulation.  We've already
> seen more than enough tampering with this - people adding
> their favorite thing to the `cl-' namespace.  Not good.
Drew, i respect your opinion; but so far the change
would just extend `cl-loop' which as you noticed has being already 
extended before.
For instance, we have:
cl-loop for x being the overlays/buffers ...

Don't see a problem to have those things.  We already point out in the
manual that these are Emacs specific things, so nobody should be fooled 
with that.  As far as we cover all CL clauses, what problem could be in
having a few more?

I find interesting be able to do things like the following:

--8<-----------------------------cut here---------------start------------->8---
(require 'find-lisp)
(let ((op "defun")
      (dir (expand-file-name "lisp" source-directory)))
  (setq funcs
        (cl-loop for f in (find-lisp-find-files dir "\.el\\'") nconc
                 (with-temp-buffer
                   (insert-file-contents-literally f)
                   (let ((regexp (format "^(%s \\(\\S +\\)" op)))
                     (cl-loop for x the matches of regexp using '(group 1) collect x)))))
  (length funcs))

=> 38898 ; op: defun
=> 1256 ; op: defmacro
=> 1542 ; op: defsubst

--8<-----------------------------cut here---------------end--------------->8---




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Sat, 08 Apr 2017 05:50:02 GMT) Full text and rfc822 format available.

Message #62 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Drew Adams <drew.adams <at> oracle.com>
To: Tino Calancha <tino.calancha <at> gmail.com>
Cc: 26338 <at> debbugs.gnu.org, Juri Linkov <juri <at> linkov.net>,
 Marcin Borkowski <mbork <at> mbork.pl>, npostavs <at> users.sourceforge.net
Subject: RE: bug#26338: 26.0.50; Collect all matches for REGEXP in current
 buffer
Date: Fri, 7 Apr 2017 22:49:29 -0700 (PDT)
> >>> Or an addition to cl-loop that would allow doing something like
> >>>    (cl-loop for m being the matches of "foo\\|bar"
> >>>             do ...)
> >>> Then you could easily 'collect m' to get the list of matches if you want
> >>> that.
> >> Your proposals looks nice to me ;-)
> >
> > (Caveat: I have not been following this thread.)
> >
> > I think that `cl-loop' should be as close to Common Lisp `loop'
> > as we can reasonably make it.  We should _not_ be adding other
> > features to it or changing its behavior away from what it is
> > supposedly emulating.
> >
> > If you want, create a _different_ macro that is Emacs-specific,
> > with whatever behavior you want.  Call it whatever you want
> > that will not be confused with Common Lisp emulation.
> >
> > Please keep `cl-' for Common Lisp emulation.  We've already
> > seen more than enough tampering with this - people adding
> > their favorite thing to the `cl-' namespace.  Not good.
>
> Drew, i respect your opinion; but so far the change
> would just extend `cl-loop' which as you noticed has being
> already extended before.  For instance, we have:
> cl-loop for x being the overlays/buffers ...
> 
> Don't see a problem to have those things.  We already point out in the
> manual that these are Emacs specific things, so nobody should be fooled
> with that.  As far as we cover all CL clauses, what problem could be in
> having a few more?

You make a fair point when you stick to only extension and
keep compatibility for the rest.  I still disagree with it,
for the reasons given below.

And because the next enhancement proposal will perhaps just
point to this one as more precedent for making changes,
without bothering to, itself, ensure that "we cover all CL
clauses" etc.  As you pointed out: "so far"...  Little by
little, we've already seen `cl-' diluted from CL by being
incrementally "enhanced".

Here's my general opinion on this kind of thing:

I agree that such things are useful.  I have nothing against
them, and I'm glad to see Emacs have them.  My objection is
to using `cl-loop' for it.

`cl-loop' - and all of the stuff in `cl-*' - should be for
Common Lisp emulation.  Nothing more or less.  That's my
opinion.

Emacs should have its _own_, non-cl-* functions, macros,
variables, whatever.  It can take Common Lisp constructs
as a point of departure or inspiration, and extend enhance,
limit, or in any way change that point of departure as is
most fitting and useful for Emacs.

That's normal.  I'm all for that kind of thing.  But it
should not be confused with Common Lisp emulation.  `cl-'
should be kept for Common Lisp emulation.

Users should be able to recognize when they are using CL
code (an emulation of it).  Users should be able to take
existing CL code and use it in Emacs with little or no
modification (no, we're not there yet, and never will be
completely, but it's a good goal).

Put all this stuff - and more - into an `eloop' macro.
Since it will be so much better and more Emacsy, with
specifics that are especially useful for Emacs, it is
what users will (eventually) use instead of `cl-loop'.

Since it will do everything that `cl-loop' does (and
more), eventually only the rare user who needs, or for
some reason really wants, code that is CL or close to
it will use `cl-loop'.  Everyone else will use `eloop'.
No problem.

I am sure that my opinion on this is a minority one -
perhaps a minority of one.

But going the other direction, along lines such as what
you suggest:

1. We lose the value of `cl-' as an emulation of CL.  And
   typically we lose compatibility with existing CL code.

2. We lose the ability, when seeing something `cl-', to
   know we can look it up in the (fine) Common Lisp docs.

3. Where does it stop?  What's the point of `cl-', if
   anything goes and we can stuff whatever into it?

What prevents Emacs design from doing the right thing?
What do we lose by putting non-CL stuff into an `eloop'
that extends `cl-loop' in Emacsy ways?

Sure, invent more and better and different.  But put it
in a different namespace or in no namespace.

If `cl-' is just an Emacs thing and no longer a Common
Lisp thing then why the pretense of having a CL manual;
and using a `cl-' namespace; and pointing to the CL docs
for explanation (the Emacs docs explain practically
nothing about its `cl-' constructs - there is really no
doc for `cl-' in Emacs)?

What's the point?  Leave `cl-loop' as Common Lisp's `loop'.
Create a more-and-better, more Emacsy `eloop' or whatever.
Complete freedom - do whatever.  My vote is only that `cl-'
be kept for CL (and even be cleaned up to be more like it).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Sat, 08 Apr 2017 11:48:02 GMT) Full text and rfc822 format available.

Message #65 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Philipp Stephani <p.stephani2 <at> gmail.com>
To: Tino Calancha <tino.calancha <at> gmail.com>, Drew Adams <drew.adams <at> oracle.com>
Cc: 26338 <at> debbugs.gnu.org, Juri Linkov <juri <at> linkov.net>,
 Marcin Borkowski <mbork <at> mbork.pl>, npostavs <at> users.sourceforge.net
Subject: Re: bug#26338: 26.0.50;
 Collect all matches for REGEXP in current buffer
Date: Sat, 08 Apr 2017 11:46:50 +0000
[Message part 1 (text/plain, inline)]
Tino Calancha <tino.calancha <at> gmail.com> schrieb am Sa., 8. Apr. 2017 um
06:46 Uhr:

>
>
> On Fri, 7 Apr 2017, Drew Adams wrote:
>
> >>> Or an addition to cl-loop that would allow doing something like
> >>>
> >>>    (cl-loop for m being the matches of "foo\\|bar"
> >>>             do ...)
> >>>
> >>> Then you could easily 'collect m' to get the list of matches if you
> want
> >>> that.
> >>
> >> Your proposals looks nice to me ;-)
> >
> > (Caveat: I have not been following this thread.)
> >
> > I think that `cl-loop' should be as close to Common Lisp `loop'
> > as we can reasonably make it.  We should _not_ be adding other
> > features to it or changing its behavior away from what it is
> > supposedly emulating.
> >
> > If you want, create a _different_ macro that is Emacs-specific,
> > with whatever behavior you want.  Call it whatever you want
> > that will not be confused with Common Lisp emulation.
> >
> > Please keep `cl-' for Common Lisp emulation.  We've already
> > seen more than enough tampering with this - people adding
> > their favorite thing to the `cl-' namespace.  Not good.
> Drew, i respect your opinion; but so far the change
> would just extend `cl-loop' which as you noticed has being already
> extended before.
> For instance, we have:
> cl-loop for x being the overlays/buffers ...
>
> Don't see a problem to have those things.


I do. They couple the idea of an iterable with a looping construct, and
such coupling is bad for various reasons:
- Coupling of unrelated entities is always an antipattern.
- For N iterables and M looping constructs, you need to implement N*M
integrations.
Instead this should use an iterable, e.g. a generator function
(iter-defun). cl-loop supports these out of the box.
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Sat, 08 Apr 2017 13:43:02 GMT) Full text and rfc822 format available.

Message #68 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Tino Calancha <tino.calancha <at> gmail.com>
To: Philipp Stephani <p.stephani2 <at> gmail.com>
Cc: Marcin Borkowski <mbork <at> mbork.pl>, npostavs <at> users.sourceforge.net,
 26338 <at> debbugs.gnu.org, Tino Calancha <tino.calancha <at> gmail.com>,
 Juri Linkov <juri <at> linkov.net>, Drew Adams <drew.adams <at> oracle.com>
Subject: Re: bug#26338: 26.0.50; Collect all matches for REGEXP in current
 buffer
Date: Sat, 8 Apr 2017 22:42:41 +0900 (JST)
[Message part 1 (text/plain, inline)]

On Sat, 8 Apr 2017, Philipp Stephani wrote:

> 
> 
> Tino Calancha <tino.calancha <at> gmail.com> schrieb am Sa., 8. Apr. 2017 um 06:46 Uhr:
> 
>
>       On Fri, 7 Apr 2017, Drew Adams wrote:
>
>       >>> Or an addition to cl-loop that would allow doing something like
>       >>>
>       >>>    (cl-loop for m being the matches of "foo\\|bar"
>       >>>             do ...)
>       >>>
>       >>> Then you could easily 'collect m' to get the list of matches if you want
>       >>> that.
>       >>
>       >> Your proposals looks nice to me ;-)
>       >
>       > (Caveat: I have not been following this thread.)
>       >
>       > I think that `cl-loop' should be as close to Common Lisp `loop'
>       > as we can reasonably make it.  We should _not_ be adding other
>       > features to it or changing its behavior away from what it is
>       > supposedly emulating.
>       >
>       > If you want, create a _different_ macro that is Emacs-specific,
>       > with whatever behavior you want.  Call it whatever you want
>       > that will not be confused with Common Lisp emulation.
>       >
>       > Please keep `cl-' for Common Lisp emulation.  We've already
>       > seen more than enough tampering with this - people adding
>       > their favorite thing to the `cl-' namespace.  Not good.
>       Drew, i respect your opinion; but so far the change
>       would just extend `cl-loop' which as you noticed has being already
>       extended before.
>       For instance, we have:
>       cl-loop for x being the overlays/buffers ...
>
>       Don't see a problem to have those things. 
> 
> 
> I do. They couple the idea of an iterable with a looping construct, and such coupling is bad for various reasons:
> - Coupling of unrelated entities is always an antipattern.
> - For N iterables and M looping constructs, you need to implement N*M integrations.
> Instead this should use an iterable, e.g. a generator function (iter-defun). cl-loop supports these out of the box.
Then, you don't like (as Drew, but for different reasons) that we have:
cl-loop for x being the buffers ...

but it seems you are fine having iter-by clause in cl-loop, which seems an 
Emacs extension (correctme if i am wrong).  So in principle, you are happy
with adding useful extensions to CL, not just keep it an emulation as 
Drew wants.

Your point is about performance.  I am driven by easy to write code.
Maybe you can provide an example about how to write those things using
the iter-by cl-loop clause.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Sat, 08 Apr 2017 13:50:03 GMT) Full text and rfc822 format available.

Message #71 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Tino Calancha <tino.calancha <at> gmail.com>
To: Noam Postavsky <npostavs <at> users.sourceforge.net>
Cc: 26338 <at> debbugs.gnu.org, Juri Linkov <juri <at> linkov.net>,
 Marcin Borkowski <mbork <at> mbork.pl>, Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#26338: 26.0.50;
 Collect all matches for REGEXP in current buffer
Date: Sat, 08 Apr 2017 22:49:48 +0900
Noam Postavsky <npostavs <at> users.sourceforge.net> writes:

> On Fri, Apr 7, 2017 at 10:47 AM, Tino Calancha <tino.calancha <at> gmail.com> wrote:
>> +
>> +@example
>> +(cl-loop for x being the matches of "^(defun \\(\\S +\\)"
>> +         using '(group 1 limit 10)
>> +         collect x)
>> +@end example
>
> You can reuse the existing 'repeat N' clause instead of 'using (limit N)'.
>
> (cl-loop for x being the matches of "^(defun \\(\\S +\\)" using (group 1)
>          repeat 10
>          collect x)
Right! Thank you.
I fixed some other parts of the patch as well.
--8<-----------------------------cut here---------------start------------->8---
commit fc2eed78e8c5591c3aad358a885b4b5bae6c1041
Author: Tino Calancha <tino.calancha <at> gmail.com>
Date:   Sat Apr 8 22:49:10 2017 +0900

    New clause in cl-loop to iterate in the matches of a regexp
    
    Add new clause in cl-loop facility to loop over the matches for
    REGEXP in the current buffer (Bug#26338).
    * lisp/emacs-lisp/cl-macs.el (cl--parse-loop-clause): Add new clause.
    (cl-loop): update docstring.
    * doc/misc/cl.texi (For Clauses): Document the new clause.
    * etc/NEWS: Mention this change.

diff --git a/doc/misc/cl.texi b/doc/misc/cl.texi
index 2339d57631..40b90d6003 100644
--- a/doc/misc/cl.texi
+++ b/doc/misc/cl.texi
@@ -2030,6 +2030,22 @@ For Clauses
 This clause iterates over a sequence, with @var{var} a @code{setf}-able
 reference onto the elements; see @code{in-ref} above.
 
+@item for @var{var} being the matches of @var{regexp}
+This clause iterates over the matches for @var{regexp} in the current buffer.
+By default, @var{var} is bound to the full match.  Optionally, @var{var}
+might be bound to a subpart of the match.
+For example,
+
+@example
+(cl-loop for x being the matches of "^(defun \\(\\S +\\)" using (group 1)
+         repeat 10
+         collect x)
+@end example
+
+@noindent
+collects the next 10 function names after point.
+This clause is an extension to standard Common Lisp.
+
 @item for @var{var} being the symbols [of @var{obarray}]
 This clause iterates over symbols, either over all interned symbols
 or over all symbols in @var{obarray}.  The loop is executed with
@@ -2487,8 +2503,8 @@ Other Clauses
 This package's @code{cl-loop} macro is compatible with that of Common
 Lisp, except that a few features are not implemented:  @code{loop-finish}
 and data-type specifiers.  Naturally, the @code{for} clauses that
-iterate over keymaps, overlays, intervals, frames, windows, and
-buffers are Emacs-specific extensions.
+iterate over keymaps, overlays, intervals, frames, windows, buffers, and
+matches for a regexp in the current buffer are Emacs-specific extensions.
 
 @node Multiple Values
 @section Multiple Values
diff --git a/etc/NEWS b/etc/NEWS
index e351abc159..b8298bf180 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -862,6 +862,10 @@ instead of its first.
 * Lisp Changes in Emacs 26.1
 
 +++
+** New clause in cl-loop to iterate in the matches for a regexp
+in the current buffer.
+
++++
 ** Emacs now supports records for user-defined types, via the new
 functions 'make-record', 'record', and 'recordp'.  Records are now
 used internally to represent cl-defstruct and defclass instances, for
diff --git a/lisp/emacs-lisp/cl-macs.el b/lisp/emacs-lisp/cl-macs.el
index ecb89fd51d..4710efd0a9 100644
--- a/lisp/emacs-lisp/cl-macs.el
+++ b/lisp/emacs-lisp/cl-macs.el
@@ -892,6 +892,7 @@ cl-loop
       the overlays/intervals [of BUFFER] [from POS1] [to POS2]
       the frames/buffers
       the windows [of FRAME]
+      the matches of/for REGEXP [using (group GROUP)]
   Iteration clauses:
     repeat INTEGER
     while/until/always/never/thereis CONDITION
@@ -1339,6 +1340,24 @@ cl--parse-loop-clause
 		  (push (list temp-idx `(1+ ,temp-idx))
 			loop-for-steps)))
 
+               ((memq word '(match matches))
+		(let* ((_ (or (and (not (memq (car cl--loop-args) '(of for)))
+                                   (error "Expected `of'"))))
+                       (regexp `(if (stringp ,(cadr cl--loop-args))
+                                    ,(cl--pop2 cl--loop-args)
+                                  (error "Regexp must be an string")))
+                       (group
+                        (if (eq (car cl--loop-args) 'using)
+                            (if (and (= (length (cadr cl--loop-args)) 2)
+                                     (eq (cl-caadr cl--loop-args) 'group))
+                                (cadr (cl--pop2 cl--loop-args))
+                              (error "Bad `using' clause"))
+                          0)))
+                  (push (list var nil) loop-for-bindings)
+                  (push `(re-search-forward ,regexp nil t) cl--loop-body)
+                  (push (list var `(match-string-no-properties ,group))
+                        loop-for-sets)))
+
 	       ((memq word hash-types)
 		(or (memq (car cl--loop-args) '(in of))
                     (error "Expected `of'"))
--8<-----------------------------cut here---------------end--------------->8---
In GNU Emacs 26.0.50 (build 14, x86_64-pc-linux-gnu, GTK+ Version 3.22.11)
 of 2017-04-08
Repository revision: 4fbfd7ad53810153371a588a9bd1a69230f60dd5




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Sat, 08 Apr 2017 14:42:01 GMT) Full text and rfc822 format available.

Message #74 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Philipp Stephani <p.stephani2 <at> gmail.com>
To: Tino Calancha <tino.calancha <at> gmail.com>
Cc: 26338 <at> debbugs.gnu.org, Juri Linkov <juri <at> linkov.net>,
 Marcin Borkowski <mbork <at> mbork.pl>, Drew Adams <drew.adams <at> oracle.com>,
 npostavs <at> users.sourceforge.net
Subject: Re: bug#26338: 26.0.50;
 Collect all matches for REGEXP in current buffer
Date: Sat, 08 Apr 2017 14:41:29 +0000
[Message part 1 (text/plain, inline)]
Tino Calancha <tino.calancha <at> gmail.com> schrieb am Sa., 8. Apr. 2017 um
15:42 Uhr:

>
>
> On Sat, 8 Apr 2017, Philipp Stephani wrote:
>
> >
> >
> > Tino Calancha <tino.calancha <at> gmail.com> schrieb am Sa., 8. Apr. 2017 um
> 06:46 Uhr:
> >
> >
> >       On Fri, 7 Apr 2017, Drew Adams wrote:
> >
> >       >>> Or an addition to cl-loop that would allow doing something like
> >       >>>
> >       >>>    (cl-loop for m being the matches of "foo\\|bar"
> >       >>>             do ...)
> >       >>>
> >       >>> Then you could easily 'collect m' to get the list of matches
> if you want
> >       >>> that.
> >       >>
> >       >> Your proposals looks nice to me ;-)
> >       >
> >       > (Caveat: I have not been following this thread.)
> >       >
> >       > I think that `cl-loop' should be as close to Common Lisp `loop'
> >       > as we can reasonably make it.  We should _not_ be adding other
> >       > features to it or changing its behavior away from what it is
> >       > supposedly emulating.
> >       >
> >       > If you want, create a _different_ macro that is Emacs-specific,
> >       > with whatever behavior you want.  Call it whatever you want
> >       > that will not be confused with Common Lisp emulation.
> >       >
> >       > Please keep `cl-' for Common Lisp emulation.  We've already
> >       > seen more than enough tampering with this - people adding
> >       > their favorite thing to the `cl-' namespace.  Not good.
> >       Drew, i respect your opinion; but so far the change
> >       would just extend `cl-loop' which as you noticed has being already
> >       extended before.
> >       For instance, we have:
> >       cl-loop for x being the overlays/buffers ...
> >
> >       Don't see a problem to have those things.
> >
> >
> > I do. They couple the idea of an iterable with a looping construct, and
> such coupling is bad for various reasons:
> > - Coupling of unrelated entities is always an antipattern.
> > - For N iterables and M looping constructs, you need to implement N*M
> integrations.
> > Instead this should use an iterable, e.g. a generator function
> (iter-defun). cl-loop supports these out of the box.
> Then, you don't like (as Drew, but for different reasons) that we have:
> cl-loop for x being the buffers ...
>

I don't like it, but it's there and cannot be removed for compatibility
reasons, so I'm not arguing about it. I'm arguing against adding more such
one-off forms.


>
> but it seems you are fine having iter-by clause in cl-loop, which seems an
> Emacs extension (correctme if i am wrong).  So in principle, you are happy
> with adding useful extensions to CL, not just keep it an emulation as
> Drew wants.
>

Yes, I don't care about Common Lisp. The iter-by clause is less of a
problem than 'buffers' etc. because it's not a one-off that couples a
looping construct with some random semantics.


>
> Your point is about performance.


No, I care mostly about clarity, simplicity, and good API design, including
separation of concerns.


>   I am driven by easy to write code.
> Maybe you can provide an example about how to write those things using
> the iter-by cl-loop clause.


Sure:
 (require 'generator)
(iter-defun re-matches (regexp)
  (while (re-search-forward regexp nil t)
    (iter-yield (match-string 0))))
(iter-do (m (re-matches (rx digit)))
  (print m))
(cl-loop for m iter-by (re-matches (rx digit))
do (print m))
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Sat, 08 Apr 2017 15:21:02 GMT) Full text and rfc822 format available.

Message #77 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Tino Calancha <tino.calancha <at> gmail.com>
To: Philipp Stephani <p.stephani2 <at> gmail.com>
Cc: Marcin Borkowski <mbork <at> mbork.pl>, npostavs <at> users.sourceforge.net,
 26338 <at> debbugs.gnu.org, Tino Calancha <tino.calancha <at> gmail.com>,
 Juri Linkov <juri <at> linkov.net>, Drew Adams <drew.adams <at> oracle.com>
Subject: Re: bug#26338: 26.0.50; Collect all matches for REGEXP in current
 buffer
Date: Sun, 9 Apr 2017 00:20:22 +0900 (JST)
[Message part 1 (text/plain, inline)]

On Sat, 8 Apr 2017, Philipp Stephani wrote:

> 
> 
> Tino Calancha <tino.calancha <at> gmail.com> schrieb am Sa., 8. Apr. 2017 um 15:42 Uhr:
> 
>
>       On Sat, 8 Apr 2017, Philipp Stephani wrote:
>
>       >
>       >
>       > Tino Calancha <tino.calancha <at> gmail.com> schrieb am Sa., 8. Apr. 2017 um 06:46 Uhr:
>       >
>       >
>       >       On Fri, 7 Apr 2017, Drew Adams wrote:
>       >
>       >       >>> Or an addition to cl-loop that would allow doing something like
>       >       >>>
>       >       >>>    (cl-loop for m being the matches of "foo\\|bar"
>       >       >>>             do ...)
>       >       >>>
>       >       >>> Then you could easily 'collect m' to get the list of matches if you want
>       >       >>> that.
>       >       >>
>       >       >> Your proposals looks nice to me ;-)
>       >       >
>       >       > (Caveat: I have not been following this thread.)
>       >       >
>       >       > I think that `cl-loop' should be as close to Common Lisp `loop'
>       >       > as we can reasonably make it.  We should _not_ be adding other
>       >       > features to it or changing its behavior away from what it is
>       >       > supposedly emulating.
>       >       >
>       >       > If you want, create a _different_ macro that is Emacs-specific,
>       >       > with whatever behavior you want.  Call it whatever you want
>       >       > that will not be confused with Common Lisp emulation.
>       >       >
>       >       > Please keep `cl-' for Common Lisp emulation.  We've already
>       >       > seen more than enough tampering with this - people adding
>       >       > their favorite thing to the `cl-' namespace.  Not good.
>       >       Drew, i respect your opinion; but so far the change
>       >       would just extend `cl-loop' which as you noticed has being already
>       >       extended before.
>       >       For instance, we have:
>       >       cl-loop for x being the overlays/buffers ...
>       >
>       >       Don't see a problem to have those things. 
>       >
>       >
>       > I do. They couple the idea of an iterable with a looping construct, and such coupling is bad for various reasons:
>       > - Coupling of unrelated entities is always an antipattern.
>       > - For N iterables and M looping constructs, you need to implement N*M integrations.
>       > Instead this should use an iterable, e.g. a generator function (iter-defun). cl-loop supports these out of the box.
>       Then, you don't like (as Drew, but for different reasons) that we have:
>       cl-loop for x being the buffers ...
> 
> 
> I don't like it, but it's there and cannot be removed for compatibility reasons, so I'm not arguing about it. I'm arguing against
> adding more such one-off forms.
I see.  Thanks for the clarification.
>  
>
>       but it seems you are fine having iter-by clause in cl-loop, which seems an
>       Emacs extension (correctme if i am wrong).  So in principle, you are happy
>       with adding useful extensions to CL, not just keep it an emulation as
>       Drew wants.
> 
> 
> Yes, I don't care about Common Lisp. The iter-by clause is less of a problem than 'buffers' etc. because it's not a one-off that
> couples a looping construct with some random semantics.
Some people like it and refer about that as the 'expressivity' of the loop 
facility.  I guess it's a matter of taste, don't need to use such 
constructs if you don't like it.  Some people do.
  
>
>       Your point is about performance.
> 
> 
> No, I care mostly about clarity, simplicity, and good API design, including separation of concerns.
Expressibity and readability might be some kind of clarity.
I totally agree about API design and separation of concerns.
>  
>         I am driven by easy to write code.
>       Maybe you can provide an example about how to write those things using
>       the iter-by cl-loop clause.
> 
> 
> Sure:
>  (require 'generator)
> (iter-defun re-matches (regexp)
>   (while (re-search-forward regexp nil t)
>     (iter-yield (match-string 0))))
> (iter-do (m (re-matches (rx digit)))
>   (print m))
> (cl-loop for m iter-by (re-matches (rx digit))
> do (print m))
Thank you very much for your examples.  They are nice.  I am not
as familiar as you with generators.  I must study them more.

Between A) and B), the second looks at least as simple and clear as
the first one, and probably more readable.

A)
(iter-defun re-matches (regexp)
  (while (re-search-forward regexp nil t)
    (iter-yield (match-string-no-properties 1))))

(cl-loop for m iter-by (re-matches "^(defun \\(\\S +\\)")
         collect m)

B)
(cl-loop for m the matches of "^(defun \\(\\S +\\)"
         collect m)

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Sat, 08 Apr 2017 15:30:02 GMT) Full text and rfc822 format available.

Message #80 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Tino Calancha <tino.calancha <at> gmail.com>
To: Drew Adams <drew.adams <at> oracle.com>
Cc: 26338 <at> debbugs.gnu.org, Juri Linkov <juri <at> linkov.net>,
 npostavs <at> users.sourceforge.net, Marcin Borkowski <mbork <at> mbork.pl>,
 Tino Calancha <tino.calancha <at> gmail.com>
Subject: RE: bug#26338: 26.0.50; Collect all matches for REGEXP in current
 buffer
Date: Sun, 9 Apr 2017 00:29:35 +0900 (JST)

On Fri, 7 Apr 2017, Drew Adams wrote:


> Put all this stuff - and more - into an `eloop' macro.
> Since it will be so much better and more Emacsy, with
> specifics that are especially useful for Emacs, it is
> what users will (eventually) use instead of `cl-loop'.
>
> Since it will do everything that `cl-loop' does (and
> more), eventually only the rare user who needs, or for
> some reason really wants, code that is CL or close to
> it will use `cl-loop'.  Everyone else will use `eloop'.
> No problem.
I guess that might cause a lot of duplication of code.
IMO experts CL lispers will be more sad with this emulation
for the lack of returning multiple values than for the addition
of some extensions.  They can chose not to use them if they
don't like them.
Just one opinion too.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Sat, 08 Apr 2017 15:38:01 GMT) Full text and rfc822 format available.

Message #83 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: npostavs <at> users.sourceforge.net
To: Philipp Stephani <p.stephani2 <at> gmail.com>
Cc: 26338 <at> debbugs.gnu.org, Juri Linkov <juri <at> linkov.net>,
 Marcin Borkowski <mbork <at> mbork.pl>, Drew Adams <drew.adams <at> oracle.com>,
 Tino Calancha <tino.calancha <at> gmail.com>
Subject: Re: bug#26338: 26.0.50;
 Collect all matches for REGEXP in current buffer
Date: Sat, 08 Apr 2017 11:38:28 -0400
Philipp Stephani <p.stephani2 <at> gmail.com> writes:

>>> - Coupling of unrelated entities is always an antipattern.
>>> - For N iterables and M looping constructs, you need to implement
>>> N*M integrations.
>
> Yes, I don't care about Common Lisp. The iter-by clause is less of a
> problem than 'buffers' etc. because it's not a one-off that couples a
> looping construct with some random semantics.

It's sort of related to Drew's concerns in that Emacs deals with the N*M
problem by setting M=1, hence why only cl-loop gets the pressure to add
more enhancments.

There are some practical problem with iter-defun though: it has several
bugs on which there doesn't seem to be any movement[1][2][3], it's
reported to be slow[4], and cl-loop's iter-by keyword is not documented
at all (that could be easily fixed, at least).  I wonder if streams[5]
is a better direction.  It already has stream-regexp, though it returns
match-data rather than a matched string.

(package-install 'stream)
(require 'stream)
(require 'seq)

(seq-do (lambda (m)
          (set-match-data m)
          (print (match-string 0)))
        (stream-regexp (current-buffer) (rx digit)))

(cl-loop with matches = (stream-regexp (current-buffer) (rx digit))
         for m = (stream-pop matches) while m
         do (set-match-data m) (print (match-string 0)))

[1]: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=26073
[2]: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=25965
[3]: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=26068
[4]: http://lists.gnu.org/archive/html/emacs-devel/2017-03/msg00264.html
[5]: https://elpa.gnu.org/packages/stream.html




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Sat, 08 Apr 2017 15:43:01 GMT) Full text and rfc822 format available.

Message #86 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Drew Adams <drew.adams <at> oracle.com>
To: Tino Calancha <tino.calancha <at> gmail.com>
Cc: 26338 <at> debbugs.gnu.org, Juri Linkov <juri <at> linkov.net>,
 Marcin Borkowski <mbork <at> mbork.pl>, npostavs <at> users.sourceforge.net
Subject: RE: bug#26338: 26.0.50; Collect all matches for REGEXP in current
 buffer
Date: Sat, 8 Apr 2017 08:42:32 -0700 (PDT)
> > Put all this stuff - and more - into an `eloop' macro.
> > Since it will be so much better and more Emacsy, with
> > specifics that are especially useful for Emacs, it is
> > what users will (eventually) use instead of `cl-loop'.
> >
> > Since it will do everything that `cl-loop' does (and
> > more), eventually only the rare user who needs, or for
> > some reason really wants, code that is CL or close to
> > it will use `cl-loop'.  Everyone else will use `eloop'.
> > No problem.
>
> I guess that might cause a lot of duplication of code.

Why? The implementation of `cl-loop' or `eloop' could
leverage the implementation of the other, or they could
both leverage the implementation of a helper macro or
function.

> IMO experts CL lispers will be more sad with this emulation
> for the lack of returning multiple values than for the addition
> of some extensions.  They can chose not to use them if they
> don't like them.  Just one opinion too.

It's not about expert CL users.  It's about whether we want
to provide a CL emulation library or not, regardless of how
complete that emulation might be.

If we go the way we're headed, `cl-*' loses all meaning.
It's just a namespace that happens to also include some
constructs that emulate CL constructs, along with lots of
other stuff that does not.

AND along with stuff that kind of emulates but also kind of
does not, i.e., does something that confuses things by seeming,
in some cases, to emulate CL functionality but in other cases
(for the same construct) does something completely un-CL.

I do, completely, see the advantage of adding helpful
functionality, building on CL constructs.  I disagree that
that should be done to what are supposed to be CL-constuct
emulations.

I do not understand the reticence to do such enhancement in
separate, non-`cl-' functions and macros.  What would be
lost in doing that?  And wrt implementation, IMO that would
end up being simpler, not more complex.  The `cl-' emulation
code is already quite complex.  Separating out non-`cl-'
features from it could only make it simpler.

And any Emacs feature that builds on and enhances an existing
`cl-' feature need not continue to emulate all of the `cl-'
behavior - it has no such obligation.  It can still leverage
commonalities that would be factored out to serve as helpers
for both `cl-' and non-`cl-'.

What's the downside to what I'm suggesting?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Sat, 22 Apr 2017 19:38:02 GMT) Full text and rfc822 format available.

Message #89 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Philipp Stephani <p.stephani2 <at> gmail.com>
To: npostavs <at> users.sourceforge.net
Cc: 26338 <at> debbugs.gnu.org, Juri Linkov <juri <at> linkov.net>,
 Marcin Borkowski <mbork <at> mbork.pl>, Drew Adams <drew.adams <at> oracle.com>,
 Tino Calancha <tino.calancha <at> gmail.com>
Subject: Re: bug#26338: 26.0.50;
 Collect all matches for REGEXP in current buffer
Date: Sat, 22 Apr 2017 19:36:57 +0000
[Message part 1 (text/plain, inline)]
<npostavs <at> users.sourceforge.net> schrieb am Sa., 8. Apr. 2017 um 17:37 Uhr:

> Philipp Stephani <p.stephani2 <at> gmail.com> writes:
>
> >>> - Coupling of unrelated entities is always an antipattern.
> >>> - For N iterables and M looping constructs, you need to implement
> >>> N*M integrations.
> >
> > Yes, I don't care about Common Lisp. The iter-by clause is less of a
> > problem than 'buffers' etc. because it's not a one-off that couples a
> > looping construct with some random semantics.
>
> It's sort of related to Drew's concerns in that Emacs deals with the N*M
> problem by setting M=1, hence why only cl-loop gets the pressure to add
> more enhancments.
>
> There are some practical problem with iter-defun though: it has several
> bugs on which there doesn't seem to be any movement[1][2][3],


That's unfortunate, because it's a really well-designed library. Stefan has
apparently resumed work on these issues (e.g commit
89898e43c7ceef28bb3c2116b4d8a3ec96d9c8da), so let's hope they will be fixed
eventually.


> it's
> reported to be slow[4], and cl-loop's iter-by keyword is not documented
> at all (that could be easily fixed, at least).  I wonder if streams[5]
> is a better direction.
>

Maybe, though I'd be hesitant to add yet another library for the same thing
to Emacs, and I much prefer generator.el's interface.
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Sat, 22 Apr 2017 19:43:02 GMT) Full text and rfc822 format available.

Message #92 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Philipp Stephani <p.stephani2 <at> gmail.com>
To: Tino Calancha <tino.calancha <at> gmail.com>
Cc: 26338 <at> debbugs.gnu.org, Juri Linkov <juri <at> linkov.net>,
 Marcin Borkowski <mbork <at> mbork.pl>, Drew Adams <drew.adams <at> oracle.com>,
 npostavs <at> users.sourceforge.net
Subject: Re: bug#26338: 26.0.50;
 Collect all matches for REGEXP in current buffer
Date: Sat, 22 Apr 2017 19:42:15 +0000
[Message part 1 (text/plain, inline)]
Tino Calancha <tino.calancha <at> gmail.com> schrieb am Sa., 8. Apr. 2017 um
17:20 Uhr:

>
> >
> >       Your point is about performance.
> >
> >
> > No, I care mostly about clarity, simplicity, and good API design,
> including separation of concerns.
> Expressibity and readability might be some kind of clarity.
>

Yes, but it seems we mean different things with these words. "Readability"
for me means (among other things) that each logical entity has a single
purpose. The single purpose of the (already way too complex) cl-loop macro
is iterating over things. It doesn't concern itself with the things it
should iterate over and where they come from.


> I totally agree about API design and separation of concerns.
> >
> >         I am driven by easy to write code.
> >       Maybe you can provide an example about how to write those things
> using
> >       the iter-by cl-loop clause.
> >
> >
> > Sure:
> >  (require 'generator)
> > (iter-defun re-matches (regexp)
> >   (while (re-search-forward regexp nil t)
> >     (iter-yield (match-string 0))))
> > (iter-do (m (re-matches (rx digit)))
> >   (print m))
> > (cl-loop for m iter-by (re-matches (rx digit))
> > do (print m))
> Thank you very much for your examples.  They are nice.  I am not
> as familiar as you with generators.  I must study them more.
>
> Between A) and B), the second looks at least as simple and clear as
> the first one, and probably more readable.
>

I disagree. (A) clearly separates the generation of the stream of objects
to iterate over from the iteration, (B) doesn't. (A) is extensible to any
kind of iteration as long as it can be expressed using generators (or
lists, vectors, ...), while for (B) you need a new keyword for every new
thing to iterate over.


>
> A)
> (iter-defun re-matches (regexp)
>    (while (re-search-forward regexp nil t)
>      (iter-yield (match-string-no-properties 1))))
>
> (cl-loop for m iter-by (re-matches "^(defun \\(\\S +\\)")
>           collect m)
>
> B)
> (cl-loop for m the matches of "^(defun \\(\\S +\\)"
>           collect m)
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#26338; Package emacs. (Tue, 15 Sep 2020 15:42:01 GMT) Full text and rfc822 format available.

Message #95 received at 26338 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Tino Calancha <tino.calancha <at> gmail.com>
Cc: 26338 <at> debbugs.gnu.org, juri linkov <juri <at> linkov.net>
Subject: Re: bug#26338: 26.0.50; Collect all matches for REGEXP in current
 buffer
Date: Tue, 15 Sep 2020 17:41:16 +0200
Tino Calancha <tino.calancha <at> gmail.com> writes:

> we have `count-matches' in replace.el, which returns the
> number of matches of a regexp.  Why not to have an standard
> function `collect-matches' as well?

A bunch of discussion then followed, and several patches, but it seemed
like many people just thought that this didn't seem generally useful
enough...  and I agree.  Gathering regexp matches from a buffer is a
three-liner, and chopping out parts of that is easy enough with seq-take
and friends, so adding separate functionality for this just doesn't seem
warranted, in my opinion.

So I'm closing this bug report.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Added tag(s) wontfix. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Tue, 15 Sep 2020 15:42:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 26338 <at> debbugs.gnu.org and Tino Calancha <tino.calancha <at> gmail.com> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Tue, 15 Sep 2020 15:42:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 14 Oct 2020 11:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 193 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.