GNU bug report logs -
#13369
24.1; compile message parsing slow because of omake hack
Previous Next
To reply to this bug, email your comments to 13369 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#13369
; Package
emacs
.
(Sun, 06 Jan 2013 20:05:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Mattias Engdegård <mattiase <at> bredband.net>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Sun, 06 Jan 2013 20:05:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Parsing compilation messages in compilation-mode can be very slow for
large buffers (thousands of error lines); it can take many
seconds. Experiments show that it is the presence of omake in
compilation-error-regexp-alist that causes most of the trouble; removing
it mostly cures the problem.
The omake regexp does not look too troublesome, but there are some
omake-specific hacks in compile.el that are more worrying. In
particular, this code (in compilation-parse-errors) looks suspicious:
(cond
((not (memq 'omake compilation-error-regexp-alist)) nil)
((string-match "\\`\\([^^]\\|^\\( \\*\\|\\[\\)\\)" pat)
nil) ;; Not anchored or anchored but already allows empty
spaces.
(t (setq pat (concat "^ *" (substring pat 1)))))
The slightly alarming concept of regexp-matching a regexp aside, this
one doesn't make sense - shouldn't the ^ (following the \|) be escaped?
Apparently the code was at some time changed from
(when (and (= ?^ (aref pat 0)) ; anchored: starts with "^"
;; but does not allow an arbitrary number of leading
spaces
(not (and (= ? (aref pat 1)) (= ?* (aref pat 2)))))
which looks more correct, and conveys the intent somewhat better
(and may be more efficient than the regexp for all I know).
It's not clear to me how the present code could ever have worked.
At the very least the regexp in compilation-parse-errors should
be fixed.
In GNU Emacs 24.1.1 (powerpc-apple-darwin, NS apple-appkit-1038.36)
of 2012-06-10 on bob.porkrind.org
Windowing system distributor `Apple', version 10.3.949
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#13369
; Package
emacs
.
(Mon, 07 Jan 2013 01:25:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 13369 <at> debbugs.gnu.org (full text, mbox):
Mattias Engdegård wrote:
> Parsing compilation messages in compilation-mode can be very slow for
> large buffers (thousands of error lines); it can take many
> seconds. Experiments show that it is the presence of omake in
> compilation-error-regexp-alist that causes most of the trouble; removing
> it mostly cures the problem.
[...]
> one doesn't make sense - shouldn't the ^ (following the \|) be escaped?
Yes, I think so.
Does making that change remove the slowdown that you see?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#13369
; Package
emacs
.
(Mon, 07 Jan 2013 02:14:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 13369 <at> debbugs.gnu.org (full text, mbox):
7 jan 2013 kl. 02.24 skrev Glenn Morris:
> Does making that change remove the slowdown that you see?
Substantially, but not entirely. (I can try measuring it exactly if
you want it quantified, but it goes from being unusable to merely
annoyingly sluggish.)
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#13369
; Package
emacs
.
(Mon, 07 Jan 2013 08:15:03 GMT)
Full text and
rfc822 format available.
Message #14 received at 13369 <at> debbugs.gnu.org (full text, mbox):
Mattias Engdegård wrote:
> Substantially, but not entirely. (I can try measuring it exactly if
> you want it quantified, but it goes from being unusable to merely
> annoyingly sluggish.)
It might be useful to have some numbers, yes.
Could you compare the time with the \\^ change to the time with the
omake part of compilation-parse-errors commented out entirely?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#13369
; Package
emacs
.
(Mon, 07 Jan 2013 21:51:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 13369 <at> debbugs.gnu.org (full text, mbox):
7 jan 2013 kl. 09.14 skrev Glenn Morris:
> Could you compare the time with the \\^ change to the time with the
> omake part of compilation-parse-errors commented out entirely?
Here are the times, in seconds, for executing compilation-parse-errors
far down a large compile buffer (5000 lines, or about 440 KiB),
with and without omake present in compilation-error-regexp-alist:
omake
present absent
30.3 3.2 Standard code
6.5 3.2 repaired regexp (escaped ^)
3.2 3.2 COND expression removed
In the last case, the entire COND surrounding the faulty regexp was
edited out.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#13369
; Package
emacs
.
(Tue, 08 Jan 2013 20:15:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 13369 <at> debbugs.gnu.org (full text, mbox):
Mattias Engdegård wrote:
> Here are the times, in seconds, for executing compilation-parse-errors
> far down a large compile buffer (5000 lines, or about 440 KiB),
> with and without omake present in compilation-error-regexp-alist:
>
> omake
> present absent
> 30.3 3.2 Standard code
> 6.5 3.2 repaired regexp (escaped ^)
> 3.2 3.2 COND expression removed
Thanks. Could you also give the numbers for
compilation-error-regexp-alist containing only `gnu' (assuming that is
the one that is relevant for your test case)?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#13369
; Package
emacs
.
(Tue, 08 Jan 2013 21:41:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 13369 <at> debbugs.gnu.org (full text, mbox):
8 jan 2013 kl. 21.14 skrev Glenn Morris:
> Thanks. Could you also give the numbers for
> compilation-error-regexp-alist containing only `gnu' (assuming that is
> the one that is relevant for your test case)?
These times are with a slightly different compilation buffer:
all no omake gnu only
32.7 3.4 0.3 standard code
6.8 3.4 0.3 repaired regexp (escaped ^)
3.4 3.4 0.3 COND expression removed
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#13369
; Package
emacs
.
(Tue, 08 Jan 2013 22:42:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 13369 <at> debbugs.gnu.org (full text, mbox):
Mattias Engdegård wrote:
> 8 jan 2013 kl. 21.14 skrev Glenn Morris:
>
>> Thanks. Could you also give the numbers for
>> compilation-error-regexp-alist containing only `gnu' (assuming that is
>> the one that is relevant for your test case)?
>
> These times are with a slightly different compilation buffer:
>
> all no omake gnu only
> 32.7 3.4 0.3 standard code
> 6.8 3.4 0.3 repaired regexp (escaped ^)
> 3.4 3.4 0.3 COND expression removed
OK, thank you. So having fixed the omake ^ issue, basically to me it
just seems to be the case that the more entries are in
compilation-error-regexp-alist, the slower things get.
Maybe we should encourage people to prune it to only the entries they
use, or maybe some less common elements should not be there by default.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#13369
; Package
emacs
.
(Wed, 09 Jan 2013 01:48:01 GMT)
Full text and
rfc822 format available.
Message #29 received at 13369 <at> debbugs.gnu.org (full text, mbox):
>>> Thanks. Could you also give the numbers for
>>> compilation-error-regexp-alist containing only `gnu' (assuming that is
>>> the one that is relevant for your test case)?
>> These times are with a slightly different compilation buffer:
>> all no omake gnu only
>> 32.7 3.4 0.3 standard code
>> 6.8 3.4 0.3 repaired regexp (escaped ^)
>> 3.4 3.4 0.3 COND expression removed
> OK, thank you. So having fixed the omake ^ issue, basically to me it
> just seems to be the case that the more entries are in
> compilation-error-regexp-alist, the slower things get.
> Maybe we should encourage people to prune it to only the entries they
> use, or maybe some less common elements should not be there by default.
Yes, every entry costs time, which is why I've been resisting adding
more entries and would rather push the problem upstream to convince the
tools's authors to stick to the standard GNU message format.
I think compile.el would benefit from a different regex engine where we
could do a lex-style union of all regexp into a single automaton.
Stefan
Merged 3700 9065 13369.
Request was from
Glenn Morris <rgm <at> gnu.org>
to
control <at> debbugs.gnu.org
.
(Wed, 09 Jan 2013 02:00:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#13369
; Package
emacs
.
(Wed, 09 Jan 2013 11:13:02 GMT)
Full text and
rfc822 format available.
Message #34 received at 13369 <at> debbugs.gnu.org (full text, mbox):
9 jan 2013 kl. 02.47 skrev Stefan Monnier:
>>> all no omake gnu only
>>> 32.7 3.4 0.3 standard code
>>> 6.8 3.4 0.3 repaired regexp (escaped ^)
>>> 3.4 3.4 0.3 COND expression removed
>> OK, thank you. So having fixed the omake ^ issue, basically to me it
>> just seems to be the case that the more entries are in
>> compilation-error-regexp-alist, the slower things get.
>> Maybe we should encourage people to prune it to only the entries they
>> use, or maybe some less common elements should not be there by
>> default.
>
> Yes, every entry costs time, which is why I've been resisting adding
> more entries and would rather push the problem upstream to convince
> the
> tools's authors to stick to the standard GNU message format.
Note however that the omake is still special - while its own regexp is
fast and simple, its mere presence in the list causes the remaining
parsing to become twice as slow (as seen from the measurements above).
I'm also still somewhat suspicious of how the hack mutilates other
regexps in ways that may change their meaning.
In addition to fixing the regexp, I suggest omake be disabled by
default because of its impact and since it's somewhat of a special need.
> I think compile.el would benefit from a different regex engine where
> we
> could do a lex-style union of all regexp into a single automaton.
That would be nice, especially if the result could be a DFA.
I would also suggest switching to rx notation for the regexps.
(The ^ quoting bug is one that would never have occurred with rx,
and that is a very small regexp.)
I actually wrote a simple regexp-to-rx translator, like rx in reverse,
just to be able to make sense of the ones in compile.el. I'd be happy
to share.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#13369
; Package
emacs
.
(Wed, 09 Jan 2013 13:44:02 GMT)
Full text and
rfc822 format available.
Message #37 received at 13369 <at> debbugs.gnu.org (full text, mbox):
Mattias Engdegård <mattiase <at> bredband.net> writes:
> I actually wrote a simple regexp-to-rx translator, like rx in reverse,
> just to be able to make sense of the ones in compile.el. I'd be happy
> to share.
Why not just share, instead of saying that you will be happy to do so.
I personally find rx easy to edit and use. I am also drifting away from
Emacs lisp regexp to rx.
ps: Someone shared a perl(?)-to-Emacs regexes a couple of months ago and
wanted to include it as part of GNU ELPA.
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#13369
; Package
emacs
.
(Wed, 09 Jan 2013 14:32:04 GMT)
Full text and
rfc822 format available.
Message #40 received at 13369 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
> Why not just share, instead of saying that you will be happy to do so.
Sorry, I just assumed that someone already wrote such a thing and that
it would be more polished than my amateurish attempt. Here it is.
[xr.el (application/octet-stream, attachment)]
[Message part 3 (text/plain, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#13369
; Package
emacs
.
(Wed, 09 Jan 2013 15:18:02 GMT)
Full text and
rfc822 format available.
Message #43 received at 13369 <at> debbugs.gnu.org (full text, mbox):
Mattias Engdegård <mattiase <at> bredband.net> writes:
Thanks, that was quick. May be you want to indicate whether you want to
assign the copyright to that code FSF so that it could be improved upon
by others and distributed with Emacs or GNU ELPA.
>> Why not just share, instead of saying that you will be happy to do so.
>
> Sorry, I just assumed that someone already wrote such a thing
[OT, The following comment concerns re-builder]
In re-builder, there is a way to convert between various regexp styles.
It is bound to C-c TAB by default. It is not clear to me, whether
re-builder supports rx-to-regexp conversions.
When I try converting the following regexp (C-h v org-heading-regexp) in
read format to rx format
"^\\(\\*+\\)\\(?: +\\(.*?\\)\\)?[ \t]*$"
I am seeing that the re-builder translates that to
,----
| '()
`----
with the following message
,----
| rx-form: Unknown rx form `nil'
`----
I am not sure whether that counts as bug. It is possible that
re-builder doesn't support such translation or that I am using the
interface wrongly.
While,
(xr "^\\(\\*+\\)\\(?: +\\(.*?\\)\\)?[ \t]*$"))
gives me
(seq bol
(group
(one-or-more "*"))
(opt
(one-or-more " ")
(group
(minimal-match
(zero-or-more nonl))))
(zero-or-more
(any " " " "))
eol)
> and that it would be more polished than my amateurish attempt. Here it
> is.
I will let others review the changes.
Some libraries like org.el use complex regexps. For someone who wants
to dig deep in to what the regexps amount to, without resorting to
pen-and-paper, one can imagine a utility which overlays or tooltips a
regexp like string with it's rx counterpart. It could be quite useful.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#13369
; Package
emacs
.
(Wed, 09 Jan 2013 20:23:02 GMT)
Full text and
rfc822 format available.
Message #46 received at 13369 <at> debbugs.gnu.org (full text, mbox):
> I actually wrote a simple regexp-to-rx translator, like rx in reverse,
> just to be able to make sense of the ones in compile.el. I'd be happy
> to share.
Reminds me of my old lex.el, so I've just added it to GNU ELPA.
Stefan
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#13369
; Package
emacs
.
(Thu, 10 Jan 2013 18:56:02 GMT)
Full text and
rfc822 format available.
Message #49 received at 13369 <at> debbugs.gnu.org (full text, mbox):
9 jan 2013 kl. 16.17 skrev Jambunathan K:
> Thanks, that was quick. May be you want to indicate whether you
> want to
> assign the copyright to that code FSF so that it could be improved
> upon
> by others and distributed with Emacs or GNU ELPA.
Thank you, but I doubt I could get my employer to sign any copyright
papers, which to the best of my understanding is required for
distribution with Emacs. Please correct me if I'm wrong.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#13369
; Package
emacs
.
(Thu, 10 Jan 2013 19:35:02 GMT)
Full text and
rfc822 format available.
Message #52 received at 13369 <at> debbugs.gnu.org (full text, mbox):
>> Thanks, that was quick. May be you want to indicate whether you want to
>> assign the copyright to that code FSF so that it could be improved upon
>> by others and distributed with Emacs or GNU ELPA.
> Thank you, but I doubt I could get my employer to sign any copyright
> papers, which to the best of my understanding is required for
> distribution with Emacs. Please correct me if I'm wrong.
Indeed, it's needed, but only very few employers really refuse to sign
the relevant paperwork (which is a disclaimer that they have no
copyright interest in your work on Emacs).
Many employers will need some convincing (and reminding), but if I were
you I wouldn't give up just on the assumption that it can't be done,
Stefan
Merged 3700 9065 13369 29554.
Request was from
Noam Postavsky <npostavs <at> users.sourceforge.net>
to
control <at> debbugs.gnu.org
.
(Tue, 05 Dec 2017 00:30:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 6 years and 165 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.