GNU bug report logs - #61436
Emacs Freezing With Java Files

Previous Next

Package: emacs;

Reported by: Hank Greenburg <hank.greenburg <at> protonmail.com>

Date: Sat, 11 Feb 2023 20:47:02 UTC

Severity: normal

Found in versions 30.0.50, 29.1.50

Done: Alan Mackenzie <acm <at> muc.de>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 61436 in the body.
You can then email your comments to 61436 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Sat, 11 Feb 2023 20:47:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Hank Greenburg <hank.greenburg <at> protonmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sat, 11 Feb 2023 20:47:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Hank Greenburg <hank.greenburg <at> protonmail.com>
To: "bug-gnu-emacs <at> gnu.org" <bug-gnu-emacs <at> gnu.org>
Subject: Emacs Freezing With Java Files
Date: Sat, 11 Feb 2023 18:16:41 +0000
[Message part 1 (text/plain, inline)]
I have a few Java files that are about 500 lines of code and I can't move around in them much before Emacs freezes. I first thought it was java-lsp but it still happened after disabling and uninstalling it. I also uninstalled lsp-mode as well, but that didn't change anything.

I started doing CPU profiles of it and found that which-function-mode was taking up 67% of my CPU usage. While this is happening all I was doing was holding the down arrow until it froze about 350 lines in. Didn't press any other buttons.

So I disabled which-function-mode and moved around the buffer just fine! Though when trying to edit the file (just hit enter), it froze again. This time it seems like electric-indent-mode was taking up close to 50% of my CPU usage.

I disabled that and tried again and then it froze again with c-indent-line-or-region eating up 63% of my CPU when I use TAB.

While using debug-on-quit I get the below output. Any idea what's happening here and how it can be addressed? I do know though that if I launch emacs with the -Q argument, then there aren't any problems at all. I tried large files of other types and it only seems to happen with Java files. I attached screenshots of the CPU profiler outputs for each of the three scenarios. Attached is also the Java file as well as my init file.

I am using emacs version 28.2 on EndeavorOS, but reproduced the results using both emacs 29 and the master branch.

Debugger entered--Lisp error: (quit)
beginning-of-defun()
c-get-fallback-scan-pos(17794)
c-parse-state-get-strategy(17794 1)
c-parse-state-1()
c-parse-state()
c-guess-basic-syntax()
c-indent-line()
#f(compiled-function () (interactive nil) #<bytecode 0x180248dcca1cc57e>)()
c-indent-command(nil)
c-indent-line-or-region(nil nil)
funcall-interactively(c-indent-line-or-region nil nil)
call-interactively(c-indent-line-or-region nil nil) command-execute(c-indent-line-or-region)
[Message part 2 (text/html, inline)]
[emacs-freezing-cpu-profile-indent-mode-off.png (image/png, attachment)]
[emacs-freezing-cpu-profile.png (image/png, attachment)]
[emacs-freezing-cpu-profile-no-function-mode.png (image/png, attachment)]
[P1.java (text/x-java, attachment)]
[init.org (text/org, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Sun, 12 Feb 2023 05:39:02 GMT) Full text and rfc822 format available.

Message #8 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Hank Greenburg <hank.greenburg <at> protonmail.com>
To: Hank Greenburg <hank.greenburg <at> protonmail.com>
Cc: 61436 <at> debbugs.gnu.org
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Sun, 12 Feb 2023 00:24:49 +0000
[Message part 1 (text/plain, inline)]
To follow up on this, somehow disabling which-function-mode from my config has solved the problem. That doesn't make sense though because I didn't add that until after it had frozen in the first place. I know that for certain because I had to use IntelliJ and got the idea for using which-function-mode from IntelliJ.

To clarify, this froze every single time I tried to edit the file no matter what modes I had turned on or off. It happened at least 40 times in total and never once was I able to actually edit the file within Emacs. After disabling which-function-mode though I have restarted Emacs a few times, written in the file, scrolled though it all without a single hiccup.

So my problem is seemingly solved but there is an underlying bug somewhere that I'm not really able to isolate aside from disabling which-function-mode.

------- Original Message -------
On Saturday, February 11th, 2023 at 12:16 PM, Hank Greenburg via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org> wrote:

> I have a few Java files that are about 500 lines of code and I can't move around in them much before Emacs freezes. I first thought it was java-lsp but it still happened after disabling and uninstalling it. I also uninstalled lsp-mode as well, but that didn't change anything.
>
> I started doing CPU profiles of it and found that which-function-mode was taking up 67% of my CPU usage. While this is happening all I was doing was holding the down arrow until it froze about 350 lines in. Didn't press any other buttons.
>
> So I disabled which-function-mode and moved around the buffer just fine! Though when trying to edit the file (just hit enter), it froze again. This time it seems like electric-indent-mode was taking up close to 50% of my CPU usage.
>
> I disabled that and tried again and then it froze again with c-indent-line-or-region eating up 63% of my CPU when I use TAB.
>
> While using debug-on-quit I get the below output. Any idea what's happening here and how it can be addressed? I do know though that if I launch emacs with the -Q argument, then there aren't any problems at all. I tried large files of other types and it only seems to happen with Java files. I attached screenshots of the CPU profiler outputs for each of the three scenarios. Attached is also the Java file as well as my init file.
>
> I am using emacs version 28.2 on EndeavorOS, but reproduced the results using both emacs 29 and the master branch.
>
> Debugger entered--Lisp error: (quit)
> beginning-of-defun()
> c-get-fallback-scan-pos(17794)
> c-parse-state-get-strategy(17794 1)
> c-parse-state-1()
> c-parse-state()
> c-guess-basic-syntax()
> c-indent-line()
> #f(compiled-function () (interactive nil) #<bytecode 0x180248dcca1cc57e>)()
> c-indent-command(nil)
> c-indent-line-or-region(nil nil)
> funcall-interactively(c-indent-line-or-region nil nil)
> call-interactively(c-indent-line-or-region nil nil) command-execute(c-indent-line-or-region)
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Sun, 12 Feb 2023 06:01:02 GMT) Full text and rfc822 format available.

Message #11 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Hank Greenburg <hank.greenburg <at> protonmail.com>
Cc: 61436 <at> debbugs.gnu.org
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Sun, 12 Feb 2023 08:00:04 +0200
> Date: Sat, 11 Feb 2023 18:16:41 +0000
> From:  Hank Greenburg via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
> 
> I have a few Java files that are about 500 lines of code and I can't move around in them much before Emacs
> freezes. I first thought it was java-lsp but it still happened after disabling and uninstalling it. I also uninstalled
> lsp-mode as well, but that didn't change anything. 
> 
> I started doing CPU profiles of it and found that which-function-mode was taking up 67% of my CPU usage.
> While this is happening all I was doing was holding the down arrow until it froze about 350 lines in. Didn't
> press any other buttons. 
> 
> So I disabled which-function-mode and moved around the buffer just fine! Though when trying to edit the file
> (just hit enter), it froze again. This time it seems like electric-indent-mode was taking up close to 50% of my
> CPU usage. 
> 
> I disabled that and tried again and then it froze again with c-indent-line-or-region eating up 63% of my CPU
> when I use TAB.
> 
> While using debug-on-quit I get the below output. Any idea what's happening here and how it can be
> addressed? I do know though that if I launch emacs with the -Q argument, then there aren't any problems at
> all. I tried large files of other types and it only seems to happen with Java files. I attached screenshots of the
> CPU profiler outputs for each of the three scenarios. Attached is also the Java file as well as my init file. 
> 
> I am using emacs version 28.2 on EndeavorOS, but reproduced the results using both emacs 29 and the
> master branch. 

I confirm that the issue doesn't happen with "emacs -Q" and the file
you posted.  So some of your customizations trigger the problem, and
we must find out which one(s).  Can you try selectively enabling only
parts of your init files to find which customizations are the reason?
If your customizations are not too many, starting "emacs -Q" and then
evaluating the customizations one by one could be a good way of
finding the culprit(s).  Another possibility is bisecting:
successively divide the init file in two halves and see which half
causes the problem, then divide that half in tow, etc. etc., until you
get to a small enough part you can post here (or figure out yourself).

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Sun, 12 Feb 2023 06:31:01 GMT) Full text and rfc822 format available.

Message #14 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Hank Greenburg <hank.greenburg <at> protonmail.com>
Cc: 61436 <at> debbugs.gnu.org
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Sun, 12 Feb 2023 08:30:16 +0200
> Cc: 61436 <at> debbugs.gnu.org
> Date: Sun, 12 Feb 2023 00:24:49 +0000
> From:  Hank Greenburg via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
> 
> To follow up on this, somehow disabling which-function-mode from my config has solved the problem. That
> doesn't make sense though because I didn't add that until after it had frozen in the first place. I know that for
> certain because I had to use IntelliJ and got the idea for using which-function-mode from IntelliJ. 
> 
> To clarify, this froze every single time I tried to edit the file no matter what modes I had turned on or off. It
> happened at least 40 times in total and never once was I able to actually edit the file within Emacs. After
> disabling which-function-mode though I have restarted Emacs a few times, written in the file, scrolled though
> it all without a single hiccup. 
> 
> So my problem is seemingly solved but there is an underlying bug somewhere that I'm not really able to
> isolate aside from disabling which-function-mode. 

If I visit the file you posted, and then turn on which-function-mode
in it, I cannot reproduce the freezes.  Scrolling through the file
becomes slower, but nowhere near "freezing".

Do you see something different when you enable just
which-function-mode in "emacs -Q"?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Sun, 12 Feb 2023 16:54:02 GMT) Full text and rfc822 format available.

Message #17 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Hank Greenburg <hank.greenburg <at> protonmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 61436 <at> debbugs.gnu.org
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Sun, 12 Feb 2023 16:52:51 +0000
I was just able to reproduce this 5 times in a row with doing the steps below. 

Launch with emacs -Q

org-babel-execute-src-block for the Repos block and Hyperbole. The Hyperbole block won't work unless you do the Repos block first.

Go to the Java file. Hold CTRL + Up/Down Arrow keys to quickly move through the file. 

This works just fine until which-function-mode is activated. After that is activated I can scroll through maybe twice before it freezes. 

It almost always happens after hitting the end of the file, and the first jump up after that. Or at least it has happened 4/5 times by doing that. The other time I avoided the end of the file to see if that was the trigger and it still froze. 



------- Original Message -------
On Sunday, February 12th, 2023 at 12:30 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:


> 
> 
> > Cc: 61436 <at> debbugs.gnu.org
> 
> > Date: Sun, 12 Feb 2023 00:24:49 +0000
> > From: Hank Greenburg via "Bug reports for GNU Emacs,
> > the Swiss army knife of text editors" bug-gnu-emacs <at> gnu.org
> > 
> > To follow up on this, somehow disabling which-function-mode from my config has solved the problem. That
> > doesn't make sense though because I didn't add that until after it had frozen in the first place. I know that for
> > certain because I had to use IntelliJ and got the idea for using which-function-mode from IntelliJ.
> > 
> > To clarify, this froze every single time I tried to edit the file no matter what modes I had turned on or off. It
> > happened at least 40 times in total and never once was I able to actually edit the file within Emacs. After
> > disabling which-function-mode though I have restarted Emacs a few times, written in the file, scrolled though
> > it all without a single hiccup.
> > 
> > So my problem is seemingly solved but there is an underlying bug somewhere that I'm not really able to
> > isolate aside from disabling which-function-mode.
> 
> 
> If I visit the file you posted, and then turn on which-function-mode
> in it, I cannot reproduce the freezes. Scrolling through the file
> becomes slower, but nowhere near "freezing".
> 
> Do you see something different when you enable just
> which-function-mode in "emacs -Q"?
>




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Sun, 12 Feb 2023 17:07:01 GMT) Full text and rfc822 format available.

Message #20 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Hank Greenburg <hank.greenburg <at> protonmail.com>
Cc: 61436 <at> debbugs.gnu.org
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Sun, 12 Feb 2023 19:05:39 +0200
> Date: Sun, 12 Feb 2023 16:52:51 +0000
> From: Hank Greenburg <hank.greenburg <at> protonmail.com>
> Cc: 61436 <at> debbugs.gnu.org
> 
> Launch with emacs -Q
> 
> org-babel-execute-src-block for the Repos block and Hyperbole. The Hyperbole block won't work unless you do the Repos block first.

Sorry, I don't understand: what is "the Repos block", and what is the
"Hyperbole block"?  What do I need to do after starting "emacs -Q" to
reproduce what you describe above? which commands to type?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Sun, 12 Feb 2023 17:12:02 GMT) Full text and rfc822 format available.

Message #23 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Hank Greenburg <hank.greenburg <at> protonmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 61436 <at> debbugs.gnu.org
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Sun, 12 Feb 2023 17:11:22 +0000
[Message part 1 (text/plain, inline)]
Sorry, by blocks I mean the org-mode code blocks that are under the headers "Repos (Elpa and Melpa)" and "Hyperbole" in the init.org file I sent with my first email. I've attached it again here.

Underneath those headers is an org-mode source block, if you go inside of the block and run "org-babel-execute-src-block" then say yes when it asks if you want to run the code. If you do that for Repos and Hyperbole blocks then follow the steps below.


Go to the Java file. Hold CTRL + Up/Down Arrow keys to quickly move through the file.

This works just fine until which-function-mode is activated. After that is activated I can scroll through maybe twice before it freezes.

It almost always happens after hitting the end of the file, and the first jump up after that. Or at least it has happened 4/5 times by doing that. The other time I avoided the end of the file to see if that was the trigger and it still froze.



------- Original Message -------
On Sunday, February 12th, 2023 at 11:05 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:


> 
> 
> > Date: Sun, 12 Feb 2023 16:52:51 +0000
> 
> > From: Hank Greenburg hank.greenburg <at> protonmail.com
> > Cc: 61436 <at> debbugs.gnu.org
> > 
> > Launch with emacs -Q
> > 
> > org-babel-execute-src-block for the Repos block and Hyperbole. The Hyperbole block won't work unless you do the Repos block first.
> 
> 
> Sorry, I don't understand: what is "the Repos block", and what is the
> "Hyperbole block"? What do I need to do after starting "emacs -Q" to
> reproduce what you describe above? which commands to type?
[init.org (text/org, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Mon, 09 Oct 2023 20:28:02 GMT) Full text and rfc822 format available.

Message #26 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>
To: Hank Greenburg <hank.greenburg <at> protonmail.com>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 61436 <at> debbugs.gnu.org
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Mon, 09 Oct 2023 22:26:31 +0200
found 61436 29.1.50
found 61436 30.0.50
thanks

Hank Greenburg <hank.greenburg <at> protonmail.com> writes:
> This works just fine until which-function-mode is
> activated. After that is activated I can scroll through maybe
> twice before it freezes.

I can confirm both on emacs-29 and on master.  (Actually, I was hoping
for another case of my bug#60768 wracking havoc, but this issue is
actually something different.  Even though `beginning-of-defun' is also
involved here through `which-function' calls ...)

A slightly easier to follow reproducer:

- Ensure package "hyperbole" is installed.  (Its only role in this issue
  seems to be a "background load generator", but I'm not 100% sure
  here.)

- Save Hank's Java source P1.java to ~/tmp.

- Save the following to ~/tmp/init.el:

------------------------- snip -------------------------
(require 'package)
(add-to-list 'package-archives
             '("melpa" . "https://melpa.org/packages/"))
(add-to-list 'package-archives
             '("gnu" . "https://elpa.gnu.org/packages/"))
(package-initialize)
(hyperbole-mode 1)
(which-function-mode 1)
------------------------- snip -------------------------

- Start emacs as

  ./src/emacs -Q -l ~/tmp/init.el ~/tmp/P1.java

  and wait for the compiler warnings to calm down.

- As Hank has recommended, forward/backward paragraph through P1.java.
  Rather soon it should hang.

And yes, it's which-function-mode which at some level inf-loops.  Maybe
the timer itself, maybe some upper layer.  If I'll find time I could try
digging into this.




bug Marked as found in versions 29.1.50. Request was from Jens Schmidt <jschmidt4gnu <at> vodafonemail.de> to control <at> debbugs.gnu.org. (Mon, 09 Oct 2023 20:28:02 GMT) Full text and rfc822 format available.

bug Marked as found in versions 30.0.50. Request was from Jens Schmidt <jschmidt4gnu <at> vodafonemail.de> to control <at> debbugs.gnu.org. (Mon, 09 Oct 2023 20:28:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Tue, 10 Oct 2023 21:00:02 GMT) Full text and rfc822 format available.

Message #33 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>
To: Hank Greenburg <hank.greenburg <at> protonmail.com>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 61436 <at> debbugs.gnu.org,
 Mats Lidell <mats.lidell <at> lidells.se>, Bob Weiner <rsw <at> gnu.org>
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Tue, 10 Oct 2023 22:58:54 +0200
Jens Schmidt <jschmidt4gnu <at> vodafonemail.de> writes:

>   Ensure package "hyperbole" is installed.  (Its only role in this
>   issue seems to be a "background load generator", but I'm not 100%
>   sure here.)

The second part above is not true!  Hyperbole mode is the culprit,
unless proven otherwise.  Here is a 100%-freezing reproducer (with
P1.java as from the initial post):

------------------------- init.el -------------------------
(require 'package)
(add-to-list 'package-archives
	     '("melpa" . "https://melpa.org/packages/"))
(add-to-list 'package-archives
	     '("gnu" . "https://elpa.gnu.org/packages/"))
(package-initialize)
;(setq hkey-init nil)
(hyperbole-mode 1)
------------------------- init.el -------------------------

Execute Emacs as:

  ./src/emacs -Q -l ~/tmp/init.el +181 ~/tmp/P1.java

That always freezes Emacs (29 and master) even before it has a chance to
display P1.java.  The freeze happens in function
`c-get-fallback-scan-pos', where the while loop inf-loops, BUT:

If you uncomment the line setting `hkey-init' to nil in init.el and
repeat: No freeze.

Not sure how to continue here - since this is a GNU ELPA package, it can
be further handled on Emacs debbugs, no?  Mats, Bob?

Disclaimer: I do not use Hyperbole as a regular user, I installed it
through `package-install' just for the purpose of this bug, as follows:

------------------------- snip -------------------------
Package hyperbole is installed.

     Status: Installed in ‘hyperbole-8.0.0/’. Delete
    Version: 8.0.0
     Commit: 4214716e06920a3e10db5811bd22a343ad6435d9
    Summary: GNU Hyperbole: The Everyday Hypertextual Information Manager
   Requires: emacs-27.0
    Website: https://www.gnu.org/software/hyperbole
   Keywords: comm convenience files frames hypermedia languages mail matching mouse multimedia outlines tools wp
 Maintainer: Bob Weiner <rsw <at> gnu.org>, Mats Lidell <matsl <at> gnu.org>
     Author: Bob Weiner
Other versions: 8.0.0 (gnu).
------------------------- snip -------------------------




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Wed, 11 Oct 2023 07:30:02 GMT) Full text and rfc822 format available.

Message #36 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Mats Lidell <mats.lidell <at> lidells.se>
To: Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>
Cc: Hank Greenburg <hank.greenburg <at> protonmail.com>, 61436 <at> debbugs.gnu.org,
 Eli Zaretskii <eliz <at> gnu.org>, Bob Weiner <rsw <at> gnu.org>
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Wed, 11 Oct 2023 09:28:59 +0200
Hi Jens,

Thanks for the report. Seems non trivial.

> Jens Schmidt writes:
> Jens Schmidt <jschmidt4gnu <at> vodafonemail.de> writes:
>
> >   Ensure package "hyperbole" is installed.  (Its only role in this
> >   issue seems to be a "background load generator", but I'm not 100%
> >   sure here.)
>
> The second part above is not true!  Hyperbole mode is the culprit,
> unless proven otherwise.  Here is a 100%-freezing reproducer (with
> P1.java as from the initial post):
>
> ------------------------- init.el -------------------------
> (require 'package)
> (add-to-list 'package-archives
> 	     '("melpa" . "https://melpa.org/packages/"))
> (add-to-list 'package-archives
> 	     '("gnu" . "https://elpa.gnu.org/packages/"))
> (package-initialize)
> ;(setq hkey-init nil)
> (hyperbole-mode 1)
> ------------------------- init.el -------------------------
>
> Execute Emacs as:
>
>   ./src/emacs -Q -l ~/tmp/init.el +181 ~/tmp/P1.java
>
> That always freezes Emacs (29 and master) even before it has a chance to
> display P1.java.  The freeze happens in function
> `c-get-fallback-scan-pos', where the while loop inf-loops, BUT:
>
> If you uncomment the line setting `hkey-init' to nil in init.el and
> repeat: No freeze.

I have tried to recreate the freezing (with 29.1 running docker) and I don't
see the exact same behavior. For me P1.java is displayed and there us no
freeze but when I try to go to top or bottom of file it freezes and I have to
hit C-g. After that it seems to work. This is just an observation. There can
be more things affected of course.

I have tried both with the hyperbole stable and devel packages and get the
same behavior. 

Note: I don't know what P1.java means here. I have picked a java file at
random that I had on my machine that is large. Is P1.java a specific file that
has been shared earlier?

> Not sure how to continue here - since this is a GNU ELPA package, it can
> be further handled on Emacs debbugs, no?  Mats, Bob?

Hyperbole has its own tracker.

https://debbugs.gnu.org/cgi/pkgreport.cgi?package=hyperbole

%% Mats




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Wed, 11 Oct 2023 10:19:02 GMT) Full text and rfc822 format available.

Message #39 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Robert Weiner <rsw <at> gnu.org>
To: Mats Lidell <mats.lidell <at> lidells.se>
Cc: Hank Greenburg <hank.greenburg <at> protonmail.com>,
 Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>, Eli Zaretskii <eliz <at> gnu.org>,
 61436 <at> debbugs.gnu.org
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Wed, 11 Oct 2023 06:17:50 -0400
[Message part 1 (text/plain, inline)]
Jens wrote:

> That always freezes Emacs (29 and master) even before it has a chance to
> display P1.java.  The freeze happens in function
> `c-get-fallback-scan-pos', where the while loop inf-loops, BUT:
>
> If you uncomment the line setting `hkey-init' to nil in init.el and
> repeat: No freeze.

As you note above, the infinite loop is coming from a Lisp function in
Emacs core, not from Hyperbole.  A Hyperbole setting may help you to see a
state reached in that function that you otherwise would not, but it is not
a Hyperbole bug; it is an unhandled state outside of Hyperbole.  When the
issue is found, we will have to work around it in Hyperbole since we
support Emacs versions back to 27.1 but that is another matter.  Thanks for
pointing it out.

-- rsw


On Wed, Oct 11, 2023 at 3:29 AM Mats Lidell <mats.lidell <at> lidells.se> wrote:

> Hi Jens,
>
> Thanks for the report. Seems non trivial.
>
> > Jens Schmidt writes:
> > Jens Schmidt <jschmidt4gnu <at> vodafonemail.de> writes:
> >
> > >   Ensure package "hyperbole" is installed.  (Its only role in this
> > >   issue seems to be a "background load generator", but I'm not 100%
> > >   sure here.)
> >
> > The second part above is not true!  Hyperbole mode is the culprit,
> > unless proven otherwise.  Here is a 100%-freezing reproducer (with
> > P1.java as from the initial post):
> >
> > ------------------------- init.el -------------------------
> > (require 'package)
> > (add-to-list 'package-archives
> >            '("melpa" . "https://melpa.org/packages/"))
> > (add-to-list 'package-archives
> >            '("gnu" . "https://elpa.gnu.org/packages/"))
> > (package-initialize)
> > ;(setq hkey-init nil)
> > (hyperbole-mode 1)
> > ------------------------- init.el -------------------------
> >
> > Execute Emacs as:
> >
> >   ./src/emacs -Q -l ~/tmp/init.el +181 ~/tmp/P1.java
> >
> > That always freezes Emacs (29 and master) even before it has a chance to
> > display P1.java.  The freeze happens in function
> > `c-get-fallback-scan-pos', where the while loop inf-loops, BUT:
> >
> > If you uncomment the line setting `hkey-init' to nil in init.el and
> > repeat: No freeze.
>
> I have tried to recreate the freezing (with 29.1 running docker) and I
> don't
> see the exact same behavior. For me P1.java is displayed and there us no
> freeze but when I try to go to top or bottom of file it freezes and I have
> to
> hit C-g. After that it seems to work. This is just an observation. There
> can
> be more things affected of course.
>
> I have tried both with the hyperbole stable and devel packages and get the
> same behavior.
>
> Note: I don't know what P1.java means here. I have picked a java file at
> random that I had on my machine that is large. Is P1.java a specific file
> that
> has been shared earlier?
>
> > Not sure how to continue here - since this is a GNU ELPA package, it can
> > be further handled on Emacs debbugs, no?  Mats, Bob?
>
> Hyperbole has its own tracker.
>
> https://debbugs.gnu.org/cgi/pkgreport.cgi?package=hyperbole
>
> %% Mats
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Wed, 11 Oct 2023 19:40:02 GMT) Full text and rfc822 format available.

Message #42 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>
To: Alan Mackenzie <acm <at> muc.de>
Cc: Robert Weiner <rsw <at> gnu.org>, Hank Greenburg <hank.greenburg <at> protonmail.com>,
 Mats Lidell <mats.lidell <at> lidells.se>, Eli Zaretskii <eliz <at> gnu.org>,
 rswgnu <at> gmail.com, 61436 <at> debbugs.gnu.org
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Wed, 11 Oct 2023 21:38:26 +0200
Hi Alan,

could you please have a look as well?  This seems to be related to
cc-mode/java-mode.  New, complete reproducer at the very bottom of this
mail.

Thanks!

Hi Robert & Mats,

Robert Weiner <rsw <at> gnu.org> writes:

> Jens wrote:
>
>> That always freezes Emacs (29 and master) even before it has a chance to
>> display P1.java.  The freeze happens in function
>> `c-get-fallback-scan-pos', where the while loop inf-loops, BUT:
>>
>> If you uncomment the line setting `hkey-init' to nil in init.el and
>> repeat: No freeze.
>
> As you note above, the infinite loop is coming from a Lisp function in
> Emacs core, not from Hyperbole.  A Hyperbole setting may help you to
> see a state reached in that function that you otherwise would not, but
> it is not a Hyperbole bug; it is an unhandled state outside of
> Hyperbole.

Well, yes and no.  The next closest culprit seems to be this hook
addition from function `hui-select-initialize':

  ;; These hooks let you select C++ and Java methods and classes by
  ;; double-clicking on the first character of a definition or on its
  ;; opening or closing brace.  This is all necessary since some
  ;; programmers don't put their function braces in the first column.
  (var:add-and-run-hook
   'java-mode-hook
   (lambda ()
     (setq defun-prompt-regexp
	   "^[ \t]*\\(\\(\\(public\\|protected\\|private\\|const\\|abstract\\|synchronized\\|final\\|static\\|threadsafe\\|transient\\|native\\|volatile\\)\\s-+\\)*\\(\\(\\([[a-zA-Z][][_$.a-zA-Z0-9]*[][_$.a-zA-Z0-9]+\\|[[a-zA-Z]\\)\\s-*\\)\\s-+\\)\\)?\\(\\([[a-zA-Z][][_$.a-zA-Z0-9]*\\s-+\\)\\s-*\\)?\\([_a-zA-Z][^][ \t:;.,{}()=]*\\|\\([_$a-zA-Z][_$.a-zA-Z0-9]*\\)\\)\\s-*\\(([^);{}]*)\\)?\\([] \t]*\\)\\(\\s-*\\<throws\\>\\s-*\\(\\([_$a-zA-Z][_$.a-zA-Z0-9]*\\)[, \t\n\r\f]*\\)+\\)?\\s-*")))

I (very generally) think that Emacs does not have to grok every regexp
in every context, but I leave that concrete case for Alan and/or others
to decide.

> On Wed, Oct 11, 2023 at 3:29 AM Mats Lidell <mats.lidell <at> lidells.se> wrote:
>
>  Thanks for the report.

Actually, not mine.  I'm just the messenger who did some root-cause
analysis.

>  Note: I don't know what P1.java means here. I have picked a java file
>  at random that I had on my machine that is large. Is P1.java a
>  specific file that has been shared earlier?

The OP has provided that, see below.

>  Hyperbole has its own tracker.
>
>  https://debbugs.gnu.org/cgi/pkgreport.cgi?package=hyperbole

Ok, thanks.  As soon as we know whose bug this is we could forward or
not.


Now for the next reproducer (Hyperbole no longer required, but still
present through its regexp :-):

- Save the following to ~/tmp/init.el:

------------------------- snip -------------------------
(add-hook
 'java-mode-hook
 (lambda ()
   (setq defun-prompt-regexp
	 "^[ \t]*\\(\\(\\(public\\|protected\\|private\\|const\\|abstract\\|synchronized\\|final\\|static\\|threadsafe\\|transient\\|native\\|volatile\\)\\s-+\\)*\\(\\(\\([[a-zA-Z][][_$.a-zA-Z0-9]*[][_$.a-zA-Z0-9]+\\|[[a-zA-Z]\\)\\s-*\\)\\s-+\\)\\)?\\(\\([[a-zA-Z][][_$.a-zA-Z0-9]*\\s-+\\)\\s-*\\)?\\([_a-zA-Z][^][ \t:;.,{}()=]*\\|\\([_$a-zA-Z][_$.a-zA-Z0-9]*\\)\\)\\s-*\\(([^);{}]*)\\)?\\([] \t]*\\)\\(\\s-*\\<throws\\>\\s-*\\(\\([_$a-zA-Z][_$.a-zA-Z0-9]*\\)[, \t\n\r\f]*\\)+\\)?\\s-*")))
------------------------- snip -------------------------

- Save attachment P1.java from the initial message

  https://yhetil.org/emacs-bugs/ZPOcahP9yPJ-kLcgipM3-l0jatXJSQWKPfObrlOkIB3dagud85x2DGXGhPpQn1QNqNksVmPIRc1intyW_Cx1Z9ou2vBZ5QLDpLTi_VFVYyg=@protonmail.com/

  to ~/tmp/P1.java.

- Start Emacs as

  ./src/emacs -Q -l ~/tmp/init.el +181 ~/tmp/P1.java

That always freezes Emacs (29 and master) even before it has a chance to
display P1.java.  The freeze happens in function
`c-get-fallback-scan-pos', where the while loop inf-loops.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Wed, 11 Oct 2023 20:08:01 GMT) Full text and rfc822 format available.

Message #45 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Robert Weiner <rswgnu <at> gmail.com>
To: Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>
Cc: Robert Weiner <rsw <at> gnu.org>, Hank Greenburg <hank.greenburg <at> protonmail.com>,
 Mats Lidell <mats.lidell <at> lidells.se>, Alan Mackenzie <acm <at> muc.de>,
 Eli Zaretskii <eliz <at> gnu.org>, 61436 <at> debbugs.gnu.org
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Wed, 11 Oct 2023 21:07:10 +0100
Those are some pretty old regexps in Hyperbole that we have not updated in many years.  Maybe we just need to cross-check them against what is currently in Emacs to resolve this.  I will have a look.  Thanks for tracing through this.

-- rsw

> On Oct 11, 2023, at 8:38 PM, Jens Schmidt <jschmidt4gnu <at> vodafonemail.de> wrote:
> 
> Hi Alan,
> 
> could you please have a look as well?  This seems to be related to
> cc-mode/java-mode.  New, complete reproducer at the very bottom of this
> mail.
> 
> Thanks!
> 
> Hi Robert & Mats,
> 
> Robert Weiner <rsw <at> gnu.org> writes:
> 
>> Jens wrote:
>> 
>>> That always freezes Emacs (29 and master) even before it has a chance to
>>> display P1.java.  The freeze happens in function
>>> `c-get-fallback-scan-pos', where the while loop inf-loops, BUT:
>>> 
>>> If you uncomment the line setting `hkey-init' to nil in init.el and
>>> repeat: No freeze.
>> 
>> As you note above, the infinite loop is coming from a Lisp function in
>> Emacs core, not from Hyperbole.  A Hyperbole setting may help you to
>> see a state reached in that function that you otherwise would not, but
>> it is not a Hyperbole bug; it is an unhandled state outside of
>> Hyperbole.
> 
> Well, yes and no.  The next closest culprit seems to be this hook
> addition from function `hui-select-initialize':
> 
>  ;; These hooks let you select C++ and Java methods and classes by
>  ;; double-clicking on the first character of a definition or on its
>  ;; opening or closing brace.  This is all necessary since some
>  ;; programmers don't put their function braces in the first column.
>  (var:add-and-run-hook
>   'java-mode-hook
>   (lambda ()
>     (setq defun-prompt-regexp
>       "^[ \t]*\\(\\(\\(public\\|protected\\|private\\|const\\|abstract\\|synchronized\\|final\\|static\\|threadsafe\\|transient\\|native\\|volatile\\)\\s-+\\)*\\(\\(\\([[a-zA-Z][][_$.a-zA-Z0-9]*[][_$.a-zA-Z0-9]+\\|[[a-zA-Z]\\)\\s-*\\)\\s-+\\)\\)?\\(\\([[a-zA-Z][][_$.a-zA-Z0-9]*\\s-+\\)\\s-*\\)?\\([_a-zA-Z][^][ \t:;.,{}()=]*\\|\\([_$a-zA-Z][_$.a-zA-Z0-9]*\\)\\)\\s-*\\(([^);{}]*)\\)?\\([] \t]*\\)\\(\\s-*\\<throws\\>\\s-*\\(\\([_$a-zA-Z][_$.a-zA-Z0-9]*\\)[, \t\n\r\f]*\\)+\\)?\\s-*")))
> 
> I (very generally) think that Emacs does not have to grok every regexp
> in every context, but I leave that concrete case for Alan and/or others
> to decide.
> 
>> On Wed, Oct 11, 2023 at 3:29 AM Mats Lidell <mats.lidell <at> lidells.se> wrote:
>> 
>> Thanks for the report.
> 
> Actually, not mine.  I'm just the messenger who did some root-cause
> analysis.
> 
>> Note: I don't know what P1.java means here. I have picked a java file
>> at random that I had on my machine that is large. Is P1.java a
>> specific file that has been shared earlier?
> 
> The OP has provided that, see below.
> 
>> Hyperbole has its own tracker.
>> 
>> https://debbugs.gnu.org/cgi/pkgreport.cgi?package=hyperbole
> 
> Ok, thanks.  As soon as we know whose bug this is we could forward or
> not.
> 
> 
> Now for the next reproducer (Hyperbole no longer required, but still
> present through its regexp :-):
> 
> - Save the following to ~/tmp/init.el:
> 
> ------------------------- snip -------------------------
> (add-hook
> 'java-mode-hook
> (lambda ()
>   (setq defun-prompt-regexp
>     "^[ \t]*\\(\\(\\(public\\|protected\\|private\\|const\\|abstract\\|synchronized\\|final\\|static\\|threadsafe\\|transient\\|native\\|volatile\\)\\s-+\\)*\\(\\(\\([[a-zA-Z][][_$.a-zA-Z0-9]*[][_$.a-zA-Z0-9]+\\|[[a-zA-Z]\\)\\s-*\\)\\s-+\\)\\)?\\(\\([[a-zA-Z][][_$.a-zA-Z0-9]*\\s-+\\)\\s-*\\)?\\([_a-zA-Z][^][ \t:;.,{}()=]*\\|\\([_$a-zA-Z][_$.a-zA-Z0-9]*\\)\\)\\s-*\\(([^);{}]*)\\)?\\([] \t]*\\)\\(\\s-*\\<throws\\>\\s-*\\(\\([_$a-zA-Z][_$.a-zA-Z0-9]*\\)[, \t\n\r\f]*\\)+\\)?\\s-*")))
> ------------------------- snip -------------------------
> 
> - Save attachment P1.java from the initial message
> 
>  https://yhetil.org/emacs-bugs/ZPOcahP9yPJ-kLcgipM3-l0jatXJSQWKPfObrlOkIB3dagud85x2DGXGhPpQn1QNqNksVmPIRc1intyW_Cx1Z9ou2vBZ5QLDpLTi_VFVYyg=@protonmail.com/
> 
>  to ~/tmp/P1.java.
> 
> - Start Emacs as
> 
>  ./src/emacs -Q -l ~/tmp/init.el +181 ~/tmp/P1.java
> 
> That always freezes Emacs (29 and master) even before it has a chance to
> display P1.java.  The freeze happens in function
> `c-get-fallback-scan-pos', where the while loop inf-loops.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Wed, 11 Oct 2023 21:44:03 GMT) Full text and rfc822 format available.

Message #48 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Mats Lidell <mats.lidell <at> lidells.se>
To: Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>
Cc: Robert Weiner <rsw <at> gnu.org>, Hank Greenburg <hank.greenburg <at> protonmail.com>,
 Alan Mackenzie <acm <at> muc.de>, Eli Zaretskii <eliz <at> gnu.org>, rswgnu <at> gmail.com,
 61436 <at> debbugs.gnu.org
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Wed, 11 Oct 2023 23:43:02 +0200
Hi,

> Jens Schmidt writes:
> Now for the next reproducer (Hyperbole no longer required, but still
> present through its regexp :-):

I can confirm that I get a harder freeze this time using the submitted P1.java
file.  I now get the reported behavior of not even displaying the file.

I've tried at occasions, over the last years, to use Emacs for coding java in
my day job. It has always ended in some freeze that feels quite like this. I
bet it is this. Would be a fun irony if it is Hyperboles regexp that has
caused it, and it is about to be solved now when I'm not coding java! 🤣

Anyway, I will take a look too. With the clear description and easy to
reproduce it could be possible to solve! ;-) 

%% Mats




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Wed, 11 Oct 2023 22:04:01 GMT) Full text and rfc822 format available.

Message #51 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>
Cc: Robert Weiner <rsw <at> gnu.org>, Hank Greenburg <hank.greenburg <at> protonmail.com>,
 Mats Lidell <mats.lidell <at> lidells.se>, Eli Zaretskii <eliz <at> gnu.org>,
 rswgnu <at> gmail.com, 61436 <at> debbugs.gnu.org
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Wed, 11 Oct 2023 22:03:05 +0000
Hello, Jens.

On Wed, Oct 11, 2023 at 21:38:26 +0200, Jens Schmidt wrote:
> Hi Alan,

> could you please have a look as well?  This seems to be related to
> cc-mode/java-mode.  New, complete reproducer at the very bottom of this
> mail.

> Thanks!

> Hi Robert & Mats,

> Robert Weiner <rsw <at> gnu.org> writes:

> > Jens wrote:

> >> That always freezes Emacs (29 and master) even before it has a chance to
> >> display P1.java.  The freeze happens in function
> >> `c-get-fallback-scan-pos', where the while loop inf-loops, BUT:

> >> If you uncomment the line setting `hkey-init' to nil in init.el and
> >> repeat: No freeze.

> > As you note above, the infinite loop is coming from a Lisp function in
> > Emacs core, not from Hyperbole.  A Hyperbole setting may help you to
> > see a state reached in that function that you otherwise would not, but
> > it is not a Hyperbole bug; it is an unhandled state outside of
> > Hyperbole.

> Well, yes and no.  The next closest culprit seems to be this hook
> addition from function `hui-select-initialize':

>   ;; These hooks let you select C++ and Java methods and classes by
>   ;; double-clicking on the first character of a definition or on its
>   ;; opening or closing brace.  This is all necessary since some
>   ;; programmers don't put their function braces in the first column.
>   (var:add-and-run-hook
>    'java-mode-hook
>    (lambda ()
>      (setq defun-prompt-regexp
> 	   "^[ \t]*\\(\\(\\(public\\|protected\\|private\\|const\\|abstract\\|synchronized\\|final\\|static\\|threadsafe\\|transient\\|native\\|volatile\\)\\s-+\\)*\\(\\(\\([[a-zA-Z][][_$.a-zA-Z0-9]*[][_$.a-zA-Z0-9]+\\|[[a-zA-Z]\\)\\s-*\\)\\s-+\\)\\)?\\(\\([[a-zA-Z][][_$.a-zA-Z0-9]*\\s-+\\)\\s-*\\)?\\([_a-zA-Z][^][ \t:;.,{}()=]*\\|\\([_$a-zA-Z][_$.a-zA-Z0-9]*\\)\\)\\s-*\\(([^);{}]*)\\)?\\([] \t]*\\)\\(\\s-*\\<throws\\>\\s-*\\(\\([_$a-zA-Z][_$.a-zA-Z0-9]*\\)[, \t\n\r\f]*\\)+\\)?\\s-*")))

> I (very generally) think that Emacs does not have to grok every regexp
> in every context, but I leave that concrete case for Alan and/or others
> to decide.

I think that that regexp might be the source of the hang.  It is
ill-conditioned.  (I've elided all of the keywords between "public" and
"volatile" to try and make it more readable):

"^[ \t]*\\(\\(\\(public\\|volatile\\)\\s-+\\)*\\(\\(\\([[a-zA-Z][][_$.a-zA-Z0-9]*[][_$.a-zA-Z0-9]+\\|[[a-zA-Z]\\)\\s-*\\))\\s-+\\)\\)? \\(\\([[a-zA-Z][][_$.a-zA-Z0-9]*\\s-+\\)\\s-*\\)?\\([_a-zA-Z][^][ \t:;.,{}()^?=]*\\|\\([_$a-zA-Z][_$.a-zA-Z0-9]*\\)\\)\\s-*\\(([^);{}]*)\\)?\\([] \t]*\\)\\(\\s-*\\<throws\\>\\s-*\\(\\([_$a-zA-Z][_$.a-zA-Z0-9]*\\)[, \t\n\r\f]*\\)+\\)?\\s-*"

The first problem seems to be just after "volatile\\)\\s-+\\)*", where you've got:

[[a-zA-Z][][_$.a-zA-Z0-9]*[][_$.a-zA-Z0-9]+\\|[[a-zA-Z]
                         ^                ^

, in other words [...]*[...]+, where the ...s match largely the same
characters.  In the event of a failure to match, the Emacs regexp engine
will try every possible combination of these.  This isn't all that bad,
but in a string of N matching characters inside a global mismatch, it
will try out all N-1 ways of splitting up the string between those two
regexp fragments.  In fact, here, the [...]* is entirely redundant (as
well as being harmful) and could be removed.

Another problem is right near the end of the regexp where there is:

\\(\\([_$a-zA-Z][_$.a-zA-Z0-9]*\\)[, \t\n\r\f]*\\)+

, or rewriting it in an easier to read fashion on several lines:

\\(                                            \\)+  
   \\(                         \\)[, \t\n\r\f]*
      [_$a-zA-Z][_$.a-zA-Z0-9]*
      1111111111111111111111111   2222222222222


.  Here, if you have a sequence of identifier characters, which are
inside a global mismatch, they can all be matched by 1.  However, they
can also be matched by 1, with any number (especially an infinite number)
of zero length strings matching 2.  In this case, the regexp engine will
try out all the ways of matching, an infinite number of them, before
giving up.  Here might be one of the places in the regexp which is
hanging.  It might well be that the second * in that expression should be
a +.

Earlier on in the regexp, I can see \\s-*\\)\\s-+, a possibly zero-length
sequence of space-syntax characters, followed by a non-empty sequence of
them.  I haven't analysed this in detail, but it smells like trouble.

It may well be that persevering with this regexp is a lost cause, and
you'd do better to construct a new regexp from scratch using more
structured methods (perhaps something similar to what's in cc-awk.el).
In fact the regexp looks horribly like one in the CC Mode manual which
was explicitly designated unsupported.  ;-(

Just as a matter of interest, I wrote a tool quite a few years ago to
diagnose and rewrite ill-conditioned regexps, but never got it to release
quality.  I tried out this tool on the regexp, but its output regexp hung
in Java Mode just as much as the original.  But this tool did help me
spot some of the solecisms which I analysed above.


> > On Wed, Oct 11, 2023 at 3:29 AM Mats Lidell <mats.lidell <at> lidells.se> wrote:
> >
> >  Thanks for the report.

> Actually, not mine.  I'm just the messenger who did some root-cause
> analysis.

> >  Note: I don't know what P1.java means here. I have picked a java file
> >  at random that I had on my machine that is large. Is P1.java a
> >  specific file that has been shared earlier?

> The OP has provided that, see below.

> >  Hyperbole has its own tracker.
> >
> >  https://debbugs.gnu.org/cgi/pkgreport.cgi?package=hyperbole

> Ok, thanks.  As soon as we know whose bug this is we could forward or
> not.


> Now for the next reproducer (Hyperbole no longer required, but still
> present through its regexp :-):

> - Save the following to ~/tmp/init.el:

> ------------------------- snip -------------------------
> (add-hook
>  'java-mode-hook
>  (lambda ()
>    (setq defun-prompt-regexp
> 	 "^[ \t]*\\(\\(\\(public\\|protected\\|private\\|const\\|abstract\\|synchronized\\|final\\|static\\|threadsafe\\|transient\\|native\\|volatile\\)\\s-+\\)*\\(\\(\\([[a-zA-Z][][_$.a-zA-Z0-9]*[][_$.a-zA-Z0-9]+\\|[[a-zA-Z]\\)\\s-*\\)\\s-+\\)\\)?\\(\\([[a-zA-Z][][_$.a-zA-Z0-9]*\\s-+\\)\\s-*\\)?\\([_a-zA-Z][^][ \t:;.,{}()=]*\\|\\([_$a-zA-Z][_$.a-zA-Z0-9]*\\)\\)\\s-*\\(([^);{}]*)\\)?\\([] \t]*\\)\\(\\s-*\\<throws\\>\\s-*\\(\\([_$a-zA-Z][_$.a-zA-Z0-9]*\\)[, \t\n\r\f]*\\)+\\)?\\s-*")))
> ------------------------- snip -------------------------

> - Save attachment P1.java from the initial message

>   https://yhetil.org/emacs-bugs/ZPOcahP9yPJ-kLcgipM3-l0jatXJSQWKPfObrlOkIB3dagud85x2DGXGhPpQn1QNqNksVmPIRc1intyW_Cx1Z9ou2vBZ5QLDpLTi_VFVYyg=@protonmail.com/

>   to ~/tmp/P1.java.

> - Start Emacs as

>   ./src/emacs -Q -l ~/tmp/init.el +181 ~/tmp/P1.java

> That always freezes Emacs (29 and master) even before it has a chance to
> display P1.java.  The freeze happens in function
> `c-get-fallback-scan-pos', where the while loop inf-loops.

c-get-fallback-scan-pos tries to move to the beginning of a function.
This probably involves defun-prompt-regexp when it is non-nil.  :-(

-- 
Alan Mackenzie (Nuremberg, Germany).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Thu, 12 Oct 2023 20:00:02 GMT) Full text and rfc822 format available.

Message #54 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>
To: Alan Mackenzie <acm <at> muc.de>
Cc: Robert Weiner <rsw <at> gnu.org>, Hank Greenburg <hank.greenburg <at> protonmail.com>,
 Mats Lidell <mats.lidell <at> lidells.se>, Eli Zaretskii <eliz <at> gnu.org>,
 rswgnu <at> gmail.com, 61436 <at> debbugs.gnu.org
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Thu, 12 Oct 2023 21:58:06 +0200
Hi Alan,

Alan Mackenzie <acm <at> muc.de> writes:

> It may well be that persevering with this regexp is a lost cause, and
> you'd do better to construct a new regexp from scratch using more
> structured methods (perhaps something similar to what's in cc-awk.el).
> In fact the regexp looks horribly like one in the CC Mode manual which
> was explicitly designated unsupported.  ;-(

and thanks for your analysis.  I felt as well that this regexp looks,
well, funny, but what intrigues me is that it cannot (only) be the
complexity and ill-conditioned-ness of that regexp that lets Emacs
freeze.

>> That always freezes Emacs (29 and master) even before it has a chance to
>> display P1.java.  The freeze happens in function
>> `c-get-fallback-scan-pos', where the while loop inf-loops.
>
> c-get-fallback-scan-pos tries to move to the beginning of a function.
> This probably involves defun-prompt-regexp when it is non-nil.  :-(

Otherwise we would see hangs or exponential behavior (?) somewhere in
the Emacs regexp machinerie, but they take place in that while loop.  So
I guess that there must be some other, additional quality that this
regexp fulfills.  Like: "matches the empty string" (which it does not,
as far as I can tell) or: "must only match before curlies" or whatnot.

Unfortunately, the doc string/info doc of `defun-prompt-regexp´ provides
only exactly that latter criterion:

  That is to say, a defun begins on a line that starts with a match for
  this regular expression, followed by a character with open-parenthesis
  syntax.

I guess that only pruning that regexp until things start unfreezing
could give an answer here.  Or more tracing to see how point moves in
`c-get-fallback-scan-pos'.  But I need some tracing break here ...


... or so I thought, I just couldn't resist:

I expanded and instrumented that function from emacs-29 as follows,
(hopefully) not changing any of its logic:

------------------------- snip -------------------------
(defun c-get-fallback-scan-pos (here)
  ;; Return a start position for building `c-state-cache' from scratch.  This
  ;; will be at the top level, 2 defuns back.  Return nil if we don't find
  ;; these defun starts a reasonable way back.
  (message "c-get-fallback-scan-pos")
  (save-excursion
    (save-restriction
      (when (> here (* 10 c-state-cache-too-far))
	(narrow-to-region (- here (* 10 c-state-cache-too-far)) here))
      ;; Go back 2 bods, but ignore any bogus positions returned by
      ;; beginning-of-defun (i.e. open paren in column zero).
      (goto-char here)
      (let ((cnt 2))
	(message "beginning-of-defun-loop-00: %d %d" cnt (point))
	(while (not (or (bobp) (zerop cnt)))
	  (message "beginning-of-defun-loop-01: %d" (point))
	  (let (beginning-of-defun-function end-of-defun-function)
	    (beginning-of-defun))
	  (and defun-prompt-regexp
	       (looking-at defun-prompt-regexp)
	       (message "beginning-of-defun-loop-02: %d" (point))
	       (goto-char (match-end 0)))
	  (message "beginning-of-defun-loop-03: %d" (point))
	  (if (eq (char-after) ?\{)
	      (setq cnt (1- cnt)))))
      (and (not (bobp))
	   (point)))))
------------------------- snip -------------------------

That results in the message triple

------------------------- snip -------------------------
beginning-of-defun-loop-01: 5879
beginning-of-defun-loop-02: 5801
beginning-of-defun-loop-03: 5879
beginning-of-defun-loop-01: 5879
beginning-of-defun-loop-02: 5801
beginning-of-defun-loop-03: 5879
...
------------------------- snip -------------------------

inf-looping.  These points are (|: 5801, ^: 5879) here in P1.java:

------------------------- snip -------------------------
178    } catch (Exception e) {
179|      error("symTable.addDecl", "unexpected error with a single HashMap " + e)^;
180    }
181
------------------------- snip -------------------------

So the catch-block just before line 181 is recognized as a potential BOD
(previous trailing open curly?).  But then `defun-prompt-regexp' matches
the function call in the catch-block as defun prompt regexp (which it
better should not?), taking point back to where, on next BOD search, the
exact previous BOD is found again.

So probably there are really two issues here:

1. The `defun-prompt-regexp' used by Hyperbole, which matches too
   broadly, and

2. function `c-get-fallback-scan-pos', which could try harder to avoid
   inf-loops when such things happen.

But that's where I *really* stop here :-)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Fri, 13 Oct 2023 12:43:01 GMT) Full text and rfc822 format available.

Message #57 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>
Cc: Robert Weiner <rsw <at> gnu.org>, Hank Greenburg <hank.greenburg <at> protonmail.com>,
 Mats Lidell <mats.lidell <at> lidells.se>, Eli Zaretskii <eliz <at> gnu.org>,
 rswgnu <at> gmail.com, 61436 <at> debbugs.gnu.org
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Fri, 13 Oct 2023 12:41:57 +0000
[Message part 1 (text/plain, inline)]
Hello, Jens.

On Thu, Oct 12, 2023 at 21:58:06 +0200, Jens Schmidt wrote:
> Hi Alan,

> Alan Mackenzie <acm <at> muc.de> writes:

[ .... ]

> >> That always freezes Emacs (29 and master) even before it has a chance to
> >> display P1.java.  The freeze happens in function
> >> `c-get-fallback-scan-pos', where the while loop inf-loops.

Yes.

> > c-get-fallback-scan-pos tries to move to the beginning of a function.
> > This probably involves defun-prompt-regexp when it is non-nil.  :-(

> Otherwise we would see hangs or exponential behavior (?) somewhere in
> the Emacs regexp machinerie, but they take place in that while loop.  So
> I guess that there must be some other, additional quality that this
> regexp fulfills.  Like: "matches the empty string" (which it does not,
> as far as I can tell) or: "must only match before curlies" or whatnot.

> Unfortunately, the doc string/info doc of `defun-prompt-regexp´ provides
> only exactly that latter criterion:

>   That is to say, a defun begins on a line that starts with a match for
>   this regular expression, followed by a character with open-parenthesis
>   syntax.

> I guess that only pruning that regexp until things start unfreezing
> could give an answer here.  Or more tracing to see how point moves in
> `c-get-fallback-scan-pos'.  But I need some tracing break here ...


> ... or so I thought, I just couldn't resist:

> I expanded and instrumented that function from emacs-29 as follows,
> (hopefully) not changing any of its logic:

> ------------------------- snip -------------------------
> (defun c-get-fallback-scan-pos (here)
>   ;; Return a start position for building `c-state-cache' from scratch.  This
>   ;; will be at the top level, 2 defuns back.  Return nil if we don't find
>   ;; these defun starts a reasonable way back.
>   (message "c-get-fallback-scan-pos")
>   (save-excursion
>     (save-restriction
>       (when (> here (* 10 c-state-cache-too-far))
> 	(narrow-to-region (- here (* 10 c-state-cache-too-far)) here))
>       ;; Go back 2 bods, but ignore any bogus positions returned by
>       ;; beginning-of-defun (i.e. open paren in column zero).
>       (goto-char here)
>       (let ((cnt 2))
> 	(message "beginning-of-defun-loop-00: %d %d" cnt (point))
> 	(while (not (or (bobp) (zerop cnt)))
> 	  (message "beginning-of-defun-loop-01: %d" (point))
> 	  (let (beginning-of-defun-function end-of-defun-function)
> 	    (beginning-of-defun))
> 	  (and defun-prompt-regexp
> 	       (looking-at defun-prompt-regexp)
> 	       (message "beginning-of-defun-loop-02: %d" (point))
> 	       (goto-char (match-end 0)))
> 	  (message "beginning-of-defun-loop-03: %d" (point))
> 	  (if (eq (char-after) ?\{)
> 	      (setq cnt (1- cnt)))))
>       (and (not (bobp))
> 	   (point)))))
> ------------------------- snip -------------------------

> That results in the message triple

> ------------------------- snip -------------------------
> beginning-of-defun-loop-01: 5879
> beginning-of-defun-loop-02: 5801
> beginning-of-defun-loop-03: 5879
> beginning-of-defun-loop-01: 5879
> beginning-of-defun-loop-02: 5801
> beginning-of-defun-loop-03: 5879
> ...
> ------------------------- snip -------------------------

> inf-looping.  These points are (|: 5801, ^: 5879) here in P1.java:

> ------------------------- snip -------------------------
> 178    } catch (Exception e) {
> 179|      error("symTable.addDecl", "unexpected error with a single HashMap " + e)^;
> 180    }
> 181
> ------------------------- snip -------------------------

> So the catch-block just before line 181 is recognized as a potential BOD
> (previous trailing open curly?).  But then `defun-prompt-regexp' matches
> the function call in the catch-block as defun prompt regexp (which it
> better should not?), taking point back to where, on next BOD search, the
> exact previous BOD is found again.

> So probably there are really two issues here:

> 1. The `defun-prompt-regexp' used by Hyperbole, which matches too
>    broadly, and

> 2. function `c-get-fallback-scan-pos', which could try harder to avoid
>    inf-loops when such things happen.

> But that's where I *really* stop here :-)

You've diagnosed the bug completely.  Thanks!  The hang was caused
entirely by the loop in c-get-fallback-scan-pos, not the deficiencies in
that long regexp.

defun-prompt-regexp, when appended with a \\s( (as is done in
beginning-of-defun-raw) matches the "      error(" on L179 of P1.java.
The bare defun-prompt-regexp (as used in CC Mode) matches the entire
line except the terminating ;.  This regexp could do with some
amendment, but it is not the main cause of the bug.

To solve the bug, I'm amending the macro c-beginning-of-defun-1 so that
it only stops at a debug-prompt-regexp position when it also found a {.
Otherwise it will keep looping until it finds a better position or BOB.

Would all concerned please apply the attached patch to the Emacs master
branch, directory lisp/progmodes.  Then please byte compile CC Mode in
full (a macro has been changed), and try the result on your real Java
code.  (If anybody wants any help applying the patch or byte compiling,
feel free to send me private mail.)  Then please confirm that the bug is
indeed fixed.  Thanks!

-- 
Alan Mackenzie (Nuremberg, Germany).

[diff.20231013.diff (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Fri, 13 Oct 2023 18:04:02 GMT) Full text and rfc822 format available.

Message #60 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Mats Lidell <mats.lidell <at> lidells.se>
To: Alan Mackenzie <acm <at> muc.de>
Cc: Robert Weiner <rsw <at> gnu.org>, Hank Greenburg <hank.greenburg <at> protonmail.com>,
 Eli Zaretskii <eliz <at> gnu.org>, rswgnu <at> gmail.com,
 Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>, 61436 <at> debbugs.gnu.org
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Fri, 13 Oct 2023 20:02:40 +0200
Hi Alan,

> Alan Mackenzie writes:
> Would all concerned please apply the attached patch to the Emacs master
> branch, directory lisp/progmodes.  Then please byte compile CC Mode in
> full (a macro has been changed), and try the result on your real Java
> code.  (If anybody wants any help applying the patch or byte compiling,
> feel free to send me private mail.)  Then please confirm that the bug is
> indeed fixed.  Thanks!

I have verified that there is no freeze with the patch applied for the test
case with the P1.java file. Latest master as of today with native byte
compilation.  (Commit = baf778c7caa) Without the patch the freeze is there. So
is looking good. :-)

Yours
-- 
%% Mats




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Fri, 13 Oct 2023 20:44:01 GMT) Full text and rfc822 format available.

Message #63 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>
To: Alan Mackenzie <acm <at> muc.de>
Cc: Robert Weiner <rsw <at> gnu.org>, Hank Greenburg <hank.greenburg <at> protonmail.com>,
 Mats Lidell <mats.lidell <at> lidells.se>, Eli Zaretskii <eliz <at> gnu.org>,
 rswgnu <at> gmail.com, 61436 <at> debbugs.gnu.org
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Fri, 13 Oct 2023 22:42:04 +0200
Hi Alan,

Alan Mackenzie <acm <at> muc.de> writes:

> To solve the bug, I'm amending the macro c-beginning-of-defun-1 so that
> it only stops at a debug-prompt-regexp position when it also found a {.
> Otherwise it will keep looping until it finds a better position or BOB.

Thanks.

> Then please confirm that the bug is
> indeed fixed.

For the fun of it I tried Hank's initial testcase as well, which is a
bit less straight-forward to set up.  The freezes are indeed gone with
your patch.  But I noticed that which-function-mode, when rapidly moving
through the file, cannot always determine the current function name,
then displaying "[n/a]" in the mode line.

And indeed, when executing the simplified test case

  ./src/emacs -Q -l ~/tmp/init.el +181 ~/tmp/P1.java

and then immediately hitting C-M-a, point jumps to the beginning of the
preceeding catch clause (point=5779 of 18142) instead of BOD.

This behavior is again tied to the `defun-prompt-regexp' used by
Hyperbole - without that regexp C-M-a jumps to the real BOD.




Reply sent to Alan Mackenzie <acm <at> muc.de>:
You have taken responsibility. (Sat, 14 Oct 2023 19:43:01 GMT) Full text and rfc822 format available.

Notification sent to Hank Greenburg <hank.greenburg <at> protonmail.com>:
bug acknowledged by developer. (Sat, 14 Oct 2023 19:43:02 GMT) Full text and rfc822 format available.

Message #68 received at 61436-done <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>,
 Mats Lidell <mats.lidell <at> lidells.se>
Cc: Robert Weiner <rsw <at> gnu.org>, Hank Greenburg <hank.greenburg <at> protonmail.com>,
 61436-done <at> debbugs.gnu.org, acm <at> muc.de, Eli Zaretskii <eliz <at> gnu.org>,
 rswgnu <at> gmail.com
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Sat, 14 Oct 2023 19:41:41 +0000
Hello, Jens and Mats.

On Fri, Oct 13, 2023 at 22:42:04 +0200, Jens Schmidt wrote:
> Hi Alan,

> Alan Mackenzie <acm <at> muc.de> writes:

> > To solve the bug, I'm amending the macro c-beginning-of-defun-1 so that
> > it only stops at a debug-prompt-regexp position when it also found a {.
> > Otherwise it will keep looping until it finds a better position or BOB.

> Thanks.

> > Then please confirm that the bug is indeed fixed.

> For the fun of it I tried Hank's initial testcase as well, which is a
> bit less straight-forward to set up.  The freezes are indeed gone with
> your patch.

Thanks for the testing.  Seeing as how both of you confirm the original
bug is fixed with the patch, I'm closing it with this post.

> But I noticed that which-function-mode, when rapidly moving through
> the file, cannot always determine the current function name, then
> displaying "[n/a]" in the mode line.

> And indeed, when executing the simplified test case

>   ./src/emacs -Q -l ~/tmp/init.el +181 ~/tmp/P1.java

> and then immediately hitting C-M-a, point jumps to the beginning of the
> preceeding catch clause (point=5779 of 18142) instead of BOD.

I can't reproduce this, even when setting defun-prompt-regexp to the
original large regexp from hui-select.el.

> This behavior is again tied to the `defun-prompt-regexp' used by
> Hyperbole - without that regexp C-M-a jumps to the real BOD.

Mats, I'm willing to work on that regular expression, and also the one
for C++.  As I mentioned earlier, I've got some tools which work on
regexps, in particular pp-regexp, which prints a regexp more readably on
several lines, and fix-re, which rewrites a regexp when it is
ill-conditioned in certain ways.

I foresee reverse engineering the regexps into more readable forms built
up by concatenating basic blocks.  For example for the java regexp I
would define

    (defconst id "[a-zA-Z][][_$.a-zA-Z0-9]*")

, and use this id in a largish concat form.

I'm also willing to share pp-regexp and fix-re with you(r team), if that
might help, on the understanding that neither is of release quality.

-- 
Alan Mackenzie (Nuremberg, Germany).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Sun, 15 Oct 2023 10:22:01 GMT) Full text and rfc822 format available.

Message #71 received at 61436-done <at> debbugs.gnu.org (full text, mbox):

From: Robert Weiner <rswgnu <at> gmail.com>
To: Alan Mackenzie <acm <at> muc.de>
Cc: Robert Weiner <rsw <at> gnu.org>, Hank Greenburg <hank.greenburg <at> protonmail.com>,
 Mats Lidell <mats.lidell <at> lidells.se>, 61436-done <at> debbugs.gnu.org,
 Eli Zaretskii <eliz <at> gnu.org>, Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Sun, 15 Oct 2023 06:20:15 -0400
Hi Alan:

Would be great if you can improve those two regexps.  The only
requirement is that they be able to recognize all defuns in the two
languages as best a regexp can, so that the whole defun can be
selected based on finding the opening brace regardless of coding
style.

Thanks.

-- Bob

> On Oct 14, 2023, at 8:41 PM, Alan Mackenzie <acm <at> muc.de> wrote:
>
> Hello, Jens and Mats.

>
>> On Fri, Oct 13, 2023 at 22:42:04 +0200, Jens Schmidt wrote:
>> Hi Alan,
>
>> Alan Mackenzie <acm <at> muc.de> writes:
>
>>> To solve the bug, I'm amending the macro c-beginning-of-defun-1 so that
>>> it only stops at a debug-prompt-regexp position when it also found a {.
>>> Otherwise it will keep looping until it finds a better position or BOB.
>
>> Thanks.
>
>>> Then please confirm that the bug is indeed fixed.
>
>> For the fun of it I tried Hank's initial testcase as well, which is a
>> bit less straight-forward to set up.  The freezes are indeed gone with
>> your patch.
>
> Thanks for the testing.  Seeing as how both of you confirm the original
> bug is fixed with the patch, I'm closing it with this post.
>
>> But I noticed that which-function-mode, when rapidly moving through
>> the file, cannot always determine the current function name, then
>> displaying "[n/a]" in the mode line.
>
>> And indeed, when executing the simplified test case
>
>>  ./src/emacs -Q -l ~/tmp/init.el +181 ~/tmp/P1.java
>
>> and then immediately hitting C-M-a, point jumps to the beginning of the
>> preceeding catch clause (point=5779 of 18142) instead of BOD.
>
> I can't reproduce this, even when setting defun-prompt-regexp to the
> original large regexp from hui-select.el.
>
>> This behavior is again tied to the `defun-prompt-regexp' used by
>> Hyperbole - without that regexp C-M-a jumps to the real BOD.
>
> Mats, I'm willing to work on that regular expression, and also the one
> for C++.  As I mentioned earlier, I've got some tools which work on
> regexps, in particular pp-regexp, which prints a regexp more readably on
> several lines, and fix-re, which rewrites a regexp when it is
> ill-conditioned in certain ways.
>
> I foresee reverse engineering the regexps into more readable forms built
> up by concatenating basic blocks.  For example for the java regexp I
> would define
>
>    (defconst id "[a-zA-Z][][_$.a-zA-Z0-9]*")
>
> , and use this id in a largish concat form.
>
> I'm also willing to share pp-regexp and fix-re with you(r team), if that
> might help, on the understanding that neither is of release quality.
>
> --
> Alan Mackenzie (Nuremberg, Germany).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Mon, 16 Oct 2023 14:06:01 GMT) Full text and rfc822 format available.

Message #74 received at 61436-done <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: Robert Weiner <rswgnu <at> gmail.com>
Cc: Robert Weiner <rsw <at> gnu.org>, Hank Greenburg <hank.greenburg <at> protonmail.com>,
 Mats Lidell <mats.lidell <at> lidells.se>, 61436-done <at> debbugs.gnu.org, acm <at> muc.de,
 Eli Zaretskii <eliz <at> gnu.org>, Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Mon, 16 Oct 2023 14:05:20 +0000
Hello, Bob.

On Sun, Oct 15, 2023 at 06:20:15 -0400, Robert Weiner wrote:
> Hi Alan:

> Would be great if you can improve those two regexps.  The only
> requirement is that they be able to recognize all defuns in the two
> languages as best a regexp can, so that the whole defun can be
> selected based on finding the opening brace regardless of coding
> style.

So far, I've only looked at the Java regexp.  It had some serious
deficiencies, notably:
(i) It used "\\s-" (space syntax) a lot.  This fails to mach \n, which in
Java mode has comment-end syntax.
(ii) The bit for the parenthesis expression was in an optional part of
the regexp with the result that it would match "almost anything" rather
than a defun start.

In the following regexp these faults are fixed.  Additionally, I've
included more modifiers (things like private, volatile) which Java seems
to have gathered over the years.  I've also attempted to match generic
functions.  I don't know how well this will work out.

Here's the regexp.  Would people please try it out and let me know how
well it works.  

(defconst java-defun-prompt-regexp
  (let ((space* "[ \t\n\r\f]*")
        (space+ "[ \t\n\r\f]+")
        (modifier*
         (concat "\\(?:"
                 (regexp-opt '("abstract" "const" "default" "final" "native"
                               "private" "protected" "public" "static"
                               "strictfp" "synchronized" "threadsafe"
                               "transient" "volatile")
                             'words)    ; Compatible with XEmacs
                 space+ "\\)*"))
        (ids-with-dots "[_$a-zA-Z][_$.a-zA-Z0-9]*")
        (ids-with-dot-\[\] "[[_$a-zA-Z][][_$.a-zA-Z0-9]*")
        (paren-exp "([^);{}]*)")
        (generic-exp "<[^(){};]*>"))
    (concat "^[ \t]*"
            modifier*
            "\\(?:" generic-exp space* "\\)?"
            ids-with-dot-\[\] space+                ; first part of type
            "\\(?:" ids-with-dot-\[\] space+ "\\)?" ; optional second part of type.
            "\\(?:[_a-zA-Z][^][ \t:;.,{}()=<>]*"    ; defun name
                "\\|" ids-with-dot*
            "\\)" space*
            paren-exp
            "\\(?:" space* "]\\)*"      ; What's this for?
            "\\(?:" space* "\\<throws\\>" space* ids-with-dot-\[\]s*
                  "\\(?:," space* ids-with-dot-\[\]s* "\\)*"
            "\\)?"
            space*)))

> Thanks.

> -- Bob

> > On Oct 14, 2023, at 8:41 PM, Alan Mackenzie <acm <at> muc.de> wrote:

[ .... ]

> > Mats, I'm willing to work on that regular expression, and also the one
> > for C++.  As I mentioned earlier, I've got some tools which work on
> > regexps, in particular pp-regexp, which prints a regexp more readably on
> > several lines, and fix-re, which rewrites a regexp when it is
> > ill-conditioned in certain ways.

> > I foresee reverse engineering the regexps into more readable forms built
> > up by concatenating basic blocks.  For example for the java regexp I
> > would define

> >    (defconst id "[a-zA-Z][][_$.a-zA-Z0-9]*")

> > , and use this id in a largish concat form.

> > I'm also willing to share pp-regexp and fix-re with you(r team), if that
> > might help, on the understanding that neither is of release quality.

-- 
Alan Mackenzie (Nuremberg, Germany).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Mon, 16 Oct 2023 19:12:01 GMT) Full text and rfc822 format available.

Message #77 received at 61436-done <at> debbugs.gnu.org (full text, mbox):

From: Robert Weiner <rswgnu <at> gmail.com>
To: Alan Mackenzie <acm <at> muc.de>
Cc: Robert Weiner <rsw <at> gnu.org>, Hank Greenburg <hank.greenburg <at> protonmail.com>,
 Mats Lidell <mats.lidell <at> lidells.se>, 61436-done <at> debbugs.gnu.org,
 Eli Zaretskii <eliz <at> gnu.org>, Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Mon, 16 Oct 2023 20:10:14 +0100
Nice job, Alan.  Thanks, look forward to trying it out.

-- Bob

> On Oct 16, 2023, at 3:05 PM, Alan Mackenzie <acm <at> muc.de> wrote:
> 
> Hello, Bob.
> 
>> On Sun, Oct 15, 2023 at 06:20:15 -0400, Robert Weiner wrote:
>> Hi Alan:
> 
>> Would be great if you can improve those two regexps.  The only
>> requirement is that they be able to recognize all defuns in the two
>> languages as best a regexp can, so that the whole defun can be
>> selected based on finding the opening brace regardless of coding
>> style.
> 
> So far, I've only looked at the Java regexp.  It had some serious
> deficiencies, notably:
> (i) It used "\\s-" (space syntax) a lot.  This fails to mach \n, which in
> Java mode has comment-end syntax.
> (ii) The bit for the parenthesis expression was in an optional part of
> the regexp with the result that it would match "almost anything" rather
> than a defun start.
> 
> In the following regexp these faults are fixed.  Additionally, I've
> included more modifiers (things like private, volatile) which Java seems
> to have gathered over the years.  I've also attempted to match generic
> functions.  I don't know how well this will work out.
> 
> Here's the regexp.  Would people please try it out and let me know how
> well it works.  
> 
> (defconst java-defun-prompt-regexp
>  (let ((space* "[ \t\n\r\f]*")
>        (space+ "[ \t\n\r\f]+")
>        (modifier*
>         (concat "\\(?:"
>                 (regexp-opt '("abstract" "const" "default" "final" "native"
>                               "private" "protected" "public" "static"
>                               "strictfp" "synchronized" "threadsafe"
>                               "transient" "volatile")
>                             'words)    ; Compatible with XEmacs
>                 space+ "\\)*"))
>        (ids-with-dots "[_$a-zA-Z][_$.a-zA-Z0-9]*")
>        (ids-with-dot-\[\] "[[_$a-zA-Z][][_$.a-zA-Z0-9]*")
>        (paren-exp "([^);{}]*)")
>        (generic-exp "<[^(){};]*>"))
>    (concat "^[ \t]*"
>            modifier*
>            "\\(?:" generic-exp space* "\\)?"
>            ids-with-dot-\[\] space+                ; first part of type
>            "\\(?:" ids-with-dot-\[\] space+ "\\)?" ; optional second part of type.
>            "\\(?:[_a-zA-Z][^][ \t:;.,{}()=<>]*"    ; defun name
>                "\\|" ids-with-dot*
>            "\\)" space*
>            paren-exp
>            "\\(?:" space* "]\\)*"      ; What's this for?
>            "\\(?:" space* "\\<throws\\>" space* ids-with-dot-\[\]s*
>                  "\\(?:," space* ids-with-dot-\[\]s* "\\)*"
>            "\\)?"
>            space*)))
> 
>> Thanks.
> 
>> -- Bob
> 
>>> On Oct 14, 2023, at 8:41 PM, Alan Mackenzie <acm <at> muc.de> wrote:
> 
> [ .... ]
> 
>>> Mats, I'm willing to work on that regular expression, and also the one
>>> for C++.  As I mentioned earlier, I've got some tools which work on
>>> regexps, in particular pp-regexp, which prints a regexp more readably on
>>> several lines, and fix-re, which rewrites a regexp when it is
>>> ill-conditioned in certain ways.
> 
>>> I foresee reverse engineering the regexps into more readable forms built
>>> up by concatenating basic blocks.  For example for the java regexp I
>>> would define
> 
>>>   (defconst id "[a-zA-Z][][_$.a-zA-Z0-9]*")
> 
>>> , and use this id in a largish concat form.
> 
>>> I'm also willing to share pp-regexp and fix-re with you(r team), if that
>>> might help, on the understanding that neither is of release quality.
> 
> -- 
> Alan Mackenzie (Nuremberg, Germany).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Sat, 21 Oct 2023 22:16:02 GMT) Full text and rfc822 format available.

Message #80 received at 61436-done <at> debbugs.gnu.org (full text, mbox):

From: Mats Lidell <mats.lidell <at> lidells.se>
To: Alan Mackenzie <acm <at> muc.de>
Cc: Robert Weiner <rsw <at> gnu.org>, Hank Greenburg <hank.greenburg <at> protonmail.com>,
 61436-done <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 Robert Weiner <rswgnu <at> gmail.com>, Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Sun, 22 Oct 2023 00:14:43 +0200
Hi Alan,

Sorry for coming back late to this issue but today I tried to use the
suggested regexp and got into some problems doing that.

> Alan Mackenzie writes:
> Here's the regexp.  Would people please try it out and let me know how
> well it works.  
>
> (defconst java-defun-prompt-regexp
>   (let ((space* "[ \t\n\r\f]*")
>         (space+ "[ \t\n\r\f]+")
>         (modifier*
>          (concat "\\(?:"
>                  (regexp-opt '("abstract" "const" "default" "final" "native"
>                                "private" "protected" "public" "static"
>                                "strictfp" "synchronized" "threadsafe"
>                                "transient" "volatile")
>                              'words)    ; Compatible with XEmacs
>                  space+ "\\)*"))
>         (ids-with-dots "[_$a-zA-Z][_$.a-zA-Z0-9]*")
>         (ids-with-dot-\[\] "[[_$a-zA-Z][][_$.a-zA-Z0-9]*")
>         (paren-exp "([^);{}]*)")
>         (generic-exp "<[^(){};]*>"))
>     (concat "^[ \t]*"
>             modifier*
>             "\\(?:" generic-exp space* "\\)?"
>             ids-with-dot-\[\] space+                ; first part of type
>             "\\(?:" ids-with-dot-\[\] space+ "\\)?" ; optional second part of type.
>             "\\(?:[_a-zA-Z][^][ \t:;.,{}()=<>]*"    ; defun name
>                 "\\|" ids-with-dot*
>             "\\)" space*
>             paren-exp
>             "\\(?:" space* "]\\)*"      ; What's this for?
>             "\\(?:" space* "\\<throws\\>" space* ids-with-dot-\[\]s*
>                   "\\(?:," space* ids-with-dot-\[\]s* "\\)*"
>             "\\)?"
>             space*)))

Can there be some typos in there or missing lines? I get a compiler warning
for space+ being used in the let, should it be a let*? ids-with-dots is
reported as not used. Can it be meant to be ids-with-dot*?
ids-with-dot-\[\]s* is undefined!?

Yours
-- 
%% Mats




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Sun, 22 Oct 2023 14:16:01 GMT) Full text and rfc822 format available.

Message #83 received at 61436-done <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: Mats Lidell <mats.lidell <at> lidells.se>
Cc: Robert Weiner <rsw <at> gnu.org>, Hank Greenburg <hank.greenburg <at> protonmail.com>,
 61436-done <at> debbugs.gnu.org, acm <at> muc.de, Eli Zaretskii <eliz <at> gnu.org>,
 Robert Weiner <rswgnu <at> gmail.com>, Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Sun, 22 Oct 2023 14:15:08 +0000
Hello, Mats.

On Sun, Oct 22, 2023 at 00:14:43 +0200, Mats Lidell wrote:
> Hi Alan,

> Sorry for coming back late to this issue but today I tried to use the
> suggested regexp and got into some problems doing that.

Sorry about that.

> > Alan Mackenzie writes:
> > Here's the regexp.  Would people please try it out and let me know how
> > well it works.  
> >
> > (defconst java-defun-prompt-regexp
> >   (let ((space* "[ \t\n\r\f]*")
> >         (space+ "[ \t\n\r\f]+")
> >         (modifier*
> >          (concat "\\(?:"
> >                  (regexp-opt '("abstract" "const" "default" "final" "native"
> >                                "private" "protected" "public" "static"
> >                                "strictfp" "synchronized" "threadsafe"
> >                                "transient" "volatile")
> >                              'words)    ; Compatible with XEmacs
> >                  space+ "\\)*"))
> >         (ids-with-dots "[_$a-zA-Z][_$.a-zA-Z0-9]*")
> >         (ids-with-dot-\[\] "[[_$a-zA-Z][][_$.a-zA-Z0-9]*")
> >         (paren-exp "([^);{}]*)")
> >         (generic-exp "<[^(){};]*>"))
> >     (concat "^[ \t]*"
> >             modifier*
> >             "\\(?:" generic-exp space* "\\)?"
> >             ids-with-dot-\[\] space+                ; first part of type
> >             "\\(?:" ids-with-dot-\[\] space+ "\\)?" ; optional second part of type.
> >             "\\(?:[_a-zA-Z][^][ \t:;.,{}()=<>]*"    ; defun name
> >                 "\\|" ids-with-dot*
> >             "\\)" space*
> >             paren-exp
> >             "\\(?:" space* "]\\)*"      ; What's this for?
> >             "\\(?:" space* "\\<throws\\>" space* ids-with-dot-\[\]s*
> >                   "\\(?:," space* ids-with-dot-\[\]s* "\\)*"
> >             "\\)?"
> >             space*)))

> Can there be some typos in there or missing lines? I get a compiler warning
> for space+ being used in the let, should it be a let*? ids-with-dots is
> reported as not used. Can it be meant to be ids-with-dot*?
> ids-with-dot-\[\]s* is undefined!?

No, the regexp just wasn't tested right.  I made the mistake in
"testing" it of having all the things like space+ already defined as
defconsts.  So I failed to pick up the let which should have been let*
and giving all the uses of the bound variables their actual names.  :-(

Can I ask you to delete that buggy version and try the following
instead?  Thanks!

(defconst java-defun-prompt-regexp
  (let* ((space* "[ \t\n\r\f]*")
         (space+ "[ \t\n\r\f]+")
         (modifier*
          (concat "\\(?:"
                  (regexp-opt '("abstract" "const" "default" "final" "native"
                                "private" "protected" "public" "static"
                                "strictfp" "synchronized" "threadsafe"
                                "transient" "volatile")
                              'words)   ; Compatible with XEmacs
                  space+ "\\)*"))
         (ids-with-dots "[_$a-zA-Z][_$.a-zA-Z0-9]*")
         (ids-with-dot-\[\] "[[_$a-zA-Z][][_$.a-zA-Z0-9]*")
         (paren-exp "([^);{}]*)")
         (generic-exp "<[^(){};]*>"))
    (concat "^[ \t]*"
            modifier*
            "\\(?:" generic-exp space* "\\)?"
            ids-with-dot-\[\] space+                ; first part of type
            "\\(?:" ids-with-dot-\[\] space+ "\\)?" ; optional second part of type.
            "\\(?:[_a-zA-Z][^][ \t:;.,{}()=<>]*"    ; defun name
                "\\|" ids-with-dots
            "\\)" space*
            paren-exp
            "\\(?:" space* "]\\)*"      ; What's this for?
            "\\(?:" space* "\\<throws\\>" space* ids-with-dot-\[\]
                  "\\(?:," space* ids-with-dot-\[\] "\\)*"
            "\\)?"
            space*)))

> Yours
> -- 
> %% Mats

-- 
Alan Mackenzie (Nuremberg, Germany).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Sun, 22 Oct 2023 17:19:02 GMT) Full text and rfc822 format available.

Message #86 received at 61436-done <at> debbugs.gnu.org (full text, mbox):

From: Mats Lidell <mats.lidell <at> lidells.se>
To: Alan Mackenzie <acm <at> muc.de>
Cc: Robert Weiner <rsw <at> gnu.org>, Hank Greenburg <hank.greenburg <at> protonmail.com>,
 61436-done <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 Robert Weiner <rswgnu <at> gmail.com>, Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Sun, 22 Oct 2023 19:17:59 +0200
Hi Alan,

> Alan Mackenzie writes:
> Can I ask you to delete that buggy version and try the following
> instead?  Thanks!

Looks good to me. With the regexp and using Emacs 30.0.50 I moved around,
jumping from function to function back and forth. I tested that with a few
java files including the one submitted with the initial bug report. During the
testing I did not experience any noticeable freeze.

%% Mats




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 20 Nov 2023 12:24:08 GMT) Full text and rfc822 format available.

bug unarchived. Request was from Alan Mackenzie <acm <at> muc.de> to control <at> debbugs.gnu.org. (Wed, 17 Apr 2024 13:18:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Wed, 17 Apr 2024 13:23:01 GMT) Full text and rfc822 format available.

Message #93 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: 61436 <at> debbugs.gnu.org
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Wed, 17 Apr 2024 13:22:16 +0000
Hello, Bob.

On Tue, Apr 16, 2024 at 21:35:59 -0400, Robert Weiner wrote:
>    Hi Alan:
>    I just re-read this whole thread and realized you resolved the problem
>    for the Java defun-prompt-regexp but not the C++
>    defun-prompt-regexp in Hyperbole's hui-select.el:L404 (probably were
>    just tired after all of that).
>    Today, someone else reported that the C++ regexp was hanging their
>    Emacs.  Do you think you could pick this back up and rework the C++
>    regexp as you did the Java one?  It would be a big help; otherwise, I
>    think we'll just have to disable that functionality in Hyperbole.
>    Best regards,
>    Bob
>    On Sun, Oct 22, 2023 at 1:18â¯PM Mats Lidell
>    <[1]mats.lidell <at> lidells.se> wrote:

Yes, I'll happily finish off that C++ regexp.  I made considerable
progress with it back in October, getting smething basically working but
with some rough edges.  One problem is that the regexp was ~1600
characters long.  I don't know if this might make the program slow -
possibly not.

I've found the .el file I was working in, and located my notes from
October.  It's going to take longer than a day or two, but hopefully
less than a week or two.

[ .... ]

-- 
Alan Mackenzie (Nuremberg, Germany).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Wed, 17 Apr 2024 18:51:02 GMT) Full text and rfc822 format available.

Message #96 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: rswgnu <at> gmail.com
Cc: Hank Greenburg <hank.greenburg <at> protonmail.com>,
 Mats Lidell <mats.lidell <at> lidells.se>, acm <at> muc.de, Eli Zaretskii <eliz <at> gnu.org>,
 Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>, 61436 <at> debbugs.gnu.org
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Wed, 17 Apr 2024 18:50:33 +0000
Hello again, Bob.

On Wed, Apr 17, 2024 at 13:06:50 +0000, Alan Mackenzie wrote:
> On Tue, Apr 16, 2024 at 21:35:59 -0400, Robert Weiner wrote:
> >    Hi Alan:
> >    I just re-read this whole thread and realized you resolved the problem
> >    for the Java defun-prompt-regexp but not the C++
> >    defun-prompt-regexp in Hyperbole's hui-select.el:L404 (probably were
> >    just tired after all of that).
> >    Today, someone else reported that the C++ regexp was hanging their
> >    Emacs.  Do you think you could pick this back up and rework the C++
> >    regexp as you did the Java one?  It would be a big help; otherwise, I
> >    think we'll just have to disable that functionality in Hyperbole.
> >    Best regards,
> >    Bob
> >    On Sun, Oct 22, 2023 at 1:18â¯PM Mats Lidell
> >    <[1]mats.lidell <at> lidells.se> wrote:

> Yes, I'll happily finish off that C++ regexp.  I made considerable
> progress with it back in October, getting smething basically working but
> with some rough edges.  One problem is that the regexp was ~1600
> characters long.  I don't know if this might make the program slow -
> possibly not.

> I've found the .el file I was working in, and located my notes from
> October.  It's going to take longer than a day or two, but hopefully
> less than a week or two.

It was rather easier than I'd anticipated.  There is my first attempt
below.  It should find most C++ defun starts, but not all.  In particlar
it won't recognise one with nested parens or nested template delimiters;
regexps cannot handle arbitrary nesting,and it didn't seem worth the
trouble to code in a small bounded degree of nesting, though this surely
could be done if I'm wrong, here.

The regexp is not small.  At the latest count it was 2,223 characters
long.  I hope this won't affect performance too much.

Please try out this regexp, and let me know how well it's working.
Thanks!

> [ .... ]



(defconst c++-defun-prompt-regexp
  (let*
      ((space* "[ \t\n\r\f]*")
       (space+ "[ \t\n\r\f]+")
       (ad-hoc-requires-clause
	(concat "\\(?:requires" space* "[][()<> \t\n\r\f_$a-zA-Z0-9&|\"'+=.,*:~-]+" space* "\\)?"))
       (id (concat "[_$~a-zA-Z][_$a-zA-Z0-9]*")
	   ;; (concat "\\(\\(~" space* "\\)?" "\\([_$a-zA-Z][_$a-zA-Z0-9]*\\)\\)")
	   )
       (template-brackets "\\(?:<[^;{}]*>\\)")
       (id-<> (concat id "\\(?:" space* template-brackets "\\)?"))
       (id-:: (concat id-<> "\\(?:" space* "::" space* id-<> "\\)*"))
       (paren-exp "([^{};]*)")
       (template-exp\? (concat "\\(?:template" space* template-brackets space* "\\)?"))
       (type-prefix-modifier* (concat "\\(?:\\(?:"
				      "\\(?:\\<extern" space+ "\"[^\"]+\"\\)"
				      "\\|"
				      (regexp-opt '("auto" "const" "explicit" "extern"
						    "friend" "inline" "mutable"
						    "noexcept" "overload"
						    "register" "static" "typedef"
						    "virtual" "volatile")
						  'words)
				      "\\)"
				      space+
				      "\\)*"))
       (type-exp (concat
		  "\\(?:\\(?:" template-brackets space* "\\)?"
		  type-prefix-modifier*
		  "\\(?:\\(?:decltype" space* paren-exp space* "\\)"
		  "\\|"
		  "\\(?:"
		  "\\(?:class\\|enum\\|struct\\|typename\\|union\\)"
		  "\\(?:" space* "\\.\\.\\.\\)?\\)"
		  space+ id space*
		  "\\(?::" id-:: space* "\\)?"
		  "\\|"
		  id-:: space*
		  "\\)"
		  "\\)\\{1,2\\}"))
       (type-mid-modifier* (concat "\\(?:"
				   (regexp-opt
				    '("auto" "consteval" "constexpr"
				      "constinit" "explicit"
				      "extern" "friend" "inline"
				      "mutable" "noexcept" "register"
				      "static" "template"
				      "thread_local" "throw"
				      "virtual" "volatile")
				    'words)
				   space+ "\\)*"))
       (operator-exp (concat "\\(?:operator\\>" space*
			     "\\(?:[][a-z_+*/%^?&|!~<>,:=-]+"
			     "\\|()\\|\"\""
			     "\\)" space*
			     "\\)"))

       (name-exp			; matches foo or (* foo), etc.
	(concat "\\(?:(" space* "[*&]+" space* id-:: space* "[][()]*" ")"
		"\\|\\(?:[*&]+" space* "\\)?" id-::
		"\\)" space*))
       (type-suffix-modifier* (concat "\\(?:"
				      (regexp-opt
				       '("auto" "const" "noexcept"
					 "requires" "throw" "volatile")
				       'words)
				      space+ "\\)*"))
       (post-paren-modifier* (concat "\\(?:"
				     (regexp-opt
				      '("const" "final" "override"
					"mutable")
				      'words)
				     space* "\\)*")))

    (concat template-exp\?
	    "\\(?:" ad-hoc-requires-clause "\\)?"
	    type-exp
	    type-mid-modifier*
	    "\\(?:" operator-exp "\\|" name-exp "\\)"
	    type-suffix-modifier*
	    paren-exp space*
	    "\\(?:->" space* type-exp "\\)?"
	    post-paren-modifier*
	    "{"))


-- 
Alan Mackenzie (Nuremberg, Germany).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Wed, 17 Apr 2024 22:25:03 GMT) Full text and rfc822 format available.

Message #99 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Robert Weiner <rswgnu <at> gmail.com>
To: Alan Mackenzie <acm <at> muc.de>
Cc: Hank Greenburg <hank.greenburg <at> protonmail.com>,
 Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Mats Lidell <mats.lidell <at> lidells.se>, 61436 <at> debbugs.gnu.org
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Wed, 17 Apr 2024 18:24:21 -0400
Great, will do, thanks, Alan.

-- Bob

> On Apr 17, 2024, at 2:50 PM, Alan Mackenzie <acm <at> muc.de> wrote:
> 
> Hello again, Bob.
> 
>> On Wed, Apr 17, 2024 at 13:06:50 +0000, Alan Mackenzie wrote:
>>> On Tue, Apr 16, 2024 at 21:35:59 -0400, Robert Weiner wrote:
>>>   Hi Alan:
>>>   I just re-read this whole thread and realized you resolved the problem
>>>   for the Java defun-prompt-regexp but not the C++
>>>   defun-prompt-regexp in Hyperbole's hui-select.el:L404 (probably were
>>>   just tired after all of that).
>>>   Today, someone else reported that the C++ regexp was hanging their
>>>   Emacs.  Do you think you could pick this back up and rework the C++
>>>   regexp as you did the Java one?  It would be a big help; otherwise, I
>>>   think we'll just have to disable that functionality in Hyperbole.
>>>   Best regards,
>>>   Bob
>>>   On Sun, Oct 22, 2023 at 1:18â¯PM Mats Lidell
>>>   <[1]mats.lidell <at> lidells.se> wrote:
> 
>> Yes, I'll happily finish off that C++ regexp.  I made considerable
>> progress with it back in October, getting smething basically working but
>> with some rough edges.  One problem is that the regexp was ~1600
>> characters long.  I don't know if this might make the program slow -
>> possibly not.
> 
>> I've found the .el file I was working in, and located my notes from
>> October.  It's going to take longer than a day or two, but hopefully
>> less than a week or two.
> 
> It was rather easier than I'd anticipated.  There is my first attempt
> below.  It should find most C++ defun starts, but not all.  In particlar
> it won't recognise one with nested parens or nested template delimiters;
> regexps cannot handle arbitrary nesting,and it didn't seem worth the
> trouble to code in a small bounded degree of nesting, though this surely
> could be done if I'm wrong, here.
> 
> The regexp is not small.  At the latest count it was 2,223 characters
> long.  I hope this won't affect performance too much.
> 
> Please try out this regexp, and let me know how well it's working.
> Thanks!
> 
>> [ .... ]
> 
> 
> 
> (defconst c++-defun-prompt-regexp
>  (let*
>      ((space* "[ \t\n\r\f]*")
>       (space+ "[ \t\n\r\f]+")
>       (ad-hoc-requires-clause
>    (concat "\\(?:requires" space* "[][()<> \t\n\r\f_$a-zA-Z0-9&|\"'+=.,*:~-]+" space* "\\)?"))
>       (id (concat "[_$~a-zA-Z][_$a-zA-Z0-9]*")
>       ;; (concat "\\(\\(~" space* "\\)?" "\\([_$a-zA-Z][_$a-zA-Z0-9]*\\)\\)")
>       )
>       (template-brackets "\\(?:<[^;{}]*>\\)")
>       (id-<> (concat id "\\(?:" space* template-brackets "\\)?"))
>       (id-:: (concat id-<> "\\(?:" space* "::" space* id-<> "\\)*"))
>       (paren-exp "([^{};]*)")
>       (template-exp\? (concat "\\(?:template" space* template-brackets space* "\\)?"))
>       (type-prefix-modifier* (concat "\\(?:\\(?:"
>                      "\\(?:\\<extern" space+ "\"[^\"]+\"\\)"
>                      "\\|"
>                      (regexp-opt '("auto" "const" "explicit" "extern"
>                            "friend" "inline" "mutable"
>                            "noexcept" "overload"
>                            "register" "static" "typedef"
>                            "virtual" "volatile")
>                          'words)
>                      "\\)"
>                      space+
>                      "\\)*"))
>       (type-exp (concat
>          "\\(?:\\(?:" template-brackets space* "\\)?"
>          type-prefix-modifier*
>          "\\(?:\\(?:decltype" space* paren-exp space* "\\)"
>          "\\|"
>          "\\(?:"
>          "\\(?:class\\|enum\\|struct\\|typename\\|union\\)"
>          "\\(?:" space* "\\.\\.\\.\\)?\\)"
>          space+ id space*
>          "\\(?::" id-:: space* "\\)?"
>          "\\|"
>          id-:: space*
>          "\\)"
>          "\\)\\{1,2\\}"))
>       (type-mid-modifier* (concat "\\(?:"
>                   (regexp-opt
>                    '("auto" "consteval" "constexpr"
>                      "constinit" "explicit"
>                      "extern" "friend" "inline"
>                      "mutable" "noexcept" "register"
>                      "static" "template"
>                      "thread_local" "throw"
>                      "virtual" "volatile")
>                    'words)
>                   space+ "\\)*"))
>       (operator-exp (concat "\\(?:operator\\>" space*
>                 "\\(?:[][a-z_+*/%^?&|!~<>,:=-]+"
>                 "\\|()\\|\"\""
>                 "\\)" space*
>                 "\\)"))
> 
>       (name-exp            ; matches foo or (* foo), etc.
>    (concat "\\(?:(" space* "[*&]+" space* id-:: space* "[][()]*" ")"
>        "\\|\\(?:[*&]+" space* "\\)?" id-::
>        "\\)" space*))
>       (type-suffix-modifier* (concat "\\(?:"
>                      (regexp-opt
>                       '("auto" "const" "noexcept"
>                     "requires" "throw" "volatile")
>                       'words)
>                      space+ "\\)*"))
>       (post-paren-modifier* (concat "\\(?:"
>                     (regexp-opt
>                      '("const" "final" "override"
>                    "mutable")
>                      'words)
>                     space* "\\)*")))
> 
>    (concat template-exp\?
>        "\\(?:" ad-hoc-requires-clause "\\)?"
>        type-exp
>        type-mid-modifier*
>        "\\(?:" operator-exp "\\|" name-exp "\\)"
>        type-suffix-modifier*
>        paren-exp space*
>        "\\(?:->" space* type-exp "\\)?"
>        post-paren-modifier*
>        "{"))
> 
> 
> --
> Alan Mackenzie (Nuremberg, Germany).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Fri, 19 Apr 2024 02:22:02 GMT) Full text and rfc822 format available.

Message #102 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Robert Weiner <rsw <at> gnu.org>
To: Alan Mackenzie <acm <at> muc.de>
Cc: Hank Greenburg <hank.greenburg <at> protonmail.com>,
 Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Mats Lidell <mats.lidell <at> lidells.se>, 61436 <at> debbugs.gnu.org
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Thu, 18 Apr 2024 22:19:51 -0400
[Message part 1 (text/plain, inline)]
Hi Alan:

I'm starting to look at your rewrite of the c++-defun-prompt-regexp.  I am
wondering if we need one for the equivalent java regexp or if the patch you
mention in your prior message is all that should be needed there.

Regards,

Bob
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Fri, 19 Apr 2024 03:00:02 GMT) Full text and rfc822 format available.

Message #105 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Robert Weiner <rsw <at> gnu.org>
To: Alan Mackenzie <acm <at> muc.de>
Cc: Hank Greenburg <hank.greenburg <at> protonmail.com>,
 Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>, Eli Zaretskii <eliz <at> gnu.org>,
 Mats Lidell <mats.lidell <at> lidells.se>, 61436 <at> debbugs.gnu.org
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Thu, 18 Apr 2024 22:58:15 -0400
[Message part 1 (text/plain, inline)]
Hi Alan:

This is to confirm that I have tested your cc-defs.el patch works properly
and eliminates the Emacs hang when using the Hyperbole
java-defun-prompt-regexp.  Nice work.

Regards,

Bob


On Fri, Oct 13, 2023 at 8:42 AM Alan Mackenzie <acm <at> muc.de> wrote:

> Hello, Jens.
>
> On Thu, Oct 12, 2023 at 21:58:06 +0200, Jens Schmidt wrote:
> > Hi Alan,
>
> > Alan Mackenzie <acm <at> muc.de> writes:
>
> [ .... ]
>
> > >> That always freezes Emacs (29 and master) even before it has a chance
> to
> > >> display P1.java.  The freeze happens in function
> > >> `c-get-fallback-scan-pos', where the while loop inf-loops.
>
> Yes.
>
> > > c-get-fallback-scan-pos tries to move to the beginning of a function.
> > > This probably involves defun-prompt-regexp when it is non-nil.  :-(
>
> > Otherwise we would see hangs or exponential behavior (?) somewhere in
> > the Emacs regexp machinerie, but they take place in that while loop.  So
> > I guess that there must be some other, additional quality that this
> > regexp fulfills.  Like: "matches the empty string" (which it does not,
> > as far as I can tell) or: "must only match before curlies" or whatnot.
>
> > Unfortunately, the doc string/info doc of `defun-prompt-regexp´ provides
> > only exactly that latter criterion:
>
> >   That is to say, a defun begins on a line that starts with a match for
> >   this regular expression, followed by a character with open-parenthesis
> >   syntax.
>
> > I guess that only pruning that regexp until things start unfreezing
> > could give an answer here.  Or more tracing to see how point moves in
> > `c-get-fallback-scan-pos'.  But I need some tracing break here ...
>
>
> > ... or so I thought, I just couldn't resist:
>
> > I expanded and instrumented that function from emacs-29 as follows,
> > (hopefully) not changing any of its logic:
>
> > ------------------------- snip -------------------------
> > (defun c-get-fallback-scan-pos (here)
> >   ;; Return a start position for building `c-state-cache' from scratch.
> This
> >   ;; will be at the top level, 2 defuns back.  Return nil if we don't
> find
> >   ;; these defun starts a reasonable way back.
> >   (message "c-get-fallback-scan-pos")
> >   (save-excursion
> >     (save-restriction
> >       (when (> here (* 10 c-state-cache-too-far))
> >       (narrow-to-region (- here (* 10 c-state-cache-too-far)) here))
> >       ;; Go back 2 bods, but ignore any bogus positions returned by
> >       ;; beginning-of-defun (i.e. open paren in column zero).
> >       (goto-char here)
> >       (let ((cnt 2))
> >       (message "beginning-of-defun-loop-00: %d %d" cnt (point))
> >       (while (not (or (bobp) (zerop cnt)))
> >         (message "beginning-of-defun-loop-01: %d" (point))
> >         (let (beginning-of-defun-function end-of-defun-function)
> >           (beginning-of-defun))
> >         (and defun-prompt-regexp
> >              (looking-at defun-prompt-regexp)
> >              (message "beginning-of-defun-loop-02: %d" (point))
> >              (goto-char (match-end 0)))
> >         (message "beginning-of-defun-loop-03: %d" (point))
> >         (if (eq (char-after) ?\{)
> >             (setq cnt (1- cnt)))))
> >       (and (not (bobp))
> >          (point)))))
> > ------------------------- snip -------------------------
>
> > That results in the message triple
>
> > ------------------------- snip -------------------------
> > beginning-of-defun-loop-01: 5879
> > beginning-of-defun-loop-02: 5801
> > beginning-of-defun-loop-03: 5879
> > beginning-of-defun-loop-01: 5879
> > beginning-of-defun-loop-02: 5801
> > beginning-of-defun-loop-03: 5879
> > ...
> > ------------------------- snip -------------------------
>
> > inf-looping.  These points are (|: 5801, ^: 5879) here in P1.java:
>
> > ------------------------- snip -------------------------
> > 178    } catch (Exception e) {
> > 179|      error("symTable.addDecl", "unexpected error with a single
> HashMap " + e)^;
> > 180    }
> > 181
> > ------------------------- snip -------------------------
>
> > So the catch-block just before line 181 is recognized as a potential BOD
> > (previous trailing open curly?).  But then `defun-prompt-regexp' matches
> > the function call in the catch-block as defun prompt regexp (which it
> > better should not?), taking point back to where, on next BOD search, the
> > exact previous BOD is found again.
>
> > So probably there are really two issues here:
>
> > 1. The `defun-prompt-regexp' used by Hyperbole, which matches too
> >    broadly, and
>
> > 2. function `c-get-fallback-scan-pos', which could try harder to avoid
> >    inf-loops when such things happen.
>
> > But that's where I *really* stop here :-)
>
> You've diagnosed the bug completely.  Thanks!  The hang was caused
> entirely by the loop in c-get-fallback-scan-pos, not the deficiencies in
> that long regexp.
>
> defun-prompt-regexp, when appended with a \\s( (as is done in
> beginning-of-defun-raw) matches the "      error(" on L179 of P1.java.
> The bare defun-prompt-regexp (as used in CC Mode) matches the entire
> line except the terminating ;.  This regexp could do with some
> amendment, but it is not the main cause of the bug.
>
> To solve the bug, I'm amending the macro c-beginning-of-defun-1 so that
> it only stops at a debug-prompt-regexp position when it also found a {.
> Otherwise it will keep looping until it finds a better position or BOB.
>
> Would all concerned please apply the attached patch to the Emacs master
> branch, directory lisp/progmodes.  Then please byte compile CC Mode in
> full (a macro has been changed), and try the result on your real Java
> code.  (If anybody wants any help applying the patch or byte compiling,
> feel free to send me private mail.)  Then please confirm that the bug is
> indeed fixed.  Thanks!
>
> --
> Alan Mackenzie (Nuremberg, Germany).
>
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Fri, 19 Apr 2024 04:41:04 GMT) Full text and rfc822 format available.

Message #108 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Robert Weiner <rswgnu <at> gmail.com>
To: rswgnu <at> gmail.com
Cc: Hank Greenburg <hank.greenburg <at> protonmail.com>,
 Mats Lidell <mats.lidell <at> lidells.se>, acm <at> muc.de, Eli Zaretskii <eliz <at> gnu.org>,
 Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>, 61436 <at> debbugs.gnu.org
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Fri, 19 Apr 2024 00:40:19 -0400
If we do need your c++ regexp, do we have a repeatable test that generates the hang, as I cannot generate such a problem even without your patch while using the prior Hyperbole c++ regexp.


-- Bob

> On Apr 18, 2024, at 10:19 PM, Robert Weiner <rsw <at> gnu.org> wrote:
> 
> 
> Hi Alan:
> 
> I'm starting to look at your rewrite of the c++-defun-prompt-regexp.  I am wondering if we need one for the equivalent java regexp or if the patch you mention in your prior message is all that should be needed there.
> 
> Regards,
> 
> Bob




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#61436; Package emacs. (Fri, 19 Apr 2024 16:00:07 GMT) Full text and rfc822 format available.

Message #111 received at 61436 <at> debbugs.gnu.org (full text, mbox):

From: Alan Mackenzie <acm <at> muc.de>
To: Robert Weiner <rswgnu <at> gmail.com>
Cc: Hank Greenburg <hank.greenburg <at> protonmail.com>,
 Mats Lidell <mats.lidell <at> lidells.se>, acm <at> muc.de, Eli Zaretskii <eliz <at> gnu.org>,
 Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>, 61436 <at> debbugs.gnu.org
Subject: Re: bug#61436: Emacs Freezing With Java Files
Date: Fri, 19 Apr 2024 15:59:22 +0000
Hello, Bob.

On Fri, Apr 19, 2024 at 00:40:19 -0400, Robert Weiner wrote:
> If we do need your c++ regexp, do we have a repeatable test that
> generates the hang, as I cannot generate such a problem even without
> your patch while using the prior Hyperbole c++ regexp.

I'm afraid I don't (yet) have a repeatable test case.  I have a raw
init.el file from pillowtrucker <at> proton.me, but it's quite some work to
extract a test procedure from it.  This is a separate bug involving
hanging in a C++ file, but I noticed it was using hyperbole and H's value
of defun-prompt-regexp was set.

> -- Bob

> > On Apr 18, 2024, at 10:19 PM, Robert Weiner <rsw <at> gnu.org> wrote:

> > 
> > Hi Alan:

> > I'm starting to look at your rewrite of the c++-defun-prompt-regexp.
> > I am wondering if we need one for the equivalent java regexp or if
> > the patch you mention in your prior message is all that should be
> > needed there.

> > Regards,

> > Bob

-- 
Alan Mackenzie (Nuremberg, Germany).




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 18 May 2024 11:24:07 GMT) Full text and rfc822 format available.

This bug report was last modified today.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.