GNU bug report logs - #71896
shell-resync-dirs hang

Previous Next

Package: emacs;

Reported by: Troy Hinckley <troyhinckley <at> dabrev.com>

Date: Tue, 2 Jul 2024 05:17:04 UTC

Severity: normal

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 71896 in the body.
You can then email your comments to 71896 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#71896; Package emacs. (Tue, 02 Jul 2024 05:17:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Troy Hinckley <troyhinckley <at> dabrev.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Tue, 02 Jul 2024 05:17:04 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Troy Hinckley <troyhinckley <at> dabrev.com>
To: bug-gnu-emacs <at> gnu.org
Subject: shell-resync-dirs hang
Date: Mon, 1 Jul 2024 21:22:35 -0500
[Message part 1 (text/plain, inline)]
This is related to #59804 and #54384.

I will occasionally (about 30% of the time) see hangs when running shell-resync-dirs in Emacs 29. I tracked this down to two issues:


First is in shell-eval-command. For Emacs 29 this function was added to fix #54384. It has this section in the code:

```
;; Wait until we get a prompt (which will be a line without
 ;; a newline). This is far from fool-proof -- if something
 ;; outputs incomplete data and then sleeps, we'll think
 ;; we've received the prompt.
 (while (not (let* ((lines (string-lines result))
 (last (car (last lines))))
 (and (length> lines 0)
 (not (equal last ""))
 (or (not prev)
 (not (equal last prev)))
 (setq prev last))))
 (accept-process-output proc 0 100))
```

Note that is says that is looking for “a line without a newline” to determine if we have reached the prompt. However this code does not actually do that. If the result ends in a newline it will still terminate the loop and not wait for more input. We can see that by the fact that the following code evaluates to nil.

```
(let ((result "dirs\n") prev)
 (not (let* ((lines (string-lines result))
 (last (car (last lines))))
 (and (length> lines 0)
 (not (equal last ""))
 (or (not prev)
 (not (equal last prev)))
 (setq prev last)))))
```

I am not sure what this code is supposed to do, but the issue arrises if the process output sends anything to this function it will terminate and not wait for more input. In my case the issue is that the shell is echoing the command followed by the result (comint-process-echoes). About 30% of the time these two lines get sent as part of two different outputs. Meaning the second line (the directory for shell-resync-dirs) does not get captured and instead gets printed to the terminal.


This leads us to the hang. The issue is this code in shell-resync-dirs:

```
(while dlsl
 (let ((newelt "")
 tem1 tem2)
 (while newelt
 ;; We need tem1 because we don't want to prepend
 ;; `comint-file-name-prefix' repeatedly into newelt via tem2.
 (setq tem1 (pop dlsl)
 tem2 (concat comint-file-name-prefix tem1 newelt))
 (cond ((file-directory-p tem2)
 (push tem2 ds)
 (when (string= " " (car dlsl))
 (pop dlsl))
 (setq newelt nil))
 (t
 (setq newelt (concat tem1 newelt)))))))
```

This loop can only terminate if tem2 is a valid directory. Otherwise it will take the default case in the cond and loop forever. And since the bug in shell-eval-command does not provide the directory when the process output is split, we get a hang.

I believe both of these need to be fixed to properly fix the bug.

For the shell-eval-command I don’t understand what that loop is trying to do now, so I am not sure how to fix it without breaking its functionality. I would just use (string-suffix-p “\n” result) to check if the output ends in a newline, but the code is obviously trying to do something more complex there.

If we fix that issue then it will resolve the hang in shell-resync-dirs, but I think that is just glossing over the problem. That functions should never hang, no matter what output it get’s from the shell. My recommendation would be to add `(and dlsl newelt)` as the condition for the inner while loop with newelt. That way if dlsl is empty, it will terminate the loop since there is nothing more to process. This fixed the issue for me.

-Troy Hinckley
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71896; Package emacs. (Sat, 06 Jul 2024 09:59:02 GMT) Full text and rfc822 format available.

Message #8 received at 71896 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Troy Hinckley <troyhinckley <at> dabrev.com>
Cc: 71896 <at> debbugs.gnu.org
Subject: Re: bug#71896: shell-resync-dirs hang
Date: Sat, 06 Jul 2024 12:58:00 +0300
> Date: Mon, 1 Jul 2024 21:22:35 -0500
> From: Troy Hinckley <troyhinckley <at> dabrev.com>
> 
> This is related to #59804 and #54384.
> 
> I will occasionally (about 30% of the time) see hangs when running shell-resync-dirs in Emacs 29. I tracked this
> down to two issues:
> 
> First is in shell-eval-command. For Emacs 29 this function was added to fix #54384. It has this section in the
> code:
> 
> ```
> ;; Wait until we get a prompt (which will be a line without
>  ;; a newline). This is far from fool-proof -- if something
>  ;; outputs incomplete data and then sleeps, we'll think
>  ;; we've received the prompt.
>  (while (not (let* ((lines (string-lines result))
>  (last (car (last lines))))
>  (and (length> lines 0)
>  (not (equal last ""))
>  (or (not prev)
>  (not (equal last prev)))
>  (setq prev last))))
>  (accept-process-output proc 0 100))
> ```
> 
> Note that is says that is looking for “a line without a newline” to determine if we have reached the prompt.
> However this code does not actually do that. If the result ends in a newline it will still terminate the loop and not
> wait for more input. We can see that by the fact that the following code evaluates to nil.
> 
> ```
> (let ((result "dirs\n") prev)
>  (not (let* ((lines (string-lines result))
>  (last (car (last lines))))
>  (and (length> lines 0)
>  (not (equal last ""))
>  (or (not prev)
>  (not (equal last prev)))
>  (setq prev last)))))
> ```
> 
> I am not sure what this code is supposed to do, but the issue arrises if the process output sends anything to
> this function it will terminate and not wait for more input. In my case the issue is that the shell is echoing the
> command followed by the result (comint-process-echoes). About 30% of the time these two lines get sent as
> part of two different outputs. Meaning the second line (the directory for shell-resync-dirs) does not get captured
> and instead gets printed to the terminal.

Does the patch below solve the problem in shell-eval-command?

> This leads us to the hang. The issue is this code in shell-resync-dirs:
> 
> ```
> (while dlsl
>  (let ((newelt "")
>  tem1 tem2)
>  (while newelt
>  ;; We need tem1 because we don't want to prepend
>  ;; `comint-file-name-prefix' repeatedly into newelt via tem2.
>  (setq tem1 (pop dlsl)
>  tem2 (concat comint-file-name-prefix tem1 newelt))
>  (cond ((file-directory-p tem2)
>  (push tem2 ds)
>  (when (string= " " (car dlsl))
>  (pop dlsl))
>  (setq newelt nil))
>  (t
>  (setq newelt (concat tem1 newelt)))))))
> ```
> 
> This loop can only terminate if tem2 is a valid directory. Otherwise it will take the default case in the cond and
> loop forever. And since the bug in shell-eval-command does not provide the directory when the process output
> is split, we get a hang.
> 
> I believe both of these need to be fixed to properly fix the bug.
> 
> For the shell-eval-command I don’t understand what that loop is trying to do now, so I am not sure how to fix it
> without breaking its functionality. I would just use (string-suffix-p “\n” result) to check if the output ends in a
> newline, but the code is obviously trying to do something more complex there.
> 
> If we fix that issue then it will resolve the hang in shell-resync-dirs, but I think that is just glossing over the
> problem. That functions should never hang, no matter what output it get’s from the shell. My recommendation
> would be to add `(and dlsl newelt)` as the condition for the inner while loop with newelt. That way if dlsl is
> empty, it will terminate the loop since there is nothing more to process. This fixed the issue for me. 

Thanks, I think I agree with your suggestion for shell-resync-dirs.
But please undo the fix you evidently made there to avoid the infloop,
and see if the patch below for shell-eval-command makes
shell-resync-dirs do its job by correctly resynchronizing
shell-dirtrack.

diff --git a/lisp/shell.el b/lisp/shell.el
index e1936ff..f86156e 100644
--- a/lisp/shell.el
+++ b/lisp/shell.el
@@ -1629,10 +1629,12 @@ shell-eval-command
           ;; a newline).  This is far from fool-proof -- if something
           ;; outputs incomplete data and then sleeps, we'll think
           ;; we've received the prompt.
-          (while (not (let* ((lines (string-lines result))
-                             (last (car (last lines))))
+          (while (not (let* ((lines (string-lines result nil t))
+                             (last (car (last lines)))
+                             (last-end (substring last -1)))
                         (and (length> lines 0)
-                             (not (equal last ""))
+                             (not (member last '("" "\n")))
+                             (not (equal last-end "\n"))
                              (or (not prev)
                                  (not (equal last prev)))
                              (setq prev last))))




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71896; Package emacs. (Mon, 08 Jul 2024 17:05:01 GMT) Full text and rfc822 format available.

Message #11 received at 71896 <at> debbugs.gnu.org (full text, mbox):

From: Troy Hinckley <troyhinckley <at> dabrev.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 71896 <at> debbugs.gnu.org
Subject: Re: bug#71896: shell-resync-dirs hang
Date: Mon, 8 Jul 2024 12:04:07 -0500
[Message part 1 (text/plain, inline)]
Thanks Eli,
The proposed patch throws an error when `last’ is “” because (substring “” -1) produces (args-out-of-range “” -1 nil).

-Troy Hinckley
On Jul 6, 2024 at 4:58 AM -0500, Eli Zaretskii <eliz <at> gnu.org>, wrote:
> > Date: Mon, 1 Jul 2024 21:22:35 -0500
> > From: Troy Hinckley <troyhinckley <at> dabrev.com>
> >
> > This is related to #59804 and #54384.
> >
> > I will occasionally (about 30% of the time) see hangs when running shell-resync-dirs in Emacs 29. I tracked this
> > down to two issues:
> >
> > First is in shell-eval-command. For Emacs 29 this function was added to fix #54384. It has this section in the
> > code:
> >
> > ```
> > ;; Wait until we get a prompt (which will be a line without
> > ;; a newline). This is far from fool-proof -- if something
> > ;; outputs incomplete data and then sleeps, we'll think
> > ;; we've received the prompt.
> > (while (not (let* ((lines (string-lines result))
> > (last (car (last lines))))
> > (and (length> lines 0)
> > (not (equal last ""))
> > (or (not prev)
> > (not (equal last prev)))
> > (setq prev last))))
> > (accept-process-output proc 0 100))
> > ```
> >
> > Note that is says that is looking for “a line without a newline” to determine if we have reached the prompt.
> > However this code does not actually do that. If the result ends in a newline it will still terminate the loop and not
> > wait for more input. We can see that by the fact that the following code evaluates to nil.
> >
> > ```
> > (let ((result "dirs\n") prev)
> > (not (let* ((lines (string-lines result))
> > (last (car (last lines))))
> > (and (length> lines 0)
> > (not (equal last ""))
> > (or (not prev)
> > (not (equal last prev)))
> > (setq prev last)))))
> > ```
> >
> > I am not sure what this code is supposed to do, but the issue arrises if the process output sends anything to
> > this function it will terminate and not wait for more input. In my case the issue is that the shell is echoing the
> > command followed by the result (comint-process-echoes). About 30% of the time these two lines get sent as
> > part of two different outputs. Meaning the second line (the directory for shell-resync-dirs) does not get captured
> > and instead gets printed to the terminal.
>
> Does the patch below solve the problem in shell-eval-command?
>
> > This leads us to the hang. The issue is this code in shell-resync-dirs:
> >
> > ```
> > (while dlsl
> > (let ((newelt "")
> > tem1 tem2)
> > (while newelt
> > ;; We need tem1 because we don't want to prepend
> > ;; `comint-file-name-prefix' repeatedly into newelt via tem2.
> > (setq tem1 (pop dlsl)
> > tem2 (concat comint-file-name-prefix tem1 newelt))
> > (cond ((file-directory-p tem2)
> > (push tem2 ds)
> > (when (string= " " (car dlsl))
> > (pop dlsl))
> > (setq newelt nil))
> > (t
> > (setq newelt (concat tem1 newelt)))))))
> > ```
> >
> > This loop can only terminate if tem2 is a valid directory. Otherwise it will take the default case in the cond and
> > loop forever. And since the bug in shell-eval-command does not provide the directory when the process output
> > is split, we get a hang.
> >
> > I believe both of these need to be fixed to properly fix the bug.
> >
> > For the shell-eval-command I don’t understand what that loop is trying to do now, so I am not sure how to fix it
> > without breaking its functionality. I would just use (string-suffix-p “\n” result) to check if the output ends in a
> > newline, but the code is obviously trying to do something more complex there.
> >
> > If we fix that issue then it will resolve the hang in shell-resync-dirs, but I think that is just glossing over the
> > problem. That functions should never hang, no matter what output it get’s from the shell. My recommendation
> > would be to add `(and dlsl newelt)` as the condition for the inner while loop with newelt. That way if dlsl is
> > empty, it will terminate the loop since there is nothing more to process. This fixed the issue for me.
>
> Thanks, I think I agree with your suggestion for shell-resync-dirs.
> But please undo the fix you evidently made there to avoid the infloop,
> and see if the patch below for shell-eval-command makes
> shell-resync-dirs do its job by correctly resynchronizing
> shell-dirtrack.
>
> diff --git a/lisp/shell.el b/lisp/shell.el
> index e1936ff..f86156e 100644
> --- a/lisp/shell.el
> +++ b/lisp/shell.el
> @@ -1629,10 +1629,12 @@ shell-eval-command
> ;; a newline). This is far from fool-proof -- if something
> ;; outputs incomplete data and then sleeps, we'll think
> ;; we've received the prompt.
> - (while (not (let* ((lines (string-lines result))
> - (last (car (last lines))))
> + (while (not (let* ((lines (string-lines result nil t))
> + (last (car (last lines)))
> + (last-end (substring last -1)))
> (and (length> lines 0)
> - (not (equal last ""))
> + (not (member last '("" "\n")))
> + (not (equal last-end "\n"))
> (or (not prev)
> (not (equal last prev)))
> (setq prev last))))
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71896; Package emacs. (Mon, 08 Jul 2024 17:46:02 GMT) Full text and rfc822 format available.

Message #14 received at 71896 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Troy Hinckley <troyhinckley <at> dabrev.com>
Cc: 71896 <at> debbugs.gnu.org
Subject: Re: bug#71896: shell-resync-dirs hang
Date: Mon, 08 Jul 2024 20:42:11 +0300
> Date: Mon, 8 Jul 2024 12:04:07 -0500
> From: Troy Hinckley <troyhinckley <at> dabrev.com>
> Cc: 71896 <at> debbugs.gnu.org
> 
> The proposed patch throws an error when `last’ is “” because (substring “” -1) produces (args-out-of-range “” -
> 1 nil).

Is that the only problem with the patch?  That is, if this is
trivially fixed, does all the rest work correctly and fixes the
problems?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71896; Package emacs. (Mon, 08 Jul 2024 17:54:01 GMT) Full text and rfc822 format available.

Message #17 received at 71896 <at> debbugs.gnu.org (full text, mbox):

From: Troy Hinckley <troyhinckley <at> dabrev.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 71896 <at> debbugs.gnu.org
Subject: Re: bug#71896: shell-resync-dirs hang
Date: Mon, 8 Jul 2024 12:53:06 -0500
[Message part 1 (text/plain, inline)]
Yes, if I fix that one issue with the patch then all the rest works correctly and fixes the problem.
Thank you.

-Troy Hinckley
On Jul 8, 2024 at 12:42 PM -0500, Eli Zaretskii <eliz <at> gnu.org>, wrote:
> > Date: Mon, 8 Jul 2024 12:04:07 -0500
> > From: Troy Hinckley <troyhinckley <at> dabrev.com>
> > Cc: 71896 <at> debbugs.gnu.org
> >
> > The proposed patch throws an error when `last’ is “” because (substring “” -1) produces (args-out-of-range “” -
> > 1 nil).
>
> Is that the only problem with the patch? That is, if this is
> trivially fixed, does all the rest work correctly and fixes the
> problems?
[Message part 2 (text/html, inline)]

Reply sent to Eli Zaretskii <eliz <at> gnu.org>:
You have taken responsibility. (Fri, 12 Jul 2024 07:02:01 GMT) Full text and rfc822 format available.

Notification sent to Troy Hinckley <troyhinckley <at> dabrev.com>:
bug acknowledged by developer. (Fri, 12 Jul 2024 07:02:01 GMT) Full text and rfc822 format available.

Message #22 received at 71896-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Troy Hinckley <troyhinckley <at> dabrev.com>
Cc: 71896-done <at> debbugs.gnu.org
Subject: Re: bug#71896: shell-resync-dirs hang
Date: Fri, 12 Jul 2024 10:00:46 +0300
> Date: Mon, 8 Jul 2024 12:53:06 -0500
> From: Troy Hinckley <troyhinckley <at> dabrev.com>
> Cc: 71896 <at> debbugs.gnu.org
> 
> Yes, if I fix that one issue with the patch then all the rest works correctly and fixes the problem.

Thanks, so I've now installed an augmented fix on the emacs-30 release
branch, and I'm therefore closing this bug.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 09 Aug 2024 11:24:12 GMT) Full text and rfc822 format available.

This bug report was last modified 146 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.