GNU bug report logs - #79436
[PATCH] Use up-to-date time in wait_reading_process_output

Previous Next

Package: emacs;

Reported by: Spencer Baugh <sbaugh <at> janestreet.com>

Date: Fri, 12 Sep 2025 15:40:03 UTC

Severity: normal

Tags: patch

Done: Eli Zaretskii <eliz <at> gnu.org>

To reply to this bug, email your comments to 79436 AT debbugs.gnu.org.
There is no need to reopen the bug first.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to eggert <at> cs.ucla.edu, dmitry <at> gutov.dev, bug-gnu-emacs <at> gnu.org:
bug#79436; Package emacs. (Fri, 12 Sep 2025 15:40:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Spencer Baugh <sbaugh <at> janestreet.com>:
New bug report received and forwarded. Copy sent to eggert <at> cs.ucla.edu, dmitry <at> gutov.dev, bug-gnu-emacs <at> gnu.org. (Fri, 12 Sep 2025 15:40:06 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Spencer Baugh <sbaugh <at> janestreet.com>
To: bug-gnu-emacs <at> gnu.org
Subject: [PATCH] Use up-to-date time in wait_reading_process_output
Date: Fri, 12 Sep 2025 11:38:29 -0400
[Message part 1 (text/plain, inline)]
Tags: patch


Compile the following C program, close_and_sit.c:

#include <stdio.h>
#include <unistd.h>
void main() {
  fclose(stdout);
  fclose(stderr);
  while (1) sleep(1);
}

Then run this snippet of Lisp, updating the executable file name
appropriately:

(make-process
 :name "close_and_sit"
 :command '("./close_and_sit")
 :connection-type 'pipe)
(make-process :name "true" :command '("true"))
(message "accept-process-output")
(accept-process-output nil 1)

This will cause Emacs to hang; accept-process-output will never time
out.

There are two bugs which combine to cause this.

The first bug is in our timeout handling.  The attached patch resolves
that, which causes accept-process-output to time out properly and fixes
the bug.

Paul, this patch partially reverts a change you made to cache
current_timespec.  Does it seem right to you?

The second bug is that when a child process closes stdout/stderr, we
continue watching the pipe, generating spurious events.  (We should
still be timing out despite the spurious events - real events could also
cause this hang, this bug just makes it easier to trigger the hang.)
I'm still working on a patch to fix the second bug.

[0001-Use-up-to-date-time-in-wait_reading_process_output.patch (text/patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#79436; Package emacs. (Sat, 27 Sep 2025 08:49:02 GMT) Full text and rfc822 format available.

Message #8 received at 79436 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: eggert <at> cs.ucla.edu, Spencer Baugh <sbaugh <at> janestreet.com>
Cc: 79436 <at> debbugs.gnu.org, dmitry <at> gutov.dev
Subject: Re: bug#79436: [PATCH] Use up-to-date time in
 wait_reading_process_output
Date: Sat, 27 Sep 2025 11:47:46 +0300
Ping!  Paul, any comments or suggestions?

> Cc: Paul Eggert <eggert <at> cs.ucla.edu>, Dmitry Gutov <dmitry <at> gutov.dev>
> Date: Fri, 12 Sep 2025 11:38:29 -0400
> From:  Spencer Baugh via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
> 
> Compile the following C program, close_and_sit.c:
> 
> #include <stdio.h>
> #include <unistd.h>
> void main() {
>   fclose(stdout);
>   fclose(stderr);
>   while (1) sleep(1);
> }
> 
> Then run this snippet of Lisp, updating the executable file name
> appropriately:
> 
> (make-process
>  :name "close_and_sit"
>  :command '("./close_and_sit")
>  :connection-type 'pipe)
> (make-process :name "true" :command '("true"))
> (message "accept-process-output")
> (accept-process-output nil 1)
> 
> This will cause Emacs to hang; accept-process-output will never time
> out.
> 
> There are two bugs which combine to cause this.
> 
> The first bug is in our timeout handling.  The attached patch resolves
> that, which causes accept-process-output to time out properly and fixes
> the bug.
> 
> Paul, this patch partially reverts a change you made to cache
> current_timespec.  Does it seem right to you?
> 
> The second bug is that when a child process closes stdout/stderr, we
> continue watching the pipe, generating spurious events.  (We should
> still be timing out despite the spurious events - real events could also
> cause this hang, this bug just makes it easier to trigger the hang.)
> I'm still working on a patch to fix the second bug.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#79436; Package emacs. (Sat, 27 Sep 2025 19:42:01 GMT) Full text and rfc822 format available.

Message #11 received at 79436 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 79436 <at> debbugs.gnu.org, dmitry <at> gutov.dev,
 Spencer Baugh <sbaugh <at> janestreet.com>
Subject: Re: bug#79436: [PATCH] Use up-to-date time in
 wait_reading_process_output
Date: Sat, 27 Sep 2025 12:41:27 -0700
[Message part 1 (text/plain, inline)]
On 2025-09-27 01:47, Eli Zaretskii wrote:
> Paul, any comments or suggestions?

Sure, that patch looks good and we can do even better by using monotonic 
timestamps, and let's make them coarse as that's faster and good enough. 
The motivation for my 2015 patch was to avoid unnecessary roundtrips to 
the kernel, and nowadays coarse timestamps do that on many platforms.

I installed the attached two patches to do all that.

Surely there are other places where Emacs should use monotonic 
timestamps, or coarse timestamps, or both, but one thing at a time (so 
to speak...).

>> From:  Spencer Baugh via "Bug reports for GNU Emacs,
>> The second bug is that when a child process closes stdout/stderr, we
>> continue watching the pipe, generating spurious events.  (We should
>> still be timing out despite the spurious events - real events could also
>> cause this hang, this bug just makes it easier to trigger the hang.)
>> I'm still working on a patch to fix the second bug.

I didn't look at this issue, as I assumed your ping was about the bug 
mentioned in the Subject line.
[0001-Prefer-coarse-timestamps-when-using-X-sync.patch (text/x-patch, attachment)]
[0002-Use-up-to-date-time-in-wait_reading_process_output.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#79436; Package emacs. (Tue, 30 Sep 2025 18:50:02 GMT) Full text and rfc822 format available.

Message #14 received at 79436 <at> debbugs.gnu.org (full text, mbox):

From: Spencer Baugh <sbaugh <at> janestreet.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 79436 <at> debbugs.gnu.org, dmitry <at> gutov.dev, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#79436: [PATCH] Use up-to-date time in
 wait_reading_process_output
Date: Tue, 30 Sep 2025 14:49:17 -0400
[Message part 1 (text/plain, inline)]
Paul Eggert <eggert <at> cs.ucla.edu> writes:

> On 2025-09-27 01:47, Eli Zaretskii wrote:
>> Paul, any comments or suggestions?
>
> Sure, that patch looks good and we can do even better by using
> monotonic timestamps, and let's make them coarse as that's faster and
> good enough. The motivation for my 2015 patch was to avoid unnecessary
> roundtrips to the kernel, and nowadays coarse timestamps do that on
> many platforms.
>
> I installed the attached two patches to do all that.
>
> Surely there are other places where Emacs should use monotonic
> timestamps, or coarse timestamps, or both, but one thing at a time (so
> to speak...).

Thanks, these patches look good to me.

So now the the example I started this bug with, no longer hangs, because
the timeout actually works.  But the example still causes Emacs to
busy-wait because of spurious events: if you run the example now, you'll
see Emacs at 100% CPU.

The attached patch fixes that issue.

[0001-Stop-monitoring-fds-after-receiving-EOF.patch (text/x-patch, inline)]
From bef644da2a51296247bde9dd907d551202995106 Mon Sep 17 00:00:00 2001
From: Spencer Baugh <sbaugh <at> janestreet.com>
Date: Tue, 30 Sep 2025 14:47:51 -0400
Subject: [PATCH] Stop monitoring fds after receiving EOF

When a subprocess closes its stdout/stderr pipe, that causes
pselect to always indicate that fd is readable, and read to
always return with EOF on that fd.  Therefore when we receive an
EOF we need to stop monitoring the fd.  Otherwise Emacs will
spin at 100% CPU, repeatedly reading that same EOF off the fd.

* src/process.c (wait_reading_process_output): When
read_process_output returns EOF, stop monitoring the
channel. (bug#79436)
(handle_child_signal): Don't call delete_read_fd; this is
handled by deactivate_process or by wait_reading_process_output.
---
 src/process.c | 48 +++++++++++++++++-------------------------------
 1 file changed, 17 insertions(+), 31 deletions(-)

diff --git a/src/process.c b/src/process.c
index 0d842dac5e9..9c74cc84199 100644
--- a/src/process.c
+++ b/src/process.c
@@ -6076,30 +6076,24 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
 		    }
 		}
 #endif /* HAVE_PTYS */
-	      /* If we can detect process termination, don't consider the
-		 process gone just because its pipe is closed.  */
-	      else if (nread == 0 && !NETCONN_P (proc) && !SERIALCONN_P (proc)
-		       && !PIPECONN_P (proc))
-		;
-	      else if (nread == 0 && PIPECONN_P (proc))
-		{
-		  /* Preserve status of processes already terminated.  */
-		  XPROCESS (proc)->tick = ++process_tick;
-		  deactivate_process (proc);
-		  if (EQ (XPROCESS (proc)->status, Qrun))
-		    pset_status (XPROCESS (proc),
-				 list2 (Qexit, make_fixnum (0)));
-		}
-	      else
+	      else if (nread == 0)
 		{
-		  /* Preserve status of processes already terminated.  */
-		  XPROCESS (proc)->tick = ++process_tick;
-		  deactivate_process (proc);
-		  if (XPROCESS (proc)->raw_status_new)
-		    update_status (XPROCESS (proc));
-		  if (EQ (XPROCESS (proc)->status, Qrun))
-		    pset_status (XPROCESS (proc),
-				 list2 (Qexit, make_fixnum (256)));
+		  /* Stop reading the channel since it returned EOF. */
+		  delete_read_fd (channel);
+
+		  /* For some kinds of processes, EOF means the process
+		     is terminated.  */
+		  if (NETCONN_P (proc) || SERIALCONN_P (proc) || PIPECONN_P (proc))
+		    {
+		      /* Preserve status of processes already terminated.  */
+		      XPROCESS (proc)->tick = ++process_tick;
+		      deactivate_process (proc);
+		      if (!PIPECONN_P (proc) && XPROCESS (proc)->raw_status_new)
+			update_status (XPROCESS (proc));
+		      if (EQ (XPROCESS (proc)->status, Qrun))
+			pset_status (XPROCESS (proc),
+				     list2 (Qexit, make_fixnum (256)));
+		    }
 		}
 	    }
 	  if (FD_ISSET (channel, &Writeok)
@@ -7753,17 +7747,9 @@ handle_child_signal (int sig)
 	  p->raw_status = status;
 	  p->raw_status_new = 1;
 
-	  /* If process has terminated, stop waiting for its output.  */
 	  if (WIFSIGNALED (status) || WIFEXITED (status))
 	    {
-	      bool clear_desc_flag = 0;
 	      p->alive = 0;
-	      if (p->infd >= 0)
-		clear_desc_flag = 1;
-
-	      /* clear_desc_flag avoids a compiler bug in Microsoft C.  */
-	      if (clear_desc_flag)
-		delete_read_fd (p->infd);
 	    }
 	}
     }
-- 
2.43.7


Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#79436; Package emacs. (Tue, 30 Sep 2025 19:14:02 GMT) Full text and rfc822 format available.

Message #17 received at 79436 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Spencer Baugh <sbaugh <at> janestreet.com>
Cc: 79436 <at> debbugs.gnu.org, dmitry <at> gutov.dev, eggert <at> cs.ucla.edu
Subject: Re: bug#79436: [PATCH] Use up-to-date time in
 wait_reading_process_output
Date: Tue, 30 Sep 2025 22:13:33 +0300
> From: Spencer Baugh <sbaugh <at> janestreet.com>
> Cc: Eli Zaretskii <eliz <at> gnu.org>,  79436 <at> debbugs.gnu.org,  dmitry <at> gutov.dev
> Date: Tue, 30 Sep 2025 14:49:17 -0400
> 
> When a subprocess closes its stdout/stderr pipe, that causes
> pselect to always indicate that fd is readable, and read to
> always return with EOF on that fd.  Therefore when we receive an
> EOF we need to stop monitoring the fd.  Otherwise Emacs will
> spin at 100% CPU, repeatedly reading that same EOF off the fd.

Why are we sure that nread == 0 means there's nothing more to read
from the process, ever?  Can't it mean that we just read everything
the process had written so far?





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#79436; Package emacs. (Tue, 30 Sep 2025 19:27:02 GMT) Full text and rfc822 format available.

Message #20 received at 79436 <at> debbugs.gnu.org (full text, mbox):

From: Spencer Baugh <sbaugh <at> janestreet.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 79436 <at> debbugs.gnu.org, dmitry <at> gutov.dev, eggert <at> cs.ucla.edu
Subject: Re: bug#79436: [PATCH] Use up-to-date time in
 wait_reading_process_output
Date: Tue, 30 Sep 2025 15:26:07 -0400
Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: Spencer Baugh <sbaugh <at> janestreet.com>
>> Cc: Eli Zaretskii <eliz <at> gnu.org>,  79436 <at> debbugs.gnu.org,  dmitry <at> gutov.dev
>> Date: Tue, 30 Sep 2025 14:49:17 -0400
>> 
>> When a subprocess closes its stdout/stderr pipe, that causes
>> pselect to always indicate that fd is readable, and read to
>> always return with EOF on that fd.  Therefore when we receive an
>> EOF we need to stop monitoring the fd.  Otherwise Emacs will
>> spin at 100% CPU, repeatedly reading that same EOF off the fd.
>
> Why are we sure that nread == 0 means there's nothing more to read
> from the process, ever?  Can't it mean that we just read everything
> the process had written so far?

Receiving 0 (EOF) back from a read on a pipe or pty (or socket, or
indeed any non-regular file) means the write end of the pipe or pty is
closed, and there will never again be data.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#79436; Package emacs. (Tue, 30 Sep 2025 19:37:02 GMT) Full text and rfc822 format available.

Message #23 received at 79436 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Spencer Baugh <sbaugh <at> janestreet.com>
Cc: 79436 <at> debbugs.gnu.org, dmitry <at> gutov.dev, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#79436: [PATCH] Use up-to-date time in
 wait_reading_process_output
Date: Tue, 30 Sep 2025 12:36:17 -0700
On 2025-09-30 12:13, Eli Zaretskii wrote:
> Why are we sure that nread == 0 means there's nothing more to read
> from the process, ever?  Can't it mean that we just read everything
> the process had written so far?

For pipes and sockets, nread==0 means there will never be more input 
data. However, I serial ports can give you input data after nread==0.

I have a different question. After the proposed patch, after nread==0 we 
call deactivate_process which means we can no longer write to the pipe 
process. Is that intended? From an OS point of view, the process at the 
other end may be willing to read data from its input pipe even after 
it's closed its output pipe.

Also, some minor indenting/commenting suggestions:

diff --git a/src/process.c b/src/process.c
index 41a6c042871..d2d36e84305 100644
--- a/src/process.c
+++ b/src/process.c
@@ -6050,12 +6050,14 @@ wait_reading_process_output (intmax_t 
time_limit, int nsecs, int read_kbd,
 		  delete_read_fd (channel);

 		  /* For some kinds of processes, EOF means the process
-		     is terminated.  */
-		  if (NETCONN_P (proc) || SERIALCONN_P (proc) || PIPECONN_P (proc))
+		     is no longer active.  */
+		  if (NETCONN_P (proc) || SERIALCONN_P (proc)
+		      || PIPECONN_P (proc))
 		    {
-		      /* Preserve status of processes already terminated.  */
 		      XPROCESS (proc)->tick = ++process_tick;
 		      deactivate_process (proc);
+		      /* A pipe process might still be running.
+			 Preserve status of other process types.  */
 		      if (!PIPECONN_P (proc) && XPROCESS (proc)->raw_status_new)
 			update_status (XPROCESS (proc));
 		      if (EQ (XPROCESS (proc)->status, Qrun))
@@ -7716,9 +7718,7 @@ handle_child_signal (int sig)
 	  p->raw_status_new = 1;

 	  if (WIFSIGNALED (status) || WIFEXITED (status))
-	    {
-	      p->alive = 0;
-	    }
+	    p->alive = 0;
 	}
     }






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#79436; Package emacs. (Tue, 30 Sep 2025 19:45:01 GMT) Full text and rfc822 format available.

Message #26 received at 79436 <at> debbugs.gnu.org (full text, mbox):

From: Spencer Baugh <sbaugh <at> janestreet.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 79436 <at> debbugs.gnu.org, dmitry <at> gutov.dev, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#79436: [PATCH] Use up-to-date time in
 wait_reading_process_output
Date: Tue, 30 Sep 2025 15:44:12 -0400
Paul Eggert <eggert <at> cs.ucla.edu> writes:

> On 2025-09-30 12:13, Eli Zaretskii wrote:
>> Why are we sure that nread == 0 means there's nothing more to read
>> from the process, ever?  Can't it mean that we just read everything
>> the process had written so far?
>
> For pipes and sockets, nread==0 means there will never be more input
> data. However, I serial ports can give you input data after nread==0.

Though note that behavior doesn't change for serial ports.  We closed
them on nread==0 before, and we continue to do that now.  Behavior is
only changing for regular subprocesses.

> I have a different question. After the proposed patch, after nread==0
> we call deactivate_process which means we can no longer write to the
> pipe process.

For pipe processes, yes.

> Is that intended? From an OS point of view, the process
> at the other end may be willing to read data from its input pipe even
> after it's closed its output pipe.

I agree, but this is how it worked for pipe processes before my change,
I'm just preserving the existing behavior.

We don't call deactivate_process for regular subprocess - as you say, a
regular subprocess could still accept input even though it's closed its
output.

> Also, some minor indenting/commenting suggestions:

Will include these in the next version.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#79436; Package emacs. (Wed, 01 Oct 2025 06:31:01 GMT) Full text and rfc822 format available.

Message #29 received at 79436 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Spencer Baugh <sbaugh <at> janestreet.com>, eggert <at> cs.ucla.edu
Cc: 79436 <at> debbugs.gnu.org, dmitry <at> gutov.dev
Subject: Re: bug#79436: [PATCH] Use up-to-date time in
 wait_reading_process_output
Date: Wed, 01 Oct 2025 09:30:16 +0300
> From: Spencer Baugh <sbaugh <at> janestreet.com>
> Cc: 79436 <at> debbugs.gnu.org,  dmitry <at> gutov.dev,  eggert <at> cs.ucla.edu
> Date: Tue, 30 Sep 2025 15:26:07 -0400
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> >> From: Spencer Baugh <sbaugh <at> janestreet.com>
> >> Cc: Eli Zaretskii <eliz <at> gnu.org>,  79436 <at> debbugs.gnu.org,  dmitry <at> gutov.dev
> >> Date: Tue, 30 Sep 2025 14:49:17 -0400
> >> 
> >> When a subprocess closes its stdout/stderr pipe, that causes
> >> pselect to always indicate that fd is readable, and read to
> >> always return with EOF on that fd.  Therefore when we receive an
> >> EOF we need to stop monitoring the fd.  Otherwise Emacs will
> >> spin at 100% CPU, repeatedly reading that same EOF off the fd.
> >
> > Why are we sure that nread == 0 means there's nothing more to read
> > from the process, ever?  Can't it mean that we just read everything
> > the process had written so far?
> 
> Receiving 0 (EOF) back from a read on a pipe or pty (or socket, or
> indeed any non-regular file) means the write end of the pipe or pty is
> closed, and there will never again be data.

After reading the relevant references, I'm not sure.

Paul, is this 100% correct on Posix systems?

Sure, the references say that if the write side of the pipe or FIFO
(and PTY is a FIFO, AFAIU) is closed, the return value is zero.  But
the question is: can the return value of 'read' be zero in any other
situations?  For example, what happens if the process on the write
side of the pipe writes zero bytes to the pipe?

Btw, there's more to this situation than meets the eye: on MS-Windows
your test case doesn't hang because accept-process-output returns
after the 1-sec timeout.

So I think we should tread very carefully here.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#79436; Package emacs. (Wed, 01 Oct 2025 07:17:02 GMT) Full text and rfc822 format available.

Message #32 received at 79436 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>, Spencer Baugh <sbaugh <at> janestreet.com>
Cc: 79436 <at> debbugs.gnu.org, dmitry <at> gutov.dev
Subject: Re: bug#79436: [PATCH] Use up-to-date time in
 wait_reading_process_output
Date: Wed, 1 Oct 2025 00:15:43 -0700
On 2025-09-30 23:30, Eli Zaretskii wrote:
>> From: Spencer Baugh <sbaugh <at> janestreet.com>
>> Receiving 0 (EOF) back from a read on a pipe or pty (or socket, or
>> indeed any non-regular file) means the write end of the pipe or pty is
>> closed, and there will never again be data.
> 
> After reading the relevant references, I'm not sure.
> 
> Paul, is this 100% correct on Posix systems?
> 
> Sure, the references say that if the write side of the pipe or FIFO
> (and PTY is a FIFO, AFAIU) is closed, the return value is zero.  But
> the question is: can the return value of 'read' be zero in any other
> situations?  For example, what happens if the process on the write
> side of the pipe writes zero bytes to the pipe?

POSIX doesn't say what happens in that situation.

In practice, a write of zero bytes typically does nothing. But in 
STREAMS-based systems like AIX or Solaris, if a process uses 
ioctl+I_SWROPT+SNDZERO on a pipe because it wants to send a zero-byte 
message, it can then do so, and the recipient will get a zero-byte read. 
As I vaguely recall, this was to simulate a user typing ^D at a 
terminal, which means EOF if you type it at the start of a line (but you 
can keep typing after EOF and later reads will see more data). Although 
POSIX marked this feature as obsolescent a while ago, and completely 
removed it in POSIX.1-2024, AIX and Solaris (and maybe a few other older 
systems) still support it. I believe it's even the default under AIX.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#79436; Package emacs. (Wed, 01 Oct 2025 07:32:02 GMT) Full text and rfc822 format available.

Message #35 received at 79436 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 79436 <at> debbugs.gnu.org, sbaugh <at> janestreet.com, dmitry <at> gutov.dev
Subject: Re: bug#79436: [PATCH] Use up-to-date time in
 wait_reading_process_output
Date: Wed, 01 Oct 2025 10:31:06 +0300
> Date: Wed, 1 Oct 2025 00:15:43 -0700
> Cc: 79436 <at> debbugs.gnu.org, dmitry <at> gutov.dev
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> 
> On 2025-09-30 23:30, Eli Zaretskii wrote:
> >> From: Spencer Baugh <sbaugh <at> janestreet.com>
> >> Receiving 0 (EOF) back from a read on a pipe or pty (or socket, or
> >> indeed any non-regular file) means the write end of the pipe or pty is
> >> closed, and there will never again be data.
> > 
> > After reading the relevant references, I'm not sure.
> > 
> > Paul, is this 100% correct on Posix systems?
> > 
> > Sure, the references say that if the write side of the pipe or FIFO
> > (and PTY is a FIFO, AFAIU) is closed, the return value is zero.  But
> > the question is: can the return value of 'read' be zero in any other
> > situations?  For example, what happens if the process on the write
> > side of the pipe writes zero bytes to the pipe?
> 
> POSIX doesn't say what happens in that situation.
> 
> In practice, a write of zero bytes typically does nothing. But in 
> STREAMS-based systems like AIX or Solaris, if a process uses 
> ioctl+I_SWROPT+SNDZERO on a pipe because it wants to send a zero-byte 
> message, it can then do so, and the recipient will get a zero-byte read. 
> As I vaguely recall, this was to simulate a user typing ^D at a 
> terminal, which means EOF if you type it at the start of a line (but you 
> can keep typing after EOF and later reads will see more data). Although 
> POSIX marked this feature as obsolescent a while ago, and completely 
> removed it in POSIX.1-2024, AIX and Solaris (and maybe a few other older 
> systems) still support it. I believe it's even the default under AIX.

Thanks.

If we want to interpret zero as EOF, we'd need to audit at least the
w32 emulation of the relevant calls (sys_read etc.).  I'm not sure we
make a point of returning zero there only when the other side of the
pipe is closed, and as I wrote, the original recipe doesn't behave on
Windows as it does on GNU/Linux.

So I generally wonder whether this change is really needed so sorely
that we have to dig so deep into the areas of code that were working
for us for ages.  Why would a program suitable for being a sub-process
of Emacs close its stdout?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#79436; Package emacs. (Wed, 01 Oct 2025 13:04:02 GMT) Full text and rfc822 format available.

Message #38 received at 79436 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Spencer Baugh <sbaugh <at> janestreet.com>
Cc: 79436 <at> debbugs.gnu.org, dmitry <at> gutov.dev, eggert <at> cs.ucla.edu
Subject: Re: bug#79436: [PATCH] Use up-to-date time in
 wait_reading_process_output
Date: Wed, 01 Oct 2025 16:02:57 +0300
> From: Spencer Baugh <sbaugh <at> janestreet.com>
> Cc: 79436 <at> debbugs.gnu.org,  dmitry <at> gutov.dev,  Eli Zaretskii <eliz <at> gnu.org>
> Date: Tue, 30 Sep 2025 15:44:12 -0400
> 
> Paul Eggert <eggert <at> cs.ucla.edu> writes:
> 
> > On 2025-09-30 12:13, Eli Zaretskii wrote:
> >> Why are we sure that nread == 0 means there's nothing more to read
> >> from the process, ever?  Can't it mean that we just read everything
> >> the process had written so far?
> >
> > For pipes and sockets, nread==0 means there will never be more input
> > data. However, I serial ports can give you input data after nread==0.
> 
> Though note that behavior doesn't change for serial ports.  We closed
> them on nread==0 before, and we continue to do that now.  Behavior is
> only changing for regular subprocesses.
> 
> > I have a different question. After the proposed patch, after nread==0
> > we call deactivate_process which means we can no longer write to the
> > pipe process.
> 
> For pipe processes, yes.
> 
> > Is that intended? From an OS point of view, the process
> > at the other end may be willing to read data from its input pipe even
> > after it's closed its output pipe.
> 
> I agree, but this is how it worked for pipe processes before my change,
> I'm just preserving the existing behavior.
> 
> We don't call deactivate_process for regular subprocess - as you say, a
> regular subprocess could still accept input even though it's closed its
> output.
> 
> > Also, some minor indenting/commenting suggestions:
> 
> Will include these in the next version.

Please make the next version as close as possible to the current code.
That way, if I decide not to make this effective on WINDOWSNT, it will
be a matter of a single simple #ifdef.  From what I can see, all you
need is to add delete_read_fd call to the clause whose condition is

	      else if (nread == 0 && !NETCONN_P (proc) && !SERIALCONN_P (proc)
		       && !PIPECONN_P (proc))

where we currently do nothing.  I understand that having almost the
same code more than once is not very elegant, but making the code with
multiple #ifdef's more readable trumps that in my book.

I also don't see why it would be TRT to do this part:

> @@ -7753,17 +7747,9 @@ handle_child_signal (int sig)
>  	  p->raw_status = status;
>  	  p->raw_status_new = 1;
>  
> -	  /* If process has terminated, stop waiting for its output.  */
>  	  if (WIFSIGNALED (status) || WIFEXITED (status))
>  	    {
> -	      bool clear_desc_flag = 0;
>  	      p->alive = 0;
> -	      if (p->infd >= 0)
> -		clear_desc_flag = 1;
> -
> -	      /* clear_desc_flag avoids a compiler bug in Microsoft C.  */
> -	      if (clear_desc_flag)
> -		delete_read_fd (p->infd);
>  	    }

You commit log message says "Don't call delete_read_fd; this is
handled by deactivate_process or by wait_reading_process_output", but
the new code doesn't call deactivate_process for a regular
sub-processes (and neither did the old one), so AFAIU we still need
that part.  Or at least it's an unrelated change.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#79436; Package emacs. (Wed, 01 Oct 2025 16:00:02 GMT) Full text and rfc822 format available.

Message #41 received at 79436 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 79436 <at> debbugs.gnu.org, sbaugh <at> janestreet.com, dmitry <at> gutov.dev
Subject: Re: bug#79436: [PATCH] Use up-to-date time in
 wait_reading_process_output
Date: Wed, 1 Oct 2025 08:59:03 -0700
On 2025-10-01 00:31, Eli Zaretskii wrote:
> Why would a program suitable for being a sub-process
> of Emacs close its stdout?

Because they want to report write errors. For performance reasons many 
operating systems defer writes and report trailing IO errors only upon 
close. To detect these errors, many programs check whether 'close 
(STDOUT_FILENO)' fails. GNU 'cat' is one example.

Typically these programs exit soon after closing, but if later cleanup 
operations hang, they won't do that.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#79436; Package emacs. (Wed, 01 Oct 2025 16:36:02 GMT) Full text and rfc822 format available.

Message #44 received at 79436 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 79436 <at> debbugs.gnu.org, sbaugh <at> janestreet.com, dmitry <at> gutov.dev
Subject: Re: bug#79436: [PATCH] Use up-to-date time in
 wait_reading_process_output
Date: Wed, 01 Oct 2025 19:35:16 +0300
> Date: Wed, 1 Oct 2025 08:59:03 -0700
> Cc: sbaugh <at> janestreet.com, 79436 <at> debbugs.gnu.org, dmitry <at> gutov.dev
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> 
> On 2025-10-01 00:31, Eli Zaretskii wrote:
> > Why would a program suitable for being a sub-process
> > of Emacs close its stdout?
> 
> Because they want to report write errors. For performance reasons many 
> operating systems defer writes and report trailing IO errors only upon 
> close. To detect these errors, many programs check whether 'close 
> (STDOUT_FILENO)' fails. GNU 'cat' is one example.

No, I meant why would a program do that and yet not exit.

> Typically these programs exit soon after closing, but if later cleanup 
> operations hang, they won't do that.

That's exactly what I meant: if they exit right away, there's no
significant reason to make these changes.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#79436; Package emacs. (Wed, 08 Oct 2025 18:00:02 GMT) Full text and rfc822 format available.

Message #47 received at 79436 <at> debbugs.gnu.org (full text, mbox):

From: Spencer Baugh <sbaugh <at> janestreet.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 79436 <at> debbugs.gnu.org, dmitry <at> gutov.dev, eggert <at> cs.ucla.edu
Subject: Re: bug#79436: [PATCH] Use up-to-date time in
 wait_reading_process_output
Date: Wed, 08 Oct 2025 13:58:46 -0400
[Message part 1 (text/plain, inline)]
Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: Spencer Baugh <sbaugh <at> janestreet.com>
>> Cc: 79436 <at> debbugs.gnu.org,  dmitry <at> gutov.dev,  Eli Zaretskii <eliz <at> gnu.org>
>> Date: Tue, 30 Sep 2025 15:44:12 -0400
>> 
>> Paul Eggert <eggert <at> cs.ucla.edu> writes:
>> 
>> > On 2025-09-30 12:13, Eli Zaretskii wrote:
>> >> Why are we sure that nread == 0 means there's nothing more to read
>> >> from the process, ever?  Can't it mean that we just read everything
>> >> the process had written so far?
>> >
>> > For pipes and sockets, nread==0 means there will never be more input
>> > data. However, I serial ports can give you input data after nread==0.
>> 
>> Though note that behavior doesn't change for serial ports.  We closed
>> them on nread==0 before, and we continue to do that now.  Behavior is
>> only changing for regular subprocesses.
>> 
>> > I have a different question. After the proposed patch, after nread==0
>> > we call deactivate_process which means we can no longer write to the
>> > pipe process.
>> 
>> For pipe processes, yes.
>> 
>> > Is that intended? From an OS point of view, the process
>> > at the other end may be willing to read data from its input pipe even
>> > after it's closed its output pipe.
>> 
>> I agree, but this is how it worked for pipe processes before my change,
>> I'm just preserving the existing behavior.
>> 
>> We don't call deactivate_process for regular subprocess - as you say, a
>> regular subprocess could still accept input even though it's closed its
>> output.
>> 
>> > Also, some minor indenting/commenting suggestions:
>> 
>> Will include these in the next version.
>
> Please make the next version as close as possible to the current code.
> That way, if I decide not to make this effective on WINDOWSNT, it will
> be a matter of a single simple #ifdef.  From what I can see, all you
> need is to add delete_read_fd call to the clause whose condition is
>
> 	      else if (nread == 0 && !NETCONN_P (proc) && !SERIALCONN_P (proc)
> 		       && !PIPECONN_P (proc))
>
> where we currently do nothing.  I understand that having almost the
> same code more than once is not very elegant, but making the code with
> multiple #ifdef's more readable trumps that in my book.

Okay, done.

> I also don't see why it would be TRT to do this part:
>
>> @@ -7753,17 +7747,9 @@ handle_child_signal (int sig)
>>  	  p->raw_status = status;
>>  	  p->raw_status_new = 1;
>>  
>> -	  /* If process has terminated, stop waiting for its output.  */
>>  	  if (WIFSIGNALED (status) || WIFEXITED (status))
>>  	    {
>> -	      bool clear_desc_flag = 0;
>>  	      p->alive = 0;
>> -	      if (p->infd >= 0)
>> -		clear_desc_flag = 1;
>> -
>> -	      /* clear_desc_flag avoids a compiler bug in Microsoft C.  */
>> -	      if (clear_desc_flag)
>> -		delete_read_fd (p->infd);
>>  	    }
>
> You commit log message says "Don't call delete_read_fd; this is
> handled by deactivate_process or by wait_reading_process_output", but
> the new code doesn't call deactivate_process for a regular
> sub-processes (and neither did the old one), so AFAIU we still need
> that part.  Or at least it's an unrelated change.

Yes, that is basically an unrelated change.

My motivation for it is: with the addition of this delete_read_fd call
in wait_reading_process_output, I see no remaining reason to call
delete_read_fd in handle_child_signal, and real reasons not to do so:
this behavior in handle_child_signal ultimately contributes to thread
bugs.  (My investigation of thread bugs related to handle_child_signal
was how I ended up finding this subprocess-closing-stdout bug)

But anyway, we can discuss that change later if you prefer.

Attached is the simple one-line patch which just fixes the "spurious
events on EOF" issue.

[0001-Stop-monitoring-fds-after-receiving-EOF.patch (text/x-patch, inline)]
From ce9aca67677317a9f0ac86a5108c1580b651ed2c Mon Sep 17 00:00:00 2001
From: Spencer Baugh <sbaugh <at> janestreet.com>
Date: Tue, 30 Sep 2025 14:47:51 -0400
Subject: [PATCH] Stop monitoring fds after receiving EOF

When a subprocess closes its stdout/stderr pipe, that causes
pselect to always indicate that fd is readable, and read to
always return with EOF on that fd.  Therefore when we receive an
EOF we need to stop monitoring the fd.  Otherwise Emacs will
spin at 100% CPU, repeatedly reading that same EOF off the fd.

* src/process.c (wait_reading_process_output): When
read_process_output returns EOF, stop monitoring the
channel. (bug#79436)
---
 src/process.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/process.c b/src/process.c
index fb6ef8b2d86..2ee77883e9e 100644
--- a/src/process.c
+++ b/src/process.c
@@ -6086,7 +6086,7 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
 		 process gone just because its pipe is closed.  */
 	      else if (nread == 0 && !NETCONN_P (proc) && !SERIALCONN_P (proc)
 		       && !PIPECONN_P (proc))
-		;
+		delete_read_fd (channel);
 	      else if (nread == 0 && PIPECONN_P (proc))
 		{
 		  /* Preserve status of processes already terminated.  */
-- 
2.43.7


Reply sent to Eli Zaretskii <eliz <at> gnu.org>:
You have taken responsibility. (Sat, 11 Oct 2025 09:32:01 GMT) Full text and rfc822 format available.

Notification sent to Spencer Baugh <sbaugh <at> janestreet.com>:
bug acknowledged by developer. (Sat, 11 Oct 2025 09:32:02 GMT) Full text and rfc822 format available.

Message #52 received at 79436-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Spencer Baugh <sbaugh <at> janestreet.com>
Cc: 79436-done <at> debbugs.gnu.org, dmitry <at> gutov.dev, eggert <at> cs.ucla.edu
Subject: Re: bug#79436: [PATCH] Use up-to-date time in
 wait_reading_process_output
Date: Sat, 11 Oct 2025 12:30:53 +0300
> From: Spencer Baugh <sbaugh <at> janestreet.com>
> Cc: 79436 <at> debbugs.gnu.org,  dmitry <at> gutov.dev,  eggert <at> cs.ucla.edu
> Date: Wed, 08 Oct 2025 13:58:46 -0400
> 
> > You commit log message says "Don't call delete_read_fd; this is
> > handled by deactivate_process or by wait_reading_process_output", but
> > the new code doesn't call deactivate_process for a regular
> > sub-processes (and neither did the old one), so AFAIU we still need
> > that part.  Or at least it's an unrelated change.
> 
> Yes, that is basically an unrelated change.
> 
> My motivation for it is: with the addition of this delete_read_fd call
> in wait_reading_process_output, I see no remaining reason to call
> delete_read_fd in handle_child_signal, and real reasons not to do so:
> this behavior in handle_child_signal ultimately contributes to thread
> bugs.  (My investigation of thread bugs related to handle_child_signal
> was how I ended up finding this subprocess-closing-stdout bug)
> 
> But anyway, we can discuss that change later if you prefer.
> 
> Attached is the simple one-line patch which just fixes the "spurious
> events on EOF" issue.

Thanks, installed on master.

The patch caused one of the process-tests to hang on MS-Windows, so
I've disabled the change for WINDOWSNT.  I'm not surprised by the test
failure, since EOF indication from sub-processes is detected
differently on MS-Windows, while zero-size reads could happen due to
how this stuff works on Windows.  Note that according to Paul, Posix
doesn't specify what happens when the subprocess writes zero bytes to
its end of the pipe, so maybe this is not limited to Windows.




This bug report was last modified 26 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.