GNU bug report logs - #48118
27.1; 28; Only first process receives output with multiple running processes

Previous Next

Package: emacs;

Reported by: Daniel Mendler <mail <at> daniel-mendler.de>

Date: Fri, 30 Apr 2021 13:45:02 UTC

Severity: normal

Tags: fixed

Found in version 27.1

Fixed in version 28.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 48118 in the body.
You can then email your comments to 48118 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Fri, 30 Apr 2021 13:45:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Daniel Mendler <mail <at> daniel-mendler.de>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Fri, 30 Apr 2021 13:45:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: bug-gnu-emacs <at> gnu.org
Subject: 27.1; 28; Only first process receives output with multiple running
 processes
Date: Fri, 30 Apr 2021 15:44:17 +0200
When running multiple asynchronous processes only the output of the
first process is handled. This happens when the first process
continously produces a huge amount of output, for example when running
`ripgrep` as done by my `consult-ripgrep` command (part of my Consult
package). Then Emacs is stuck handling the output of the first process.
The output of the second process is not read until the first process is
terminated. I expect Emacs to treat the running processes fairly. The
issue also occurs if a :filter function is specified. Both Emacs 27 and
28 are affected.

Minimal reproducible example by @jakanakaevangeli:

(progn
  (setq pa (make-process
             :name "yes-a"
             :command '("yes")
             :connection-type 'pipe
             :buffer (setq a (generate-new-buffer " *a*"))))
  (setq pb (make-process
             :name "yes-b"
             :command '("yes")
             :connection-type 'pipe
             :buffer (setq b (generate-new-buffer " *b*"))))
  (run-at-time
   1 1
   (lambda ()
     (message "size a: %s\nsize b: %s"
              (buffer-size a) (buffer-size b)))))

Original bug discussions:
https://github.com/minad/consult/issues/272
https://github.com/minad/consult/pull/297




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Fri, 30 Apr 2021 14:19:01 GMT) Full text and rfc822 format available.

Message #8 received at 48118 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: 48118 <at> debbugs.gnu.org
Subject: Re: bug#48118: 27.1; 28;
 Only first process receives output with multiple running processes
Date: Fri, 30 Apr 2021 17:17:58 +0300
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Fri, 30 Apr 2021 15:44:17 +0200
> 
> When running multiple asynchronous processes only the output of the
> first process is handled. This happens when the first process
> continously produces a huge amount of output, for example when running
> `ripgrep` as done by my `consult-ripgrep` command (part of my Consult
> package). Then Emacs is stuck handling the output of the first process.
> The output of the second process is not read until the first process is
> terminated. I expect Emacs to treat the running processes fairly.

Why can't you call stop-process from your sentinel function(s) to
avoid the problem?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Fri, 30 Apr 2021 14:24:01 GMT) Full text and rfc822 format available.

Message #11 received at 48118 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 48118 <at> debbugs.gnu.org
Subject: Re: bug#48118: 27.1; 28; Only first process receives output with
 multiple running processes
Date: Fri, 30 Apr 2021 16:23:34 +0200
On 4/30/21 4:17 PM, Eli Zaretskii wrote:
>> When running multiple asynchronous processes only the output of the
>> first process is handled. This happens when the first process
>> continously produces a huge amount of output, for example when running
>> `ripgrep` as done by my `consult-ripgrep` command (part of my Consult
>> package). Then Emacs is stuck handling the output of the first process.
>> The output of the second process is not read until the first process is
>> terminated. I expect Emacs to treat the running processes fairly.
> 
> Why can't you call stop-process from your sentinel function(s) to
> avoid the problem?

What do you mean? I don't want to stop the processes. I want to have
them running asynchronously and concurrently and Emacs should receive
the incoming data of both processes. When I have to stop the processes
the benefit of running the processes asynchronously is lost.

------
I forgot to append the system information to the report.

In GNU Emacs 27.1 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.5,
cairo version 1.16.0)
 of 2021-02-09, modified by Debian built on 3df710f593d9
Repository revision: b0229d4bbaea7fcddffced393512c650212830db
Repository branch: deb/emacs/d/sid/master
Windowing system distributor 'The X.Org Foundation', version 11.0.12004000
System Description: Debian GNU/Linux 10 (buster)

Configured using:
 'configure --build x86_64-linux-gnu --prefix=/usr
 --sharedstatedir=/var/lib --libexecdir=/usr/lib
 --localstatedir=/var/lib --infodir=/usr/share/info
 --mandir=/usr/share/man --enable-libsystemd --with-pop=yes
 --enable-locallisppath=/etc/emacs:/usr/local/share/emacs/27.1/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/27.1/site-lisp:/usr/share/emacs/site-lisp
 --with-sound=alsa --without-gconf --with-mailutils --build
 x86_64-linux-gnu --prefix=/usr --sharedstatedir=/var/lib
 --libexecdir=/usr/lib --localstatedir=/var/lib
 --infodir=/usr/share/info --mandir=/usr/share/man --enable-libsystemd
 --with-pop=yes
 --enable-locallisppath=/etc/emacs:/usr/local/share/emacs/27.1/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/27.1/site-lisp:/usr/share/emacs/site-lisp
 --with-sound=alsa --without-gconf --with-mailutils --with-cairo
 --with-x=yes --with-x-toolkit=gtk3 --with-toolkit-scroll-bars
 'CFLAGS=-g -O2 -fdebug-prefix-map=/emacs/emacs=.
 -fstack-protector-strong -Wformat -Werror=format-security -Wall'
 'CPPFLAGS=-Wdate-time -D_FORTIFY_SOURCE=2' LDFLAGS=-Wl,-z,relro'

Configured features:
XPM JPEG TIFF GIF PNG RSVG CAIRO SOUND GPM DBUS GSETTINGS GLIB NOTIFY
INOTIFY ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE HARFBUZZ M17N_FLT LIBOTF
ZLIB TOOLKIT_SCROLL_BARS GTK3 X11 XDBE XIM MODULES THREADS LIBSYSTEMD
JSON PDUMPER LCMS2 GMP




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Fri, 30 Apr 2021 14:27:02 GMT) Full text and rfc822 format available.

Message #14 received at 48118 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: mail <at> daniel-mendler.de
Cc: 48118 <at> debbugs.gnu.org
Subject: Re: bug#48118: 27.1; 28;
 Only first process receives output with multiple running processes
Date: Fri, 30 Apr 2021 17:26:00 +0300
> Date: Fri, 30 Apr 2021 17:17:58 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 48118 <at> debbugs.gnu.org
> 
> Why can't you call stop-process from your sentinel function(s) to
> avoid the problem?

Sorry, I meant filter functions, of course.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Fri, 30 Apr 2021 14:32:02 GMT) Full text and rfc822 format available.

Message #17 received at 48118 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 48118 <at> debbugs.gnu.org
Subject: Re: bug#48118: 27.1; 28; Only first process receives output with
 multiple running processes
Date: Fri, 30 Apr 2021 16:30:54 +0200
On 4/30/21 4:26 PM, Eli Zaretskii wrote:
>> Date: Fri, 30 Apr 2021 17:17:58 +0300
>> From: Eli Zaretskii <eliz <at> gnu.org>
>> Cc: 48118 <at> debbugs.gnu.org
>>
>> Why can't you call stop-process from your sentinel function(s) to
>> avoid the problem?
> 
> Sorry, I meant filter functions, of course.

So you say I should repeatedly stop the current process in the filter
function in order to allow the other process to take precedence, since
the underlying Emacs handling of asynchronous processes is unable to
read from two processes at once? This does not sound like a good
solution to me. What is preventing Emacs from treating multiple
processes fairly?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Fri, 30 Apr 2021 14:32:02 GMT) Full text and rfc822 format available.

Message #20 received at 48118 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: 48118 <at> debbugs.gnu.org
Subject: Re: bug#48118: 27.1; 28; Only first process receives output with
 multiple running processes
Date: Fri, 30 Apr 2021 17:31:09 +0300
> Cc: 48118 <at> debbugs.gnu.org
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Fri, 30 Apr 2021 16:23:34 +0200
> 
> > Why can't you call stop-process from your sentinel function(s) to
> > avoid the problem?
> 
> What do you mean? I don't want to stop the processes. I want to have
> them running asynchronously and concurrently and Emacs should receive
> the incoming data of both processes. When I have to stop the processes
> the benefit of running the processes asynchronously is lost.

So let's talk about this in more detail, okay?

First, what does "fairness" mean in this context?  Given that there
are multiple simultaneous asynchronous subprocesses that produce
output at different rates and consume CPU at different levels, what
would it mean for Emacs to be "fair"?

Second, suppose we have multiple ripgrep subprocesses running, and
Emacs will somehow read from each one of them in a round-robin
fashion: what and how do you expect the user to do to handle the
results of all those subprocesses simultaneously and "fairly"?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Fri, 30 Apr 2021 14:35:02 GMT) Full text and rfc822 format available.

Message #23 received at 48118 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: 48118 <at> debbugs.gnu.org
Subject: Re: bug#48118: 27.1; 28; Only first process receives output with
 multiple running processes
Date: Fri, 30 Apr 2021 17:34:38 +0300
> Cc: 48118 <at> debbugs.gnu.org
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Fri, 30 Apr 2021 16:30:54 +0200
> 
> So you say I should repeatedly stop the current process in the filter
> function in order to allow the other process to take precedence

Yes.

> since the underlying Emacs handling of asynchronous processes is
> unable to read from two processes at once?

No.  The problem is not the _ability_ to read from more than one
subprocess -- the ability does exist.  The problem is that doing so
would run afoul of other scenarios.

> me. What is preventing Emacs from treating multiple processes
> fairly?

I asked elsewhere what you mean by "fairly" in this context.

But the general answer to your question is that Emacs knows nothing
about the processes, their importance, their output rates, and the
respective filter functions.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Fri, 30 Apr 2021 14:46:02 GMT) Full text and rfc822 format available.

Message #26 received at 48118 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 48118 <at> debbugs.gnu.org
Subject: Re: bug#48118: 27.1; 28; Only first process receives output with
 multiple running processes
Date: Fri, 30 Apr 2021 16:45:25 +0200
On 4/30/21 4:34 PM, Eli Zaretskii wrote:
>> So you say I should repeatedly stop the current process in the filter
>> function in order to allow the other process to take precedence
> 
> Yes.

This is not a good solution. What if I have multiple packages which read
from asynchronous processes? Maybe I cannot control all of the processes
and their scheduling.

>> since the underlying Emacs handling of asynchronous processes is
>> unable to read from two processes at once?
> 
> No.  The problem is not the _ability_ to read from more than one
> subprocess -- the ability does exist.  The problem is that doing so
> would run afoul of other scenarios.

Which scenarios break?

>> me. What is preventing Emacs from treating multiple processes
>> fairly?
> 
> I asked elsewhere what you mean by "fairly" in this context.
> 
> But the general answer to your question is that Emacs knows nothing
> about the processes, their importance, their output rates, and the
> respective filter functions.

Okay good. How can I configure it such that two processes both populate
their buffers in a round-robin fashion?

> First, what does "fairness" mean in this context?  Given that there
> are multiple simultaneous asynchronous subprocesses that produce
> output at different rates and consume CPU at different levels, what
> would it mean for Emacs to be "fair"?
>
> Second, suppose we have multiple ripgrep subprocesses running, and
> Emacs will somehow read from each one of them in a round-robin
> fashion: what and how do you expect the user to do to handle the
> results of all those subprocesses simultaneously and "fairly"?

I agree with you that fairness is a difficult problem. But the problem
is omnipresent at the os level. There you have scheduling problems in
the io layer, in the process scheduling layer, in the memory management
layer and so on. There is certainly some heuristic that one can apply.
For example by comparing the amount of data produced by multiple
processes one could decide which process is read next. Or one can use a
deadline criterion.

I am not happy with the argument that Emacs cannot do any better than
stopping the second process and only handle the first process.

If you don't want to hardcode the scheduling behavior there could be
some pluggable scheduler. This would be better than having to write my
own scheduling by hand for each `make-process` call.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Fri, 30 Apr 2021 15:00:02 GMT) Full text and rfc822 format available.

Message #29 received at 48118 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: 48118 <at> debbugs.gnu.org
Subject: Re: bug#48118: 27.1; 28; Only first process receives output with
 multiple running processes
Date: Fri, 30 Apr 2021 17:59:00 +0300
> Cc: 48118 <at> debbugs.gnu.org
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Fri, 30 Apr 2021 16:45:25 +0200
> 
> On 4/30/21 4:34 PM, Eli Zaretskii wrote:
> >> So you say I should repeatedly stop the current process in the filter
> >> function in order to allow the other process to take precedence
> > 
> > Yes.
> 
> This is not a good solution. What if I have multiple packages which read
> from asynchronous processes? Maybe I cannot control all of the processes
> and their scheduling.

That's not what I meant.  I meant that if your Lisp program launches a
subprocess that is known to spill huge amounts of output at high rate,
and you don't want to starve other subprocesses, your filter function
can stop the process from time to time to give others an opportunity
to have their outputs read.

> >> since the underlying Emacs handling of asynchronous processes is
> >> unable to read from two processes at once?
> > 
> > No.  The problem is not the _ability_ to read from more than one
> > subprocess -- the ability does exist.  The problem is that doing so
> > would run afoul of other scenarios.
> 
> Which scenarios break?

For example, if the filter function call accept-process-output.  Or
does anything else that changes output from which processes is or
isn't available.

> > But the general answer to your question is that Emacs knows nothing
> > about the processes, their importance, their output rates, and the
> > respective filter functions.
> 
> Okay good. How can I configure it such that two processes both populate
> their buffers in a round-robin fashion?

What does this mean, exactly?  Which quantity should be doled in a
round-robin fashion? bytes read from the processes? something else?

If the bytes read, then how do you suggest to handle two processes
which produce output at very different rates?

> I am not happy with the argument that Emacs cannot do any better than
> stopping the second process and only handle the first process.

I'm not saying that Emacs cannot do that, I'm trying to understand
what that would mean in practice.

> If you don't want to hardcode the scheduling behavior there could be
> some pluggable scheduler. This would be better than having to write my
> own scheduling by hand for each `make-process` call.

Please hold your horses, you are getting too far ahead of the
discussion.  I asked those questions for a reason: I think we cannot
make any meaningful progress without answering them first.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Fri, 30 Apr 2021 15:40:02 GMT) Full text and rfc822 format available.

Message #32 received at 48118 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 48118 <at> debbugs.gnu.org
Subject: Re: bug#48118: 27.1; 28; Only first process receives output with
 multiple running processes
Date: Fri, 30 Apr 2021 17:39:35 +0200
On 4/30/21 4:59 PM, Eli Zaretskii wrote:
> That's not what I meant.  I meant that if your Lisp program launches a
> subprocess that is known to spill huge amounts of output at high rate,
> and you don't want to starve other subprocesses, your filter function
> can stop the process from time to time to give others an opportunity
> to have their outputs read.

This is true. However I expect Emacs to do that for me. I see the
situation like this - I have multiple packages creating processes and
competing for runtime. Then I expect Emacs to take some kind of
preemptive role and ensure that none of the processes misbehaves.
However one can also take the standpoint that each process should try to
behave well by itself and should be scheduled explicitly; Emacs should
stay out of this business.

>>>> since the underlying Emacs handling of asynchronous processes is
>>>> unable to read from two processes at once?
>>>
>>> No.  The problem is not the _ability_ to read from more than one
>>> subprocess -- the ability does exist.  The problem is that doing so
>>> would run afoul of other scenarios.
>>
>> Which scenarios break?
> 
> For example, if the filter function call accept-process-output.  Or
> does anything else that changes output from which processes is or
> isn't available.

Does this necessarily prevent scheduling? I interpret
`accept-process-output` as a function which prioritizes a process, but I
am unsure if this makes it impossible to implement additional scheduling.

> What does this mean, exactly?  Which quantity should be doled in a
> round-robin fashion? bytes read from the processes? something else?
> 
> If the bytes read, then how do you suggest to handle two processes
> which produce output at very different rates?

For example bytes read or time spent to handle a process (time spent in
the filter function?). If a process has eaten up its time it has to wait
until it gets scheduled next. If you have two process with very
different rates, the slow process may not use up its allotted time slot
and the faster process is still allowed to run.

>> I am not happy with the argument that Emacs cannot do any better than
>> stopping the second process and only handle the first process.
> 
> I'm not saying that Emacs cannot do that, I'm trying to understand
> what that would mean in practice.

Actually I would also like to understand what the best process handling
looks like. When I stumbled over this issue, it astonished me that Emacs
does not seem to do any scheduling at all and handles only a single
process. As far as I know other language runtimes handle this problem
differently, attempting some kind of scheduling.

What is the reason for the current behavior? Is it predictability? If I
understand correctly, Emacs always reads from the first process. If data
arrives, Emacs does not read from the second processes at all. Only if
no data is available from the first process, the second process is
handled. Is it like this?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Fri, 30 Apr 2021 15:59:02 GMT) Full text and rfc822 format available.

Message #35 received at 48118 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: 48118 <at> debbugs.gnu.org
Subject: Re: bug#48118: 27.1; 28; Only first process receives output with
 multiple running processes
Date: Fri, 30 Apr 2021 18:58:06 +0300
> Cc: 48118 <at> debbugs.gnu.org
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Fri, 30 Apr 2021 17:39:35 +0200
> 
> >> Which scenarios break?
> > 
> > For example, if the filter function call accept-process-output.  Or
> > does anything else that changes output from which processes is or
> > isn't available.
> 
> Does this necessarily prevent scheduling? I interpret
> `accept-process-output` as a function which prioritizes a process, but I
> am unsure if this makes it impossible to implement additional scheduling.

A call to accept-process-output prioritizes a process only if it
explicitly requests output from that single process.  Which is not
necessarily true in all cases.

> > What does this mean, exactly?  Which quantity should be doled in a
> > round-robin fashion? bytes read from the processes? something else?
> > 
> > If the bytes read, then how do you suggest to handle two processes
> > which produce output at very different rates?
> 
> For example bytes read or time spent to handle a process (time spent in
> the filter function?).

Bytes read has a problem when processes produce output a very
different rates.  Time spent to handle may (and usually does) mean the
filter function does something expensive, it doesn't necessarily tell
anything about the output from the subprocess.

> When I stumbled over this issue, it astonished me that Emacs
> does not seem to do any scheduling at all and handles only a single
> process.

If you read the code, you will see this isn't what happens.  What
happens is that Emacs reads a chunk of output from the first process
it sees ready, then it goes back and re-checks which processes are
ready -- and in your scenario I think it again sees that the first
process is ready.

> What is the reason for the current behavior? Is it predictability? If I
> understand correctly, Emacs always reads from the first process. If data
> arrives, Emacs does not read from the second processes at all. Only if
> no data is available from the first process, the second process is
> handled. Is it like this?

In your scenario, yes.  It depends on how large is the output produced
by a process in one go.

I suggest to read the code of wait_reading_process_output, it has some
non-trivial logic in this department.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Fri, 30 Apr 2021 16:11:01 GMT) Full text and rfc822 format available.

Message #38 received at 48118 <at> debbugs.gnu.org (full text, mbox):

From: jakanakaevangeli <at> chiru.no
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: eliz <at> gnu.org, 48118 <at> debbugs.gnu.org
Subject: Re: bug#48118: 27.1; 28; Only first process receives output with
 multiple running processes
Date: Fri, 30 Apr 2021 18:15:16 +0200
Excuse me if I'm mistaken, but doesn't pselect return all file
descriptors that are ready for reading? Couldn't we then simply process
all of them (in a for loop) after pselect returns?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Fri, 30 Apr 2021 16:18:02 GMT) Full text and rfc822 format available.

Message #41 received at 48118 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 48118 <at> debbugs.gnu.org
Subject: Re: bug#48118: 27.1; 28; Only first process receives output with
 multiple running processes
Date: Fri, 30 Apr 2021 18:17:49 +0200
On 4/30/21 5:58 PM, Eli Zaretskii wrote:
> A call to accept-process-output prioritizes a process only if it
> explicitly requests output from that single process.  Which is not
> necessarily true in all cases.

Yes, I have seen that in the documentation.

>>> What does this mean, exactly?  Which quantity should be doled in a
>>> round-robin fashion? bytes read from the processes? something else?
>>>
>>> If the bytes read, then how do you suggest to handle two processes
>>> which produce output at very different rates?
>>
>> For example bytes read or time spent to handle a process (time spent in
>> the filter function?).
> 
> Bytes read has a problem when processes produce output a very
> different rates.  Time spent to handle may (and usually does) mean the
> filter function does something expensive, it doesn't necessarily tell
> anything about the output from the subprocess.

Of course it is not possible to find a perfect scheduling algorithm. But
how does the OS handle it if you have multiple processes which produce
output with vastly different rates? I am not claiming this problem has
been solved, but there are certainly some heuristics. Emacs is also
dependent on the OS scheduling, depending on how Emacs schedules its
reads/writes from the processes, the OS scheduler adjusts accordingly.
This furthermore complicates the picture.

>> When I stumbled over this issue, it astonished me that Emacs
>> does not seem to do any scheduling at all and handles only a single
>> process.
> 
> If you read the code, you will see this isn't what happens.  What
> happens is that Emacs reads a chunk of output from the first process
> it sees ready, then it goes back and re-checks which processes are
> ready -- and in your scenario I think it again sees that the first
> process is ready.

This is what we assumed. Emacs could check the second process the next
time. This way one may get a slightly more fair behavior. It would
certainly not be perfect and you could throw scenarios at it which would
make it behave unexpectedly. It may behave a bit more expectedly in the
common case?

> I suggest to read the code of wait_reading_process_output, it has some
> non-trivial logic in this department.

I will do that. Has this problem discussed before?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Fri, 30 Apr 2021 17:53:02 GMT) Full text and rfc822 format available.

Message #44 received at 48118 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: jakanakaevangeli <at> chiru.no
Cc: mail <at> daniel-mendler.de, 48118 <at> debbugs.gnu.org
Subject: Re: bug#48118: 27.1; 28; Only first process receives output with
 multiple running processes
Date: Fri, 30 Apr 2021 20:52:29 +0300
> From: jakanakaevangeli <at> chiru.no
> Cc: eliz <at> gnu.org, 48118 <at> debbugs.gnu.org
> Date: Fri, 30 Apr 2021 18:15:16 +0200
> 
> Excuse me if I'm mistaken, but doesn't pselect return all file
> descriptors that are ready for reading? Couldn't we then simply process
> all of them (in a for loop) after pselect returns?

That's what we do, but with a twist that I described.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Fri, 30 Apr 2021 18:07:02 GMT) Full text and rfc822 format available.

Message #47 received at 48118 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: 48118 <at> debbugs.gnu.org
Subject: Re: bug#48118: 27.1; 28; Only first process receives output with
 multiple running processes
Date: Fri, 30 Apr 2021 21:06:22 +0300
> Cc: 48118 <at> debbugs.gnu.org
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Fri, 30 Apr 2021 18:17:49 +0200
> 
> > Bytes read has a problem when processes produce output a very
> > different rates.  Time spent to handle may (and usually does) mean the
> > filter function does something expensive, it doesn't necessarily tell
> > anything about the output from the subprocess.
> 
> Of course it is not possible to find a perfect scheduling algorithm. But
> how does the OS handle it if you have multiple processes which produce
> output with vastly different rates? I am not claiming this problem has
> been solved, but there are certainly some heuristics. Emacs is also
> dependent on the OS scheduling, depending on how Emacs schedules its
> reads/writes from the processes, the OS scheduler adjusts accordingly.
> This furthermore complicates the picture.

I'm sure patches to tune the scheduling to specific use cases will be
welcome.  My gut feeling is that we will need some variables to allow
Lisp programs to tell Emacs how to handle the various kinds of
processes and combinations thereof, but if you can come up with
patches that automatically adapt to the process's behavior, that would
be even better.

You could also try playing with the value of read-process-output-max,
perhaps enlarging it will make the problem in your case less severe.

> > If you read the code, you will see this isn't what happens.  What
> > happens is that Emacs reads a chunk of output from the first process
> > it sees ready, then it goes back and re-checks which processes are
> > ready -- and in your scenario I think it again sees that the first
> > process is ready.
> 
> This is what we assumed. Emacs could check the second process the next
> time. This way one may get a slightly more fair behavior. It would
> certainly not be perfect and you could throw scenarios at it which would
> make it behave unexpectedly. It may behave a bit more expectedly in the
> common case?

We could try that (conditioned on some new variable) and see if this
has downsides.

> > I suggest to read the code of wait_reading_process_output, it has some
> > non-trivial logic in this department.
> 
> I will do that. Has this problem discussed before?

I don't think so, but my memory is not to be trusted.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Sun, 02 May 2021 07:24:01 GMT) Full text and rfc822 format available.

Message #50 received at 48118 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 48118 <at> debbugs.gnu.org
Subject: Re: bug#48118: 27.1; 28; Only first process receives output with
 multiple running processes
Date: Sun, 02 May 2021 09:23:33 +0200
Daniel Mendler <mail <at> daniel-mendler.de> writes:

> This is what we assumed. Emacs could check the second process the next
> time. This way one may get a slightly more fair behavior.

Yes, more fairness in how Emacs handles process output would be nice,
indeed.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Mon, 24 May 2021 21:01:01 GMT) Full text and rfc822 format available.

Message #53 received at 48118 <at> debbugs.gnu.org (full text, mbox):

From: miha <at> kamnitnik.top
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: mail <at> daniel-mendler.de, eliz <at> gnu.org, 48118 <at> debbugs.gnu.org
Subject: Re: bug#48118: 27.1; 28; Only first process receives output with
 multiple running processes
Date: Mon, 24 May 2021 23:05:47 +0200
[Message part 1 (text/plain, inline)]
I propose the following simple patch. It makes output from multiple
/bin/yes programs arrive at the same rate and multiple grep processes
can run without them seemingly blocking each other.


[0001-Try-to-not-prioritise-reading-from-lower-fds.patch (text/x-patch, inline)]
From 29544585ec07ec180bb13fac9142d3755c597cd9 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Miha=20Rihtar=C5=A1i=C4=8D?= <miha <at> kamnitnik.top>
Date: Mon, 24 May 2021 22:46:47 +0200
Subject: [PATCH] Try to not prioritise reading from lower fds

* src/process.c (wait_reading_process_output): When looping through
fds, continue from where we left off.
---
 src/process.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/src/process.c b/src/process.c
index 47a2a6f1a3..9c2f328ebc 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5134,6 +5134,7 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
 			     Lisp_Object wait_for_cell,
 			     struct Lisp_Process *wait_proc, int just_wait_proc)
 {
+  static int last_read_channel = -1;
   int channel, nfds;
   fd_set Available;
   fd_set Writeok;
@@ -5188,6 +5189,7 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
   while (1)
     {
       bool process_skipped = false;
+      bool wrapped;
 
       /* If calling from keyboard input, do not quit
 	 since we want to return C-g as an input character.
@@ -5722,8 +5724,17 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
             d->func (channel, d->data);
 	}
 
-      for (channel = 0; channel <= max_desc; channel++)
-	{
+      for (channel = last_read_channel + 1, wrapped = false;
+	   !wrapped || (channel <= last_read_channel && channel <= max_desc);
+	   channel++)
+        {
+	  if (channel > max_desc)
+	    {
+	      wrapped = true;
+	      channel = -1;
+	      continue;
+	    }
+
 	  if (FD_ISSET (channel, &Available)
 	      && ((fd_callback_info[channel].flags & (KEYBOARD_FD | PROCESS_FD))
 		  == PROCESS_FD))
@@ -5761,6 +5772,7 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
 		     don't try to read from any other processes
 		     before doing the select again.  */
 		  FD_ZERO (&Available);
+		  last_read_channel = channel;
 
 		  if (do_display)
 		    redisplay_preserve_echo_area (12);
-- 
2.31.1

[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Tue, 25 May 2021 11:39:02 GMT) Full text and rfc822 format available.

Message #56 received at 48118 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: miha <at> kamnitnik.top
Cc: mail <at> daniel-mendler.de, larsi <at> gnus.org, 48118 <at> debbugs.gnu.org
Subject: Re: bug#48118: 27.1; 28; Only first process receives output with
 multiple running processes
Date: Tue, 25 May 2021 14:38:17 +0300
> From: miha <at> kamnitnik.top
> Cc: mail <at> daniel-mendler.de, eliz <at> gnu.org, 48118 <at> debbugs.gnu.org
> Date: Mon, 24 May 2021 23:05:47 +0200
> 
> I propose the following simple patch. It makes output from multiple
> /bin/yes programs arrive at the same rate and multiple grep processes
> can run without them seemingly blocking each other.

Thanks, but I don't think we can make such changes unconditionally.
I'm okay with trying this by default, but we should have a Lisp
variable that would allow to get back to the old behavior.  That's
because if some user complains about some problems, and we think the
problems are caused by this change, we could tell that user to flip
the variable and see if the problems go away.

That variable should also be in NEWS.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Tue, 25 May 2021 15:14:02 GMT) Full text and rfc822 format available.

Message #59 received at 48118 <at> debbugs.gnu.org (full text, mbox):

From: <miha <at> kamnitnik.top>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: mail <at> daniel-mendler.de, larsi <at> gnus.org, 48118 <at> debbugs.gnu.org
Subject: Re: bug#48118: 27.1; 28; Only first process receives output with
 multiple running processes
Date: Tue, 25 May 2021 17:18:22 +0200
[Message part 1 (text/plain, inline)]
Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: miha <at> kamnitnik.top
>> Cc: mail <at> daniel-mendler.de, eliz <at> gnu.org, 48118 <at> debbugs.gnu.org
>> Date: Mon, 24 May 2021 23:05:47 +0200
>> 
>> I propose the following simple patch. It makes output from multiple
>> /bin/yes programs arrive at the same rate and multiple grep processes
>> can run without them seemingly blocking each other.
>
> Thanks, but I don't think we can make such changes unconditionally.
> I'm okay with trying this by default, but we should have a Lisp
> variable that would allow to get back to the old behavior.  That's
> because if some user complains about some problems, and we think the
> problems are caused by this change, we could tell that user to flip
> the variable and see if the problems go away.
>
> That variable should also be in NEWS.

Revised patch.

[0001-Try-to-not-prioritise-reading-from-lower-fds.patch (text/x-patch, inline)]
From 77fb8097f2ed20f034230a5c61dee00880bbcd24 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Miha=20Rihtar=C5=A1i=C4=8D?= <miha <at> kamnitnik.top>
Date: Mon, 24 May 2021 22:46:47 +0200
Subject: [PATCH] Try to not prioritise reading from lower fds

* src/process.c (wait_reading_process_output): When looping through
fds, continue from where we left off.
(syms_of_process): Vprocess_prioritize_lower_fds: New variable
---
 etc/NEWS      |  7 +++++++
 src/process.c | 27 +++++++++++++++++++++++++--
 2 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/etc/NEWS b/etc/NEWS
index 1541b74a3b..cc767c8dc2 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -2706,6 +2706,13 @@ the Emacs Lisp reference manual for background.
 * Lisp Changes in Emacs 28.1
 
 +++
+** New variable 'process-prioritize-lower-fds'
+When looping through file descriptors to handle subprocess output, try
+to continue from where the previous loop left off instead of always
+beginning from file descriptor zero.  Set this variable to t to get
+the old behaviour.
+
+---
 ** New function 'sxhash-equal-including-properties'.
 This is identical to 'sxhash-equal' but accounting also for string
 properties.
diff --git a/src/process.c b/src/process.c
index 47a2a6f1a3..7bf55c203e 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5134,6 +5134,7 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
 			     Lisp_Object wait_for_cell,
 			     struct Lisp_Process *wait_proc, int just_wait_proc)
 {
+  static int last_read_channel = -1;
   int channel, nfds;
   fd_set Available;
   fd_set Writeok;
@@ -5188,6 +5189,8 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
   while (1)
     {
       bool process_skipped = false;
+      bool wrapped;
+      int channel_start;
 
       /* If calling from keyboard input, do not quit
 	 since we want to return C-g as an input character.
@@ -5722,8 +5725,20 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
             d->func (channel, d->data);
 	}
 
-      for (channel = 0; channel <= max_desc; channel++)
-	{
+      channel_start
+	= !NILP (Vprocess_prioritize_lower_fds) ? 0 : last_read_channel + 1;
+
+      for (channel = channel_start, wrapped = false;
+	   !wrapped || (channel < channel_start && channel <= max_desc);
+	   channel++)
+        {
+	  if (channel > max_desc)
+	    {
+	      wrapped = true;
+	      channel = -1;
+	      continue;
+	    }
+
 	  if (FD_ISSET (channel, &Available)
 	      && ((fd_callback_info[channel].flags & (KEYBOARD_FD | PROCESS_FD))
 		  == PROCESS_FD))
@@ -5761,6 +5776,7 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
 		     don't try to read from any other processes
 		     before doing the select again.  */
 		  FD_ZERO (&Available);
+		  last_read_channel = channel;
 
 		  if (do_display)
 		    redisplay_preserve_echo_area (12);
@@ -8477,6 +8493,13 @@ syms_of_process (void)
 The variable takes effect when `start-process' is called.  */);
   Vprocess_adaptive_read_buffering = Qt;
 
+  DEFVAR_LISP ("process-prioritize-lower-fds", Vprocess_prioritize_lower_fds,
+	       doc: /* If nil, try to not prioritize reading from any process.
+Emacs loops through file descriptors to receive data from subprocesses.  After
+accepting output from the first file descriptor with available data, restart the
+loop from the file descriptor 0 if this option is non-nil.  */);
+  Vprocess_prioritize_lower_fds = Qnil;
+
   DEFVAR_LISP ("interrupt-process-functions", Vinterrupt_process_functions,
 	       doc: /* List of functions to be called for `interrupt-process'.
 The arguments of the functions are the same as for `interrupt-process'.
-- 
2.31.1

[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Tue, 25 May 2021 17:13:02 GMT) Full text and rfc822 format available.

Message #62 received at 48118 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: miha <at> kamnitnik.top
Cc: mail <at> daniel-mendler.de, larsi <at> gnus.org, 48118 <at> debbugs.gnu.org
Subject: Re: bug#48118: 27.1; 28; Only first process receives output with
 multiple running processes
Date: Tue, 25 May 2021 20:12:28 +0300
> From: <miha <at> kamnitnik.top>
> Cc: larsi <at> gnus.org, mail <at> daniel-mendler.de, 48118 <at> debbugs.gnu.org
> Date: Tue, 25 May 2021 17:18:22 +0200
> 
> Revised patch.

Thanks.

> --- a/etc/NEWS
> +++ b/etc/NEWS
> @@ -2706,6 +2706,13 @@ the Emacs Lisp reference manual for background.
>  * Lisp Changes in Emacs 28.1
>  
>  +++
> +** New variable 'process-prioritize-lower-fds'
> +When looping through file descriptors to handle subprocess output, try
> +to continue from where the previous loop left off instead of always
> +beginning from file descriptor zero.  Set this variable to t to get
> +the old behaviour.
> +
> +---

The "+++" and "---" markers should be reversed.

> +  DEFVAR_LISP ("process-prioritize-lower-fds", Vprocess_prioritize_lower_fds,
> +	       doc: /* If nil, try to not prioritize reading from any process.
> +Emacs loops through file descriptors to receive data from subprocesses.  After
> +accepting output from the first file descriptor with available data, restart the
> +loop from the file descriptor 0 if this option is non-nil.  */);
> +  Vprocess_prioritize_lower_fds = Qnil;

Please use DEVAR_BOOL, since this is a boolean variable.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Tue, 25 May 2021 17:58:01 GMT) Full text and rfc822 format available.

Message #65 received at 48118 <at> debbugs.gnu.org (full text, mbox):

From: <miha <at> kamnitnik.top>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: mail <at> daniel-mendler.de, larsi <at> gnus.org, 48118 <at> debbugs.gnu.org
Subject: Re: bug#48118: 27.1; 28; Only first process receives output with
 multiple running processes
Date: Tue, 25 May 2021 20:02:57 +0200
[Message part 1 (text/plain, inline)]
Eli Zaretskii <eliz <at> gnu.org> writes:

> The "+++" and "---" markers should be reversed.

Whoops.

>> +  DEFVAR_LISP ("process-prioritize-lower-fds", Vprocess_prioritize_lower_fds,
>> +	       doc: /* If nil, try to not prioritize reading from any process.
>> +Emacs loops through file descriptors to receive data from subprocesses.  After
>> +accepting output from the first file descriptor with available data, restart the
>> +loop from the file descriptor 0 if this option is non-nil.  */);
>> +  Vprocess_prioritize_lower_fds = Qnil;
>
> Please use DEVAR_BOOL, since this is a boolean variable.

Tanks for feedback, posting revised patch.

[0001-Try-to-not-prioritise-reading-from-lower-fds.patch (#("text/x-patch", inline)]
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Tue, 25 May 2021 19:03:02 GMT) Full text and rfc822 format available.

Message #68 received at 48118 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: <miha <at> kamnitnik.top>
Cc: mail <at> daniel-mendler.de, Eli Zaretskii <eliz <at> gnu.org>, 48118 <at> debbugs.gnu.org
Subject: Re: bug#48118: 27.1; 28; Only first process receives output with
 multiple running processes
Date: Tue, 25 May 2021 21:02:48 +0200
<miha <at> kamnitnik.top> writes:

> Tanks for feedback, posting revised patch.

I did some testing, and the network processes seem to work fine with
this change, so I've applied the patch and pushed to Emacs 28 now.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Added tag(s) fixed. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Tue, 25 May 2021 19:04:01 GMT) Full text and rfc822 format available.

bug marked as fixed in version 28.1, send any further explanations to 48118 <at> debbugs.gnu.org and Daniel Mendler <mail <at> daniel-mendler.de> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Tue, 25 May 2021 19:04:01 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Fri, 04 Jun 2021 13:36:01 GMT) Full text and rfc822 format available.

Message #75 received at 48118 <at> debbugs.gnu.org (full text, mbox):

From: Philipp <p.stephani2 <at> gmail.com>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: Daniel Mendler <mail <at> daniel-mendler.de>, 48118 <at> debbugs.gnu.org
Subject: Re: bug#48118: 27.1; 28; Only first process receives output with
 multiple running processes
Date: Fri, 4 Jun 2021 15:34:53 +0200

> Am 02.05.2021 um 09:23 schrieb Lars Ingebrigtsen <larsi <at> gnus.org>:
> 
> Daniel Mendler <mail <at> daniel-mendler.de> writes:
> 
>> This is what we assumed. Emacs could check the second process the next
>> time. This way one may get a slightly more fair behavior.
> 
> Yes, more fairness in how Emacs handles process output would be nice,
> indeed.

An alternative approach would be to randomly shuffle the file descriptors before selecting on them.  At least that's what e.g. Go is doing (see the code starting from "generate permuted order" in https://golang.org/src/runtime/select.go).



Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#48118; Package emacs. (Fri, 04 Jun 2021 14:02:02 GMT) Full text and rfc822 format available.

Message #78 received at 48118 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Philipp <p.stephani2 <at> gmail.com>
Cc: mail <at> daniel-mendler.de, larsi <at> gnus.org, 48118 <at> debbugs.gnu.org
Subject: Re: bug#48118: 27.1; 28;
 Only first process receives output with multiple running processes
Date: Fri, 04 Jun 2021 17:00:54 +0300
> From: Philipp <p.stephani2 <at> gmail.com>
> Date: Fri, 4 Jun 2021 15:34:53 +0200
> Cc: Daniel Mendler <mail <at> daniel-mendler.de>, 48118 <at> debbugs.gnu.org
> 
> An alternative approach would be to randomly shuffle the file descriptors before selecting on them.  At least that's what e.g. Go is doing (see the code starting from "generate permuted order" in https://golang.org/src/runtime/select.go).

We could have such a behavior as an option.  But we'd need to make
sure the random numbers coming out of that are really random and give
each handle the same chance, even for short time durations.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 03 Jul 2021 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 291 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.