GNU bug report logs - #66020
[PATCH] Reduce GC churn in read_process_output

Package: emacs;

Reported by: Dmitry Gutov <dmitry <at> gutov.dev>

Date: Sat, 16 Sep 2023 01:27:02 UTC

Severity: wishlist

Tags: patch

Done: Dmitry Gutov <dmitry <at> gutov.dev>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 66020 in the body.
You can then email your comments to 66020 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox

Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Sat, 16 Sep 2023 01:27:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Dmitry Gutov <dmitry <at> gutov.dev>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sat, 16 Sep 2023 01:27:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: bug-gnu-emacs <at> gnu.org
Subject: [PATCH] Reduce GC churn in read_process_output
Date: Sat, 16 Sep 2023 04:26:11 +0300

[Message part 1 (text/plain, inline)]

As suggested, I'm filing this in a new bug report for wider review.

The updated patch attached.

The previous discussion was in bug#64735, and the benchmarks (which more 
or less hold for the new patch as well) can be viewed here (last table): 
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=64735#506. Except the new 
version somehow performs a little better at read-process-output-max=4096 
as well, despite seemingly doing more.

Let me know if I DRYed this too much, or if there are better names for 
the extracted common routines, or etc. Or for the new variable 
(read-process-output-fast).

[read_and_insert_process_output_v2.diff (text/x-patch, attachment)]

Severity set to 'wishlist' from 'normal' Request was from Stefan Kangas <stefankangas <at> gmail.com> to control <at> debbugs.gnu.org. (Mon, 18 Sep 2023 22:52:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Tue, 19 Sep 2023 20:01:01 GMT) Full text and rfc822 format available.

Message #10 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 66020 <at> debbugs.gnu.org
Subject: bug#66020 (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Tue, 19 Sep 2023 22:59:43 +0300

This is another continuation from bug#64735, a subthread in this bug 
seems more fitting, given that I did most of the tests with its patch 
applied.

On 16/09/2023 08:37, Eli Zaretskii wrote:
>> Date: Sat, 16 Sep 2023 04:32:26 +0300
>> Cc:luangruo <at> yahoo.com,sbaugh <at> janestreet.com,yantar92 <at> posteo.net,
>>   64735 <at> debbugs.gnu.org
>> From: Dmitry Gutov<dmitry <at> gutov.dev>
>>
>>>> I wonder what scenario that might become apparent in. Launching many
>>>> small processes at once? Can't think of a realistic test case.
>>> One process suffices.  The effect might not be significant, but
>>> slowdowns due to new features are generally considered regressions.
>> We'd need some objective way to evaluate this. Otherwise we'd just stop
>> at the prospect of slowing down some process somewhere by 9ns (never
>> mind speeding others up).
> That could indeed happen, and did happen in other cases.  My personal
> conclusion from similar situations is that it is impossible to tell in
> advance what the reaction will be; we need to present the numbers and
> see how the chips fall.

I wrote this test:

(defun test-ls-output ()
  (with-temp-buffer
    (let ((proc
           (make-process :name "ls"
                         :sentinel (lambda (&rest _))
                         :buffer (current-buffer)
                         :stderr (current-buffer)
                         :connection-type 'pipe
                         :command '("ls"))))
      (while (accept-process-output proc))
      (buffer-string))))

And tried to find some case where the difference is the least in favor 
of high buffer length. The one in favor of it we already know (a process 
with lots and lots of output).

But when running 'ls' on a small directory (output 500 chars long), the 
variance in benchmarking is larger than any difference I can see from 
changing read-process-output-max from 4096 to 40960 (or to 40900 even). 
The benchmark is the following:

  (benchmark 1000 '(let ((read-process-output-fast t) 
(read-process-output-max 4096)) (test-ls-output)))

When the directory is a little large (output ~50000 chars), there is 
more nuance. At first, as long as (!) read_and_insert_process_output_v2 
patch is applied and read-process-output-fast is non-nil, the difference 
is negligible:

| read-process-output-max | bench result                        |
|                    4096 | (4.566418994 28 0.8000380139999992) |
|                   40960 | (4.640526664 32 0.8330555910000008) |
|                  409600 | (4.629948652 30 0.7989731299999994) |

For completeness, here are the same results for 
read-process-output-fast=nil (emacs-29 is similar, though all a little 
slower):

| read-process-output-max | bench result                        |
|                    4096 | (4.953397326 52 1.354643750000001)  |
|                   40960 | (6.942334958 75 2.0616055079999995) |
|                  409600 | (7.124765651 76 2.0892871070000005) |

But as the session gets older (and I repeat these and other 
memory-intensive benchmarks), the outlay changes, and the larger buffer 
leads to uniformly worse number (the below is taken with 
read-process-output-fast=t; with that var set to nil the results were 
even worse):

| read-process-output-max | bench result                        |
|                    4096 | (5.02324481 41 0.8851443580000051)  |
|                   40960 | (5.438721274 61 1.2202541989999958) |
|                  409600 | (6.11188183 77 1.5461468160000038)  |

...which seems odd given that in general, the buffer length closer to 
the length of the output should be preferable, because otherwise it is 
allocated multiple times, and read_process_output is likewise called 
more. Perhaps longer strings get more difficult to allocate as 
fragmentation increases?

So, the last table is from a session I had running from yesterday, and 
the first table was produced after I restarted Emacs about an hour ago 
(the numbers were stable for 1-2 hours while I was writing this email 
on-and-off, then started degrading again a little bit, though not yet -- 
a couple of hours since -- even halfway to the numbers in the last table).

Where to go from here?

- Maybe we declare the difference insignificant and bump the value of 
read-process-output-max, given that it helps in other cases,
- Or try to find out the cause for degradation,
- Or keep the default the same, but make it easier to use different 
value for different processes (meaning, we resurrect the discussion in 
bug#38561).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Wed, 20 Sep 2023 11:21:02 GMT) Full text and rfc822 format available.

Message #13 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dmitry <at> gutov.dev>, Stefan Kangas <stefankangas <at> gmail.com>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 66020 <at> debbugs.gnu.org
Subject: Re: bug#66020 (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Wed, 20 Sep 2023 14:20:13 +0300

> Date: Tue, 19 Sep 2023 22:59:43 +0300
> Cc: 66020 <at> debbugs.gnu.org
> From: Dmitry Gutov <dmitry <at> gutov.dev>
> 
> - Maybe we declare the difference insignificant and bump the value of 
> read-process-output-max, given that it helps in other cases,
> - Or try to find out the cause for degradation,
> - Or keep the default the same, but make it easier to use different 
> value for different processes (meaning, we resurrect the discussion in 
> bug#38561).

I'd try the same experiment on other use cases, say "M-x grep" and
"M-x compile" with large outputs, and if you see the same situation
there (i.e. larger buffers are no worse), try increasing the default
value on master.

Stefan & Stefan: any comments or suggestions?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Thu, 21 Sep 2023 00:59:02 GMT) Full text and rfc822 format available.

Message #16 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>, Stefan Kangas <stefankangas <at> gmail.com>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 66020 <at> debbugs.gnu.org
Subject: Re: bug#66020 (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Thu, 21 Sep 2023 03:57:43 +0300

On 20/09/2023 14:20, Eli Zaretskii wrote:
>> Date: Tue, 19 Sep 2023 22:59:43 +0300
>> Cc: 66020 <at> debbugs.gnu.org
>> From: Dmitry Gutov <dmitry <at> gutov.dev>
>>
>> - Maybe we declare the difference insignificant and bump the value of
>> read-process-output-max, given that it helps in other cases,
>> - Or try to find out the cause for degradation,
>> - Or keep the default the same, but make it easier to use different
>> value for different processes (meaning, we resurrect the discussion in
>> bug#38561).
> 
> I'd try the same experiment on other use cases, say "M-x grep" and
> "M-x compile" with large outputs, and if you see the same situation
> there (i.e. larger buffers are no worse), try increasing the default
> value on master.

I've run one particular rgrep search a few times (24340 hits, ~44s when 
the variable's value is either 4096 or 409600). And it makes sense that 
there is no difference: compilation modes do a lot more work than just 
capturing the process output or splitting it into strings.

That leaves the question of what new value to use. 409600 is optimal for 
a large-output process but seems too much as default anyway (even if I 
have very little experimental proof for that hesitance: any help with 
that would be very welcome).

I did some more experimenting, though. At a superficial glance, 
allocating the 'chars' buffer at the beginning of read_process_output is 
problematic because we could instead reuse a buffer for the whole 
duration of the process. I tried that (adding a new field to 
Lisp_Process and setting it in make_process), although I had to use a 
value produced by make_uninit_string: apparently simply storing a char* 
field inside a managed structure creates problems for the GC and early 
segfaults. Anyway, the result was slightly _slower_ than the status quo.

So I read what 'alloca' does, and it looks hard to beat. But it's only 
used (as you of course know) when the value is <= MAX_ALLOCA, which is 
currently 16384. Perhaps an optimal default value shouldn't exceed this, 
even if it's hard to create a benchmark that shows a difference. With 
read-process-output-max set to 16384, my original benchmark gets about 
halfway to the optimal number.

And I think we should make the process "remember" the value at its 
creation either way (something touched on in bug#38561): in bug#55737 we 
added an fcntl call to make the larger values take effect. But this call 
is in create_process: so any subsequent increase to a large value of 
this var won't have effect. Might as well remember it there (in a new 
field), then it'll be easier to use different values of it for different 
processes (set using let-binding at the time of the process' creation).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Thu, 21 Sep 2023 02:37:02 GMT) Full text and rfc822 format available.

Message #19 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Eli Zaretskii <eliz <at> gnu.org>, Stefan Kangas <stefankangas <at> gmail.com>,
 66020 <at> debbugs.gnu.org
Subject: Re: bug#66020 (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Wed, 20 Sep 2023 22:36:18 -0400

> make_process), although I had to use a value produced by make_uninit_string:
> apparently simply storing a char* field inside a managed structure creates
> problems for the GC and early segfaults. Anyway, the result was slightly

That should depend on *where* you put that field.  Basically, it has to
come after:

    /* The thread a process is linked to, or nil for any thread.  */
    Lisp_Object thread;
    /* After this point, there are no Lisp_Objects.  */

since all the words up to that point will be traced by the GC (and
assumed to be Lisp_Object fields).  But of course, if you created the
buffer with `make_uninit_string` then it'll be inside the Lisp heap and
so it'll be reclaimed if the GC doesn't find any reference to it.


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Thu, 21 Sep 2023 07:43:01 GMT) Full text and rfc822 format available.

Message #22 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 66020 <at> debbugs.gnu.org, stefankangas <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#66020 (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Thu, 21 Sep 2023 10:42:31 +0300

> Date: Thu, 21 Sep 2023 03:57:43 +0300
> Cc: 66020 <at> debbugs.gnu.org
> From: Dmitry Gutov <dmitry <at> gutov.dev>
> 
> That leaves the question of what new value to use. 409600 is optimal for 
> a large-output process but seems too much as default anyway (even if I 
> have very little experimental proof for that hesitance: any help with 
> that would be very welcome).

How does the throughput depend on this value?  If the dependence curve
plateaus at some lower value, we could use that lower value as a
"good-enough" default.

> I did some more experimenting, though. At a superficial glance, 
> allocating the 'chars' buffer at the beginning of read_process_output is 
> problematic because we could instead reuse a buffer for the whole 
> duration of the process. I tried that (adding a new field to 
> Lisp_Process and setting it in make_process), although I had to use a 
> value produced by make_uninit_string: apparently simply storing a char* 
> field inside a managed structure creates problems for the GC and early 
> segfaults. Anyway, the result was slightly _slower_ than the status quo.
> 
> So I read what 'alloca' does, and it looks hard to beat. But it's only 
> used (as you of course know) when the value is <= MAX_ALLOCA, which is 
> currently 16384. Perhaps an optimal default value shouldn't exceed this, 
> even if it's hard to create a benchmark that shows a difference. With 
> read-process-output-max set to 16384, my original benchmark gets about 
> halfway to the optimal number.

Which I think means we should stop worrying about the overhead of
malloc for this purpose, as it is fast enough, at least on GNU/Linux.

> And I think we should make the process "remember" the value at its 
> creation either way (something touched on in bug#38561): in bug#55737 we 
> added an fcntl call to make the larger values take effect. But this call 
> is in create_process: so any subsequent increase to a large value of 
> this var won't have effect.

Why would the variable change after create_process?  I'm afraid I
don't understand what issue you are trying to deal with here.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Thu, 21 Sep 2023 08:08:01 GMT) Full text and rfc822 format available.

Message #25 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefankangas <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>, Dmitry Gutov <dmitry <at> gutov.dev>, 
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 66020 <at> debbugs.gnu.org
Subject: Re: bug#66020 (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Thu, 21 Sep 2023 01:07:29 -0700

Eli Zaretskii <eliz <at> gnu.org> writes:

> Stefan & Stefan: any comments or suggestions?

FWIW, I've had the below snippet in my .emacs for the last two years,
and haven't noticed any adverse effects.  I never bothered making any
actual benchmarks though:

    ;; Maybe faster:
    (setq read-process-output-max
          (max read-process-output-max (* 64 1024)))

I added the above after the discussion here:

    https://lists.gnu.org/r/emacs-devel/2021-03/msg01461.html

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Thu, 21 Sep 2023 13:17:02 GMT) Full text and rfc822 format available.

Message #28 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 66020 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca, stefankangas <at> gmail.com
Subject: Re: bug#66020 (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Thu, 21 Sep 2023 16:16:31 +0300

> Date: Thu, 21 Sep 2023 15:20:57 +0300
> Cc: Eli Zaretskii <eliz <at> gnu.org>, Stefan Kangas <stefankangas <at> gmail.com>,
>  66020 <at> debbugs.gnu.org
> From: Dmitry Gutov <dmitry <at> gutov.dev>
> 
> On 21/09/2023 05:36, Stefan Monnier wrote:
> >> make_process), although I had to use a value produced by make_uninit_string:
> >> apparently simply storing a char* field inside a managed structure creates
> >> problems for the GC and early segfaults. Anyway, the result was slightly
> > That should depend on*where*  you put that field.  Basically, it has to
> > come after:
> > 
> >      /* The thread a process is linked to, or nil for any thread.  */
> >      Lisp_Object thread;
> >      /* After this point, there are no Lisp_Objects.  */
> > 
> > since all the words up to that point will be traced by the GC (and
> > assumed to be Lisp_Object fields).
> 
> Ah, thanks. That calls for another try.
> 
> ...still no improvement, though no statistically significant slowdown 
> either this time.

Why did you expect a significant improvement?  Allocating and freeing
the same-size buffer in quick succession has got to be optimally
handled by modern malloc implementations, so I wouldn't be surprised
by what you discover.  There should be no OS calls, just reuse of a
buffer that was just recently free'd.  The overhead exists, but is
probably very small, so it is lost in the noise.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Thu, 21 Sep 2023 13:18:02 GMT) Full text and rfc822 format available.

Message #31 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 66020 <at> debbugs.gnu.org, stefankangas <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#66020 (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Thu, 21 Sep 2023 16:17:31 +0300

> Date: Thu, 21 Sep 2023 15:27:41 +0300
> Cc: 66020 <at> debbugs.gnu.org
> From: Dmitry Gutov <dmitry <at> gutov.dev>
> 
> >      https://lists.gnu.org/r/emacs-devel/2021-03/msg01461.html
> 
> The archive seems down (so I can't read this), but if you found a 
> tangible improvement from the above setting, you might also want to try 
> out the patch at the top of this bug report.

It is back up now.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Thu, 21 Sep 2023 14:38:01 GMT) Full text and rfc822 format available.

Message #34 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 66020 <at> debbugs.gnu.org, stefankangas <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#66020 (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Thu, 21 Sep 2023 17:37:23 +0300

On 21/09/2023 10:42, Eli Zaretskii wrote:
>> Date: Thu, 21 Sep 2023 03:57:43 +0300
>> Cc: 66020 <at> debbugs.gnu.org
>> From: Dmitry Gutov <dmitry <at> gutov.dev>
>>
>> That leaves the question of what new value to use. 409600 is optimal for
>> a large-output process but seems too much as default anyway (even if I
>> have very little experimental proof for that hesitance: any help with
>> that would be very welcome).
> 
> How does the throughput depend on this value?  If the dependence curve
> plateaus at some lower value, we could use that lower value as a
> "good-enough" default.

Depends on what we're prepared to call a plateau. Strictly speaking, not 
really. But we have a "sweet spot": for the process in my original 
benchmark ('find' with lots of output) it seems to be around 1009600. 
Here's a table (numbers are different from before because they're 
results of (benchmark 5 ...) divided by 5, meaning GC is amortized:

|    4096 | 0.78 |
|   16368 | 0.69 |
|   40960 | 0.65 |
|  409600 | 0.59 |
| 1009600 | 0.56 |
| 2009600 | 0.64 |
| 4009600 | 0.65 |

The process's output length is 27244567 in this case. Still above the 
largest of the buffers in this example.

Notably, only allocating the buffer once at the start of the process 
(experiment mentioned in the email to Stefan M.) doesn't change the 
dynamics: buffer lengths above ~1009600 make the performance worse.

So there must be some negative factor associated with higher buffers. 
There is an obvious positive one: the longer the buffer, the longer we 
don't switch between processes, so that overhead is lower.

We could look into improving that part specifically: for example, 
reading from the process multiple times into 'chars' right away while 
there is still pending output present (either looping inside 
read_process_output, or calling it in a loop in 
wait_reading_process_output, at least until the process' buffered output 
is exhausted). That could reduce reactivity, however (can we find out 
how much is already buffered in advance, and only loop until we exhaust 
that length?)

>> I did some more experimenting, though. At a superficial glance,
>> allocating the 'chars' buffer at the beginning of read_process_output is
>> problematic because we could instead reuse a buffer for the whole
>> duration of the process. I tried that (adding a new field to
>> Lisp_Process and setting it in make_process), although I had to use a
>> value produced by make_uninit_string: apparently simply storing a char*
>> field inside a managed structure creates problems for the GC and early
>> segfaults. Anyway, the result was slightly _slower_ than the status quo.
>>
>> So I read what 'alloca' does, and it looks hard to beat. But it's only
>> used (as you of course know) when the value is <= MAX_ALLOCA, which is
>> currently 16384. Perhaps an optimal default value shouldn't exceed this,
>> even if it's hard to create a benchmark that shows a difference. With
>> read-process-output-max set to 16384, my original benchmark gets about
>> halfway to the optimal number.
> 
> Which I think means we should stop worrying about the overhead of
> malloc for this purpose, as it is fast enough, at least on GNU/Linux.

Perhaps. If we're not too concerned about memory fragmentation (that's 
the only explanation I have for the table "session gets older" -- last 
one -- in a previous email with test-ls-output timings).

>> And I think we should make the process "remember" the value at its
>> creation either way (something touched on in bug#38561): in bug#55737 we
>> added an fcntl call to make the larger values take effect. But this call
>> is in create_process: so any subsequent increase to a large value of
>> this var won't have effect.
> 
> Why would the variable change after create_process?  I'm afraid I
> don't understand what issue you are trying to deal with here.

Well, what could we lose by saving the value of read-process-output-max 
in create_process? Currently I suppose one could vary its value while a 
process is still running, to implement some adaptive behavior or 
whatnot. But that's already semi-broken because fcntl is called in 
create_process.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Thu, 21 Sep 2023 15:00:02 GMT) Full text and rfc822 format available.

Message #37 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 66020 <at> debbugs.gnu.org, stefankangas <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#66020 (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Thu, 21 Sep 2023 17:59:36 +0300

> Date: Thu, 21 Sep 2023 17:37:23 +0300
> Cc: stefankangas <at> gmail.com, monnier <at> iro.umontreal.ca, 66020 <at> debbugs.gnu.org
> From: Dmitry Gutov <dmitry <at> gutov.dev>
> 
> > How does the throughput depend on this value?  If the dependence curve
> > plateaus at some lower value, we could use that lower value as a
> > "good-enough" default.
> 
> Depends on what we're prepared to call a plateau. Strictly speaking, not 
> really. But we have a "sweet spot": for the process in my original 
> benchmark ('find' with lots of output) it seems to be around 1009600. 
> Here's a table (numbers are different from before because they're 
> results of (benchmark 5 ...) divided by 5, meaning GC is amortized:
> 
> |    4096 | 0.78 |
> |   16368 | 0.69 |
> |   40960 | 0.65 |
> |  409600 | 0.59 |
> | 1009600 | 0.56 |
> | 2009600 | 0.64 |
> | 4009600 | 0.65 |

Not enough data points between 40960 and 409600, IMO.  40960 sounds
like a good spot for the default value.

> >> And I think we should make the process "remember" the value at its
> >> creation either way (something touched on in bug#38561): in bug#55737 we
> >> added an fcntl call to make the larger values take effect. But this call
> >> is in create_process: so any subsequent increase to a large value of
> >> this var won't have effect.
> > 
> > Why would the variable change after create_process?  I'm afraid I
> > don't understand what issue you are trying to deal with here.
> 
> Well, what could we lose by saving the value of read-process-output-max 
> in create_process?

It's already recorded in the size of the pipe, so why would we need to
record it once more?

> Currently I suppose one could vary its value while a process is
> still running, to implement some adaptive behavior or whatnot. But
> that's already semi-broken because fcntl is called in
> create_process.

I see no reason to support such changes during the process run,
indeed.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Thu, 21 Sep 2023 17:34:01 GMT) Full text and rfc822 format available.

Message #40 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 66020 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca, stefankangas <at> gmail.com
Subject: Re: bug#66020: (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Thu, 21 Sep 2023 20:33:29 +0300

[Message part 1 (text/plain, inline)]

On 21/09/2023 17:37, Dmitry Gutov wrote:
> We could look into improving that part specifically: for example, 
> reading from the process multiple times into 'chars' right away while 
> there is still pending output present (either looping inside 
> read_process_output, or calling it in a loop in 
> wait_reading_process_output, at least until the process' buffered output 
> is exhausted). That could reduce reactivity, however (can we find out 
> how much is already buffered in advance, and only loop until we exhaust 
> that length?)

Hmm, the naive patch below offers some improvement for the value 4096, 
but still not comparable to raising the buffer size: 0.76 -> 0.72.

diff --git a/src/process.c b/src/process.c
index 2376d0f288d..a550e223f78 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5893,7 +5893,7 @@ wait_reading_process_output (intmax_t time_limit, 
int nsecs, int read_kbd,
 	      && ((fd_callback_info[channel].flags & (KEYBOARD_FD | PROCESS_FD))
 		  == PROCESS_FD))
 	    {
-	      int nread;
+	      int nread = 0, nnread;

 	      /* If waiting for this channel, arrange to return as
 		 soon as no more input to be processed.  No more
@@ -5912,7 +5912,13 @@ wait_reading_process_output (intmax_t time_limit, 
int nsecs, int read_kbd,
 	      /* Read data from the process, starting with our
 		 buffered-ahead character if we have one.  */

-	      nread = read_process_output (proc, channel);
+	      do
+		{
+		  nnread = read_process_output (proc, channel);
+		  nread += nnread;
+		}
+	      while (nnread >= 4096);
+
 	      if ((!wait_proc || wait_proc == XPROCESS (proc))
 		  && got_some_output < nread)
 		got_some_output = nread;


And "unlocking" the pipe size on the external process takes the 
performance further up a notch (by default it's much larger): 0.72 -> 0.65.

diff --git a/src/process.c b/src/process.c
index 2376d0f288d..85fc1b4d0c8 100644
--- a/src/process.c
+++ b/src/process.c
@@ -2206,10 +2206,10 @@ create_process (Lisp_Object process, char 
**new_argv, Lisp_Object current_dir)
       inchannel = p->open_fd[READ_FROM_SUBPROCESS];
       forkout = p->open_fd[SUBPROCESS_STDOUT];

-#if (defined (GNU_LINUX) || defined __ANDROID__)	\
-  && defined (F_SETPIPE_SZ)
-      fcntl (inchannel, F_SETPIPE_SZ, read_process_output_max);
-#endif /* (GNU_LINUX || __ANDROID__) && F_SETPIPE_SZ */
+/* #if (defined (GNU_LINUX) || defined __ANDROID__)	\ */
+/*   && defined (F_SETPIPE_SZ) */
+/*       fcntl (inchannel, F_SETPIPE_SZ, read_process_output_max); */
+/* #endif /\* (GNU_LINUX || __ANDROID__) && F_SETPIPE_SZ *\/ */
     }

   if (!NILP (p->stderrproc))

Apparently the patch from bug#55737 also made things a little worse by 
default, by limiting concurrency (the external process has to wait while 
the pipe is blocked, and by default Linux's pipe is larger). Just 
commenting it out makes performance a little better as well, though not 
as much as the two patches together.

Note that both changes above are just PoC (e.g. the hardcoded 4096, and 
probably other details like carryover).

I've tried to make a more nuanced loop inside read_process_output 
instead (as replacement for the first patch above), and so far it 
performs worse that the baseline. If anyone can see when I'm doing wrong 
(see attachment), comments are very welcome.

[read_process_output_nn_inside.diff (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Thu, 21 Sep 2023 17:41:02 GMT) Full text and rfc822 format available.

Message #43 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 66020 <at> debbugs.gnu.org, stefankangas <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#66020 (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Thu, 21 Sep 2023 20:40:35 +0300

On 21/09/2023 17:59, Eli Zaretskii wrote:
>> Date: Thu, 21 Sep 2023 17:37:23 +0300
>> Cc: stefankangas <at> gmail.com, monnier <at> iro.umontreal.ca, 66020 <at> debbugs.gnu.org
>> From: Dmitry Gutov <dmitry <at> gutov.dev>
>>
>>> How does the throughput depend on this value?  If the dependence curve
>>> plateaus at some lower value, we could use that lower value as a
>>> "good-enough" default.
>>
>> Depends on what we're prepared to call a plateau. Strictly speaking, not
>> really. But we have a "sweet spot": for the process in my original
>> benchmark ('find' with lots of output) it seems to be around 1009600.
>> Here's a table (numbers are different from before because they're
>> results of (benchmark 5 ...) divided by 5, meaning GC is amortized:
>>
>> |    4096 | 0.78 |
>> |   16368 | 0.69 |
>> |   40960 | 0.65 |
>> |  409600 | 0.59 |
>> | 1009600 | 0.56 |
>> | 2009600 | 0.64 |
>> | 4009600 | 0.65 |
> 
> Not enough data points between 40960 and 409600, IMO.  40960 sounds
> like a good spot for the default value.

Or 32K, from the thread linked to previously (datagram size). And ifwe 
were to raise MAX_ALLOCA by 2x, we could still use 'alloca'.

Neither would be optimal for my test scenario, though still an 
improvement. But see my other email with experimental patches, those 
bear improvement with the default 4096.

>>>> And I think we should make the process "remember" the value at its
>>>> creation either way (something touched on in bug#38561): in bug#55737 we
>>>> added an fcntl call to make the larger values take effect. But this call
>>>> is in create_process: so any subsequent increase to a large value of
>>>> this var won't have effect.
>>>
>>> Why would the variable change after create_process?  I'm afraid I
>>> don't understand what issue you are trying to deal with here.
>>
>> Well, what could we lose by saving the value of read-process-output-max
>> in create_process?
> 
> It's already recorded in the size of the pipe, so why would we need to
> record it once more?

'read_process_output' looks it up once more, to set the value of 
'readmax' and allocate char*chars.

Can we get the "recorded" value back from the pipe somehow?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Thu, 21 Sep 2023 17:55:02 GMT) Full text and rfc822 format available.

Message #46 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 66020 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca, stefankangas <at> gmail.com
Subject: Re: bug#66020 (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Thu, 21 Sep 2023 20:54:30 +0300

On 21/09/2023 16:16, Eli Zaretskii wrote:
>> Date: Thu, 21 Sep 2023 15:20:57 +0300
>> Cc: Eli Zaretskii<eliz <at> gnu.org>, Stefan Kangas<stefankangas <at> gmail.com>,
>>   66020 <at> debbugs.gnu.org
>> From: Dmitry Gutov<dmitry <at> gutov.dev>
>>
>> On 21/09/2023 05:36, Stefan Monnier wrote:
>>>> make_process), although I had to use a value produced by make_uninit_string:
>>>> apparently simply storing a char* field inside a managed structure creates
>>>> problems for the GC and early segfaults. Anyway, the result was slightly
>>> That should depend on*where*  you put that field.  Basically, it has to
>>> come after:
>>>
>>>       /* The thread a process is linked to, or nil for any thread.  */
>>>       Lisp_Object thread;
>>>       /* After this point, there are no Lisp_Objects.  */
>>>
>>> since all the words up to that point will be traced by the GC (and
>>> assumed to be Lisp_Object fields).
>> Ah, thanks. That calls for another try.
>>
>> ...still no improvement, though no statistically significant slowdown
>> either this time.
> Why did you expect a significant improvement?

No need to be surprised, I'm still growing intuition for what is fast 
and what is slow at this level of abstraction.

> Allocating and freeing
> the same-size buffer in quick succession has got to be optimally
> handled by modern malloc implementations, so I wouldn't be surprised
> by what you discover.  There should be no OS calls, just reuse of a
> buffer that was just recently free'd.  The overhead exists, but is
> probably very small, so it is lost in the noise.

There are context switches after 'read_process_output' exits (control is 
returned to Emacs's event loop, the external process runs again, we wait 
on it with 'select'), it might not be there later, especially outside of 
the lab situation where we benchmark just single external process. So I 
don't know.

I'm not majorly concerned, of course, and wouldn't be at all, if not for 
the previously recorded minor degragation with larger buffers in the 
longer-running session (last table in https://debbugs.gnu.org/66020#10).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Thu, 21 Sep 2023 18:40:02 GMT) Full text and rfc822 format available.

Message #49 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 66020 <at> debbugs.gnu.org, stefankangas <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#66020 (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Thu, 21 Sep 2023 21:39:34 +0300

> Date: Thu, 21 Sep 2023 20:40:35 +0300
> Cc: stefankangas <at> gmail.com, monnier <at> iro.umontreal.ca, 66020 <at> debbugs.gnu.org
> From: Dmitry Gutov <dmitry <at> gutov.dev>
> 
> Can we get the "recorded" value back from the pipe somehow?

There's F_GETPIPE_SZ command to fcntl, so I think we can.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Thu, 21 Sep 2023 18:43:02 GMT) Full text and rfc822 format available.

Message #52 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 66020 <at> debbugs.gnu.org, stefankangas <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#66020 (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Thu, 21 Sep 2023 21:42:01 +0300

On 21/09/2023 21:39, Eli Zaretskii wrote:
>> Date: Thu, 21 Sep 2023 20:40:35 +0300
>> Cc:stefankangas <at> gmail.com,monnier <at> iro.umontreal.ca,66020 <at> debbugs.gnu.org
>> From: Dmitry Gutov<dmitry <at> gutov.dev>
>>
>> Can we get the "recorded" value back from the pipe somehow?
> There's F_GETPIPE_SZ command to fcntl, so I think we can.

I'll rephrase: is this a good idea (doing a +1 syscall every time we 
read a chunk, I'm not sure of its performance anyway), or should we add 
a new field to Lisp_Process after all?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Thu, 21 Sep 2023 18:50:01 GMT) Full text and rfc822 format available.

Message #55 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 66020 <at> debbugs.gnu.org, stefankangas <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#66020 (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Thu, 21 Sep 2023 21:49:00 +0300

> Date: Thu, 21 Sep 2023 21:42:01 +0300
> Cc: stefankangas <at> gmail.com, monnier <at> iro.umontreal.ca, 66020 <at> debbugs.gnu.org
> From: Dmitry Gutov <dmitry <at> gutov.dev>
> 
> On 21/09/2023 21:39, Eli Zaretskii wrote:
> >> Date: Thu, 21 Sep 2023 20:40:35 +0300
> >> Cc:stefankangas <at> gmail.com,monnier <at> iro.umontreal.ca,66020 <at> debbugs.gnu.org
> >> From: Dmitry Gutov<dmitry <at> gutov.dev>
> >>
> >> Can we get the "recorded" value back from the pipe somehow?
> > There's F_GETPIPE_SZ command to fcntl, so I think we can.
> 
> I'll rephrase: is this a good idea (doing a +1 syscall every time we 
> read a chunk, I'm not sure of its performance anyway), or should we add 
> a new field to Lisp_Process after all?

If you indeed need the size, then do add it to the process object.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Sat, 23 Sep 2023 21:52:01 GMT) Full text and rfc822 format available.

Message #58 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: stefankangas <at> gmail.com, 66020 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca
Subject: Re: bug#66020: (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Sun, 24 Sep 2023 00:51:28 +0300

On 21/09/2023 20:33, Dmitry Gutov wrote:
> On 21/09/2023 17:37, Dmitry Gutov wrote:
>> We could look into improving that part specifically: for example, 
>> reading from the process multiple times into 'chars' right away while 
>> there is still pending output present (either looping inside 
>> read_process_output, or calling it in a loop in 
>> wait_reading_process_output, at least until the process' buffered 
>> output is exhausted). That could reduce reactivity, however (can we 
>> find out how much is already buffered in advance, and only loop until 
>> we exhaust that length?)
> 
> Hmm, the naive patch below offers some improvement for the value 4096, 
> but still not comparable to raising the buffer size: 0.76 -> 0.72.
> 
> diff --git a/src/process.c b/src/process.c
> index 2376d0f288d..a550e223f78 100644
> --- a/src/process.c
> +++ b/src/process.c
> @@ -5893,7 +5893,7 @@ wait_reading_process_output (intmax_t time_limit, 
> int nsecs, int read_kbd,
>             && ((fd_callback_info[channel].flags & (KEYBOARD_FD | 
> PROCESS_FD))
>             == PROCESS_FD))
>           {
> -          int nread;
> +          int nread = 0, nnread;
> 
>             /* If waiting for this channel, arrange to return as
>            soon as no more input to be processed.  No more
> @@ -5912,7 +5912,13 @@ wait_reading_process_output (intmax_t time_limit, 
> int nsecs, int read_kbd,
>             /* Read data from the process, starting with our
>            buffered-ahead character if we have one.  */
> 
> -          nread = read_process_output (proc, channel);
> +          do
> +        {
> +          nnread = read_process_output (proc, channel);
> +          nread += nnread;
> +        }
> +          while (nnread >= 4096);
> +
>             if ((!wait_proc || wait_proc == XPROCESS (proc))
>             && got_some_output < nread)
>           got_some_output = nread;
> 
> 
> And "unlocking" the pipe size on the external process takes the 
> performance further up a notch (by default it's much larger): 0.72 -> 0.65.
> 
> diff --git a/src/process.c b/src/process.c
> index 2376d0f288d..85fc1b4d0c8 100644
> --- a/src/process.c
> +++ b/src/process.c
> @@ -2206,10 +2206,10 @@ create_process (Lisp_Object process, char 
> **new_argv, Lisp_Object current_dir)
>         inchannel = p->open_fd[READ_FROM_SUBPROCESS];
>         forkout = p->open_fd[SUBPROCESS_STDOUT];
> 
> -#if (defined (GNU_LINUX) || defined __ANDROID__)    \
> -  && defined (F_SETPIPE_SZ)
> -      fcntl (inchannel, F_SETPIPE_SZ, read_process_output_max);
> -#endif /* (GNU_LINUX || __ANDROID__) && F_SETPIPE_SZ */
> +/* #if (defined (GNU_LINUX) || defined __ANDROID__)    \ */
> +/*   && defined (F_SETPIPE_SZ) */
> +/*       fcntl (inchannel, F_SETPIPE_SZ, read_process_output_max); */
> +/* #endif /\* (GNU_LINUX || __ANDROID__) && F_SETPIPE_SZ *\/ */
>       }
> 
>     if (!NILP (p->stderrproc))
> 
> Apparently the patch from bug#55737 also made things a little worse by 
> default, by limiting concurrency (the external process has to wait while 
> the pipe is blocked, and by default Linux's pipe is larger). Just 
> commenting it out makes performance a little better as well, though not 
> as much as the two patches together.
> 
> Note that both changes above are just PoC (e.g. the hardcoded 4096, and 
> probably other details like carryover).
> 
> I've tried to make a more nuanced loop inside read_process_output 
> instead (as replacement for the first patch above), and so far it 
> performs worse that the baseline. If anyone can see when I'm doing wrong 
> (see attachment), comments are very welcome.

This seems to have been a dead end: while looping does indeed make 
things faster, it doesn't really fit the approach of the 
'adaptive_read_buffering' part that's implemented in read_process_output.

And if the external process is crazy fast (while we, e.g. when using a 
Lisp filter, are not so fast), the result could be much reduced 
interactivity, with this one process keeping us stuck in the loop.

But it seems I've found an answer to one previous question: "can we find 
out how much is already buffered in advance?"

The patch below asks that from the OS (how portable is this? not sure) 
and allocates a larger buffer when more output has been buffered. If we 
keep OS's default value of pipe buffer size (64K on Linux and 16K-ish on 
macOS, according to 
https://unix.stackexchange.com/questions/11946/how-big-is-the-pipe-buffer), 
that means auto-scaling the buffer on Emacs's side depending on how much 
the process outputs. The effect on performance is similar to the 
previous (looping) patch (0.70 -> 0.65), and is comparable to bumping 
read-process-output-max to 65536.

So if we do decide to bump the default, I suppose the below should not 
be necessary. And I don't know whether we should be concerned about 
fragmentation: this way buffers do get allocates in different sizes 
(almost always multiples of 4096, but with rare exceptions among larger 
values).

diff --git a/src/process.c b/src/process.c
index 2376d0f288d..13cf6d6c50d 100644
--- a/src/process.c
+++ b/src/process.c
@@ -6137,7 +6145,18 @@
   specpdl_ref count = SPECPDL_INDEX ();
   Lisp_Object odeactivate;
   char *chars;

+#ifdef USABLE_FIONREAD
+#ifdef DATAGRAM_SOCKETS
+  if (!DATAGRAM_CHAN_P (channel))
+#endif
+    {
+      int available_read;
+      ioctl (p->infd, FIONREAD, &available_read);
+      readmax = MAX (readmax, available_read);
+    }
+#endif
+
   USE_SAFE_ALLOCA;
   chars = SAFE_ALLOCA (sizeof coding->carryover + readmax);

What do people think?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Sat, 23 Sep 2023 22:47:01 GMT) Full text and rfc822 format available.

Message #61 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: 66020 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> IRO.UMontreal.CA>,
 Stefan Kangas <stefankangas <at> gmail.com>
Subject: Re: bug#66020: [PATCH] Reduce GC churn in read_process_output
Date: Sun, 24 Sep 2023 01:45:46 +0300

[Message part 1 (text/plain, inline)]

Here are the previously discussed patches, collected in one place for 
easier review.

As explained, the last one is more experimental, a possible alternative 
to bumping read-process-output-max to 65536.

[0001-Go-around-calling-the-default-process-filter-reducin.patch (text/x-patch, attachment)]

[0002-Remember-the-value-of-read_process_output_max-when-a.patch (text/x-patch, attachment)]

[0003-Dynamically-increase-the-readmax-if-the-pipe-has-mor.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Sun, 24 Sep 2023 05:31:01 GMT) Full text and rfc822 format available.

Message #64 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: Paul Eggert <eggert <at> cs.ucla.edu>, stefankangas <at> gmail.com,
 66020 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca
Subject: Re: bug#66020: (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Sun, 24 Sep 2023 08:29:37 +0300

> Date: Sun, 24 Sep 2023 00:51:28 +0300
> From: Dmitry Gutov <dmitry <at> gutov.dev>
> Cc: 66020 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca, stefankangas <at> gmail.com
> 
> But it seems I've found an answer to one previous question: "can we find 
> out how much is already buffered in advance?"
> 
> The patch below asks that from the OS (how portable is this? not sure) 
> and allocates a larger buffer when more output has been buffered. If we 
> keep OS's default value of pipe buffer size (64K on Linux and 16K-ish on 
> macOS, according to 
> https://unix.stackexchange.com/questions/11946/how-big-is-the-pipe-buffer), 
> that means auto-scaling the buffer on Emacs's side depending on how much 
> the process outputs. The effect on performance is similar to the 
> previous (looping) patch (0.70 -> 0.65), and is comparable to bumping 
> read-process-output-max to 65536.
> 
> So if we do decide to bump the default, I suppose the below should not 
> be necessary. And I don't know whether we should be concerned about 
> fragmentation: this way buffers do get allocates in different sizes 
> (almost always multiples of 4096, but with rare exceptions among larger 
> values).
> 
> diff --git a/src/process.c b/src/process.c
> index 2376d0f288d..13cf6d6c50d 100644
> --- a/src/process.c
> +++ b/src/process.c
> @@ -6137,7 +6145,18 @@
>     specpdl_ref count = SPECPDL_INDEX ();
>     Lisp_Object odeactivate;
>     char *chars;
> 
> +#ifdef USABLE_FIONREAD
> +#ifdef DATAGRAM_SOCKETS
> +  if (!DATAGRAM_CHAN_P (channel))
> +#endif
> +    {
> +      int available_read;
> +      ioctl (p->infd, FIONREAD, &available_read);
> +      readmax = MAX (readmax, available_read);
> +    }
> +#endif
> +
>     USE_SAFE_ALLOCA;
>     chars = SAFE_ALLOCA (sizeof coding->carryover + readmax);
> 
> What do people think?

I think we should increase the default size, and the rest (querying
the system about the pipe size) looks like an unnecessary complication
to me.

I've added Paul Eggert to this discussion, as I'd like to hear his
opinions about this stuff.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Sun, 26 May 2024 15:21:02 GMT) Full text and rfc822 format available.

Message #67 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>, Paul Eggert <eggert <at> cs.ucla.edu>
Cc: stefankangas <at> gmail.com, 66020 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca
Subject: Re: bug#66020: (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Sun, 26 May 2024 18:20:15 +0300

Hi Paul and Eli,

On 24/09/2023 08:29, Eli Zaretskii wrote:

>> What do people think?
> I think we should increase the default size, and the rest (querying
> the system about the pipe size) looks like an unnecessary complication
> to me.
> 
> I've added Paul Eggert to this discussion, as I'd like to hear his
> opinions about this stuff.

Do you think we can get this in before the emacs-30 branch is cut?

To summarize:

* Patch 1 (reduces consing when the default filter is used by moving it 
into C - skipping the creation of Lisp strings).
* Patch 2
  a) It ensures that the dynamic binding of read-process-output-max is 
saved when the process it created and used for its processing - ensuring 
that the current dynamic value is not just used when creating the pipe, 
but also later when reading from it.
  b) A few lines are outdated: part of the fix went in with bug#66288.
* Patch 3 can be replaced with just upping the default value of 
read-process-output-max to 65536.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Sun, 26 May 2024 16:03:02 GMT) Full text and rfc822 format available.

Message #70 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: eggert <at> cs.ucla.edu, Dmitry Gutov <dmitry <at> gutov.dev>
Cc: stefankangas <at> gmail.com, 66020 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca
Subject: Re: bug#66020: (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Sun, 26 May 2024 19:01:44 +0300

> Date: Sun, 26 May 2024 18:20:15 +0300
> Cc: 66020 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca, stefankangas <at> gmail.com
> From: Dmitry Gutov <dmitry <at> gutov.dev>
> 
> Hi Paul and Eli,
> 
> On 24/09/2023 08:29, Eli Zaretskii wrote:
> 
> >> What do people think?
> > I think we should increase the default size, and the rest (querying
> > the system about the pipe size) looks like an unnecessary complication
> > to me.
> > 
> > I've added Paul Eggert to this discussion, as I'd like to hear his
> > opinions about this stuff.
> 
> Do you think we can get this in before the emacs-30 branch is cut?

I'll try to recollect the discussion and review the patches one of
these days.

Paul, your input (as well as that of everybody else on the CC list)
will be most welcome.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Sun, 26 May 2024 23:29:02 GMT) Full text and rfc822 format available.

Message #73 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefankangas <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>, eggert <at> cs.ucla.edu,
 Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 66020 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca
Subject: Re: bug#66020: (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Sun, 26 May 2024 16:27:19 -0700

Eli Zaretskii <eliz <at> gnu.org> writes:

> I'll try to recollect the discussion and review the patches one of
> these days.
>
> Paul, your input (as well as that of everybody else on the CC list)
> will be most welcome.

FWIW, I'd be in favor of raising `read-process-output-max' to something
like 40960 (as Eli suggested in this thread), or perhaps some power of 2
close to that like 32768 or 65536.

This is based on it being seemingly faster in the benchmarks in this
thread, and me having used that locally for 2-3 years and noting no
adverse effects.

See also the discussion here:
https://lists.gnu.org/r/emacs-devel/2021-03/msg01461.html

(I didn't review patch 1 and 2, so no opinion on those.)

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Sat, 08 Jun 2024 12:13:02 GMT) Full text and rfc822 format available.

Message #76 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Kangas <stefankangas <at> gmail.com>, dmitry <at> gutov.dev
Cc: eggert <at> cs.ucla.edu, 66020 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca
Subject: Re: bug#66020: (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Sat, 08 Jun 2024 15:11:44 +0300

> From: Stefan Kangas <stefankangas <at> gmail.com>
> Date: Sun, 26 May 2024 16:27:19 -0700
> Cc: 66020 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > I'll try to recollect the discussion and review the patches one of
> > these days.
> >
> > Paul, your input (as well as that of everybody else on the CC list)
> > will be most welcome.
> 
> FWIW, I'd be in favor of raising `read-process-output-max' to something
> like 40960 (as Eli suggested in this thread), or perhaps some power of 2
> close to that like 32768 or 65536.
> 
> This is based on it being seemingly faster in the benchmarks in this
> thread, and me having used that locally for 2-3 years and noting no
> adverse effects.
> 
> See also the discussion here:
> https://lists.gnu.org/r/emacs-devel/2021-03/msg01461.html
> 
> (I didn't review patch 1 and 2, so no opinion on those.)

I guess we can install all 3 patches and see if anything breaks.

Reply sent to Dmitry Gutov <dmitry <at> gutov.dev>:
You have taken responsibility. (Sun, 09 Jun 2024 01:41:01 GMT) Full text and rfc822 format available.

Notification sent to Dmitry Gutov <dmitry <at> gutov.dev>:
bug acknowledged by developer. (Sun, 09 Jun 2024 01:41:02 GMT) Full text and rfc822 format available.

Message #81 received at 66020-done <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>, Stefan Kangas <stefankangas <at> gmail.com>
Cc: eggert <at> cs.ucla.edu, monnier <at> iro.umontreal.ca, 66020-done <at> debbugs.gnu.org
Subject: Re: bug#66020: (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Sun, 9 Jun 2024 03:12:16 +0300

On 08/06/2024 15:11, Eli Zaretskii wrote:
>> From: Stefan Kangas<stefankangas <at> gmail.com>
>> Date: Sun, 26 May 2024 16:27:19 -0700
>> Cc:66020 <at> debbugs.gnu.org,monnier <at> iro.umontreal.ca
>>
>> Eli Zaretskii<eliz <at> gnu.org>  writes:
>>
>>> I'll try to recollect the discussion and review the patches one of
>>> these days.
>>>
>>> Paul, your input (as well as that of everybody else on the CC list)
>>> will be most welcome.
>> FWIW, I'd be in favor of raising `read-process-output-max' to something
>> like 40960 (as Eli suggested in this thread), or perhaps some power of 2
>> close to that like 32768 or 65536.
>>
>> This is based on it being seemingly faster in the benchmarks in this
>> thread, and me having used that locally for 2-3 years and noting no
>> adverse effects.
>>
>> See also the discussion here:
>> https://lists.gnu.org/r/emacs-devel/2021-03/msg01461.html
>>
>> (I didn't review patch 1 and 2, so no opinion on those.)
> I guess we can install all 3 patches and see if anything breaks.

Thank you, now pushed to master.

For read-process-output-max, I chose the higher of the powers of two, 
but please feel free to amend.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Tue, 11 Jun 2024 16:02:01 GMT) Full text and rfc822 format available.

Message #84 received at 66020-done <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>, Stefan Kangas <stefankangas <at> gmail.com>
Cc: eggert <at> cs.ucla.edu, monnier <at> iro.umontreal.ca, 66020-done <at> debbugs.gnu.org
Subject: Re: bug#66020: (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Tue, 11 Jun 2024 06:12:21 +0300

[Message part 1 (text/plain, inline)]

On 09/06/2024 03:12, Dmitry Gutov wrote:
>>>
>> I guess we can install all 3 patches and see if anything breaks.
> 
> Thank you, now pushed to master.

On the heels of bug#71452, I've pushed three updates.

Here's a third one (hopefully last) which I'd like a second opinion on. 
Process output -- apparently -- needs to be inserted before markers.

Is it okay to make adjust_markers_for_insert non-static and call it 
here? See attached.

[read_and_insert_process_output_before_markers.diff (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Tue, 11 Jun 2024 16:02:02 GMT) Full text and rfc822 format available.

Message #87 received at 66020-done <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 66020-done <at> debbugs.gnu.org, eggert <at> cs.ucla.edu, stefankangas <at> gmail.com,
 monnier <at> iro.umontreal.ca
Subject: Re: bug#66020: (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Tue, 11 Jun 2024 14:41:07 +0300

On 11/06/2024 09:51, Eli Zaretskii wrote:
>> Date: Tue, 11 Jun 2024 06:12:21 +0300
>> From: Dmitry Gutov<dmitry <at> gutov.dev>
>> Cc:eggert <at> cs.ucla.edu,monnier <at> iro.umontreal.ca,66020-done <at> debbugs.gnu.org
>>
>> Is it okay to make adjust_markers_for_insert non-static and call it
>> here? See attached.
> Technically, this is okay, but I'd like to hear from Stefan about
> whether it's correct to insert process output before the markers.

FWIW, internal-default-process-filter calls 
insert_from_string_before_markers.

My goal is just to maintain compatibility.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Tue, 11 Jun 2024 16:34:01 GMT) Full text and rfc822 format available.

Message #90 received at 66020-done <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, eggert <at> cs.ucla.edu, stefankangas <at> gmail.com,
 66020-done <at> debbugs.gnu.org
Subject: Re: bug#66020: (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Tue, 11 Jun 2024 08:55:53 -0400

> Technically, this is okay, but I'd like to hear from Stefan about
> whether it's correct to insert process output before the markers.

AFAIK, when the insertion is done by the default process filter, it's
indeed done "before the markers".  I have no idea why it's done this
way and would have preferred it to be different in many cases, but it's
been that way forever and changing it would inevitably break some code.


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Tue, 11 Jun 2024 17:15:02 GMT) Full text and rfc822 format available.

Message #93 received at 66020-done <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 eggert <at> cs.ucla.edu, stefankangas <at> gmail.com, 66020-done <at> debbugs.gnu.org
Subject: Re: bug#66020: (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Tue, 11 Jun 2024 17:15:46 +0000

Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of
text editors" <bug-gnu-emacs <at> gnu.org> writes:

>> Technically, this is okay, but I'd like to hear from Stefan about
>> whether it's correct to insert process output before the markers.
>
> AFAIK, when the insertion is done by the default process filter, it's
> indeed done "before the markers".  I have no idea why it's done this
> way and would have preferred it to be different in many cases, but it's
> been that way forever and changing it would inevitably break some code.

FYI, I just upgraded to the latest master, and observed some lines in
*Async-native-compile-log* inserted not at the end, but on the second
line before last.

Is it something others also see?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Tue, 11 Jun 2024 18:10:02 GMT) Full text and rfc822 format available.

Message #96 received at 66020-done <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Ihor Radchenko <yantar92 <at> posteo.net>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Eli Zaretskii <eliz <at> gnu.org>, eggert <at> cs.ucla.edu, stefankangas <at> gmail.com,
 66020-done <at> debbugs.gnu.org
Subject: Re: bug#66020: (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Tue, 11 Jun 2024 21:09:05 +0300

On 11/06/2024 20:15, Ihor Radchenko wrote:
> Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of
> text editors"<bug-gnu-emacs <at> gnu.org>  writes:
> 
>>> Technically, this is okay, but I'd like to hear from Stefan about
>>> whether it's correct to insert process output before the markers.
>> AFAIK, when the insertion is done by the default process filter, it's
>> indeed done "before the markers".  I have no idea why it's done this
>> way and would have preferred it to be different in many cases, but it's
>> been that way forever and changing it would inevitably break some code.
> FYI, I just upgraded to the latest master, and observed some lines in
> *Async-native-compile-log*  inserted not at the end, but on the second
> line before last.
> 
> Is it something others also see?

I have now pushed the proposed patch (with "before markers" behavior).

Could you rebuild and see whether the async-native-compile scenario 
improves?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Tue, 11 Jun 2024 19:33:02 GMT) Full text and rfc822 format available.

Message #99 received at 66020-done <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 66020-done <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 eggert <at> cs.ucla.edu, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 stefankangas <at> gmail.com
Subject: Re: bug#66020: (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Tue, 11 Jun 2024 19:33:44 +0000

Dmitry Gutov <dmitry <at> gutov.dev> writes:

>> FYI, I just upgraded to the latest master, and observed some lines in
>> *Async-native-compile-log*  inserted not at the end, but on the second
>> line before last.
>> 
>> Is it something others also see?
>
> I have now pushed the proposed patch (with "before markers" behavior).
>
> Could you rebuild and see whether the async-native-compile scenario 
> improves?

Back to normal now.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Tue, 11 Jun 2024 20:01:02 GMT) Full text and rfc822 format available.

Message #102 received at 66020-done <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Ihor Radchenko <yantar92 <at> posteo.net>
Cc: 66020-done <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 eggert <at> cs.ucla.edu, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 stefankangas <at> gmail.com
Subject: Re: bug#66020: (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Tue, 11 Jun 2024 23:00:30 +0300

On 11/06/2024 22:33, Ihor Radchenko wrote:
>>> FYI, I just upgraded to the latest master, and observed some lines in
>>> *Async-native-compile-log*   inserted not at the end, but on the second
>>> line before last.
>>>
>>> Is it something others also see?
>> I have now pushed the proposed patch (with "before markers" behavior).
>>
>> Could you rebuild and see whether the async-native-compile scenario
>> improves?
> Back to normal now.

Perfect, thanks for checking.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Tue, 11 Jun 2024 20:25:02 GMT) Full text and rfc822 format available.

Message #105 received at 66020 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: dmitry <at> gutov.dev, eggert <at> cs.ucla.edu, stefankangas <at> gmail.com,
 66020 <at> debbugs.gnu.org
Subject: Re: bug#66020: (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Tue, 11 Jun 2024 16:06:05 +0300

> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: Dmitry Gutov <dmitry <at> gutov.dev>,  stefankangas <at> gmail.com,
>   eggert <at> cs.ucla.edu,  66020-done <at> debbugs.gnu.org
> Date: Tue, 11 Jun 2024 08:55:53 -0400
> 
> > Technically, this is okay, but I'd like to hear from Stefan about
> > whether it's correct to insert process output before the markers.
> 
> AFAIK, when the insertion is done by the default process filter, it's
> indeed done "before the markers".  I have no idea why it's done this
> way and would have preferred it to be different in many cases, but it's
> been that way forever and changing it would inevitably break some code.

OK, thanks.  I guess compatibility wins here.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#66020; Package emacs. (Tue, 11 Jun 2024 20:42:07 GMT) Full text and rfc822 format available.

Message #108 received at 66020-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 66020-done <at> debbugs.gnu.org, eggert <at> cs.ucla.edu, stefankangas <at> gmail.com,
 monnier <at> iro.umontreal.ca
Subject: Re: bug#66020: (bug#64735 spin-off): regarding the default for
 read-process-output-max
Date: Tue, 11 Jun 2024 09:51:51 +0300

> Date: Tue, 11 Jun 2024 06:12:21 +0300
> From: Dmitry Gutov <dmitry <at> gutov.dev>
> Cc: eggert <at> cs.ucla.edu, monnier <at> iro.umontreal.ca, 66020-done <at> debbugs.gnu.org
> 
> Is it okay to make adjust_markers_for_insert non-static and call it 
> here? See attached.

Technically, this is okay, but I'd like to hear from Stefan about
whether it's correct to insert process output before the markers.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 10 Jul 2024 11:24:09 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #66020 [PATCH] Reduce GC churn in read_process_output

GNU bug report logs - #66020
[PATCH] Reduce GC churn in read_process_output