GNU bug report logs -
#75574
Adaptive read buffering is a pessimization
Previous Next
To reply to this bug, email your comments to 75574 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#75574
; Package
emacs
.
(Wed, 15 Jan 2025 05:43:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Daniel Colascione <dancol <at> dancol.org>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Wed, 15 Jan 2025 05:43:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
The adaptive read buffering code delays reads in the hope that we can read in buffer chunks if we wait a little bit between reads from a process producing a lot of output. Sounds good. Doesn't work. The attempted optimization reduces performance in various scenarios and causes an 8x regression in performance for me in flows involving mixtures of big and small reads.
With adaptive reading, we increase the read delay every time we get a short read and decrease it when we get a full buffer of data or do a write. The problem is 1) that there are legitimate flows involving long sequences of reads without an intervening write and 2) reads (especially from PTYs) may *never* report a full buffer because the kernel limits the maximum read size no matter how big the backlog is. (For example, the Darwin kernel limits PTY (and presumably TTY in general?) reads to 1024 bytes, but the default Emacs read size is 64k, so we never recognize a signal that we should reduce the read delay.
I'd suggest just deleting the feature. It's not worth the complexity and edge cases, IMHO.
If that's not an option, I'd suggest detecting bulk flows by doing a zero timeout select() after we're tempted to increase the delay and actually increasing the delay only when that select times out.
Just tweaking the maximum read size probably isn't a good idea: it's an implementation detail and can change with time and over the types of FD from which we read.
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#75574
; Package
emacs
.
(Wed, 15 Jan 2025 15:00:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 75574 <at> debbugs.gnu.org (full text, mbox):
> Date: Tue, 14 Jan 2025 21:42:13 -0800
> From: Daniel Colascione <dancol <at> dancol.org>
>
> The adaptive read buffering code delays reads in the hope that we can read in buffer chunks if we wait a
> little bit between reads from a process producing a lot of output. Sounds good. Doesn't work. The attempted
> optimization reduces performance in various scenarios and causes an 8x regression in performance for me
> in flows involving mixtures of big and small reads.
>
> With adaptive reading, we increase the read delay every time we get a short read and decrease it when we
> get a full buffer of data or do a write. The problem is 1) that there are legitimate flows involving long
> sequences of reads without an intervening write and 2) reads (especially from PTYs) may *never* report a
> full buffer because the kernel limits the maximum read size no matter how big the backlog is. (For example,
> the Darwin kernel limits PTY (and presumably TTY in general?) reads to 1024 bytes, but the default Emacs
> read size is 64k, so we never recognize a signal that we should reduce the read delay.
>
> I'd suggest just deleting the feature. It's not worth the complexity and edge cases, IMHO.
>
> If that's not an option, I'd suggest detecting bulk flows by doing a zero timeout select() after we're tempted to
> increase the delay and actually increasing the delay only when that select times out.
>
> Just tweaking the maximum read size probably isn't a good idea: it's an implementation detail and can
> change with time and over the types of FD from which we read.
AFAICS, this feature can be disabled by setting
process-adaptive-read-buffering to the nil value, either globally or
let-binding it around start-process. Does that solve your problems,
and if so, can we conclude that Emacs allows both using the feature
and disabling it.
AFAIR, in some situation this was really useful, otherwise we wouldn't
have set it to t by default.
Or maybe we should introduce a new special value of
process-adaptive-read-buffering, which would be equivalent to nil when
reading from PTYs, if that case can never benefit from these delays?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#75574
; Package
emacs
.
(Wed, 15 Jan 2025 15:38:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 75574 <at> debbugs.gnu.org (full text, mbox):
On January 15, 2025 6:59:12 AM PST, Eli Zaretskii <eliz <at> gnu.org> wrote:
>> Date: Tue, 14 Jan 2025 21:42:13 -0800
>> From: Daniel Colascione <dancol <at> dancol.org>
>>
>> The adaptive read buffering code delays reads in the hope that we can read in buffer chunks if we wait a
>> little bit between reads from a process producing a lot of output. Sounds good. Doesn't work. The attempted
>> optimization reduces performance in various scenarios and causes an 8x regression in performance for me
>> in flows involving mixtures of big and small reads.
>>
>> With adaptive reading, we increase the read delay every time we get a short read and decrease it when we
>> get a full buffer of data or do a write. The problem is 1) that there are legitimate flows involving long
>> sequences of reads without an intervening write and 2) reads (especially from PTYs) may *never* report a
>> full buffer because the kernel limits the maximum read size no matter how big the backlog is. (For example,
>> the Darwin kernel limits PTY (and presumably TTY in general?) reads to 1024 bytes, but the default Emacs
>> read size is 64k, so we never recognize a signal that we should reduce the read delay.
>>
>> I'd suggest just deleting the feature. It's not worth the complexity and edge cases, IMHO.
>>
>> If that's not an option, I'd suggest detecting bulk flows by doing a zero timeout select() after we're tempted to
>> increase the delay and actually increasing the delay only when that select times out.
>>
>> Just tweaking the maximum read size probably isn't a good idea: it's an implementation detail and can
>> change with time and over the types of FD from which we read.
>
>AFAICS, this feature can be disabled by setting
>process-adaptive-read-buffering to the nil value, either globally or
>let-binding it around start-process. Does that solve your problems,
>and if so, can we conclude that Emacs allows both using the feature
>and disabling it.
Well, I mean, Emacs is free software, so in theory users can disable any bug with enough elbow grease, even ones we don't know about. I have the radical idea that it shouldn't be slow by default and that users shouldn't have to manually disable known bugs.
>AFAIR, in some situation this was really useful, otherwise we wouldn't
>have set it to t by default.
Plenty of questionable things sound like good ideas at the time.
>Or maybe we should introduce a new special value of
>process-adaptive-read-buffering, which would be equivalent to nil when
>reading from PTYs, if that case can never benefit from these delays?
Even if you were to limit the adaptive reading to pipes and sockets, you'd still have the problem with trying to guess the maximum read size even if you limit the mechanism to pipes. It's just broken. I don't see a reason to continue to maintain the complexity. What are we saving? A few context switches, maybe? Machines are fast enough that we don't need to make a queue form for the sake of batching.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#75574
; Package
emacs
.
(Wed, 15 Jan 2025 15:57:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 75574 <at> debbugs.gnu.org (full text, mbox):
> Date: Wed, 15 Jan 2025 07:37:10 -0800
> From: Daniel Colascione <dancol <at> dancol.org>
> CC: 75574 <at> debbugs.gnu.org
>
>
>
> On January 15, 2025 6:59:12 AM PST, Eli Zaretskii <eliz <at> gnu.org> wrote:
> >> Date: Tue, 14 Jan 2025 21:42:13 -0800
> >> From: Daniel Colascione <dancol <at> dancol.org>
> >>
> >> The adaptive read buffering code delays reads in the hope that we can read in buffer chunks if we wait a
> >> little bit between reads from a process producing a lot of output. Sounds good. Doesn't work. The attempted
> >> optimization reduces performance in various scenarios and causes an 8x regression in performance for me
> >> in flows involving mixtures of big and small reads.
> >>
> >> With adaptive reading, we increase the read delay every time we get a short read and decrease it when we
> >> get a full buffer of data or do a write. The problem is 1) that there are legitimate flows involving long
> >> sequences of reads without an intervening write and 2) reads (especially from PTYs) may *never* report a
> >> full buffer because the kernel limits the maximum read size no matter how big the backlog is. (For example,
> >> the Darwin kernel limits PTY (and presumably TTY in general?) reads to 1024 bytes, but the default Emacs
> >> read size is 64k, so we never recognize a signal that we should reduce the read delay.
> >>
> >> I'd suggest just deleting the feature. It's not worth the complexity and edge cases, IMHO.
> >>
> >> If that's not an option, I'd suggest detecting bulk flows by doing a zero timeout select() after we're tempted to
> >> increase the delay and actually increasing the delay only when that select times out.
> >>
> >> Just tweaking the maximum read size probably isn't a good idea: it's an implementation detail and can
> >> change with time and over the types of FD from which we read.
> >
> >AFAICS, this feature can be disabled by setting
> >process-adaptive-read-buffering to the nil value, either globally or
> >let-binding it around start-process. Does that solve your problems,
> >and if so, can we conclude that Emacs allows both using the feature
> >and disabling it.
>
> Well, I mean, Emacs is free software, so in theory users can disable any bug with enough elbow grease, even ones we don't know about. I have the radical idea that it shouldn't be slow by default and that users shouldn't have to manually disable known bugs.
>
> >AFAIR, in some situation this was really useful, otherwise we wouldn't
> >have set it to t by default.
>
> Plenty of questionable things sound like good ideas at the time.
It wasn't just a good idea, though. It actually helped fix some use
cases.
> >Or maybe we should introduce a new special value of
> >process-adaptive-read-buffering, which would be equivalent to nil when
> >reading from PTYs, if that case can never benefit from these delays?
>
> Even if you were to limit the adaptive reading to pipes and sockets, you'd still have the problem with trying to guess the maximum read size even if you limit the mechanism to pipes. It's just broken. I don't see a reason to continue to maintain the complexity. What are we saving? A few context switches, maybe? Machines are fast enough that we don't need to make a queue form for the sake of batching.
I see your point.
However, this area is really tricky, and supports many different usage
patterns. There are Lisp programs that read relatively small chunks
of input from a subprocess, and then there are other programs which
read huge amounts of data. I wouldn't risk deciding we have all of
them figured out and can conclude they all don't need this.
You can try searching the bug tracker for
process-adaptive-read-buffering, it will bring up several hits.
We could begin by setting this variable nil by default and see if
someone complains. If enough time has passed without any complaints,
we could then consider actually removing the feature.
This bug report was last modified 6 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.