GNU bug report logs - #53041
29.0.50; TRAMP spins the CPU by polling the child processes without a delay

Previous Next

Package: emacs;

Reported by: Dima Kogan <dima <at> secretsauce.net>

Date: Wed, 5 Jan 2022 23:04:02 UTC

Severity: normal

Tags: wontfix

Found in version 29.0.50

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 53041 in the body.
You can then email your comments to 53041 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#53041; Package emacs. (Wed, 05 Jan 2022 23:04:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Dima Kogan <dima <at> secretsauce.net>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Wed, 05 Jan 2022 23:04:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Dima Kogan <dima <at> secretsauce.net>
To: bug-gnu-emacs <at> gnu.org
Subject: 29.0.50; TRAMP spins the CPU by polling the child processes without
 a delay
Date: Wed, 05 Jan 2022 15:03:48 -0800
Hi. I use TRAMP regularly, and I often see it redline my CPU, which
shouldn't be happening.

The cause in all cases I've seen is TRAMP expecting some output from the
child process, and looking for this output in a delay-less loop. For
instance (tramp-process-one-action) looks like this:

  (defun tramp-process-one-action (proc vec actions)
      ....
    (while (not found)
      (while (tramp-accept-process-output proc 0))
      .... )

The (while (tramp-accept-process-output proc 0)) form does

  Read all available data; returns immediately if none is available

So here we spin the CPU until there's some data to look at AND until the
incoming data meets some condition we're looking for. In order to not
spin, at least one of the (tramp-accept-process-output) calls needs to
block. The simplest thing to do to fix this is to replace

  (while (tramp-accept-process-output proc 0))

with

  (tramp-accept-process-output proc nil)

Here we block until we get SOME data back. I think this is probably
good-enough, since the outer loop will get more data, if it's needed. If
we really want to replace the original logic with blocking, we can do
this instead:

  (let (timeout)
    (while 
        (prog1
            (tramp-accept-process-output proc timeout)
          (setq timeout 0))))

Either one of these makes most of these issues disappear. There are more
places in the code where we call (tramp-accept-process-output ... 0),
and I think they're all wrong: we should always block. I can send a
patch, but let's agree on the approach first. My preference is to
replace all the (while (tramp-accept-process-output proc 0)) with
(tramp-accept-process-output proc nil) unless there's a specific reason
not to.

One easy way to reproduce one such behavior:

1. Start up emacs
2. open /ssh:SERVER:FILE
3. Break the network connection (I'm on a laptop. Leaving the wifi area
   is enough)
4. Try to type into the buffer visiting FILE
5. See emacs block the user while spinning the CPU.

Thanks




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#53041; Package emacs. (Sun, 09 Jan 2022 13:47:01 GMT) Full text and rfc822 format available.

Message #8 received at 53041 <at> debbugs.gnu.org (full text, mbox):

From: Michael Albinus <michael.albinus <at> gmx.de>
To: Dima Kogan <dima <at> secretsauce.net>
Cc: 53041 <at> debbugs.gnu.org
Subject: Re: bug#53041: 29.0.50; TRAMP spins the CPU by polling the child
 processes without a delay
Date: Sun, 09 Jan 2022 14:46:04 +0100
Dima Kogan <dima <at> secretsauce.net> writes:

> Hi.

Hi Dima,

> I use TRAMP regularly, and I often see it redline my CPU, which
> shouldn't be happening.
>
> The cause in all cases I've seen is TRAMP expecting some output from the
> child process, and looking for this output in a delay-less loop. For
> instance (tramp-process-one-action) looks like this:
>
>   (defun tramp-process-one-action (proc vec actions)
>       ....
>     (while (not found)
>       (while (tramp-accept-process-output proc 0))
>       .... )
>
> The (while (tramp-accept-process-output proc 0)) form does
>
>   Read all available data; returns immediately if none is available
>
> So here we spin the CPU until there's some data to look at AND until the
> incoming data meets some condition we're looking for. In order to not
> spin, at least one of the (tramp-accept-process-output) calls needs to
> block. The simplest thing to do to fix this is to replace
>
>   (while (tramp-accept-process-output proc 0))
>
> with
>
>   (tramp-accept-process-output proc nil)
>
> Here we block until we get SOME data back. I think this is probably
> good-enough, since the outer loop will get more data, if it's needed. If
> we really want to replace the original logic with blocking, we can do
> this instead:
>
>   (let (timeout)
>     (while
>         (prog1
>             (tramp-accept-process-output proc timeout)
>           (setq timeout 0))))
>
> Either one of these makes most of these issues disappear. There are more
> places in the code where we call (tramp-accept-process-output ... 0),
> and I think they're all wrong: we should always block. I can send a
> patch, but let's agree on the approach first. My preference is to
> replace all the (while (tramp-accept-process-output proc 0)) with
> (tramp-accept-process-output proc nil) unless there's a specific reason
> not to.
>
> One easy way to reproduce one such behavior:
>
> 1. Start up emacs
> 2. open /ssh:SERVER:FILE
> 3. Break the network connection (I'm on a laptop. Leaving the wifi area
>    is enough)
> 4. Try to type into the buffer visiting FILE
> 5. See emacs block the user while spinning the CPU.

This was discussed several times already. The most recent discussion wrt
Tramp starts at <https://lists.gnu.org/archive/html/emacs-devel/2019-01/msg00301.html>.

The pattern (while (accept-process-output p) was proposed by Stefan
Monnier in <https://lists.gnu.org/archive/html/emacs-devel/2019-01/msg00338.html>,
so this is used in Tramp. I do not want to reopen this can of worms, really.

To fix your problem of a broken connection, the Tramp manual recommends
to add "ServerAliveInterval 5" in your ~/.ssh/config, see (info "(tramp)
Frequently Asked Questions") . Additionally, you might set "ServerAliveCountMax 2".

> Thanks

Best regards, Michael.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#53041; Package emacs. (Fri, 14 Jan 2022 08:14:01 GMT) Full text and rfc822 format available.

Message #11 received at 53041 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Michael Albinus <michael.albinus <at> gmx.de>
Cc: Dima Kogan <dima <at> secretsauce.net>, 53041 <at> debbugs.gnu.org
Subject: Re: bug#53041: 29.0.50; TRAMP spins the CPU by polling the child
 processes without a delay
Date: Fri, 14 Jan 2022 09:13:12 +0100
Michael Albinus <michael.albinus <at> gmx.de> writes:

> The pattern (while (accept-process-output p) was proposed by Stefan
> Monnier in
> <https://lists.gnu.org/archive/html/emacs-devel/2019-01/msg00338.html>,
> so this is used in Tramp. I do not want to reopen this can of worms, really.
>
> To fix your problem of a broken connection, the Tramp manual recommends
> to add "ServerAliveInterval 5" in your ~/.ssh/config, see (info "(tramp)
> Frequently Asked Questions") . Additionally, you might set
> "ServerAliveCountMax 2".

If I understand correctly, this means that we won't be doing anything
further in this bug report, and I'm therefore closing it.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Added tag(s) wontfix. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Fri, 14 Jan 2022 08:14:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 53041 <at> debbugs.gnu.org and Dima Kogan <dima <at> secretsauce.net> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Fri, 14 Jan 2022 08:14:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#53041; Package emacs. (Fri, 14 Jan 2022 18:35:02 GMT) Full text and rfc822 format available.

Message #18 received at 53041 <at> debbugs.gnu.org (full text, mbox):

From: Dima Kogan <dima <at> secretsauce.net>
To: Michael Albinus <michael.albinus <at> gmx.de>
Cc: 53041 <at> debbugs.gnu.org
Subject: Re: bug#53041: 29.0.50; TRAMP spins the CPU by polling the child
 processes without a delay
Date: Fri, 14 Jan 2022 10:33:47 -0800
Michael Albinus <michael.albinus <at> gmx.de> writes:

> This was discussed several times already. The most recent discussion wrt
> Tramp starts at <https://lists.gnu.org/archive/html/emacs-devel/2019-01/msg00301.html>.
>
> The pattern (while (accept-process-output p) was proposed by Stefan
> Monnier in <https://lists.gnu.org/archive/html/emacs-devel/2019-01/msg00338.html>,
> so this is used in Tramp. I do not want to reopen this can of worms, really.
>
> To fix your problem of a broken connection, the Tramp manual recommends
> to add "ServerAliveInterval 5" in your ~/.ssh/config, see (info "(tramp)
> Frequently Asked Questions") . Additionally, you might set "ServerAliveCountMax 2".

Thanks for the links, Michael. I'll dogfood some patches for a while,
and we can maybe talk about it later if those consistently work well.

Thanks




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 12 Feb 2022 12:24:08 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 72 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.