GNU bug report logs - #41720
[PATCH] store: Use buffered I/O for all protocol writes

Previous Next

Package: guix-patches;

Reported by: Lars-Dominik Braun <ldb <at> leibniz-psychology.org>

Date: Fri, 5 Jun 2020 09:12:01 UTC

Severity: normal

Tags: patch

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 41720 in the body.
You can then email your comments to 41720 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to guix-patches <at> gnu.org:
bug#41720; Package guix-patches. (Fri, 05 Jun 2020 09:12:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Lars-Dominik Braun <ldb <at> leibniz-psychology.org>:
New bug report received and forwarded. Copy sent to guix-patches <at> gnu.org. (Fri, 05 Jun 2020 09:12:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Lars-Dominik Braun <ldb <at> leibniz-psychology.org>
To: guix-patches <at> gnu.org
Subject: [PATCH] store: Use buffered I/O for all protocol writes
Date: Fri, 5 Jun 2020 11:11:51 +0200
[Message part 1 (text/plain, inline)]
Hi,

I did some digging to investigate why SSH-based guix operations are currently
so slow. One of the reasons seems to be that some operations write single words
to the socket and due to NOWAIT these result in a lot of small packets.

The attached patch modifies store.scm to buffer all writes, so that won’t happen
any more. I’m seeing about ~40% speedup (6.194s vs 10.075s for the best out of
five runs) for `GUIX_DAEMON_SOCKET=ssh://localhost guix environment guix --
true`, but it also seems to have a negative impact on UNIX domain socket
communication by ~10% (1.561s vs 1.385s). For the notorious r-learnr package
it’s still ~20% better (ssh, 16.620/21.418) and only ~3% worse (unix,
11.489/11.199). Keep in mind localhost has a much larger MTU and much lower
latency than usual networks though. It might benefit from having an even bigger
write buffer (64k vs. currently 8k).

Cheers,
Lars

[0001-store-Use-buffered-I-O-for-all-protocol-writes.patch (text/x-diff, attachment)]
[signature.asc (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#41720; Package guix-patches. (Fri, 05 Jun 2020 21:34:01 GMT) Full text and rfc822 format available.

Message #8 received at 41720 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Lars-Dominik Braun <ldb <at> leibniz-psychology.org>
Cc: 41720 <at> debbugs.gnu.org
Subject: Re: [bug#41720] [PATCH] store: Use buffered I/O for all protocol
 writes
Date: Fri, 05 Jun 2020 23:33:02 +0200
Hi,

Lars-Dominik Braun <ldb <at> leibniz-psychology.org> skribis:

> I did some digging to investigate why SSH-based guix operations are currently
> so slow. One of the reasons seems to be that some operations write single words
> to the socket and due to NOWAIT these result in a lot of small packets.

Good catch!

> The attached patch modifies store.scm to buffer all writes, so that won’t happen
> any more. I’m seeing about ~40% speedup (6.194s vs 10.075s for the best out of
> five runs) for `GUIX_DAEMON_SOCKET=ssh://localhost guix environment guix --
> true`, but it also seems to have a negative impact on UNIX domain socket
> communication by ~10% (1.561s vs 1.385s). For the notorious r-learnr package
> it’s still ~20% better (ssh, 16.620/21.418) and only ~3% worse (unix,
> 11.489/11.199). Keep in mind localhost has a much larger MTU and much lower
> latency than usual networks though. It might benefit from having an even bigger
> write buffer (64k vs. currently 8k).

Nice, though the slowdown on Unix-domain sockets is annoying given that
it’s the primary transport.  Are the 10% significant and stable over
multiple runs?

It may be that writing through the buffered port, with the extra
allocations, indirection through the custom port, etc., outweighs the
gains.

OTOH all the other RPCs are buffered, so it makes sense to just do the
same.

(It may be wiser to make the ‘operation’ macro smarter and emit code
that directly allocates a bytevector of the right size and sends it.
Future work!)

> From eb38dc246e5dd33f8a48a99aded8b8dc0c378376 Mon Sep 17 00:00:00 2001
> From: Lars-Dominik Braun <ldb <at> leibniz-psychology.org>
> Date: Fri, 5 Jun 2020 10:38:32 +0200
> Subject: [PATCH] store: Use buffered I/O for all protocol writes
>
> * guix/store.scm (run-gc) Use buffered output port.
> (export-path) Same.
> (add-file-tree-to-store) Same.
> (set-build-options): Same. Add explicit flush.

Look:

--8<---------------cut here---------------start------------->8---
$ GUIX_PROFILING=rpc guix environment guix -- true
Remote procedure call summary: 1927 RPCs
  built-in-builders              ...     1
  build-things                   ...     1
  query-substitutable-path-infos ...     1
  valid-path?                    ...     2
  add-to-store/tree              ...    19
  query-references               ...   108
  add-to-store                   ...   133
  add-text-to-store              ...  1662
--8<---------------cut here---------------end--------------->8---

So probably the gain you’re seeing comes from ‘add-file-tree-to-store’.

Otherwise the patch LGTM as long as it doesn’t introduce any test suite
regressions.

Thanks for investigating!

Ludo’.




Information forwarded to guix-patches <at> gnu.org:
bug#41720; Package guix-patches. (Mon, 08 Jun 2020 06:43:02 GMT) Full text and rfc822 format available.

Message #11 received at 41720 <at> debbugs.gnu.org (full text, mbox):

From: Lars-Dominik Braun <ldb <at> leibniz-psychology.org>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 41720 <at> debbugs.gnu.org
Subject: Re: [bug#41720] [PATCH] store: Use buffered I/O for all protocol
 writes
Date: Mon, 8 Jun 2020 08:42:28 +0200
[Message part 1 (text/plain, inline)]
Hi Ludo,

> Nice, though the slowdown on Unix-domain sockets is annoying given that
> it’s the primary transport.  Are the 10% significant and stable over
> multiple runs?
I don’t want to blind your eyes with meaningless statistics here, but let’s do
this properly then. So for `guix environment guix -- true` with 10 trials these
are the results:

---snip---
$ ministat -A -c 99 before_ssh after_ssh
x before_ssh
+ after_ssh
    N           Min           Max        Median           Avg        Stddev
x  10          7.63        10.635       10.1585        9.9382    0.85517585
+  10         5.986         6.424        6.1755        6.1917    0.15398416
Difference at 99.0% confidence
        -3.7465 +/- 0.790815
        -37.698% +/- 5.0788%
        (Student's t, pooled s = 0.614425)

$ ministat -A -c 99 before_unix after_unix
x before_unix
+ after_unix
    N           Min           Max        Median           Avg        Stddev
x  10         1.528         1.624        1.5445        1.5636   0.035842867
+  10         1.576         1.701        1.6145        1.6229   0.037869513
Difference at 99.0% confidence
        0.0593 +/- 0.0474548
        3.79253% +/- 3.08989%
        (Student's t, pooled s = 0.0368701)
---snap---

And the same for `guix environment --ad-hoc r-learnr`:

---snip---
$ ministat -A -c 99 before_ssh after_ssh
x before_ssh
+ after_ssh
    N           Min           Max        Median           Avg        Stddev
x  10        20.104        21.629        21.082       21.0115    0.45142472
+  10        16.082        16.713        16.292       16.3416    0.25018224
Difference at 99.0% confidence
        -4.6699 +/- 0.469719
        -22.2254% +/- 1.86734%
        (Student's t, pooled s = 0.364949)

$ ministat -A -c 99 before_unix after_unix
x before_unix
+ after_unix
    N           Min           Max        Median           Avg        Stddev
x  10        11.322        11.529        11.401       11.4127   0.062467858
+  10        11.249        11.469       11.3515        11.368   0.078689262
No difference proven at 99.0% confidence
---snap---

> Otherwise the patch LGTM as long as it doesn’t introduce any test suite
> regressions.
Yes, it passes the test suite.

Lars

[signature.asc (application/pgp-signature, inline)]

Reply sent to Ludovic Courtès <ludo <at> gnu.org>:
You have taken responsibility. (Tue, 09 Jun 2020 07:50:02 GMT) Full text and rfc822 format available.

Notification sent to Lars-Dominik Braun <ldb <at> leibniz-psychology.org>:
bug acknowledged by developer. (Tue, 09 Jun 2020 07:50:02 GMT) Full text and rfc822 format available.

Message #16 received at 41720-done <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Lars-Dominik Braun <ldb <at> leibniz-psychology.org>
Cc: 41720-done <at> debbugs.gnu.org
Subject: Re: [bug#41720] [PATCH] store: Use buffered I/O for all protocol
 writes
Date: Tue, 09 Jun 2020 09:49:49 +0200
Hi!

Lars-Dominik Braun <ldb <at> leibniz-psychology.org> skribis:

>> Nice, though the slowdown on Unix-domain sockets is annoying given that
>> it’s the primary transport.  Are the 10% significant and stable over
>> multiple runs?
> I don’t want to blind your eyes with meaningless statistics here, but let’s do
> this properly then. So for `guix environment guix -- true` with 10 trials these
> are the results:
>
> ---snip---
> $ ministat -A -c 99 before_ssh after_ssh
> x before_ssh
> + after_ssh
>     N           Min           Max        Median           Avg        Stddev
> x  10          7.63        10.635       10.1585        9.9382    0.85517585
> +  10         5.986         6.424        6.1755        6.1917    0.15398416
> Difference at 99.0% confidence
>         -3.7465 +/- 0.790815
>         -37.698% +/- 5.0788%
>         (Student's t, pooled s = 0.614425)
>
> $ ministat -A -c 99 before_unix after_unix
> x before_unix
> + after_unix
>     N           Min           Max        Median           Avg        Stddev
> x  10         1.528         1.624        1.5445        1.5636   0.035842867
> +  10         1.576         1.701        1.6145        1.6229   0.037869513
> Difference at 99.0% confidence
>         0.0593 +/- 0.0474548
>         3.79253% +/- 3.08989%
>         (Student's t, pooled s = 0.0368701)
> ---snap---
>
> And the same for `guix environment --ad-hoc r-learnr`:
>
> ---snip---
> $ ministat -A -c 99 before_ssh after_ssh
> x before_ssh
> + after_ssh
>     N           Min           Max        Median           Avg        Stddev
> x  10        20.104        21.629        21.082       21.0115    0.45142472
> +  10        16.082        16.713        16.292       16.3416    0.25018224
> Difference at 99.0% confidence
>         -4.6699 +/- 0.469719
>         -22.2254% +/- 1.86734%
>         (Student's t, pooled s = 0.364949)
>
> $ ministat -A -c 99 before_unix after_unix
> x before_unix
> + after_unix
>     N           Min           Max        Median           Avg        Stddev
> x  10        11.322        11.529        11.401       11.4127   0.062467858
> +  10        11.249        11.469       11.3515        11.368   0.078689262
> No difference proven at 99.0% confidence
> ---snap---

Awesome, thanks a lot.  Looks like the difference on Unix-domain sockets
is not significant.

Applied!

What package provides ‘ministat’?

Thanks,
Ludo’.




Information forwarded to guix-patches <at> gnu.org:
bug#41720; Package guix-patches. (Tue, 09 Jun 2020 08:22:02 GMT) Full text and rfc822 format available.

Message #19 received at 41720-done <at> debbugs.gnu.org (full text, mbox):

From: Lars-Dominik Braun <ldb <at> leibniz-psychology.org>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 41720-done <at> debbugs.gnu.org
Subject: Re: [bug#41720] [PATCH] store: Use buffered I/O for all protocol
 writes
Date: Tue, 9 Jun 2020 10:21:51 +0200
[Message part 1 (text/plain, inline)]
Hi,

great!

> What package provides ‘ministat’?
It’s originally a FreeBSD tool[1], but there are several unofficial Linux ports
on GitHub, like [2].

Cheers,
Lars

[1] https://www.freebsd.org/cgi/man.cgi?query=ministat&sektion=1&manpath=freebsd-release-ports
[2] https://github.com/csjayp/ministat

[signature.asc (application/pgp-signature, inline)]

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 07 Jul 2020 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 294 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.