Received: (at 39728) by debbugs.gnu.org; 24 Feb 2020 21:23:56 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Feb 24 16:23:56 2020 Received: from localhost ([127.0.0.1]:54204 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1j6LCm-0005mT-3e for submit <at> debbugs.gnu.org; Mon, 24 Feb 2020 16:23:56 -0500 Received: from eggs.gnu.org ([209.51.188.92]:42208) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <ludo@HIDDEN>) id 1j6LCk-0005mB-V0 for 39728 <at> debbugs.gnu.org; Mon, 24 Feb 2020 16:23:55 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:34891) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <ludo@HIDDEN>) id 1j6LCf-00086z-4p; Mon, 24 Feb 2020 16:23:49 -0500 Received: from [80.215.212.253] (port=17708 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from <ludo@HIDDEN>) id 1j6LCe-0007dP-HW; Mon, 24 Feb 2020 16:23:48 -0500 From: =?utf-8?Q?Ludovic_Court=C3=A8s?= <ludo@HIDDEN> To: Julien Lepiller <julien@HIDDEN> Subject: Re: [bug#39728] [PATCH] Allow parallel downloads and builds References: <20200221235307.535fb453@HIDDEN> Date: Mon, 24 Feb 2020 22:23:45 +0100 In-Reply-To: <20200221235307.535fb453@HIDDEN> (Julien Lepiller's message of "Fri, 21 Feb 2020 23:53:07 +0100") Message-ID: <87h7zfel26.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 39728 Cc: 39728 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.7 (-) Hi! Julien Lepiller <julien@HIDDEN> skribis: > This patch allows to count builds and downloads separately. The idea is > that downloads need bandwidth, but no CPU, while builds do not need > bandwidth, but need CPU. With this patch, guix will be able to download > substitutes while building unrelated packages. Currently, guix needs to > wait for the download to finish before proceeding to the build. This > should reduce the time of guix commands that need to build and download > things at the same time. > > What do you think? I think it=E2=80=99s a good idea! I wonder what the UI will look like: (guix status) would no longer display a progress bar when there=E2=80=99s more than on job (build or down= load) taking place at the same time. >>From 9c059d81ba4f4016f8c400b403f8c5edbdb160c2 Mon Sep 17 00:00:00 2001 > From: Julien Lepiller <julien@HIDDEN> > Date: Fri, 21 Feb 2020 23:41:33 +0100 > Subject: [PATCH] nix: Count build and download jobs separately. ^ I=E2=80=99d write =E2=80=9Cdaemon:=E2=80=9D here. :-) > This allows to run downloads (that take bandwith) and builds (that take ^ =E2=80=9CThis allows us=E2=80=9D > CPU time) independently from one another. > > * nix/nix-daemon/guix-daemon.cc: Add a max-download-jobs option. > * nix/libstore/globals.hh: Add a maxDownloadJobs setting. > * nix/libstore/globals.cc: Add a default value to it. > * nix/libstore/build.cc: Manage build and download jobs separately. For the final patch, please specify the entities changed (classes, functions, etc.). > + /* Number of download slots occupied. This includes substitution and > + built-ins. */ > + unsigned int nrDownloads; Note that not all builtins are downloads. Fixed-output derivations are (usually) also downloads. (It=E2=80=99d be the first time the daemon gets a notion of =E2=80=9Cdownlo= ad=E2=80=9D. We should make sure it doesn=E2=80=99t conflict with other assumptions.) > /* Registers a running child process. `inBuildSlot' means that > the process counts towards the jobs limit. */ > void childStarted(GoalPtr goal, pid_t pid, > - const set<int> & fds, bool inBuildSlot, bool respectTimeouts); > + const set<int> & fds, bool inBuildSlot, bool inDownloadSlot, > + bool respectTimeouts); How about replacing these two Booleans by a single enum? > + unsigned int curDownloads =3D worker.getNrDownloads(); > + if (curDownloads >=3D (settings.maxDownloadJobs=3D=3D0?1:settings.ma= xDownloadJobs) && > + fixedOutput) { This is hard to parse and lacking spacing. :-) Perhaps make an intermediate function or variable? > +void Worker::waitForDownloadSlot(GoalPtr goal) > +{ > + debug("wait for download slot"); > + if (getNrDownloads() < (settings.maxDownloadJobs=3D=3D0?1:settings.m= axDownloadJobs)) Same here. > @@ -118,6 +119,7 @@ void Settings::update() > { > _get(tryFallback, "build-fallback"); > _get(maxBuildJobs, "build-max-jobs"); > + _get(maxDownloadJobs, "download-max-jobs"); We should also allow =E2=80=98set-build-options=E2=80=99 to set this option= , as well as add it to =E2=80=98%standard-build-options=E2=80=99. That can prolly come = in a separate patch. > + { "max-downloads", 'D', n_("N"), 0, > + n_("allow at most N download jobs") }, We=E2=80=99d need to update doc/guix.texi. It would be great if you could test this patch for your daily usage. I find it surprisingly easy to break things in the daemon. :-) Thank you! Ludo=E2=80=99.
guix-patches@HIDDEN
:bug#39728
; Package guix-patches
.
Full text available.Received: (at submit) by debbugs.gnu.org; 21 Feb 2020 22:53:34 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Feb 21 17:53:34 2020 Received: from localhost ([127.0.0.1]:48348 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1j5HAs-0008CC-1o for submit <at> debbugs.gnu.org; Fri, 21 Feb 2020 17:53:34 -0500 Received: from lists.gnu.org ([209.51.188.17]:55942) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <julien@HIDDEN>) id 1j5HAq-0008C3-30 for submit <at> debbugs.gnu.org; Fri, 21 Feb 2020 17:53:32 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:44754) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from <julien@HIDDEN>) id 1j5HAn-0000RJ-QH for guix-patches@HIDDEN; Fri, 21 Feb 2020 17:53:31 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,URIBL_BLOCKED autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <julien@HIDDEN>) id 1j5HAi-0008CD-1t for guix-patches@HIDDEN; Fri, 21 Feb 2020 17:53:29 -0500 Received: from lepiller.eu ([2a00:5884:8208::1]:39044) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from <julien@HIDDEN>) id 1j5HAh-0008AT-4v for guix-patches@HIDDEN; Fri, 21 Feb 2020 17:53:24 -0500 Received: from lepiller.eu (localhost [127.0.0.1]) by lepiller.eu (OpenSMTPD) with ESMTP id cd01c274 for <guix-patches@HIDDEN>; Fri, 21 Feb 2020 22:53:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=lepiller.eu; h=date:from :to:subject:message-id:mime-version:content-type; s=dkim; bh=n++ 9yoIChEqgG20cqPGuVAueYEx728KISUJPEfg13k4=; b=Y08S5xrCZ+3Zzo/X308 qvrWmAKKlugH1AmrL5Zk6iKmLk5QgPOXglurseeDqZTxT5L4ee3gX9jpJxjrBWVC KoiCvqv+vrs4GmttN/X3+aY+HWx7vOdYtfwm+xt6YsLMeXFZqFcuuGPvsNNeNWzn CwOeKgsycJmlDRCItsG/y8S+IReKupngF0RoCnFxOzocPgNVu45cxD0KZPZ8UNjS jZaOqOkw8DIYwCNB2fkS3HdWAose8bFYv5mLuimWksGLltXMQWtiVspWwaU2AVao k4tBLAN70dj2K+jOym6M3PaEl93oZmpMXdti4K3b5vwxvKCShDyltPwDYhLHGhm5 JBA== Received: by lepiller.eu (OpenSMTPD) with ESMTPSA id e21dd517 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for <guix-patches@HIDDEN>; Fri, 21 Feb 2020 22:53:18 +0000 (UTC) Date: Fri, 21 Feb 2020 23:53:07 +0100 From: Julien Lepiller <julien@HIDDEN> To: guix-patches@HIDDEN Subject: [PATCH] Allow parallel downloads and builds Message-ID: <20200221235307.535fb453@HIDDEN> X-Mailer: Claws Mail 3.17.4 (GTK+ 2.24.32; x86_64-unknown-linux-gnu) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="MP_/08Zvlxdk=mg4mXHm6u3=if1" X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:5884:8208::1 X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.7 (-) --MP_/08Zvlxdk=mg4mXHm6u3=if1 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Disposition: inline Hi guix! This patch allows to count builds and downloads separately. The idea is that downloads need bandwidth, but no CPU, while builds do not need bandwidth, but need CPU. With this patch, guix will be able to download substitutes while building unrelated packages. Currently, guix needs to wait for the download to finish before proceeding to the build. This should reduce the time of guix commands that need to build and download things at the same time. What do you think? --MP_/08Zvlxdk=mg4mXHm6u3=if1 Content-Type: text/x-patch Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=0001-nix-Count-build-and-download-jobs-separately.patch From 9c059d81ba4f4016f8c400b403f8c5edbdb160c2 Mon Sep 17 00:00:00 2001 From: Julien Lepiller <julien@HIDDEN> Date: Fri, 21 Feb 2020 23:41:33 +0100 Subject: [PATCH] nix: Count build and download jobs separately. This allows to run downloads (that take bandwith) and builds (that take CPU time) independently from one another. * nix/nix-daemon/guix-daemon.cc: Add a max-download-jobs option. * nix/libstore/globals.hh: Add a maxDownloadJobs setting. * nix/libstore/globals.cc: Add a default value to it. * nix/libstore/build.cc: Manage build and download jobs separately. --- nix/libstore/build.cc | 75 +++++++++++++++++++++++++++-------- nix/libstore/globals.cc | 2 + nix/libstore/globals.hh | 3 ++ nix/nix-daemon/guix-daemon.cc | 5 +++ 4 files changed, 69 insertions(+), 16 deletions(-) diff --git a/nix/libstore/build.cc b/nix/libstore/build.cc index 17e92c68a7..d08904c4ee 100644 --- a/nix/libstore/build.cc +++ b/nix/libstore/build.cc @@ -204,6 +204,7 @@ struct Child set<int> fds; bool respectTimeouts; bool inBuildSlot; + bool inDownloadSlot; time_t lastOutput; /* time we last got output on stdout/stderr */ time_t timeStarted; }; @@ -231,10 +232,14 @@ private: /* Child processes currently running. */ Children children; - /* Number of build slots occupied. This includes local builds and - substitutions but not remote builds via the build hook. */ + /* Number of build slots occupied. This includes local builds but not + substitution, built-ins and remote builds via the build hook. */ unsigned int nrLocalBuilds; + /* Number of download slots occupied. This includes substitution and + built-ins. */ + unsigned int nrDownloads; + /* Maps used to prevent multiple instantiations of a goal for the same derivation / path. */ WeakGoalMap derivationGoals; @@ -275,15 +280,18 @@ public: /* Wake up a goal (i.e., there is something for it to do). */ void wakeUp(GoalPtr goal); - /* Return the number of local build and substitution processes - currently running (but not remote builds via the build - hook). */ + /* Return the number of local build processes currently running (but + not remote builds via the build hook). */ unsigned int getNrLocalBuilds(); + /* Return the number of downloads currently running. */ + unsigned int getNrDownloads(); + /* Registers a running child process. `inBuildSlot' means that the process counts towards the jobs limit. */ void childStarted(GoalPtr goal, pid_t pid, - const set<int> & fds, bool inBuildSlot, bool respectTimeouts); + const set<int> & fds, bool inBuildSlot, bool inDownloadSlot, + bool respectTimeouts); /* Unregisters a running child process. `wakeSleepers' should be false if there is no sense in waking up goals that are sleeping @@ -295,6 +303,10 @@ public: might be right away). */ void waitForBuildSlot(GoalPtr goal); + /* Put `goal' to sleep until a download slot becomes available (which + might be right away). */ + void waitForDownloadSlot(GoalPtr goal); + /* Wait for any goal to finish. Pretty indiscriminate way to wait for some resource that some other goal is holding. */ void waitForAnyGoal(GoalPtr goal); @@ -1359,12 +1371,21 @@ void DerivationGoal::tryToBuild() derivation prefers to be done locally, do it even if maxBuildJobs is 0. */ unsigned int curBuilds = worker.getNrLocalBuilds(); - if (curBuilds >= settings.maxBuildJobs && !(buildLocally && curBuilds == 0)) { + if (curBuilds >= settings.maxBuildJobs && !(buildLocally && curBuilds == 0) && + !fixedOutput) { worker.waitForBuildSlot(shared_from_this()); outputLocks.unlock(); return; } + unsigned int curDownloads = worker.getNrDownloads(); + if (curDownloads >= (settings.maxDownloadJobs==0?1:settings.maxDownloadJobs) && + fixedOutput) { + worker.waitForDownloadSlot(shared_from_this()); + outputLocks.unlock(); + return; + } + try { /* Okay, we have to build. */ @@ -1648,7 +1669,7 @@ HookReply DerivationGoal::tryBuildHook() set<int> fds; fds.insert(hook->fromHook.readSide); fds.insert(hook->builderOut.readSide); - worker.childStarted(shared_from_this(), hook->pid, fds, false, true); + worker.childStarted(shared_from_this(), hook->pid, fds, false, false, true); if (settings.printBuildTrace) printMsg(lvlError, format("@ build-started %1% - %2% %3% %4%") @@ -2030,7 +2051,7 @@ void DerivationGoal::startBuilder() pid.setSeparatePG(true); builderOut.writeSide.close(); worker.childStarted(shared_from_this(), pid, - singleton<set<int> >(builderOut.readSide), true, true); + singleton<set<int> >(builderOut.readSide), !fixedOutput, fixedOutput, true); /* Check if setting up the build environment failed. */ string msg = readLine(builderOut.readSide); @@ -3034,11 +3055,11 @@ void SubstitutionGoal::tryToRun() trace("trying to run"); /* Make sure that we are allowed to start a build. Note that even - is maxBuildJobs == 0 (no local builds allowed), we still allow - a substituter to run. This is because substitutions cannot be - distributed to another machine via the build hook. */ - if (worker.getNrLocalBuilds() >= (settings.maxBuildJobs == 0 ? 1 : settings.maxBuildJobs)) { - worker.waitForBuildSlot(shared_from_this()); + is maxDownloadJobs == 0 (no downloads allowed), we still allow + a substituter to run. This is because we always need to download, so + not allowing is meaningless. */ + if (worker.getNrDownloads() >= (settings.maxDownloadJobs == 0 ? 1 : settings.maxDownloadJobs)) { + worker.waitForDownloadSlot(shared_from_this()); return; } @@ -3107,7 +3128,7 @@ void SubstitutionGoal::tryToRun() outPipe.writeSide.close(); logPipe.writeSide.close(); worker.childStarted(shared_from_this(), - pid, singleton<set<int> >(logPipe.readSide), true, true); + pid, singleton<set<int> >(logPipe.readSide), false, true, true); state = &SubstitutionGoal::finished; @@ -3242,6 +3263,7 @@ Worker::Worker(LocalStore & store) if (working) abort(); working = true; nrLocalBuilds = 0; + nrDownloads = 0; lastWokenUp = 0; permanentFailure = false; timedOut = false; @@ -3334,9 +3356,14 @@ unsigned Worker::getNrLocalBuilds() return nrLocalBuilds; } +unsigned Worker::getNrDownloads() +{ + return nrDownloads; +} + void Worker::childStarted(GoalPtr goal, - pid_t pid, const set<int> & fds, bool inBuildSlot, + pid_t pid, const set<int> & fds, bool inBuildSlot, bool inDownloadSlot, bool respectTimeouts) { Child child; @@ -3344,9 +3371,11 @@ void Worker::childStarted(GoalPtr goal, child.fds = fds; child.timeStarted = child.lastOutput = time(0); child.inBuildSlot = inBuildSlot; + child.inDownloadSlot = inDownloadSlot; child.respectTimeouts = respectTimeouts; children[pid] = child; if (inBuildSlot) nrLocalBuilds++; + if (inDownloadSlot) nrDownloads++; } @@ -3362,6 +3391,11 @@ void Worker::childTerminated(pid_t pid, bool wakeSleepers) nrLocalBuilds--; } + if (i->second.inDownloadSlot) { + assert(nrDownloads > 0); + nrDownloads--; + } + children.erase(pid); if (wakeSleepers) { @@ -3386,6 +3420,15 @@ void Worker::waitForBuildSlot(GoalPtr goal) addToWeakGoals(wantingToBuild, goal); } +void Worker::waitForDownloadSlot(GoalPtr goal) +{ + debug("wait for download slot"); + if (getNrDownloads() < (settings.maxDownloadJobs==0?1:settings.maxDownloadJobs)) + wakeUp(goal); /* we can do it right away */ + else + addToWeakGoals(wantingToBuild, goal); +} + void Worker::waitForAnyGoal(GoalPtr goal) { diff --git a/nix/libstore/globals.cc b/nix/libstore/globals.cc index 0cc001fbe4..416033718d 100644 --- a/nix/libstore/globals.cc +++ b/nix/libstore/globals.cc @@ -29,6 +29,7 @@ Settings::Settings() tryFallback = false; buildVerbosity = lvlError; maxBuildJobs = 1; + maxDownloadJobs = 1; buildCores = 1; readOnlyMode = false; thisSystem = SYSTEM; @@ -118,6 +119,7 @@ void Settings::update() { _get(tryFallback, "build-fallback"); _get(maxBuildJobs, "build-max-jobs"); + _get(maxDownloadJobs, "download-max-jobs"); _get(buildCores, "build-cores"); _get(thisSystem, "system"); _get(multiplexedBuildOutput, "multiplexed-build-output"); diff --git a/nix/libstore/globals.hh b/nix/libstore/globals.hh index 27616a2283..c033f8ed56 100644 --- a/nix/libstore/globals.hh +++ b/nix/libstore/globals.hh @@ -90,6 +90,9 @@ struct Settings { /* Maximum number of parallel build jobs. 0 means unlimited. */ unsigned int maxBuildJobs; + /* Maximum number of parallel download jobs. 0 means 1. */ + unsigned int maxDownloadJobs; + /* Number of CPU cores to utilize in parallel within a build, i.e. by passing this number to Make via '-j'. 0 means that the number of actual CPU cores on the local host ought to be diff --git a/nix/nix-daemon/guix-daemon.cc b/nix/nix-daemon/guix-daemon.cc index cd949aca67..6997c07f5a 100644 --- a/nix/nix-daemon/guix-daemon.cc +++ b/nix/nix-daemon/guix-daemon.cc @@ -99,6 +99,8 @@ static const struct argp_option options[] = }, { "max-jobs", 'M', n_("N"), 0, n_("allow at most N build jobs") }, + { "max-downloads", 'D', n_("N"), 0, + n_("allow at most N download jobs") }, { "timeout", GUIX_OPT_TIMEOUT, n_("SECONDS"), 0, n_("mark builds as failed after SECONDS of activity") }, { "max-silent-time", GUIX_OPT_MAX_SILENT_TIME, n_("SECONDS"), 0, @@ -276,6 +278,9 @@ parse_opt (int key, char *arg, struct argp_state *state) case 'M': settings.set ("build-max-jobs", arg); break; + case 'D': + settings.set ("download-max-jobs", arg); + break; case GUIX_OPT_TIMEOUT: settings.set ("build-timeout", arg); break; -- 2.24.0 --MP_/08Zvlxdk=mg4mXHm6u3=if1--
Julien Lepiller <julien@HIDDEN>
:guix-patches@HIDDEN
.
Full text available.guix-patches@HIDDEN
:bug#39728
; Package guix-patches
.
Full text available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.