GNU bug report logs - #33210
Cuirass: Use a SQLite in single-thread mode

Previous Next

Package: guix-patches;

Reported by: Clément Lassieur <clement <at> lassieur.org>

Date: Tue, 30 Oct 2018 20:36:03 UTC

Severity: normal

Done: Clément Lassieur <clement <at> lassieur.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 33210 in the body.
You can then email your comments to 33210 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to guix-patches <at> gnu.org:
bug#33210; Package guix-patches. (Tue, 30 Oct 2018 20:36:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Clément Lassieur <clement <at> lassieur.org>:
New bug report received and forwarded. Copy sent to guix-patches <at> gnu.org. (Tue, 30 Oct 2018 20:36:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: guix-patches <at> gnu.org
Subject: Cuirass: Use a SQLite in single-thread mode
Date: Tue, 30 Oct 2018 21:35:06 +0100
Hi,

These patches are supposed to slightly improve Cuirass' performances,
because it doesn't use the multi-threading features.

Clément




Information forwarded to guix-patches <at> gnu.org:
bug#33210; Package guix-patches. (Tue, 30 Oct 2018 20:48:01 GMT) Full text and rfc822 format available.

Message #8 received at 33210 <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: 33210 <at> debbugs.gnu.org
Subject: [PATCH 1/3] gnu: Add sqlite-with-single-thread.
Date: Tue, 30 Oct 2018 21:47:24 +0100
* gnu/packages/databases.scm (sqlite-with-single-thread): New variable.
---
 gnu/packages/databases.scm | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/gnu/packages/databases.scm b/gnu/packages/databases.scm
index 87fb170e5..24914cd87 100644
--- a/gnu/packages/databases.scm
+++ b/gnu/packages/databases.scm
@@ -32,6 +32,7 @@
 ;;; Copyright © 2017 Kristofer Buffington <kristoferbuffington <at> gmail.com>
 ;;; Copyright © 2018 Amirouche Boubekki <amirouche <at> hypermove.net>
 ;;; Copyright © 2018 Joshua Sierles, Nextjournal <joshua <at> nextjournal.com>
+;;; Copyright © 2018 Clément Lassieur <clement <at> lassieur.org>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -1223,6 +1224,15 @@ is in the public domain.")
                               "-DSQLITE_ENABLE_DBSTAT_VTAB "
                               "-DSQLITE_ENABLE_COLUMN_METADATA")))))))
 
+;; This is used by Cuirass.
+(define-public sqlite-with-single-thread
+  (package (inherit sqlite)
+    (name "sqlite-with-single-thread")
+    (arguments
+     (substitute-keyword-arguments (package-arguments sqlite)
+       ((#:configure-flags flags)
+        `(cons "--disable-threadsafe" ,flags))))))
+
 (define-public tdb
   (package
     (name "tdb")
-- 
2.19.1





Information forwarded to guix-patches <at> gnu.org:
bug#33210; Package guix-patches. (Tue, 30 Oct 2018 20:48:02 GMT) Full text and rfc822 format available.

Message #11 received at 33210 <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: 33210 <at> debbugs.gnu.org
Subject: [PATCH 2/3] gnu: Add guile-sqlite3-with-single-thread.
Date: Tue, 30 Oct 2018 21:47:25 +0100
* gnu/packages/guile.scm (guile-sqlite3-with-single-thread): New variable.
---
 gnu/packages/guile.scm | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/gnu/packages/guile.scm b/gnu/packages/guile.scm
index 9e3300337..4f9ddf913 100644
--- a/gnu/packages/guile.scm
+++ b/gnu/packages/guile.scm
@@ -20,6 +20,7 @@
 ;;; Copyright © 2018 Arun Isaac <arunisaac <at> systemreboot.net>
 ;;; Copyright © 2018 Pierre-Antoine Rouby <pierre-antoine.rouby <at> inria.fr>
 ;;; Copyright © 2018 Eric Bavier <bavier <at> member.fsf.org>
+;;; Copyright © 2018 Clément Lassieur <clement <at> lassieur.org>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -71,6 +72,7 @@
   #:use-module (gnu packages xdisorg)
   #:use-module (gnu packages xorg)
   #:use-module (gnu packages networking)
+  #:use-module ((guix build utils) #:select (alist-replace))
   #:use-module (guix packages)
   #:use-module (guix download)
   #:use-module (guix git-download)
@@ -1166,6 +1168,13 @@ Guile's foreign function interface.")
      "This package provides Guile bindings to the SQLite database system.")
     (license license:gpl3+)))
 
+(define-public guile-sqlite3-with-single-thread
+  (package
+    (inherit guile-sqlite3)
+    (name "guile-sqlite3-with-single-thread")
+    (inputs (alist-replace "sqlite" (list sqlite-with-single-thread)
+                           (package-inputs guile-sqlite3)))))
+
 (define-public guile2.0-sqlite3
   (package-for-guile-2.0 guile-sqlite3))
 
-- 
2.19.1





Information forwarded to guix-patches <at> gnu.org:
bug#33210; Package guix-patches. (Tue, 30 Oct 2018 20:48:02 GMT) Full text and rfc822 format available.

Message #14 received at 33210 <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: 33210 <at> debbugs.gnu.org
Subject: [PATCH 3/3] gnu: cuirass: Use SQLite in single-thread mode.
Date: Tue, 30 Oct 2018 21:47:26 +0100
* gnu/packages/ci.scm (cuirass)[inputs]: Replace GUILE-SQLITE3 with
GUILE-SQLITE3-WITH-SINGLE-THREAD.
---
 gnu/packages/ci.scm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gnu/packages/ci.scm b/gnu/packages/ci.scm
index 1cac8b9fb..126948109 100644
--- a/gnu/packages/ci.scm
+++ b/gnu/packages/ci.scm
@@ -261,7 +261,7 @@ their dependencies.")
          ("guile-fibers" ,guile-fibers)
          ("guile-gcrypt" ,guile-gcrypt)
          ("guile-json" ,guile-json)
-         ("guile-sqlite3" ,guile-sqlite3)
+         ("guile-sqlite3" ,guile-sqlite3-with-single-thread)
          ("guile-git" ,guile-git)
          ;; FIXME: this is propagated by "guile-git", but it needs to be among
          ;; the inputs to add it to GUILE_LOAD_PATH.
-- 
2.19.1





Information forwarded to guix-patches <at> gnu.org:
bug#33210; Package guix-patches. (Sun, 04 Nov 2018 23:09:02 GMT) Full text and rfc822 format available.

Message #17 received at 33210 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Clément Lassieur <clement <at> lassieur.org>
Cc: 33210 <at> debbugs.gnu.org
Subject: Re: [bug#33210] Cuirass: Use a SQLite in single-thread mode
Date: Mon, 05 Nov 2018 00:08:23 +0100
Hello,

Clément Lassieur <clement <at> lassieur.org> skribis:

> These patches are supposed to slightly improve Cuirass' performances,
> because it doesn't use the multi-threading features.

Did you notice a measurable difference?

We could do it, but IMO that should be a last resort because I’d expect
it to be a micro-optimization.

WDYT?

Ludo’.




Information forwarded to guix-patches <at> gnu.org:
bug#33210; Package guix-patches. (Mon, 05 Nov 2018 08:04:02 GMT) Full text and rfc822 format available.

Message #20 received at 33210 <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 33210 <at> debbugs.gnu.org
Subject: Re: [bug#33210] Cuirass: Use a SQLite in single-thread mode
Date: Mon, 05 Nov 2018 09:02:57 +0100
Ludovic Courtès <ludo <at> gnu.org> writes:

> Hello,
>
> Clément Lassieur <clement <at> lassieur.org> skribis:
>
>> These patches are supposed to slightly improve Cuirass' performances,
>> because it doesn't use the multi-threading features.
>
> Did you notice a measurable difference?

I haven't done any measurement yet, but according to the SQLite
documentation:

    Setting -DSQLITE_THREADSAFE=0 causes all of the mutex and
    thread-safety logic in SQLite to be omitted. This is the single
    compile-time option that makes the most difference in optimizing the
    performance of SQLite.

So even if the optimization is small, it's the option that has the
biggest impact on performance.

> We could do it, but IMO that should be a last resort because I’d expect
> it to be a micro-optimization.

Lots of micro-optimizations lead to an overall faster application ;-).
And this one doesn't make the code more complicated.  To me it's just a
bonus.

[1]: https://www.sqlite.org/compile.html




Information forwarded to guix-patches <at> gnu.org:
bug#33210; Package guix-patches. (Tue, 06 Nov 2018 00:13:02 GMT) Full text and rfc822 format available.

Message #23 received at 33210 <at> debbugs.gnu.org (full text, mbox):

From: Danny Milosavljevic <dannym <at> scratchpost.org>
To: Clément Lassieur <clement <at> lassieur.org>
Cc: 33210 <at> debbugs.gnu.org, Ludovic Courtès <ludo <at> gnu.org>
Subject: Re: [bug#33210] Cuirass: Use a SQLite in single-thread mode
Date: Tue, 6 Nov 2018 01:11:54 +0100
[Message part 1 (text/plain, inline)]
Hi Clément,

> I haven't done any measurement yet, but according to the SQLite
> documentation:
> 
>     Setting -DSQLITE_THREADSAFE=0 causes all of the mutex and
>     thread-safety logic in SQLite to be omitted. This is the single
>     compile-time option that makes the most difference in optimizing the
>     performance of SQLite.
> 
> So even if the optimization is small, it's the option that has the
> biggest impact on performance.
> 
> > We could do it, but IMO that should be a last resort because I’d expect
> > it to be a micro-optimization.  
> 
> Lots of micro-optimizations lead to an overall faster application ;-).
> And this one doesn't make the code more complicated.  To me it's just a
> bonus.

Keep in mind that if we want consistent views via the web interface,
the cuirass evaluator has to use its own connection independent of the
web interface (so that the web interface doesn't see half-finished stuff).
If that's still possible after that then fine.

Right now, as I mentioned before, it can happen that you request a certain
filter when requesting something from the web and the result will actually
contain data that does not match the filter.  What happened is that the
data in the transaction got changed before we returned it but after the
first query ran.

This is not supposed to happen in relational database systems.  The reason
why it happens here is because we use the same transaction (session) for
both the updates done by the evaluator and the queries done by the web
interface.  It would be much better if the queries by the web interface
had their own transaction.  It was fine before but in some refactoring,
"evaluate" ceased to be an external program and I didn't notice what
happened to it.
[Message part 2 (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#33210; Package guix-patches. (Tue, 06 Nov 2018 00:51:02 GMT) Full text and rfc822 format available.

Message #26 received at 33210 <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: Danny Milosavljevic <dannym <at> scratchpost.org>
Cc: 33210 <at> debbugs.gnu.org, Ludovic Courtès <ludo <at> gnu.org>
Subject: Re: [bug#33210] Cuirass: Use a SQLite in single-thread mode
Date: Tue, 06 Nov 2018 01:50:11 +0100
Hi Danny,

Danny Milosavljevic <dannym <at> scratchpost.org> writes:

> Keep in mind that if we want consistent views via the web interface,
> the cuirass evaluator has to use its own connection independent of the
> web interface (so that the web interface doesn't see half-finished stuff).
> If that's still possible after that then fine.
>
> Right now, as I mentioned before, it can happen that you request a certain
> filter when requesting something from the web and the result will actually
> contain data that does not match the filter.  What happened is that the
> data in the transaction got changed before we returned it but after the
> first query ran.
>
> This is not supposed to happen in relational database systems.  The reason
> why it happens here is because we use the same transaction (session) for
> both the updates done by the evaluator and the queries done by the web
> interface.  It would be much better if the queries by the web interface
> had their own transaction.  It was fine before but in some refactoring,
> "evaluate" ceased to be an external program and I didn't notice what
> happened to it.

Yes, I know, but I remember that I told you[1] that there was an easy
(one line) fix for this. :-)

Do you want to send a patch?

[1]: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=32234#57




Information forwarded to guix-patches <at> gnu.org:
bug#33210; Package guix-patches. (Tue, 06 Nov 2018 11:21:02 GMT) Full text and rfc822 format available.

Message #29 received at 33210 <at> debbugs.gnu.org (full text, mbox):

From: Danny Milosavljevic <dannym <at> scratchpost.org>
To: Clément Lassieur <clement <at> lassieur.org>
Cc: 33210 <at> debbugs.gnu.org, Ludovic Courtès <ludo <at> gnu.org>
Subject: Re: [bug#33210] Cuirass: Use a SQLite in single-thread mode
Date: Tue, 6 Nov 2018 12:20:36 +0100
[Message part 1 (text/plain, inline)]
Hi Clément,

On Tue, 06 Nov 2018 01:50:11 +0100
Clément Lassieur <clement <at> lassieur.org> wrote:

> > This is not supposed to happen in relational database systems.  The reason
> > why it happens here is because we use the same transaction (session) for
> > both the updates done by the evaluator and the queries done by the web
> > interface.  It would be much better if the queries by the web interface
> > had their own transaction.  It was fine before but in some refactoring,
> > "evaluate" ceased to be an external program and I didn't notice what
> > happened to it.  
> 
> Yes, I know, but I remember that I told you[1] that there was an easy
> (one line) fix for this. :-)

That would really only be a workaround (arguably still better than what we
have now - which is basically you ask for a banana and I give you an apple
- what if other procedures in cuirass assume it's a banana? :) ).

We would be doing the work that SQLite would have done, but we'd do it in a bad
way (by blocking everyone else).

It's not really useful to undo all the progress relational databases have made.

If we had multiple transactions in progress touching the same row, SQLite would
use MVCC to hold diverging copies of the same row at the same time, blocking
nobody.
[Message part 2 (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#33210; Package guix-patches. (Tue, 06 Nov 2018 14:08:02 GMT) Full text and rfc822 format available.

Message #32 received at 33210 <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: Danny Milosavljevic <dannym <at> scratchpost.org>
Cc: 33210 <at> debbugs.gnu.org
Subject: Re: [bug#33210] Cuirass: Use a SQLite in single-thread mode
Date: Tue, 06 Nov 2018 15:07:40 +0100
Hey Danny,

Danny Milosavljevic <dannym <at> scratchpost.org> writes:

> Hi Clément,
>
> On Tue, 06 Nov 2018 01:50:11 +0100
> Clément Lassieur <clement <at> lassieur.org> wrote:
>
>> > This is not supposed to happen in relational database systems.  The reason
>> > why it happens here is because we use the same transaction (session) for
>> > both the updates done by the evaluator and the queries done by the web
>> > interface.  It would be much better if the queries by the web interface
>> > had their own transaction.  It was fine before but in some refactoring,
>> > "evaluate" ceased to be an external program and I didn't notice what
>> > happened to it.  
>> 
>> Yes, I know, but I remember that I told you[1] that there was an easy
>> (one line) fix for this. :-)
>
> That would really only be a workaround (arguably still better than what we
> have now - which is basically you ask for a banana and I give you an apple
> - what if other procedures in cuirass assume it's a banana? :) ).
>
> We would be doing the work that SQLite would have done, but we'd do it in a bad
> way (by blocking everyone else).
>
> It's not really useful to undo all the progress relational databases have made.
>
> If we had multiple transactions in progress touching the same row, SQLite would
> use MVCC to hold diverging copies of the same row at the same time, blocking
> nobody.

It blocks everyone indeed, because we take care of the scheduling in a
rather basic way.  But if I understand correctly, the overall spent time
is more or less the same: only the order of the requests differs.  There
is an essential difference however: if we take care of the scheduling,
we won't have SQLITE_BUSY blocking the Fibers scheduler all the time.
And blocking the Fibers scheduler has an impact on all other possibly
unrelated Fibers clients.  When guile-sqlite3 handles SQLITE_BUSY
correctly, I'll be happy to switch back to a multi-threading SQLite.
While waiting for it, I believe our users want a fast Cuirass, and I'd
like the time spent in the Fibers scheduler to be balanced by removing
the SQLite now useless mutexes.

Does that make sense?

Clément




Information forwarded to guix-patches <at> gnu.org:
bug#33210; Package guix-patches. (Tue, 06 Nov 2018 14:41:03 GMT) Full text and rfc822 format available.

Message #35 received at 33210 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Clément Lassieur <clement <at> lassieur.org>
Cc: 33210 <at> debbugs.gnu.org
Subject: Re: [bug#33210] Cuirass: Use a SQLite in single-thread mode
Date: Tue, 06 Nov 2018 15:40:11 +0100
Hi Clément,

Clément Lassieur <clement <at> lassieur.org> skribis:

> Ludovic Courtès <ludo <at> gnu.org> writes:
>
>> Hello,
>>
>> Clément Lassieur <clement <at> lassieur.org> skribis:
>>
>>> These patches are supposed to slightly improve Cuirass' performances,
>>> because it doesn't use the multi-threading features.
>>
>> Did you notice a measurable difference?
>
> I haven't done any measurement yet, but according to the SQLite
> documentation:
>
>     Setting -DSQLITE_THREADSAFE=0 causes all of the mutex and
>     thread-safety logic in SQLite to be omitted. This is the single
>     compile-time option that makes the most difference in optimizing the
>     performance of SQLite.
>
> So even if the optimization is small, it's the option that has the
> biggest impact on performance.
>
>> We could do it, but IMO that should be a last resort because I’d expect
>> it to be a micro-optimization.
>
> Lots of micro-optimizations lead to an overall faster application ;-).
> And this one doesn't make the code more complicated.  To me it's just a
> bonus.

I agree it doesn’t complicate the code; still, that’s a couple of
additional package variants to deal with, for hardly measurable benefits
I suspect.

I think we should focus on higher-level optimizations at this
development stage of Cuirass.  For instance I have been meaning to patch
it so that it doesn’t have to process all the build logs, since it
doesn’t do anything with those logs and processing them involves tons of
syscalls and string processing and introduces latency in fiber
scheduling.  This is a simple change that could have a more visible
impact I believe.  Hopefully I’ll get there real soon…

WDYT?

Ludo’.




Information forwarded to guix-patches <at> gnu.org:
bug#33210; Package guix-patches. (Tue, 06 Nov 2018 20:11:02 GMT) Full text and rfc822 format available.

Message #38 received at 33210 <at> debbugs.gnu.org (full text, mbox):

From: Danny Milosavljevic <dannym <at> scratchpost.org>
To: Clément Lassieur <clement <at> lassieur.org>
Cc: 33210 <at> debbugs.gnu.org
Subject: Re: [bug#33210] Cuirass: Use a SQLite in single-thread mode
Date: Tue, 6 Nov 2018 21:10:49 +0100
[Message part 1 (text/plain, inline)]
Hi Clément,

> rather basic way.  But if I understand correctly, the overall spent time
> is more or less the same: only the order of the requests differs.

Yeah, right now users can query something using the web interface while
a build is updating (or running) at the cost of the returned data being
potentially very strange.

The "one-line fix" would make it worse in that users cannot query while
a build is running, making them wait until the build is done (approx.
30 min) before the query succeeds.  The upside is that the returned
data is consistent at all times.  This is how DBMSes did it in the 90s,
too.

What I'd like to eventually have is the proper fix where users can query
while a build is running, *and* the build doesn't have to wait either.
This works just fine using two transactions with WAL mode of sqlite,
which means it uses MVCC in order to keep both "world views", one for the
querier and one for the builder (easily extended to an arbitrary
number of queriers and builders at once by just having more transactions)
while they are both using "the world".

> is an essential difference however: if we take care of the scheduling,
> we won't have SQLITE_BUSY blocking the Fibers scheduler all the time.
> And blocking the Fibers scheduler has an impact on all other possibly
> unrelated Fibers clients.

Right.  I just wanted to make sure we understand the possible implications here.

In the end I'm not sure we even need multithreading even for my scenario -
maybe (probably) just having an extra sqlite_open would be enough, threads
or not.  On the other hand there are shared caches etc and this change here
could cause some very tricky problems then.

I have to say I liked the external evaluator much more since then all
this complexity would be contained in the external program and it would
just magically work without special-casing any of this stuff.

>  When guile-sqlite3 handles SQLITE_BUSY
> correctly, I'll be happy to switch back to a multi-threading SQLite.
> While waiting for it, I believe our users want a fast Cuirass, and I'd
> like the time spent in the Fibers scheduler to be balanced by removing
> the SQLite now useless mutexes.

That makes sense.

It's difficult for guile-sqlite3 to handle SQLITE_BUSY correctly since
sqlite also uses SQLITE_BUSY to indicate errors that you are supposed to
fail on.

In the non-presence of a busy handler, it's not possible to distinguish
whether the SQLITE_BUSY was of the "please retry" kind or of the "don't
you retry" kind.

It would mean that guile-sqlite3 would have to have its own flag that
indicates whether the busy handler was called, and check this one.
Resetting this flag would also have to be potentially thread-safe
(for other users of guile-sqlite3).

That's always assuming that sqlite3 undos whatever it was trying to
do before returning SQLITE_BUSY so it actually makes sense to retry
the call later.

So something like this:

guile_sqlite_handle_busy(...) {
  guile_struct->busy_handler_called = true;
  return 0; // fail
}

guile_sqlite_open {
  int rc = native_sqlite_open(...);
  native_sqlite_set_busy_handler(..., guile_sqlite_handle_busy);
  // FIXME: check for errors here and fail on error
  guile_struct->busy_handler_called = false;
}

guile_sqlite_method {
  int rc, busy_handler_called;
  do {
    rc = native_sqlite_method(...);
  } while (rc == SQLITE_BUSY && (busy_handler_called = test-and-reset(guile_struct->busy_handler_called), yield));
  return rc;
}

Hmmmmmmmm.  I think that can be done.

Notes for myself:

pager.c
        busyHandler
                btreeInvokeBusyHandler
                        sqlite3BtreeBeginTrans
        sqlite3PagerSetBusyhandler
        SQLITE_FCNTL_BUSYHANDLER 

[Message part 2 (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#33210; Package guix-patches. (Wed, 07 Nov 2018 11:39:01 GMT) Full text and rfc822 format available.

Message #41 received at 33210 <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: Danny Milosavljevic <dannym <at> scratchpost.org>
Cc: 33210 <at> debbugs.gnu.org
Subject: Re: [bug#33210] Cuirass: Use a SQLite in single-thread mode
Date: Wed, 07 Nov 2018 12:38:20 +0100
Hi Danny,

Danny Milosavljevic <dannym <at> scratchpost.org> writes:

> Yeah, right now users can query something using the web interface while
> a build is updating (or running) at the cost of the returned data being
> potentially very strange.

This is quite unlikely.

> The "one-line fix" would make it worse in that users cannot query while
> a build is running, making them wait until the build is done (approx.
> 30 min) before the query succeeds.  The upside is that the returned
> data is consistent at all times.  This is how DBMSes did it in the 90s,
> too.

No, there are no SQL requests during the build.  There are some before
the build, and some after the build.  But they are all short.

> What I'd like to eventually have is the proper fix where users can query
> while a build is running, *and* the build doesn't have to wait either.
> This works just fine using two transactions with WAL mode of sqlite,
> which means it uses MVCC in order to keep both "world views", one for the
> querier and one for the builder (easily extended to an arbitrary
> number of queriers and builders at once by just having more transactions)
> while they are both using "the world".

It's already the case, because all the queries are very short.

>> is an essential difference however: if we take care of the scheduling,
>> we won't have SQLITE_BUSY blocking the Fibers scheduler all the time.
>> And blocking the Fibers scheduler has an impact on all other possibly
>> unrelated Fibers clients.
>
> Right.  I just wanted to make sure we understand the possible implications here.
>
> In the end I'm not sure we even need multithreading even for my scenario -
> maybe (probably) just having an extra sqlite_open would be enough, threads
> or not.  On the other hand there are shared caches etc and this change here
> could cause some very tricky problems then.

I don't understand this.  Could you explain why we would need an extra
sqlite_open, what change are you talking about, and why there could be
tricky problems with shared caches?

> I have to say I liked the external evaluator much more since then all
> this complexity would be contained in the external program and it would
> just magically work without special-casing any of this stuff.

The evaluator is still external, I'm not sure what you are talking
about.

>>  When guile-sqlite3 handles SQLITE_BUSY
>> correctly, I'll be happy to switch back to a multi-threading SQLite.
>> While waiting for it, I believe our users want a fast Cuirass, and I'd
>> like the time spent in the Fibers scheduler to be balanced by removing
>> the SQLite now useless mutexes.
>
> That makes sense.
>
> It's difficult for guile-sqlite3 to handle SQLITE_BUSY correctly since
> sqlite also uses SQLITE_BUSY to indicate errors that you are supposed to
> fail on.

[...]

> Hmmmmmmmm.  I think that can be done.

Cool!

Cheers,
Clément




Information forwarded to guix-patches <at> gnu.org:
bug#33210; Package guix-patches. (Thu, 08 Nov 2018 00:00:02 GMT) Full text and rfc822 format available.

Message #44 received at 33210 <at> debbugs.gnu.org (full text, mbox):

From: Danny Milosavljevic <dannym <at> scratchpost.org>
To: Clément Lassieur <clement <at> lassieur.org>
Cc: 33210 <at> debbugs.gnu.org
Subject: Re: [bug#33210] Cuirass: Use a SQLite in single-thread mode
Date: Thu, 8 Nov 2018 00:59:41 +0100
[Message part 1 (text/plain, inline)]
Hi Clément,

> > Yeah, right now users can query something using the web interface while
> > a build is updating (or running) at the cost of the returned data being
> > potentially very strange.  
> 
> This is quite unlikely.

When testing the cuirass status frontend it happened regularily without
me trying to make it happen - took quite some time to find the cause, too.

> No, there are no SQL requests during the build.  There are some before
> the build, and some after the build.  But they are all short.

Yes, if there is one transaction right before starting the build and another
transaction at the end of the build, then it's much better.

> > I have to say I liked the external evaluator much more since then all
> > this complexity would be contained in the external program and it would
> > just magically work without special-casing any of this stuff.  
> 
> The evaluator is still external, I'm not sure what you are talking
> about.

Hmm, I'll read through the source for a bit.  I was of the impression
that now the cuirass main process did the updating of the build status
rather than the evaluator.

> > It's difficult for guile-sqlite3 to handle SQLITE_BUSY correctly since
> > sqlite also uses SQLITE_BUSY to indicate errors that you are supposed to
> > fail on.  
> 
> [...]
> 
> > Hmmmmmmmm.  I think that can be done.  

I've tried it and it works well enough, although some of the sqlite
documentation makes it sound like one cannot *just* retry some of the
calls (for example: sqlite3_step).

It's one of the disadantages of fibers that every C library has to have
special code in it to support it somehow (if at all) - it means that
it has to be written in a way to make all calls non-blocking.
[Message part 2 (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#33210; Package guix-patches. (Thu, 08 Nov 2018 07:46:01 GMT) Full text and rfc822 format available.

Message #47 received at 33210 <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: Danny Milosavljevic <dannym <at> scratchpost.org>
Cc: 33210 <at> debbugs.gnu.org, Andy Wingo <wingo <at> pobox.com>
Subject: Re: [bug#33210] Cuirass: Use a SQLite in single-thread mode
Date: Thu, 08 Nov 2018 08:45:33 +0100
Danny Milosavljevic <dannym <at> scratchpost.org> writes:

> Hi Clément,
>
>> > Yeah, right now users can query something using the web interface while
>> > a build is updating (or running) at the cost of the returned data being
>> > potentially very strange.  
>> 
>> This is quite unlikely.
>
> When testing the cuirass status frontend it happened regularily without
> me trying to make it happen - took quite some time to find the cause, too.

Maybe it happened without me noticing then.  I'll trust you on this :-)

>> No, there are no SQL requests during the build.  There are some before
>> the build, and some after the build.  But they are all short.
>
> Yes, if there is one transaction right before starting the build and another
> transaction at the end of the build, then it's much better.
>
>> > I have to say I liked the external evaluator much more since then all
>> > this complexity would be contained in the external program and it would
>> > just magically work without special-casing any of this stuff.  
>> 
>> The evaluator is still external, I'm not sure what you are talking
>> about.
>
> Hmm, I'll read through the source for a bit.  I was of the impression
> that now the cuirass main process did the updating of the build status
> rather than the evaluator.

Oh, yes, the updating of the build status is done by the main process,
but the evaluator is still external and doesn't query the database.  Why
don't you like that?

>> > It's difficult for guile-sqlite3 to handle SQLITE_BUSY correctly since
>> > sqlite also uses SQLITE_BUSY to indicate errors that you are supposed to
>> > fail on.  
>> 
>> [...]
>> 
>> > Hmmmmmmmm.  I think that can be done.  
>
> I've tried it and it works well enough, although some of the sqlite
> documentation makes it sound like one cannot *just* retry some of the
> calls (for example: sqlite3_step).
>
> It's one of the disadantages of fibers that every C library has to have
> special code in it to support it somehow (if at all) - it means that
> it has to be written in a way to make all calls non-blocking.

Indeed.  I wonder how goroutines deal with this.  Cc'ing Andy.

Clément




Reply sent to Clément Lassieur <clement <at> lassieur.org>:
You have taken responsibility. (Thu, 13 Dec 2018 13:58:03 GMT) Full text and rfc822 format available.

Notification sent to Clément Lassieur <clement <at> lassieur.org>:
bug acknowledged by developer. (Thu, 13 Dec 2018 13:58:03 GMT) Full text and rfc822 format available.

Message #52 received at 33210-done <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 33210-done <at> debbugs.gnu.org
Subject: Re: [bug#33210] Cuirass: Use a SQLite in single-thread mode
Date: Thu, 13 Dec 2018 14:57:47 +0100
Ludovic Courtès <ludo <at> gnu.org> writes:

> Hi Clément,
>
> Clément Lassieur <clement <at> lassieur.org> skribis:
>
>> Ludovic Courtès <ludo <at> gnu.org> writes:
>>
>>> Hello,
>>>
>>> Clément Lassieur <clement <at> lassieur.org> skribis:
>>>
>>>> These patches are supposed to slightly improve Cuirass' performances,
>>>> because it doesn't use the multi-threading features.
>>>
>>> Did you notice a measurable difference?
>>
>> I haven't done any measurement yet, but according to the SQLite
>> documentation:
>>
>>     Setting -DSQLITE_THREADSAFE=0 causes all of the mutex and
>>     thread-safety logic in SQLite to be omitted. This is the single
>>     compile-time option that makes the most difference in optimizing the
>>     performance of SQLite.
>>
>> So even if the optimization is small, it's the option that has the
>> biggest impact on performance.
>>
>>> We could do it, but IMO that should be a last resort because I’d expect
>>> it to be a micro-optimization.
>>
>> Lots of micro-optimizations lead to an overall faster application ;-).
>> And this one doesn't make the code more complicated.  To me it's just a
>> bonus.
>
> I agree it doesn’t complicate the code; still, that’s a couple of
> additional package variants to deal with, for hardly measurable benefits
> I suspect.
>
> I think we should focus on higher-level optimizations at this
> development stage of Cuirass.  For instance I have been meaning to patch
> it so that it doesn’t have to process all the build logs, since it
> doesn’t do anything with those logs and processing them involves tons of
> syscalls and string processing and introduces latency in fiber
> scheduling.  This is a simple change that could have a more visible
> impact I believe.  Hopefully I’ll get there real soon…

Understood!

Sorry to reply that late.  Closing.

Clément




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 11 Jan 2019 12:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 5 years and 78 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.