GNU bug report logs - #51021
detect loops in module/package graph

Previous Next

Package: guix;

Reported by: raingloom <raingloom <at> riseup.net>

Date: Tue, 5 Oct 2021 00:59:02 UTC

Severity: normal

To reply to this bug, email your comments to 51021 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#51021; Package guix. (Tue, 05 Oct 2021 00:59:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to raingloom <raingloom <at> riseup.net>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Tue, 05 Oct 2021 00:59:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: raingloom <raingloom <at> riseup.net>
To: Guix Bugs <bug-guix <at> gnu.org>
Subject: detect loops in module/package graph
Date: Tue, 5 Oct 2021 02:58:19 +0200
I'll be short and blunt, currently it sucks big time whenever you have
a loop in your packages.
There is no indication other than Guix hanging forever without any
output.

I don't want to spend my time manually finding loops in graphs,
computers are better at that.

Sadly I don't know when I'll have time to implement this, so if someone
knows of a solution, they should not hesitate with sending a patch and
making all our lives easier.




Information forwarded to bug-guix <at> gnu.org:
bug#51021; Package guix. (Tue, 05 Oct 2021 08:02:02 GMT) Full text and rfc822 format available.

Message #8 received at 51021 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: raingloom <raingloom <at> riseup.net>, 51021 <at> debbugs.gnu.org
Subject: Re: bug#51021: detect loops in module/package graph
Date: Tue, 05 Oct 2021 03:59:00 -0400
Hi,

raingloom <raingloom <at> riseup.net> writes:
> I'll be short and blunt, currently it sucks big time whenever you have
> a loop in your packages.

Agreed.  I've been concerned about this problem since the early days of
Guix.  See <https://bugs.gnu.org/18247>.

Back in August 2014, there was a strongly connected component (SCC)
containing 51 package modules:

  (acl algebra attr avahi base curl cyrus-sasl dejagnu docbook doxygen
   fltk fontutils gcc gd gdb gettext ghostscript gl glib gnome gnupg
   gnutls graphviz groff gsasl gstreamer gtk iso-codes lesstif
   libcanberra linux lisp maths mit-krb5 mpi netpbm openldap pdf
   pulseaudio rrdtool shishi ssh tcl tcsh texinfo texlive valgrind web
   xiph xml xorg)

Since then, that SCC has grown to 277 package modules:

  (acl admin aidc algebra apr aspell assembly astronomy attr audio
   augeas authentication autogen autotools avahi backup base bash bdw-gc
   bison bittorrent boost build-tools c calendar cdrom certs check cmake
   code compression cook cpio cpp crates-graphics crates-gtk crates-io
   cross-base crypto cryptsetup cups curl cyrus-sasl databases
   datastructures dav debian dejagnu dictionaries disk django djvu dns
   docbook docker documentation ebook ed elf emacs emacs-xyz enchant
   enlightenment fabric-management fcitx file-systems finance firmware
   flex fltk fonts fontutils freedesktop freeipmi ftp game-development
   gawk gcc gd gdb geo gettext ghostscript gimp gl glib gnome gnome-xyz
   gnu-doc gnunet gnupg gnuzilla golang gpodder gps graphics graphviz
   groff groovy gsasl gstreamer gtk guile guile-xyz gv haskell
   haskell-apps haskell-check haskell-crypto haskell-web haskell-xyz
   hurd ibus icu4c image image-processing imagemagick inkscape iso-codes
   java java-compression javascript jemalloc jupyter kde kde-frameworks
   kde-internet kde-pim kde-plasma kerberos language less lesstif
   libcanberra libedit libevent libffi libftdi libidn libphidget
   libreoffice libunistring libusb linphone linux lirc lisp lisp-xyz
   llvm logging lsof lua mail man markup maths matrix maven
   maven-parent-pom mcrypt mes messaging mingw monitoring mono mp3 mpd
   mpi multiprecision music nano ncurses netpbm nettle networking nfs
   ninja node noweb nss ocaml ocr onc-rpc openbox opencl openldap
   openstack package-management parallel password-utils patchutils
   pciutils pcre pdf perl perl-check perl-compression perl-maths
   perl-web photo php plotutils polkit popt pretty-print protobuf
   pulseaudio python python-check python-compression python-crypto
   python-science python-web python-xyz qt rails rdesktop rdf readline
   rpc rrdtool rsync ruby rust rust-apps samba scanner scheme screen sdl
   search security-token selinux serialization shells slang speech
   sphinx spice sqlite ssh sssd suckless swig sync tcl telephony
   terminals tex texinfo text-editors textutils time tls tor uml upnp
   valgrind version-control video vim virtualization vpn vulkan w3m web
   web-browsers webkit wget wine wordnet wxwidgets xdisorg xfig xiph xml
   xorg)

There's also a second, much smaller SCC:

  (bioconductor bioinformatics cran graph machine-learning statistics)

> I don't want to spend my time manually finding loops in graphs,
> computers are better at that.
>
> Sadly I don't know when I'll have time to implement this, so if someone
> knows of a solution, they should not hesitate with sending a patch and
> making all our lives easier.

I've attached a script that I hacked up in 2014 to analyze the Guix
package module dependency graph.  Note the (chdir "gnu/packages") in the
middle of it, so it must be loaded from the top directory of the Guix
source code, and the REPL will be in "gnu/packages" after loading it.
Here's an example of its use:

--8<---------------cut here---------------start------------->8---
mhw <at> jojen ~/guix$ guile -l cycle-viewer.scm
Found the following non-trivial strongly-connected components:
(bioconductor
  bioinformatics
  cran
  graph
  statistics
  machine-learning)

(autotools
  perl
  base
[… 272 lines elided …]
  libftdi
  perl-maths)

GNU Guile 3.0.2
Copyright (C) 1995-2020 Free Software Foundation, Inc.

Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.

Enter `,help' for help.
scheme@(guile-user)> (length non-trivial-sccs)
$1 = 2
scheme@(guile-user)> (map length non-trivial-sccs)
$2 = (6 277)
scheme@(guile-user)> (first non-trivial-sccs)
$3 = (bioconductor bioinformatics cran graph statistics machine-learning)
scheme@(guile-user)> (second non-trivial-sccs)
$4 = (autotools perl base acl attr gettext check bash compression …)
scheme@(guile-user)> ,pp (edges-within (first non-trivial-sccs))
$5 = ((bioconductor . statistics)
 (bioconductor . graph)
 (bioconductor . cran)
 (bioconductor . bioinformatics)
 (bioinformatics . statistics)
 (bioinformatics . machine-learning)
 (bioinformatics . graph)
 (bioinformatics . cran)
 (bioinformatics . bioconductor)
 (cran . statistics)
 (cran . machine-learning)
 (cran . graph)
 (cran . bioinformatics)
 (graph . statistics)
 (graph . cran)
 (graph . bioinformatics)
 (graph . bioconductor)
 (machine-learning . statistics)
 (machine-learning . cran)
 (statistics . machine-learning)
 (statistics . cran))
scheme@(guile-user)> (with-output-to-file "/tmp/BIO-SCC.dot" (lambda () (write-scc-dot (first non-trivial-sccs))))
scheme@(guile-user)> (with-output-to-file "/tmp/MAIN-SCC.dot" (lambda () (write-scc-dot (second non-trivial-sccs))))
scheme@(guile-user)> ,quit
--8<---------------cut here---------------end--------------->8---

If someone would like to polish this into a more usable tool, and
possibly integrate it into Guix, please feel free.

       Mark

-- 
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>.




Information forwarded to bug-guix <at> gnu.org:
bug#51021; Package guix. (Tue, 05 Oct 2021 08:06:01 GMT) Full text and rfc822 format available.

Message #11 received at 51021 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: raingloom <raingloom <at> riseup.net>, 51021 <at> debbugs.gnu.org
Subject: Re: bug#51021: detect loops in module/package graph
Date: Tue, 05 Oct 2021 04:03:24 -0400
[Message part 1 (text/plain, inline)]
Earlier, I wrote
> I've attached a script that I hacked up in 2014 to analyze the Guix
> package module dependency graph.

Here's the script:

[cycle-viewer.scm (text/plain, inline)]
;;; cycle-viewer.scm: a Guix package module dependency graph analyzer
;;; Copyright (C) 2014  Mark H Weaver <mhw <at> netris.org>
;;;
;;; This program is free software: you can redistribute it and/or modify
;;; it under the terms of the GNU General Public License as published by
;;; the Free Software Foundation, either version 3 of the License, or
;;; (at your option) any later version.
;;;
;;; This program is distributed in the hope that it will be useful,
;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
;;; GNU General Public License for more details.
;;;
;;; You should have received a copy of the GNU General Public License
;;; along with this program.  If not, see <http://www.gnu.org/licenses/>.

(use-modules (srfi srfi-1)
             (srfi srfi-26)
             (ice-9 match)
             (ice-9 ftw)
             (ice-9 pretty-print))

;;;
;;; Tarjan's strongly connected components algorithm
;;;
;;; Robert Tarjan, Depth-first search and linear graph algorithms.
;;; SIAM Journal on Computing, 1(2):146-160, 1972.
;;;
;;;
;;; vertices is the list of vertices, which may be any objects that
;;; can be distinguished using 'equal?'.
;;;
;;; edges is the list of edges, where each edge is a pair (w . v)
;;; representing the directed edge w => v, for vertices w and v.
;;;
;;; The return value is a list of the strongly-connected components,
;;; where each strongly-connected component (SCC) is represented as a
;;; list of the vertices it contains.  The returned SCCs are sorted in
;;; topological order.
;;;
(define (strongly-connected-components vertices edges)
  (define size (length vertices))
  (define vs (iota size))

  (define lookup
    (let ((t (make-hash-table size)))
      (for-each (cut hash-set! t <> <>) vertices vs)
      (cut hash-ref t <>)))

  (define name
    (let ((t (make-vector size #f)))
      (for-each (cut vector-set! t <> <>) vs vertices)
      (cut vector-ref t <>)))

  (define (vector-update! v i f)
    (vector-set! v i (f (vector-ref v i))))

  (define (compose f g) (lambda (x) (f (g x))))

  (define successors
    (let ((t (make-vector size '())))
      (for-each (lambda (v w) (vector-update! t v (cut cons w <>)))
                (map (compose lookup car) edges)
                (map (compose lookup cdr) edges))
      (cut vector-ref t <>)))

  (define new-index
    (let ((i -1))
      (lambda ()
        (set! i (+ i 1))
        i)))

  (define index-table (make-vector size #f))
  (define index       (cut vector-ref  index-table <>))
  (define set-index!  (cut vector-set! index-table <> <>))

  (define lowlink-table (make-vector size size))
  (define lowlink (cut vector-ref lowlink-table <>))
  (define (update-lowlink! v x)
    (if v (vector-update! lowlink-table v (cut min x <>))))

  (define done-table (make-bitvector size #f))
  (define done? (cut bitvector-ref  done-table <>))
  (define done! (cut bitvector-set! done-table <> #t))

  (define results '())
  (define pending '())

  (define (finalize! v)
    (let loop ((names '()) (p pending))
      (done! (car p))
      (cond ((eqv? v (car p))
             (set! pending (cdr p))
             (set! results (cons (cons (name v) names)
                                 results)))
            (else (loop (cons (name (car p))
                              names)
                        (cdr p))))))

  (let loop ((v #f) (ws vs) (stack '()))
    (cond ((pair? ws)
           (let ((w (car ws)))
             (cond ((index w) => (lambda (wi)
                                   (if (not (done? w))
                                       (update-lowlink! v wi))
                                   (loop v (cdr ws) stack)))
                   (else (let ((wi (new-index)))
                           (set-index! w wi)
                           (update-lowlink! w wi)
                           (set! pending (cons w pending))
                           (loop w (successors w)
                                 (cons (cons v (cdr ws))
                                       stack)))))))
          ((pair? stack)
           (if (and v (= (index v) (lowlink v)))
               (finalize! v))
           (update-lowlink! (caar stack) (lowlink v))
           (loop (caar stack) (cdar stack) (cdr stack)))
          (else results))))

(chdir "gnu/packages")

(define files (scandir "." (cut string-suffix? ".scm" <>)))
(define headers (map (cut call-with-input-file <> read)
                     files))
(define modules
  (filter-map
   (lambda (header)
     (match header
       (('define-module ('gnu 'packages name) . _) name)
       (('define-module module-name . _)
        (format (current-warning-port)
                "Warning: found unexpected module name ~S in gnu/packages/*.scm~%"
                module-name)
        #f)))
   headers))

(define dependencies
  (append-map
   (lambda (header)
     (match header
       (('define-module ('gnu 'packages module) . rest)
        (let loop ((rest rest)
                   (deps '()))
          (match rest
            (() deps)
            ((#:use-module ('gnu 'packages name) . rest)
             (loop rest `((,module . ,name) . ,deps)))
            ((#:use-module (('gnu 'packages name) . _) . rest)
             (loop rest `((,module . ,name) . ,deps)))
            ((#:use-module _ . rest)
             (loop rest deps))
            ((#:export _ . rest)
             (loop rest deps))
            ((#:autoload _ _ . rest)
             (loop rest deps)))))
       (('define-module module-name . _) '())))
   headers))

(define sccs (strongly-connected-components modules dependencies))

(define (non-trivial? scc)
  (not (= 1 (length scc))))

(define non-trivial-sccs (filter non-trivial? sccs))

(unless (null? non-trivial-sccs)
  (display "Found the following non-trivial strongly-connected components:")
  (newline)
  (for-each (lambda (scc)
              (pretty-print scc)
              (newline))
            non-trivial-sccs))

(define (edges-within vs)
  (filter (match-lambda
            ((a . b) (and (member a vs) (member b vs))))
          dependencies))

(define (edges-involving vs)
  (filter (match-lambda
            ((a . b) (or (member a vs) (member b vs))))
          dependencies))

(define (edges-from vs)
  (filter (match-lambda
            ((a . b) (member a vs)))
          dependencies))

(define (edges-to vs)
  (filter (match-lambda
            ((a . b) (member b vs)))
          dependencies))

(define (module-label module)
  (symbol->string module))

(define* (write-edges-dot edges #:optional (port (current-output-port)))
  (display "digraph {\n" port)
  (for-each (match-lambda
              ((a . b) (format port "  ~S -> ~S;\n"
                               (module-label a)
                               (module-label b))))
            edges)
  (display "}\n" port))

(define* (write-scc-dot scc #:optional (port (current-output-port)))
  (write-edges-dot (edges-within scc) port))

[Message part 3 (text/plain, inline)]
-- 
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>.

Information forwarded to bug-guix <at> gnu.org:
bug#51021; Package guix. (Thu, 07 Oct 2021 13:29:02 GMT) Full text and rfc822 format available.

Message #14 received at 51021 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Mark H Weaver <mhw <at> netris.org>
Cc: 51021 <at> debbugs.gnu.org, raingloom <raingloom <at> riseup.net>
Subject: Re: bug#51021: detect loops in module/package graph
Date: Thu, 07 Oct 2021 15:28:48 +0200
Hi Mark,

Mark H Weaver <mhw <at> netris.org> skribis:

> raingloom <raingloom <at> riseup.net> writes:
>> I'll be short and blunt, currently it sucks big time whenever you have
>> a loop in your packages.
>
> Agreed.  I've been concerned about this problem since the early days of
> Guix.  See <https://bugs.gnu.org/18247>.
>
> Back in August 2014, there was a strongly connected component (SCC)
> containing 51 package modules:

Thanks for the analysis and for the updated patch!

Module cycles are something we allow and even rely on, so finding cycles
in itself is not necessarily helpful.  What would help is finding cyclic
top-level references, which are those that cause problems, but that’s
another story.

WDYT?

Now, I’m not sure if raingloom was talking about these cycles, or rather
about cycles in the package dependency graph?

Chris Baines proposed a patch a while back to report those, though I
can’t find it anymore.  IIRC, the difficulty was in making sure cycle
detection would not be too expensive, and in keeping a readable style.

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#51021; Package guix. (Mon, 11 Oct 2021 07:50:01 GMT) Full text and rfc822 format available.

Message #17 received at 51021 <at> debbugs.gnu.org (full text, mbox):

From: zimoun <zimon.toutoune <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: Mark H Weaver <mhw <at> netris.org>, 51021 <at> debbugs.gnu.org
Subject: Re: bug#51021: detect loops in module/package graph
Date: Mon, 11 Oct 2021 09:49:00 +0200
Hi Ludo,

On Thu, 7 Oct 2021 at 15:34, Ludovic Courtès <ludo <at> gnu.org> wrote:
> Mark H Weaver <mhw <at> netris.org> skribis:
> > raingloom <raingloom <at> riseup.net> writes:

> >> I'll be short and blunt, currently it sucks big time whenever you have
> >> a loop in your packages.
> >
> > Agreed.  I've been concerned about this problem since the early days of
> > Guix.  See <https://bugs.gnu.org/18247>.
> >
> > Back in August 2014, there was a strongly connected component (SCC)
> > containing 51 package modules:
>
> Thanks for the analysis and for the updated patch!
>
> Module cycles are something we allow and even rely on, so finding cycles
> in itself is not necessarily helpful.  What would help is finding cyclic
> top-level references, which are those that cause problems, but that’s
> another story.

What Mark had implemented [1] works for any directed graph.  What do
you mean by "top-level references"?

1: <https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm>

> Now, I’m not sure if raingloom was talking about these cycles, or rather
> about cycles in the package dependency graph?

Probably. ;-)

But the way to detect cycle could be applied to any graph that Guix
uses.  For instance, something totally irrelevant that I would never
think of: channels [2]. :-)

2: <http://issues.guix.gnu.org/issue/41069>

> Chris Baines proposed a patch a while back to report those, though I
> can’t find it anymore.  IIRC, the difficulty was in making sure cycle
> detection would not be too expensive, and in keeping a readable style.

From my memories about Graph Theory, the algorithm Mark is proposing
is an efficient way to detect cycles (cycle is strong connected
component).  BTW, detect cycle is (almost) the same complexity as
topological sort; since cycles are obstacle for topological sort to
exist, somehow.  I remember too the Chris's proposal but I do not
remember what they implemented.

I do not understand what you mean by "keeping a readable style".

All the best,
simon




Information forwarded to bug-guix <at> gnu.org:
bug#51021; Package guix. (Tue, 12 Oct 2021 09:48:02 GMT) Full text and rfc822 format available.

Message #20 received at 51021 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: zimoun <zimon.toutoune <at> gmail.com>
Cc: Mark H Weaver <mhw <at> netris.org>, 51021 <at> debbugs.gnu.org
Subject: Re: bug#51021: detect loops in module/package graph
Date: Tue, 12 Oct 2021 11:47:39 +0200
Hi,

zimoun <zimon.toutoune <at> gmail.com> skribis:

> What Mark had implemented [1] works for any directed graph.  What do
> you mean by "top-level references"?

Reference to variables coming from one module of an SCC that appear at
the top level of another module in the SCC.

>> Chris Baines proposed a patch a while back to report those, though I
>> can’t find it anymore.  IIRC, the difficulty was in making sure cycle
>> detection would not be too expensive, and in keeping a readable style.
>
> From my memories about Graph Theory, the algorithm Mark is proposing
> is an efficient way to detect cycles

What I meant is that ‘package-derivation’ traverses the package graph,
so it’s a natural place to add cycle detection.

But since ‘package-derivation’ is so central, care must be taken about
the performance hit and about its readability.

Thanks,
Ludo’.




This bug report was last modified 2 years and 197 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.