Package: guix;
Reported by: Eric Brown <ecbrown <at> ericcbrown.com>
Date: Thu, 10 Jun 2021 11:16:01 UTC
Severity: normal
Done: Mathieu Othacehe <othacehe <at> gnu.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 48945 in the body.
You can then email your comments to 48945 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
View this report as an mbox folder, status mbox, maintainer mbox
bug-guix <at> gnu.org
:bug#48945
; Package guix
.
(Thu, 10 Jun 2021 11:16:01 GMT) Full text and rfc822 format available.Eric Brown <ecbrown <at> ericcbrown.com>
:bug-guix <at> gnu.org
.
(Thu, 10 Jun 2021 11:16:01 GMT) Full text and rfc822 format available.Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Eric Brown <ecbrown <at> ericcbrown.com> To: bug-guix <at> gnu.org Subject: PostgreSQL + Cuirass Errors Date: Thu, 10 Jun 2021 06:14:55 -0500
Hello: Executive Summary: - Can't reinstall Cuirass and/or PostgreSQL - Divide by 0 error reported by postgres when computing metrics Details: I am having issues reconfiguring Cuirass and PostgreSQL . I wonder if these are related to several issues in PostgreSQL, and seem to occur when I reconfigure either cuirass and/or postgres without Cuirass present, i.e. my "database server" /etc/config.scm: ---------------- (define %cuirass-specs #~(list (specification (name "my-cbc") (build '(packages "cbc"))) (specification (name "my-ipopt") (build '(packages "ipopt"))) (specification (name "my-linux-libre") (build '(packages "linux-libre"))) (specification (name "my-openblas-ilp64") (build '(packages "openblas-ilp64"))) (specification (name "my-qtbase") (build '(packages "qtbase"))) (specification (name "my-sylpheed") (build '(packages "sylpheed"))) (specification (name "my-texlive") (build '(packages "texlive"))))) (service cuirass-service-type (cuirass-configuration (specifications %cuirass-specs))) An example session trying to get cuirass re-installed: 1. Comment out Cuirass in /etc/config.scm and reconfigure building /gnu/store/9nmk3q8nwk51wqanpw4a5agwak0yfhpj-upgrade-shepherd-services.scm.drv... shepherd: Removing service 'cuirass-web'... shepherd: Done. shepherd: Removing service 'postgres-roles'... shepherd: Done. shepherd: Removing service 'cuirass'... shepherd: Done. shepherd: Removing service 'postgres'... shepherd: Done. shepherd: Service host-name has been started. shepherd: Service user-homes has been started. shepherd: Service sysctl has been started. shepherd: Service host-name has been started. shepherd: Service term-auto could not be started. To complete the upgrade, run 'herd restart SERVICE' to stop, upgrade, and restart each service that was not automatically restarted. Run 'herd status' to view the list of services on your system 2) At shell: # rm -rf /var/log/cuirass /var/log/cuirass.log* /var/log/cuirass.log /var/log/cuirass-web.log /var/cache/cuirass /var/lib/postgresql/data /var/lib/cuirass 3) Reboot 4) Check no files above are regenerated, e.g. by other services requiring postgresql (none found) 5) Re-enable Cuirass in /etc/config.scm, reconfigure: (frequently observed error at end of this item) selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting default timezone ... US/Central selecting dynamic shared memory implementation ... posix creating configuration files ... ok running bootstrap script ... ok performing post-bootstrap initialization ... sh: locale: command not found 2021-06-10 05:57:26.532 CDT [1370] WARNING: no usable system locales were found ok syncing data to disk ... ok WARNING: enabling "trust" authentication for local connections You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb. Success. You can now start the database server using: /gnu/store/jsa77nkqcvsck4ksvm2b8sccl174hai4-postgresql-10.17/bin/pg_ctl -D /var/lib/postgresql/data -l logfile start The following derivation will be built: /gnu/store/bmzhdkki40d8y6d6n9a3gw4g70xmv824-install-bootloader.scm.drv building /gnu/store/bmzhdkki40d8y6d6n9a3gw4g70xmv824-install-bootloader.scm.drv... guix system: bootloader successfully installed on '/boot/efi' shepherd: Service host-name has been started. shepherd: Service user-homes has been started. shepherd: Service sysctl has been started. shepherd: Service host-name has been started. shepherd: Service term-auto could not be started. guix system: warning: exception caught while executing 'start' on service 'postgres': Throw to key `%exception' with args `("#<&invoke-error program: \"/gnu/store/4x3h2096cvzvq65wv40a4acwdyks9ivc-pg_ctl-wrapper\" arguments: (\"start\") exit-status: 1 term-signal: #f stop-signal: #f>")'. guix system: warning: some services could not be upgraded hint: To allow changes to all the system services to take effect, you will need to reboot. 6) Reboot 7) telnet localhost 5432 telnet localhost 5432 Trying 127.0.0.1... telnet: Unable to connect to remote host: Connection refused -------- I am also observing divide-by-zero errors reported by a PG process when computing metrics. Perhaps it is ignorable, but it seems to throw a Scheme "stack trace" that doesn't look good. I was unable to capture the specific message due to thrashing to restart Curirass and the DB. I am able to reproduce this on several machines, this is my third attempt to install on a fresh machine, use as I expect (ability to add/remove/reconfigure services) etc. This may be a red herring, but I can't help but feel that postgres is getting pulled in from other services as well, and that there may be a collision (e.g. PostgreSQL 10 and 13 both seem to get referenced.) I have stripped this system back to (essentially) bare-bones.scm, and see that PostgreSQL is even referenced by networkmanager package/service. (Which I am loathe to revert to dhcp since it handles wireguard. :-( ) Best regards Eric PS I would add that i have seen an error like: guix system: warning: exception caught while executing 'start' on service 'postgres': Throw to key `%exception' with args `("#<&invoke-error program: \"/gnu/store/4x3h2096cvzvq65wv40a4acwdyks9ivc-pg_ ctl-wrapper\" arguments: (\"start\") exit-status: 1 term-signal: #f stop-signal: #f>")'. in another context, it was for nginx but a reboot fixed that and I can serve pages.
bug-guix <at> gnu.org
:bug#48945
; Package guix
.
(Tue, 15 Jun 2021 18:49:02 GMT) Full text and rfc822 format available.Message #8 received at 48945 <at> debbugs.gnu.org (full text, mbox):
From: Eric Brown <ecbrown <at> ericcbrown.com> To: 48945 <at> debbugs.gnu.org Subject: Re: bug#48945: PostgreSQL + Cuirass Errors Date: Tue, 15 Jun 2021 13:48:51 -0500
Eric Brown <ecbrown <at> ericcbrown.com> writes: > Hello: > > Executive Summary: > - Can't reinstall Cuirass and/or PostgreSQL > - Divide by 0 error reported by postgres when computing metrics > An update on this: I have reinstalled, and I can get PostgreSQL working. I think my problem was trying to "reset" cuirass by removing it from config.scm repeatedly, and shuffling up uid/gid's etc. I think I can avoid this. The other problem remains: it seems that cuirass rolls along pretty well for a while and then will report an error. It could also be triggered perhaps because I am adding a package build rule after reconfigure -- but I think it's appeared with just these packages as well. ----------------- 2021-06-14T09:35:01 Updating metric percentage-failure-10-last-eval-per-spec (my-texlive) to 0.0. 2021-06-14T09:35:01 Updating metric percentage-failure-100-last-eval-per-spec (my-texlive) to 18.42105263157895. 2021-06-14T09:35:01 Updating metric percentage-failed-eval-per-spec (my-texlive) to 18.42105263157895. 2021-06-14T09:35:01 Updating metric average-10-last-eval-duration-per-spec (my-xfce) to 31.0. 2021-06-14T09:35:01 Updating metric average-100-last-eval-duration-per-spec (my-xfce) to 31.0. 2021-06-14T09:35:01 Updating metric average-eval-duration-per-spec (my-xfce) to 31.0. 2021-06-14T09:35:01 Updating metric percentage-failure-10-last-eval-per-spec (my-xfce) to 0.0. 2021-06-14T09:35:01 Updating metric percentage-failure-100-last-eval-per-spec (my-xfce) to 0.0. 2021-06-14T09:35:01 Updating metric percentage-failed-eval-per-spec (my-xfce) to 0.0. 2021-06-14T09:35:01 Failed to compute metric average-eval-build-start-time (14335). 2021-06-14T09:35:01 Updating metric average-eval-build-complete-time (14335) to 12.0. 2021-06-14T09:35:01 Updating metric evaluation-completion-speed (14335) to 300.0. 2021-06-14T09:35:01 Failed to compute metric average-eval-build-start-time (14206). 2021-06-14T09:35:01 Updating metric average-eval-build-complete-time (14206) to 1.0. 2021-06-14T09:35:01 Updating metric evaluation-completion-speed (14206) to 3600.0. 2021-06-14T09:35:01 Failed to compute metric average-eval-build-start-time (14196). 2021-06-14T09:35:01 Updating metric average-eval-build-complete-time (14196) to 0.0. 2021-06-14T09:35:01 fatal: uncaught exception 'psql-query-error' in 'metrics' fiber! 2021-06-14T09:35:01 exception arguments: (fatal-error "PGRES_FATAL_ERROR" "ERROR: division by zero\n") In ice-9/boot-9.scm: 1747:15 11 (with-exception-handler #<procedure 7fa71b9f25d0 at ic?> ?) 1752:10 10 (with-exception-handler _ _ #:unwind? _ # _) 724:2 9 (call-with-prompt ("break") #<procedure 7fa71f8cfe00 a?> ?) 724:2 8 (call-with-prompt ("continue") #<procedure 7fa71f8d5f8?> ?) 724:2 8 (call-with-prompt ("continue") #<procedure 7fa71f8d5f8?> ?) [416/1949] In ice-9/eval.scm: 619:8 7 (_ #(#(#<directory (cuirass scripts register) 7fa72?> ?))) In cuirass/logging.scm: 58:18 6 (call-with-time-logging "Metrics update" #<procedure 7f?>) In ice-9/boot-9.scm: 1685:16 5 (raise-exception _ #:continuable? _) 1780:13 4 (_ #<&compound-exception components: (#<&error> #<&orig?>) 2137:6 3 (_ _ . _) 1747:15 2 (with-exception-handler #<procedure 7fa7250ccb10 at ic?> ?) In cuirass/utils.scm: 299:22 1 (_) In unknown file: 0 (make-stack #t) ERROR: In procedure make-stack: Throw to key `psql-query-error' with args `(fatal-error "PGRES_FATAL_ERROR" "ERROR: division by zero\n")'. Some deprecated features have been used. Set the environment variable GUILE_WARN_DEPRECATED to "detailed" and rerun the program to get more information. Set it to "no" to suppress this message.
bug-guix <at> gnu.org
:bug#48945
; Package guix
.
(Mon, 19 Jul 2021 10:44:01 GMT) Full text and rfc822 format available.Message #11 received at 48945 <at> debbugs.gnu.org (full text, mbox):
From: "Reza Alizadeh Majd" <r.majd <at> pantherx.org> To: 48945 <at> debbugs.gnu.org Subject: Re: bug#48945: PostgreSQL + Cuirass Errors Date: Mon, 19 Jul 2021 15:12:46 +0430
Hi, Is there any update about this issue? I receive same "division by zero" error after running cuirass for a while. deleting the records with `starttime` equal to 0 from `builds` table, cuirass could start again. but the issue happens again after a while. -- Reza Alizadeh Majd PantherX Team https://pantherx.org
Mathieu Othacehe <othacehe <at> gnu.org>
:Eric Brown <ecbrown <at> ericcbrown.com>
:Message #16 received at 48945-done <at> debbugs.gnu.org (full text, mbox):
From: Mathieu Othacehe <othacehe <at> gnu.org> To: "Reza Alizadeh Majd" <r.majd <at> pantherx.org> Cc: 48945-done <at> debbugs.gnu.org Subject: Re: bug#48945: PostgreSQL + Cuirass Errors Date: Wed, 11 Aug 2021 10:07:54 +0200
Hello, > deleting the records with `starttime` equal to 0 from `builds` table, > cuirass could start again. but the issue happens again after a while. This is fixed with aa2f682facce5de727bdae5bbd5d1a2a27923ebb and 1dcaebc66097ce503bd827c7b28e0a0936c1daee. Thanks, Mathieu
Debbugs Internal Request <help-debbugs <at> gnu.org>
to internal_control <at> debbugs.gnu.org
.
(Wed, 08 Sep 2021 11:24:06 GMT) Full text and rfc822 format available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.