GNU bug report logs -
#76811
[PATCH] services: nginx: Replace invoke with spawn-command.
Previous Next
To reply to this bug, email your comments to 76811 AT debbugs.gnu.org.
There is no need to reopen the bug first.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
guix-patches <at> gnu.org
:
bug#76811
; Package
guix-patches
.
(Fri, 07 Mar 2025 13:04:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Christopher Baines <mail <at> cbaines.net>
:
New bug report received and forwarded. Copy sent to
guix-patches <at> gnu.org
.
(Fri, 07 Mar 2025 13:04:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
I'm not sure where invoke is coming from here, but it could be from (guix
build utils), that uses system* which uses waitpid, which might cause problems
with recent versions of the shepherd?
At least I'm seeing issues on multiple machines where attempting to restart
the nginx service sometimes causes the shepherd to hang.
* gnu/services/web.scm (nginx-shepherd-service): Replace invoke with
spawn-command.
Change-Id: Ie9ce4be9a4df121465b28148612b4fbc45fb5126
---
gnu/services/web.scm | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/gnu/services/web.scm b/gnu/services/web.scm
index 7593cd2eaa..b46a4db73f 100644
--- a/gnu/services/web.scm
+++ b/gnu/services/web.scm
@@ -870,7 +870,8 @@ (define (nginx-shepherd-service config)
(nginx-action
(lambda args
#~(lambda _
- (invoke #$nginx-binary "-c" #$config-file #$@args)
+ (spawn-command
+ (list #$nginx-binary "-c" #$config-file #$@args))
(match '#$args
(("-s" . _) #t)
(_
base-commit: 9bc4c9f521caab8aa8d88aa948a650945bb55838
--
2.48.1
Information forwarded
to
guix-patches <at> gnu.org
:
bug#76811
; Package
guix-patches
.
(Fri, 07 Mar 2025 13:57:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 76811 <at> debbugs.gnu.org (full text, mbox):
Christopher Baines <mail <at> cbaines.net> skribis:
> I'm not sure where invoke is coming from here, but it could be from (guix
> build utils), that uses system* which uses waitpid, which might cause problems
> with recent versions of the shepherd?
>
> At least I'm seeing issues on multiple machines where attempting to restart
> the nginx service sometimes causes the shepherd to hang.
>
> * gnu/services/web.scm (nginx-shepherd-service): Replace invoke with
> spawn-command.
>
> Change-Id: Ie9ce4be9a4df121465b28148612b4fbc45fb5126
Hi! ‘invoke’ uses ‘system*’, which is an alias for ‘spawn-command’ (see
‘replace-core-bindings!’ in ‘shepherd.scm’) so the only effect of this
patch is that errors from “nginx -c nginx.conf …” would be ignored.
I think we need a reproducer for the hang so we can pinpoint the
problem because it’s a pretty serious bug!
Ludo’.
Reply sent
to
Christopher Baines <mail <at> cbaines.net>
:
You have taken responsibility.
(Fri, 07 Mar 2025 17:19:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Christopher Baines <mail <at> cbaines.net>
:
bug acknowledged by developer.
(Fri, 07 Mar 2025 17:19:02 GMT)
Full text and
rfc822 format available.
Message #13 received at 76811-close <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Ludovic Courtès <ludo <at> gnu.org> writes:
> Christopher Baines <mail <at> cbaines.net> skribis:
>
>> I'm not sure where invoke is coming from here, but it could be from (guix
>> build utils), that uses system* which uses waitpid, which might cause problems
>> with recent versions of the shepherd?
>>
>> At least I'm seeing issues on multiple machines where attempting to restart
>> the nginx service sometimes causes the shepherd to hang.
>>
>> * gnu/services/web.scm (nginx-shepherd-service): Replace invoke with
>> spawn-command.
>>
>> Change-Id: Ie9ce4be9a4df121465b28148612b4fbc45fb5126
>
> Hi! ‘invoke’ uses ‘system*’, which is an alias for ‘spawn-command’ (see
> ‘replace-core-bindings!’ in ‘shepherd.scm’) so the only effect of this
> patch is that errors from “nginx -c nginx.conf …” would be ignored.
Ah, yes, I see, I've tried to verify this and it does seem that the
nginx server is using this system* replacement.
> I think we need a reproducer for the hang so we can pinpoint the
> problem because it’s a pretty serious bug!
I did try restarting nginx over and over again in the system test os,
but that seemed to work.
On a VM I have though, it only takes a few restarts for it to hang, I'm
not sure why though.
[signature.asc (application/pgp-signature, inline)]
Information forwarded
to
guix-patches <at> gnu.org
:
bug#76811
; Package
guix-patches
.
(Fri, 07 Mar 2025 23:10:02 GMT)
Full text and
rfc822 format available.
Message #16 received at 76811 <at> debbugs.gnu.org (full text, mbox):
Hey,
> I did try restarting nginx over and over again in the system test os,
> but that seemed to work.
>
> On a VM I have though, it only takes a few restarts for it to hang, I'm
> not sure why though.
It could be that nginx alone would work well, but combining it with
another service that does unusual things triggers the bug.
One way to investigate would be to start from ‘bayfront.scm’ (which I
think has that problem) and boil it down until we have something that
can run in a VM and reproduces the problem.
I don’t see any ‘waitpid’ uses left in Shepherd services under
gnu/services/*.scm.
One thing I found that is risky is ‘guix-data-service-setup-database’:
it loads a bunch of (guix-data-service …) modules into PID 1 and runs
non-trivial code in there; I strongly recommend doing this in a separate
process, similar to how ‘bffe-shepherd-services’ does it.
Ludo’.
Information forwarded
to
guix-patches <at> gnu.org
:
bug#76811
; Package
guix-patches
.
(Mon, 10 Mar 2025 22:05:02 GMT)
Full text and
rfc822 format available.
Message #19 received at 76811 <at> debbugs.gnu.org (full text, mbox):
Ludovic Courtès <ludo <at> gnu.org> skribis:
> One thing I found that is risky is ‘guix-data-service-setup-database’:
> it loads a bunch of (guix-data-service …) modules into PID 1 and runs
> non-trivial code in there; I strongly recommend doing this in a separate
> process, similar to how ‘bffe-shepherd-services’ does it.
The guix-data-service tests shows that:
https://ci.guix.gnu.org/build/9557216/log
Namely:
--8<---------------cut here---------------start------------->8---
[ 5.788985] shepherd[1]: Service loopback started.
[ 5.790001] shepherd[1]: Service loopback running with value #t.
Uncaught exception in task:
In fibers.scm:
172:8 6 (_)
In shepherd/service/system-log.scm:
180:10 5 (run-system-log #<<channel> getq: #<atomic-box 7fcc8e8?> ?)
In srfi/srfi-1.scm:
586:17 4 (map1 (#<input-output: socket 17> #<input: /proc/kmsg?>))
In shepherd/service/system-log.scm:
181:33 3 (_ #<input-output: socket 17>)
In fibers/io-wakeup.scm:
72:13 2 (make-wait-operation #<procedure 7fcc8e8b0300 at fiber?> ?)
72:13 1 (make-wait-operation #f #<procedure 7fcc8e8b0300 at fi?> ?)
In ice-9/boot-9.scm:
1685:16 0 (raise-exception _ #:continuable? _)
ice-9/boot-9.scm:1685:16: In procedure raise-exception:
Wrong type to apply: #<syntax-transformer make-base-operation>
--8<---------------cut here---------------end--------------->8---
Here bindings in (fibers io-wakeup) are likely “polluted” by loading
guile-fibers-next via the (guix-data-service …) modules.
Ludo’.
This bug report was last modified today.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.