GNU bug report logs - #36510
confusing mcron logging

Previous Next

Package: guix;

Reported by: Robert Vollmert <rob <at> vllmrt.net>

Date: Fri, 5 Jul 2019 13:37:01 UTC

Severity: normal

Done: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 36510 in the body.
You can then email your comments to 36510 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#36510; Package guix. (Fri, 05 Jul 2019 13:37:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Robert Vollmert <rob <at> vllmrt.net>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Fri, 05 Jul 2019 13:37:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Robert Vollmert <rob <at> vllmrt.net>
To: bug-guix <at> gnu.org
Subject: confusing mcron logging
Date: Fri, 5 Jul 2019 15:35:35 +0200
I have two mcron jobs on my system, certbot renewal and
a handwritten and currently buggy guile job. This is an
excerpt from /var/log/mcron.log:

>>>>>

Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator webroot, Installer None
Cert not yet due for renewal
Keeping the existing certificate

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Certificate not yet due for renewal; no action taken.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Acquiring or renewing certificate: garp.vllmrt.net
Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator webroot, Installer None
Cert not yet due for renewal
Keeping the existing certificate

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Certificate not yet due for renewal; no action taken.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Acquiring or renewing certificate: garp.vllmrt.net
Backtrace:
           9 (apply-smob/1 #<catch-closure 5cf300>)
In ice-9/boot-9.scm:
    829:9  8 (catch mcron-error #<procedure 7fe67c318d28 at mcron/s?> ?)
In mcron/scripts/mcron.scm:
     99:7  7 (_)
In mcron/base.scm:
   234:12  6 (_ #<continuation 5ad660>)
In srfi/srfi-1.scm:
    640:9  5 (for-each #<procedure run-job (job)> (#<<job> user: #(?>))
In mcron/base.scm:
   186:10  4 (run-job #<<job> user: #("root" "x" 0 0 "System adminis?>)
In ice-9/eval.scm:
   293:34  3 (_ #(#(#<directory (mcron scripts mcron) 6a9c80>)))
   182:19  2 (proc #(#(#<directory (mcron scripts mcron) 6a9c80>)))
   142:16  1 (compile-top-call _ (7 . get-string-all) ((10 (# . #) ?)))
In unknown file:
           0 (%resolve-variable (7 . get-string-all) #<directory (mc?>)

ERROR: In procedure %resolve-variable:
Unbound variable: get-string-all
Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator webroot, Installer None
Cert not yet due for renewal
Keeping the existing certificate

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Certificate not yet due for renewal; no action taken.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Acquiring or renewing certificate: garp.vllmrt.net
Saving debug log to /var/log/letsencrypt/letsencrypt.log
Plugins selected: Authenticator webroot, Installer None
Cert not yet due for renewal
Keeping the existing certificate

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Certificate not yet due for renewal; no action taken.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Acquiring or renewing certificate: garp.vllmrt.net

<<<<<

It’s impossible to tell what output is from which job; which jobs succeeded or
didn’t; when they ran.

Suggestions:
- mcron should log the timestamp and a job id of every job when it starts
- mcron should log the timestamp and status and job id of every job when it finishes
- job output should be prefixed by some job id





Information forwarded to bug-guix <at> gnu.org:
bug#36510; Package guix. (Fri, 05 Jul 2019 20:39:02 GMT) Full text and rfc822 format available.

Message #8 received at 36510 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Robert Vollmert <rob <at> vllmrt.net>
Cc: 36510 <at> debbugs.gnu.org
Subject: Re: bug#36510: confusing mcron logging
Date: Fri, 05 Jul 2019 22:37:57 +0200
Hi,

Robert Vollmert <rob <at> vllmrt.net> skribis:

> Suggestions:
> - mcron should log the timestamp and a job id of every job when it starts
> - mcron should log the timestamp and status and job id of every job when it finishes
> - job output should be prefixed by some job id

+1!  +3 even.  :-)

Something that can help debugging to some extent (but is definitely no
substitute for what you suggest above!) is ‘sudo herd schedule mcron’.
I use that to manually run jobs that appear not to work as expected.

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#36510; Package guix. (Fri, 05 Jul 2019 20:50:02 GMT) Full text and rfc822 format available.

Message #11 received at 36510 <at> debbugs.gnu.org (full text, mbox):

From: Robert Vollmert <rob <at> vllmrt.net>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 36510 <at> debbugs.gnu.org
Subject: Re: bug#36510: confusing mcron logging
Date: Fri, 5 Jul 2019 22:48:57 +0200
> On 5. Jul 2019, at 22:37, Ludovic Courtès <ludo <at> gnu.org> wrote:
> Something that can help debugging to some extent (but is definitely no
> substitute for what you suggest above!) is ‘sudo herd schedule mcron’.
> I use that to manually run jobs that appear not to work as expected.

That only works for non-guile jobs though as far as I understand, where
'herd schedule mcron' prints a store path.

https://debbugs.gnu.org/cgi/bugreport.cgi?bug=36430





Information forwarded to bug-guix <at> gnu.org:
bug#36510; Package guix. (Wed, 18 Aug 2021 00:54:02 GMT) Full text and rfc822 format available.

Message #14 received at 36510 <at> debbugs.gnu.org (full text, mbox):

From: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 36510 <at> debbugs.gnu.org, Robert Vollmert <rob <at> vllmrt.net>
Subject: Re: bug#36510: confusing mcron logging
Date: Tue, 17 Aug 2021 20:53:03 -0400
Hello Robert and Ludovic,

Ludovic Courtès <ludo <at> gnu.org> writes:

> Hi,
>
> Robert Vollmert <rob <at> vllmrt.net> skribis:
>
>> Suggestions:
>> - mcron should log the timestamp and a job id of every job when it starts
>> - mcron should log the timestamp and status and job id of every job when it finishes
>> - job output should be prefixed by some job id
>
> +1!  +3 even.  :-)

I've sent a patch upstream that implements all of the above [0].  I've
been using it on my system, it works well so far!  I'm also keeping this
work in a public Notabug git repo [1].

Hopefully it gets merged and Guix System can reap the benefits :-).

Thanks for the suggestions!

Maxim

[0]  https://lists.gnu.org/archive/html/bug-mcron/2021-08/msg00005.html
[1]  https://notabug.org/apteryx/mcron




Information forwarded to bug-guix <at> gnu.org:
bug#36510; Package guix. (Tue, 24 Aug 2021 12:33:01 GMT) Full text and rfc822 format available.

Message #17 received at 36510 <at> debbugs.gnu.org (full text, mbox):

From: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 36510 <at> debbugs.gnu.org, Robert Vollmert <rob <at> vllmrt.net>
Subject: Re: bug#36510: confusing mcron logging
Date: Tue, 24 Aug 2021 08:32:30 -0400
Hello,

I sent a v3 of the output annotation patch [0], which no longer blocks
reading the output of long-running child processes.

I've reconfigured my system with it, so far so good.

[0]  https://lists.gnu.org/archive/html/bug-mcron/2021-08/msg00008.html




Information forwarded to bug-guix <at> gnu.org:
bug#36510; Package guix. (Mon, 30 Aug 2021 09:51:01 GMT) Full text and rfc822 format available.

Message #20 received at 36510 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
Cc: 36510 <at> debbugs.gnu.org, Robert Vollmert <rob <at> vllmrt.net>
Subject: Re: bug#36510: confusing mcron logging
Date: Mon, 30 Aug 2021 11:49:47 +0200
Hello Maxim,

Maxim Cournoyer <maxim.cournoyer <at> gmail.com> skribis:

> Ludovic Courtès <ludo <at> gnu.org> writes:
>
>> Hi,
>>
>> Robert Vollmert <rob <at> vllmrt.net> skribis:
>>
>>> Suggestions:
>>> - mcron should log the timestamp and a job id of every job when it starts
>>> - mcron should log the timestamp and status and job id of every job when it finishes
>>> - job output should be prefixed by some job id
>>
>> +1!  +3 even.  :-)
>
> I've sent a patch upstream that implements all of the above [0].  I've
> been using it on my system, it works well so far!  I'm also keeping this
> work in a public Notabug git repo [1].

That’s a much welcome improvement, thank you!

Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#36510; Package guix. (Tue, 04 Jan 2022 16:10:02 GMT) Full text and rfc822 format available.

Message #23 received at 36510 <at> debbugs.gnu.org (full text, mbox):

From: Dale Mellor <mcron-lsfnyl <at> rdmp.org>
To: 36510 <at> debbugs.gnu.org, Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
Subject: Re: [PATCH v3] base: Annotate output with job information.
Date: Tue, 04 Jan 2022 13:21:44 +0000
Hi, sorry for the delay but I've had a bit of time over Christmas
   to look things over. I've given this a lot of consideration.


I am happy to drop compatibility with guile-2.2 and older; I
   think we can make a minor version bump for this break with
   legacy.



Does this belong in mcron?  The mcron source code is currently
   3,000 lines, to which you are bringing over 500 new ones to
   make a facility which is geared towards debugging in the GUIX
   system (I am all-in on GUIX myself, but mcron is a generic GNU
   program with use-cases outside of this system).  I wonder if
   this is the best place: perhaps it is shepherd, which is
   responsible for the /var/log/mcron.log file, to be responsible
   for the amended logging messages?  And then again, isn't this
   exactly what syslogd does anyway?  Most likely timings will be
   more accurate if they are generated in mcron.

   In your use-case, of debugging the system, I would think that
   more specialized messages placed directly in the cron jobs
   themselves would be a better aid to your work, as you can
   target them to the problem at hand.  And you could send those
   to syslogd if you wanted.



The output is a little unpredictable.  The script (which is
   admittedly somewhat pathological)

     (job '(next-second '(0 30)) '(begin (display "test: ")
                                         (system "date")))

   produces

     2022-01-04T11:24:00 (...): running...
     2022-01-04T11:24:00 (...): Tue 4 Jan 11:24:00 GMT 2022
     2022-01-04T11:24:00 (...): test: completed in 0.022s
     2022-01-04T11:24:30 (...): running...
     2022-01-04T11:24:30 (...): Tue 4 Jan 11:24:30 GMT 2022
     2022-01-04T11:25:00 (...): running...
     2022-01-04T11:25:00 (...): Tue 4 Jan 11:25:00 GMT 2022
     ...



But all things considered your changes are generally useful to
   have, including outside of the GUIX system, and I would very
   much like to have them there.  But to be sure not to break any
   existing applications, I would like the changes to be opt-in
   via a command-line switch -l; the --log-format option can
   remain to customize this (please also make -L a short option
   alternative; also -D as short for --date-format).

   I am willing and able to do this work myself in a reasonable
   time-frame if you would like me to.



Best wishes, Dale






Reply sent to Maxim Cournoyer <maxim.cournoyer <at> gmail.com>:
You have taken responsibility. (Mon, 21 Nov 2022 01:23:01 GMT) Full text and rfc822 format available.

Notification sent to Robert Vollmert <rob <at> vllmrt.net>:
bug acknowledged by developer. (Mon, 21 Nov 2022 01:23:02 GMT) Full text and rfc822 format available.

Message #28 received at 36510-done <at> debbugs.gnu.org (full text, mbox):

From: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
To: Dale Mellor <mcron-lsfnyl <at> rdmp.org>
Cc: 36510-done <at> debbugs.gnu.org
Subject: Re: bug#36510: confusing mcron logging
Date: Sun, 20 Nov 2022 20:22:25 -0500
Hello Dale,

Dale Mellor <mcron-lsfnyl <at> rdmp.org> writes:

> Hi, sorry for the delay but I've had a bit of time over Christmas
>    to look things over. I've given this a lot of consideration.

Apologies for my lack of reply thus far, it seems your mail had fallen
in cracks.

>
> I am happy to drop compatibility with guile-2.2 and older; I think we
> can make a minor version bump for this break with legacy.
>
>
>
> Does this belong in mcron?  The mcron source code is currently
>    3,000 lines, to which you are bringing over 500 new ones to
>    make a facility which is geared towards debugging in the GUIX
>    system (I am all-in on GUIX myself, but mcron is a generic GNU
>    program with use-cases outside of this system).  I wonder if
>    this is the best place: perhaps it is shepherd, which is
>    responsible for the /var/log/mcron.log file, to be responsible
>    for the amended logging messages?  And then again, isn't this
>    exactly what syslogd does anyway?  Most likely timings will be
>    more accurate if they are generated in mcron.

Since Shepherd 0.9+, it now appends logging information to every output
it handles, so this feature has indeed become less important, but still
useful: I've recently bumped our package of mcron in Guix and I'm using
its annotation facility to prepend the process ID to its output.  I
think the grunt of new lines added must be as documentation and test
code, so that's not so bad as it seems I think.

>    In your use-case, of debugging the system, I would think that
>    more specialized messages placed directly in the cron jobs
>    themselves would be a better aid to your work, as you can
>    target them to the problem at hand.  And you could send those
>    to syslogd if you wanted.

Here's a sample output from the Guix build farm:

--8<---------------cut here---------------start------------->8---
2022-11-21 01:56:15 84005 /gnu/store/ypyz886hd7qaw0g8ba5a595dc0qgnj3q-update-guix.gnu.org: running...
2022-11-21 01:59:24 84005 /gnu/store/ypyz886hd7qaw0g8ba5a595dc0qgnj3q-update-guix.gnu.org: Updating channel 'guix' from Git repository at 'https://git.savannah.gnu.org/git/guix.git'...
2022-11-21 01:59:24 84005 /gnu/store/ypyz886hd7qaw0g8ba5a595dc0qgnj3q-update-guix.gnu.org: Computing Guix derivation for 'x86_64-linux'...  
2022-11-21 01:59:24 84005 /gnu/store/ypyz886hd7qaw0g8ba5a595dc0qgnj3q-update-guix.gnu.org: [2022-11-21T01:56:18+0100] building web site from 'https://git.savannah.gnu.org/git/guix/guix-artwork.git'...
2022-11-21 01:59:24 84005 /gnu/store/ypyz886hd7qaw0g8ba5a595dc0qgnj3q-update-guix.gnu.org: completed in 189.325s
2022-11-21 02:00:00 91665 /gnu/store/xsc4x68avp8nmrf3hgvhd26yl3k90jqz-check-disk-space: running...
2022-11-21 02:00:00 91665 /gnu/store/xsc4x68avp8nmrf3hgvhd26yl3k90jqz-check-disk-space: completed in 0.046s
--8<---------------cut here---------------end--------------->8---

The timestamp is now generated by Shepherd, and mcron adds the PID of
the job, such as 84005 above.  To have some indication of how long the
job ran available at a quick glance is very useful for admin purposes.

>
>
> The output is a little unpredictable.  The script (which is
>    admittedly somewhat pathological)
>
>      (job '(next-second '(0 30)) '(begin (display "test: ")
>                                          (system "date")))
>
>    produces
>
>      2022-01-04T11:24:00 (...): running...
>      2022-01-04T11:24:00 (...): Tue 4 Jan 11:24:00 GMT 2022
>      2022-01-04T11:24:00 (...): test: completed in 0.022s
>      2022-01-04T11:24:30 (...): running...
>      2022-01-04T11:24:30 (...): Tue 4 Jan 11:24:30 GMT 2022
>      2022-01-04T11:25:00 (...): running...
>      2022-01-04T11:25:00 (...): Tue 4 Jan 11:25:00 GMT 2022
>      ...

I've noticed that too, that some jobs somehow escape producing the
"completed in x..." message.  I'll try looking into that, it's probably
a subtle bug.

> But all things considered your changes are generally useful to
>    have, including outside of the GUIX system, and I would very
>    much like to have them there.  But to be sure not to break any
>    existing applications, I would like the changes to be opt-in
>    via a command-line switch -l; the --log-format option can
>    remain to customize this (please also make -L a short option
>    alternative; also -D as short for --date-format).
>
>    I am willing and able to do this work myself in a reasonable
>    time-frame if you would like me to.

Thank you for taking on yourself the above work, Dale!  I was happily
surprise to see this change had landed with your improvement on top.

I think this Guix issue can now be closed :-).

-- 
Thanks,
Maxim




Information forwarded to bug-guix <at> gnu.org:
bug#36510; Package guix. (Tue, 29 Nov 2022 03:32:01 GMT) Full text and rfc822 format available.

Message #31 received at 36510 <at> debbugs.gnu.org (full text, mbox):

From: Maxim Cournoyer <maxim.cournoyer <at> gmail.com>
To: Dale Mellor <mcron-lsfnyl <at> rdmp.org>
Cc: 36510 <at> debbugs.gnu.org
Subject: Re: bug#36510: confusing mcron logging
Date: Mon, 28 Nov 2022 22:31:38 -0500
Hi,

Maxim Cournoyer <maxim.cournoyer <at> gmail.com> writes:

> Dale Mellor <mcron-lsfnyl <at> rdmp.org> writes:

[...]

>> The output is a little unpredictable.  The script (which is
>>    admittedly somewhat pathological)
>>
>>      (job '(next-second '(0 30)) '(begin (display "test: ")
>>                                          (system "date")))
>>
>>    produces
>>
>>      2022-01-04T11:24:00 (...): running...
>>      2022-01-04T11:24:00 (...): Tue 4 Jan 11:24:00 GMT 2022
>>      2022-01-04T11:24:00 (...): test: completed in 0.022s
>>      2022-01-04T11:24:30 (...): running...
>>      2022-01-04T11:24:30 (...): Tue 4 Jan 11:24:30 GMT 2022
>>      2022-01-04T11:25:00 (...): running...
>>      2022-01-04T11:25:00 (...): Tue 4 Jan 11:25:00 GMT 2022
>>      ...

I tried reproducing this, but couldn't, using the latest GNU Shepherd as
shipped in Guix.

> I've noticed that too, that some jobs somehow escape producing the
> "completed in x..." message.  I'll try looking into that, it's probably
> a subtle bug.

I took some time looking at the issue, and it was more straightforward
than I had hoped: I was using exec in my job, which was basically
hijacking the mcron's forked job process and loosing what it would have
normally done upon completion (print status).  Turning the 'execl' calls
into 'system*' fixed it:

--8<---------------cut here---------------start------------->8---
modified   guix/hurd.scm
@@ -36,14 +36,14 @@
   ;; Run 'updatedb' at 3AM every day.
   #~(job '(next-hour '(3))
          (lambda ()
-           (execl #$(file-append findutils "/bin/updatedb") "updatedb"
-                  (string-append "--prunepaths="
-                                 "/gnu/store "
-                                 "/media "
-                                 "/mnt "
-                                 "/tmp "
-                                 "/var/tmp "
-                                 "/var/lib ")))
+           (system* #$(file-append findutils "/bin/updatedb")
+                    (string-append "--prunepaths="
+                                   "/gnu/store "
+                                   "/media "
+                                   "/mnt "
+                                   "/tmp "
+                                   "/var/tmp "
+                                   "/var/lib ")))
          "updatedb"))
 
 (define btrfs-balance-job
@@ -52,15 +52,15 @@
   ;; low (5%) to minimize wear on the SSD.  Runs at 5 AM every 3 days.
   #~(job '(next-hour-from (next-day (range 1 31 3)) '(5))
          (lambda ()
-           (execl #$(file-append btrfs-progs "/bin/btrfs") "btrfs"
-                  "balance" "start" "-dusage=5" "/"))
+           (system* #$(file-append btrfs-progs "/bin/btrfs")
+                    "balance" "start" "-dusage=5" "/"))
          "btrfs-balance"))
 
 (define btrbk-job
   #~(job '(next-hour)
          (lambda ()
-           (execl #$(file-append btrbk "/bin/btrbk") "btrbk"
-                  "-q" "-c" #$(local-file "btrbk.conf") "run"))
+           (system* #$(file-append btrbk "/bin/btrbk")
+                    "-q" "-c" #$(local-file "btrbk.conf") "run"))
          "btrbk"))
 
--8<---------------cut here---------------end--------------->8---

-- 
Thanks,
Maxim




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 27 Dec 2022 12:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 121 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.