GNU bug report logs -
#79762
Guile may run stale code if source changes too quickly
Previous Next
To reply to this bug, email your comments to 79762 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-guile <at> gnu.org:
bug#79762; Package
guile.
(Mon, 03 Nov 2025 22:02:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Rob Browning <rlb <at> defaultvalue.org>:
New bug report received and forwarded. Copy sent to
bug-guile <at> gnu.org.
(Mon, 03 Nov 2025 22:02:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
In the past I'd noticed that guile appeared to be running stale code,
and I finally got a chance to track it down with help from Dale Smith.
The cause is that filesystems have finite (and sometimes quite coarse)
timestamp "granularity", and guile currently treats equal timestamps
between a .scm file and .go file as meaning that the .go file is up to
date, i.e. represents the compilation of the code in the .scm
file. Right now that is not always true, and can produce undesiarable
results:
For example, if you do this:
echo '(display "something bad\n")' > test.scm
guile test.scm
echo '(display "the (broken) fix)' > test.scm
guile test.scm
The final guile invocation may actually succeed while printing
"something bad" if the test.go generated by the first guile invocation
ends up with the same timestamp as the one with the fix. It's not even
relevant that the fix is actually itself broken (invalid syntax) because
guile never attempts to compile it --- it just runs the "bad" test.go.
To fix this, we can change compiled_is_fresh in load.c to require the
.go to be strictly newer than the corresponding .scm, i.e. change the <=
to < here:
if (source_mtime.tv_sec < compiled_mtime.tv_sec
|| (source_mtime.tv_sec == compiled_mtime.tv_sec
&& source_mtime.tv_nsec <= compiled_mtime.tv_nsec))
compiled_is_newer = 1;
...
and in more-recent? in boot-9.scm, change >= to > here:
(and (= (stat:mtime stat1) (stat:mtime stat2))
(>= (stat:mtimensec stat1)
(stat:mtimensec stat2)))))
This script demonstrates the problem for the latter case:
[test-stale-load (text/x-sh, inline)]
#!/bin/bash
set -ueo pipefail
guile="${GUILE:-guile}"
top="$(pwd)"
tmpdir=''
on-exit() { cd "$top"; if test "$tmpdir"; then rm -rf "$tmpdir"; fi; }
trap on-exit EXIT
tmpdir="$(mktemp -d show-stale-load-XXXXXXX)"
cd "$tmpdir"
mkdir cache
export XDG_CACHE_HOME="$(pwd)/cache"
while true; do
echo '(display "game over\n")' > foo.scm
"$guile" foo.scm > /dev/null
# (Compilation of) this broken fix should be rejected, but rarely
# it's not because foo.scm here ends up with the same timestamp as
# the foo.go compiled above.
echo '(display "fixed)' > foo.scm
if "$guile" foo.scm | tee foo.log; then
echo 'broken fix did not provoke an error'
fi
if grep -q 'game over' foo.log; then
# stat foo.scm "$(find -name foo.scm.go)"
echo '=== ran bad code even after fix was in place ==='
exit 2
fi
done
[Message part 3 (text/plain, inline)]
Thanks to Dale Smith for finding the root causes.
--
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2011-07-10 E6A9 DA3C C9FD 1FF8 C676 D2C4 C0F0 39E9 ED1B 597A
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4
Information forwarded
to
bug-guile <at> gnu.org:
bug#79762; Package
guile.
(Thu, 06 Nov 2025 09:06:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 79762 <at> debbugs.gnu.org (full text, mbox):
Hi Rob,
Rob Browning <rlb <at> defaultvalue.org> skribis:
> To fix this, we can change compiled_is_fresh in load.c to require the
> .go to be strictly newer than the corresponding .scm, i.e. change the <=
> to < here:
>
> if (source_mtime.tv_sec < compiled_mtime.tv_sec
> || (source_mtime.tv_sec == compiled_mtime.tv_sec
> && source_mtime.tv_nsec <= compiled_mtime.tv_nsec))
> compiled_is_newer = 1;
> ...
>
> and in more-recent? in boot-9.scm, change >= to > here:
>
> (and (= (stat:mtime stat1) (stat:mtime stat2))
> (>= (stat:mtimensec stat1)
> (stat:mtimensec stat2)))))
While this change is justified by the examples you gave, I think it
would cause problems you wouldn’t expect.
In particular, in Guix and Nix, timestamps on build artifacts are reset
to the Epoch + 1 second, for reproducibility purposes. This means that
.scm and .go files have the exact same timestamp.
With the change above, all .go files would be considered stale and Guile
would end up auto-recompiling everything.
So I think we’ll have to leave that unchanged.
Now, I would hope that the problem you describe is rare enough in
practice that this is not too much of a problem?
Thanks,
Ludo’.
Information forwarded
to
bug-guile <at> gnu.org:
bug#79762; Package
guile.
(Thu, 06 Nov 2025 09:20:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 79762 <at> debbugs.gnu.org (full text, mbox):
Rob Browning wrote:
> In the past I'd noticed that guile appeared to be running stale code,
> and I finally got a chance to track it down with help from Dale Smith.
>
> The cause is that filesystems have finite (and sometimes quite coarse)
> timestamp "granularity", and guile currently treats equal timestamps
> between a .scm file and .go file as meaning that the .go file is up to
> date, i.e. represents the compilation of the code in the .scm
> file. Right now that is not always true, and can produce undesiarable
> results:
>
> For example, if you do this:
>
> echo '(display "something bad\n")' > test.scm
> guile test.scm
> echo '(display "the (broken) fix)' > test.scm
> guile test.scm
This is an artificial example. In what situation did you encounter
this issue? I am wondering if it is easier to fix that situation
instead of submitting this patch.
> The final guile invocation may actually succeed while printing
> "something bad" if the test.go generated by the first guile invocation
> ends up with the same timestamp as the one with the fix. It's not even
> relevant that the fix is actually itself broken (invalid syntax) because
> guile never attempts to compile it --- it just runs the "bad" test.go.
>
> To fix this, we can change compiled_is_fresh in load.c to require the
> .go to be strictly newer than the corresponding .scm, i.e. change the <=
> to < here:
If you do that, you will break a lot of Guile Autotools projects whose
object files currently have >= timestamps to sources. See for example
<https://lists.gnu.org/archive/html/guile-devel/2010-07/msg00125.html>,
which explains an Automake hack to establish
object timestamp >= source timestamp.
To obtain a strict inequality, a sleep command would further need to
be used (as discussed there), but currently autotools projects do not
do this and will break with your patch.
Regards,
Nikolaos Chatzikonstantinou
Information forwarded
to
bug-guile <at> gnu.org:
bug#79762; Package
guile.
(Thu, 06 Nov 2025 17:44:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 79762 <at> debbugs.gnu.org (full text, mbox):
Nikolaos Chatzikonstantinou <nchatz314 <at> gmail.com> writes:
> This is an artificial example. In what situation did you encounter
> this issue? I am wondering if it is easier to fix that situation
> instead of submitting this patch.
I believe I've run in to guile running stale code more than once (after
I figured out what was going on), which is why I started looking around
for a cause, but I don't have a good idea exactly what might have been
provoking it, and it's been a while --- I just made a note to eventually
investigate.
Though I've also wondered whether loading might somehow (also) be
capable of falling back to a stale .go after a compilation error, or
similar, which could explain what I saw, and probably be less unlikely.
Thanks for the evaluation
--
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2011-07-10 E6A9 DA3C C9FD 1FF8 C676 D2C4 C0F0 39E9 ED1B 597A
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4
Information forwarded
to
bug-guile <at> gnu.org:
bug#79762; Package
guile.
(Thu, 06 Nov 2025 18:10:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 79762 <at> debbugs.gnu.org (full text, mbox):
Ludovic Courtès <ludo <at> gnu.org> writes:
> In particular, in Guix and Nix, timestamps on build artifacts are
> reset to the Epoch + 1 second, for reproducibility purposes. This
> means that .scm and .go files have the exact same timestamp.
Ahh, OK. Clojure had this same problem in Debian a good while back with
dh_strip_nondeterminism. There it was causing performance problems
because clojure would recompile all the packaged source code every time
it ran, even though it also had compiled code, because the timestamps in
the jar were equal and clojure did consider that "stale". I think it was
fixed by having dh_strip_nondeterminism set the source time a bit before
the compiled time while stripping. https://bugs.debian.org/877418
> Now, I would hope that the problem you describe is rare enough in
> practice that this is not too much of a problem?
Well, as mentioned in my reply to Nikolaos Chatzikonstantinou, I believe
I've seen it a number of times, but not often, and it was quite
confusing, which is why I started looking around in the first place. But
also as mentioned, I did wonder if there might be some additional, more
likely cause.
Thanks for taking a look
--
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2011-07-10 E6A9 DA3C C9FD 1FF8 C676 D2C4 C0F0 39E9 ED1B 597A
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4
This bug report was last modified 1 day ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.