GNU bug report logs - #49517
[PATCH] gnu: txr: Build documentation and update to 265.

Previous Next

Package: guix-patches;

Reported by: "Paul A. Patience" <paul <at> apatience.com>

Date: Sun, 11 Jul 2021 00:38:02 UTC

Severity: normal

Tags: patch

Done: Guillaume Le Vaillant <glv <at> posteo.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 49517 in the body.
You can then email your comments to 49517 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to guix-patches <at> gnu.org:
bug#49517; Package guix-patches. (Sun, 11 Jul 2021 00:38:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to "Paul A. Patience" <paul <at> apatience.com>:
New bug report received and forwarded. Copy sent to guix-patches <at> gnu.org. (Sun, 11 Jul 2021 00:38:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: "Paul A. Patience" <paul <at> apatience.com>
To: "guix-patches <at> gnu.org" <guix-patches <at> gnu.org>
Subject: [PATCH] gnu: txr: Build documentation and update to 265.
Date: Sun, 11 Jul 2021 00:37:06 +0000
[Message part 1 (text/plain, inline)]
Empty Message
[0001-gnu-txr-Build-documentation.patch (text/x-patch, attachment)]
[0002-gnu-txr-Update-to-265.patch (text/x-patch, attachment)]

Information forwarded to guix-patches <at> gnu.org:
bug#49517; Package guix-patches. (Mon, 12 Jul 2021 01:02:02 GMT) Full text and rfc822 format available.

Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):

From: "Paul A. Patience" <paul <at> apatience.com>
To: "guix-patches <at> gnu.org" <guix-patches <at> gnu.org>
Subject: Re: [PATCH] gnu: txr: Build documentation and update to 265.
Date: Mon, 12 Jul 2021 01:01:18 +0000
[Message part 1 (text/plain, inline)]
I've managed to fix one of the failing tests
and narrowed down the problem of the others.
(Only the second patch is different, but I've
attached both for your convenience.)

Best regards,
Paul
[0001-gnu-txr-Build-documentation.patch (text/x-patch, attachment)]
[0002-gnu-txr-Update-to-265.patch (text/x-patch, attachment)]

Information forwarded to guix-patches <at> gnu.org:
bug#49517; Package guix-patches. (Tue, 13 Jul 2021 23:46:01 GMT) Full text and rfc822 format available.

Message #11 received at submit <at> debbugs.gnu.org (full text, mbox):

From: "Paul A. Patience" <paul <at> apatience.com>
To: "guix-patches <at> gnu.org" <guix-patches <at> gnu.org>
Cc: Kaz Kylheku <kaz <at> kylheku.com>
Subject: Re: [PATCH] gnu: txr: Build documentation and update to 265.
Date: Tue, 13 Jul 2021 23:45:34 +0000
[Message part 1 (text/plain, inline)]
On Sunday, July 11th, 2021 at 21:01, Paul A. Patience <paul <at> apatience.com> wrote:

> I've managed to fix one of the failing tests
> and narrowed down the problem of the others.

Kaz Kylheku has determined the cause of the failing tests,
so I've updated the comment to reflect his conclusions.

There has been a new release so I've updated the package
to TXR version 266.

Once again, only the second attached patch is different
from my initial submission.

Best regards,
Paul
[0001-gnu-txr-Build-documentation.patch (text/x-patch, attachment)]
[0002-gnu-txr-Update-to-266.patch (text/x-patch, attachment)]

Information forwarded to guix-patches <at> gnu.org:
bug#49517; Package guix-patches. (Sat, 17 Jul 2021 09:58:02 GMT) Full text and rfc822 format available.

Message #14 received at 49517 <at> debbugs.gnu.org (full text, mbox):

From: Guillaume Le Vaillant <glv <at> posteo.net>
To: "Paul A. Patience" <paul <at> apatience.com>
Cc: Kaz Kylheku <kaz <at> kylheku.com>, 49517 <at> debbugs.gnu.org
Subject: Re: [bug#49517] [PATCH] gnu: txr: Build documentation and update to
 265.
Date: Sat, 17 Jul 2021 09:57:02 +0000
[Message part 1 (text/plain, inline)]
Paul A. Patience <paul <at> apatience.com> skribis:

> On Sunday, July 11th, 2021 at 21:01, Paul A. Patience <paul <at> apatience.com> wrote:
>
>> I've managed to fix one of the failing tests
>> and narrowed down the problem of the others.
>
> Kaz Kylheku has determined the cause of the failing tests,
> so I've updated the comment to reflect his conclusions.
>
> There has been a new release so I've updated the package
> to TXR version 266.
>
> Once again, only the second attached patch is different
> from my initial submission.
>
> Best regards,
> Paul

Hi,

When testing the patch to build the HTML and PDF documentation,
I noticed that the 'share/doc/txr-263/txr-manpage.pdf' file is not
reproducible. There are some timestamps and UUIDs in it that change at
each build (diffoscope output attached).

Could you take a look at that and see if there's a way to make it
reproducible?
Thanks.
[txr-diffoscope.txt.lz (application/octet-stream, attachment)]
[signature.asc (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#49517; Package guix-patches. (Sun, 18 Jul 2021 02:33:01 GMT) Full text and rfc822 format available.

Message #17 received at 49517 <at> debbugs.gnu.org (full text, mbox):

From: Kaz Kylheku <kaz <at> kylheku.com>
To: Guillaume Le Vaillant <glv <at> posteo.net>
Cc: "Paul A. Patience" <paul <at> apatience.com>, 49517 <at> debbugs.gnu.org
Subject: Re: [bug#49517] [PATCH] gnu: txr: Build documentation and update to
 265.
Date: Sat, 17 Jul 2021 15:51:51 -0700
On 2021-07-17 02:57, Guillaume Le Vaillant wrote:
> Hi,
> 
> When testing the patch to build the HTML and PDF documentation,
> I noticed that the 'share/doc/txr-263/txr-manpage.pdf' file is not
> reproducible. There are some timestamps and UUIDs in it that change at
> each build (diffoscope output attached).
> 
> Could you take a look at that and see if there's a way to make it
> reproducible?
> Thanks.

Hi Guillaume,

Thank you for your report. I don't see anything in the pdfroff 
documentation
about getting rid of this. I might use a program similar to this one
to just overwrite the UUIDs and dates:

(let* ((pdf (file-get-string "txr-manpage.pdf"))
       (start (search-str pdf "<?xpacket begin="))
       (end (if start (search-str pdf "<?xpacket end" start)))
       (xml (if end [pdf start..end]))
       (orig-len (len xml)))
  (unless xml
    (format *stderr* "XML block not found in PDF")
    (exit nil))
  (upd xml
    (regsub #/uuid:........-....-....-....-............/
            "uuid:00000000-0000-0000-0000-000000000000")
    (regsub #/Date>....-..-..T..:..:..-..:../
            "Date>1970-01-01T00:00:00-00:00"))
  (assert (eql (len xml) orig-len))
  (set [pdf start..end] xml)
  (file-put-string "txr-manpage.pdf.temp" pdf)
  (rename-path "txr-manpage.pdf.temp" "txr-manpage.pdf"))


I have some questions.

1. When, for the sake of reproducible binary builds,
   we replace date stamps with fixed dates, is there a preference for
   what date to use? I used the Unix epoch, as you can see.

   I'm aware of the convention involving the environment
   variable SOURCE_DATE_EPOCH.

   Should I use that?

2. Is there some recommended practice with regard to some
   ./configure option or environment/make variable to react to
   for ensuring reproducible builds? So that is to say, suppose
   I don't wish to do the above embedded XML cleaning, except
   when building for a distro that strives for reproducibility.

   For opting in to reproducibilty, should I again rely on
   SOURCE_DATE_EPOCH and have the build react to it?


Thanks ...




Information forwarded to guix-patches <at> gnu.org:
bug#49517; Package guix-patches. (Sun, 18 Jul 2021 03:44:01 GMT) Full text and rfc822 format available.

Message #20 received at 49517 <at> debbugs.gnu.org (full text, mbox):

From: Kaz Kylheku <kaz <at> kylheku.com>
To: Guillaume Le Vaillant <glv <at> posteo.net>
Cc: "Paul A. Patience" <paul <at> apatience.com>, 49517 <at> debbugs.gnu.org
Subject: Re: [bug#49517] [PATCH] gnu: txr: Build documentation and update to
 265.
Date: Sat, 17 Jul 2021 20:43:36 -0700
On 2021-07-17 15:51, Kaz Kylheku wrote:
> On 2021-07-17 02:57, Guillaume Le Vaillant wrote:
>> Hi,
>> 
>> When testing the patch to build the HTML and PDF documentation,
>> I noticed that the 'share/doc/txr-263/txr-manpage.pdf' file is not
>> reproducible. There are some timestamps and UUIDs in it that change at
>> each build (diffoscope output attached).
>> 
>> Could you take a look at that and see if there's a way to make it
>> reproducible?
>> Thanks.
> 
> Hi Guillaume,
> 
> Thank you for your report. I don't see anything in the pdfroff 
> documentation
> about getting rid of this. I might use a program similar to this one
> to just overwrite the UUIDs and dates:

I've noticed that there are some dates in the document which
respond to SOURCE_DATE_EPOCH:

  2 0 obj
  <</Producer(GPL Ghostscript 9.26)
  /CreationDate(D:20210717203740-07'00')
  /ModDate(D:20210717203740-07'00')
  /Creator(groff version 1.22.3)>>endobj

If I build with the SOURCE_DATE_EPOCH environment variable,
these dates from Ghostscript follow that variable.
That's why Guillaume isn't seeing an issue in that section
of the file.

Here is what I am going with:

commit 8fbf3f55446427c06248ce222a05fd09d77ac878 (HEAD -> master)
Author: Kaz Kylheku <kaz <at> kylheku.com>
Date:   Sat Jul 17 19:11:20 2021 -0700

    doc: reproducible PDF.

    * Makefile (txr-manpage.pdf): If SOURCE_DATE_EPOCH exists,
    then run pdf-clobber-stamps.tl.

    * pdf-clobber-stamps.tl: New file.

diff --git a/Makefile b/Makefile
index 0094985f..cac9b3c0 100644
--- a/Makefile
+++ b/Makefile
@@ -560,6 +560,7 @@ txr-manpage.html: txr.1 genman.txr
 txr-manpage.pdf: txr.1 checkman.txr
        $(TXR) checkman.txr $<
        tbl $< | pdfroff -ww -man --no-toc - > $@
+       [ $$SOURCE_DATE_EPOCH ] && $(TXR) pdf-clobber-stamps.tl || true

 #
 # Special targets used by ./configure
diff --git a/pdf-clobber-stamps.tl b/pdf-clobber-stamps.tl
new file mode 100644
index 00000000..0e56a44d
--- /dev/null
+++ b/pdf-clobber-stamps.tl
@@ -0,0 +1,19 @@
+(let* ((epoch (or (tointz (getenv "SOURCE_DATE_EPOCH")) 0))
+       (isotime (time-string-utc epoch "%FT%T+00:00"))
+       (pdf (file-get-string "txr-manpage.pdf"))
+       (start (search-str pdf "<?xpacket begin="))
+       (end (if start (search-str pdf "<?xpacket end" start)))
+       (xml (if end [pdf start..end]))
+       (orig-len (len xml)))
+  (unless xml
+    (format *stderr* "XML block not found in PDF")
+    (exit nil))
+  (upd xml
+    (regsub #/uuid:........-....-....-....-............/
+            "uuid:00000000-0000-0000-0000-000000000000")
+    (regsub #/Date>....-..-..T..:..:..-..:../
+            `Date>@isotime`))
+  (assert (eql (len xml) orig-len))
+  (set [pdf start..end] xml)
+  (file-put-string "txr-manpage.pdf.temp" pdf)
+  (rename-path "txr-manpage.pdf.temp" "txr-manpage.pdf"))




Information forwarded to guix-patches <at> gnu.org:
bug#49517; Package guix-patches. (Sun, 18 Jul 2021 10:37:02 GMT) Full text and rfc822 format available.

Message #23 received at 49517 <at> debbugs.gnu.org (full text, mbox):

From: Guillaume Le Vaillant <glv <at> posteo.net>
To: Kaz Kylheku <kaz <at> kylheku.com>
Cc: "Paul A. Patience" <paul <at> apatience.com>, 49517 <at> debbugs.gnu.org
Subject: Re: [bug#49517] [PATCH] gnu: txr: Build documentation and update to
 265.
Date: Sun, 18 Jul 2021 10:36:36 +0000
[Message part 1 (text/plain, inline)]
Kaz Kylheku <kaz <at> kylheku.com> skribis:

> On 2021-07-17 15:51, Kaz Kylheku wrote:
>> On 2021-07-17 02:57, Guillaume Le Vaillant wrote:
>>> Hi,
>>> When testing the patch to build the HTML and PDF documentation,
>>> I noticed that the 'share/doc/txr-263/txr-manpage.pdf' file is not
>>> reproducible. There are some timestamps and UUIDs in it that change at
>>> each build (diffoscope output attached).
>>> Could you take a look at that and see if there's a way to make it
>>> reproducible?
>>> Thanks.
>> Hi Guillaume,
>> Thank you for your report. I don't see anything in the pdfroff 
>> documentation
>> about getting rid of this. I might use a program similar to this one
>> to just overwrite the UUIDs and dates:
>
> I've noticed that there are some dates in the document which
> respond to SOURCE_DATE_EPOCH:
>
>   2 0 obj
>   <</Producer(GPL Ghostscript 9.26)
>   /CreationDate(D:20210717203740-07'00')
>   /ModDate(D:20210717203740-07'00')
>   /Creator(groff version 1.22.3)>>endobj
>
> If I build with the SOURCE_DATE_EPOCH environment variable,
> these dates from Ghostscript follow that variable.
> That's why Guillaume isn't seeing an issue in that section
> of the file.

Hi Kaz,

I tried your patch and it doesn't fix all the timestamps in the
environment used to build Guix packages:
 - Timestamps have the "YYYY-MM-DDTHH:MM:SSZ" format instead of
   "YYYY-MM-DDTHH:MM:SS+00:00"
 - There are two "...Date(D:YYYYMMDDHHMMSSZ..." timestamps after the XML
   block, although SOURCE_DATE_EPOCH is set to 1 in the environment

With the following modified 'pdf-clobber-stamps.tl' the document becomes
reproducible with Guix (but probably not in some other environments,
depending on the timezone format):

--8<---------------cut here---------------start------------->8---
(let* ((epoch (or (tointz (getenv "SOURCE_DATE_EPOCH")) 0))
       (isotime (time-string-utc epoch "%FT%TZ"))
       (pdf (file-get-string "txr-manpage.pdf"))
       (start (search-str pdf "<?xpacket begin="))
       (end (if start (search-str pdf "<?xpacket end" start)))
       (xml (if end [pdf start..end]))
       (orig-len (len xml)))
  (unless xml
    (format *stderr* "XML block not found in PDF")
    (exit nil))
  (upd xml
    (regsub #/uuid:........-....-....-....-............/
            "uuid:00000000-0000-0000-0000-000000000000")
    (regsub #/Date>....-..-..T..:..:..Z/
            `Date>@isotime`))
  (assert (eql (len xml) orig-len))
  (set [pdf start..end] xml)
  (upd pdf
    (regsub #/Date\(D:..............Z/
            "Date(D:19700101000001Z"))
  (file-put-string "txr-manpage.pdf.temp" pdf)
  (rename-path "txr-manpage.pdf.temp" "txr-manpage.pdf"))
--8<---------------cut here---------------end--------------->8---
[signature.asc (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#49517; Package guix-patches. (Sun, 18 Jul 2021 13:00:02 GMT) Full text and rfc822 format available.

Message #26 received at 49517 <at> debbugs.gnu.org (full text, mbox):

From: "Paul A. Patience" <paul <at> apatience.com>
To: Guillaume Le Vaillant <glv <at> posteo.net>
Cc: Kaz Kylheku <kaz <at> kylheku.com>, 49517 <at> debbugs.gnu.org
Subject: Re: [bug#49517] [PATCH] gnu: txr: Build documentation and update to
 265.
Date: Sun, 18 Jul 2021 12:59:29 +0000
[Message part 1 (text/plain, inline)]
Hi Guillaume,

On Sunday, July 18th, 2021 at 06:36, Guillaume Le Vaillant <glv <at> posteo.net> wrote:

> Hi Kaz,
>
> I tried your patch and it doesn't fix all the timestamps in the
> environment used to build Guix packages:

I had sent an email last night but accidentally only to Kaz. Here it is below:

On Saturday, July 17th, 2021 at 18:51, Kaz Kylheku <kaz <at> kylheku.com> wrote:
> On 2021-07-17 02:57, Guillaume Le Vaillant wrote:
>> When testing the patch to build the HTML and PDF documentation,
>> I noticed that the 'share/doc/txr-263/txr-manpage.pdf' file is not
>> reproducible. There are some timestamps and UUIDs in it that change at
>> each build (diffoscope output attached).

I've updated the first patch to fix this by setting GS_GENERATE_UUIDS
to 0, which seems to be the standard Guix way to patch groff's use of
Ghostscript.
It removes most of the date (i.e., the hours, minutes and seconds) and
the UUID, but leaves the year, month and day:

  $ xxd /gnu/store/h94iilsa2xsp2ymn3k9x3ckmvfjha731-txr-266/share/doc/txr-266/txr-manpage.pdf | grep -C 1 Date
  00231430: 702f 312e 302f 273e 3c78 6d70 3a4d 6f64  p/1.0/'><xmp:Mod
  00231440: 6966 7944 6174 653e 3230 3231 2d30 372d  ifyDate>2021-07-
  00231450: 3138 3c2f 786d 703a 4d6f 6469 6679 4461  18</xmp:ModifyDa
  --
  00231470: 6174 653e 3230 3231 2d30 372d 3138 3c2f  ate>2021-07-18</
  00231480: 786d 703a 4372 6561 7465 4461 7465 3e0a  xmp:CreateDate>.
  00231490: 3c78 6d70 3a43 7265 6174 6f72 546f 6f6c  <xmp:CreatorTool

Is this acceptable?
Otherwise we may have to resort to a variation of the method Kaz
mentioned, though it's probably better to fix the Ghostscript patches
implementing GS_GENERATE_UUIDS, because otherwise any package relying on
groff to make PDFs will suffer from this very problem.

> Thank you for your report. I don't see anything in the pdfroff
> documentation about getting rid of this.

The problem is in fact with Ghostscript [1].
Ghostscript is the program adding the metadata.

> 2. Is there some recommended practice with regard to some
>     ./configure option or environment/make variable to react to
>     for ensuring reproducible builds? So that is to say, suppose
>     I don't wish to do the above embedded XML cleaning, except
>     when building for a distro that strives for reproducibility.
>
>     For opting in to reproducibilty, should I again rely on
>     SOURCE_DATE_EPOCH and have the build react to it?

I think the goal of SOURCE_DATE_EPOCH is for projects such as TXR to
need do nothing, and rather have Guix arrange for the "builder"
applications (i.e., Ghostscript here) to produce reproducible outputs.
In this case with GS_GENERATE_UUIDS=0.
So I don't think TXR need change anything.

Since I had to make a change in one of the patches, I have added a third
patch (squeezed in between the other two) adjusting the installation of
the license files.
The three patches are attached.

(Kaz, if there's anything TXR should change, perhaps it is the target
directory of the license files, i.e., $(datadir) -> $(docdir).
I think it's more common in general to install license files into
/usr/share/doc/APP rather than /usr/share/APP -- at least, that's where
Guix installs them.
This would render the second attached patch unnecessary.)

Best regards,
Paul

[1]: https://bugs.ghostscript.com/show_bug.cgi?id=696765
[0001-gnu-txr-Build-documentation.patch (text/x-patch, attachment)]
[0002-gnu-txr-Fix-license-installation.patch (text/x-patch, attachment)]
[0003-gnu-txr-Update-to-266.patch (text/x-patch, attachment)]

Information forwarded to guix-patches <at> gnu.org:
bug#49517; Package guix-patches. (Sun, 18 Jul 2021 20:28:01 GMT) Full text and rfc822 format available.

Message #29 received at 49517 <at> debbugs.gnu.org (full text, mbox):

From: Kaz Kylheku <kaz <at> kylheku.com>
To: Guillaume Le Vaillant <glv <at> posteo.net>
Cc: "Paul A. Patience" <paul <at> apatience.com>, 49517 <at> debbugs.gnu.org
Subject: Re: [bug#49517] [PATCH] gnu: txr: Build documentation and update to
 265.
Date: Sun, 18 Jul 2021 13:27:18 -0700
On 2021-07-18 03:36, Guillaume Le Vaillant wrote:
> Hi Kaz,
> 
> I tried your patch and it doesn't fix all the timestamps in the
> environment used to build Guix packages:
>  - Timestamps have the "YYYY-MM-DDTHH:MM:SSZ" format instead of
>    "YYYY-MM-DDTHH:MM:SS+00:00"
>  - There are two "...Date(D:YYYYMMDDHHMMSSZ..." timestamps after the 
> XML
>    block, although SOURCE_DATE_EPOCH is set to 1 in the environment

These are precisely the entries I was referring to in my other post.
In the Ubuntu environment, these are following SOURCE_DATE_EPOCH.

In fact, all the dates follow SOURCE_DATE_EPOCH. Even with my
hack commented out, if we do this:

  $ SOURCE_DATE_EPOCH=0 make txr-manpage.pdf
  ./txr checkman.txr txr.1
  tbl txr.1 | pdfroff -ww -man --no-toc - > txr-manpage.pdf
  ./pdfroff-eCdDwXuD8U/pdf29977.cmp:1: warning: macro `pdfhref' not 
defined
  txr.1:36: warning: number register `M2' not defined
  # [ $SOURCE_DATE_EPOCH ] && ./txr pdf-clobber-stamps.tl || true

the resulting dates are all set to 1970-01-01:

 $ strings txr-manpage.pdf | grep -E 'Mod|Crea'
 <rdf:Description rdf:about='uuid:9f558000-55ee-11bd-0000-096f2d10ec33' 
xmlns:xmp='http://ns.adobe.com/xap/1.0/'><xmp:ModifyDate>1970-01-01T00:00:00Z</xmp:ModifyDate>
  <xmp:CreateDate>1970-01-01T00:00:00Z</xmp:CreateDate>
  <xmp:CreatorTool>groff version 
1.22.3</xmp:CreatorTool></rdf:Description>
  /CreationDate(D:19700101000000Z00'00')
  /ModDate(D:19700101000000Z00'00')
  /Creator(groff version 1.22.3)>>endobj

Moreover, the uuid: strings are not changing between repetitions.

Either Ubuntu has a different upstream for these tools, or else they 
have
some patches (which would be worth stealing instead of repeating the 
work).

Moreover, if Ubuntu has patches for this, it might be getting them from
Debian.

> With the following modified 'pdf-clobber-stamps.tl' the document 
> becomes
> reproducible with Guix (but probably not in some other environments,
> depending on the timezone format):

This is interesting, not to mention an annoying variation. I wonder
where this timezone format is coming from? It doesn't seem to be any
local variable under LC_TIME.

It's also weird how the timezone is expressed with a colon in the
Ubuntu build, as -07:00.  I don't see anything in strftime for that,
looking at the latest Glibc documentation online.

In the Ghostscript code it seems that the latter dates: /CreationDate
and all, are the source of the values put into the XML.

The /CreationDate is being printed using a gs_sprintf call. Here is
the link to the Debian repo, inside a function called 
pdf_image_finish_file:

https://sources.debian.org/src/ghostscript/9.53.3%7Edfsg-7/devices/gdevpdfimg.c/?hl=670#L753

        gs_sprintf(CreationDate, 
"(D:%04d%02d%02d%02d%02d%02d%c%02d\'%02d\')",
            tms.tm_year + 1900, tms.tm_mon + 1, tms.tm_mday,
            tms.tm_hour, tms.tm_min, tms.tm_sec,
            timesign, timeoffset / 60, timeoffset % 60);


I found the code which converst the date with the colon in the timezone,
the function pdf_xmp_convert_time:

https://sources.debian.org/src/ghostscript/9.53.3%7Edfsg-7/devices/vector/gdevpdfe.c/#L222

It looks the same as in the ArtifexSoftware ghostpd upstream. It is 
ad-hoc
code not using strftime, which puts in the colon.

This behavior is conditional depending on the input, though.
There is a case in which it puts in a Z and terminates, resulting
(I am guessing) in the format seen on Guix:

    dt[19] = buf[14]; /* designator */
    if (dt[19] == 'Z')
        return 20;

The pdf_image_finish_function writes a Z if it is compiled with #ifdef 
CLUSTER.
This CLUSTER compile-time switch has to do with some "cluster testing" 
that
requires reproducible files.

It will also write a Z if it finds that the time offset is zero:

  #ifdef CLUSTER
        memset(&t, 0, sizeof(t));
        memset(&tms, 0, sizeof(tms));
        timesign = 'Z';
        timeoffset = 0;
  #else
        time(&t);
        tms = *gmtime(&t);
        tms.tm_isdst = -1;
        timeoffset = (int)difftime(t, mktime(&tms)); /* tz+dst in 
seconds */
        timesign = (timeoffset == 0 ? 'Z' : timeoffset < 0 ? '-' : '+');
        timeoffset = any_abs(timeoffset) / 60;
        tms = *localtime(&t);
  #endif

Aha, this may be what is going on in the Guix build: that the time 
offset has
been set to zero and so the 'Z' character is written; then the
conversion function to the other date format writes a 'Z' and quits.

I don't see where this code reacts to SOURCE_DATE_EPOCH like I'm seeing
on Ubuntu; maybe I'm looking at the wrote branch of the Debian repo,
or it really is Ubuntu who did that?

In any case, if we end up needing any aspect of my hack, I think I can 
make it
account for all the variations we can expect to see out of this code.

Cheers ...




Information forwarded to guix-patches <at> gnu.org:
bug#49517; Package guix-patches. (Sun, 18 Jul 2021 21:30:02 GMT) Full text and rfc822 format available.

Message #32 received at 49517 <at> debbugs.gnu.org (full text, mbox):

From: "Paul A. Patience" <paul <at> apatience.com>
To: Kaz Kylheku <kaz <at> kylheku.com>
Cc: Guillaume Le Vaillant <glv <at> posteo.net>, 49517 <at> debbugs.gnu.org
Subject: Re: [bug#49517] [PATCH] gnu: txr: Build documentation and update to
 265.
Date: Sun, 18 Jul 2021 21:28:58 +0000
On Sunday, July 18th, 2021 at 16:27, Kaz Kylheku <kaz <at> kylheku.com> wrote:
> Either Ubuntu has a different upstream for these tools, or else they
> have some patches (which would be worth stealing instead of repeating
> the work).
>
> Moreover, if Ubuntu has patches for this, it might be getting them from
> Debian.

I know Debian is making great efforts to obtain reproducible builds [1],
and in fact if you look at the first message of the bug report I
previously linked [2], they mention that they have been using some
custom patches to get Ghostscript to produce reproducible output
(on Debian).
In fact, we can find some information about Debian's Ghostscript patch
here [3], though unfortunately the link to the patch is dead.

(There is also more information about reproducible builds here [4].)

Best regards,
Paul

[1]: https://isdebianreproducibleyet.com/
[2]: https://bugs.ghostscript.com/show_bug.cgi?id=696765
[3]: https://wiki.debian.org/ReproducibleBuilds/PdfGeneratedByGhostscript
[4]: https://wiki.debian.org/ReproducibleBuilds/Howto





Information forwarded to guix-patches <at> gnu.org:
bug#49517; Package guix-patches. (Mon, 19 Jul 2021 03:24:02 GMT) Full text and rfc822 format available.

Message #35 received at 49517 <at> debbugs.gnu.org (full text, mbox):

From: Kaz Kylheku <kaz <at> kylheku.com>
To: Guillaume Le Vaillant <glv <at> posteo.net>
Cc: "Paul A. Patience" <paul <at> apatience.com>, 49517 <at> debbugs.gnu.org
Subject: Re: [bug#49517] [PATCH] gnu: txr: Build documentation and update to
 265.
Date: Sun, 18 Jul 2021 20:23:07 -0700
On 2021-07-18 03:36, Guillaume Le Vaillant wrote:
> Hi Kaz,
> 
> I tried your patch and it doesn't fix all the timestamps in the
> environment used to build Guix packages:

OK,

I have a new patch which works for both the observed time
formats issued by Ghostscript, which I reproduced and tested.

Patch follows.

I extended the capture region to extract not only the XML
but that bit of PostScript with the dates which follows right
after it.

(I structured the script this way to avoid doing a regex search
and replace of the whole file, which is not only more time
consuming but risks more false positives than necessary.)

The replacement argument of regsub can be a function; the function
receives the original string and calculates its replacement.
So we can check for a trailing Z and act accordingly.

commit 920ae93cd768222db7387ee026f2d779d5e6de09 (HEAD -> master)
Author: Kaz Kylheku <kaz <at> kylheku.com>
Date:   Sat Jul 17 19:11:20 2021 -0700

    doc: reproducible PDF.

    * Makefile (txr-manpage.pdf): If SOURCE_DATE_EPOCH exists,
    then run pdf-clobber-stamps.tl.

    * pdf-clobber-stamps.tl: New file.

diff --git a/Makefile b/Makefile
index 0094985f..cac9b3c0 100644
--- a/Makefile
+++ b/Makefile
@@ -560,6 +560,7 @@ txr-manpage.html: txr.1 genman.txr
 txr-manpage.pdf: txr.1 checkman.txr
        $(TXR) checkman.txr $<
        tbl $< | pdfroff -ww -man --no-toc - > $@
+       [ $$SOURCE_DATE_EPOCH ] && $(TXR) pdf-clobber-stamps.tl || true

 #
 # Special targets used by ./configure
diff --git a/pdf-clobber-stamps.tl b/pdf-clobber-stamps.tl
new file mode 100644
index 00000000..78ea06c6
--- /dev/null
+++ b/pdf-clobber-stamps.tl
@@ -0,0 +1,22 @@
+(let* ((epoch (or (tointz (getenv "SOURCE_DATE_EPOCH")) 0))
+       (pdf (file-get-string "txr-manpage.pdf"))
+       (start (search-str pdf "<?xpacket begin="))
+       (end (if start (search-str pdf "/Creator(" start)))
+       (xml (if end [pdf start..end]))
+       (orig-len (len xml))
+       (isotime (time-string-utc epoch "%FT%T"))
+       (gstime (time-string-utc epoch "%Y%m%d%H%M%SZ0000")))
+  (unless xml
+    (format *stderr* "XML block not found in PDF")
+    (exit nil))
+  (upd xml
+    (regsub #/uuid:........-....-....-....-............/
+            "uuid:00000000-0000-0000-0000-000000000000")
+    (regsub #/Date>....-..-..T..:..:..(Z|[+\-]..:..)/
+            (ret `Date>@isotime@(if (ends-with "Z" @1) "Z" "+00:00")`))
+    (regsub #/Date\(D:..............[Z+\-]..../
+            `Date(D:@gstime`))
+  (assert (eql (len xml) orig-len))
+  (set [pdf start..end] xml)
+  (file-put-string "txr-manpage.pdf.temp" pdf)
+  (rename-path "txr-manpage.pdf.temp" "txr-manpage.pdf"))





Information forwarded to guix-patches <at> gnu.org:
bug#49517; Package guix-patches. (Mon, 19 Jul 2021 12:09:02 GMT) Full text and rfc822 format available.

Message #38 received at 49517 <at> debbugs.gnu.org (full text, mbox):

From: Guillaume Le Vaillant <glv <at> posteo.net>
To: "Paul A. Patience" <paul <at> apatience.com>
Cc: Kaz Kylheku <kaz <at> kylheku.com>, 49517 <at> debbugs.gnu.org
Subject: Re: [bug#49517] [PATCH] gnu: txr: Build documentation and update to
 265.
Date: Mon, 19 Jul 2021 12:08:19 +0000
[Message part 1 (text/plain, inline)]
Paul A. Patience <paul <at> apatience.com> skribis:

> On Sunday, July 18th, 2021 at 16:27, Kaz Kylheku <kaz <at> kylheku.com> wrote:
>> Either Ubuntu has a different upstream for these tools, or else they
>> have some patches (which would be worth stealing instead of repeating
>> the work).
>>
>> Moreover, if Ubuntu has patches for this, it might be getting them from
>> Debian.
>
> I know Debian is making great efforts to obtain reproducible builds [1],
> and in fact if you look at the first message of the bug report I
> previously linked [2], they mention that they have been using some
> custom patches to get Ghostscript to produce reproducible output
> (on Debian).
> In fact, we can find some information about Debian's Ghostscript patch
> here [3], though unfortunately the link to the patch is dead.
>
> (There is also more information about reproducible builds here [4].)
>
> Best regards,
> Paul
>
> [1]: https://isdebianreproducibleyet.com/
> [2]: https://bugs.ghostscript.com/show_bug.cgi?id=696765
> [3]: https://wiki.debian.org/ReproducibleBuilds/PdfGeneratedByGhostscript
> [4]: https://wiki.debian.org/ReproducibleBuilds/Howto

So Debian indeed has a patch adding the possibility to set the timestamp
based on SOURCE_DATE_EPOCH (see '2010_add_build_timestamp_setting.patch'
in [1] for example).

Guix also has a patch, but a different one based on GS_GENERATE_UUIDS.
However this patch is missing a part disabling two of the timestamps.
I proposed a patch to fix that (see [2]).
With this fix, 'pdf-clobber-stamps.tl' is not necessary anymore to build
the documentation reproducibly in Guix, but it might still be useful for
some other build environments.

[1] http://security.debian.org/debian-security/pool/updates/main/g/ghostscript/ghostscript_9.26a~dfsg-0+deb9u7.debian.tar.xz
[2] https://bugs.gnu.org/49640
[signature.asc (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#49517; Package guix-patches. (Mon, 19 Jul 2021 21:32:01 GMT) Full text and rfc822 format available.

Message #41 received at 49517 <at> debbugs.gnu.org (full text, mbox):

From: Kaz Kylheku <kaz <at> kylheku.com>
To: Guillaume Le Vaillant <glv <at> posteo.net>
Cc: "Paul A. Patience" <paul <at> apatience.com>, 49517 <at> debbugs.gnu.org
Subject: Re: [bug#49517] [PATCH] gnu: txr: Build documentation and update to
 265.
Date: Mon, 19 Jul 2021 14:31:15 -0700
On 2021-07-19 05:08, Guillaume Le Vaillant wrote:
> So Debian indeed has a patch adding the possibility to set the 
> timestamp
> based on SOURCE_DATE_EPOCH (see 
> '2010_add_build_timestamp_setting.patch'
> in [1] for example).

Looks like they rolled out this patch into production in 2015.

Is there a reason why Guix can't just steal the Debian patches
related to reproducibility? (Like underlying differences it the overall
approach which lead to incompatibilities?)

It would probably be best if distros did this the same way, so
there are no surprises.

GNU/Linux could set a precedent for other platforms, even.
If I'm building something on, say, Cygwin, OpenBSD or MacOS, if the
reproducbility stuff works the same way like on GNU/Linuxes, that's
great.

Here is a powerful argument why Just One Way of doing it is better:

Distros should not be carrying patches for this in the first place;
the programs themselves should be upstreaming the changes for
reproducibility.

If there is an agreed-upon /de facto/ (or /de jure/) standard way
of doing it, it is easier to persuade the individual program developers 
to
accept the changes. They have a single target to hit which covers
all platforms.

In contrast, if reproducibility is an /ad hoc/ OS-and-distro-specific
matter, they are going to be understandably less motivated to upstream
the changes.

Nobody wants a situation in their source tree like:

  patches/for-debian
         /for-guix
         /for-solaris
         ...

Just one implementation, committed into trunk, with with no #ifdefs.




Reply sent to Guillaume Le Vaillant <glv <at> posteo.net>:
You have taken responsibility. (Tue, 20 Jul 2021 09:08:01 GMT) Full text and rfc822 format available.

Notification sent to "Paul A. Patience" <paul <at> apatience.com>:
bug acknowledged by developer. (Tue, 20 Jul 2021 09:08:01 GMT) Full text and rfc822 format available.

Message #46 received at 49517-done <at> debbugs.gnu.org (full text, mbox):

From: Guillaume Le Vaillant <glv <at> posteo.net>
To: "Paul A. Patience" <paul <at> apatience.com>
Cc: Kaz Kylheku <kaz <at> kylheku.com>, 49517-done <at> debbugs.gnu.org
Subject: Re: [bug#49517] [PATCH] gnu: txr: Build documentation and update to
 265.
Date: Tue, 20 Jul 2021 09:07:20 +0000
[Message part 1 (text/plain, inline)]
Paul A. Patience <paul <at> apatience.com> skribis:

> I've updated the first patch to fix this by setting GS_GENERATE_UUIDS
> to 0, which seems to be the standard Guix way to patch groff's use of
> Ghostscript.
> It removes most of the date (i.e., the hours, minutes and seconds) and
> the UUID, but leaves the year, month and day:
>
>   $ xxd /gnu/store/h94iilsa2xsp2ymn3k9x3ckmvfjha731-txr-266/share/doc/txr-266/txr-manpage.pdf | grep -C 1 Date
>   00231430: 702f 312e 302f 273e 3c78 6d70 3a4d 6f64  p/1.0/'><xmp:Mod
>   00231440: 6966 7944 6174 653e 3230 3231 2d30 372d  ifyDate>2021-07-
>   00231450: 3138 3c2f 786d 703a 4d6f 6469 6679 4461  18</xmp:ModifyDa
>   --
>   00231470: 6174 653e 3230 3231 2d30 372d 3138 3c2f  ate>2021-07-18</
>   00231480: 786d 703a 4372 6561 7465 4461 7465 3e0a  xmp:CreateDate>.
>   00231490: 3c78 6d70 3a43 7265 6174 6f72 546f 6f6c  <xmp:CreatorTool
>
> Is this acceptable?
> Otherwise we may have to resort to a variation of the method Kaz
> mentioned, though it's probably better to fix the Ghostscript patches
> implementing GS_GENERATE_UUIDS, because otherwise any package relying on
> groff to make PDFs will suffer from this very problem.

Hi Paul,

I pushed your patches as 75922458af60081bf6964006d5b9c180ff9ec8ca and
following with some modifications. I added a phase replacing the
hardcoded "/bin/sh" by the real path to bash in "/gnu/store/...", which
makes all the tests pass.
For now the PDF documentation still has the "ModifyDate" and
"CreateDate" fields. The fix for this is in the core-updates branch, so
when core-updates gets merged into master, the PDF should become
reproducible.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#49517; Package guix-patches. (Tue, 20 Jul 2021 09:19:02 GMT) Full text and rfc822 format available.

Message #49 received at 49517 <at> debbugs.gnu.org (full text, mbox):

From: Guillaume Le Vaillant <glv <at> posteo.net>
To: Kaz Kylheku <kaz <at> kylheku.com>
Cc: "Paul A. Patience" <paul <at> apatience.com>, 49517 <at> debbugs.gnu.org
Subject: Re: [bug#49517] [PATCH] gnu: txr: Build documentation and update to
 265.
Date: Tue, 20 Jul 2021 09:18:39 +0000
[Message part 1 (text/plain, inline)]
Kaz Kylheku <kaz <at> kylheku.com> skribis:

> On 2021-07-19 05:08, Guillaume Le Vaillant wrote:
>> So Debian indeed has a patch adding the possibility to set the timestamp
>> based on SOURCE_DATE_EPOCH (see '2010_add_build_timestamp_setting.patch'
>> in [1] for example).
>
> Looks like they rolled out this patch into production in 2015.
>
> Is there a reason why Guix can't just steal the Debian patches
> related to reproducibility? (Like underlying differences it the overall
> approach which lead to incompatibilities?)

I don't think so, the developer who made the patch for Guix probably
just didn't know about Debian's patch.


> It would probably be best if distros did this the same way, so
> there are no surprises.
>
> GNU/Linux could set a precedent for other platforms, even.
> If I'm building something on, say, Cygwin, OpenBSD or MacOS, if the
> reproducbility stuff works the same way like on GNU/Linuxes, that's
> great.
>
> Here is a powerful argument why Just One Way of doing it is better:
>
> Distros should not be carrying patches for this in the first place;
> the programs themselves should be upstreaming the changes for
> reproducibility.
>
> If there is an agreed-upon /de facto/ (or /de jure/) standard way
> of doing it, it is easier to persuade the individual program developers to
> accept the changes. They have a single target to hit which covers
> all platforms.
>
> In contrast, if reproducibility is an /ad hoc/ OS-and-distro-specific
> matter, they are going to be understandably less motivated to upstream
> the changes.
>
> Nobody wants a situation in their source tree like:
>
>   patches/for-debian
>          /for-guix
>          /for-solaris
>          ...
>
> Just one implementation, committed into trunk, with with no #ifdefs.

In this case upstream explicitly refused merging the patches for
reproducibility (https://bugs.ghostscript.com/show_bug.cgi?id=698208).
[signature.asc (application/pgp-signature, inline)]

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 17 Aug 2021 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 245 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.