GNU bug report logs - #60288
[RESEND #2] [PATCH v1 0/2] Start adding ZIM file(s)

Previous Next

Package: guix-patches;

Reported by: Denis 'GNUtoo' Carikli <GNUtoo <at> cyberdimension.org>

Date: Fri, 23 Dec 2022 22:09:01 UTC

Severity: normal

Tags: patch

Done: Denis 'GNUtoo' Carikli <GNUtoo <at> cyberdimension.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 60288 in the body.
You can then email your comments to 60288 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to guix-patches <at> gnu.org:
bug#60288; Package guix-patches. (Fri, 23 Dec 2022 22:09:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Denis 'GNUtoo' Carikli <GNUtoo <at> cyberdimension.org>:
New bug report received and forwarded. Copy sent to guix-patches <at> gnu.org. (Fri, 23 Dec 2022 22:09:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Denis 'GNUtoo' Carikli <GNUtoo <at> cyberdimension.org>
To: guix-patches <at> gnu.org
Cc: Denis 'GNUtoo' Carikli <GNUtoo <at> cyberdimension.org>
Subject: [RESEND #2] [PATCH v1 0/2] Start adding ZIM file(s)
Date: Fri, 23 Dec 2022 23:07:31 +0100
[Message part 1 (text/plain, inline)]
Hi,

Here are two small patches. 

The first one add #:substitutable? to the copy-build system.

I don't know how to check if it works as intended though. It's
similar to the commit d0050ea8ad1c32d94cf5ba6725a0fc961bb23f38 
("build-system/go: Add #:substitutable? argument.") so normally
it shouldn't be an issue, but if someone can double check it it
would be best as it would avoid keeping around substitutes of
very big sizes.

The second patch adds a ZIM file. I'll most likely send more
patches to add additional ZIM files packages (about 10) later
on. I prefer doing it this way as it avoids having to deal with
potential rebases breaking if there is something wrong with my
second patch.

Denis 'GNUtoo' Carikli (2):
  build-system/copy: Add #:substitutable? argument.
  gnu: Add wikipedia_en_all_maxi

 gnu/local.mk               |  1 +
 gnu/packages/zim-files.scm | 86 ++++++++++++++++++++++++++++++++++++++
 guix/build-system/copy.scm |  4 +-
 3 files changed, 90 insertions(+), 1 deletion(-)
 create mode 100644 gnu/packages/zim-files.scm


base-commit: c193b5203b31246a6d74270c8086c45851561947
-- 
2.38.1

[Message part 2 (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#60288; Package guix-patches. (Fri, 23 Dec 2022 22:21:01 GMT) Full text and rfc822 format available.

Message #8 received at 60288 <at> debbugs.gnu.org (full text, mbox):

From: Denis 'GNUtoo' Carikli <GNUtoo <at> cyberdimension.org>
To: 60288 <at> debbugs.gnu.org
Cc: Denis 'GNUtoo' Carikli <GNUtoo <at> cyberdimension.org>
Subject: [PATCH v1 1/2] build-system/copy: Add #:substitutable? argument.
Date: Fri, 23 Dec 2022 23:20:23 +0100
* guix/build-system/copy.scm (copy-build): Add 'substitutable?'
  argument.
---
 guix/build-system/copy.scm | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/guix/build-system/copy.scm b/guix/build-system/copy.scm
index 4894ba46fb..bb4d2daaa8 100644
--- a/guix/build-system/copy.scm
+++ b/guix/build-system/copy.scm
@@ -96,7 +96,8 @@ (define* (copy-build name inputs
                      (target #f)
                      (imported-modules %copy-build-system-modules)
                      (modules '((guix build copy-build-system)
-                                (guix build utils))))
+                                (guix build utils)))
+                     (substitutable? #t))
   "Build SOURCE using INSTALL-PLAN, and with INPUTS."
   (define builder
     (with-imported-modules imported-modules
@@ -129,6 +130,7 @@ (define builder
     (gexp->derivation name builder
                       #:system system
                       #:target #f
+                      #:substitutable? substitutable?
                       #:guile-for-build guile)))
 
 (define copy-build-system
-- 
2.38.1





Information forwarded to guix-patches <at> gnu.org:
bug#60288; Package guix-patches. (Fri, 23 Dec 2022 22:21:02 GMT) Full text and rfc822 format available.

Message #11 received at 60288 <at> debbugs.gnu.org (full text, mbox):

From: Denis 'GNUtoo' Carikli <GNUtoo <at> cyberdimension.org>
To: 60288 <at> debbugs.gnu.org
Cc: Denis 'GNUtoo' Carikli <GNUtoo <at> cyberdimension.org>
Subject: [PATCH v1 2/2] gnu: Add wikipedia_en_all_maxi
Date: Fri, 23 Dec 2022 23:20:24 +0100
* gnu/packages/zim-files.scm (wikipedia_en_all_maxi): New variable.
---
 gnu/local.mk               |  1 +
 gnu/packages/zim-files.scm | 86 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 87 insertions(+)
 create mode 100644 gnu/packages/zim-files.scm

diff --git a/gnu/local.mk b/gnu/local.mk
index 5b8944f568..8957554fc2 100644
--- a/gnu/local.mk
+++ b/gnu/local.mk
@@ -643,6 +643,7 @@ GNU_SYSTEM_MODULES =				\
   %D%/packages/xfce.scm				\
   %D%/packages/zig.scm				\
   %D%/packages/zile.scm				\
+  %D%/packages/zim-files.scm			\
   %D%/packages/zwave.scm			\
 						\
   %D%/services.scm				\
diff --git a/gnu/packages/zim-files.scm b/gnu/packages/zim-files.scm
new file mode 100644
index 0000000000..49b7accb52
--- /dev/null
+++ b/gnu/packages/zim-files.scm
@@ -0,0 +1,86 @@
+;;; GNU Guix --- Functional package management for GNU
+;;; Copyright © 2022 Denis 'GNUtoo' Carikli <GNUtoo <at> cyberdimension.org>
+;;;
+;;; This file is part of GNU Guix.
+;;;
+;;; GNU Guix is free software; you can redistribute it and/or modify it
+;;; under the terms of the GNU General Public License as published by
+;;; the Free Software Foundation; either version 3 of the License, or (at
+;;; your option) any later version.
+;;;
+;;; GNU Guix is distributed in the hope that it will be useful, but
+;;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;;; GNU General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU General Public License
+;;; along with GNU Guix.  If not, see <http://www.gnu.org/licenses/>.
+
+(define-module (gnu packages zim-files)
+  #:use-module (gnu packages)
+  #:use-module (guix build-system copy)
+  #:use-module (guix download)
+  #:use-module (guix gexp)
+  #:use-module (guix utils)
+  #:use-module ((guix licenses) #:prefix license:)
+  #:use-module (guix packages))
+
+;;; Commentary:
+;;;
+;;; Many Guix contributors have a tendency to update packages in this
+;;; way: they only update the package revision and then launch a build
+;;; that fails just to make Guix tell them the right base32 hash. They
+;;; then update the base32 hash and launch the build again.
+;;;
+;;; However some ZIM files are quite big. At the time of writing,
+;;; wikipedia_en_all_maxi_2022-05.zim is about 89 GiB.
+;;;
+;;; So this approach will be time consuming as the second time Guix
+;;; will restart downloading the same file from scratch.
+;;;
+;;; The solution to this issue is to download the sha256sums (for that
+;;; simply append .sha256 to the URL of the ZIM file). It will give a
+;;; file like that:
+;;; f12163513307893c87fd75009b1d61677bae675627eaadf4cb0fa63953eea021  wikipedia_en_all_maxi_2022-05.zim
+;;;
+;;; You can then use this hash to compute the base32 with nix-hash:
+;;; $ nix-hash --type sha256 --to-base32 \
+;;; f12163513307893c87fd75009b1d61677bae675627eaadf4cb0fa63953eea021
+;;; 08d0xr9kk9hgrgsavsi7arkswyv7c4frn03mzn3kr2876d8n68gi
+
+(define-public wikipedia-en-all-maxi
+  (package
+    (name "wikipedia-en-all-maxi")
+    (version "2022-05")
+    (source (origin
+              (method url-fetch)
+              (uri (string-append
+                    "https://mirror.download.kiwix.org/zim/wikipedia/"
+                    (string-replace-substring name "-" "_")
+                    "_" version ".zim"))
+              (sha256
+               (base32
+                "08d0xr9kk9hgrgsavsi7arkswyv7c4frn03mzn3kr2876d8n68gi"))))
+    (build-system copy-build-system)
+    (arguments
+     (list
+      ;; We are not (yet) generating the zim file, so it doesn't make sense to
+      ;; build substitutes.
+      #:substitutable? #f
+      ;; If we use kiwix-serve, the path of the ZIM file needs to be passed to
+      ;; it. And if the filename has a version in it, we'd need to update the
+      ;; path manually each time the package is updated. We also need to
+      ;; change the filename to match the package name.
+      #:install-plan #~'((#$(string-append
+                             (string-replace-substring name "-" "_")
+                             "_" version ".zim")
+                          #$(string-append "share/" name ".zim")))))
+    (synopsis
+     "Complete English Wikipedia packed in a ZIM file, for offline usage with
+Kiwix")
+    (description
+     "Wikipedia is a free Encyclopedia.  This is the English version.  It
+contains all the articles, and all the medias (images, etc) present in
+the articles in a scaled down resolution.")
+    (home-page "https://en.wikipedia.org/wiki/Main_Page")
+    (license license:cc-by-sa3.0)))
-- 
2.38.1





Information forwarded to guix-patches <at> gnu.org:
bug#60288; Package guix-patches. (Wed, 28 Dec 2022 18:15:01 GMT) Full text and rfc822 format available.

Message #14 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Christopher Baines <mail <at> cbaines.net>
To: Denis 'GNUtoo' Carikli <GNUtoo <at> cyberdimension.org>
Cc: 60288 <at> debbugs.gnu.org, guix-patches <at> gnu.org
Subject: Re: [bug#60288] [RESEND #2] [PATCH v1 0/2] Start adding ZIM file(s)
Date: Wed, 28 Dec 2022 18:10:54 +0000
[Message part 1 (text/plain, inline)]
Denis 'GNUtoo' Carikli <GNUtoo <at> cyberdimension.org> writes:

> Here are two small patches.
>
> The first one add #:substitutable? to the copy-build system.
>
> I don't know how to check if it works as intended though. It's
> similar to the commit d0050ea8ad1c32d94cf5ba6725a0fc961bb23f38
> ("build-system/go: Add #:substitutable? argument.") so normally
> it shouldn't be an issue, but if someone can double check it it
> would be best as it would avoid keeping around substitutes of
> very big sizes.
>
> The second patch adds a ZIM file. I'll most likely send more
> patches to add additional ZIM files packages (about 10) later
> on. I prefer doing it this way as it avoids having to deal with
> potential rebases breaking if there is something wrong with my
> second patch.
>
> Denis 'GNUtoo' Carikli (2):
>   build-system/copy: Add #:substitutable? argument.
>   gnu: Add wikipedia_en_all_maxi

I haven't looked at this in detail, but one comment on the QA
failures. Building the package for this large file involves copying it
from the store, to another place in the store. This requires 2x the
space which this large file takes up, which is a pretty wasteful
approach.

This is the reason behind the build failures I've seen, the build
machines run out of space when attempting the file copy. Maybe an
alternative if you want to have a package would be to symlink to the
source. That way, there's only a large file and a symlink in the store,
rather than two copies of the same large file.

Chris
[signature.asc (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#60288; Package guix-patches. (Wed, 28 Dec 2022 18:15:02 GMT) Full text and rfc822 format available.

Information forwarded to guix-patches <at> gnu.org:
bug#60288; Package guix-patches. (Thu, 29 Dec 2022 23:21:02 GMT) Full text and rfc822 format available.

Message #20 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Denis 'GNUtoo' Carikli <GNUtoo <at> cyberdimension.org>
To: Christopher Baines <mail <at> cbaines.net>
Cc: 60288 <at> debbugs.gnu.org, guix-patches <at> gnu.org
Subject: Re: [bug#60288] [RESEND #2] [PATCH v1 0/2] Start adding ZIM file(s)
Date: Fri, 30 Dec 2022 00:19:50 +0100
[Message part 1 (text/plain, inline)]
On Wed, 28 Dec 2022 18:10:54 +0000
Christopher Baines <mail <at> cbaines.net> wrote:
> I haven't looked at this in detail, but one comment on the QA
> failures. Building the package for this large file involves copying it
> from the store, to another place in the store. This requires 2x the
> space which this large file takes up, which is a pretty wasteful
> approach.
Not only that but it also take a very long time to do that copy on
slower machines with an encrypted rootfs.

> This is the reason behind the build failures I've seen, the build
> machines run out of space when attempting the file copy. Maybe an
> alternative if you want to have a package would be to symlink to the
> source. That way, there's only a large file and a symlink in the
> store, rather than two copies of the same large file.
I'll try that. I hope that guix gc will not garbage collect the source
though.

Do you know if it's possible just to have a source package somehow
(and download the source to a specific filename) and not copy anything
at all?

Denis.
[Message part 2 (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#60288; Package guix-patches. (Thu, 29 Dec 2022 23:21:02 GMT) Full text and rfc822 format available.

Information forwarded to guix-patches <at> gnu.org:
bug#60288; Package guix-patches. (Mon, 02 Jan 2023 20:03:02 GMT) Full text and rfc822 format available.

Message #26 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Denis 'GNUtoo' Carikli <GNUtoo <at> cyberdimension.org>
To: Christopher Baines <mail <at> cbaines.net>
Cc: 60288 <at> debbugs.gnu.org, guix-patches <at> gnu.org
Subject: Re: [bug#60288] [RESEND #2] [PATCH v1 0/2] Start adding ZIM file(s)
Date: Mon, 2 Jan 2023 21:01:41 +0100
[Message part 1 (text/plain, inline)]
On Wed, 28 Dec 2022 18:10:54 +0000
Christopher Baines <mail <at> cbaines.net> wrote:> 
> Maybe an alternative if you want to have a package would be to
> symlink to the source.
The issue is that I don't know how to refer to the source in a
situation like that.

I didn't really find good examples of all that. So far the best
I saw was to either define (source [...]) and reuse it in multiple
packages or to reuse the source of another package with (package-source
<package name>) like in linux.scm.

With the gnu build system, it copies the source in the current
directory, so I've really no idea what to do here. We might also need
to add the source to the inputs or native-inputs or propagated-inputs
somehow so it would not garbage collect it when we install the zim. Is
propagated-inputs the way to go?

Denis.
[Message part 2 (application/pgp-signature, inline)]

Information forwarded to guix-patches <at> gnu.org:
bug#60288; Package guix-patches. (Mon, 02 Jan 2023 20:03:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 60288 <at> debbugs.gnu.org and Denis 'GNUtoo' Carikli <GNUtoo <at> cyberdimension.org> Request was from Denis 'GNUtoo' Carikli <GNUtoo <at> cyberdimension.org> to control <at> debbugs.gnu.org. (Tue, 05 Dec 2023 15:00:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 03 Jan 2024 12:24:11 GMT) Full text and rfc822 format available.

This bug report was last modified 106 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.