GNU bug report logs -
#77898
31.0.50; arc-mode: Split PKZIP archive signature not recognized
Previous Next
To reply to this bug, email your comments to 77898 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77898
; Package
emacs
.
(Fri, 18 Apr 2025 12:33:03 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Fri, 18 Apr 2025 12:33:03 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
The attached zip file is a "real" one, in the sense that it has been
created in the wild. It is a stripped down "terraform stack" generated
by one of the cloud providers.
InfoZIP's unzip can handle that file without problems:
[emacs-master]$ unzip -v
UnZip 6.00 of 20 April 2009, by Debian. Original by Info-ZIP.
[...]
[emacs-master]$ unzip -l tf-stack.zip
Archive: tf-stack.zip
Length Date Time Name
--------- ---------- ----- ----
38 2025-04-01 08:43 provider.tf
--------- -------
38 1 file
However, "./src/emacs -Q tf-stack.zip" fails on it with:
File mode specification error: (error "Buffer format not recognized")
It turns out that the zip file starts with a special marker 0x08074b50
for spanned or split archives, as defined in the pseudo zip file
specification "APPNOTE.txt" by PKWARE. See section 8.5 "Capacities and
Markers" of that document for more details.
The attached patch provides support for the aforementioned special
marker, alongside with the already present "temporary spanning marker"
0x30304b50. The patch itself is trivial, but I tried to provide above
information also in the commit message for future reference.
WDYT?
Thanks!
In GNU Emacs 31.0.50 (build 6, x86_64-pc-linux-gnu, GTK+ Version
3.24.38, cairo version 1.16.0) of 2025-04-18 built on sappc2
Repository revision: 9f0c43a3d1b38c93bf21bf25db0f5dd489338d7c
Repository branch: master
Windowing system distributor 'The X.Org Foundation', version 11.0.12201009
System Description: Debian GNU/Linux 12 (bookworm)
Configured using:
'configure --with-native-compilation --with-mailutils'
Configured features:
ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG
LCMS2 LIBOTF LIBSELINUX LIBSYSTEMD LIBXML2 M17N_FLT MODULES NATIVE_COMP
NOTIFY INOTIFY PDUMPER PNG RSVG SECCOMP SOUND THREADS TIFF
TOOLKIT_SCROLL_BARS WEBP X11 XDBE XIM XINERAMA XINPUT2 XPM XRANDR GTK3
ZLIB
Important settings:
value of $LC_COLLATE: POSIX
value of $LC_TIME: POSIX
value of $LANG: en_US.UTF-8
value of $XMODIFIERS: @im=ibus
locale-coding-system: utf-8-unix
Major mode: Fundamental
Minor modes in effect:
tooltip-mode: t
global-eldoc-mode: t
show-paren-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
blink-cursor-mode: t
minibuffer-regexp-mode: t
line-number-mode: t
indent-tabs-mode: t
transient-mark-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
Load-path shadows:
None found.
Features:
(shadow sort mail-extr emacsbug lisp-mnt message mailcap yank-media puny
dired dired-loaddefs rfc822 mml mml-sec password-cache epa derived epg
rfc6068 epg-config gnus-util time-date mm-decode mm-bodies mm-encode
mail-parse rfc2231 mailabbrev gmm-utils mailheader cl-loaddefs cl-lib
sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils
compile text-property-search comint subr-x ansi-osc ansi-color ring
comp-run bytecomp byte-compile comp-common rx arc-mode archive-mode rmc
iso-transl tooltip cconv eldoc paren electric uniquify ediff-hook
vc-hooks lisp-float-type elisp-mode mwheel term/x-win x-win
term/common-win x-dnd touch-screen tool-bar dnd fontset image regexp-opt
fringe tabulated-list replace newcomment text-mode lisp-mode prog-mode
register page tab-bar menu-bar rfn-eshadow isearch easymenu timer select
scroll-bar mouse jit-lock font-lock syntax font-core term/tty-colors
frame minibuffer nadvice seq simple cl-generic indonesian philippine
cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao
korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech
european ethiopic indian cyrillic chinese composite emoji-zwj charscript
charprop case-table epa-hook jka-cmpr-hook help abbrev obarray oclosure
cl-preloaded button loaddefs theme-loaddefs faces cus-face macroexp
files window text-properties overlay sha1 md5 base64 format env
code-pages mule custom widget keymap hashtable-print-readable backquote
threads dbusbind inotify lcms2 dynamic-setting system-font-setting
font-render-setting cairo gtk x-toolkit xinput2 x multi-tty move-toolbar
make-network-process tty-child-frames native-compile emacs)
Memory information:
((conses 16 72011 11711) (symbols 48 7037 0) (strings 32 18600 1457)
(string-bytes 1 583099) (vectors 16 11590)
(vector-slots 8 158155 10743) (floats 8 23 12) (intervals 56 264 0)
(buffers 984 13))
[tf-stack.zip (application/zip, attachment)]
[0001-Detect-more-types-of-split-zip-archives.patch (text/x-patch, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77898
; Package
emacs
.
(Fri, 18 Apr 2025 12:41:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 77898 <at> debbugs.gnu.org (full text, mbox):
For a simpler reproducer, I just noticed that
zip -s1000000 split.zip README
with InfoZIP's zip 3.0
[emacs-master]$ zip -v
Copyright (c) 1990-2008 Info-ZIP - Type 'zip "-L"' for software license.
This is Zip 3.0 (July 5th 2008), by Info-ZIP.
also creates such zip files that Emacs (without my patch)
cannot grok.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77898
; Package
emacs
.
(Fri, 18 Apr 2025 12:52:05 GMT)
Full text and
rfc822 format available.
Message #11 received at 77898 <at> debbugs.gnu.org (full text, mbox):
> Date: Fri, 18 Apr 2025 14:30:59 +0200
> From: Jens Schmidt via "Bug reports for GNU Emacs,
> the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
>
> The attached zip file is a "real" one, in the sense that it has been
> created in the wild. It is a stripped down "terraform stack" generated
> by one of the cloud providers.
>
> InfoZIP's unzip can handle that file without problems:
>
> [emacs-master]$ unzip -v
> UnZip 6.00 of 20 April 2009, by Debian. Original by Info-ZIP.
> [...]
>
> [emacs-master]$ unzip -l tf-stack.zip
> Archive: tf-stack.zip
> Length Date Time Name
> --------- ---------- ----- ----
> 38 2025-04-01 08:43 provider.tf
> --------- -------
> 38 1 file
>
> However, "./src/emacs -Q tf-stack.zip" fails on it with:
>
> File mode specification error: (error "Buffer format not recognized")
>
> It turns out that the zip file starts with a special marker 0x08074b50
> for spanned or split archives, as defined in the pseudo zip file
> specification "APPNOTE.txt" by PKWARE. See section 8.5 "Capacities and
> Markers" of that document for more details.
>
> The attached patch provides support for the aforementioned special
> marker, alongside with the already present "temporary spanning marker"
> 0x30304b50. The patch itself is trivial, but I tried to provide above
> information also in the commit message for future reference.
>
> WDYT?
Please add comments describing the possible signatures, with pointers
to specific sections of APPNOTE.txt.
Also, is it possible to add some tests for this?
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77898
; Package
emacs
.
(Sat, 19 Apr 2025 12:33:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 77898 <at> debbugs.gnu.org (full text, mbox):
On 2025-04-18 14:51, Eli Zaretskii wrote:
> Please add comments describing the possible signatures, with pointers
> to specific sections of APPNOTE.txt.
As an aside, how would you add a comment to the first branch of a
`cond' if you would like to keep the branches aligned to the *end*
of the `cond' keyword? I usually do these as follows:
(cond ;; See APPNOTE.txt (version 6.3.10) from PKWARE for the zip
;; file signatures:
;; - PK\003\004 == 0x04034b50: local file header signature
;; (section 4.3.7)
;; - PK\007\010 == 0x08074b50 (followed by local header):
;; spanned/split archive signature (section 8.5.3)
;; - PK00 == 0x30304b50 (followed by local header): temporary
;; spanned/split archive signature (section 8.5.4)
((looking-at "\\(?:PK\007\010\\|PK00\\)?[P]K\003\004") 'zip)
But the Emacs indentation machinery clearly disagrees on that and
I'm not sure to what extent Emacs's indentation is normative in its
own sources.
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77898
; Package
emacs
.
(Sat, 19 Apr 2025 13:09:05 GMT)
Full text and
rfc822 format available.
Message #17 received at 77898 <at> debbugs.gnu.org (full text, mbox):
> Date: Sat, 19 Apr 2025 14:31:27 +0200
> Cc: 77898 <at> debbugs.gnu.org
> From: Jens Schmidt <jschmidt4gnu <at> vodafonemail.de>
>
> As an aside, how would you add a comment to the first branch of a
> `cond' if you would like to keep the branches aligned to the *end*
> of the `cond' keyword? I usually do these as follows:
>
> (cond ;; See APPNOTE.txt (version 6.3.10) from PKWARE for the zip
> ;; file signatures:
> ;; - PK\003\004 == 0x04034b50: local file header signature
> ;; (section 4.3.7)
> ;; - PK\007\010 == 0x08074b50 (followed by local header):
> ;; spanned/split archive signature (section 8.5.3)
> ;; - PK00 == 0x30304b50 (followed by local header): temporary
> ;; spanned/split archive signature (section 8.5.4)
> ((looking-at "\\(?:PK\007\010\\|PK00\\)?[P]K\003\004") 'zip)
Something like below, perhaps?
(cond (looking-at
;; Comment
"PATTERN"
Or even
;; See APPNOTE.txt
(cond ((looking-at "\\(?:PK\007\010\\|PK00\\)?[P]K\003\004") 'zip)
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77898
; Package
emacs
.
(Sat, 19 Apr 2025 19:09:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 77898 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 2025-04-19 15:07, Eli Zaretskii wrote:
> Or even
>
> ;; See APPNOTE.txt
> (cond ((looking-at "\\(?:PK\007\010\\|PK00\\)?[P]K\003\004") 'zip)
Thanks, I went for that.
Please find attached the next version of the patch. I added tests not
only for zip and split zip detection, but also for all archivers that
follow the calling pattern
ARCHIVER PARAMETER... ARCHIVE FILE...
and that I could easily install on my GNU/Linux system. Which were
surprisingly many.
Of course, such tests are highly OS-dependent any have the potential
of causing a lot of failures, even though I tried to code them in a
defensive manner.
WDYT?
Thanks.
[0001-Detect-more-types-of-split-zip-archives.patch (text/x-patch, attachment)]
This bug report was last modified 4 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.