GNU bug report logs - #33708
26.1.90; nhexl-mode performance

Previous Next

Package: emacs;

Reported by: Guido Kraemer <gkraemer <at> bgc-jena.mpg.de>

Date: Tue, 11 Dec 2018 17:08:01 UTC

Severity: normal

Found in version 26.1.90

Done: Stefan Monnier <monnier <at> iro.umontreal.ca>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 33708 in the body.
You can then email your comments to 33708 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#33708; Package emacs. (Tue, 11 Dec 2018 17:08:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Guido Kraemer <gkraemer <at> bgc-jena.mpg.de>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Tue, 11 Dec 2018 17:08:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Guido Kraemer <gkraemer <at> bgc-jena.mpg.de>
To: bug-gnu-emacs <at> gnu.org
Subject: 26.1.90; nhexl-mode performance
Date: Tue, 11 Dec 2018 18:06:40 +0100
Filing a bug report because of the discussion on

https://emacs.stackexchange.com/questions/46492/how-to-search-for-a-sequence-of-bytes-in-hexl-mode/ 


nhexl-mode performance is really bad in large files.

Occurred in the files of the Bitcoin blockchain. The beginning of every
block is marked with the byte sequence `f9beb4d9`. Searching for this
byte sequence in nhexl-mode is really slow.

In case you do not have the bitcoin client installed, I uploaded the
first file of the blockchain here (134MB, link will be valid for 30
days):

https://ufile.io/z08bl

Thanks for looking into this.

Guido.






In GNU Emacs 26.1.90 (build 1, x86_64-pc-linux-gnu, X toolkit, Xaw 
scroll bars)
of 2018-12-01 built on uranus
Repository revision: 7851ae8b443c62a41ea4f4440512aa56cc87b9b7
Windowing system distributor 'The X.Org Foundation




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#33708; Package emacs. (Fri, 14 Dec 2018 18:41:01 GMT) Full text and rfc822 format available.

Message #8 received at 33708 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Guido Kraemer <gkraemer <at> bgc-jena.mpg.de>
Cc: 33708 <at> debbugs.gnu.org
Subject: Re: bug#33708: 26.1.90; nhexl-mode performance
Date: Fri, 14 Dec 2018 13:40:36 -0500
> Occurred in the files of the Bitcoin blockchain. The beginning of every
> block is marked with the byte sequence `f9beb4d9`. Searching for this
> byte sequence in nhexl-mode is really slow.

Duh, indeed it's painful, and it's a plain performance bug in nhexl-mode.
I just installed the patch below which seems to fix it (along with
another less severe bug).
The corresponding 1.2 package should appear soon on GNU ELPA.

Thanks for the test case (tho it was painful to get due to having to go
through my browser and coerce it to run non-free Javascript code).


        Stefan


diff --git a/packages/nhexl-mode/nhexl-mode.el b/packages/nhexl-mode/nhexl-mode.el
index 89d91182f..a52a90081 100644
--- a/packages/nhexl-mode/nhexl-mode.el
+++ b/packages/nhexl-mode/nhexl-mode.el
@@ -807,22 +807,31 @@ Return the corresponding nibble, if applicable."
       (push (string-to-number (substring string i (+ i 2)) 16)
             chars)
       (setq i (+ i 2)))
-    (let* ((base (regexp-quote (apply #'string (nreverse chars))))
-           (newstr
-            (if (>= i (length string))
-                base
-              (cl-assert (= (1+ i) (length string)))
-              (let ((nibble (string-to-number (substring string i) 16)))
-                ;; FIXME: if one of the two bounds is a special char
-                ;; like `]` or `^' we can get into trouble!
-                (format "%s[%c-%c]" base
-                        (* 16 nibble)
-                        (+ 15 (* 16 nibble)))))))
+    (let* ((base (regexp-quote (apply #'unibyte-string (nreverse chars))))
+           (re
+            (concat (if (>= i (length string))
+                        base
+                      (cl-assert (= (1+ i) (length string)))
+                      (let ((nibble (string-to-number (substring string i) 16)))
+                        ;; FIXME: if one of the two bounds is a special char
+                        ;; like `]` or `^' we can get into trouble!
+                        (concat base
+                                (unibyte-string ?\[ (* 16 nibble) ?-
+                                                   (+ 15 (* 16 nibble)) ?\]))))
+                    ;; We also search for the literal hex string here, so the
+                    ;; search stops as soon as one is found, otherwise we too
+                    ;; easily fall into the trap of bug#33708 where at every
+                    ;; cycle we first search unsuccessfully through the whole
+                    ;; buffer with one kind of search before trying the
+                    ;; other search.
+                    ;; Don't bother regexp-quoting the string since we know
+                    ;; it's only made of hex chars!
+                    "\\|" string)))
       (let ((case-fold-search nil))
         (funcall (if isearch-forward
                      #'re-search-forward
                    #'re-search-backward)
-                 newstr bound noerror)))))
+                 re bound noerror)))))
 
 (defun nhexl--isearch-search-fun (orig-fun)
   (let ((def-fun (funcall orig-fun)))
@@ -830,9 +839,18 @@ Return the corresponding nibble, if applicable."
       (unless bound
         (setq bound (if isearch-forward (point-max) (point-min))))
       (let ((startpos (point))
-            (def (funcall def-fun string bound noerror)))
-        ;; Don't search further than what `def-fun' found.
-        (if def (setq bound (match-beginning 0)))
+            def)
+        ;; Hex address search.
+        (when (and nhexl-isearch-hex-addresses
+                   (> (length string) 1)
+                   (string-match-p "\\`[[:xdigit:]]+:?\\'" string))
+          ;; Could be a hexadecimal address.
+          (goto-char startpos)
+          (let ((newdef (nhexl--isearch-match-hex-address string bound noerror)))
+            (when newdef
+              (setq def newdef)
+              (setq bound (match-beginning 0)))))
+        ;; Hex bytes search
         (when (and nhexl-isearch-hex-bytes
                    (> (length string) 1)
                    (string-match-p "\\`[[:xdigit:]]+\\'" string))
@@ -842,12 +860,10 @@ Return the corresponding nibble, if applicable."
             (when newdef
               (setq def newdef)
               (setq bound (match-beginning 0)))))
-        (when (and nhexl-isearch-hex-addresses
-                   (> (length string) 1)
-                   (string-match-p "\\`[[:xdigit:]]+:?\\'" string))
-          ;; Could be a hexadecimal address.
+        ;; Normal search.
+        (progn
           (goto-char startpos)
-          (let ((newdef (nhexl--isearch-match-hex-address string bound noerror)))
+          (let ((newdef (funcall def-fun string bound noerror)))
             (when newdef
               (setq def newdef)
               (setq bound (match-beginning 0)))))
@@ -909,17 +925,19 @@ Return the corresponding nibble, if applicable."
             #'nhexl--isearch-highlight-cleanup)
 (defun nhexl--isearch-highlight-cleanup (&rest _)
   (when (and nhexl-mode nhexl-isearch-hex-highlight)
-    (dolist (ol isearch-lazy-highlight-overlays)
-      (when (and (overlayp ol) (eq (overlay-buffer ol) (current-buffer)))
-        (put-text-property (overlay-start ol) (overlay-end ol)
-                           'fontified nil)))))
+    (with-silent-modifications
+      (dolist (ol isearch-lazy-highlight-overlays)
+        (when (and (overlayp ol) (eq (overlay-buffer ol) (current-buffer)))
+          (put-text-property (overlay-start ol) (overlay-end ol)
+                             'fontified nil))))))
 
 (advice-add 'isearch-lazy-highlight-match :after
             #'nhexl--isearch-highlight-match)
 (defun nhexl--isearch-highlight-match (&optional mb me)
   (when (and nhexl-mode nhexl-isearch-hex-highlight
              (integerp mb) (integerp me))
-    (put-text-property mb me 'fontified nil)))
+    (with-silent-modifications
+      (put-text-property mb me 'fontified nil))))
 
 (defun nhexl--line-width-watcher (_sym _newval op where)
   (when (eq op 'set)




Reply sent to Stefan Monnier <monnier <at> iro.umontreal.ca>:
You have taken responsibility. (Fri, 14 Dec 2018 22:53:02 GMT) Full text and rfc822 format available.

Notification sent to Guido Kraemer <gkraemer <at> bgc-jena.mpg.de>:
bug acknowledged by developer. (Fri, 14 Dec 2018 22:53:02 GMT) Full text and rfc822 format available.

Message #13 received at 33708-done <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Guido Kraemer <gkraemer <at> bgc-jena.mpg.de>
Subject: Re: bug#33708: 26.1.90; nhexl-mode performance
Date: Fri, 14 Dec 2018 17:52:31 -0500
> - Thanks for fixing this, performance is great now!

Thanks for confirming.

> - Sorry for using that weird site for file sharing, you could have installed
> the bitcoin client (which is MIT licensed) and downloaded the
> blockchain ;-),

I hesitated to do that, indeed.

> what would you use for ad-hoc file sharing?

I'd put it on my web server, not linked from any page.

> - I think there is another minor bug: When the cursor is at the very
> beginning of the buffer and you search for the byte sequence at the very
> beginning of the file, search will jump to the second occurrence. Happens in
> the example of the original bug report.

Yeah, it's a misfeature that I'm not sure how to fix:

When you type `C-s f a`, you first search for `f` and this one is not
treated as a hex-search so it jumps to the first `f` char, so when you
get to type `a` Isearch keeps searching from that `f` rather than
restarting from the beginning of the buffer.

I could change the rule so that `C-s f` already treats `f` as
a hex-search, I guess.


        Stefan




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 12 Jan 2019 12:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 5 years and 106 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.