GNU bug report logs - #37393
26.2.90; [PATCH] Speed up 'csv-align-fields'

Previous Next

Package: emacs;

Reported by: Simen Heggestøyl <simenheg <at> gmail.com>

Date: Thu, 12 Sep 2019 17:08:01 UTC

Severity: normal

Tags: patch

Found in version 26.2.90

Done: Simen Heggestøyl <simenheg <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 37393 in the body.
You can then email your comments to 37393 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#37393; Package emacs. (Thu, 12 Sep 2019 17:08:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Simen Heggestøyl <simenheg <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Thu, 12 Sep 2019 17:08:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Simen Heggestøyl <simenheg <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, Leo Liu <sdl.web <at> gmail.com>
Subject: 26.2.90; [PATCH] Speed up 'csv-align-fields'
Date: Thu, 12 Sep 2019 19:07:33 +0200
[Message part 1 (text/plain, inline)]
The attached patch attempts to speed up the 'csv-align-fields' command
by avoiding expensive calls to 'current-column', instead reusing field
widths already computed by 'csv--column-widths'.

I felt an urge to speed up the command a bit while working with large
(100 000+ lines) CSV files. Below are benchmarks produced by running

  (benchmark 3 '(csv-align-fields nil (point-min) (point-max)))

in three CSV files from the real world of various sizes. In these cases
the speedup seems to be around 1.5x—2x.

~400 line file:
  Before: Elapsed time: 0.175867s
  After:  Elapsed time: 0.086809s

~50 000 line file:
  Before: Elapsed time: 34.665853s (7.480686s in 35 GCs)
  After:  Elapsed time: 24.349081s (7.154716s in 27 GCs)

~110 000 line file:
  Before: Elapsed time: 82.444038s (19.799686s in 51 GCs)
  After:  Elapsed time: 40.184331s (9.037813s in 25 GCs)

(I've put on CC the two of you who seem to have done most of the work on
this mode lately, hope that's OK.)

-- Simen
[0001-Speed-up-csv-align-fields.patch (text/x-diff, inline)]
From 4fc82f1f66c736bcfbc15d20ff53bd3e21e8a8e1 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Simen=20Heggest=C3=B8yl?= <simenheg <at> gmail.com>
Date: Thu, 12 Sep 2019 18:54:28 +0200
Subject: [PATCH] Speed up 'csv-align-fields'

* packages/csv-mode/csv-mode.el: Bump version number and make the
dependency on Emacs 24.1 or higher explicit.
(csv--column-widths): Return the field widths as well.
(csv-align-fields): Speed up by using the field widths already computed
by 'csv--column-widths'.
---
 packages/csv-mode/csv-mode.el | 30 ++++++++++++++++--------------
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/packages/csv-mode/csv-mode.el b/packages/csv-mode/csv-mode.el
index 40f70330a..dc2555687 100644
--- a/packages/csv-mode/csv-mode.el
+++ b/packages/csv-mode/csv-mode.el
@@ -4,7 +4,8 @@
 
 ;; Author: "Francis J. Wright" <F.J.Wright <at> qmul.ac.uk>
 ;; Time-stamp: <23 August 2004>
-;; Version: 1.7
+;; Version: 1.8
+;; Package-Requires: ((emacs "24.1"))
 ;; Keywords: convenience
 
 ;; This package is free software; you can redistribute it and/or modify
@@ -969,24 +970,26 @@ The fields yanked are those last killed by `csv-kill-fields'."
   (and (overlay-get o 'csv) (delete-overlay o)))
 
 (defun csv--column-widths ()
-  (let ((widths '()))
+  (let ((column-widths '())
+        (field-widths '()))
     ;; Construct list of column widths:
     (while (not (eobp))                   ; for each record...
       (or (csv-not-looking-at-record)
-          (let ((w widths)
+          (let ((w column-widths)
                 (col (current-column))
-                x)
+                field-width)
             (while (not (eolp))
               (csv-end-of-field)
-              (setq x (- (current-column) col)) ; Field width.
+              (setq field-width (- (current-column) col))
+              (push field-width field-widths)
               (if w
-                  (if (> x (car w)) (setcar w x))
-                (setq w (list x)
-                      widths (nconc widths w)))
+                  (if (> field-width (car w)) (setcar w field-width))
+                (setq w (list field-width)
+                      column-widths (nconc column-widths w)))
               (or (eolp) (forward-char))  ; Skip separator.
               (setq w (cdr w) col (current-column)))))
       (forward-line))
-    widths))
+    (list column-widths (nreverse field-widths))))
 
 (defun csv-align-fields (hard beg end)
   "Align all the fields in the region to form columns.
@@ -1017,23 +1020,22 @@ If there is no selected region, default to the whole buffer."
       (narrow-to-region beg end)
       (set-marker end nil)
       (goto-char (point-min))
-      (let ((widths (csv--column-widths)))
+      (pcase-let ((`(,column-widths ,field-widths) (csv--column-widths)))
 
 	;; Align fields:
 	(goto-char (point-min))
 	(while (not (eobp))		; for each record...
 	  (unless (csv-not-looking-at-record)
-            (let ((w widths)
+            (let ((w column-widths)
                   (column 0))    ;Desired position of left-side of this column.
               (while (and w (not (eolp)))
                 (let* ((beg (point))
                        (align-padding (if (bolp) 0 csv-align-padding))
                        (left-padding 0) (right-padding 0)
-                       (field-width
-                        (- (- (current-column)
-                              (progn (csv-end-of-field) (current-column)))))
+                       (field-width (pop field-widths))
                        (column-width (pop w))
                        (x (- column-width field-width))) ; Required padding.
+                  (csv-end-of-field)
                   (set-marker end (point)) ; End of current field.
                   ;; beg = beginning of current field
                   ;; end = (point) = end of current field
-- 
2.23.0


Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#37393; Package emacs. (Thu, 12 Sep 2019 17:47:02 GMT) Full text and rfc822 format available.

Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Simen Heggestøyl <simenheg <at> gmail.com>
Cc: bug-gnu-emacs <at> gnu.org, Leo Liu <sdl.web <at> gmail.com>
Subject: Re: 26.2.90; [PATCH] Speed up 'csv-align-fields'
Date: Thu, 12 Sep 2019 13:46:36 -0400
> The attached patch attempts to speed up the 'csv-align-fields' command
> by avoiding expensive calls to 'current-column', instead reusing field
> widths already computed by 'csv--column-widths'.

Sounds good.  I rarely use large CSV files, but I know the operation is slow.

I'm OK with the patch, tho please see my comment below.

> I felt an urge to speed up the command a bit while working with large
> (100 000+ lines) CSV files. Below are benchmarks produced by running
>
>   (benchmark 3 '(csv-align-fields nil (point-min) (point-max)))
>
> in three CSV files from the real world of various sizes. In these cases
> the speedup seems to be around 1.5x—2x.
>
> ~400 line file:
>   Before: Elapsed time: 0.175867s
>   After:  Elapsed time: 0.086809s
>
> ~50 000 line file:
>   Before: Elapsed time: 34.665853s (7.480686s in 35 GCs)
>   After:  Elapsed time: 24.349081s (7.154716s in 27 GCs)
>
> ~110 000 line file:
>   Before: Elapsed time: 82.444038s (19.799686s in 51 GCs)
>   After:  Elapsed time: 40.184331s (9.037813s in 25 GCs)

40s is still slow, but a factor of 2 is good, thanks.

If you're interested in this line, I think there are two avenues to
improve the behavior further:
- align lazily via jit-lock (this way the time is determined by the
  amount of text displayed rather than the total file size).
- make align-fields' into a mode, where fields are kept aligned even while
  the buffer is modified.

>  (defun csv--column-widths ()
> -  (let ((widths '()))
> +  (let ((column-widths '())
> +        (field-widths '()))

I think the return value is now sufficiently complex that the function
deserves a docstring describing it.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#37393; Package emacs. (Sun, 15 Sep 2019 15:56:02 GMT) Full text and rfc822 format available.

Message #11 received at 37393 <at> debbugs.gnu.org (full text, mbox):

From: Simen Heggestøyl <simenheg <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 37393 <at> debbugs.gnu.org, sdl.web <at> gmail.com
Subject: Re: 26.2.90; [PATCH] Speed up 'csv-align-fields'
Date: Sun, 15 Sep 2019 17:55:44 +0200
[Message part 1 (text/plain, inline)]
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

> Sounds good.  I rarely use large CSV files, but I know the operation is slow.
>
> I'm OK with the patch, tho please see my comment below.

Thanks for reviewing it.

> 40s is still slow, but a factor of 2 is good, thanks.

Yes (though 40s is the time for all three benchmark runs, so one
alignment is 40s/3).

> If you're interested in this line, I think there are two avenues to
> improve the behavior further:
> - align lazily via jit-lock (this way the time is determined by the
>   amount of text displayed rather than the total file size).

Wouldn't that still depend on knowing the column widths? I find that the
column width computation is taking about 80% of the time when calling
'csv-align-fields' (after the patch).

> - make align-fields' into a mode, where fields are kept aligned even while
>   the buffer is modified.

That sounds nice.

>>  (defun csv--column-widths ()
>> -  (let ((widths '()))
>> +  (let ((column-widths '())
>> +        (field-widths '()))
>
> I think the return value is now sufficiently complex that the function
> deserves a docstring describing it.

Agreed, I'll add one before I install the patch.

I've also attached a new suggestion for speeding up the column width
computation itself by eliminating another 'current-column'-call. I'm not
too sure about its correctness yet, but it seems to work in a few tests
I've done, and it sped up 'csv--column-widths' by a factor of 1.3–1.4.
[0001-WIP.patch (text/x-diff, inline)]
From c3c077170aefa8ba0cd5d8f8b824c85eb0f01a66 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Simen=20Heggest=C3=B8yl?= <simenheg <at> gmail.com>
Date: Sun, 15 Sep 2019 17:31:40 +0200
Subject: [PATCH] WIP

---
 packages/csv-mode/csv-mode.el | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/packages/csv-mode/csv-mode.el b/packages/csv-mode/csv-mode.el
index dc2555687..00107f51e 100644
--- a/packages/csv-mode/csv-mode.el
+++ b/packages/csv-mode/csv-mode.el
@@ -976,18 +976,26 @@ The fields yanked are those last killed by `csv-kill-fields'."
     (while (not (eobp))                   ; for each record...
       (or (csv-not-looking-at-record)
           (let ((w column-widths)
-                (col (current-column))
+                (col-beg (current-column))
+                col-end
                 field-width)
             (while (not (eolp))
               (csv-end-of-field)
-              (setq field-width (- (current-column) col))
+              (setq col-end (current-column))
+              (setq field-width (- col-end col-beg))
               (push field-width field-widths)
               (if w
                   (if (> field-width (car w)) (setcar w field-width))
                 (setq w (list field-width)
                       column-widths (nconc column-widths w)))
-              (or (eolp) (forward-char))  ; Skip separator.
-              (setq w (cdr w) col (current-column)))))
+              (unless (eolp)
+                (forward-char)            ; Skip separator.
+                (setq w (cdr w))
+                (setq col-beg (if (= (char-before) ?\t)
+                                  (* (/ (+ col-end tab-width)
+                                        tab-width)
+                                     tab-width)
+                                (+ col-end (char-width (char-before)))))))))
       (forward-line))
     (list column-widths (nreverse field-widths))))
 
-- 
2.23.0

[Message part 3 (text/plain, inline)]
-- Simen

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#37393; Package emacs. (Sun, 15 Sep 2019 16:18:01 GMT) Full text and rfc822 format available.

Message #14 received at 37393 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Simen Heggestøyl <simenheg <at> gmail.com>
Cc: 37393 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca, sdl.web <at> gmail.com
Subject: Re: bug#37393: 26.2.90; [PATCH] Speed up 'csv-align-fields'
Date: Sun, 15 Sep 2019 19:17:49 +0300
> From: Simen Heggestøyl <simenheg <at> gmail.com>
> Date: Sun, 15 Sep 2019 17:55:44 +0200
> Cc: 37393 <at> debbugs.gnu.org, sdl.web <at> gmail.com
> 
> > If you're interested in this line, I think there are two avenues to
> > improve the behavior further:
> > - align lazily via jit-lock (this way the time is determined by the
> >   amount of text displayed rather than the total file size).
> 
> Wouldn't that still depend on knowing the column widths? I find that the
> column width computation is taking about 80% of the time when calling
> 'csv-align-fields' (after the patch).

I'm talking here about something I don't understand well enough, but
did you try computing column width using vertical-motion?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#37393; Package emacs. (Sun, 15 Sep 2019 18:44:02 GMT) Full text and rfc822 format available.

Message #17 received at 37393 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Simen Heggestøyl <simenheg <at> gmail.com>
Cc: 37393 <at> debbugs.gnu.org, sdl.web <at> gmail.com
Subject: Re: 26.2.90; [PATCH] Speed up 'csv-align-fields'
Date: Sun, 15 Sep 2019 14:43:14 -0400
>> If you're interested in this line, I think there are two avenues to
>> improve the behavior further:
>> - align lazily via jit-lock (this way the time is determined by the
>>   amount of text displayed rather than the total file size).
> Wouldn't that still depend on knowing the column widths?

Of the whole file?  No: instead you'd only use the max of the columns that
you've seen so far.  When it increases (thus invalidating
alignment-overlays already created), you just "flush" those overlays and
rebuild them.

> I've also attached a new suggestion for speeding up the column width
> computation itself by eliminating another 'current-column'-call. I'm not
> too sure about its correctness yet, but it seems to work in a few tests
> I've done, and it sped up 'csv--column-widths' by a factor of 1.3–1.4.

Looks OK, but deserves a comment explaining that this computation of the
new `col-beg` using tab-width and char-width is there to avoid an
additional call to current-column (it's basically
a "fast-current-column" which gets its extra speed from doing the work
more incrementally, whereas current-column always starts counting from
BOL).

Maybe we could get yet more speedup by making it possible to pass to
`current-column` (or a new C function) a start position along with its
column, so we'd avoid re-traversing the part of the line that we've
already processed.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#37393; Package emacs. (Tue, 17 Sep 2019 16:54:01 GMT) Full text and rfc822 format available.

Message #20 received at 37393 <at> debbugs.gnu.org (full text, mbox):

From: Simen Heggestøyl <simenheg <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>, Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 37393 <at> debbugs.gnu.org, sdl.web <at> gmail.com
Subject: Re: 26.2.90; [PATCH] Speed up 'csv-align-fields'
Date: Tue, 17 Sep 2019 18:53:06 +0200
[Message part 1 (text/plain, inline)]
Eli Zaretskii <eliz <at> gnu.org> writes:

> I'm talking here about something I don't understand well enough, but
> did you try computing column width using vertical-motion?

No, I haven't yet tried anything in that direction.

Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

>>> If you're interested in this line, I think there are two avenues to
>>> improve the behavior further:
>>> - align lazily via jit-lock (this way the time is determined by the
>>>   amount of text displayed rather than the total file size).
>> Wouldn't that still depend on knowing the column widths?
>
> Of the whole file?  No: instead you'd only use the max of the columns that
> you've seen so far.  When it increases (thus invalidating
> alignment-overlays already created), you just "flush" those overlays and
> rebuild them.

Wouldn't then the columns appear to jump about whenever a new max width
is discovered? I also guess you'd lose the ability to do e.g.
C-u 1000 C-n and stay in the same column?

> Maybe we could get yet more speedup by making it possible to pass to
> `current-column` (or a new C function) a start position along with its
> column, so we'd avoid re-traversing the part of the line that we've
> already processed.

I think that sounds like a good idea; then my ugly "fast-current-column"
patch from last time won't be needed. I have no prior experience with
Emacs' C internals, but an initial naive attempt is attached. I've only
tested it on some basic cases where it seems to behave correctly and
gives a speedup factor of around 4,5x–5x. Could this track be something
worth exploring further?
[0001-WIP.patch (text/x-diff, inline)]
From 16d8915b8ad96b2aa7fb0350468b4af847c5ff19 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Simen=20Heggest=C3=B8yl?= <simenheg <at> gmail.com>
Date: Tue, 17 Sep 2019 17:32:00 +0200
Subject: [PATCH] WIP

---
 src/indent.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/src/indent.c b/src/indent.c
index 1b589a710c..f2c525d882 100644
--- a/src/indent.c
+++ b/src/indent.c
@@ -46,6 +46,10 @@ #define CR 015
 
 ptrdiff_t last_known_column_point;
 
+/* Value of point byte when current_column was called.  */
+
+ptrdiff_t last_known_column_point_byte;
+
 /* Value of MODIFF when current_column was called.  */
 
 static modiff_count last_known_column_modified;
@@ -325,6 +329,7 @@ DEFUN ("current-column", Fcurrent_column, Scurrent_column, 0, 0, 0,
 invalidate_current_column (void)
 {
   last_known_column_point = 0;
+  last_known_column_point_byte = 0;
 }
 
 ptrdiff_t
@@ -454,6 +459,7 @@ current_column (void)
 
   last_known_column = col;
   last_known_column_point = PT;
+  last_known_column_point_byte = PT_BYTE;
   last_known_column_modified = MODIFF;
 
   return col;
@@ -545,6 +551,17 @@ scan_for_column (ptrdiff_t *endpos, EMACS_INT *goalcol, ptrdiff_t *prevcol)
   ptrdiff_t scan, scan_byte, next_boundary;
 
   scan = find_newline (PT, PT_BYTE, BEGV, BEGV_BYTE, -1, NULL, &scan_byte, 1);
+
+  if (scan < last_known_column_point
+      && end > last_known_column_point
+      && MODIFF == last_known_column_modified)
+    {
+      scan = last_known_column_point;
+      scan_byte = last_known_column_point_byte;
+      col = last_known_column;
+      prev_col = last_known_column;
+    }
+
   next_boundary = scan;
 
   window = Fget_buffer_window (Fcurrent_buffer (), Qnil);
@@ -701,6 +718,7 @@ scan_for_column (ptrdiff_t *endpos, EMACS_INT *goalcol, ptrdiff_t *prevcol)
 
   last_known_column = col;
   last_known_column_point = PT;
+  last_known_column_point_byte = PT_BYTE;
   last_known_column_modified = MODIFF;
 
   if (goalcol)
@@ -848,6 +866,7 @@ DEFUN ("indent-to", Findent_to, Sindent_to, 1, 2, "NIndent to column: ",
 
   last_known_column = mincol;
   last_known_column_point = PT;
+  last_known_column_point_byte = PT_BYTE;
   last_known_column_modified = MODIFF;
 
   XSETINT (column, mincol);
@@ -1040,6 +1059,7 @@ COLUMN (otherwise).  In addition, if FORCE is t, and the line is too short
 
   last_known_column = col;
   last_known_column_point = PT;
+  last_known_column_point_byte = PT_BYTE;
   last_known_column_modified = MODIFF;
 
   return make_fixnum (col);
-- 
2.23.0

[Message part 3 (text/plain, inline)]
-- Simen

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#37393; Package emacs. (Tue, 17 Sep 2019 17:24:02 GMT) Full text and rfc822 format available.

Message #23 received at 37393 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Simen Heggestøyl <simenheg <at> gmail.com>
Cc: 37393 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca, sdl.web <at> gmail.com
Subject: Re: 26.2.90; [PATCH] Speed up 'csv-align-fields'
Date: Tue, 17 Sep 2019 20:23:19 +0300
> From: Simen Heggestøyl <simenheg <at> gmail.com>
> Cc: 37393 <at> debbugs.gnu.org, sdl.web <at> gmail.com
> Date: Tue, 17 Sep 2019 18:53:06 +0200
> 
> > Maybe we could get yet more speedup by making it possible to pass to
> > `current-column` (or a new C function) a start position along with its
> > column, so we'd avoid re-traversing the part of the line that we've
> > already processed.
> 
> I think that sounds like a good idea; then my ugly "fast-current-column"
> patch from last time won't be needed.

Once again, risking to talk about something I don't clearly
understand: vertical-motion already allows you to pass an argument
which tells what is the current column.

Any rationale to use current-column instead of vertical-motion?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#37393; Package emacs. (Tue, 17 Sep 2019 19:13:02 GMT) Full text and rfc822 format available.

Message #26 received at 37393 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Simen Heggestøyl <simenheg <at> gmail.com>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 37393 <at> debbugs.gnu.org, sdl.web <at> gmail.com
Subject: Re: 26.2.90; [PATCH] Speed up 'csv-align-fields'
Date: Tue, 17 Sep 2019 15:12:28 -0400
>> Of the whole file?  No: instead you'd only use the max of the columns that
>> you've seen so far.  When it increases (thus invalidating
>> alignment-overlays already created), you just "flush" those overlays and
>> rebuild them.
> Wouldn't then the columns appear to jump about whenever a new max width
> is discovered?

Yes, but that should only happen as part of a scroll or similar
"significant" visual change anyway.

> I also guess you'd lose the ability to do e.g.
> C-u 1000 C-n and stay in the same column?

Probably, yes.  Seems like a minor issue to me.

>> Maybe we could get yet more speedup by making it possible to pass to
>> `current-column` (or a new C function) a start position along with its
>> column, so we'd avoid re-traversing the part of the line that we've
>> already processed.
> I think that sounds like a good idea; then my ugly "fast-current-column"

> Could this track be something worth exploring further?

Your call,


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#37393; Package emacs. (Tue, 17 Sep 2019 19:15:02 GMT) Full text and rfc822 format available.

Message #29 received at 37393 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 37393 <at> debbugs.gnu.org,
 Simen Heggestøyl <simenheg <at> gmail.com>, sdl.web <at> gmail.com
Subject: Re: 26.2.90; [PATCH] Speed up 'csv-align-fields'
Date: Tue, 17 Sep 2019 15:14:45 -0400
> Once again, risking to talk about something I don't clearly
> understand: vertical-motion already allows you to pass an argument
> which tells what is the current column.
> Any rationale to use current-column instead of vertical-motion?

But we need to measure the width of a particular chunk of buffer text,
and I don't see how vertical-motion can be used for that: the only
things it returns are:
- a new point position
- the number of screen lines moved over
I can't see how to use it to compute the text's width.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#37393; Package emacs. (Wed, 18 Sep 2019 02:36:02 GMT) Full text and rfc822 format available.

Message #32 received at 37393 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 37393 <at> debbugs.gnu.org, simenheg <at> gmail.com, sdl.web <at> gmail.com
Subject: Re: 26.2.90; [PATCH] Speed up 'csv-align-fields'
Date: Wed, 18 Sep 2019 05:34:52 +0300
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: Simen Heggestøyl <simenheg <at> gmail.com>,
>   37393 <at> debbugs.gnu.org,
>   sdl.web <at> gmail.com
> Date: Tue, 17 Sep 2019 15:14:45 -0400
> 
> > Once again, risking to talk about something I don't clearly
> > understand: vertical-motion already allows you to pass an argument
> > which tells what is the current column.
> > Any rationale to use current-column instead of vertical-motion?
> 
> But we need to measure the width of a particular chunk of buffer text,
> and I don't see how vertical-motion can be used for that: the only
> things it returns are:
> - a new point position
> - the number of screen lines moved over
> I can't see how to use it to compute the text's width.

What about posn-at-point?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#37393; Package emacs. (Wed, 18 Sep 2019 20:00:02 GMT) Full text and rfc822 format available.

Message #35 received at 37393 <at> debbugs.gnu.org (full text, mbox):

From: Simen Heggestøyl <simenheg <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 37393 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca, sdl.web <at> gmail.com
Subject: Re: 26.2.90; [PATCH] Speed up 'csv-align-fields'
Date: Wed, 18 Sep 2019 21:59:20 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

> What about posn-at-point?

That one doesn't seem to take the display width of the characters into
account. For instance, in a buffer with "<space><tab><space>" and point
at the end, 'current-column' returns column 9 as expected, while
'posn-at-point' gives 3.

-- Simen




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#37393; Package emacs. (Wed, 18 Sep 2019 20:09:02 GMT) Full text and rfc822 format available.

Message #38 received at 37393 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Simen Heggestøyl <simenheg <at> gmail.com>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 37393 <at> debbugs.gnu.org, sdl.web <at> gmail.com
Subject: Re: 26.2.90; [PATCH] Speed up 'csv-align-fields'
Date: Wed, 18 Sep 2019 16:08:27 -0400
>> What about posn-at-point?
> That one doesn't seem to take the display width of the characters into
> account. For instance, in a buffer with "<space><tab><space>" and point
> at the end, 'current-column' returns column 9 as expected, while
> 'posn-at-point' gives 3.

It should return a lot more information than just a bare number.
I suspect you did not look at the right number.  I think you'd have to
use the pixel (X . Y) information (and divide it by the frame's char
width), rather than the (COL . ROW) one.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#37393; Package emacs. (Thu, 19 Sep 2019 15:52:02 GMT) Full text and rfc822 format available.

Message #41 received at 37393 <at> debbugs.gnu.org (full text, mbox):

From: Simen Heggestøyl <simenheg <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: eliz <at> gnu.org, 37393 <at> debbugs.gnu.org, sdl.web <at> gmail.com
Subject: Re: 26.2.90; [PATCH] Speed up 'csv-align-fields'
Date: Thu, 19 Sep 2019 17:51:33 +0200
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

> It should return a lot more information than just a bare number.
> I suspect you did not look at the right number.  I think you'd have to
> use the pixel (X . Y) information (and divide it by the frame's char
> width), rather than the (COL . ROW) one.

Your suspicion is correct, I looked at the COL field.

By using the pixel X divided by (frame-char-width), the results look
right, but I'm getting a slowdown on a 400 line file from 0.03s to 2s.

-- Simen




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#37393; Package emacs. (Thu, 19 Sep 2019 17:31:01 GMT) Full text and rfc822 format available.

Message #44 received at 37393 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Simen Heggestøyl <simenheg <at> gmail.com>
Cc: 37393 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca, sdl.web <at> gmail.com
Subject: Re: 26.2.90; [PATCH] Speed up 'csv-align-fields'
Date: Thu, 19 Sep 2019 20:30:16 +0300
> From: Simen Heggestøyl <simenheg <at> gmail.com>
> Cc: eliz <at> gnu.org, 37393 <at> debbugs.gnu.org, sdl.web <at> gmail.com
> Date: Thu, 19 Sep 2019 17:51:33 +0200
> 
> By using the pixel X divided by (frame-char-width), the results look
> right, but I'm getting a slowdown on a 400 line file from 0.03s to 2s.

Then I guess this technique won't help in your case.  Sorry for
distracting you.




Reply sent to Simen Heggestøyl <simenheg <at> gmail.com>:
You have taken responsibility. (Wed, 09 Oct 2019 16:34:02 GMT) Full text and rfc822 format available.

Notification sent to Simen Heggestøyl <simenheg <at> gmail.com>:
bug acknowledged by developer. (Wed, 09 Oct 2019 16:34:02 GMT) Full text and rfc822 format available.

Message #49 received at 37393-done <at> debbugs.gnu.org (full text, mbox):

From: Simen Heggestøyl <simenheg <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 37393-done <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca, sdl.web <at> gmail.com
Subject: Re: 26.2.90; [PATCH] Speed up 'csv-align-fields'
Date: Wed, 09 Oct 2019 18:33:16 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

> Then I guess this technique won't help in your case.  Sorry for
> distracting you.

No problem, thanks the suggestions.

Closing this bug now as the original patch has been installed (with some
changes suggested by Stefan).

-- Simen




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 07 Nov 2019 12:24:10 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 172 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.