GNU bug report logs -
#77684
[PATCH] ; * lisp/emacs-lisp/vtable.el (vtable--limit-string): Use binary search
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 77684 in the body.
You can then email your comments to 77684 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77684
; Package
emacs
.
(Thu, 10 Apr 2025 00:14:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Kristoffer Balintona <krisbalintona <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Thu, 10 Apr 2025 00:14:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Tags: patch
Hello,
Attached is a patch to vtable.el that improves the performance of
vtable-limit-string. This is because of the expensive calls to
string-pixel-width.
The current implementation of the function is O(n), whereas the version
in the attached patch uses a binary search, which has an improved time
complexity of O(n).
I discovered how expensive vtable-limit-string is when working on a
personal project of mine that creates vtables with hundreds of rows.
According to profiler-report, it was by far the largest bottleneck.
Below are two basic benchmarks that show the markedly improved
performance of the version in the attached patch:
(benchmark-run-compiled 10
(let ((buf-name "*temp*"))
(get-buffer-create buf-name)
;; The function below is defined by my own project. It
creates a vtable
;; in buffer named BUF-NAME
(org-roam-folgezettel-list buf-name)
(kill-buffer buf-name)))
;; Original (with GC): (16.77486223 17 6.687552347000008)
;; Proposed (with GC): (4.469360591 5 1.919122549000008)
(let ((gc-cons-threshold most-positive-fixnum))
(benchmark-run-compiled 10
(let ((buf-name "*temp*"))
(get-buffer-create buf-name)
;; The function below is defined by my own project. It
creates a vtable
;; in buffer named BUF-NAME
(org-roam-folgezettel-list buf-name)
(kill-buffer buf-name))))
;; Original (no GC): (10.767429898 0 0.0)
;; Proposed (no GC): (2.299831031 0 0.0)
This is my first patch to Emacs, so please let me know if I’ve missed
any conventions for properly contributing.
In GNU Emacs 31.0.50 (build 1, x86_64-pc-linux-gnu, GTK+ Version
3.24.49, cairo version 1.18.4) of 2025-03-31 built on UnoriginalName
Windowing system distributor 'Microsoft Corporation', version 11.0.12010000
System Description: Arch Linux
Configured using:
'configure 'CFLAGS=-O2 -march=native -mtune=native
-fomit-frame-pointer' --with-mailutils --with-gtk --with-dbus
--with-native-compilation'
--
With appreciation,
Kristoffer
[0001-lisp-emacs-lisp-vtable.el-vtable-limit-string-Use-bi.patch (text/patch, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77684
; Package
emacs
.
(Sun, 13 Apr 2025 09:28:03 GMT)
Full text and
rfc822 format available.
Message #8 received at 77684 <at> debbugs.gnu.org (full text, mbox):
> From: Kristoffer Balintona <krisbalintona <at> gmail.com>
> Date: Wed, 9 Apr 2025 17:13:10 -0700
>
> Attached is a patch to vtable.el that improves the performance of
> vtable-limit-string. This is because of the expensive calls to
> string-pixel-width.
>
> The current implementation of the function is O(n), whereas the version
> in the attached patch uses a binary search, which has an improved time
> complexity of O(n).
>
> I discovered how expensive vtable-limit-string is when working on a
> personal project of mine that creates vtables with hundreds of rows.
> According to profiler-report, it was by far the largest bottleneck.
Binary search will only work if adding or removing a character will
always change the result of string-pixel-width. But this is not true
in general, due to character composition and other Emacs display
features. Here's a simple example of such an issue:
(string-pixel-width "a")
=> 9
(string-pixel-width (concat "a" "́"))
=> 9
So adding a character doesn't affect the result of string-pixel-width,
in this case because the two characters will be displayed by a single
glyph that combines the base character 'a' and the acute accent.
Did you try your modified code in such situations, or did you only try
it with simple characters that never combine?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77684
; Package
emacs
.
(Sun, 13 Apr 2025 11:16:06 GMT)
Full text and
rfc822 format available.
Message #11 received at 77684 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Sun, Apr 13, 2025 at 5:28 AM Eli Zaretskii <eliz <at> gnu.org> wrote:
> > From: Kristoffer Balintona <krisbalintona <at> gmail.com>
> > Date: Wed, 9 Apr 2025 17:13:10 -0700
> >
> > Attached is a patch to vtable.el that improves the performance of
> > vtable-limit-string. This is because of the expensive calls to
> > string-pixel-width.
> >
> > The current implementation of the function is O(n), whereas the version
> > in the attached patch uses a binary search, which has an improved time
> > complexity of O(n).
> >
> > I discovered how expensive vtable-limit-string is when working on a
> > personal project of mine that creates vtables with hundreds of rows.
> > According to profiler-report, it was by far the largest bottleneck.
>
> Binary search will only work if adding or removing a character will
> always change the result of string-pixel-width. But this is not true
> in general, due to character composition and other Emacs display
> features. Here's a simple example of such an issue:
>
> (string-pixel-width "a")
> => 9
> (string-pixel-width (concat "a" "́"))
> => 9
>
> So adding a character doesn't affect the result of string-pixel-width,
> in this case because the two characters will be displayed by a single
> glyph that combines the base character 'a' and the acute accent.
>
> Did you try your modified code in such situations, or did you only try
> it with simple characters that never combine?
>
You can avoid the cost of this string metric by supplying your own global
displayer function for the table or a column-specific displayer function.
Apropos timing as I've been spending some time with vtable and I have a
bunch of bug fixes and a few enhancements coming. One will slow down that
function a little more...vtable breaks under remapped fonts; e.g., when
using text-scale-adjust so I've fixed that (I have some pixel adjustments
left on the header which still gets out of alignment a little).
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77684
; Package
emacs
.
(Tue, 15 Apr 2025 01:36:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 77684 <at> debbugs.gnu.org (full text, mbox):
On Sun, Apr 13 2025, Eli Zaretskii wrote:
> Binary search will only work if adding or removing a character will
> always change the result of string-pixel-width. But this is not true
> in general, due to character composition and other Emacs display
> features. Here's a simple example of such an issue:
>
> (string-pixel-width "a")
> => 9
> (string-pixel-width (concat "a" "́"))
> => 9
>
> So adding a character doesn't affect the result of string-pixel-width,
> in this case because the two characters will be displayed by a single
> glyph that combines the base character 'a' and the acute accent.
>
> Did you try your modified code in such situations, or did you only try
> it with simple characters that never combine?
I see. I think you're right: the vast majority of characters I tried my
code against were simple characters. I rarely interact with characters
that combine so when I wrote my version I hadn't considered how they're
treated in Emacs.
So I suppose my patch doesn't work. However, then, my question is
whether vtable--limit-string can be made significantly more performant to
begin with, or if we're stuck with string-pixel-width being a potential
bottleneck.
OTOH, Ship Mints made a suggestion elsewhere in the thread that I'll
try. If it suffices, I'll use that for my personal uses instead.
--
With appreciation,
Kristoffer
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77684
; Package
emacs
.
(Tue, 15 Apr 2025 02:55:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 77684 <at> debbugs.gnu.org (full text, mbox):
On Sun, Apr 13 2025, Ship Mints wrote:
> You can avoid the cost of this string metric by supplying your own global
> displayer function for the table or a column-specific displayer function.
>
> Apropos timing as I've been spending some time with vtable and I have a
> bunch of bug fixes and a few enhancements coming. One will slow down that
> function a little more...vtable breaks under remapped fonts; e.g., when
> using text-scale-adjust so I've fixed that (I have some pixel adjustments
> left on the header which still gets out of alignment a little).
Thank you, that's a good idea. I hadn't realized the truncation based on
string width happens during the display phase. Based on Eli's remarks,
my patch might not be appropriate, but I think modifying the displayer
function will do.
Though I suppose it'd be nice if the Info manual mentioned that the
displayer would be where users could address performance related
concerns.
--
In gratitude,
Kristoffer
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77684
; Package
emacs
.
(Tue, 15 Apr 2025 07:33:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 77684 <at> debbugs.gnu.org (full text, mbox):
> From: Kristoffer Balintona <krisbalintona <at> gmail.com>
> Date: Mon, 14 Apr 2025 18:35:48 -0700
> Cc: 77684 <at> debbugs.gnu.org
>
> On Sun, Apr 13 2025, Eli Zaretskii wrote:
>
> > Binary search will only work if adding or removing a character will
> > always change the result of string-pixel-width. But this is not true
> > in general, due to character composition and other Emacs display
> > features. Here's a simple example of such an issue:
> >
> > (string-pixel-width "a")
> > => 9
> > (string-pixel-width (concat "a" "́"))
> > => 9
> >
> > So adding a character doesn't affect the result of string-pixel-width,
> > in this case because the two characters will be displayed by a single
> > glyph that combines the base character 'a' and the acute accent.
> >
> > Did you try your modified code in such situations, or did you only try
> > it with simple characters that never combine?
>
> I see. I think you're right: the vast majority of characters I tried my
> code against were simple characters. I rarely interact with characters
> that combine so when I wrote my version I hadn't considered how they're
> treated in Emacs.
>
> So I suppose my patch doesn't work. However, then, my question is
> whether vtable--limit-string can be made significantly more performant to
> begin with, or if we're stuck with string-pixel-width being a potential
> bottleneck.
Perhaps we could still use binary search, but augment it as follows:
when a match is found by binary search, run an additional loop, which
adds back one character at a time, as long as string-pixel-width
returns the same value. The last character added which still yields
the same value will be the point where to truncate the string.
For long enough strings, this might still be a win, even if characters
are composed. And for the simple case of characters that are not
composed, this adds just one extra call to string-pixel-width.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77684
; Package
emacs
.
(Tue, 15 Apr 2025 11:31:01 GMT)
Full text and
rfc822 format available.
Message #23 received at 77684 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Tue, Apr 15, 2025 at 3:33 AM Eli Zaretskii <eliz <at> gnu.org> wrote:
> > From: Kristoffer Balintona <krisbalintona <at> gmail.com>
> > Date: Mon, 14 Apr 2025 18:35:48 -0700
> > Cc: 77684 <at> debbugs.gnu.org
> >
> > On Sun, Apr 13 2025, Eli Zaretskii wrote:
> >
> > > Binary search will only work if adding or removing a character will
> > > always change the result of string-pixel-width. But this is not true
> > > in general, due to character composition and other Emacs display
> > > features. Here's a simple example of such an issue:
> > >
> > > (string-pixel-width "a")
> > > => 9
> > > (string-pixel-width (concat "a" "́"))
> > > => 9
> > >
> > > So adding a character doesn't affect the result of string-pixel-width,
> > > in this case because the two characters will be displayed by a single
> > > glyph that combines the base character 'a' and the acute accent.
> > >
> > > Did you try your modified code in such situations, or did you only try
> > > it with simple characters that never combine?
> >
> > I see. I think you're right: the vast majority of characters I tried my
> > code against were simple characters. I rarely interact with characters
> > that combine so when I wrote my version I hadn't considered how they're
> > treated in Emacs.
> >
> > So I suppose my patch doesn't work. However, then, my question is
> > whether vtable--limit-string can be made significantly more performant to
> > begin with, or if we're stuck with string-pixel-width being a potential
> > bottleneck.
>
> Perhaps we could still use binary search, but augment it as follows:
> when a match is found by binary search, run an additional loop, which
> adds back one character at a time, as long as string-pixel-width
> returns the same value. The last character added which still yields
> the same value will be the point where to truncate the string.
>
> For long enough strings, this might still be a win, even if characters
> are composed. And for the simple case of characters that are not
> composed, this adds just one extra call to string-pixel-width.
>
Perhaps a binary search would suffice. In vtable's case, a heuristic for
the string pixel-fit abbreviator could be hinted. It could start with the
column's name length. Just an idea.
vtable does not currently consider just the rows that are displayed, but
considers the entire row set.
In the end, the programmer knows the data and really should consider
accommodating strings they suspect or know will need pixel-wise
abbreviation. Perhaps the vtable documentation should mention that
pixelwise operations can be expensive and to specify column widths and
provide "formatters" and "displayers" tuned to their data.
-Stephane
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77684
; Package
emacs
.
(Tue, 22 Apr 2025 21:37:01 GMT)
Full text and
rfc822 format available.
Message #26 received at 77684 <at> debbugs.gnu.org (full text, mbox):
On Tue, Apr 15 2025, Eli Zaretskii wrote:
> Perhaps we could still use binary search, but augment it as follows:
> when a match is found by binary search, run an additional loop, which
> adds back one character at a time, as long as string-pixel-width
> returns the same value. The last character added which still yields
> the same value will be the point where to truncate the string.
>
> For long enough strings, this might still be a win, even if characters
> are composed. And for the simple case of characters that are not
> composed, this adds just one extra call to string-pixel-width.
Thank you for the suggestion, Eli. Unfortunately, I wasn't able to find
an implementation that worked exhaustively. I ran into trouble I
couldn't resolve when the match returned by the binary search landed
within a ligature/emoji (which can span multiple characters without
necessarily changing the pixel width in predictable, monotonic ways).
Aside from those cases, I think the binary search works quite well.
However, since it doesn't cover all the cases the current implementation
does, I think the current implementation should be kept.
I may return to this in the future, but for now I think if developers
need performance, they should set the displayer themselves (which I will
do for my own personal projects).
--
In gratitude,
Kristoffer
Reply sent
to
Eli Zaretskii <eliz <at> gnu.org>
:
You have taken responsibility.
(Wed, 23 Apr 2025 12:07:05 GMT)
Full text and
rfc822 format available.
Notification sent
to
Kristoffer Balintona <krisbalintona <at> gmail.com>
:
bug acknowledged by developer.
(Wed, 23 Apr 2025 12:07:05 GMT)
Full text and
rfc822 format available.
Message #31 received at 77684-done <at> debbugs.gnu.org (full text, mbox):
> From: Kristoffer Balintona <krisbalintona <at> gmail.com>
> Date: Tue, 22 Apr 2025 17:36:32 -0400
> Cc: 77684 <at> debbugs.gnu.org
>
> On Tue, Apr 15 2025, Eli Zaretskii wrote:
>
> > Perhaps we could still use binary search, but augment it as follows:
> > when a match is found by binary search, run an additional loop, which
> > adds back one character at a time, as long as string-pixel-width
> > returns the same value. The last character added which still yields
> > the same value will be the point where to truncate the string.
> >
> > For long enough strings, this might still be a win, even if characters
> > are composed. And for the simple case of characters that are not
> > composed, this adds just one extra call to string-pixel-width.
>
> Thank you for the suggestion, Eli. Unfortunately, I wasn't able to find
> an implementation that worked exhaustively. I ran into trouble I
> couldn't resolve when the match returned by the binary search landed
> within a ligature/emoji (which can span multiple characters without
> necessarily changing the pixel width in predictable, monotonic ways).
>
> Aside from those cases, I think the binary search works quite well.
> However, since it doesn't cover all the cases the current implementation
> does, I think the current implementation should be kept.
>
> I may return to this in the future, but for now I think if developers
> need performance, they should set the displayer themselves (which I will
> do for my own personal projects).
Thanks. So let's close this bug for now. If someone has clever ideas
how to speed up vtable--limit-string, we can always reopen.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Thu, 22 May 2025 11:24:19 GMT)
Full text and
rfc822 format available.
This bug report was last modified 15 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.