GNU bug report logs - #47712
27.1; Provide `string-display-width` function, which takes properties into account, `substring-width`

Previous Next

Package: emacs;

Reported by: Daniel Mendler <mail <at> daniel-mendler.de>

Date: Sun, 11 Apr 2021 21:17:02 UTC

Severity: normal

Found in version 27.1

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 47712 in the body.
You can then email your comments to 47712 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Sun, 11 Apr 2021 21:17:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Daniel Mendler <mail <at> daniel-mendler.de>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sun, 11 Apr 2021 21:17:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: bug-gnu-emacs <at> gnu.org
Subject: 27.1; Provide `string-display-width` function, which takes properties
 into account, `substring-width`
Date: Sun, 11 Apr 2021 23:16:13 +0200
Provide a `string-display-width' function, which computes the
`string-width', but takes invisible display properties into account.
This function is often needed by packages.

For example my Consult package (https://github.com/minad/consult) has a
crude (non-recursive) version of that function. Then the Embark package
has (https://github.com/oantolin/embark) has a similar function. Last
but not least, the function already exists in Org under the name
`org-string-width'.

It is reasonable to implement this function in Elisp. But all the
implementations have to use `string-width' for measuring the parts which
make up the actual string. Unfortunately this requires allocations for
the substrings. A possible solution to this problem is to implement a
primitive function `substring-width' and implement `string-width' in
terms of that function.

(defun consult--display-width (string)
  "Compute width of STRING taking display and invisible properties into 
account."
  (let ((pos 0) (width 0) (end (length string)))
    (while (< pos end)
      (let ((nextd (next-single-property-change
                    pos 'display string end))
            (display (get-text-property
                      pos 'display string)))
        (if (stringp display)
            (setq width (+ width (string-width display))
                  pos nextd)
          (while (< pos nextd)
            (let ((nexti (next-single-property-change
                          pos 'invisible string nextd)))
              (unless (get-text-property
                       pos 'invisible string)
                (setq width
                      (+ width
                         (string-width
                          ;; Avoid allocation for the full string.
                          ;; There should be a `substring-width'
                          ;; provided by Emacs. TODO: Propose
                          ;; upstream? Alternatively propose this
                          ;; whole `display-width' function to
                          ;; upstream.
                          (if (and (= pos 0) (= nexti end))
                              string
                            (substring-no-properties
                             string pos nexti))))))
              (setq pos nexti))))))
    width))




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Mon, 12 Apr 2021 02:28:02 GMT) Full text and rfc822 format available.

Message #8 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1;
 Provide `string-display-width` function, which takes properties into
 account, `substring-width`
Date: Mon, 12 Apr 2021 05:26:41 +0300
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Sun, 11 Apr 2021 23:16:13 +0200
> 
> Provide a `string-display-width' function, which computes the
> `string-width', but takes invisible display properties into account.
> This function is often needed by packages.

We already have window-text-pixel-size; doesn't it fit the bill?  If
not, why not?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Mon, 12 Apr 2021 08:46:02 GMT) Full text and rfc822 format available.

Message #11 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Mon, 12 Apr 2021 10:45:45 +0200
On 4/12/21 4:26 AM, Eli Zaretskii wrote:
> We already have window-text-pixel-size; doesn't it fit the bill?  If
> not, why not?

`window-text-pixel-size` computes the size of a portion of the window, 
it does not take a string argument. The function `string-width` instead 
takes a string argument returning the number of columns needed to 
display this string. This function is needed to compute widths of 
strings which are not yet displayed.

`string-display-width` is a proposed function which takes properties 
into account as `window-text-pixel-size` does, but without going through 
the display system first. Instead it computes the `string-width` of a 
flattened string, where the invisible parts have been removed and the 
displayed parts have been replaced.

Orgmode uses its incarnation of `string-display-width`, 
`org-string-width`, for formatting its tables (see org-table.el). The 
width computation is performed before displaying the strings in the window.

`substring-width` is a proposed generalization of `string-width` which 
allows computation of the width of a substring, without requiring the 
allocation of a substring. `string-display-width` or 
`substring-display-width` can be implemented based on `string-width` or 
`substring-width`, with`substring-width` being advantageous since it 
does not require the substring allocation.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Mon, 12 Apr 2021 08:55:02 GMT) Full text and rfc822 format available.

Message #14 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Mon, 12 Apr 2021 10:53:50 +0200
Daniel Mendler <mail <at> daniel-mendler.de> writes:

> `window-text-pixel-size` computes the size of a portion of the window,
> it does not take a string argument. The function `string-width`
> instead takes a string argument returning the number of columns needed
> to display this string. This function is needed to compute widths of
> strings which are not yet displayed.

If I understand correctly (and I may not), the proposed function just
takes properties into account (like `invisible' and `display'), but does
not compute pixel sizes at all?

I can see how that may be useful in some cases (and is very different
from `window-text-pixel-size'), so I think adding something like that
would be nice.  But the name is confusing -- it sounds like it's
computing the displayed width, and it's not -- if there's images or text
with a different font in there, it's not taken into account?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Mon, 12 Apr 2021 09:10:01 GMT) Full text and rfc822 format available.

Message #17 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Mon, 12 Apr 2021 11:08:52 +0200
On 4/12/21 10:53 AM, Lars Ingebrigtsen wrote:
> I can see how that may be useful in some cases (and is very different
> from `window-text-pixel-size'), so I think adding something like that
> would be nice.  But the name is confusing -- it sounds like it's
> computing the displayed width, and it's not -- if there's images or text
> with a different font in there, it's not taken into account?

Yes, a better name could be `string-property-width`? I proposed 
`string-display-width` since it takes the 'display property into 
account, but maybe this gives false associations.

Note that I would like to have an implementation of 
`substring-width`/`string-display-width`/`substring-display-width` which 
does not allocate, since this reduces the GC pressure when formatting 
many items. How do you think about this?

Regarding images, different fonts, maybe a `string-property-pixel-width` 
would be useful too. But for the use cases I have in mind (formatting 
monospaced text) computing columns is sufficient.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Mon, 12 Apr 2021 12:23:01 GMT) Full text and rfc822 format available.

Message #20 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Mon, 12 Apr 2021 15:21:38 +0300
> Cc: 47712 <at> debbugs.gnu.org
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Mon, 12 Apr 2021 10:45:45 +0200
> 
> On 4/12/21 4:26 AM, Eli Zaretskii wrote:
> > We already have window-text-pixel-size; doesn't it fit the bill?  If
> > not, why not?
> 
> `window-text-pixel-size` computes the size of a portion of the window, 
> it does not take a string argument.

That is easy to work around, right?  Just insert the string into a
temporary buffer and say with-current-buffer (and/or
with-selected-window, if needed).

> The function `string-width` instead takes a string argument
> returning the number of columns needed to display this string. This
> function is needed to compute widths of strings which are not yet
> displayed.

But you compute the width because you are going to display that string
soon enough, right?  Or did I misunderstand the purpose?

In general, calculating a string's width out of display context is
meaningless in Emacs.  More about that below.

> `string-display-width` is a proposed function which takes properties 
> into account as `window-text-pixel-size` does, but without going through 
> the display system first.

That's exactly my point: by not using the display code, you allow up
front inaccurate results.  There's no practical way to implement this
in Lisp without yielding inaccurate and even grossly incorrect results
in some cases (see below), so any such implementation will be limited
from the get-go, in ways that are hard to even document precisely, let
alone use reliably.  Thus, users of such an implementation are bound
to be unpleasantly surprised if they try using it in a situation
different from yours.

> Instead it computes the `string-width` of a flattened string, where
> the invisible parts have been removed and the displayed parts have
> been replaced.

Yes, I understood what the code does.  But it only handles some of the
possible use cases, and will fail in others.  For example, it ignores
faces (which could change the metrics of characters); it assumes that
all the characters of the string can be displayed by the buffer's
fonts (because "tofu" has very different dimensions); it uses
char-width to measure a character's width (which is only accurate for
text-mode frames); it doesn't handle display properties that aren't
strings, such as (space :align-to N); etc.

> Orgmode uses its incarnation of `string-display-width`, 
> `org-string-width`, for formatting its tables (see org-table.el). The 
> width computation is performed before displaying the strings in the window.

What you propose might be good enough for org-table, but it falls
short of what is required from a general-purpose core feature: such a
feature needs to support, or at least be able to support, all the
possible display-related features that could affect the string's
width.  Your proposed implementation cannot support all of those
features even in principle, because some of the display features
mentioned above aren't accessible directly from Lisp.  And if we try
implementing this in C, we will end up with window-text-pixel-size,
because that's why it was written in the first place.  It was written
to use the display code and requires a context of a window, because
otherwise the problem you are trying to solve cannot be solved:
features like fonts, faces, invisibility specs, etc. -- all of those
depend on windows or buffers or frames, so trying to estimate a
string's width without that context will produce incorrect results.

So I urge you to try to use window-text-pixel-size for org-table and
elsewhere, because that's what that function is for.  If it lacks some
feature or has bugs, we will improve it.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Mon, 12 Apr 2021 12:51:02 GMT) Full text and rfc822 format available.

Message #23 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Mon, 12 Apr 2021 14:50:25 +0200
On 4/12/21 2:21 PM, Eli Zaretskii wrote:
> That is easy to work around, right?  Just insert the string into a
> temporary buffer and say with-current-buffer (and/or
> with-selected-window, if needed).

I have not tested this but I suspect this to be slow for a few thousand 
strings.

>> The function `string-width` instead takes a string argument
>> returning the number of columns needed to display this string. This
>> function is needed to compute widths of strings which are not yet
>> displayed.
> 
> But you compute the width because you are going to display that string
> soon enough, right?  Or did I misunderstand the purpose?

I am using it to generate candidate strings for `completing-read` and 
Org is using it to format tables. So no, I don't think that's soon enough.

> In general, calculating a string's width out of display context is
> meaningless in Emacs.  More about that below.

I know about the context dependence. But there is the `string-width` 
function which is also computed in the current context. I am only asking 
for an improved version of already existing functionality. My most 
minimal feature request is just a function `substring-width`.

> That's exactly my point: by not using the display code, you allow up
> front inaccurate results.  There's no practical way to implement this
> in Lisp without yielding inaccurate and even grossly incorrect results
> in some cases (see below), so any such implementation will be limited
> from the get-go, in ways that are hard to even document precisely, let
> alone use reliably.  Thus, users of such an implementation are bound
> to be unpleasantly surprised if they try using it in a situation
> different from yours.

I agree that it would not work in all case. But why does `string-width` 
exist then? Is it deprecated?

>> Instead it computes the `string-width` of a flattened string, where
>> the invisible parts have been removed and the displayed parts have
>> been replaced.
> 
> Yes, I understood what the code does.  But it only handles some of the
> possible use cases, and will fail in others.  For example, it ignores
> faces (which could change the metrics of characters); it assumes that
> all the characters of the string can be displayed by the buffer's
> fonts (because "tofu" has very different dimensions); it uses
> char-width to measure a character's width (which is only accurate for
> text-mode frames); it doesn't handle display properties that aren't
> strings, such as (space :align-to N); etc.

That's right. In particular not supporting the :space alignment property 
is a serious limitation. But in text-mode frames the computation will 
return a correct result if it considers 'invisible and 'display recursively.

> So I urge you to try to use window-text-pixel-size for org-table and
> elsewhere, because that's what that function is for.  If it lacks some
> feature or has bugs, we will improve it.

It is fair to reject the feature request for a 
`string-display/property-width` function, because it would be hard to 
implement a generally useful function.

However if you attack `string-width` for not computing something 
correct, then one may want to consider deprecating `string-width` 
altogether or at least make it clear in the documentation that this 
function should not be used and something else based on the window 
function should be used instead.

Note that `gnus-correct-length` has been replaced by `string-width`, so 
maybe Lars can say something about the justification for `string-width`?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Mon, 12 Apr 2021 13:22:02 GMT) Full text and rfc822 format available.

Message #26 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Mon, 12 Apr 2021 16:21:15 +0300
> Cc: 47712 <at> debbugs.gnu.org
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Mon, 12 Apr 2021 14:50:25 +0200
> 
> On 4/12/21 2:21 PM, Eli Zaretskii wrote:
> > That is easy to work around, right?  Just insert the string into a
> > temporary buffer and say with-current-buffer (and/or
> > with-selected-window, if needed).
> 
> I have not tested this but I suspect this to be slow for a few thousand 
> strings.

Can we please see both methods benchmarked?  I'd also like to
understand better in which situation you need to do this with
thousands of strings in one go.  In any case, I presume that running
your code on thousands of strings also takes some time, let alone
conses many strings.

> > In general, calculating a string's width out of display context is
> > meaningless in Emacs.  More about that below.
> 
> I know about the context dependence. But there is the `string-width` 
> function which is also computed in the current context. I am only asking 
> for an improved version of already existing functionality. My most 
> minimal feature request is just a function `substring-width`.

string-width is from my POV a historical accident.  The accident
happened long ago enough to preclude deleting it, but I'd like to
limit its use as much as possible.  I certainly would like to avoid
extending it or making it support more features (it currently supports
only composed characters).

> However if you attack `string-width` for not computing something 
> correct, then one may want to consider deprecating `string-width` 
> altogether or at least make it clear in the documentation that this 
> function should not be used and something else based on the window 
> function should be used instead.

I'm fine with doing that, of course.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Mon, 12 Apr 2021 13:33:02 GMT) Full text and rfc822 format available.

Message #29 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Mon, 12 Apr 2021 15:32:33 +0200
On 4/12/21 3:21 PM, Eli Zaretskii wrote:
> Can we please see both methods benchmarked?  I'd also like to
> understand better in which situation you need to do this with
> thousands of strings in one go.  In any case, I presume that running
> your code on thousands of strings also takes some time, let alone
> conses many strings.

Sure, I can prepare something in order to check if the slowdown is 
significant.

> string-width is from my POV a historical accident.  The accident
> happened long ago enough to preclude deleting it, but I'd like to
> limit its use as much as possible.  I certainly would like to avoid
> extending it or making it support more features (it currently supports
> only composed characters).

Okay, fair enough. I am all for removing/obsoleting functionality which 
is considered an accident as long as a good alternative exists. But if 
no acceptable alternative exists, the function has a justification and 
then it may also be possible to improve upon it.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Mon, 12 Apr 2021 13:42:02 GMT) Full text and rfc822 format available.

Message #32 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Mon, 12 Apr 2021 16:40:51 +0300
> Cc: 47712 <at> debbugs.gnu.org
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Mon, 12 Apr 2021 15:32:33 +0200
> 
> Okay, fair enough. I am all for removing/obsoleting functionality which 
> is considered an accident as long as a good alternative exists. But if 
> no acceptable alternative exists, the function has a justification and 
> then it may also be possible to improve upon it.

Let's first establish whether indeed there's no alternative.  I'd be
surprised if that's the case, since window-text-pixel-size was written
to support precisely this kind of need.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Mon, 12 Apr 2021 14:06:01 GMT) Full text and rfc822 format available.

Message #35 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Mon, 12 Apr 2021 16:05:41 +0200
I gave it a quick test. See the function `string-pixel-width` below. It 
seems that it does not take 'invisible and 'display into account. I 
probably have to change something to ensure that the properties are not 
ignored.

But for we can still look at the micro benchmark.  The `string-width` 
function is 200 times faster than the `string-pixel-width` function. 
This is a huge difference, but as usual with microbenchmarks, one can 
argue that the difference will be less pronounced in a realistic 
computation.

I am still not happy with replacing `string-width` with something so 
much slower. However `window-text-pixel-size` also gives a different, 
much more precise result since it takes everything into account (or at 
least it should, invisible and display properties included). In the uses 
cases I mentioned one relies on monospace faces and formatting.

(defmacro bench (&rest body)
  (let ((start (make-symbol "t")))
    `(let (,start)
       (setq ,start (current-time))
       ,@body
       (float-time (time-since ,start)))))

(defun string-pixel-width (string)
  (with-temp-buffer
    (insert string)
    (car (window-text-pixel-size nil (point-min) (point-max)))))

;; returns 56 for all of the following strings, which is wrong
(string-pixel-width "1234")
(string-pixel-width (propertize "1234" 'invisible t))
(string-pixel-width (propertize "1234" 'display " "))

(defvar test-string
  (concat "some string with "
          (propertize "invisible substring" 'invisible t)
          " and "
          (propertize "a displayed substring"
                      'display "an overwritten substring")))

;; 5s
(bench
  (dotimes (_ 10000)
    (string-pixel-width test-string)))

;; 2.5s
(bench
  (dotimes (_ 1000000)
    (string-width test-string)))

;; 3.5s
(bench
  (dotimes (_ 1000000)
    (consult--display-width test-string)))




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Mon, 12 Apr 2021 14:16:01 GMT) Full text and rfc822 format available.

Message #38 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Mon, 12 Apr 2021 17:15:17 +0300
> Cc: 47712 <at> debbugs.gnu.org
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Mon, 12 Apr 2021 16:05:41 +0200
> 
> I gave it a quick test. See the function `string-pixel-width` below. It 
> seems that it does not take 'invisible and 'display into account. I 
> probably have to change something to ensure that the properties are not 
> ignored.

If these properties are ignored, they will also be ignored on display.

> But for we can still look at the micro benchmark.  The `string-width` 
> function is 200 times faster than the `string-pixel-width` function. 

And if you reuse the same temp buffer?

> I am still not happy with replacing `string-width` with something so 
> much slower.

With 0.5 millisecond per call, I don't see a problem.  And I expect
that to go down if the buffer is reused.

> (defmacro bench (&rest body)
>    (let ((start (make-symbol "t")))
>      `(let (,start)
>         (setq ,start (current-time))
>         ,@body
>         (float-time (time-since ,start)))))

Please use benchmark-run, as that also tells us about GC during the
run, and important aspect.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Mon, 12 Apr 2021 14:34:02 GMT) Full text and rfc822 format available.

Message #41 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: mail <at> daniel-mendler.de
Cc: 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1;
 Provide `string-display-width` function, which takes properties into
 account, `substring-width`
Date: Mon, 12 Apr 2021 17:32:56 +0300
> Date: Mon, 12 Apr 2021 17:15:17 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 47712 <at> debbugs.gnu.org
> 
> > But for we can still look at the micro benchmark.  The `string-width` 
> > function is 200 times faster than the `string-pixel-width` function. 
> 
> And if you reuse the same temp buffer?

My benchmarking indicates that reusing the buffer makes
string-pixel-width only 10 times slower than string-width, i.e. 50
about microseconds per call.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Mon, 12 Apr 2021 14:37:01 GMT) Full text and rfc822 format available.

Message #44 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Mon, 12 Apr 2021 16:36:05 +0200
On 4/12/21 4:15 PM, Eli Zaretskii wrote:
> If these properties are ignored, they will also be ignored on display.

No, something is wrong. 'display should not be ignored.

>> But for we can still look at the micro benchmark.  The `string-width`
>> function is 200 times faster than the `string-pixel-width` function.
> 
> And if you reuse the same temp buffer?

Sorry, I should have said that I tried reusing the same buffer. But it 
was not faster when I tried that. The buffer switching has a significant 
overhead. In order to get a fair benchmark one should measure the following:

;; 1.4s
(with-temp-buffer
  (bench
   (dotimes (_ 10000)
     (erase-buffer)
     (insert test-string)
     (car (window-text-pixel-size nil (point-min) (point-max))))))

Given that benchmark the `window-text-pixel-size` function is still over 
50 times slower.

> With 0.5 millisecond per call, I don't see a problem.  And I expect
> that to go down if the buffer is reused.

No, 0.5ms per call is not acceptable. When processing 2000 strings takes 
a second, it is not viable to use this to preprocess and format many 
strings. It may be okay for computing a handful strings which are being 
displayed right away.

Given the benchmark I think it makes sense to continue to use 
`string-width` for certain use cases which can live with the limitations 
of only working correctly in text mode.

But I understand that you don't want to add a half-broken 
`string-display-width` API on top of the already half-broken 
`string-width` API. One may still discuss the implementation of a 
`substring-width` API which generalizes `string-width`.

(defun string-width (s)
  (substring-width s 0 (length s)))

(defun substring-width (s a b)
  (string-width (substring s a b)))




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Mon, 12 Apr 2021 14:39:02 GMT) Full text and rfc822 format available.

Message #47 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Mon, 12 Apr 2021 16:38:08 +0200
On 4/12/21 4:32 PM, Eli Zaretskii wrote:
> My benchmarking indicates that reusing the buffer makes
> string-pixel-width only 10 times slower than string-width, i.e. 50
> about microseconds per call.

Can you please paste the exact code you used? As I wrote in my previous 
mail, I tried without buffer switching, but it was still 50 times 
slower. 10 times slower is still not good, but given that number I may 
stop arguing, since the factor will be even less in real code.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Mon, 12 Apr 2021 17:03:02 GMT) Full text and rfc822 format available.

Message #50 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Mon, 12 Apr 2021 20:01:53 +0300
> Cc: 47712 <at> debbugs.gnu.org
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Mon, 12 Apr 2021 16:38:08 +0200
> 
> On 4/12/21 4:32 PM, Eli Zaretskii wrote:
> > My benchmarking indicates that reusing the buffer makes
> > string-pixel-width only 10 times slower than string-width, i.e. 50
> > about microseconds per call.
> 
> Can you please paste the exact code you used? As I wrote in my previous 
> mail, I tried without buffer switching, but it was still 50 times 
> slower.

  (defun string-pixel-width (string)
    (with-current-buffer (get-buffer-create "foo")
      (erase-buffer)
      (insert string)
      (window-text-pixel-size nil (point-min) (point-max))))

  (defvar test-string
     (concat "some string with "
	     (propertize "invisible substring" 'invisible t)
	     " and "
	     (propertize "a displayed substring"
			 'display "an overwritten substring")))

  (benchmark-run 100000 (string-pixel-width test-string))
  (benchmark-run 100000 (string-width test-string))

> 10 times slower is still not good, but given that number I may 
> stop arguing, since the factor will be even less in real code.

You can still use your code in org-table, if it does the job, I just
don't think we should have a semi-working API in core.

(There's indeed something strange with the results, I think the
with-current-buffer thing is not enough (because if I manually switch
to buffer "foo" and run the function, it returns correct results).  I
will take a closer look when I have time, unless martin beats me to
it.)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Mon, 12 Apr 2021 17:10:02 GMT) Full text and rfc822 format available.

Message #53 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Mon, 12 Apr 2021 20:09:08 +0300
> Cc: 47712 <at> debbugs.gnu.org
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Mon, 12 Apr 2021 16:36:05 +0200
> 
> One may still discuss the implementation of a 
> `substring-width` API which generalizes `string-width`.
> 
> (defun string-width (s)
>    (substring-width s 0 (length s)))
> 
> (defun substring-width (s a b)
>    (string-width (substring s a b)))

Why not simply extend string-width to accept 2 optional arguments?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Mon, 12 Apr 2021 17:15:02 GMT) Full text and rfc822 format available.

Message #56 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Mon, 12 Apr 2021 19:13:53 +0200
On 4/12/21 7:09 PM, Eli Zaretskii wrote:
>> Cc: 47712 <at> debbugs.gnu.org
>> From: Daniel Mendler <mail <at> daniel-mendler.de>
>> Date: Mon, 12 Apr 2021 16:36:05 +0200
>>
>> One may still discuss the implementation of a
>> `substring-width` API which generalizes `string-width`.
>>
>> (defun string-width (s)
>>     (substring-width s 0 (length s)))
>>
>> (defun substring-width (s a b)
>>     (string-width (substring s a b)))
> 
> Why not simply extend string-width to accept 2 optional arguments?

I agree, that is better.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Mon, 12 Apr 2021 17:19:01 GMT) Full text and rfc822 format available.

Message #59 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Mon, 12 Apr 2021 19:18:11 +0200
On 4/12/21 7:01 PM, Eli Zaretskii wrote:
> You can still use your code in org-table, if it does the job, I just
> don't think we should have a semi-working API in core.

This is a reasonable. Extending `string-width` with two optional 
begin/end arguments would still be a welcome addition.

> (There's indeed something strange with the results, I think the
> with-current-buffer thing is not enough (because if I manually switch
> to buffer "foo" and run the function, it returns correct results).  I
> will take a closer look when I have time, unless martin beats me to
> it.)

Thank you, it would be great if you figure it out. Then the 
`window-text-pixel-size` becomes a good alternative to computing the 
string width manually from the display and invisible properties as I am 
doing as of now.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Tue, 13 Apr 2021 07:07:02 GMT) Full text and rfc822 format available.

Message #62 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: martin rudalics <rudalics <at> gmx.at>
To: Eli Zaretskii <eliz <at> gnu.org>, Daniel Mendler <mail <at> daniel-mendler.de>
Cc: 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Tue, 13 Apr 2021 09:06:06 +0200
>    (defun string-pixel-width (string)
>      (with-current-buffer (get-buffer-create "foo")
>        (erase-buffer)
>        (insert string)
>        (window-text-pixel-size nil (point-min) (point-max))))

[...]

> (There's indeed something strange with the results, I think the
> with-current-buffer thing is not enough (because if I manually switch
> to buffer "foo" and run the function, it returns correct results).  I
> will take a closer look when I have time, unless martin beats me to
> it.)

For `window-text-pixel-size' to work correctly, the buffer in question
must be the buffer of the first argument of `window-text-pixel-size'
(that's what the display engine expects).  OTOH Elisp code doesn't have
to care about whether that buffer is current (but it can't harm either).

So one way to write `string-pixel-width' should be

(defun string-pixel-width (string)
  (let ((buffer (get-buffer-create "*foo*"))
	(old-buffer (window-buffer)))
    (with-current-buffer buffer
      (erase-buffer)
      (insert string)
      (set-window-buffer (selected-window) buffer)
      (prog1
	  (window-text-pixel-size nil (point-min) (point-max))
	(set-window-buffer (selected-window) old-buffer)))))

martin




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Tue, 13 Apr 2021 08:02:02 GMT) Full text and rfc822 format available.

Message #65 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Tue, 13 Apr 2021 10:01:12 +0200
Daniel Mendler <mail <at> daniel-mendler.de> writes:

> Note that I would like to have an implementation of
> `substring-width`/`string-display-width`/`substring-display-width`
> which does not allocate, since this reduces the GC pressure when
> formatting many items. How do you think about this?
>
> Regarding images, different fonts, maybe a
> `string-property-pixel-width` would be useful too. But for the use
> cases I have in mind (formatting monospaced text) computing columns is
> sufficient.

Sure; I think a non-allocating function to count monospaced string width
would be useful.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Tue, 13 Apr 2021 12:02:02 GMT) Full text and rfc822 format available.

Message #68 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: mail <at> daniel-mendler.de, 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Tue, 13 Apr 2021 15:00:35 +0300
> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Cc: Eli Zaretskii <eliz <at> gnu.org>,  47712 <at> debbugs.gnu.org
> Date: Tue, 13 Apr 2021 10:01:12 +0200
> 
> Daniel Mendler <mail <at> daniel-mendler.de> writes:
> 
> > Note that I would like to have an implementation of
> > `substring-width`/`string-display-width`/`substring-display-width`
> > which does not allocate, since this reduces the GC pressure when
> > formatting many items. How do you think about this?
> >
> > Regarding images, different fonts, maybe a
> > `string-property-pixel-width` would be useful too. But for the use
> > cases I have in mind (formatting monospaced text) computing columns is
> > sufficient.
> 
> Sure; I think a non-allocating function to count monospaced string width
> would be useful.

I won't mount the barricades over this, but please keep in mind that
even with fixed-pitch fonts this will not always produce correct
results, because on GUI frames the "double-width" characters not
always take exactly two columns (it depends on the font).  So this
function will only work reliably on text-mode (a.k.a. TTY) frames and
for Latin characters on GUI frames.

If you still think we should have such a semi-broken function, at
least include some warning in its doc string, so that users wouldn't
be unpleasantly surprised.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Tue, 13 Apr 2021 12:24:02 GMT) Full text and rfc822 format available.

Message #71 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: martin rudalics <rudalics <at> gmx.at>
Cc: mail <at> daniel-mendler.de, 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Tue, 13 Apr 2021 15:23:22 +0300
> Cc: 47712 <at> debbugs.gnu.org
> From: martin rudalics <rudalics <at> gmx.at>
> Date: Tue, 13 Apr 2021 09:06:06 +0200
> 
> (defun string-pixel-width (string)
>    (let ((buffer (get-buffer-create "*foo*"))
> 	(old-buffer (window-buffer)))
>      (with-current-buffer buffer
>        (erase-buffer)
>        (insert string)
>        (set-window-buffer (selected-window) buffer)
>        (prog1
> 	  (window-text-pixel-size nil (point-min) (point-max))
> 	(set-window-buffer (selected-window) old-buffer)))))

Thanks.  This produces correct results, and takes 35-37 usec per call
here.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Tue, 13 Apr 2021 12:26:01 GMT) Full text and rfc822 format available.

Message #74 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: Eli Zaretskii <eliz <at> gnu.org>, Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Tue, 13 Apr 2021 14:25:18 +0200
On 4/13/21 2:00 PM, Eli Zaretskii wrote:
>> From: Lars Ingebrigtsen <larsi <at> gnus.org>
>>> Note that I would like to have an implementation of
>>> `substring-width`/`string-display-width`/`substring-display-width`
>>> which does not allocate, since this reduces the GC pressure when
>>> formatting many items. How do you think about this?
>>
>> Sure; I think a non-allocating function to count monospaced string width
>> would be useful.
> 
> I won't mount the barricades over this, but please keep in mind that
> even with fixed-pitch fonts this will not always produce correct
> results, because on GUI frames the "double-width" characters not
> always take exactly two columns (it depends on the font).  So this
> function will only work reliably on text-mode (a.k.a. TTY) frames and
> for Latin characters on GUI frames.
> 
> If you still think we should have such a semi-broken function, at
> least include some warning in its doc string, so that users wouldn't
> be unpleasantly surprised.

Yes, `string-width` is broken for face-dependent purposes. However there 
are still these handful of use cases which will probably not go away 
(org-mode table, formatting monospaced text, ...). In those existing 
cases the `string-width` function is often used in combination with 
`substring`, i.e., `(string-width (substring str beg end)`.

Therefore I would be happy with the following resolution:

1. Add two arguments begin and end to `string-width` to improve the 
current uses of `string-width`.

2. Document the caveats of `string-width` in the docstring (works only 
reliable in text mode for multi-width chars) and maybe mention 
`window-text-pixel-size` or the `string-pixel-width` function by Martin.

----

(defun string-pixel-width (string)
 (let ((buffer (get-buffer-create " *string-pixel-width*"))
       (old-buffer (window-buffer)))
   (unwind-protect
       (with-current-buffer buffer
         (erase-buffer)
         (insert string)
         (set-window-buffer (selected-window) buffer)
         (car (window-text-pixel-size nil (point-min) (point-max))))
     (set-window-buffer (selected-window) old-buffer))))

(I don't think it is needed to add the `string-pixel-width` function to 
the core. The function or some variant of it is easy enough for packages 
to include and it cannot be used in this form since it leads to these 
temporary buffers lying around.)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Wed, 14 Apr 2021 08:52:01 GMT) Full text and rfc822 format available.

Message #77 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: larsi <at> gnus.org, 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Wed, 14 Apr 2021 11:50:36 +0300
> Cc: 47712 <at> debbugs.gnu.org
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Tue, 13 Apr 2021 14:25:18 +0200
> 
> Yes, `string-width` is broken for face-dependent purposes. However there 
> are still these handful of use cases which will probably not go away 
> (org-mode table, formatting monospaced text, ...). In those existing 
> cases the `string-width` function is often used in combination with 
> `substring`, i.e., `(string-width (substring str beg end)`.
> 
> Therefore I would be happy with the following resolution:
> 
> 1. Add two arguments begin and end to `string-width` to improve the 
> current uses of `string-width`.
> 
> 2. Document the caveats of `string-width` in the docstring (works only 
> reliable in text mode for multi-width chars) and maybe mention 
> `window-text-pixel-size` or the `string-pixel-width` function by Martin.

Your wishes have been granted, see the latest master branch.

Is there anything else to do with this bug report, or can we close it?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#47712; Package emacs. (Wed, 14 Apr 2021 10:50:01 GMT) Full text and rfc822 format available.

Message #80 received at 47712 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Mendler <mail <at> daniel-mendler.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: larsi <at> gnus.org, 47712 <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Wed, 14 Apr 2021 12:49:17 +0200
On 4/14/21 10:50 AM, Eli Zaretskii wrote:
> Your wishes have been granted, see the latest master branch.
> 
> Is there anything else to do with this bug report, or can we close it?

Thank you for the quick resolution, Eli! Please close this issue.




Reply sent to Eli Zaretskii <eliz <at> gnu.org>:
You have taken responsibility. (Wed, 14 Apr 2021 11:39:02 GMT) Full text and rfc822 format available.

Notification sent to Daniel Mendler <mail <at> daniel-mendler.de>:
bug acknowledged by developer. (Wed, 14 Apr 2021 11:39:02 GMT) Full text and rfc822 format available.

Message #85 received at 47712-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Daniel Mendler <mail <at> daniel-mendler.de>
Cc: larsi <at> gnus.org, 47712-done <at> debbugs.gnu.org
Subject: Re: bug#47712: 27.1; Provide `string-display-width` function, which
 takes properties into account, `substring-width`
Date: Wed, 14 Apr 2021 14:38:29 +0300
> Cc: larsi <at> gnus.org, 47712 <at> debbugs.gnu.org
> From: Daniel Mendler <mail <at> daniel-mendler.de>
> Date: Wed, 14 Apr 2021 12:49:17 +0200
> 
> On 4/14/21 10:50 AM, Eli Zaretskii wrote:
> > Your wishes have been granted, see the latest master branch.
> > 
> > Is there anything else to do with this bug report, or can we close it?
> 
> Thank you for the quick resolution, Eli! Please close this issue.

Done, thanks.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 13 May 2021 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 320 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.