GNU bug report logs - #18385
24.3.93; posn-at-point doesn't account for tab-width

Previous Next

Package: emacs;

Reported by: Dmitry <dgutov <at> yandex.ru>

Date: Mon, 1 Sep 2014 21:54:01 UTC

Severity: normal

Found in version 24.3.93

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 18385 in the body.
You can then email your comments to 18385 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#18385; Package emacs. (Mon, 01 Sep 2014 21:54:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Dmitry <dgutov <at> yandex.ru>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Mon, 01 Sep 2014 21:54:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Dmitry <dgutov <at> yandex.ru>
To: bug-gnu-emacs <at> gnu.org
Subject: 24.3.93; posn-at-point doesn't account for tab-width
Date: Tue, 02 Sep 2014 01:53:12 +0400
1. New, empty buffer.
2. (insert "a\tb")
3. (posn-actual-col-row (posn-at-point))
=> (3 . 0)

It should probably return (9 . 0).

I'm not 100% this is actually a bug, but (posn-actual-col-row
(posn-at-point)) returns the "visually" correct column values in the
more complex cases (after text with `display' or `compose-region' called
on it), so not accounting for tab-width looks surprising.

Originally: https://github.com/company-mode/company-mode/issues/175

In GNU Emacs 24.3.93.1 (x86_64-unknown-linux-gnu, GTK+ Version 3.10.8)
 of 2014-08-18 on axl
Repository revision: 117447 eliz <at> gnu.org-20140817144850-xgexz1n2z8s4aiur
Windowing system distributor `The X.Org Foundation', version 11.0.11501000
System Description:	Ubuntu 14.04.1 LTS






Reply sent to Eli Zaretskii <eliz <at> gnu.org>:
You have taken responsibility. (Tue, 02 Sep 2014 15:25:01 GMT) Full text and rfc822 format available.

Notification sent to Dmitry <dgutov <at> yandex.ru>:
bug acknowledged by developer. (Tue, 02 Sep 2014 15:25:02 GMT) Full text and rfc822 format available.

Message #10 received at 18385-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry <dgutov <at> yandex.ru>
Cc: 18385-done <at> debbugs.gnu.org
Subject: Re: bug#18385: 24.3.93; posn-at-point doesn't account for tab-width
Date: Tue, 02 Sep 2014 18:24:12 +0300
> From: Dmitry <dgutov <at> yandex.ru>
> Date: Tue, 02 Sep 2014 01:53:12 +0400
> 
> 1. New, empty buffer.
> 2. (insert "a\tb")
> 3. (posn-actual-col-row (posn-at-point))
> => (3 . 0)
> 
> It should probably return (9 . 0).

No, it should return (3 . 0), as it does.  You misunderstand the
contract of this function (which is not surprising, since the issue is
a subtle one, and the documentation, while it tries to be accurate,
has a hard time communicating its intent due to inherent ambiguity of
the related terminology).

This sentence from the doc string of posn-actual-col-row says it all:

  These are the actual row number in the window and character number
  in that row.

"Character number in that row".  IOW, it counts characters, not visual
columns.  This function, and the data in its POSITION argument which
it accesses, are designed to make it easy to find the glyph (or
"display element") in a screen line, so it simply provides the ordinal
number of the "thing at point" on its screen line, disregarding the
screen dimensions of that thing.

So this is not a bug, but intended, if obscure, behavior.

> I'm not 100% this is actually a bug, but (posn-actual-col-row
> (posn-at-point)) returns the "visually" correct column values in the
> more complex cases (after text with `display' or `compose-region' called
> on it), so not accounting for tab-width looks surprising.

As long as posn-actual-col-row deals with characters of the same
dimensions (i.e. the same font), it will always produce seemingly
accurate "column" counts, no matter whether these characters come from
a buffer, a display property, or an overlay string.  (It counts
characters on display, so the source from which they came is
irrelevant.)  But as soon as you have something in the line whose
glyph is larger or smaller than the other characters in that line, the
"column" produced by the function will be skewed, because it's
actually not a visual column, but a count of "display elements" from
the beginning of the screen line.  E.g., try insert-image or put a
display property which uses ':align-to' or ':width', and you will see
that the image and the stretch of whitespace produced by those are
counted as a single "column", no matter what are their actual
dimensions.

IOW, posn-actual-col-row is not reliable when you want screen
coordinates in row/column units.

> Originally: https://github.com/company-mode/company-mode/issues/175

And you were right to resolve that by using posn-col-row instead.
That function translates pixel coordinates into row/column units,
which is much closer to what you want.

(Yes, it's not easy to do the job of the display engine.)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18385; Package emacs. (Tue, 02 Sep 2014 23:31:01 GMT) Full text and rfc822 format available.

Message #13 received at 18385 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: 18385 <at> debbugs.gnu.org
Cc: eliz <at> gnu.org
Subject: Re: bug#18385: 24.3.93; posn-at-point doesn't account for tab-width
Date: Wed, 03 Sep 2014 03:30:10 +0400
Hi Eli,

Thanks for the detailed explanation.

Eli Zaretskii <eliz <at> gnu.org> writes:

> "Character number in that row".  IOW, it counts characters, not visual
> columns.  This function, and the data in its POSITION argument which
> it accesses, are designed to make it easy to find the glyph (or
> "display element") in a screen line, so it simply provides the ordinal
> number of the "thing at point" on its screen line, disregarding the
> screen dimensions of that thing.

Okay, if character widths are "applied" after the glyphs to be displayed
are collected, I guess this makes a certain amount of sense.

Is this value useful, though? Since the visual buffer contents are
inaccessible from Lisp, I believe this value wouldn't be properly
correct in most contexts, even in tty, where the major pitfalls you
described can't happen.

AFAICS, this function is only called in two places in Emacs code:

- From `line-move-partial', where I don't understand what it does.
- From `proced-sort-header', where the caller apparently just assumes
  there are no multiple-width characters on that line.

So, was the decision not to return current-column-like value made due to
performance considerations?

> And you were right to resolve that by using posn-col-row instead.
> That function translates pixel coordinates into row/column units,
> which is much closer to what you want.

Thanks, that's good to know.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18385; Package emacs. (Wed, 03 Sep 2014 16:13:01 GMT) Full text and rfc822 format available.

Message #16 received at 18385 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 18385 <at> debbugs.gnu.org
Subject: Re: bug#18385: 24.3.93; posn-at-point doesn't account for tab-width
Date: Wed, 03 Sep 2014 19:12:42 +0300
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Cc: eliz <at> gnu.org
> Date: Wed, 03 Sep 2014 03:30:10 +0400
> 
> > "Character number in that row".  IOW, it counts characters, not visual
> > columns.  This function, and the data in its POSITION argument which
> > it accesses, are designed to make it easy to find the glyph (or
> > "display element") in a screen line, so it simply provides the ordinal
> > number of the "thing at point" on its screen line, disregarding the
> > screen dimensions of that thing.
> 
> Okay, if character widths are "applied" after the glyphs to be displayed
> are collected, I guess this makes a certain amount of sense.

Technically, the width is "applied" during the same (single) pass the
display elements are collected.

If you want a more accurate description of what happens, it is this:
Glyphs collected for each display line are stored in an array of
structures which specify how to display each glyph (and that includes
the calculated pixel width of the glyphs).  What posn-actual-col-row
gives you is the _index_ of the corresponding glyph structure in that
array.

> Is this value useful, though? Since the visual buffer contents are
> inaccessible from Lisp

posn-actual-col-row is just an accessor function for the event data
structure.  The event data structure from which it extracts its output
is used not only from Lisp, it is also used by the display engine
itself, and there this information is very useful, because it allows
to easily find the glyph where the user clicks the mouse.  The glyph
holds a reference to the object from which it came and other useful
information required to process click events.

> I believe this value wouldn't be properly correct in most contexts,
> even in tty, where the major pitfalls you described can't happen.

First, ':align-to' display properties are supported on a TTY as well,
as are TABs (of course).  But this is actually one more subtle issue
with posn-actual-col-row, because if you try your recipe in a
text-mode frame, you will see that there posn-actual-col-row counts
the TAB as 7 columns, and your recipe works as you expected!

This is again a manifestation of how click events and posn-at-point
implement the "column" part: they count _glyphs_.  Now, on a TTY,
TABs, ':align-to', and the like are implemented by inserting a
suitable number of glyphs that display as a blank character '\032',
because a text-mode terminal cannot display variable-width characters
(well, there are double-width characters in CJK locales, but let's
ignore that for a moment).  So this is what you get in the "column" on
a TTY.

> AFAICS, this function is only called in two places in Emacs code:
> 
> - From `line-move-partial', where I don't understand what it does.

line-move-partial uses the "row" part of what posn-actual-col-row
returns, so the problems with "columns" don't happen there.

> - From `proced-sort-header', where the caller apparently just assumes
>   there are no multiple-width characters on that line.

And that's a valid assumption in that case, because these "columns"
are applied to the Proced's header line, which Proced itself
generates, so it knows what is there.  In addition, the "column"
returned by posn-actual-col-row is used there to index into the
header-line string, so again, a pure character count is TRT.

> So, was the decision not to return current-column-like value made due to
> performance considerations?

The event data structure is fundamentally pixel-based.  The basic
first-hand information in the event is the pixel-unit X and Y, from
which, given enough code, you can recompute everything else.  The
event data structure provides additional pre-computed attributes of
the event to make the job of its users easier; the glyph coordinates
COL and ROW are 2 of these pre-computed attributes, which some of the
users of the event structure find very useful.  Admittedly, Lisp
programs generally shouldn't use those, except in very special cases.
But every complex data structure should have accessor functions to its
parts, and the event structure is no exception.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18385; Package emacs. (Wed, 03 Sep 2014 22:11:01 GMT) Full text and rfc822 format available.

Message #19 received at 18385 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 18385 <at> debbugs.gnu.org
Subject: Re: bug#18385: 24.3.93; posn-at-point doesn't account for tab-width
Date: Thu, 04 Sep 2014 02:10:42 +0400
On 09/03/2014 08:12 PM, Eli Zaretskii wrote:

> If you want a more accurate description of what happens, it is this:
> Glyphs collected for each display line are stored in an array of
> structures which specify how to display each glyph (and that includes
> the calculated pixel width of the glyphs).  What posn-actual-col-row
> gives you is the _index_ of the corresponding glyph structure in that
> array.

Thank you.

>> I believe this value wouldn't be properly correct in most contexts,
>> even in tty, where the major pitfalls you described can't happen.
>
> First, ':align-to' display properties are supported on a TTY as well,
> as are TABs (of course).  But this is actually one more subtle issue
> with posn-actual-col-row, because if you try your recipe in a
> text-mode frame, you will see that there posn-actual-col-row counts
> the TAB as 7 columns, and your recipe works as you expected!

Looks like yet another reason not to use this from Lisp code. :)

> generates, so it knows what is there.  In addition, the "column"
> returned by posn-actual-col-row is used there to index into the
> header-line string, so again, a pure character count is TRT.

Ah, indeed. It's a different application from what I had in mind.

> But every complex data structure should have accessor functions to its
> parts, and the event structure is no exception.

I guess its docstring was that tripped me up. "Actual column" sounds too 
close to the name of `current-column', which does count character widths.

Do you think something like this change would make sense?

Or, also, instead of "contain", maybe use "correspond to" (unrelated to 
the present discussion).

=== modified file 'lisp/subr.el'
--- lisp/subr.el	2014-09-02 15:16:42 +0000
+++ lisp/subr.el	2014-09-03 22:09:02 +0000
@@ -1149,10 +1149,13 @@
 	      (/ (cdr pair) (+ (frame-char-height frame) spacing))))))))

 (defun posn-actual-col-row (position)
-  "Return the actual column and row in POSITION, measured in characters.
-These are the actual row number in the window and character number in 
that row.
-Return nil if POSITION does not contain the actual position; in that case
+  "Return the actual row character number and row number in POSITION.
+Return nil if POSITION does not contain an actual position; in that case
 `posn-col-row' can be used to get approximate values.
+
+Consider using `posn-col-row' instead either way, because this
+function doesn't take character widths into account.
+
 POSITION should be a list of the form returned by the `event-start'
 and `event-end' functions."
   (nth 6 position))






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18385; Package emacs. (Thu, 04 Sep 2014 15:23:02 GMT) Full text and rfc822 format available.

Message #22 received at 18385 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: 18385 <at> debbugs.gnu.org
Subject: Re: bug#18385: 24.3.93; posn-at-point doesn't account for tab-width
Date: Thu, 04 Sep 2014 18:23:00 +0300
> Date: Thu, 04 Sep 2014 02:10:42 +0400
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> CC: 18385 <at> debbugs.gnu.org
> 
> I guess its docstring was that tripped me up. "Actual column" sounds too 
> close to the name of `current-column', which does count character widths.
> 
> Do you think something like this change would make sense?

Yes, I made a similar change (on the emacs-24 branch).

Thanks.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 03 Oct 2014 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 9 years and 217 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.