GNU bug report logs - #13399
24.3.50; Word-wrap can't wrap at zero-width space U-200B

Previous Next

Package: emacs;

Reported by: martin rudalics <rudalics <at> gmx.at>

Date: Thu, 10 Jan 2013 08:31:02 UTC

Severity: wishlist

Found in version 24.3.50

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 13399 in the body.
You can then email your comments to 13399 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Thu, 10 Jan 2013 08:31:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to martin rudalics <rudalics <at> gmx.at>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Thu, 10 Jan 2013 08:31:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: martin rudalics <rudalics <at> gmx.at>
To: Bug-Gnu-Emacs <bug-gnu-emacs <at> gnu.org>
Subject: 24.3.50; Word-wrap can't wrap at zero-width space U-200B
Date: Thu, 10 Jan 2013 09:29:25 +0100
With emacs -Q evaluate

(with-current-buffer (get-buffer-create "*foo*")
  (dotimes (i 1000)
    (insert "1234​")) ; U-200B
  (setq word-wrap t)
  (display-buffer "*foo*"))

where the character after 1234 is a zero-width space character with
unicode code point U-200B.  As can be seen in the window showing *foo*,
lines are not regularly wrapped at that character.  Doing

(with-current-buffer (get-buffer-create "*foo*")
  (dotimes (i 1000)
    (insert "1234 "))
  (setq word-wrap t)
  (display-buffer "*foo*"))

instead wraps lines as expected.

Observed with GNU Emacs 24.3.50.1 (i386-mingw-nt5.1.2600)
 of 2013-01-07 on MACHNO

martin





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Thu, 10 Jan 2013 19:15:02 GMT) Full text and rfc822 format available.

Message #8 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: martin rudalics <rudalics <at> gmx.at>
Cc: 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Thu, 10 Jan 2013 21:15:04 +0200
> Date: Thu, 10 Jan 2013 09:29:25 +0100
> From: martin rudalics <rudalics <at> gmx.at>
> 
> With emacs -Q evaluate
> 
> (with-current-buffer (get-buffer-create "*foo*")
>    (dotimes (i 1000)
>      (insert "1234​")) ; U-200B
>    (setq word-wrap t)
>    (display-buffer "*foo*"))
> 
> where the character after 1234 is a zero-width space character with
> unicode code point U-200B.  As can be seen in the window showing *foo*,
> lines are not regularly wrapped at that character.

You mean, not wrapped at all.  Witness the continuation bitmaps in the
fringes, which shouldn't appear when a line is wrapped.

> Doing
> 
> (with-current-buffer (get-buffer-create "*foo*")
>    (dotimes (i 1000)
>      (insert "1234 "))
>    (setq word-wrap t)
>    (display-buffer "*foo*"))
> 
> instead wraps lines as expected.

If anything, this is a missing feature, since word-wrap is explicitly
coded to break lines only on SPC and TAB characters.  See the
IT_DISPLAYING_WHITESPACE macro in xdisp.c.

If we want to add more characters to the set, we should probably
arrange a special char-table for this, and have it exposed to Lisp, so
it could be customized.  Patches are welcome.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Fri, 11 Jan 2013 08:18:02 GMT) Full text and rfc822 format available.

Message #11 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: martin rudalics <rudalics <at> gmx.at>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space
	U-200B
Date: Fri, 11 Jan 2013 09:16:45 +0100
>> As can be seen in the window showing *foo*,
>> lines are not regularly wrapped at that character.
>
> You mean, not wrapped at all.  Witness the continuation bitmaps in the
> fringes, which shouldn't appear when a line is wrapped.

I thought these bitmaps appear when a line is wrapped.

> If anything, this is a missing feature, since word-wrap is explicitly
> coded to break lines only on SPC and TAB characters.

The doc-string of `word-wrap' says

  When word-wrapping is on, continuation lines are wrapped at the space
  or tab character nearest to the right window edge

Since U-200B is a space character the line should wrap at it.  Also

  this character is intended for invisible word separation and for line
  break control; it has no width, but its presence between two
  characters does not prevent increased letter spacing in justification

and Emacs apparently does handle it specially since it reserves a few
pixels when drawing it.  But documentation on `word-wrap' is scarce ...

> See the
> IT_DISPLAYING_WHITESPACE macro in xdisp.c.

I tried to understand the code but failed.

> If we want to add more characters to the set, we should probably
> arrange a special char-table for this, and have it exposed to Lisp, so
> it could be customized.  Patches are welcome.

IIUC all breakable spaces are between U-2000 and U-200B so maybe a
character table is not needed.

Anway, exposing displayed text to Lisp would be great.  We'd just need
two functions - one that gets the pixel width of an arbitrary buffer
string wrt a specific window, and one that gets the pixel height of an
arbitrary buffer string (newlines ignored) wrt a specific window.  This
way we could get rid of lots of problems currently hidden in the display
engine ...

martin




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Fri, 11 Jan 2013 08:59:02 GMT) Full text and rfc822 format available.

Message #14 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: martin rudalics <rudalics <at> gmx.at>
Cc: 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Fri, 11 Jan 2013 10:58:08 +0200
> Date: Fri, 11 Jan 2013 09:16:45 +0100
> From: martin rudalics <rudalics <at> gmx.at>
> CC: 13399 <at> debbugs.gnu.org
> 
>  >> As can be seen in the window showing *foo*,
>  >> lines are not regularly wrapped at that character.
>  >
>  > You mean, not wrapped at all.  Witness the continuation bitmaps in the
>  > fringes, which shouldn't appear when a line is wrapped.
> 
> I thought these bitmaps appear when a line is wrapped.

Not by default.  Not unless you customize visual-line-fringe-indicators.

>  > If anything, this is a missing feature, since word-wrap is explicitly
>  > coded to break lines only on SPC and TAB characters.
> 
> The doc-string of `word-wrap' says
> 
>    When word-wrapping is on, continuation lines are wrapped at the space
>    or tab character nearest to the right window edge
> 
> Since U-200B is a space character the line should wrap at it.

No, it means literally "the space character", U+0020.

>  Also
> 
>    this character is intended for invisible word separation and for line
>    break control; it has no width, but its presence between two
>    characters does not prevent increased letter spacing in justification
> 
> and Emacs apparently does handle it specially since it reserves a few
> pixels when drawing it.

See glyphless-char-display and glyphless-char-display-control for why.

> But documentation on `word-wrap' is scarce ...

Actually, it doesn't exist, apart of the doc string.

>  > See the
>  > IT_DISPLAYING_WHITESPACE macro in xdisp.c.
> 
> I tried to understand the code but failed.

  #define IT_DISPLAYING_WHITESPACE(it)					\
    /* If the character to be displayed is SPC or TAB */
    ((it->what == IT_CHARACTER && (it->c == ' ' || it->c == '\t'))      \
    /* Or we are iterating over a display or overlay string, ... */
     || ((STRINGP (it->string)						\
    /* ... and the character at current string position is SPC or TAB */
	  && (SREF (it->string, IT_STRING_BYTEPOS (*it)) == ' '		\
	      || SREF (it->string, IT_STRING_BYTEPOS (*it)) == '\t'))	\
    /* Or we are iterating over a C string, ... */
	 || (it->s							\
    /* ... and the character at current string position is SPC or TAB */
	     && (it->s[IT_BYTEPOS (*it)] == ' '				\
		 || it->s[IT_BYTEPOS (*it)] == '\t'))			\
    /* Or the iterator is before end of buffer's reachable portion, ... */
	 || (IT_BYTEPOS (*it) < ZV_BYTE					\
    /* ... and the character at current buffer position is SPC or TAB */
	     && (*BYTE_POS_ADDR (IT_BYTEPOS (*it)) == ' '			\
		 || *BYTE_POS_ADDR (IT_BYTEPOS (*it)) == '\t'))))		\

In any case, you can clearly see that it only tests for literal SPC
and TAB characters.

>  > If we want to add more characters to the set, we should probably
>  > arrange a special char-table for this, and have it exposed to Lisp, so
>  > it could be customized.  Patches are welcome.
> 
> IIUC all breakable spaces are between U-2000 and U-200B so maybe a
> character table is not needed.

Who said we want only break at breakable space characters?  Who said
Unicode will never add more such characters in another block?  And
what about low-ASCII characters, which are already in a different
block?

In any case, even if you are right, a char-table is a way to store
character properties efficiently.  In particular, it will waste very
little storage to mark a contiguous range of characters with the same
property.  The advantage of using a char-table is that it will
dynamically expand as needed if more characters are added to the set.

> Anway, exposing displayed text to Lisp would be great.  We'd just need
> two functions - one that gets the pixel width of an arbitrary buffer
> string wrt a specific window, and one that gets the pixel height of an
> arbitrary buffer string (newlines ignored) wrt a specific window.  This
> way we could get rid of lots of problems currently hidden in the display
> engine ...

You lost me here.  By "exposing to Lisp" I meant expose the char-table
of word-wrap characters to Lisp.  What did _you_ want exposed to Lisp?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Fri, 11 Jan 2013 10:31:02 GMT) Full text and rfc822 format available.

Message #17 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: martin rudalics <rudalics <at> gmx.at>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space
	U-200B
Date: Fri, 11 Jan 2013 11:29:46 +0100
>>  > You mean, not wrapped at all.  Witness the continuation bitmaps in the
>>  > fringes, which shouldn't appear when a line is wrapped.
>>
>> I thought these bitmaps appear when a line is wrapped.
>
> Not by default.  Not unless you customize visual-line-fringe-indicators.

With emacs -Q I see curly arrows in the fringes regardless of whether I
set `visual-line-fringe-indicators' or not.  What am I missing?

>> The doc-string of `word-wrap' says
>>
>>    When word-wrapping is on, continuation lines are wrapped at the space
>>    or tab character nearest to the right window edge
>>
>> Since U-200B is a space character the line should wrap at it.
>
> No, it means literally "the space character", U+0020.

So `word-wrap' is ASCII-only?  The doc-string should say so.

>> and Emacs apparently does handle it specially since it reserves a few
>> pixels when drawing it.
>
> See glyphless-char-display and glyphless-char-display-control for why.

IIUC it has a `thin-space' display method entry and I could set this to
`zero-width' (the doc-string of `glyphless-char-display' is ambiguous
about that)?  Does this also mean that I can separate text properties of
adjacent words by inserting a zero-width space between them?

>   #define IT_DISPLAYING_WHITESPACE(it)					\
>     /* If the character to be displayed is SPC or TAB */
[...]
> In any case, you can clearly see that it only tests for literal SPC
> and TAB characters.

Even if I don't understand the code I can see that, yes.

>>  > If we want to add more characters to the set, we should probably
>>  > arrange a special char-table for this, and have it exposed to Lisp, so
>>  > it could be customized.  Patches are welcome.
>>
>> IIUC all breakable spaces are between U-2000 and U-200B so maybe a
>> character table is not needed.
>
> Who said we want only break at breakable space characters?  Who said
> Unicode will never add more such characters in another block?  And
> what about low-ASCII characters, which are already in a different
> block?

But implementing a character table and working with it is harder.

> In any case, even if you are right, a char-table is a way to store
> character properties efficiently.  In particular, it will waste very
> little storage to mark a contiguous range of characters with the same
> property.  The advantage of using a char-table is that it will
> dynamically expand as needed if more characters are added to the set.

Is it useful to make a _separate_ table for line-break properties?

>> Anway, exposing displayed text to Lisp would be great.  We'd just need
>> two functions - one that gets the pixel width of an arbitrary buffer
>> string wrt a specific window, and one that gets the pixel height of an
>> arbitrary buffer string (newlines ignored) wrt a specific window.  This
>> way we could get rid of lots of problems currently hidden in the display
>> engine ...
>
> You lost me here.  By "exposing to Lisp" I meant expose the char-table
> of word-wrap characters to Lisp.

I only now understand what you meant.

> What did _you_ want exposed to Lisp?

Two functions: One to get the width of some arbitrary buffer text in
pixels and one to get the full height of a buffer text line in pixels.
The former would be used for doing word-wrapping variants in Lisp, the
latter for fitting windows to their buffers.

martin




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Fri, 11 Jan 2013 10:58:01 GMT) Full text and rfc822 format available.

Message #20 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: martin rudalics <rudalics <at> gmx.at>
Cc: 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Fri, 11 Jan 2013 12:57:38 +0200
> Date: Fri, 11 Jan 2013 11:29:46 +0100
> From: martin rudalics <rudalics <at> gmx.at>
> CC: 13399 <at> debbugs.gnu.org
> 
>  >>  > You mean, not wrapped at all.  Witness the continuation bitmaps in the
>  >>  > fringes, which shouldn't appear when a line is wrapped.
>  >>
>  >> I thought these bitmaps appear when a line is wrapped.
>  >
>  > Not by default.  Not unless you customize visual-line-fringe-indicators.
> 
> With emacs -Q I see curly arrows in the fringes regardless of whether I
> set `visual-line-fringe-indicators' or not.  What am I missing?

If this is still with u+200B?  You need to try with regular spaces.
Then the indicators should disappear on wrapped lines.

>  >> The doc-string of `word-wrap' says
>  >>
>  >>    When word-wrapping is on, continuation lines are wrapped at the space
>  >>    or tab character nearest to the right window edge
>  >>
>  >> Since U-200B is a space character the line should wrap at it.
>  >
>  > No, it means literally "the space character", U+0020.
> 
> So `word-wrap' is ASCII-only?

Yes.

> The doc-string should say so.

Well, I personally find it hard to imagine that "the space character"
could be interpreted as something other than U+0020.  But I see what
you mean.

>  >> and Emacs apparently does handle it specially since it reserves a few
>  >> pixels when drawing it.
>  >
>  > See glyphless-char-display and glyphless-char-display-control for why.
> 
> IIUC it has a `thin-space' display method entry and I could set this to
> `zero-width' (the doc-string of `glyphless-char-display' is ambiguous
> about that)?

Yes.

>  Does this also mean that I can separate text properties of
> adjacent words by inserting a zero-width space between them?

Yes, I think so (if I understand correctly what you mean).

>  >>  > If we want to add more characters to the set, we should probably
>  >>  > arrange a special char-table for this, and have it exposed to Lisp, so
>  >>  > it could be customized.  Patches are welcome.
>  >>
>  >> IIUC all breakable spaces are between U-2000 and U-200B so maybe a
>  >> character table is not needed.
>  >
>  > Who said we want only break at breakable space characters?  Who said
>  > Unicode will never add more such characters in another block?  And
>  > what about low-ASCII characters, which are already in a different
>  > block?
> 
> But implementing a character table and working with it is harder.

I don't think it's harder, it's actually very simple.  You have a
simple API for setting values in the table and a simple API for
accessing a property of a character.

>  > In any case, even if you are right, a char-table is a way to store
>  > character properties efficiently.  In particular, it will waste very
>  > little storage to mark a contiguous range of characters with the same
>  > property.  The advantage of using a char-table is that it will
>  > dynamically expand as needed if more characters are added to the set.
> 
> Is it useful to make a _separate_ table for line-break properties?

Why not?  What existing table would you reuse for that?

>  > What did _you_ want exposed to Lisp?
> 
> Two functions: One to get the width of some arbitrary buffer text in
> pixels and one to get the full height of a buffer text line in pixels.
> The former would be used for doing word-wrapping variants in Lisp, the
> latter for fitting windows to their buffers.

The latter already exists as window-line-height, doesn't it?

Anyway, how would you word-wrap in Lisp, except by adding display
strings with newlines (which AFAIR features like longlines
etc. already do)?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Fri, 11 Jan 2013 14:31:02 GMT) Full text and rfc822 format available.

Message #23 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: martin rudalics <rudalics <at> gmx.at>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space
	U-200B
Date: Fri, 11 Jan 2013 15:30:04 +0100
>> With emacs -Q I see curly arrows in the fringes regardless of whether I
>> set `visual-line-fringe-indicators' or not.  What am I missing?
>
> If this is still with u+200B?  You need to try with regular spaces.

My bad.  It works when I enable `visual-line-mode'.  It doesn't with
only `word-wrap' set to t.

> Then the indicators should disappear on wrapped lines.
[...]
>> So `word-wrap' is ASCII-only?
>
> Yes.
>
>> The doc-string should say so.
>
> Well, I personally find it hard to imagine that "the space character"
> could be interpreted as something other than U+0020.  But I see what
> you mean.

Nothing serious.  I detected zero-width spaces and noticed that emacs
can display them, so I was mislead that word wrap would handle them.

>> IIUC it has a `thin-space' display method entry and I could set this to
>> `zero-width' (the doc-string of `glyphless-char-display' is ambiguous
>> about that)?
>
> Yes.

Good.

>>  Does this also mean that I can separate text properties of
>> adjacent words by inserting a zero-width space between them?
>
> Yes, I think so (if I understand correctly what you mean).

Never mind, it works.  What I meant was that when, for example, I have
two adjacent parts of text with the same mouse-face property and the
mouse hovers over one of the words, the other word gets highlighted as
well.  Maybe it's just stickyness or whatever, but till now I hadn't
found a method to turn this off.  Not recommended for normal buffers
because `forward-char' appears to hang, but that's a different story.

>> But implementing a character table and working with it is harder.
>
> I don't think it's harder, it's actually very simple.  You have a
> simple API for setting values in the table and a simple API for
> accessing a property of a character.

OK.  I take your word for it.  Maybe it could be also useful for adding
soft hyphens (if we can make `forward-char' handle them).

>> Is it useful to make a _separate_ table for line-break properties?
>
> Why not?  What existing table would you reuse for that?

No idea.  Do we have a list of predefined character tables somewhere?

>> Two functions: One to get the width of some arbitrary buffer text in
>> pixels and one to get the full height of a buffer text line in pixels.
>> The former would be used for doing word-wrapping variants in Lisp, the
>> latter for fitting windows to their buffers.
>
> The latter already exists as window-line-height, doesn't it?

This needs an up to date display, IIUC :-(

> Anyway, how would you word-wrap in Lisp, except by adding display
> strings with newlines (which AFAIR features like longlines
> etc. already do)?

By adding hard newlines.  All I care about is to (1) show the entire
buffer text in a fixed-width window and (2) make that window as small as
possible.

martin




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Fri, 11 Jan 2013 14:49:01 GMT) Full text and rfc822 format available.

Message #26 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: martin rudalics <rudalics <at> gmx.at>
Cc: 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Fri, 11 Jan 2013 16:49:01 +0200
> Date: Fri, 11 Jan 2013 15:30:04 +0100
> From: martin rudalics <rudalics <at> gmx.at>
> CC: 13399 <at> debbugs.gnu.org
> 
>  >>  Does this also mean that I can separate text properties of
>  >> adjacent words by inserting a zero-width space between them?
>  >
>  > Yes, I think so (if I understand correctly what you mean).
> 
> Never mind, it works.  What I meant was that when, for example, I have
> two adjacent parts of text with the same mouse-face property and the
> mouse hovers over one of the words, the other word gets highlighted as
> well.  Maybe it's just stickyness or whatever

No, it's because, when mouse highlight finds a character with
mouse-face, it looks forward for the first character without that
face, and highlights everything in between.

>  >> Two functions: One to get the width of some arbitrary buffer text in
>  >> pixels and one to get the full height of a buffer text line in pixels.
>  >> The former would be used for doing word-wrapping variants in Lisp, the
>  >> latter for fitting windows to their buffers.
>  >
>  > The latter already exists as window-line-height, doesn't it?
> 
> This needs an up to date display, IIUC :-(

Yes, but only because the code to do that without looking at the
current glyph matrix was never written.  We do similar things all over
the display engine.

>  > Anyway, how would you word-wrap in Lisp, except by adding display
>  > strings with newlines (which AFAIR features like longlines
>  > etc. already do)?
> 
> By adding hard newlines.

I thought that way is "deprecated" in favor of C-level word-wrap,
which is why longlines.el is in obsolete/...




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Fri, 11 Jan 2013 15:18:01 GMT) Full text and rfc822 format available.

Message #29 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: martin rudalics <rudalics <at> gmx.at>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space
	U-200B
Date: Fri, 11 Jan 2013 16:17:04 +0100
> No, it's because, when mouse highlight finds a character with
> mouse-face, it looks forward for the first character without that
> face, and highlights everything in between.

I thought so.  Seems like zero-width spaces is the only means to fix
this.

>> This needs an up to date display, IIUC :-(
>
> Yes, but only because the code to do that without looking at the
> current glyph matrix was never written.  We do similar things all over
> the display engine.

Could you give writing this sort of "maybe this year" priority?  Look at
that silly code in `fit-window-to-buffer' for a motivation.  Just that
the return value of such a function would have to include the height
needed for continuation lines as well.

>> By adding hard newlines.
>
> I thought that way is "deprecated" in favor of C-level word-wrap,
> which is why longlines.el is in obsolete/...

But I have to calculate the height of the window _before_ redisplay.
And for knowing the height of the window I have to know the number of
displayed lines.

martin




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Fri, 11 Jan 2013 15:23:01 GMT) Full text and rfc822 format available.

Message #32 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Christopher Schmidt <christopher <at> ch.ristopher.com>
To: bug-gnu-emacs <at> gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Fri, 11 Jan 2013 15:22:29 +0000 (GMT)
martin rudalics <rudalics <at> gmx.at> writes:
>> No, it's because, when mouse highlight finds a character with
>> mouse-face, it looks forward for the first character without that
>> face, and highlights everything in between.
>
> I thought so.  Seems like zero-width spaces is the only means to fix
> this.

You can insert invisible text in between the characters, such as

    (propertize " " 'invisible t)

        Christopher




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Fri, 11 Jan 2013 15:55:02 GMT) Full text and rfc822 format available.

Message #35 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: martin rudalics <rudalics <at> gmx.at>
Cc: 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Fri, 11 Jan 2013 17:53:57 +0200
> Date: Fri, 11 Jan 2013 16:17:04 +0100
> From: martin rudalics <rudalics <at> gmx.at>
> CC: 13399 <at> debbugs.gnu.org
> 
>  >> This needs an up to date display, IIUC :-(
>  >
>  > Yes, but only because the code to do that without looking at the
>  > current glyph matrix was never written.  We do similar things all over
>  > the display engine.
> 
> Could you give writing this sort of "maybe this year" priority?

I cannot promise that, but I will try.  (It is actually a very good
exercise for someone who wants to get to know the display engine,
hint, hint...)

> But I have to calculate the height of the window _before_ redisplay.
> And for knowing the height of the window I have to know the number of
> displayed lines.

Doesn't vertical-motion, pos-visible-in-window-p, and its ilk provide
that already?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Fri, 11 Jan 2013 16:09:02 GMT) Full text and rfc822 format available.

Message #38 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: martin rudalics <rudalics <at> gmx.at>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Fri, 11 Jan 2013 11:08:02 -0500
> Never mind, it works.  What I meant was that when, for example, I have
> two adjacent parts of text with the same mouse-face property and the
> mouse hovers over one of the words, the other word gets highlighted as
> well.  Maybe it's just stickyness or whatever, but till now I hadn't
> found a method to turn this off.  Not recommended for normal buffers
> because `forward-char' appears to hang, but that's a different story.

Text properties apply to characters, so they don't have a natural notion
of "extent" and "boundaries", but Emacs usually invents those notions
when needed by treating any run of characters whose text-property value
is `eq' as one extent.
In the case of the mouse-face property that means you can use (list 'my-face)
on the chunk you want to make sure it's not `eq' to an adjacent chunk.

>> The latter already exists as window-line-height, doesn't it?
> This needs an up to date display, IIUC :-(

W.r.t. functions that return the pixel width/height of a string, I guess
you'd presume that the string would be displayed at the leftmost
position on a line, since the width/height of a string will depend on
where it's displayed in the window (which affects the width of TAB
chars, and the placement of line wraps).

>> Anyway, how would you word-wrap in Lisp, except by adding display
>> strings with newlines (which AFAIR features like longlines
>> etc. already do)?
> By adding hard newlines.  All I care about is to (1) show the entire
> buffer text in a fixed-width window and (2) make that window as small as
> possible.

How 'bout starting my making the window as high as you can, then call
(posn-at-point (point-max)), then shrink the window accordingly?


        Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Fri, 11 Jan 2013 18:05:01 GMT) Full text and rfc822 format available.

Message #41 received at submit <at> debbugs.gnu.org (full text, mbox):

From: martin rudalics <rudalics <at> gmx.at>
To: bug-gnu-emacs <at> gnu.org
Subject: Re: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space
	U-200B
Date: Fri, 11 Jan 2013 19:04:14 +0100
> You can insert invisible text in between the characters, such as
>
>     (propertize " " 'invisible t)

One could also use display properties or overlays.  But zero-width
spaces can be easily saved to file and read back (not that this is
something I'd need in the case at hand).

martin




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Fri, 11 Jan 2013 18:06:02 GMT) Full text and rfc822 format available.

Message #44 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: martin rudalics <rudalics <at> gmx.at>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space
	U-200B
Date: Fri, 11 Jan 2013 19:04:47 +0100
> I cannot promise that, but I will try.  (It is actually a very good
> exercise for someone who wants to get to know the display engine,
> hint, hint...)

... a talented, young programmer.  Where do you find them these days?

>> But I have to calculate the height of the window _before_ redisplay.
>> And for knowing the height of the window I have to know the number of
>> displayed lines.
>
> Doesn't vertical-motion, pos-visible-in-window-p, and its ilk provide
> that already?

`fit-window-to-buffer' uses `pos-visible-in-window-p' already, with poor
results here (in particular when drawing boxes around text).  And
`vertical-motion' doesn't pay attention to the height of text.  At least
that was my experience when I tried to rewrite `fit-window-to-buffer'
using `count-screen-lines'.

martin




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Fri, 11 Jan 2013 18:07:02 GMT) Full text and rfc822 format available.

Message #47 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: martin rudalics <rudalics <at> gmx.at>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space
	U-200B
Date: Fri, 11 Jan 2013 19:06:24 +0100
> In the case of the mouse-face property that means you can use (list 'my-face)
> on the chunk you want to make sure it's not `eq' to an adjacent chunk.

Good to know that trick (undocumented I presume).

> W.r.t. functions that return the pixel width/height of a string, I guess
> you'd presume that the string would be displayed at the leftmost
> position on a line, since the width/height of a string will depend on
> where it's displayed in the window (which affects the width of TAB
> chars, and the placement of line wraps).

Yes.  In my use case the buffer has no newline.

>>> Anyway, how would you word-wrap in Lisp, except by adding display
>>> strings with newlines (which AFAIR features like longlines
>>> etc. already do)?
>> By adding hard newlines.  All I care about is to (1) show the entire
>> buffer text in a fixed-width window and (2) make that window as small as
>> possible.
>
> How 'bout starting my making the window as high as you can, then call
> (posn-at-point (point-max)), then shrink the window accordingly?

How could `posn-at-point' possibly work if the display is not up to
date?  And what I want to avoid is the redisplay.

martin




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Fri, 11 Jan 2013 18:51:02 GMT) Full text and rfc822 format available.

Message #50 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: martin rudalics <rudalics <at> gmx.at>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Fri, 11 Jan 2013 13:50:00 -0500
>> In the case of the mouse-face property that means you can use (list
>> 'my-face) on the chunk you want to make sure it's not `eq' to an
>> adjacent chunk.
> Good to know that trick (undocumented I presume).

Maybe it's documented somewhere, but at least the doc of the
`mouse-face' text property doesn't mention how boundaries are
determined, indeed.

>> W.r.t. functions that return the pixel width/height of a string, I guess
>> you'd presume that the string would be displayed at the leftmost
>> position on a line, since the width/height of a string will depend on
>> where it's displayed in the window (which affects the width of TAB
>> chars, and the placement of line wraps).
> Yes.  In my use case the buffer has no newline.

Does the absence of newline make a difference to the problem?

> How could `posn-at-point' possibly work if the display is not up to
> date?

Actually, it does not require the display to be up-to-date, only the
glyph-matrices.

> And what I want to avoid is the redisplay.

You should be able to update the glyph-matrices without causing the
display to immediately reflect the changes.  E.g. window-end with
a non-nil `update' argument should do that, IIUC.


        Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Fri, 11 Jan 2013 19:09:01 GMT) Full text and rfc822 format available.

Message #53 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: martin rudalics <rudalics <at> gmx.at>
Cc: monnier <at> iro.umontreal.ca, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Fri, 11 Jan 2013 21:08:02 +0200
> Date: Fri, 11 Jan 2013 19:06:24 +0100
> From: martin rudalics <rudalics <at> gmx.at>
> CC: Eli Zaretskii <eliz <at> gnu.org>, 13399 <at> debbugs.gnu.org
> 
> How could `posn-at-point' possibly work if the display is not up to
> date?

Very simply: it simulates the display.  See pos_visible_p (which
posn-at-point calls).  That's what I meant when I said that doing this
for determining the line width or height should be easy.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Fri, 11 Jan 2013 19:30:01 GMT) Full text and rfc822 format available.

Message #56 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: rudalics <at> gmx.at, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Fri, 11 Jan 2013 21:29:19 +0200
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz <at> gnu.org>,  13399 <at> debbugs.gnu.org
> Date: Fri, 11 Jan 2013 13:50:00 -0500
> 
> > How could `posn-at-point' possibly work if the display is not up to
> > date?
> 
> Actually, it does not require the display to be up-to-date, only the
> glyph-matrices.

No, you cannot have up-to-date glyph matrices without triggering
redisplay.  Updating the glyph matrices is the first stage of
redisplay (the second stage being updating the windows using the
differences between the "current" and the "desired" glyph matrices).

What you mean is use move_it_* family of functions.  These _simulate_
redisplay by computing all the metrics of all the characters they move
across, but without producing glyphs and glyph matrices.  Because you
don't actually need the glyph matrices for the task at hand, you only
need the metrics of each display line (its ascent, descent, and pixel
width), and those are computed and tracked by the display iterator
even if no glyphs are produced.

> You should be able to update the glyph-matrices without causing the
> display to immediately reflect the changes.  E.g. window-end with
> a non-nil `update' argument should do that, IIUC.

window-end with a non-nil UPDATE arg uses the above mentioned
technique: it calls move_it_vertically, which moves through the lines
computing their metrics, but doesn't produce any glyphs.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Fri, 11 Jan 2013 22:49:01 GMT) Full text and rfc822 format available.

Message #59 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: rudalics <at> gmx.at, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Fri, 11 Jan 2013 17:47:46 -0500
>> > How could `posn-at-point' possibly work if the display is not up to
>> > date?
>> Actually, it does not require the display to be up-to-date, only the
>> glyph-matrices.
> No, you cannot have up-to-date glyph matrices without triggering
> redisplay.

Looks like I misunderstood, indeed.  This said, I don't see why we
couldn't provide ways to update the glyph matrices while leaving the
display update for later.


        Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Sat, 12 Jan 2013 08:29:02 GMT) Full text and rfc822 format available.

Message #62 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: rudalics <at> gmx.at, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Sat, 12 Jan 2013 10:28:15 +0200
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: rudalics <at> gmx.at,  13399 <at> debbugs.gnu.org
> Date: Fri, 11 Jan 2013 17:47:46 -0500
> 
> >> > How could `posn-at-point' possibly work if the display is not up to
> >> > date?
> >> Actually, it does not require the display to be up-to-date, only the
> >> glyph-matrices.
> > No, you cannot have up-to-date glyph matrices without triggering
> > redisplay.
> 
> Looks like I misunderstood, indeed.  This said, I don't see why we
> couldn't provide ways to update the glyph matrices while leaving the
> display update for later.

Generating up-to-date matrices without reflecting them on the glass is
easy: just skip the calls to update_frame in redisplay_internal.

But why would we need that?  Most everything we need to know about
display is already tracked by the display iterator, so available even
without generating glyphs, and that's what the move_it_* functions do.
These function do their job by traversing only small portions of the
buffer, just large enough for the job at hand to be done.  OTOH,
updating the entire glyph matrices of all of the windows on all of the
frames AFAIR takes the lion's share of time used by redisplay, so we
might as well force a complete redisplay when we do need complete
up-to-date matrices.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Sat, 12 Jan 2013 13:22:02 GMT) Full text and rfc822 format available.

Message #65 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: rudalics <at> gmx.at, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Sat, 12 Jan 2013 08:20:50 -0500
> But why would we need that?  Most everything we need to know about
> display is already tracked by the display iterator, so available even
> without generating glyphs, and that's what the move_it_* functions do.
> These function do their job by traversing only small portions of the
> buffer, just large enough for the job at hand to be done.

What about posn-at-point?

> OTOH, updating the entire glyph matrices of all of the windows on all
> of the frames AFAIR takes the lion's share of time used by redisplay,
> so we might as well force a complete redisplay when we do need
> complete up-to-date matrices.

posn-at-point doesn't need to refresh all glyph matrices, only the one
of the selected window.


        Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Sat, 12 Jan 2013 14:13:02 GMT) Full text and rfc822 format available.

Message #68 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: rudalics <at> gmx.at, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Sat, 12 Jan 2013 16:12:38 +0200
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: rudalics <at> gmx.at,  13399 <at> debbugs.gnu.org
> Date: Sat, 12 Jan 2013 08:20:50 -0500
> 
> > But why would we need that?  Most everything we need to know about
> > display is already tracked by the display iterator, so available even
> > without generating glyphs, and that's what the move_it_* functions do.
> > These function do their job by traversing only small portions of the
> > buffer, just large enough for the job at hand to be done.
> 
> What about posn-at-point?

What about it?  It already uses move_it_*, see pos_visible_p, which
does all the work.

> > OTOH, updating the entire glyph matrices of all of the windows on all
> > of the frames AFAIR takes the lion's share of time used by redisplay,
> > so we might as well force a complete redisplay when we do need
> > complete up-to-date matrices.
> 
> posn-at-point doesn't need to refresh all glyph matrices, only the one
> of the selected window.

Look at pos_visible_p, and you will see that what it does is
start_display at window top, then move the display iterator to where
point is displayed, and taking the pixel coordinates from the display
iterator when that's done.  What else is needed, that the glyph matrix
of the window would provide?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Sat, 12 Jan 2013 14:31:05 GMT) Full text and rfc822 format available.

Message #71 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: martin rudalics <rudalics <at> gmx.at>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: monnier <at> iro.umontreal.ca, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space
	U-200B
Date: Sat, 12 Jan 2013 15:29:57 +0100
[Message part 1 (text/plain, inline)]
> Very simply: it simulates the display.  See pos_visible_p (which
> posn-at-point calls).  That's what I meant when I said that doing this
> for determining the line width or height should be easy.

Here I see the following call sequences:

`fit-window-to-buffer'
  -> `count-screen-lines' -> `vertical-motion' -> move_it_to
  -> `pos-visible-in-window-p' -> pos_visible_p -> move_it_to

So everything `fit-window-to-buffer' does ends up calling move_it_to and
the loop called via `pos-visible-in-window-p' is likely silly.  Using

`posn-at-point' -> `pos-visible-in-window-p' -> pos_visible_p -> move_it_to

`window-end' -> move_it_vertically -> move_it_to

probably won't produce anything else.  This means that somehow
move_it_to fails to DTRT here.

I don't have a deterministic scenario to produce the bug.  I attach a
file you can try via `eval-buffer' followed by M-x foo.  After that
you'd have to resize the frame randomly (usually shrinking it
vertically) until it hides the last line(s) of the *foo* window.  This
usually takes a few seconds here.

The hiding does not occur when I do not draw a box around characters.  I
didn't try different character heights, bold face, etc. so far.  Note
that in normal work I use a maximized frame and the bug shows up
frequently when changing text in *foo* (using another hook) or the
window configuration.

I now have to manually trigger `window-line-height' on each of *foo*'s
lines when the hiding occurs, add the return values, and try to find out
what goes on.  This will take some time.  If you have a better idea, I'd
be all ears.

martin
[foo.el (application/emacs-lisp, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Sat, 12 Jan 2013 14:57:02 GMT) Full text and rfc822 format available.

Message #74 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: martin rudalics <rudalics <at> gmx.at>
Cc: monnier <at> iro.umontreal.ca, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Sat, 12 Jan 2013 16:56:31 +0200
> Date: Sat, 12 Jan 2013 15:29:57 +0100
> From: martin rudalics <rudalics <at> gmx.at>
> CC: monnier <at> iro.umontreal.ca, 13399 <at> debbugs.gnu.org
> 
>  > Very simply: it simulates the display.  See pos_visible_p (which
>  > posn-at-point calls).  That's what I meant when I said that doing this
>  > for determining the line width or height should be easy.
> 
> Here I see the following call sequences:
> 
> `fit-window-to-buffer'
>    -> `count-screen-lines' -> `vertical-motion' -> move_it_to
>    -> `pos-visible-in-window-p' -> pos_visible_p -> move_it_to
> 
> So everything `fit-window-to-buffer' does ends up calling move_it_to and
> the loop called via `pos-visible-in-window-p' is likely silly.

fit-window-to-buffer should probably call pos-visible-in-window-p with
argument PARTIALLY non-nil, and then it should be possible to get rid
of the loop.

> Using
> 
> `posn-at-point' -> `pos-visible-in-window-p' -> pos_visible_p -> move_it_to
> 
> `window-end' -> move_it_vertically -> move_it_to
> 
> probably won't produce anything else.  This means that somehow
> move_it_to fails to DTRT here.

If you can show a case where it fails, it should be possible to fix
it.  OTOH, it could be that these Lisp tricks try to work around
failures in move_it_to that were already fixed.

> I don't have a deterministic scenario to produce the bug.  I attach a
> file you can try via `eval-buffer' followed by M-x foo.  After that
> you'd have to resize the frame randomly (usually shrinking it
> vertically) until it hides the last line(s) of the *foo* window.  This
> usually takes a few seconds here.
> 
> The hiding does not occur when I do not draw a box around characters.  I
> didn't try different character heights, bold face, etc. so far.  Note
> that in normal work I use a maximized frame and the bug shows up
> frequently when changing text in *foo* (using another hook) or the
> window configuration.

I think what you see only happens when the last line is partially
visible (where "partially" might mean just one of its pixels).  If
that is not desirable, then we will need to be more accurate when we
say "visible" and have a more precise definition of what exactly
constitutes a window's height when a single line, in particular the
last one, might have characters of different descent values.

In any case, this is a failure in fit-window-to-buffer, if anything,
it is not necessarily an evidence that move_it_* functions fail.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Sat, 12 Jan 2013 16:07:02 GMT) Full text and rfc822 format available.

Message #77 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: rudalics <at> gmx.at, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Sat, 12 Jan 2013 11:06:33 -0500
> What else is needed, that the glyph matrix
> of the window would provide?

Caching, tho maybe it doesn't matter.


        Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Sat, 12 Jan 2013 16:39:02 GMT) Full text and rfc822 format available.

Message #80 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: martin rudalics <rudalics <at> gmx.at>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: monnier <at> iro.umontreal.ca, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space
	U-200B
Date: Sat, 12 Jan 2013 17:37:57 +0100
> fit-window-to-buffer should probably call pos-visible-in-window-p with
> argument PARTIALLY non-nil, and then it should be possible to get rid
> of the loop.

But PARTIALLY nil means to return nil if the position is only partially
visible.  And in this case `fit-window-to-buffer' should try to enlarge
the window.

> I think what you see only happens when the last line is partially
> visible (where "partially" might mean just one of its pixels).  If
> that is not desirable, then we will need to be more accurate when we
> say "visible" and have a more precise definition of what exactly
> constitutes a window's height when a single line, in particular the
> last one, might have characters of different descent values.
>
> In any case, this is a failure in fit-window-to-buffer, if anything,
> it is not necessarily an evidence that move_it_* functions fail.

Maybe.  I now call `count-screen-lines' with COUNT-FINAL-NEWLINE t and
this seems to work although I'm quite sure that it didn't work earlier.
I think for the moment I'll just add a COUNT-FINAL-NEWLINE argument to
`fit-window-to-buffer' and see whether it works.

martin




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Sat, 12 Jan 2013 16:53:01 GMT) Full text and rfc822 format available.

Message #83 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: martin rudalics <rudalics <at> gmx.at>
Cc: monnier <at> iro.umontreal.ca, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Sat, 12 Jan 2013 18:51:59 +0200
> Date: Sat, 12 Jan 2013 17:37:57 +0100
> From: martin rudalics <rudalics <at> gmx.at>
> CC: monnier <at> iro.umontreal.ca, 13399 <at> debbugs.gnu.org
> 
>  > fit-window-to-buffer should probably call pos-visible-in-window-p with
>  > argument PARTIALLY non-nil, and then it should be possible to get rid
>  > of the loop.
> 
> But PARTIALLY nil means to return nil if the position is only partially
> visible.  And in this case `fit-window-to-buffer' should try to enlarge
> the window.

Should it?  It enlarges the window in line units, so the enlarged
window will again show a partially visible line, no?

> I now call `count-screen-lines' with COUNT-FINAL-NEWLINE t and
> this seems to work although I'm quite sure that it didn't work earlier.

count-screen-lines relies on vertical-motion, which got several
improvements lately.  Maybe this is the reason.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Sat, 12 Jan 2013 18:02:02 GMT) Full text and rfc822 format available.

Message #86 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: martin rudalics <rudalics <at> gmx.at>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: monnier <at> iro.umontreal.ca, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space
	U-200B
Date: Sat, 12 Jan 2013 19:01:16 +0100
>> But PARTIALLY nil means to return nil if the position is only partially
>> visible.  And in this case `fit-window-to-buffer' should try to enlarge
>> the window.
>
> Should it?  It enlarges the window in line units, so the enlarged
> window will again show a partially visible line, no?

That's the clue.  The bug is in this part of `fit-window-to-buffer':

	    (if (zerop delta)
		;; Return zero if DELTA became zero in the process.
		0

The delta comes from `count-screen-lines' and that function returns the
number of lines in the buffer but NOT in canonical line units.  Removing
this conditional now seems to fix the problem for sure.

Still `fit-window-to-buffer' would benefit from a function that returned
either the pixel height needed for displaying the region or its number
of canonical line units.  Obviously, the former would be preferable when
we switch to pixel-sized windows.

martin




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Sat, 12 Jan 2013 18:39:02 GMT) Full text and rfc822 format available.

Message #89 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: martin rudalics <rudalics <at> gmx.at>
Cc: monnier <at> iro.umontreal.ca, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Sat, 12 Jan 2013 20:38:35 +0200
> Date: Sat, 12 Jan 2013 19:01:16 +0100
> From: martin rudalics <rudalics <at> gmx.at>
> CC: monnier <at> iro.umontreal.ca, 13399 <at> debbugs.gnu.org
> 
> Still `fit-window-to-buffer' would benefit from a function that returned
> either the pixel height needed for displaying the region or its number
> of canonical line units.  Obviously, the former would be preferable when
> we switch to pixel-sized windows.

Would something like this be good enough?

  (save-excursion
    (move-beginning-of-line 1)
    (setq pos1 (posn-at-point)))
  (save-excursion
    (move-beginning-of-line 2)
    (setq pos2 (posn-at-point)))

Then use the Y member of the returned information in pos1 and pos2.
(Alternatively, you could do something similar with
pos-visible-in-window-p instead of posn-at-point.)  Will this fit the
bill?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Mon, 14 Jan 2013 18:06:02 GMT) Full text and rfc822 format available.

Message #92 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: martin rudalics <rudalics <at> gmx.at>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: monnier <at> iro.umontreal.ca, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space
	U-200B
Date: Mon, 14 Jan 2013 19:04:15 +0100
> Would something like this be good enough?
>
>   (save-excursion
>     (move-beginning-of-line 1)
>     (setq pos1 (posn-at-point)))
>   (save-excursion
>     (move-beginning-of-line 2)
>     (setq pos2 (posn-at-point)))
>
> Then use the Y member of the returned information in pos1 and pos2.

Looks like this should work.  But at the moment I'm a bit lost with the
information returned by `posn-at-point': What precisely stands the value
of (nth 2 (posn-at-point (point-max))) for?  If my buffer ends with a
newline, is that the value of the lowest pixel of the chararacter box of
the character just above the cursor?  Can it include line spacing?

I wonder because I find this calculation in `posn-col-row' confusing:

	      (- (/ (cdr pair) (+ (frame-char-height frame) spacing))
		 (if (null (with-current-buffer (window-buffer window)
			     header-line-format))
		     0 1))))))))

It does not round values, so the value of rows can be less than needed
for showing the entire text.  OTOH it seems to apply spacing to the last
line of a buffer.  Finally, if a buffer wants a headerline, evaluating
(posn-col-row (posn-at-point (point-min))) gives (0 . -1).  Is that
useful?

So I'm working with the raw data returned by `posn-at-point' and the
results are not worse than with the current approach.  But I still seem
to lose some pixels somewhere ...

martin




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Sat, 02 Feb 2013 16:50:02 GMT) Full text and rfc822 format available.

Message #95 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: martin rudalics <rudalics <at> gmx.at>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space
	U-200B
Date: Sat, 02 Feb 2013 17:48:42 +0100
[Message part 1 (text/plain, inline)]
> But why would we need that?  Most everything we need to know about
> display is already tracked by the display iterator, so available even
> without generating glyphs, and that's what the move_it_* functions do.
> These function do their job by traversing only small portions of the
> buffer, just large enough for the job at hand to be done.

I rewrote `fit-window-to-buffer' and `fit-frame-to-buffer' using the
display iterator.  Please have a look at the attached patch.

Thanks, martin
[fit-window-to-buffer.diff (text/plain, inline)]
=== modified file 'lisp/window.el'
--- lisp/window.el	2013-01-02 16:13:04 +0000
+++ lisp/window.el	2013-02-02 14:58:22 +0000
@@ -6074,211 +6074,428 @@
 			     (eobp)
 			     window))))
 
-;;; Resizing buffers to fit their contents exactly.
+;;; Resizing windows and frames to fit their contents exactly.
+(defcustom fit-window-to-buffer-horizontally nil
+  "Non-nil means `fit-window-to-buffer' can resize windows horizontally.
+If this is nil, `fit-window-to-buffer' never resizes windows
+horizontally.  If this is `only', it can resize windows
+horizontally only.  Any other value means `fit-window-to-buffer'
+can resize windows in both dimensions."
+  :type 'boolean
+  :version "24.4"
+  :group 'help)
+
 (defcustom fit-frame-to-buffer nil
-  "Non-nil means `fit-window-to-buffer' can resize frames.
+  "Non-nil means `fit-frame-to-buffer' can resize frames.
 A frame can be resized if and only if its root window is a live
-window.  The height of the root window is subject to the values
-of `fit-frame-to-buffer-max-height' and `window-min-height'."
+window.  If this is `horizontally', frames can be resized
+horizontally only.  If this is `vertically', frames can be
+resized vertically only.  Any other non-nil value means frames
+can be resized in both dimensions.  See also
+`fit-frame-to-buffer-margins' and `fit-frame-to-buffer-sizes'."
   :type 'boolean
-  :version "24.3"
+  :version "24.4"
   :group 'help)
 
-(defcustom fit-frame-to-buffer-bottom-margin 4
-  "Bottom margin for the command `fit-frame-to-buffer'.
-This is the number of lines that function leaves free at the bottom of
-the display, in order to not obscure any system task bar or panel.
-If you do not have one (or if it is vertical) you might want to
-reduce this.  If it is thicker, you might want to increase this."
-  ;; If you set this too small, fit-frame-to-buffer can shift the
-  ;; frame up to avoid the panel.
-  :type 'integer
-  :version "24.3"
-  :group 'windows)
-
-(defun fit-frame-to-buffer (&optional frame max-height min-height)
+(defcustom fit-frame-to-buffer-margins '(0 0 0 72)
+  "Margins around frame for `fit-frame-to-buffer'.
+This list specifies the numbers of pixels to be left free on the
+left, above, the right, and below a frame that shall be fit to
+its buffer.  The value specified here can be overridden for a
+specific frame by that frame's `fit-frame-to-buffer-margins'
+parameter, if present.
+
+On some window systems the calculation of frame sizes can be
+incorrect.  Increasing the value of the third and/or fourth
+element of this variable can fix that.
+
+See also `fit-frame-to-buffer-sizes'."
+  :version "24.4"
+  :type '(list
+	  (integer :tag "Left" :size 5)
+	  (integer :tag " Above" :size 5)
+	  (integer :tag " Right" :size 5)
+	  (integer :tag " Below" :size 5))
+  :group 'windows)
+
+(defcustom fit-frame-to-buffer-sizes '(nil nil nil nil)
+  "Size boundaries of frame for `fit-frame-to-buffer'.
+This list specifies the total maximum and minimum lines and
+maximum and minimum columns of the root window of any frame that
+shall be fit to its buffer.  If any of these values is non-nil,
+it will override the value supplied by the respective arguments
+of `fit-frame-to-buffer'.
+
+On window systems where the menubar can wrap, fitting a frame to
+its buffer may swallow the last line(s).  Specifying an
+appropriate minimum width value here can avoid such wrapping.
+
+See also `fit-frame-to-buffer-margins'."
+  :version "24.4"
+  :type '(list
+	  (choice
+	   :tag "Maximum Height"
+	   :value nil
+	   :format "%[MaxHeight%] %v  "
+	   (const :tag "None" :format "%t" nil)
+	   (integer :tag "Lines" :size 5))
+	  (choice
+	   :tag "Minimum Height"
+	   :value nil
+	   :format "%[MinHeight%] %v  "
+	   (const :tag "None" :format "%t" nil)
+	   (integer :tag "Lines" :size 5))
+	  (choice
+	   :tag "Maximum Width"
+	   :value nil
+	   :format "%[MaxWidth%] %v  "
+	   (const :tag "None" :format "%t" nil)
+	   (integer :tag "Columns" :size 5))
+	  (choice
+	   :tag "Minimum Width"
+	   :value nil
+	   :format "%[MinWidth%] %v\n"
+	   (const :tag "None" :format "%t" nil)
+	   (integer :tag "Columns" :size 5)))
+  :group 'windows)
+
+(defun window--sanitize-margin (margin left right)
+  "Return MARGIN if it's a number between LEFT and RIGHT."
+  (if (and (numberp margin)
+	   (<= left (- right margin)) (<= margin right))
+      margin
+    0))
+
+(defun fit-frame-to-buffer (&optional frame max-height min-height max-width min-width)
   "Adjust height of FRAME to display its buffer contents exactly.
 FRAME can be any live frame and defaults to the selected one.
+MAX-HEIGHT, MIN-HEIGHT, MAX-WIDTH and MIN-WIDTH specify bounds on
+the new total size of FRAME's root window.
 
-Optional argument MAX-HEIGHT specifies the maximum height of FRAME.
-It defaults to the height of the display below the current
-top line of FRAME, minus `fit-frame-to-buffer-bottom-margin'.
-Optional argument MIN-HEIGHT specifies the minimum height of FRAME.
-The default corresponds to `window-min-height'."
+The option `fit-frame-to-buffer' controls whether this function
+has any effect.  New position and size of FRAME are additionally
+determined by the options `fit-frame-to-buffer-sizes' and
+`fit-frame-to-buffer-margins' or the corresponding parameters of
+FRAME.  This function can fail to fit the buffer's height when
+`word-wrap' is turned on in that buffer."
   (interactive)
   (setq frame (window-normalize-frame frame))
-  (let* ((root (frame-root-window frame))
-	 (frame-min-height
-	  (+ (- (frame-height frame) (window-total-size root))
-	     window-min-height))
-	 (frame-top (frame-parameter frame 'top))
-	 (top (if (consp frame-top)
-		  (funcall (car frame-top) (cadr frame-top))
-		frame-top))
-	 (frame-max-height
-	  (- (/ (- (x-display-pixel-height frame) top)
-		(frame-char-height frame))
-	     fit-frame-to-buffer-bottom-margin))
-	 (compensate 0)
-	 delta)
-    (when (and (window-live-p root) (not (window-size-fixed-p root)))
-      (with-selected-window root
-	(cond
-	 ((not max-height)
-	  (setq max-height frame-max-height))
-	 ((numberp max-height)
-	  (setq max-height (min max-height frame-max-height)))
-	 (t
-	  (error "%s is an invalid maximum height" max-height)))
-	(cond
-	 ((not min-height)
-	  (setq min-height frame-min-height))
-	 ((numberp min-height)
-	  (setq min-height (min min-height frame-min-height)))
-	 (t
-	  (error "%s is an invalid minimum height" min-height)))
-	;; When tool-bar-mode is enabled and we have just created a new
-	;; frame, reserve lines for toolbar resizing.  This is needed
-	;; because for reasons unknown to me Emacs (1) reserves one line
-	;; for the toolbar when making the initial frame and toolbars
-	;; are enabled, and (2) later adds the remaining lines needed.
-	;; Our code runs IN BETWEEN (1) and (2).  YMMV when you're on a
-	;; system that behaves differently.
-	(let ((quit-restore (window-parameter root 'quit-restore))
-	      (lines (tool-bar-lines-needed frame)))
-	  (when (and quit-restore (eq (car quit-restore) 'frame)
-		     (not (zerop lines)))
-	    (setq compensate (1- lines))))
-	(message "%s" compensate)
-	(setq delta
-	      ;; Always count a final newline - we don't do any
-	      ;; post-processing, so let's play safe.
-	      (+ (count-screen-lines nil nil t)
-		 (- (window-body-size))
-		 compensate)))
-      ;; Move away from final newline.
-      (when (and (eobp) (bolp) (not (bobp)))
-	(set-window-point root (line-beginning-position 0)))
-      (set-window-start root (point-min))
-      (set-window-vscroll root 0)
-      (condition-case nil
-	  (set-frame-height
-	   frame
-	   (min (max (+ (frame-height frame) delta)
-		     min-height)
-		max-height))
-	(error (setq delta nil))))
-    delta))
+  (when (and (window-live-p (frame-root-window frame))
+	     fit-frame-to-buffer
+	     (or (not window-size-fixed)
+		 (and (eq window-size-fixed 'height)
+		      (not (eq fit-frame-to-buffer 'vertically)))
+		 (and (eq window-size-fixed 'width)
+		      (not (eq fit-frame-to-buffer 'horizontally)))))
+    (with-selected-window (frame-root-window frame)
+      (let* ((window (frame-root-window frame))
+	     (char-width (frame-char-width))
+	     (char-height (frame-char-height))
+	     (display-width (display-pixel-width frame))
+	     (display-height (display-pixel-height frame))
+	     ;; Sanitize margins.
+	     (margins (or (frame-parameter frame 'fit-frame-to-buffer-margins)
+			  fit-frame-to-buffer-margins))
+	     (left-margin (window--sanitize-margin
+			   (nth 0 margins) 0 display-width))
+	     (top-margin (window--sanitize-margin
+			  (nth 1 margins) 0 display-height))
+	     (right-margin (window--sanitize-margin
+			    (nth 2 margins) left-margin display-width))
+	     (bottom-margin (window--sanitize-margin
+			     (nth 3 margins) top-margin display-height))
+	     ;; The pixel width of FRAME.
+	     (frame-width (frame-pixel-width))
+	     ;; The difference between FRAME's pixel and parameter
+	     ;; widths.
+	     (frame-extra-width
+	      (- frame-width (* (frame-width) char-width)))
+	     ;; The pixel height of FRAME's window.
+	     (window-body-width (* (window-body-width) char-width))
+	     ;; The difference in pixels between total and body width of
+	     ;; FRAME's window.
+	     (window-extra-width
+	      (- (* (window-total-width) char-width) window-body-width))
+	     ;; The difference in pixels between the frame's pixel width
+	     ;; and the window's body width.
+	     (extra-width
+	      (* char-width (- (frame-width) (window-body-width))))
+	     ;; The maximum width we can use for fitting.
+	     (fit-width
+	      (- display-width (- frame-width window-body-width)
+		 left-margin right-margin))
+	     ;; The pixel position of FRAME's left border.  We usually
+	     ;; try to leave this alone.
+	     (left
+	      (let ((left (frame-parameter nil 'left)))
+		(if (consp left)
+		    (funcall (car left) (cadr left))
+		  left)))
+	     ;; The pixel height of FRAME.
+	     (frame-height (frame-pixel-height))
+	     ;; The difference between FRAME's pixel and parameter
+	     ;; heights.
+	     (frame-extra-height
+	      (- frame-height (* (frame-height) char-height)))
+	     ;; When tool-bar-mode is enabled and we just created a new
+	     ;; frame, reserve lines for toolbar resizing.  Needed
+	     ;; because for reasons unknown to me Emacs (1) reserves one
+	     ;; line for the toolbar when making the initial frame and
+	     ;; toolbars are enabled, and (2) later adds the remaining
+	     ;; lines needed.  Our code runs IN BETWEEN (1) and (2).
+	     ;; YMMV when you're on a system that behaves differently.
+	     (toolbar-extra-height
+	      (let ((quit-restore (window-parameter window 'quit-restore))
+		    (lines (tool-bar-lines-needed frame)))
+		(* char-height
+		   (if (and quit-restore (eq (car quit-restore) 'frame)
+			    (not (zerop lines)))
+		       (1- lines)
+		     0))))
+	     ;; The pixel height of FRAME's window.
+	     (window-body-height (* (window-body-height) char-height))
+	     ;; The difference in pixels between total and body height
+	     ;; of FRAME's window.
+	     (window-extra-height
+	      (- (* (window-total-height) char-height) window-body-height))
+	     ;; The difference in pixels between the frame's pixel
+	     ;; height and the window's body height.
+	     (extra-height
+	      (* (- (frame-height) (window-body-height)) char-height))
+	     ;; The maximum height we can use for fitting.
+	     (fit-height
+	      (- display-height (- frame-height window-body-height)
+		 top-margin bottom-margin toolbar-extra-height))
+	     ;; The pixel position of FRAME's top border.  We usually
+	     ;; try to leave this alone.
+	     (top
+	      (let ((top (frame-parameter nil 'top)))
+		(if (consp top)
+		    (funcall (car top) (cadr top))
+		  top)))
+	     ;; Sanitize minimum and maximum sizes.
+	     (sizes (or (frame-parameter frame 'fit-frame-to-buffer-sizes)
+			fit-frame-to-buffer-sizes))
+	     (max-height
+	      (cond
+	       ((numberp (nth 0 sizes))
+		(- (* (nth 0 sizes) char-height) window-extra-height))
+	       ((numberp max-height)
+		(- (* max-height char-height) window-extra-height))))
+	     (min-height
+	      (cond
+	       ((numberp (nth 1 sizes))
+		(- (* (nth 1 sizes) char-height) window-extra-height))
+	       ((numberp min-height)
+		(- (* min-height char-height) window-extra-height))))
+	     (max-width
+	      (cond
+	       ((numberp (nth 2 sizes))
+		(- (* (nth 2 sizes) char-width) window-extra-width))
+	       ((numberp max-width)
+		(- (* max-width char-width) window-extra-width))))
+	     (min-width
+	      (cond
+	       ((numberp (nth 3 sizes))
+		(- (* (nth 3 sizes) char-width) window-extra-width))
+	       ((numberp min-width)
+		(- (* min-width char-width) window-extra-width))))
+	     (value (window-buffer-pixel-size
+		     nil (point-min) (point-max)
+		     display-width display-height))
+	     (width (car value))
+	     (height (cdr value))
+	     remainder)
+	;; Round sizes (hopefully we can drop these as soon as we can
+	;; resize pixelwise).  First add pixels to obtain full last
+	;; lines and columns.
+	(setq remainder (% width char-width))
+	(unless (zerop remainder)
+	  (setq width (+ width (- char-width remainder))))
+	(setq remainder (% height char-height))
+	(setq height (+ height (- char-height remainder)))
+	;; Now make sure that we don't get larger than our rounded
+	;; maximum lines and columns.
+	(when (> width fit-width)
+	  (setq width (- fit-width (% fit-width char-width))))
+	(when (> height fit-height)
+	  (setq height (- fit-height (% fit-height char-height))))
+	;; Don't change height or width when the window's size is fixed
+	;; in either direction.
+	(cond
+	 ((eq window-size-fixed 'height)
+	  (setq height nil))
+	 ((eq window-size-fixed 'width)
+	  (setq height nil)))
+	(when width
+	  ;; Fit to maximum and minimum widths.
+	  (when max-width
+	    (setq width (min width max-width)))
+	  (when min-width
+	    (setq width (max width min-width)))
+	  ;; Add extra width.
+	  (setq width (+ width extra-width))
+	  ;; Preserve right margin.
+	  (let ((right (+ left width frame-extra-width))
+		(max-right (- display-width right-margin)))
+	    (cond
+	     ((> right max-right)
+	      ;; Move FRAME to left.
+	      (setq left (max 0 (- left (- right max-right)))))
+	     ((< left left-margin)
+	      ;; Move frame to right.
+	      (setq left left-margin)))))
+	(when height
+	  ;; Fit to maximum and minimum heights.
+	  (when max-height
+	    (setq height (min height max-height)))
+	  (when min-height
+	    (setq height (max height min-height)))
+	  ;; Add extra height.
+	  (setq height (+ height extra-height))
+	  ;; Preserve bottom and top margins.
+	  (let ((bottom (+ top height frame-extra-height))
+		(max-bottom (- display-height bottom-margin)))
+	    (cond
+	     ((> bottom max-bottom)
+	      ;; Move FRAME to left.
+	      (setq top (max 0 (- top (- bottom max-bottom)))))
+	     ((< top top-margin)
+	      ;; Move frame down.
+	      (setq top top-margin)))))
+	;; Apply changes.
+	(set-frame-position frame left top)
+	(set-frame-size
+	 frame
+	 (if width (/ width char-width) (frame-width))
+	 (if height (/ height char-height) (frame-height)))))))
 
-(defun fit-window-to-buffer (&optional window max-height min-height)
-  "Adjust height of WINDOW to display its buffer's contents exactly.
+(defun fit-window-to-buffer (&optional window max-height min-height max-width min-width)
+  "Adjust size of WINDOW to display its buffer's contents exactly.
 WINDOW must be a live window and defaults to the selected one.
 
-Optional argument MAX-HEIGHT specifies the maximum height of
-WINDOW and defaults to the height of WINDOW's frame.  Optional
-argument MIN-HEIGHT specifies the minimum height of WINDOW and
-defaults to `window-min-height'.  Both MAX-HEIGHT and MIN-HEIGHT
-are specified in lines and include the mode line and header line,
-if any.
-
-If WINDOW is a full height window, then if the option
-`fit-frame-to-buffer' is non-nil, this calls the function
-`fit-frame-to-buffer' to adjust the frame height.
-
-Return the number of lines by which WINDOW was enlarged or
-shrunk.  If an error occurs during resizing, return nil but don't
-signal an error.
+If WINDOW is part of a vertical combination, adjust WINDOW's
+height.  The new height is calculated from the number of lines of
+the accessible portion of its buffer.  The optional argument
+MAX-HEIGHT specifies a maximum height and defaults to the height
+of WINDOW's frame.  The optional argument MIN-HEIGHT specifies a
+minimum height and defaults to `window-min-height'.  Both
+MAX-HEIGHT and MIN-HEIGHT are specified in lines and include the
+mode line and header line, if any.
+
+If WINDOW is part of a horizontal combination and the value of
+the option `fit-window-to-buffer-horizontally' is non-nil, adjust
+WINDOW's height.  The new width of WINDOW is calculated from the
+maximum length of its buffer's lines that follow the current
+start position of WINDOW.  The optional argument MAX-WIDTH
+specifies a maximum width and defaults to the width of WINDOW's
+frame.  The optional argument MIN-WIDTH specifies a minimum width
+and defaults to `window-min-width'.  Both MAX-WIDTH and MIN-WIDTH
+are specified in columns and include fringes, margins and
+scrollbars, if any.
+
+If WINDOW is its frame's root window, then if the option
+`fit-frame-to-buffer' is non-nil, call `fit-frame-to-buffer' to
+adjust the frame's size.
 
 Note that even if this function makes WINDOW large enough to show
-_all_ lines of its buffer you might not see the first lines when
-WINDOW was scrolled."
+_all_ parts of its buffer you might not see the first part when
+WINDOW was scrolled.  If WINDOW is resized horizontally, you will
+not see the top of its buffer unless WINDOW starts at its minimum
+accessible position."
   (interactive)
   (setq window (window-normalize-window window t))
-  (cond
-   ((window-size-fixed-p window))
-   ((window-full-height-p window)
-    (when fit-frame-to-buffer
-      (fit-frame-to-buffer (window-frame window))))
-   (t
+  (if (eq window (frame-root-window window))
+      (when fit-frame-to-buffer
+	;; Fit WINDOW's frame to buffer.
+	(fit-frame-to-buffer
+	 (window-frame window) max-height min-height max-width min-width))
     (with-selected-window window
-      (let* ((height (window-total-size))
+      (let* ((frame (window-frame))
+	     (char-height (frame-char-height))
+	     (char-width (frame-char-width))
+	     (display-height (display-pixel-height))
+	     (total-height (window-total-height))
+	     (body-height (window-body-height))
+	     (body-width (window-body-width))
 	     (min-height
-	      ;; Adjust MIN-HEIGHT.
+	      ;; Sanitize MIN-HEIGHT.
 	      (if (numberp min-height)
 		  ;; Can't get smaller than `window-safe-min-height'.
 		  (max min-height window-safe-min-height)
 		;; Preserve header and mode line if present.
 		(window-min-size nil nil t)))
 	     (max-height
-	      ;; Adjust MAX-HEIGHT.
+	      ;; Sanitize MAX-HEIGHT.
 	      (if (numberp max-height)
-		  ;; Can't get larger than height of frame.
-		  (min max-height
-		       (window-total-size (frame-root-window window)))
-		;; Don't delete other windows.
-		(+ height (window-max-delta nil nil window))))
-	     ;; Make `desired-height' the height necessary to show
-	     ;; all of WINDOW's buffer, constrained by MIN-HEIGHT
-	     ;; and MAX-HEIGHT.
-	     (desired-height
-	      (max
-	       (min
-		(+ (count-screen-lines)
-		   ;; For non-minibuffers count the mode line, if any.
-		   (if (and (not (window-minibuffer-p window))
-			    mode-line-format)
-		       1
-		     0)
-		   ;; Count the header line, if any.
-		   (if header-line-format 1 0))
-		max-height)
-	       min-height))
-	     (desired-delta
-	      (- desired-height (window-total-size window)))
-	     (delta
-	      (if (> desired-delta 0)
-		  (min desired-delta
-		       (window-max-delta window nil window))
-		(max desired-delta
-		     (- (window-min-delta window nil window))))))
-	(condition-case nil
-	    (if (zerop delta)
-		;; Return zero if DELTA became zero in the process.
-		0
-	      ;; Don't try to redisplay with the cursor at the end on its
-	      ;; own line--that would force a scroll and spoil things.
-	      (when (and (eobp) (bolp) (not (bobp)))
-		;; It's silly to put `point' at the end of the previous
-		;; line and so maybe force horizontal scrolling.
-		(set-window-point window (line-beginning-position 0)))
-	      ;; Call `window-resize' with OVERRIDE argument equal WINDOW.
-	      (window-resize window delta nil window)
-	      ;; Check if the last line is surely fully visible.  If
-	      ;; not, enlarge the window.
-	      (let ((end (save-excursion
-			   (goto-char (point-max))
-			   (when (and (bolp) (not (bobp)))
-			     ;; Don't include final newline.
-			     (backward-char 1))
-			   (when truncate-lines
-			     ;; If line-wrapping is turned off, test the
-			     ;; beginning of the last line for
-			     ;; visibility instead of the end, as the
-			     ;; end of the line could be invisible by
-			     ;; virtue of extending past the edge of the
-			     ;; window.
-			     (forward-line 0))
-			   (point))))
-		(set-window-vscroll window 0)
-		;; This loop might in some rare pathological cases raise
-		;; an error - another reason for the `condition-case'.
-		(while (and (< desired-height max-height)
-			    (= desired-height (window-total-size))
-			    (not (pos-visible-in-window-p end)))
-		  (window-resize window 1 nil window)
-		  (setq desired-height (1+ desired-height)))))
-	  (error (setq delta nil)))
-	delta)))))
+		  (min (+ total-height (window-max-delta)) max-height)
+		(+ total-height (window-max-delta))))
+	     height)
+	(cond
+	 ;; If WINDOW is vertically combined, try to resize it
+	 ;; vertically.
+	 ((and (not (eq fit-window-to-buffer-horizontally 'only))
+	       (not (window-size-fixed-p window))
+	       (window-combined-p))
+	  ;; Vertically we always want to fit the entire buffer.
+	  ;; WINDOW'S height can't get larger than its frame's pixel
+	  ;; height.  Its width remains fixed.
+	  (setq height (cdr (window-buffer-pixel-size
+			     nil (point-min) (point-max)
+			     (* body-width char-width)
+			     (frame-pixel-height))))
+	  ;; Round height.
+	  (setq height (+ (/ height char-height)
+			  (if (zerop (% height char-height)) 0 1)))
+	  (unless (= height body-height)
+	    (window-resize-no-error
+	     window
+	     (- (max min-height
+		     (min max-height
+			  (+ total-height (- height body-height))))
+		total-height)
+	     nil window)))
+	 ;; If WINDOW is horizontally combined, try to resize it
+	 ;; horizontally.
+	 ((and fit-window-to-buffer-horizontally
+	       (not (window-size-fixed-p window t))
+	       (window-combined-p nil t))
+	  (let* ((display-width (display-pixel-width))
+		 (total-width (window-total-width))
+		 (min-width
+		  ;; Sanitize MIN-WIDTH.
+		  (if (numberp min-width)
+		      ;; Can't get smaller than `window-safe-min-width'.
+		      (max min-width window-safe-min-width)
+		    ;; Preserve fringes, margines, scrollbars if present.
+		    (window-min-size nil nil t)))
+		 (max-width
+		  ;; Sanitize MAX-WIDTH.
+		  (if (numberp max-width)
+		      (min (+ total-width (window-max-delta nil t)) max-width)
+		    (+ total-width (window-max-delta nil t))))
+		 ;; When fitting vertically, assume that WINDOW's start
+		 ;; position remains unaltered.  WINDOW can't get wider
+		 ;; than its frame's pixel width, its height remains
+		 ;; unaltered.
+		 (width (car (window-buffer-pixel-size
+			      nil (window-start) (point-max)
+			      (frame-pixel-width)
+			      ;; Add one char-height to assure that
+			      ;; we're on the safe side.  This
+			      ;; overshoots when the first line below
+			      ;; the bottom is wider than the window.
+			      (* body-height char-height)))))
+	    (setq width (+ (/ width char-width)
+			   (if (zerop (% width char-width)) 0 1)))
+	    (unless (= width body-width)
+	      (window-resize-no-error
+	       window
+	       (- (max min-width
+		       (min max-width
+			    (+ total-width (- width body-width))))
+		  total-width)
+	       t window)))))))))
 
 (defun window-safely-shrinkable-p (&optional window)
   "Return t if WINDOW can be shrunk without shrinking other windows.

=== modified file 'src/dispextern.h'
--- src/dispextern.h	2013-01-02 16:13:04 +0000
+++ src/dispextern.h	2013-02-02 14:56:54 +0000
@@ -2489,6 +2489,9 @@
      pixel_width with each call to produce_glyphs.  */
   int current_x;
 
+  /* Maximum x pixel position encountered within a display line.  */
+  int max_current_x;
+
   /* Accumulated width of continuation lines.  If > 0, this means we
      are currently in a continuation line.  This is initially zero and
      incremented/reset by display_line, move_it_to etc.  */

=== modified file 'src/xdisp.c'
--- src/xdisp.c	2013-01-05 21:18:01 +0000
+++ src/xdisp.c	2013-02-02 14:56:32 +0000
@@ -2996,7 +2996,7 @@
 
 	  it->current_y = first_y;
 	  it->vpos = 0;
-	  it->current_x = it->hpos = 0;
+	  it->current_x = it->max_current_x = it->hpos = 0;
 	}
     }
 }
@@ -8814,7 +8814,10 @@
 
 	  /* If TO_CHARPOS is reached or ZV, we don't have to do more.  */
 	  if (skip == MOVE_POS_MATCH_OR_ZV)
-	    reached = 5;
+	    {
+	      it->max_current_x = max (it->current_x, it->max_current_x);
+	      reached = 5;
+	    }
 	  else if (skip == MOVE_X_REACHED)
 	    {
 	      /* If TO_X was reached, we want to know whether TO_Y is
@@ -8883,6 +8886,8 @@
 		      skip = move_it_in_display_line_to
 			(it, -1, prev_x, MOVE_TO_X);
 		    }
+
+		  it->max_current_x = max (it->current_x, it->max_current_x);
 		  reached = 6;
 		}
 	    }
@@ -8908,15 +8913,18 @@
       switch (skip)
 	{
 	case MOVE_POS_MATCH_OR_ZV:
+	  it->max_current_x = max (it->current_x, it->max_current_x);
 	  reached = 8;
 	  goto out;
 
 	case MOVE_NEWLINE_OR_CR:
+	  it->max_current_x = max (it->current_x, it->max_current_x);
 	  set_iterator_to_next (it, 1);
 	  it->continuation_lines_width = 0;
 	  break;
 
 	case MOVE_LINE_TRUNCATED:
+	  it->max_current_x = it->last_visible_x;
 	  it->continuation_lines_width = 0;
 	  reseat_at_next_visible_line_start (it, 0);
 	  if ((op & MOVE_TO_POS) != 0
@@ -8928,6 +8936,7 @@
 	  break;
 
 	case MOVE_LINE_CONTINUED:
+	  it->max_current_x = it->last_visible_x;
 	  /* For continued lines ending in a tab, some of the glyphs
 	     associated with the tab are displayed on the current
 	     line.  Since it->current_x does not include these glyphs,
@@ -9326,6 +9335,93 @@
 	  && it->dpvec + it->current.dpvec_index != it->dpend);
 }
 
+DEFUN ("window-buffer-pixel-size", Fwindow_buffer_pixel_size, Swindow_buffer_pixel_size, 0, 5, 0,
+       doc: /* Return size of WINDOW's buffer in pixels.
+WINDOW must be a live window and defaults to the selected one.  The
+return value is a cons of the maximum pixel-width of any line and the
+maximum pixel-height of all lines.
+
+The optional argument X_LIMIT, if non-nil, specifies the maximum
+pixel-width that can be returned.  X_LIMIT nil or omitted, means to use
+the pixel-width of WINDOW's body; use this if you do not intend to
+change the width of WINDOW.  Use the maximum width WINDOW can be
+expanded to if you intend to change WINDOW's width.
+
+The optional argument Y_LIMIT, if non-nil, specifies the maximum
+pixel-height to scan.  Lines starting below Y_LIMIT are not scanned.
+Since calculating the pixel-height of a large buffer can take some time,
+it makes sense to specify this argument if the size of the buffer is
+unknown.  */)
+  (Lisp_Object window, Lisp_Object from, Lisp_Object to, Lisp_Object x_limit, Lisp_Object y_limit)
+{
+  struct window *w = decode_live_window (window);
+  Lisp_Object buf, value;
+  struct buffer *b;
+  struct it it;
+  struct buffer *old_buffer = NULL;
+  ptrdiff_t start, end;
+  struct text_pos startp, endp;
+  void *itdata = NULL;
+  int max_y = -1;
+
+  buf = w->buffer;
+  CHECK_BUFFER (buf);
+  b = XBUFFER (buf);
+
+  if (NILP (from))
+    start = BEGV;
+  else
+    {
+      CHECK_NUMBER_COERCE_MARKER (from);
+      start = min (max (XINT (from), BEGV), ZV);
+    }
+
+  if (NILP (to))
+    end = ZV;
+  else
+    {
+      CHECK_NUMBER_COERCE_MARKER (to);
+      end = max (start, min (XINT (to), ZV));
+    }
+
+  if (b != current_buffer)
+    {
+      old_buffer = current_buffer;
+      set_buffer_internal (b);
+    }
+
+  if (!NILP (y_limit))
+    {
+      CHECK_NUMBER (y_limit);
+      max_y = XINT (y_limit);
+    }
+
+  itdata = bidi_shelve_cache ();
+  SET_TEXT_POS (startp, start, CHAR_TO_BYTE (start));
+  start_display (&it, w, startp);
+
+  if (!NILP (x_limit))
+    {
+      CHECK_NUMBER (x_limit);
+      it.last_visible_x = XINT (x_limit);
+    }
+
+  /* Actually, we never want move_it_to stop at to_x.  But to make sure
+     that move_it_in_display_line_to always moves far enough, we set it
+     to MOST_POSITIVE_FIXNUM and specify MOVE_TO_X.  */
+  move_it_to (&it, end, MOST_POSITIVE_FIXNUM, max_y, -1,
+	      MOVE_TO_POS | MOVE_TO_X | MOVE_TO_Y);
+  last_height = 0;
+  value = Fcons (make_number (it.max_current_x),
+		 make_number (it.current_y));
+/** 		 make_number (line_bottom_y (&it))); **/
+  bidi_unshelve_cache (itdata, 0);
+
+  if (old_buffer)
+    set_buffer_internal (old_buffer);
+
+  return value;
+}
 
 /***********************************************************************
 			       Messages
@@ -28808,6 +28904,7 @@
   defsubr (&Sformat_mode_line);
   defsubr (&Sinvisible_p);
   defsubr (&Scurrent_bidi_paragraph_direction);
+  defsubr (&Swindow_buffer_pixel_size);
 
   DEFSYM (Qmenu_bar_update_hook, "menu-bar-update-hook");
   DEFSYM (Qoverriding_terminal_local_map, "overriding-terminal-local-map");



Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Sat, 02 Feb 2013 17:54:01 GMT) Full text and rfc822 format available.

Message #98 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: martin rudalics <rudalics <at> gmx.at>
Cc: monnier <at> iro.umontreal.ca, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Sat, 02 Feb 2013 19:52:19 +0200
> Date: Sat, 02 Feb 2013 17:48:42 +0100
> From: martin rudalics <rudalics <at> gmx.at>
> CC: Stefan Monnier <monnier <at> iro.umontreal.ca>, 13399 <at> debbugs.gnu.org
> 
> I rewrote `fit-window-to-buffer' and `fit-frame-to-buffer' using the
> display iterator.  Please have a look at the attached patch.

I looked at the C parts.

> --- src/dispextern.h	2013-01-02 16:13:04 +0000
> +++ src/dispextern.h	2013-02-02 14:56:54 +0000
> @@ -2489,6 +2489,9 @@
>       pixel_width with each call to produce_glyphs.  */
>    int current_x;
>  
> +  /* Maximum x pixel position encountered within a display line.  */
> +  int max_current_x;

Adding a struct member for the sake of just one user sounds
unjustified.  Can we instead make move_it_to accumulate the value
internally and return it?

In any case, the comment is inaccurate, since the value is accumulated
across all the display lines traversed by the iterator, not computed
per display line.

> +DEFUN ("window-buffer-pixel-size", Fwindow_buffer_pixel_size, Swindow_buffer_pixel_size, 0, 5, 0,

Why not window-text-pixel-size?  The "buffer" part doesn't belong
here, I think.

> +       doc: /* Return size of WINDOW's buffer in pixels.
> +WINDOW must be a live window and defaults to the selected one.  The
> +return value is a cons of the maximum pixel-width of any line and the
> +maximum pixel-height of all lines.
> +
> +The optional argument X_LIMIT, if non-nil, specifies the maximum
> +pixel-width that can be returned.  X_LIMIT nil or omitted, means to use
> +the pixel-width of WINDOW's body; use this if you do not intend to
> +change the width of WINDOW.  Use the maximum width WINDOW can be
> +expanded to if you intend to change WINDOW's width.
> +
> +The optional argument Y_LIMIT, if non-nil, specifies the maximum
> +pixel-height to scan.  Lines starting below Y_LIMIT are not scanned.

"Lines starting below Y_LIMIT" is ambiguous.  I suggest

  Lines whose y-coordinate is beyond Y_LIMIT will not be scanned.

> +Since calculating the pixel-height of a large buffer can take some time,
> +it makes sense to specify this argument if the size of the buffer is
> +unknown.  */)

The doc string keeps silent about arguments FROM and TO.

> +  /* Actually, we never want move_it_to stop at to_x.  But to make sure
> +     that move_it_in_display_line_to always moves far enough, we set it
> +     to MOST_POSITIVE_FIXNUM and specify MOVE_TO_X.  */
> +  move_it_to (&it, end, MOST_POSITIVE_FIXNUM, max_y, -1,
> +	      MOVE_TO_POS | MOVE_TO_X | MOVE_TO_Y);

Did you test what this does when END is covered by a display string?

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Sat, 02 Feb 2013 18:23:01 GMT) Full text and rfc822 format available.

Message #101 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: martin rudalics <rudalics <at> gmx.at>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: monnier <at> iro.umontreal.ca, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space
	U-200B
Date: Sat, 02 Feb 2013 19:20:44 +0100
>> +  /* Maximum x pixel position encountered within a display line.  */
>> +  int max_current_x;
>
> Adding a struct member for the sake of just one user sounds
> unjustified.  Can we instead make move_it_to accumulate the value
> internally and return it?

I don't know.  IIUC most iterator functions never return something
useful.  And if one wants to glue together the results of subsequent
calls of move_it_to, it might make sense to not reset the value
internally.

> In any case, the comment is inaccurate, since the value is accumulated
> across all the display lines traversed by the iterator, not computed
> per display line.

Would "within any line traversed by the iterator" be better?

>> +DEFUN ("window-buffer-pixel-size", Fwindow_buffer_pixel_size, Swindow_buffer_pixel_size, 0, 5, 0,
>
> Why not window-text-pixel-size?  The "buffer" part doesn't belong
> here, I think.

Since I also look at buffer portions outside the window, such a term
wouldn't be very accurate either.

> "Lines starting below Y_LIMIT" is ambiguous.  I suggest
>
>   Lines whose y-coordinate is beyond Y_LIMIT will not be scanned.

OK.

>> +Since calculating the pixel-height of a large buffer can take some time,
>> +it makes sense to specify this argument if the size of the buffer is
>> +unknown.  */)
>
> The doc string keeps silent about arguments FROM and TO.

... because I only added them later on ;-) Initially I always scanned
from `point-min' to min (`point-max', y_limit) but later I noticed that
with side-by-side windows it makes sense to start at `window-start'.

>> +  /* Actually, we never want move_it_to stop at to_x.  But to make sure
>> +     that move_it_in_display_line_to always moves far enough, we set it
>> +     to MOST_POSITIVE_FIXNUM and specify MOVE_TO_X.  */
>> +  move_it_to (&it, end, MOST_POSITIVE_FIXNUM, max_y, -1,
>> +	      MOVE_TO_POS | MOVE_TO_X | MOVE_TO_Y);
>
> Did you test what this does when END is covered by a display string?

No.  I didn't try invisible text and other atrocities either.  In fact,
I never experimented with a non-ZV end position so far.  Which problems
do you see?

martin




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Sat, 02 Feb 2013 18:38:02 GMT) Full text and rfc822 format available.

Message #104 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: martin rudalics <rudalics <at> gmx.at>
Cc: monnier <at> iro.umontreal.ca, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Sat, 02 Feb 2013 20:36:10 +0200
> Date: Sat, 02 Feb 2013 19:20:44 +0100
> From: martin rudalics <rudalics <at> gmx.at>
> CC: monnier <at> iro.umontreal.ca, 13399 <at> debbugs.gnu.org
> 
>  >> +  /* Maximum x pixel position encountered within a display line.  */
>  >> +  int max_current_x;
>  >
>  > Adding a struct member for the sake of just one user sounds
>  > unjustified.  Can we instead make move_it_to accumulate the value
>  > internally and return it?
> 
> I don't know.  IIUC most iterator functions never return something
> useful.

Because there was no need.  Now there is.

> And if one wants to glue together the results of subsequent calls of
> move_it_to, it might make sense to not reset the value internally.

It should be simple enough to add them up externally.

Adding a member to 'struct it' has the downside of enlarging the
structure, which slows down any code that copies such structs.  We are
going to punish all the users for the benefit of just one.  So I think
we should avoid that.

>  > In any case, the comment is inaccurate, since the value is accumulated
>  > across all the display lines traversed by the iterator, not computed
>  > per display line.
> 
> Would "within any line traversed by the iterator" be better?

Yes.

> 
>  >> +DEFUN ("window-buffer-pixel-size", Fwindow_buffer_pixel_size, Swindow_buffer_pixel_size, 0, 5, 0,
>  >
>  > Why not window-text-pixel-size?  The "buffer" part doesn't belong
>  > here, I think.
> 
> Since I also look at buffer portions outside the window, such a term
> wouldn't be very accurate either.

Then how about text-pixel-size?

>  >> +  /* Actually, we never want move_it_to stop at to_x.  But to make sure
>  >> +     that move_it_in_display_line_to always moves far enough, we set it
>  >> +     to MOST_POSITIVE_FIXNUM and specify MOVE_TO_X.  */
>  >> +  move_it_to (&it, end, MOST_POSITIVE_FIXNUM, max_y, -1,
>  >> +	      MOVE_TO_POS | MOVE_TO_X | MOVE_TO_Y);
>  >
>  > Did you test what this does when END is covered by a display string?
> 
> No.  I didn't try invisible text and other atrocities either.  In fact,
> I never experimented with a non-ZV end position so far.  Which problems
> do you see?

move_it_to might overshoot in those cases, i.e. you get more pixels
than strictly needed.  But OTOH, I see no way of asking for more
specific limits in these cases, nor a use-case where that could be
needed.  So I guess we are okay until someone hollers.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Sun, 03 Feb 2013 09:46:02 GMT) Full text and rfc822 format available.

Message #107 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: martin rudalics <rudalics <at> gmx.at>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: monnier <at> iro.umontreal.ca, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space
	U-200B
Date: Sun, 03 Feb 2013 10:44:07 +0100
>> I don't know.  IIUC most iterator functions never return something
>> useful.
>
> Because there was no need.  Now there is.

But hardly anyone delving into move_it_to wants to know that it returns
the maximum length of any row it encountered.  So I'm quite reluctant to
break this tradition.

> Adding a member to 'struct it' has the downside of enlarging the
> structure, which slows down any code that copies such structs.  We are
> going to punish all the users for the benefit of just one.  So I think
> we should avoid that.

How about making this static in xdisp.c like last_max_ascent and
last_height?

>>  >> +DEFUN ("window-buffer-pixel-size", Fwindow_buffer_pixel_size, Swindow_buffer_pixel_size, 0, 5, 0,
>>  >
>>  > Why not window-text-pixel-size?  The "buffer" part doesn't belong
>>  > here, I think.
>>
>> Since I also look at buffer portions outside the window, such a term
>> wouldn't be very accurate either.
>
> Then how about text-pixel-size?

This would imply that I had to accept any text as argument.  Maybe
`window-buffer-text-pixel-size' would be most accurate - after all it's
about a window, its buffer, and (part of) that buffer's text.

> move_it_to might overshoot in those cases, i.e. you get more pixels
> than strictly needed.

I earlier tried (line_bottom_y (&it)) as the return value but that
simply added the last line's height twice.  Anyway, in my limited
experience with this function it's still better to overshoot than
getting too few pixels.  The thing that troubled me most was this

	     If TO_X is not specified, use a TO_X of zero.  The reason
	     is to make the outcome of this function more predictable.
	     If we didn't use TO_X == 0, we would stop at the end of
	     the line which is probably not what a caller would expect
	     to happen.

which caused me to exit a line too early and consequently not show the
end of the last visible line of a window when it was the longest one.  I
eventually fixed this by calling move_it_to with to_x equal to
MOST_POSITIVE_FIXNUM.  Yet, having to_x equal -1 _not_ go to
last_visible_x seems quite arbitrary to me.

> But OTOH, I see no way of asking for more
> specific limits in these cases, nor a use-case where that could be
> needed.  So I guess we are okay until someone hollers.

I think so too.  I didn't try testing any special cases and IIUC nobody
tried my patch so far.

martin




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Sun, 03 Feb 2013 16:03:01 GMT) Full text and rfc822 format available.

Message #110 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: martin rudalics <rudalics <at> gmx.at>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Sun, 03 Feb 2013 11:01:20 -0500
>> Then how about text-pixel-size?
> This would imply that I had to accept any text as argument.

No, I think it's OK.  window-buffer-pixel-size was OK as well, as far as
I'm concerned.


        Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Sun, 03 Feb 2013 18:59:01 GMT) Full text and rfc822 format available.

Message #113 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: martin rudalics <rudalics <at> gmx.at>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space
	U-200B
Date: Sun, 03 Feb 2013 19:57:31 +0100
Just to recite the initial problem and your proposal:

>> With emacs -Q evaluate
>>
>> (with-current-buffer (get-buffer-create "*foo*")
>>    (dotimes (i 1000)
>>      (insert "1234​")) ; U-200B
>>    (setq word-wrap t)
>>    (display-buffer "*foo*"))
>>
>> where the character after 1234 is a zero-width space character with
>> unicode code point U-200B.  As can be seen in the window showing *foo*,
>> lines are not regularly wrapped at that character.
>
> You mean, not wrapped at all.  Witness the continuation bitmaps in the
> fringes, which shouldn't appear when a line is wrapped.
>
>> Doing
>>
>> (with-current-buffer (get-buffer-create "*foo*")
>>    (dotimes (i 1000)
>>      (insert "1234 "))
>>    (setq word-wrap t)
>>    (display-buffer "*foo*"))
>>
>> instead wraps lines as expected.
>
> If anything, this is a missing feature, since word-wrap is explicitly
> coded to break lines only on SPC and TAB characters.  See the
> IT_DISPLAYING_WHITESPACE macro in xdisp.c.
>
> If we want to add more characters to the set, we should probably
> arrange a special char-table for this, and have it exposed to Lisp, so
> it could be customized.  Patches are welcome.

I now rewrote IT_DISPLAYING_WHITESPACE as

#define IT_DISPLAYING_WHITESPACE(it)					\
  ((it->what == IT_CHARACTER						\
    && !NILP (CHAR_TABLE_REF (Vword_wrap_chars, it->c)))		\
   || ((STRINGP (it->string)						\
	&& !NILP (CHAR_TABLE_REF					\
		   (Vword_wrap_chars,					\
		      SREF (it->string, IT_STRING_BYTEPOS (*it)))))	\
       || (it->s && !NILP (CHAR_TABLE_REF				\
			    (Vword_wrap_chars,				\
			       it->s[IT_BYTEPOS (*it)])))		\
       || (IT_BYTEPOS (*it) < ZV_BYTE					\
	   && !NILP (CHAR_TABLE_REF					\
		      (Vword_wrap_chars,				\
			 (*BYTE_POS_ADDR (IT_BYTEPOS (*it))))))))	\


and have a character table called `word-wrap-chars' such that
(aref word-wrap-chars ?​) returns t, but it doesn't wrap at a
U-200B character.  Is there some additional wrinkle like some
hardcoded space/tab in the word-wrap code I have to observe?
Or is my code wrong?

Thanks, martin





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Sun, 03 Feb 2013 19:34:01 GMT) Full text and rfc822 format available.

Message #116 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: martin rudalics <rudalics <at> gmx.at>
Cc: monnier <at> iro.umontreal.ca, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Sun, 03 Feb 2013 21:32:01 +0200
> Date: Sun, 03 Feb 2013 10:44:07 +0100
> From: martin rudalics <rudalics <at> gmx.at>
> CC: monnier <at> iro.umontreal.ca, 13399 <at> debbugs.gnu.org
> 
>  >> I don't know.  IIUC most iterator functions never return something
>  >> useful.
>  >
>  > Because there was no need.  Now there is.
> 
> But hardly anyone delving into move_it_to wants to know that it returns
> the maximum length of any row it encountered.  So I'm quite reluctant to
> break this tradition.

There's no tradition.  Callers that don't care about the return value
are free to ignore it.  I don't see a problem here.  We have many
internal functions that return values which might be useful to
someone.

> How about making this static in xdisp.c like last_max_ascent and
> last_height?

That'd be even worse, IMO.

>  > Then how about text-pixel-size?
> 
> This would imply that I had to accept any text as argument.

I don't think so, but maybe you will like text-region-pixel-size?

> Maybe
> `window-buffer-text-pixel-size' would be most accurate - after all it's
> about a window, its buffer, and (part of) that buffer's text.

It hints that it returns size of text in a window, which is false.

But we are bike-shedding.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Sun, 03 Feb 2013 19:47:02 GMT) Full text and rfc822 format available.

Message #119 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: martin rudalics <rudalics <at> gmx.at>
Cc: 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Sun, 03 Feb 2013 21:45:29 +0200
> Date: Sun, 03 Feb 2013 19:57:31 +0100
> From: martin rudalics <rudalics <at> gmx.at>
> CC: 13399 <at> debbugs.gnu.org
> 
> I now rewrote IT_DISPLAYING_WHITESPACE as
> 
> #define IT_DISPLAYING_WHITESPACE(it)					\
>    ((it->what == IT_CHARACTER						\
>      && !NILP (CHAR_TABLE_REF (Vword_wrap_chars, it->c)))		\
>     || ((STRINGP (it->string)						\
> 	&& !NILP (CHAR_TABLE_REF					\
> 		   (Vword_wrap_chars,					\
> 		      SREF (it->string, IT_STRING_BYTEPOS (*it)))))	\
>         || (it->s && !NILP (CHAR_TABLE_REF				\
> 			    (Vword_wrap_chars,				\
> 			       it->s[IT_BYTEPOS (*it)])))		\
>         || (IT_BYTEPOS (*it) < ZV_BYTE					\
> 	   && !NILP (CHAR_TABLE_REF					\
> 		      (Vword_wrap_chars,				\
> 			 (*BYTE_POS_ADDR (IT_BYTEPOS (*it))))))))	\
> 
> 
> and have a character table called `word-wrap-chars' such that
> (aref word-wrap-chars ?​) returns t, but it doesn't wrap at a
> U-200B character.  Is there some additional wrinkle like some
> hardcoded space/tab in the word-wrap code I have to observe?
> Or is my code wrong?

Does CHAR_TABLE_REF return the right value?

Also, the code is wrong here:

   SREF (it->string, IT_STRING_BYTEPOS (*it))

and here:

   it->s[IT_BYTEPOS (*it)]

and here:

   *BYTE_POS_ADDR (IT_BYTEPOS (*it))

in that that it assumes the character is always one byte, which is
clearly wrong with non-ASCII characters.  You should instead use
FETCH_CHAR and STRING_CHAR.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Mon, 04 Feb 2013 17:06:02 GMT) Full text and rfc822 format available.

Message #122 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: martin rudalics <rudalics <at> gmx.at>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: monnier <at> iro.umontreal.ca, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space
	U-200B
Date: Mon, 04 Feb 2013 18:04:14 +0100
[Message part 1 (text/plain, inline)]
> But we are bike-shedding.

Indeed.  How about the attached patch?

martin
[fit-window-to-buffer.diff (text/plain, inline)]
=== modified file 'lisp/window.el'
--- lisp/window.el	2013-01-02 16:13:04 +0000
+++ lisp/window.el	2013-02-04 10:17:41 +0000
@@ -6074,211 +6074,428 @@
 			     (eobp)
 			     window))))
 
-;;; Resizing buffers to fit their contents exactly.
+;;; Resizing windows and frames to fit their contents exactly.
+(defcustom fit-window-to-buffer-horizontally nil
+  "Non-nil means `fit-window-to-buffer' can resize windows horizontally.
+If this is nil, `fit-window-to-buffer' never resizes windows
+horizontally.  If this is `only', it can resize windows
+horizontally only.  Any other value means `fit-window-to-buffer'
+can resize windows in both dimensions."
+  :type 'boolean
+  :version "24.4"
+  :group 'help)
+
 (defcustom fit-frame-to-buffer nil
-  "Non-nil means `fit-window-to-buffer' can resize frames.
+  "Non-nil means `fit-frame-to-buffer' can resize frames.
 A frame can be resized if and only if its root window is a live
-window.  The height of the root window is subject to the values
-of `fit-frame-to-buffer-max-height' and `window-min-height'."
+window.  If this is `horizontally', frames can be resized
+horizontally only.  If this is `vertically', frames can be
+resized vertically only.  Any other non-nil value means frames
+can be resized in both dimensions.  See also
+`fit-frame-to-buffer-margins' and `fit-frame-to-buffer-sizes'."
   :type 'boolean
-  :version "24.3"
+  :version "24.4"
   :group 'help)
 
-(defcustom fit-frame-to-buffer-bottom-margin 4
-  "Bottom margin for the command `fit-frame-to-buffer'.
-This is the number of lines that function leaves free at the bottom of
-the display, in order to not obscure any system task bar or panel.
-If you do not have one (or if it is vertical) you might want to
-reduce this.  If it is thicker, you might want to increase this."
-  ;; If you set this too small, fit-frame-to-buffer can shift the
-  ;; frame up to avoid the panel.
-  :type 'integer
-  :version "24.3"
-  :group 'windows)
-
-(defun fit-frame-to-buffer (&optional frame max-height min-height)
+(defcustom fit-frame-to-buffer-margins '(0 0 0 72)
+  "Margins around frame for `fit-frame-to-buffer'.
+This list specifies the numbers of pixels to be left free on the
+left, above, the right, and below a frame that shall be fit to
+its buffer.  The value specified here can be overridden for a
+specific frame by that frame's `fit-frame-to-buffer-margins'
+parameter, if present.
+
+On some window systems the calculation of frame sizes can be
+incorrect.  Increasing the value of the third and/or fourth
+element of this variable can fix that.
+
+See also `fit-frame-to-buffer-sizes'."
+  :version "24.4"
+  :type '(list
+	  (integer :tag "Left" :size 5)
+	  (integer :tag " Above" :size 5)
+	  (integer :tag " Right" :size 5)
+	  (integer :tag " Below" :size 5))
+  :group 'windows)
+
+(defcustom fit-frame-to-buffer-sizes '(nil nil nil nil)
+  "Size boundaries of frame for `fit-frame-to-buffer'.
+This list specifies the total maximum and minimum lines and
+maximum and minimum columns of the root window of any frame that
+shall be fit to its buffer.  If any of these values is non-nil,
+it will override the value supplied by the respective arguments
+of `fit-frame-to-buffer'.
+
+On window systems where the menubar can wrap, fitting a frame to
+its buffer may swallow the last line(s).  Specifying an
+appropriate minimum width value here can avoid such wrapping.
+
+See also `fit-frame-to-buffer-margins'."
+  :version "24.4"
+  :type '(list
+	  (choice
+	   :tag "Maximum Height"
+	   :value nil
+	   :format "%[MaxHeight%] %v  "
+	   (const :tag "None" :format "%t" nil)
+	   (integer :tag "Lines" :size 5))
+	  (choice
+	   :tag "Minimum Height"
+	   :value nil
+	   :format "%[MinHeight%] %v  "
+	   (const :tag "None" :format "%t" nil)
+	   (integer :tag "Lines" :size 5))
+	  (choice
+	   :tag "Maximum Width"
+	   :value nil
+	   :format "%[MaxWidth%] %v  "
+	   (const :tag "None" :format "%t" nil)
+	   (integer :tag "Columns" :size 5))
+	  (choice
+	   :tag "Minimum Width"
+	   :value nil
+	   :format "%[MinWidth%] %v\n"
+	   (const :tag "None" :format "%t" nil)
+	   (integer :tag "Columns" :size 5)))
+  :group 'windows)
+
+(defun window--sanitize-margin (margin left right)
+  "Return MARGIN if it's a number between LEFT and RIGHT."
+  (if (and (numberp margin)
+	   (<= left (- right margin)) (<= margin right))
+      margin
+    0))
+
+(defun fit-frame-to-buffer (&optional frame max-height min-height max-width min-width)
   "Adjust height of FRAME to display its buffer contents exactly.
 FRAME can be any live frame and defaults to the selected one.
+MAX-HEIGHT, MIN-HEIGHT, MAX-WIDTH and MIN-WIDTH specify bounds on
+the new total size of FRAME's root window.
 
-Optional argument MAX-HEIGHT specifies the maximum height of FRAME.
-It defaults to the height of the display below the current
-top line of FRAME, minus `fit-frame-to-buffer-bottom-margin'.
-Optional argument MIN-HEIGHT specifies the minimum height of FRAME.
-The default corresponds to `window-min-height'."
+The option `fit-frame-to-buffer' controls whether this function
+has any effect.  New position and size of FRAME are additionally
+determined by the options `fit-frame-to-buffer-sizes' and
+`fit-frame-to-buffer-margins' or the corresponding parameters of
+FRAME.  This function can fail to fit the buffer's height when
+`word-wrap' is turned on in that buffer."
   (interactive)
   (setq frame (window-normalize-frame frame))
-  (let* ((root (frame-root-window frame))
-	 (frame-min-height
-	  (+ (- (frame-height frame) (window-total-size root))
-	     window-min-height))
-	 (frame-top (frame-parameter frame 'top))
-	 (top (if (consp frame-top)
-		  (funcall (car frame-top) (cadr frame-top))
-		frame-top))
-	 (frame-max-height
-	  (- (/ (- (x-display-pixel-height frame) top)
-		(frame-char-height frame))
-	     fit-frame-to-buffer-bottom-margin))
-	 (compensate 0)
-	 delta)
-    (when (and (window-live-p root) (not (window-size-fixed-p root)))
-      (with-selected-window root
-	(cond
-	 ((not max-height)
-	  (setq max-height frame-max-height))
-	 ((numberp max-height)
-	  (setq max-height (min max-height frame-max-height)))
-	 (t
-	  (error "%s is an invalid maximum height" max-height)))
-	(cond
-	 ((not min-height)
-	  (setq min-height frame-min-height))
-	 ((numberp min-height)
-	  (setq min-height (min min-height frame-min-height)))
-	 (t
-	  (error "%s is an invalid minimum height" min-height)))
-	;; When tool-bar-mode is enabled and we have just created a new
-	;; frame, reserve lines for toolbar resizing.  This is needed
-	;; because for reasons unknown to me Emacs (1) reserves one line
-	;; for the toolbar when making the initial frame and toolbars
-	;; are enabled, and (2) later adds the remaining lines needed.
-	;; Our code runs IN BETWEEN (1) and (2).  YMMV when you're on a
-	;; system that behaves differently.
-	(let ((quit-restore (window-parameter root 'quit-restore))
-	      (lines (tool-bar-lines-needed frame)))
-	  (when (and quit-restore (eq (car quit-restore) 'frame)
-		     (not (zerop lines)))
-	    (setq compensate (1- lines))))
-	(message "%s" compensate)
-	(setq delta
-	      ;; Always count a final newline - we don't do any
-	      ;; post-processing, so let's play safe.
-	      (+ (count-screen-lines nil nil t)
-		 (- (window-body-size))
-		 compensate)))
-      ;; Move away from final newline.
-      (when (and (eobp) (bolp) (not (bobp)))
-	(set-window-point root (line-beginning-position 0)))
-      (set-window-start root (point-min))
-      (set-window-vscroll root 0)
-      (condition-case nil
-	  (set-frame-height
-	   frame
-	   (min (max (+ (frame-height frame) delta)
-		     min-height)
-		max-height))
-	(error (setq delta nil))))
-    delta))
+  (when (and (window-live-p (frame-root-window frame))
+	     fit-frame-to-buffer
+	     (or (not window-size-fixed)
+		 (and (eq window-size-fixed 'height)
+		      (not (eq fit-frame-to-buffer 'vertically)))
+		 (and (eq window-size-fixed 'width)
+		      (not (eq fit-frame-to-buffer 'horizontally)))))
+    (with-selected-window (frame-root-window frame)
+      (let* ((window (frame-root-window frame))
+	     (char-width (frame-char-width))
+	     (char-height (frame-char-height))
+	     (display-width (display-pixel-width frame))
+	     (display-height (display-pixel-height frame))
+	     ;; Sanitize margins.
+	     (margins (or (frame-parameter frame 'fit-frame-to-buffer-margins)
+			  fit-frame-to-buffer-margins))
+	     (left-margin (window--sanitize-margin
+			   (nth 0 margins) 0 display-width))
+	     (top-margin (window--sanitize-margin
+			  (nth 1 margins) 0 display-height))
+	     (right-margin (window--sanitize-margin
+			    (nth 2 margins) left-margin display-width))
+	     (bottom-margin (window--sanitize-margin
+			     (nth 3 margins) top-margin display-height))
+	     ;; The pixel width of FRAME.
+	     (frame-width (frame-pixel-width))
+	     ;; The difference between FRAME's pixel and parameter
+	     ;; widths.
+	     (frame-extra-width
+	      (- frame-width (* (frame-width) char-width)))
+	     ;; The pixel height of FRAME's window.
+	     (window-body-width (* (window-body-width) char-width))
+	     ;; The difference in pixels between total and body width of
+	     ;; FRAME's window.
+	     (window-extra-width
+	      (- (* (window-total-width) char-width) window-body-width))
+	     ;; The difference in pixels between the frame's pixel width
+	     ;; and the window's body width.
+	     (extra-width
+	      (* char-width (- (frame-width) (window-body-width))))
+	     ;; The maximum width we can use for fitting.
+	     (fit-width
+	      (- display-width (- frame-width window-body-width)
+		 left-margin right-margin))
+	     ;; The pixel position of FRAME's left border.  We usually
+	     ;; try to leave this alone.
+	     (left
+	      (let ((left (frame-parameter nil 'left)))
+		(if (consp left)
+		    (funcall (car left) (cadr left))
+		  left)))
+	     ;; The pixel height of FRAME.
+	     (frame-height (frame-pixel-height))
+	     ;; The difference between FRAME's pixel and parameter
+	     ;; heights.
+	     (frame-extra-height
+	      (- frame-height (* (frame-height) char-height)))
+	     ;; When tool-bar-mode is enabled and we just created a new
+	     ;; frame, reserve lines for toolbar resizing.  Needed
+	     ;; because for reasons unknown to me Emacs (1) reserves one
+	     ;; line for the toolbar when making the initial frame and
+	     ;; toolbars are enabled, and (2) later adds the remaining
+	     ;; lines needed.  Our code runs IN BETWEEN (1) and (2).
+	     ;; YMMV when you're on a system that behaves differently.
+	     (toolbar-extra-height
+	      (let ((quit-restore (window-parameter window 'quit-restore))
+		    (lines (tool-bar-lines-needed frame)))
+		(* char-height
+		   (if (and quit-restore (eq (car quit-restore) 'frame)
+			    (not (zerop lines)))
+		       (1- lines)
+		     0))))
+	     ;; The pixel height of FRAME's window.
+	     (window-body-height (* (window-body-height) char-height))
+	     ;; The difference in pixels between total and body height
+	     ;; of FRAME's window.
+	     (window-extra-height
+	      (- (* (window-total-height) char-height) window-body-height))
+	     ;; The difference in pixels between the frame's pixel
+	     ;; height and the window's body height.
+	     (extra-height
+	      (* (- (frame-height) (window-body-height)) char-height))
+	     ;; The maximum height we can use for fitting.
+	     (fit-height
+	      (- display-height (- frame-height window-body-height)
+		 top-margin bottom-margin toolbar-extra-height))
+	     ;; The pixel position of FRAME's top border.  We usually
+	     ;; try to leave this alone.
+	     (top
+	      (let ((top (frame-parameter nil 'top)))
+		(if (consp top)
+		    (funcall (car top) (cadr top))
+		  top)))
+	     ;; Sanitize minimum and maximum sizes.
+	     (sizes (or (frame-parameter frame 'fit-frame-to-buffer-sizes)
+			fit-frame-to-buffer-sizes))
+	     (max-height
+	      (cond
+	       ((numberp (nth 0 sizes))
+		(- (* (nth 0 sizes) char-height) window-extra-height))
+	       ((numberp max-height)
+		(- (* max-height char-height) window-extra-height))))
+	     (min-height
+	      (cond
+	       ((numberp (nth 1 sizes))
+		(- (* (nth 1 sizes) char-height) window-extra-height))
+	       ((numberp min-height)
+		(- (* min-height char-height) window-extra-height))))
+	     (max-width
+	      (cond
+	       ((numberp (nth 2 sizes))
+		(- (* (nth 2 sizes) char-width) window-extra-width))
+	       ((numberp max-width)
+		(- (* max-width char-width) window-extra-width))))
+	     (min-width
+	      (cond
+	       ((numberp (nth 3 sizes))
+		(- (* (nth 3 sizes) char-width) window-extra-width))
+	       ((numberp min-width)
+		(- (* min-width char-width) window-extra-width))))
+	     (value (window-text-pixel-size
+		     nil (point-min) (point-max)
+		     display-width display-height))
+	     (width (car value))
+	     (height (cdr value))
+	     remainder)
+	;; Round sizes (hopefully we can drop these as soon as we can
+	;; resize pixelwise).  First add pixels to obtain full last
+	;; lines and columns.
+	(setq remainder (% width char-width))
+	(unless (zerop remainder)
+	  (setq width (+ width (- char-width remainder))))
+	(setq remainder (% height char-height))
+	(setq height (+ height (- char-height remainder)))
+	;; Now make sure that we don't get larger than our rounded
+	;; maximum lines and columns.
+	(when (> width fit-width)
+	  (setq width (- fit-width (% fit-width char-width))))
+	(when (> height fit-height)
+	  (setq height (- fit-height (% fit-height char-height))))
+	;; Don't change height or width when the window's size is fixed
+	;; in either direction.
+	(cond
+	 ((eq window-size-fixed 'height)
+	  (setq height nil))
+	 ((eq window-size-fixed 'width)
+	  (setq height nil)))
+	(when width
+	  ;; Fit to maximum and minimum widths.
+	  (when max-width
+	    (setq width (min width max-width)))
+	  (when min-width
+	    (setq width (max width min-width)))
+	  ;; Add extra width.
+	  (setq width (+ width extra-width))
+	  ;; Preserve right margin.
+	  (let ((right (+ left width frame-extra-width))
+		(max-right (- display-width right-margin)))
+	    (cond
+	     ((> right max-right)
+	      ;; Move FRAME to left.
+	      (setq left (max 0 (- left (- right max-right)))))
+	     ((< left left-margin)
+	      ;; Move frame to right.
+	      (setq left left-margin)))))
+	(when height
+	  ;; Fit to maximum and minimum heights.
+	  (when max-height
+	    (setq height (min height max-height)))
+	  (when min-height
+	    (setq height (max height min-height)))
+	  ;; Add extra height.
+	  (setq height (+ height extra-height))
+	  ;; Preserve bottom and top margins.
+	  (let ((bottom (+ top height frame-extra-height))
+		(max-bottom (- display-height bottom-margin)))
+	    (cond
+	     ((> bottom max-bottom)
+	      ;; Move FRAME to left.
+	      (setq top (max 0 (- top (- bottom max-bottom)))))
+	     ((< top top-margin)
+	      ;; Move frame down.
+	      (setq top top-margin)))))
+	;; Apply changes.
+	(set-frame-position frame left top)
+	(set-frame-size
+	 frame
+	 (if width (/ width char-width) (frame-width))
+	 (if height (/ height char-height) (frame-height)))))))
 
-(defun fit-window-to-buffer (&optional window max-height min-height)
-  "Adjust height of WINDOW to display its buffer's contents exactly.
+(defun fit-window-to-buffer (&optional window max-height min-height max-width min-width)
+  "Adjust size of WINDOW to display its buffer's contents exactly.
 WINDOW must be a live window and defaults to the selected one.
 
-Optional argument MAX-HEIGHT specifies the maximum height of
-WINDOW and defaults to the height of WINDOW's frame.  Optional
-argument MIN-HEIGHT specifies the minimum height of WINDOW and
-defaults to `window-min-height'.  Both MAX-HEIGHT and MIN-HEIGHT
-are specified in lines and include the mode line and header line,
-if any.
-
-If WINDOW is a full height window, then if the option
-`fit-frame-to-buffer' is non-nil, this calls the function
-`fit-frame-to-buffer' to adjust the frame height.
-
-Return the number of lines by which WINDOW was enlarged or
-shrunk.  If an error occurs during resizing, return nil but don't
-signal an error.
+If WINDOW is part of a vertical combination, adjust WINDOW's
+height.  The new height is calculated from the number of lines of
+the accessible portion of its buffer.  The optional argument
+MAX-HEIGHT specifies a maximum height and defaults to the height
+of WINDOW's frame.  The optional argument MIN-HEIGHT specifies a
+minimum height and defaults to `window-min-height'.  Both
+MAX-HEIGHT and MIN-HEIGHT are specified in lines and include the
+mode line and header line, if any.
+
+If WINDOW is part of a horizontal combination and the value of
+the option `fit-window-to-buffer-horizontally' is non-nil, adjust
+WINDOW's height.  The new width of WINDOW is calculated from the
+maximum length of its buffer's lines that follow the current
+start position of WINDOW.  The optional argument MAX-WIDTH
+specifies a maximum width and defaults to the width of WINDOW's
+frame.  The optional argument MIN-WIDTH specifies a minimum width
+and defaults to `window-min-width'.  Both MAX-WIDTH and MIN-WIDTH
+are specified in columns and include fringes, margins and
+scrollbars, if any.
+
+If WINDOW is its frame's root window, then if the option
+`fit-frame-to-buffer' is non-nil, call `fit-frame-to-buffer' to
+adjust the frame's size.
 
 Note that even if this function makes WINDOW large enough to show
-_all_ lines of its buffer you might not see the first lines when
-WINDOW was scrolled."
+_all_ parts of its buffer you might not see the first part when
+WINDOW was scrolled.  If WINDOW is resized horizontally, you will
+not see the top of its buffer unless WINDOW starts at its minimum
+accessible position."
   (interactive)
   (setq window (window-normalize-window window t))
-  (cond
-   ((window-size-fixed-p window))
-   ((window-full-height-p window)
-    (when fit-frame-to-buffer
-      (fit-frame-to-buffer (window-frame window))))
-   (t
+  (if (eq window (frame-root-window window))
+      (when fit-frame-to-buffer
+	;; Fit WINDOW's frame to buffer.
+	(fit-frame-to-buffer
+	 (window-frame window) max-height min-height max-width min-width))
     (with-selected-window window
-      (let* ((height (window-total-size))
+      (let* ((frame (window-frame))
+	     (char-height (frame-char-height))
+	     (char-width (frame-char-width))
+	     (display-height (display-pixel-height))
+	     (total-height (window-total-height))
+	     (body-height (window-body-height))
+	     (body-width (window-body-width))
 	     (min-height
-	      ;; Adjust MIN-HEIGHT.
+	      ;; Sanitize MIN-HEIGHT.
 	      (if (numberp min-height)
 		  ;; Can't get smaller than `window-safe-min-height'.
 		  (max min-height window-safe-min-height)
 		;; Preserve header and mode line if present.
 		(window-min-size nil nil t)))
 	     (max-height
-	      ;; Adjust MAX-HEIGHT.
+	      ;; Sanitize MAX-HEIGHT.
 	      (if (numberp max-height)
-		  ;; Can't get larger than height of frame.
-		  (min max-height
-		       (window-total-size (frame-root-window window)))
-		;; Don't delete other windows.
-		(+ height (window-max-delta nil nil window))))
-	     ;; Make `desired-height' the height necessary to show
-	     ;; all of WINDOW's buffer, constrained by MIN-HEIGHT
-	     ;; and MAX-HEIGHT.
-	     (desired-height
-	      (max
-	       (min
-		(+ (count-screen-lines)
-		   ;; For non-minibuffers count the mode line, if any.
-		   (if (and (not (window-minibuffer-p window))
-			    mode-line-format)
-		       1
-		     0)
-		   ;; Count the header line, if any.
-		   (if header-line-format 1 0))
-		max-height)
-	       min-height))
-	     (desired-delta
-	      (- desired-height (window-total-size window)))
-	     (delta
-	      (if (> desired-delta 0)
-		  (min desired-delta
-		       (window-max-delta window nil window))
-		(max desired-delta
-		     (- (window-min-delta window nil window))))))
-	(condition-case nil
-	    (if (zerop delta)
-		;; Return zero if DELTA became zero in the process.
-		0
-	      ;; Don't try to redisplay with the cursor at the end on its
-	      ;; own line--that would force a scroll and spoil things.
-	      (when (and (eobp) (bolp) (not (bobp)))
-		;; It's silly to put `point' at the end of the previous
-		;; line and so maybe force horizontal scrolling.
-		(set-window-point window (line-beginning-position 0)))
-	      ;; Call `window-resize' with OVERRIDE argument equal WINDOW.
-	      (window-resize window delta nil window)
-	      ;; Check if the last line is surely fully visible.  If
-	      ;; not, enlarge the window.
-	      (let ((end (save-excursion
-			   (goto-char (point-max))
-			   (when (and (bolp) (not (bobp)))
-			     ;; Don't include final newline.
-			     (backward-char 1))
-			   (when truncate-lines
-			     ;; If line-wrapping is turned off, test the
-			     ;; beginning of the last line for
-			     ;; visibility instead of the end, as the
-			     ;; end of the line could be invisible by
-			     ;; virtue of extending past the edge of the
-			     ;; window.
-			     (forward-line 0))
-			   (point))))
-		(set-window-vscroll window 0)
-		;; This loop might in some rare pathological cases raise
-		;; an error - another reason for the `condition-case'.
-		(while (and (< desired-height max-height)
-			    (= desired-height (window-total-size))
-			    (not (pos-visible-in-window-p end)))
-		  (window-resize window 1 nil window)
-		  (setq desired-height (1+ desired-height)))))
-	  (error (setq delta nil)))
-	delta)))))
+		  (min (+ total-height (window-max-delta)) max-height)
+		(+ total-height (window-max-delta))))
+	     height)
+	(cond
+	 ;; If WINDOW is vertically combined, try to resize it
+	 ;; vertically.
+	 ((and (not (eq fit-window-to-buffer-horizontally 'only))
+	       (not (window-size-fixed-p window))
+	       (window-combined-p))
+	  ;; Vertically we always want to fit the entire buffer.
+	  ;; WINDOW'S height can't get larger than its frame's pixel
+	  ;; height.  Its width remains fixed.
+	  (setq height (cdr (window-text-pixel-size
+			     nil (point-min) (point-max)
+			     (* body-width char-width)
+			     (frame-pixel-height))))
+	  ;; Round height.
+	  (setq height (+ (/ height char-height)
+			  (if (zerop (% height char-height)) 0 1)))
+	  (unless (= height body-height)
+	    (window-resize-no-error
+	     window
+	     (- (max min-height
+		     (min max-height
+			  (+ total-height (- height body-height))))
+		total-height)
+	     nil window)))
+	 ;; If WINDOW is horizontally combined, try to resize it
+	 ;; horizontally.
+	 ((and fit-window-to-buffer-horizontally
+	       (not (window-size-fixed-p window t))
+	       (window-combined-p nil t))
+	  (let* ((display-width (display-pixel-width))
+		 (total-width (window-total-width))
+		 (min-width
+		  ;; Sanitize MIN-WIDTH.
+		  (if (numberp min-width)
+		      ;; Can't get smaller than `window-safe-min-width'.
+		      (max min-width window-safe-min-width)
+		    ;; Preserve fringes, margines, scrollbars if present.
+		    (window-min-size nil nil t)))
+		 (max-width
+		  ;; Sanitize MAX-WIDTH.
+		  (if (numberp max-width)
+		      (min (+ total-width (window-max-delta nil t)) max-width)
+		    (+ total-width (window-max-delta nil t))))
+		 ;; When fitting vertically, assume that WINDOW's start
+		 ;; position remains unaltered.  WINDOW can't get wider
+		 ;; than its frame's pixel width, its height remains
+		 ;; unaltered.
+		 (width (car (window-text-pixel-size
+			      nil (window-start) (point-max)
+			      (frame-pixel-width)
+			      ;; Add one char-height to assure that
+			      ;; we're on the safe side.  This
+			      ;; overshoots when the first line below
+			      ;; the bottom is wider than the window.
+			      (* body-height char-height)))))
+	    (setq width (+ (/ width char-width)
+			   (if (zerop (% width char-width)) 0 1)))
+	    (unless (= width body-width)
+	      (window-resize-no-error
+	       window
+	       (- (max min-width
+		       (min max-width
+			    (+ total-width (- width body-width))))
+		  total-width)
+	       t window)))))))))
 
 (defun window-safely-shrinkable-p (&optional window)
   "Return t if WINDOW can be shrunk without shrinking other windows.

=== modified file 'src/dispextern.h'
--- src/dispextern.h	2013-01-02 16:13:04 +0000
+++ src/dispextern.h	2013-02-04 10:23:56 +0000
@@ -3054,7 +3054,7 @@
 void init_iterator_to_row_start (struct it *, struct window *,
                                  struct glyph_row *);
 void start_display (struct it *, struct window *, struct text_pos);
-void move_it_to (struct it *, ptrdiff_t, int, int, int, int);
+int move_it_to (struct it *, ptrdiff_t, int, int, int, int);
 void move_it_vertically (struct it *, int);
 void move_it_vertically_backward (struct it *, int);
 void move_it_by_lines (struct it *, ptrdiff_t);

=== modified file 'src/xdisp.c'
--- src/xdisp.c	2013-01-05 21:18:01 +0000
+++ src/xdisp.c	2013-02-04 13:43:35 +0000
@@ -8734,13 +8734,17 @@
 
    If TO_CHARPOS is in invisible text, e.g. a truncated part of a
    screen line, this function will set IT to the next position that is
-   displayed to the right of TO_CHARPOS on the screen.  */
-
-void
+   displayed to the right of TO_CHARPOS on the screen.
+
+   Return the maximum pixel length of any line scanned but never more
+   than it.last_visible_x.  */
+
+int
 move_it_to (struct it *it, ptrdiff_t to_charpos, int to_x, int to_y, int to_vpos, int op)
 {
   enum move_it_result skip, skip2 = MOVE_X_REACHED;
   int line_height, line_start_x = 0, reached = 0;
+  int max_current_x = 0;
   void *backup_data = NULL;
 
   for (;;)
@@ -8814,7 +8818,10 @@
 
 	  /* If TO_CHARPOS is reached or ZV, we don't have to do more.  */
 	  if (skip == MOVE_POS_MATCH_OR_ZV)
-	    reached = 5;
+	    {
+	      max_current_x = max (it->current_x, max_current_x);
+	      reached = 5;
+	    }
 	  else if (skip == MOVE_X_REACHED)
 	    {
 	      /* If TO_X was reached, we want to know whether TO_Y is
@@ -8871,6 +8878,9 @@
 	      if (to_y >= it->current_y
 		  && to_y < it->current_y + line_height)
 		{
+		  if (to_y > it->current_y)
+		    max_current_x = max (it->current_x, max_current_x);
+
 		  /* When word-wrap is on, TO_X may lie past the end
 		     of a wrapped line.  Then it->current is the
 		     character on the next line, so backtrack to the
@@ -8883,6 +8893,7 @@
 		      skip = move_it_in_display_line_to
 			(it, -1, prev_x, MOVE_TO_X);
 		    }
+
 		  reached = 6;
 		}
 	    }
@@ -8908,15 +8919,18 @@
       switch (skip)
 	{
 	case MOVE_POS_MATCH_OR_ZV:
+	  max_current_x = max (it->current_x, max_current_x);
 	  reached = 8;
 	  goto out;
 
 	case MOVE_NEWLINE_OR_CR:
+	  max_current_x = max (it->current_x, max_current_x);
 	  set_iterator_to_next (it, 1);
 	  it->continuation_lines_width = 0;
 	  break;
 
 	case MOVE_LINE_TRUNCATED:
+	  max_current_x = it->last_visible_x;
 	  it->continuation_lines_width = 0;
 	  reseat_at_next_visible_line_start (it, 0);
 	  if ((op & MOVE_TO_POS) != 0
@@ -8928,6 +8942,7 @@
 	  break;
 
 	case MOVE_LINE_CONTINUED:
+	  max_current_x = it->last_visible_x;
 	  /* For continued lines ending in a tab, some of the glyphs
 	     associated with the tab are displayed on the current
 	     line.  Since it->current_x does not include these glyphs,
@@ -8997,6 +9012,8 @@
     bidi_unshelve_cache (backup_data, 1);
 
   TRACE_MOVE ((stderr, "move_it_to: reached %d\n", reached));
+
+  return max_current_x;
 }
 
 
@@ -9326,6 +9343,95 @@
 	  && it->dpvec + it->current.dpvec_index != it->dpend);
 }
 
+DEFUN ("window-text-pixel-size", Fwindow_text_pixel_size, Swindow_text_pixel_size, 0, 5, 0,
+       doc: /* Return the size of the text of WINDOW's buffer in pixels.
+WINDOW must be a live window and defaults to the selected one.  The
+return value is a cons of the maximum pixel-width of any text line and
+the maximum pixel-height of all text lines.
+
+The optional argument FROM, if non-nil, specifies the first text
+position and defaults to the minimum accessible position of the buffer.
+TO, if non-nil, specifies the last text position and defaults to the
+maximum accessible position of the buffer.
+
+The optional argument X_LIMIT, if non-nil, specifies the maximum text
+width that can be returned.  X_LIMIT nil or omitted, means to use the
+pixel-width of WINDOW's body; use this if you do not intend to change
+the width of WINDOW.  Use the maximum width WINDOW may assume if you
+intend to change WINDOW's width.
+
+The optional argument Y_LIMIT, if non-nil, specifies the maximum text
+height that can be returned.  Text lines whose y-coordinate is beyond
+Y_LIMIT are ignored.  Since calculating the text height of a large
+buffer can take some time, it makes sense to specify this argument if
+the size of the buffer is unknown.  */)
+  (Lisp_Object window, Lisp_Object from, Lisp_Object to, Lisp_Object x_limit, Lisp_Object y_limit)
+{
+  struct window *w = decode_live_window (window);
+  Lisp_Object buf;
+  struct buffer *b;
+  struct it it;
+  struct buffer *old_buffer = NULL;
+  ptrdiff_t start, end;
+  struct text_pos startp, endp;
+  void *itdata = NULL;
+  int max_y = -1, x, y;
+
+  buf = w->buffer;
+  CHECK_BUFFER (buf);
+  b = XBUFFER (buf);
+
+  if (NILP (from))
+    start = BEGV;
+  else
+    {
+      CHECK_NUMBER_COERCE_MARKER (from);
+      start = min (max (XINT (from), BEGV), ZV);
+    }
+
+  if (NILP (to))
+    end = ZV;
+  else
+    {
+      CHECK_NUMBER_COERCE_MARKER (to);
+      end = max (start, min (XINT (to), ZV));
+    }
+
+  if (b != current_buffer)
+    {
+      old_buffer = current_buffer;
+      set_buffer_internal (b);
+    }
+
+  if (!NILP (y_limit))
+    {
+      CHECK_NUMBER (y_limit);
+      max_y = XINT (y_limit);
+    }
+
+  itdata = bidi_shelve_cache ();
+  SET_TEXT_POS (startp, start, CHAR_TO_BYTE (start));
+  start_display (&it, w, startp);
+
+  if (!NILP (x_limit))
+    {
+      CHECK_NUMBER (x_limit);
+      it.last_visible_x = XINT (x_limit);
+    }
+
+  /* Actually, we never want move_it_to stop at to_x.  But to make sure
+     that move_it_in_display_line_to always moves far enough, we set it
+     to MOST_POSITIVE_FIXNUM and specify MOVE_TO_X.  */
+  x = move_it_to (&it, end, MOST_POSITIVE_FIXNUM, max_y, -1,
+		  MOVE_TO_POS | MOVE_TO_X | MOVE_TO_Y);
+  y = it.current_y;
+  bidi_unshelve_cache (itdata, 0);
+
+  if (old_buffer)
+    set_buffer_internal (old_buffer);
+
+  return Fcons (make_number (x), make_number (y));
+}
 
 /***********************************************************************
 			       Messages
@@ -28808,6 +28914,7 @@
   defsubr (&Sformat_mode_line);
   defsubr (&Sinvisible_p);
   defsubr (&Scurrent_bidi_paragraph_direction);
+  defsubr (&Swindow_text_pixel_size);
 
   DEFSYM (Qmenu_bar_update_hook, "menu-bar-update-hook");
   DEFSYM (Qoverriding_terminal_local_map, "overriding-terminal-local-map");



Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Mon, 04 Feb 2013 17:59:01 GMT) Full text and rfc822 format available.

Message #125 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: martin rudalics <rudalics <at> gmx.at>
Cc: monnier <at> iro.umontreal.ca, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
	Word-wrap can't wrap at zero-width space U-200B
Date: Mon, 04 Feb 2013 19:57:31 +0200
> Date: Mon, 04 Feb 2013 18:04:14 +0100
> From: martin rudalics <rudalics <at> gmx.at>
> CC: monnier <at> iro.umontreal.ca, 13399 <at> debbugs.gnu.org
> 
> How about the attached patch?

The C parts look OK to me.  Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Fri, 08 Dec 2017 02:38:01 GMT) Full text and rfc822 format available.

Message #128 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Adam Tack <adam.tack.513 <at> gmail.com>
To: 13399 <at> debbugs.gnu.org
Subject: 24.3.50; Word-wrap can't wrap at zero-width space U-200B
Date: Fri, 8 Dec 2017 01:02:08 +0000
[Message part 1 (text/plain, inline)]
I have a patch for the original issue of word-wrap not wrapping at a
zero-width space.  The implementation uses a character table, and is
closely based on that written by Martin Rudalics
(https://debbugs.gnu.org/cgi/bugreport.cgi?bug=13399#113), with Eli
Zaretski's suggestions regarding unicode.

The patch applies cleanly to the latest master, compiles on GNU+Linux
(Ubuntu Xenial) and appears to work — both of the following tests
result in the expected wrapping on the zero-width space character (the
first of these is taken verbatim from this bug thread, the second,
adapted from the first, checks that there is no regression of Bug#11341):

(with-current-buffer (get-buffer-create "*foo*")
  (dotimes (i 1000)
    (insert "1234")) ; U-200B
  (setq word-wrap t)
  (display-buffer "*foo*"))

(with-current-buffer (get-buffer-create "*bar*")
  (dotimes (i 1000)
    (insert "1234")) ; U-200B
  (setq word-wrap t)
  (setq whitespace-display-mappings
    '((space-mark 32
              [183]
              [46])
      (space-mark 160
              [164]
              [95])
      (space-mark 8203
              [164]
              [95])
      (newline-mark 10
            [36 10])
      (tab-mark 9
            [187 9]
            [92 9])))
  (whitespace-mode)
  (display-buffer "*bar*"))

Setting other word-wrap characters using set-char-table-range with
lisp also works as expected in the simple situations that I tested.

However, this is my first foray into modifying a serious C codebase,
so I am not sure if I have done the right thing.  In particular, I
have serious doubts about the second and third cases from
IT_DISPLAYING_WHITESPACE, especially since I don't really know when
they would be applicable.

   || ((STRINGP (it->string)                        \
    && !NILP (CHAR_TABLE_REF                    \
          (Vword_wrap_chars, STRING_CHAR            \
           (SDATA (it->string) + IT_STRING_BYTEPOS (*it)))))    \
       || (it->s && !NILP (CHAR_TABLE_REF                \
               (Vword_wrap_chars,                \
                STRING_CHAR(it->s + IT_BYTEPOS (*it)))))    \

Additionally, I'm not certain whether syms_of_character in character.c
is the right location for the definition of the char-table and whether
the range of characters U+2000 to U+200B should be in the chartable,
or if it should just be space and tab, by default.


I am aware that if this were to be accepted, I would also need to make
a change to etc/NEWS, probably the docstring of `word-wrap' and
somewhere in the Texinfo manual.

I have not yet filled out a copyright assignment form, though I will
do so if this patch (modulo changes) is considered acceptable.

Thanks!
[word_wrap_char_table.diff (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Fri, 08 Dec 2017 10:13:01 GMT) Full text and rfc822 format available.

Message #131 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: martin rudalics <rudalics <at> gmx.at>
To: Adam Tack <adam.tack.513 <at> gmail.com>, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space
 U-200B
Date: Fri, 08 Dec 2017 11:12:23 +0100
> I have a patch for the original issue of word-wrap not wrapping at a
> zero-width space.  The implementation uses a character table, and is
> closely based on that written by Martin Rudalics
> (https://debbugs.gnu.org/cgi/bugreport.cgi?bug=13399#113), with Eli
> Zaretski's suggestions regarding unicode.
>
> The patch applies cleanly to the latest master, compiles on GNU+Linux
> (Ubuntu Xenial) and appears to work — both of the following tests
> result in the expected wrapping on the zero-width space character (the
> first of these is taken verbatim from this bug thread, the second,
> adapted from the first, checks that there is no regression of Bug#11341):

Thank you very much for woking on this.  The patch applies here and
seems to work as needed.

martin





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Fri, 08 Dec 2017 15:40:02 GMT) Full text and rfc822 format available.

Message #134 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Adam Tack <adam.tack.513 <at> gmail.com>
Cc: 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
 Word-wrap can't wrap at zero-width space U-200B
Date: Fri, 08 Dec 2017 17:38:29 +0200
> From: Adam Tack <adam.tack.513 <at> gmail.com>
> Date: Fri, 8 Dec 2017 01:02:08 +0000
> 
> I have a patch for the original issue of word-wrap not wrapping at a
> zero-width space.  The implementation uses a character table, and is
> closely based on that written by Martin Rudalics
> (https://debbugs.gnu.org/cgi/bugreport.cgi?bug=13399#113), with Eli
> Zaretski's suggestions regarding unicode.

Thanks for working on this!

> However, this is my first foray into modifying a serious C codebase,
> so I am not sure if I have done the right thing.  In particular, I
> have serious doubts about the second and third cases from
> IT_DISPLAYING_WHITESPACE, especially since I don't really know when
> they would be applicable.
> 
>    || ((STRINGP (it->string)                        \
>     && !NILP (CHAR_TABLE_REF                    \
>           (Vword_wrap_chars, STRING_CHAR            \
>            (SDATA (it->string) + IT_STRING_BYTEPOS (*it)))))    \
>        || (it->s && !NILP (CHAR_TABLE_REF                \
>                (Vword_wrap_chars,                \
>                 STRING_CHAR(it->s + IT_BYTEPOS (*it)))))    \

I think this is okay, but maybe the macro could be converted into an
inline function, and then fetching the character from the various
objects separated from looking up the char-table for that character?

> Additionally, I'm not certain whether syms_of_character in character.c
> is the right location for the definition of the char-table and whether
> the range of characters U+2000 to U+200B should be in the chartable,
> or if it should just be space and tab, by default.

Well, since it's a char-table, users will probably want to control
which characters cause word-wrap.  One idea would be to have a minor
mode or some such, providing users an ability to include or exclude
different groups of related whitespace characters as a whole?  This
could be in follow-up patches, though.

We could also look at LineBreak.txt in the Unicode database for
inspiration and ideas.

But I do think that the default should be only TAB and SPC, as Emacs
always did, and the rest should be optional, and probably in Lisp, not
C.

> I am aware that if this were to be accepted, I would also need to make
> a change to etc/NEWS, probably the docstring of `word-wrap' and
> somewhere in the Texinfo manual.

And also a couple of tests (the ones you used would be a good start).

> I have not yet filled out a copyright assignment form, though I will
> do so if this patch (modulo changes) is considered acceptable.

I will send the forms off-list, thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Fri, 08 Dec 2017 20:10:01 GMT) Full text and rfc822 format available.

Message #137 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: adam.tack.513 <at> gmail.com
Cc: 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
 Word-wrap can't wrap at zero-width space U-200B
Date: Fri, 08 Dec 2017 22:08:22 +0200
> Date: Fri, 08 Dec 2017 17:38:29 +0200
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 13399 <at> debbugs.gnu.org
> 
> >    || ((STRINGP (it->string)                        \
> >     && !NILP (CHAR_TABLE_REF                    \
> >           (Vword_wrap_chars, STRING_CHAR            \
> >            (SDATA (it->string) + IT_STRING_BYTEPOS (*it)))))    \
> >        || (it->s && !NILP (CHAR_TABLE_REF                \
> >                (Vword_wrap_chars,                \
> >                 STRING_CHAR(it->s + IT_BYTEPOS (*it)))))    \

One other thought: since TAB and SPC are single-byte characters,
whereas the other "whitespace" characters are not, supporting the
non-ASCII whitespace will be associated with some performance hit in
the display engine, because it requires a char-table look up and
fetching multibyte characters.  So perhaps we should allow the
word-wrap-chars char-table to be nil (and make that the default), and
in that case support only TAB and SPC as word-wrap characters.  This
would let the default configuration work as fast is it does now,
imposing the performance penalty only on those who want to support
more whitespace characters.

WDYT?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Sat, 09 Dec 2017 03:51:02 GMT) Full text and rfc822 format available.

Message #140 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Adam Tack <adam.tack.513 <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
 Word-wrap can't wrap at zero-width space U-200B
Date: Sat, 9 Dec 2017 03:50:05 +0000
[Message part 1 (text/plain, inline)]
Thanks for the very fast replies and the suggestions!

> I think this is okay, but maybe the macro could be converted into an
> inline function, and then fetching the character from the various
> objects separated from looking up the char-table for that character?

I've made the conversion — it's now slightly less messy.  Regarding
the separation, I think that the most that can be done is to have the
look-up in a separate function.  Regrettably, trying to first obtain
the character, for example via a set of if-else clauses, and then
looking it up, which would be cleaner, can't really work since the
cases (in particular the first and fourth) are not disjunct.

> Well, since it's a char-table, users will probably want to control
> which characters cause word-wrap.  One idea would be to have a minor
> mode or some such, providing users an ability to include or exclude
> different groups of related whitespace characters as a whole?  This
> could be in follow-up patches, though.

Customisability was the idea. :)  I'm not sure how best to expose it in
a reasonably user-friendly way, though.  For the time being, allowing
control directly via the char-table might suffice.

> We could also look at LineBreak.txt in the Unicode database for
> inspiration and ideas.

The three main customisation options that I'm considering are:

i) Unicode whitespace (U+2000 - U+200B),
ii) vim's breakat characters (default " ^I!@*-+;:,./?"), since
presumably they had given it some thought,
iii) The characters in LineBreak.txt (parsing the file shouldn't be
hard, if there aren't copyright issues).

> But I do think that the default should be only TAB and SPC, as Emacs
> always did, and the rest should be optional, and probably in Lisp, not
> C.

> And also a couple of tests (the ones you used would be a good start).

These would presumably have to be in tests/manual since the position of
the word-wrap depends on too many variables (width of window, font
type, font size)?

> I will send the forms off-list, thanks.

Thanks!

> One other thought: since TAB and SPC are single-byte characters,
> whereas the other "whitespace" characters are not, supporting the
> non-ASCII whitespace will be associated with some performance hit in
> the display engine, because it requires a char-table look up and
> fetching multibyte characters.  So perhaps we should allow the
> word-wrap-chars char-table to be nil (and make that the default), and
> in that case support only TAB and SPC as word-wrap characters.  This
> would let the default configuration work as fast is it does now,
> imposing the performance penalty only on those who want to support
> more whitespace characters.

> WDYT?

That seems sensible.  The old behaviour will now be the default and
look-up using the char-table only enabled with the global minor mode
`word-wrap-char-table-mode' (suggestions for a catchier name very
welcome).  For the time being, its definition is in a new file
`lisp/word-wrap.el'.  Also temporarily, for ease of testing, it allows
wrapping on the unicode whitespace characters.


The current iteration is attached.  Until they've found a proper home,
the slightly updated tests are below.

(require 'word-wrap)

(with-current-buffer (get-buffer-create "*bar*")
  (dotimes (i 1000)
    (insert "1234")) ; U-200B
  (setq word-wrap t)
  (setq whitespace-display-mappings
    '((space-mark 32
              [183]
              [46])
      (space-mark 160
              [164]
              [95])
      (space-mark 8203
              [164]
              [95])
      (newline-mark 10
            [36 10])
      (tab-mark 9
            [187 9]
            [92 9])))
  (whitespace-mode)
  (word-wrap-char-table-mode)
  (display-buffer "*bar*"))

(with-current-buffer (get-buffer-create "*foo*")
  (dotimes (i 1000)
    (insert "1234")) ; U-200B
  (setq word-wrap t)
  (word-wrap-char-table-mode)
  (display-buffer "*foo*"))
[word_wrap_char_table.diff (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Tue, 12 Dec 2017 17:14:02 GMT) Full text and rfc822 format available.

Message #143 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Adam Tack <adam.tack.513 <at> gmail.com>
Cc: 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
 Word-wrap can't wrap at zero-width space U-200B
Date: Tue, 12 Dec 2017 19:13:33 +0200
> From: Adam Tack <adam.tack.513 <at> gmail.com>
> Date: Sat, 9 Dec 2017 03:50:05 +0000
> Cc: 13399 <at> debbugs.gnu.org
> 
> > I think this is okay, but maybe the macro could be converted into an
> > inline function, and then fetching the character from the various
> > objects separated from looking up the char-table for that character?
> 
> I've made the conversion — it's now slightly less messy.  Regarding
> the separation, I think that the most that can be done is to have the
> look-up in a separate function.  Regrettably, trying to first obtain
> the character, for example via a set of if-else clauses, and then
> looking it up, which would be cleaner, can't really work since the
> cases (in particular the first and fourth) are not disjunct.

Hmm... not sure why you arrived at this conclusion.  E.g., what's
wrong with the implementation at the bottom of this message?

> > We could also look at LineBreak.txt in the Unicode database for
> > inspiration and ideas.
> 
> The three main customisation options that I'm considering are:
> 
> i) Unicode whitespace (U+2000 - U+200B),

Yes.

> ii) vim's breakat characters (default " ^I!@*-+;:,./?"), since
> presumably they had given it some thought,

Maybe.  I'm not sure in what modes this would be TRT.

> iii) The characters in LineBreak.txt (parsing the file shouldn't be
> hard, if there aren't copyright issues).

We already import several UCD files, see admin/unidata, where you will
also find copyright.html from the Unicode Consortium.

> > And also a couple of tests (the ones you used would be a good start).
> 
> These would presumably have to be in tests/manual since the position of
> the word-wrap depends on too many variables (width of window, font
> type, font size)?

test/manual is okay.

> diff --git a/lisp/word-wrap.el b/lisp/word-wrap.el
> new file mode 100644
> index 0000000..6d59a83
> --- /dev/null
> +++ b/lisp/word-wrap.el
> @@ -0,0 +1,21 @@
> +(define-minor-mode word-wrap-char-table-mode
> +  "Toggle wrapping using a look-up to word-wrap-chars, globally.
> +
> +Currently, this allows word wrapping on the characters U+2000 to
> +U+200B in addition to the default of space and tap, when
> +`word-wrap' is set to t.
> +
> +(Provisional and unstable.)
> +"
> +  :global t
> +  :lighter "uws "
> +  (if word-wrap-char-table-mode
> +      (progn (setq word-wrap-chars (make-char-table nil nil))
> +             (set-char-table-range word-wrap-chars 9 t)
> +             (set-char-table-range word-wrap-chars 32 t)
> +             (set-char-table-range word-wrap-chars
> +                                   '(8192 . 8203) t))
> +    (setq word-wrap-chars nil)))

This should probably go into simple.el.

Thanks.

Here's the implementation of IT_DISPLAYING_WHITESPACE I had in mind:

static inline bool
IT_DISPLAYING_WHITESPACE (struct it *it)
{
  bool char_table_p = CHAR_TABLE_P (Vword_wrap_chars);
  int c;

  if (it->what == IT_CHARACTER)
    c = it->c;
  else if (!char_table_p)
    {
      if (STRINGP (it->string))
	c = SREF (it->string, IT_STRING_BYTEPOS (*it));
      else if (it->s)
	c = it->s[IT_BYTEPOS (*it)];
      else if (IT_BYTEPOS (*it) < ZV_BYTE)
	c = *BYTE_POS_ADDR (IT_BYTEPOS (*it));
    }
  else
    {
      if (STRINGP (it->string))
	c = STRING_CHAR (SDATA (it->string) + IT_STRING_BYTEPOS (*it));
      else if (it->s)
	c = STRING_CHAR (it->s + IT_BYTEPOS (*it));
      else if (IT_BYTEPOS (*it) < ZV_BYTE)
	c = FETCH_CHAR_AS_MULTIBYTE (IT_BYTEPOS (*it));
    }

  return
    char_table_p
    ? !NILP (CHAR_TABLE_REF (Vword_wrap_chars, c))
    : (c == ' ' || c == '\t');
}




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Wed, 13 Dec 2017 04:02:01 GMT) Full text and rfc822 format available.

Message #146 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Adam Tack <adam.tack.513 <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
 Word-wrap can't wrap at zero-width space U-200B
Date: Wed, 13 Dec 2017 04:00:56 +0000
Sorry for not working further on this, but I didn't have time.  I will
get back to finishing this, soon.

> Hmm... not sure why you arrived at this conclusion.  E.g., what's
> wrong with the implementation at the bottom of this message?

This was very similar to my first try.  Unfortunately, it doesn't work
correctly in whitespace-mode, even with just normal spaces, regressing
on Bug#11341.

(with-current-buffer (get-buffer-create "*bar*")
  (dotimes (i 1000)
    (insert "1234 ")) ; Space
  (setq word-wrap t)
  (whitespace-mode)
  (display-buffer "*bar*"))

The spaces are displayed as `·', so it->c returns 183, none of the
further tests are checked and IT_DISPLAYING_WHITESPACE returns False.
(In the currently used implementation, if it->c is not one of ' ' or '\t'
then the later tests are all checked.)

I thought about changing the order of the tests to something like the
following (ignoring the special case of ' ' and '\t', here, for
brevity):

static inline bool
IT_DISPLAYING_WHITESPACE (struct it *it) {
  int c;
  if (IT_BYTEPOS (*it) < ZV_BYTE)
    c = FETCH_CHAR (IT_BYTEPOS (*it));
  else if (it->what == IT_CHARACTER)
    c = it->c;
  else if (STRINGP (it->string))
    c = STRING_CHAR (SDATA (it->string) + IT_STRING_BYTEPOS (*it));
  else if (it->s)
    c = STRING_CHAR (it->s + IT_BYTEPOS (*it));
  else
    return false;

  return !NILP (CHAR_TABLE_REF (Vword_wrap_chars, c));
}

which in the case of whitespace-mode does TRT, but I worried that
there might be situations where wrapping on the display character
is correct.  The crux (as I had previously, but very unclearly,
written) is that under "normal" circumstances, both
`(it->what == IT_CHARACTER)' and `(IT_BYTEPOS (*it) < ZV_BYTE)'
are true.

Additionally, I wasn't sure whether there should be a fall-through,
since on the one hand, it prevents emacs crashing if (weirdly) all the
previous tests return false, but on the other, it might preclude some magic
compiler optimisation.

Chaining ORs side-stepped both issues, so I settled on keeping it, though
it might have been the wrong decision.

> > ii) vim's breakat characters (default " ^I!@*-+;:,./?"), since
> > presumably they had given it some thought,

> Maybe.  I'm not sure in what modes this would be TRT.

It should almost certainly not be the default in any mode, but it
might, perhaps, be a useful, pre-defined option for some users.  (For
instance, when wrapping long URLs or paths in comments:

|;;                                                     |
|https://very.long.url/that-will-not-fit-on-a-single-lin|
|e-anyway-but-could-at-least-start-on-the-same-line-as-t|
|he-comment-sign-and-break-at-slightly-more-logical-plac|
|es                                                     |

looks (IMO at least!) less aesthetically pleasing than:

|;; https://very.long.url/that-will-not-fit-on-a-single-|
|line-anyway-but-could-at-least-start-on-the-same-line- |
|as-the-comment-sign-and-break-at-slightly-more-logical-|
|places                                                 |

where `|' is the margin.

The same sometimes holds for excessively long variable names.  I
definitely wouldn't impose this preference on others, but I assume
that some might share it.)  Using vim's choice helps avoid
bike-shedding.

> We already import several UCD files, see admin/unidata, where you will
> also find copyright.html from the Unicode Consortium.

Great! That's convenient.

> test/manual is okay.

Thanks!

> This should probably go into simple.el.

I'll move it there.


Thanks for the help!




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Wed, 13 Dec 2017 16:10:01 GMT) Full text and rfc822 format available.

Message #149 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Adam Tack <adam.tack.513 <at> gmail.com>
Cc: 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
 Word-wrap can't wrap at zero-width space U-200B
Date: Wed, 13 Dec 2017 18:09:17 +0200
> From: Adam Tack <adam.tack.513 <at> gmail.com>
> Date: Wed, 13 Dec 2017 04:00:56 +0000
> Cc: 13399 <at> debbugs.gnu.org
> 
> > Hmm... not sure why you arrived at this conclusion.  E.g., what's
> > wrong with the implementation at the bottom of this message?
> 
> This was very similar to my first try.  Unfortunately, it doesn't work
> correctly in whitespace-mode, even with just normal spaces, regressing
> on Bug#11341.

Right, I missed that.  But then I guess it might be better to leave
this a macro, as the function is not prettier.  Or maybe leave the
case of the char-table being nil as a macro, and put the rest in a
function, like we do in several other places.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Sun, 17 Dec 2017 02:23:01 GMT) Full text and rfc822 format available.

Message #152 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Adam Tack <adam.tack.513 <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50;
 Word-wrap can't wrap at zero-width space U-200B
Date: Sun, 17 Dec 2017 02:22:12 +0000
[Message part 1 (text/plain, inline)]
> Right, I missed that.  But then I guess it might be better to leave
> this a macro, as the function is not prettier.  Or maybe leave the
> case of the char-table being nil as a macro, and put the rest in a
> function, like we do in several other places.

I've split out the non-nil char-table case out into a function, as I
think that using a named function slightly improves readability, and
having a macro over 20 lines long, somehow feels "wrong".  If the
compiler does actually follow the inline directive, there should be no
additional performance hit.

Is the convention for "static inline" functions to be in lower-case?
(The few cases that I've found in w32term.c are not capitalised.)

Attached is a working (and hopefully complete) patch for word-wrap
using the `word-wrap-chars' char-table (without a nice lisp
interface).


Also attached is a patch for the lisp interface.

However, upon reflection, I'm not sure whether my approach of using a
new minor mode (the horribly named `word-wrap-chars-mode') is the
best, as it's not very discoverable and adds yet another minor mode,
whose existence must be remembered and whose lighter will clutter the
mode-line.  The alternative would be to just use `visual-line-mode',
without changing the default behaviour and allow customization by
customizing `word-wrap-type'.

`visual-line-mode' would then be modified slightly so that, when
being enabled:

i)  it calls `set-word-wrap-chars',
ii) it saves `word-wrap-chars' to `visual-line--saved-state', if it
had been locally set (like for the other relevant variables),

and when disabled, it unsets `word-wrap-chars'.  The default of
`word-wrap-type' would be changed to ascii-whitespace.

By default, the effect of `visual-line-mode' would not change, other
than a redundant call to `set-word-wrap-chars' which would keep
`word-wrap-chars' set to nil.  `word-wrap-type' would have to be
customized for char-table-based wrapping to be used.  Would such
a change to `visual-line-mode' be acceptable?

(The attached patch for the lisp interface then mostly stays the same,
other than the dropping of `word-wrap-chars-mode'.)


An option to add all unicode line breaking characters from
LineBreak.txt to the `word-wrap-chars' char-table is still missing, as
I'm not too comfortable with the unicode-category-table, it's probably
not an option too urgently needed and it would be an incomplete
implementation anyway, since the algorithm can only use the last
character, not the following one, or the surroundings.

Would the cleanest approach be to add a new `line-break' character
property (corresponding to the Unicode Line_Break:
http://unicode.org/reports/tr44/#Line_Break ), set it by parsing
LineBreak.txt (during compilation of emacs), and then set
`word-wrap-chars' based on which characters have the correct
`line-break' property (SP, ZW, BA, B2, HY (?), SY (?), CB (???),
SA (???)) when needed (and perhaps cache it)?


Thank you for your continued guidance and best wishes,
Adam
[word_wrap_char_table.diff (text/plain, attachment)]
[lisp_interface.diff (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Fri, 18 Sep 2020 14:57:01 GMT) Full text and rfc822 format available.

Message #155 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Adam Tack <adam.tack.513 <at> gmail.com>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space
 U-200B
Date: Fri, 18 Sep 2020 16:55:40 +0200
Adam Tack <adam.tack.513 <at> gmail.com> writes:

> I've split out the non-nil char-table case out into a function, as I
> think that using a named function slightly improves readability, and
> having a macro over 20 lines long, somehow feels "wrong".  If the
> compiler does actually follow the inline directive, there should be no
> additional performance hit.

This was the last post in the thread, and the patch no longer applied,
so I've respun it for Emacs 28.

However, I can't find any copyright assignment on file -- Adam, did you
go through with the assignment process?

diff --git a/doc/emacs/display.texi b/doc/emacs/display.texi
index e7b8745a04..9fcca8c6e6 100644
--- a/doc/emacs/display.texi
+++ b/doc/emacs/display.texi
@@ -1831,6 +1831,14 @@ Visual Line Mode
 report.  You can add categories to a character using the command
 @code{modify-category-entry}.
 
+@vindex word-wrap-chars
+@findex word-wrap-chars-mode
+  Word boundaries and hence points at which word wrap can occur are,
+by default, considered to occur on the space and tab characters.  If
+you prefer word-wrap to be permissible at other characters, you can
+change the value of the char-table @code{word-wrap-chars}, or use
+@code{word-wrap-chars-mode}, which does this for you.
+
 @node Display Custom
 @section Customization of Display
 
diff --git a/etc/NEWS b/etc/NEWS
index 54bad068f8..f3216ed445 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -73,6 +73,19 @@ its implementation has been removed from the Linux kernel.
 OpenBSD 5.3 and older releases are no longer supported, as they lack
 proper pty support that Emacs needs.
 
++++
+** The characters at which word-wrapping occurs can now be controlled
+using the new `word-wrap-chars' char-table.  If `word-wrap-chars' is
+nil (the default), then word-wrapping will occur only on the space or
+tab characters, as has been the case until now.
+
+The most convenient way to change the characters at which wrap occurs
+is customizing the new variable `word-wrap-type' and using the new
+`word-wrap-chars-mode' minor mode, which sets `word-wrap-chars' based
+on `word-wrap-type', for you.  The options for `word-wrap-type' are
+ascii-whitespace, unicode-whitespace and a customizable list of
+character codes and character code ranges.
+
 
 * Startup Changes in Emacs 28.1
 
diff --git a/lisp/simple.el b/lisp/simple.el
index 7dc695848b..b881cbc23e 100644
--- a/lisp/simple.el
+++ b/lisp/simple.el
@@ -7269,6 +7269,117 @@ turn-on-visual-line-mode
 (define-globalized-minor-mode global-visual-line-mode
   visual-line-mode turn-on-visual-line-mode)
 
+
+(defvar word-wrap-type)
+
+(defvar word-wrap-chars--saved nil)
+
+(define-minor-mode word-wrap-chars-mode
+  "Toggle wrapping using a look-up to `word-wrap-chars'.
+The exact choice of characters on which wrapping occurs, depends
+on the value of `word-wrap-type'.  By default, `word-wrap-type'
+is set to unicode-white-space, which allows word wrapping on all
+breakable unicode whitespace, not only space and tap.
+
+For details of other customization options, see
+`word-wrap-type'.
+
+This minor mode has no effect unless `visual-line-mode' is
+enabled or `word-wrap' is set to t.
+
+To toggle wrapping using a look-up, globally, use
+`global-word-wrap-chars-mode'."
+  :group 'visual-line
+  :lighter " wwc"
+  (if word-wrap-chars-mode
+      (progn
+        (if (local-variable-p 'word-wrap-chars)
+            (setq-local word-wrap-chars--saved
+                        word-wrap-chars))
+        (set-word-wrap-chars))
+    (setq-local word-wrap-chars word-wrap-chars--saved)))
+
+(defun turn-on-word-wrap-chars-mode ()
+  (visual-line-mode 1))
+
+(define-globalized-minor-mode global-word-wrap-chars-mode
+  word-wrap-chars-mode turn-on-word-wrap-chars-mode)
+
+(defun update-word-wrap-chars ()
+  "Update `word-wrap-chars' upon Customize of `word-wrap-type'.
+
+Only buffers which use the `word-wrap-chars-mode' are affected."
+  (mapcar #'(lambda (buf)
+	      (with-current-buffer buf
+	        (if word-wrap-chars-mode
+                    (set-word-wrap-chars))))
+	  (buffer-list)))
+
+(defun set-word-wrap-chars ()
+  "Set `word-wrap-chars' locally, based on `word-wrap-type'."
+  (cond
+   ((eq word-wrap-type 'ascii-whitespace)
+    (setq-local word-wrap-chars nil))
+   ((eq word-wrap-type 'unicode-whitespace)
+    (set-word-wrap-chars-from-list
+     '(9 32 5760 (8192 . 8198) (8200 . 8203) 8287 12288)))
+   ((listp word-wrap-type)
+    (set-word-wrap-chars-from-list word-wrap-type))))
+
+(defun set-word-wrap-chars-from-list (list)
+  "Set `word-wrap-chars' locally from a list.
+Each element of the list can be a character code (code point) or
+a cons of character codes, representing the two (inclusive)
+endpoints of the range of characters."
+  (setq-local
+   word-wrap-chars
+   (let ((char-table (make-char-table nil nil)))
+     (dolist (range list char-table)
+       (set-char-table-range char-table range t)))))
+
+(defcustom word-wrap-type
+  'unicode-whitespace
+  "Characters on which word-wrap occurs.
+This variable controls the value of `word-wrap-chars' that is set
+by `word-wrap-chars-mode`.  `word-wrap-chars' determines on what
+characters word-wrapping can occur, when `word-wrap' is t or
+`visual-line-mode' is enabled.
+
+Possible values are ascii-whitespace, unicode-whitespace or a
+custom list of characters and character ranges.
+
+If the value is `ascii-whitespace', word-wrap is only on space
+and tab.  If the value is `unicode-whitespace', word-wrap is on
+all the Unicode whitespace characters that permit wrapping,
+including but not limited to space and tab.
+
+If a custom list of characters and ranges is used, word wrap is
+on these characters and character ranges.  The ranges are
+inclusive of both endpoints.
+
+When you change this without using customize, you need to call
+`update-word-wrap-chars' to update the word wrap in current
+buffers.  For instance:
+
+(setq word-wrap-type \\='(9 32 ?_))
+(update-word-wrap-chars)
+
+will set the wrappable characters to space, tab and underscore,
+in all buffers in `word-wrap-chars-mode' and using the default
+value of `word-wrap-type'.
+"
+  :type '(choice (const :tag "Space and tab" ascii-whitespace)
+		 (const :tag "All unicode spaces" unicode-whitespace)
+		 (repeat :tag "Custom characters or ranges"
+			 :value (9 32)
+			 (choice (character)
+				 (cons :tag "Range" character character))))
+  :set (lambda (symbol value)
+	 (set-default symbol value)
+	 (update-word-wrap-chars))
+  :group 'visual-line
+  :version 27.1)
+
 
 (defun transpose-chars (arg)
   "Interchange characters around point, moving forward one character.
diff --git a/src/buffer.c b/src/buffer.c
index 241f2d43a9..5c26323d69 100644
--- a/src/buffer.c
+++ b/src/buffer.c
@@ -5786,7 +5786,12 @@ syms_of_buffer (void)
 Visual Line mode.  Visual Line mode, when enabled, sets `word-wrap'
 to t, and additionally redefines simple editing commands to act on
 visual lines rather than logical lines.  See the documentation of
-`visual-line-mode'.  */);
+`visual-line-mode'.
+
+If `word-wrap-chars' is non-nil and a char-table, continuation lines
+are wrapped on the characters in `word-wrap-chars' whose value is t,
+rather than the space and tab characters.  `word-wrap-chars-mode
+provides a convenient interface for using this.  */);
 
   DEFVAR_PER_BUFFER ("default-directory", &BVAR (current_buffer, directory),
 		     Qstringp,
diff --git a/src/character.c b/src/character.c
index 5860f6a0c8..032f4fc12b 100644
--- a/src/character.c
+++ b/src/character.c
@@ -1084,4 +1084,14 @@ syms_of_character (void)
 See The Unicode Standard for the meaning of those values.  */);
   /* The correct char-table is setup in characters.el.  */
   Vunicode_category_table = Qnil;
+
+  DEFVAR_LISP ("word-wrap-chars", Vword_wrap_chars,
+	       doc: /* A char-table for characters at which word-wrap occurs.
+Such characters have value t in this table.  If the char-table is nil,
+word-wrap occurs only on space and tab.
+
+For a more user-friendly way of changing the characters at which
+word-wrap can occur, consider using `word-wrap-chars-mode' and
+customizing `word-wrap-type'. */);
+  Vword_wrap_chars = Qnil;
 }
diff --git a/src/xdisp.c b/src/xdisp.c
index 615f0ca7cf..744b9a52c7 100644
--- a/src/xdisp.c
+++ b/src/xdisp.c
@@ -494,20 +494,42 @@ #define IT_OVERFLOW_NEWLINE_INTO_FRINGE(it) false
 #endif /* HAVE_WINDOW_SYSTEM */
 
 /* Test if the display element loaded in IT, or the underlying buffer
-   or string character, is a space or a TAB character.  This is used
-   to determine where word wrapping can occur.  */
+   or string character, is a space or tab (by default, to avoid the
+   unnecessary performance hit of char-table lookup).  If
+   word-wrap-chars is a char-table, then instead check if the relevant
+   element or character belongs to the char-table.  This is used to
+   determine where word wrapping can occur.  */
 
 #define IT_DISPLAYING_WHITESPACE(it)					\
-  ((it->what == IT_CHARACTER && (it->c == ' ' || it->c == '\t'))	\
-   || ((STRINGP (it->string)						\
-	&& (SREF (it->string, IT_STRING_BYTEPOS (*it)) == ' '		\
-	    || SREF (it->string, IT_STRING_BYTEPOS (*it)) == '\t'))	\
-       || (it->s							\
-	   && (it->s[IT_BYTEPOS (*it)] == ' '				\
-	       || it->s[IT_BYTEPOS (*it)] == '\t'))			\
-       || (IT_BYTEPOS (*it) < ZV_BYTE					\
-	   && (*BYTE_POS_ADDR (IT_BYTEPOS (*it)) == ' '			\
-	       || *BYTE_POS_ADDR (IT_BYTEPOS (*it)) == '\t'))))
+  (!CHAR_TABLE_P (Vword_wrap_chars)					\
+   ? ((it->what == IT_CHARACTER && (it->c == ' ' || it->c == '\t'))	\
+      || ((STRINGP (it->string)						\
+	   && (SREF (it->string, IT_STRING_BYTEPOS (*it)) == ' '	\
+	       || SREF (it->string, IT_STRING_BYTEPOS (*it)) == '\t'))	\
+	  || (it->s							\
+	      && (it->s[IT_BYTEPOS (*it)] == ' '			\
+		  || it->s[IT_BYTEPOS (*it)] == '\t'))			\
+	  || (IT_BYTEPOS (*it) < ZV_BYTE				\
+	      && (*BYTE_POS_ADDR (IT_BYTEPOS (*it)) == ' '		\
+		  || *BYTE_POS_ADDR (IT_BYTEPOS (*it)) == '\t'))))	\
+   : it_displaying_word_wrap_char(it))					\
+
+static inline bool
+char_is_word_wrap_char_p (int c) {
+  return !NILP (CHAR_TABLE_REF (Vword_wrap_chars, c));
+}
+
+static inline bool
+it_displaying_word_wrap_char (struct it *it) {
+  return ((it->what == IT_CHARACTER && char_is_word_wrap_char_p (it->c))
+	  || (STRINGP (it->string) && char_is_word_wrap_char_p
+	      (STRING_CHAR
+	       (SDATA (it->string) + IT_STRING_BYTEPOS (*it))))
+	  || (it->s && char_is_word_wrap_char_p
+	      (STRING_CHAR(it->s + IT_BYTEPOS (*it))))
+	  || (IT_BYTEPOS (*it) < ZV_BYTE && char_is_word_wrap_char_p
+	      (FETCH_CHAR (IT_BYTEPOS (*it)))));
+}
 
 /* These are the category sets we use.  They are defined by
    kinsoku.el and chracters.el.  */
diff --git a/test/manual/word-wrap-test.el b/test/manual/word-wrap-test.el
new file mode 100644
index 0000000000..593c2decc7
--- /dev/null
+++ b/test/manual/word-wrap-test.el
@@ -0,0 +1,127 @@
+;;; word-wrap-test.el -- tests for word-wrap -*- lexical-binding: t -*-
+
+;; Copyright (C) 2017 Free Software Foundation, Inc.
+
+;; This file is part of GNU Emacs.
+
+;; This program is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see <https://www.gnu.org/licenses/>.
+
+;;; Commentary:
+
+;; Run the tests M-x word-wrap-test-[1-4] which correspond to the four
+;; combinations:
+;;
+;; i)  whitespace-mode being enabled and disabled,
+;;
+;; ii) word-wrap-chars being nil and equal to a char-table that
+;; specifies U-200B as the only word-wrap character.
+;;
+;; The tests with whitespace-mode are needed to help avoid a
+;; regression on Bug#11341.
+
+;;; Code:
+
+(setq whitespace-display-mappings-for-zero-width-space
+      '((space-mark 32
+                    [183]
+                    [46])
+        (space-mark 160
+                    [164]
+                    [95])
+        (space-mark 8203
+                    [164]
+                    [95])
+        (newline-mark 10
+                      [36 10])
+        (tab-mark 9
+                  [187 9]
+                  [92 9])))
+
+(defun word-wrap-test-1 ()
+  "Check word-wrap for nil `word-wrap-chars'."
+  (interactive)
+  (let ((buf (get-buffer-create "*Word-wrap Test 1*")))
+    (with-current-buffer buf
+      (erase-buffer)
+      (insert "Word wrap should occur for space.\n\n")
+      (dotimes (i 100)
+        (insert "1234567 ")) ; Space
+      (insert "\n\nWord wrap should NOT occur for U-200B.\n\n")
+      (dotimes (i 100)
+        (insert "1234567​")) ; U-200B
+      (setq word-wrap t)
+      (setq-local word-wrap-chars nil)
+      (whitespace-mode -1)
+      (display-buffer buf))))
+
+(defun word-wrap-test-2 ()
+  "Check word-wrap for nil `word-wrap-chars' with whitespace-mode."
+  (interactive)
+  (let ((buf (get-buffer-create "*Word-wrap Test 2*")))
+    (with-current-buffer buf
+      (erase-buffer)
+      (insert "Word wrap should occur for space (displayed as `·').\n\n")
+      (dotimes (i 100)
+        (insert "1234567 ")) ; Space
+      (insert "\n\nWord wrap should NOT occur for U-200B (displayed as `¤').\n\n")
+      (dotimes (i 100)
+        (insert "1234567​")) ; U-200B
+      (setq word-wrap t)
+      (setq-local word-wrap-chars nil)
+      (setq-local whitespace-display-mappings
+                  whitespace-display-mappings-for-zero-width-space)
+      (whitespace-mode)
+      (display-buffer buf))))
+
+(defun word-wrap-test-3 ()
+  "Check word-wrap if `word-wrap-chars' is a char-table."
+  (interactive)
+  (let ((buf (get-buffer-create "*Word-wrap Test 3*")))
+    (with-current-buffer buf
+      (erase-buffer)
+      (insert "Word wrap should NOT occur for space.\n\n")
+      (dotimes (i 100)
+        (insert "1234567 ")) ; Space
+      (insert "\n\nWord wrap should occur for U-200B.\n\n")
+      (dotimes (i 100)
+        (insert "1234567​")) ; U-200B
+      (setq word-wrap t)
+      (setq-local word-wrap-chars
+                  (let ((ct (make-char-table nil nil)))
+                    (set-char-table-range ct 8203 t)
+                    ct))
+      (whitespace-mode -1)
+      (display-buffer buf))))
+
+(defun word-wrap-test-4 ()
+  "Check word-wrap if `word-wrap-chars' is a char-table, for whitespace-mode."
+  (interactive)
+  (let ((buf (get-buffer-create "*Word-wrap Test 4*")))
+    (with-current-buffer buf
+      (erase-buffer)
+      (insert "Word wrap should NOT occur for space (displayed as `·').\n\n")
+      (dotimes (i 100)
+        (insert "1234567 ")) ; Space
+      (insert "\n\nWord wrap should occur for U-200B (displayed as `¤').\n\n")
+      (dotimes (i 100)
+        (insert "1234567​")) ; U-200B
+      (setq word-wrap t)
+      (setq-local word-wrap-chars
+                  (let ((ct (make-char-table nil nil)))
+                    (set-char-table-range ct 8203 t)
+                    ct))
+      (setq-local whitespace-display-mappings
+                  whitespace-display-mappings-for-zero-width-space)
+      (whitespace-mode)
+      (display-buffer buf))))


-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Fri, 18 Sep 2020 15:40:01 GMT) Full text and rfc822 format available.

Message #158 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: adam.tack.513 <at> gmail.com, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space
 U-200B
Date: Fri, 18 Sep 2020 18:39:48 +0300
> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Cc: Eli Zaretskii <eliz <at> gnu.org>,  13399 <at> debbugs.gnu.org
> Date: Fri, 18 Sep 2020 16:55:40 +0200
> 
> Adam Tack <adam.tack.513 <at> gmail.com> writes:
> 
> > I've split out the non-nil char-table case out into a function, as I
> > think that using a named function slightly improves readability, and
> > having a macro over 20 lines long, somehow feels "wrong".  If the
> > compiler does actually follow the inline directive, there should be no
> > additional performance hit.
> 
> This was the last post in the thread, and the patch no longer applied,
> so I've respun it for Emacs 28.

Since we now have word-wrap-by-category, wouldn't that solve the
problem, if we add the necessary category to the category set of
zero-width space character?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Sat, 19 Sep 2020 13:16:01 GMT) Full text and rfc822 format available.

Message #161 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: adam.tack.513 <at> gmail.com, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space
 U-200B
Date: Sat, 19 Sep 2020 15:15:32 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

> Since we now have word-wrap-by-category, wouldn't that solve the
> problem, if we add the necessary category to the category set of
> zero-width space character?

Ah, right.  So the solution here would be to use modify-category-entry
on U-200B, but I guess whether to do so depends on the use case, and
it's not something we would want to do by default?

So there doesn't seem to be anything more to do in this bug report, and
I'm closing it.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




bug closed, send any further explanations to 13399 <at> debbugs.gnu.org and martin rudalics <rudalics <at> gmx.at> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Sat, 19 Sep 2020 13:16:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#13399; Package emacs. (Sat, 19 Sep 2020 14:37:01 GMT) Full text and rfc822 format available.

Message #166 received at 13399 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: adam.tack.513 <at> gmail.com, 13399 <at> debbugs.gnu.org
Subject: Re: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space
 U-200B
Date: Sat, 19 Sep 2020 17:36:45 +0300
> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Cc: adam.tack.513 <at> gmail.com,  13399 <at> debbugs.gnu.org
> Date: Sat, 19 Sep 2020 15:15:32 +0200
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > Since we now have word-wrap-by-category, wouldn't that solve the
> > problem, if we add the necessary category to the category set of
> > zero-width space character?
> 
> Ah, right.  So the solution here would be to use modify-category-entry
> on U-200B, but I guess whether to do so depends on the use case, and
> it's not something we would want to do by default?

Yes, I think so.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 18 Oct 2020 11:24:13 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 162 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.