GNU bug report logs - #50983
28.0.50; [REGRESSION, BUG] Display bugs with uncommon characters

Previous Next

Package: emacs;

Reported by: Rudi C <rudiwillalwaysloveyou <at> gmail.com>

Date: Sat, 2 Oct 2021 22:51:02 UTC

Severity: normal

Found in version 28.0.50

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 50983 in the body.
You can then email your comments to 50983 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#50983; Package emacs. (Sat, 02 Oct 2021 22:51:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Rudi C <rudiwillalwaysloveyou <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sat, 02 Oct 2021 22:51:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Rudi C <rudiwillalwaysloveyou <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 28.0.50; [REGRESSION, BUG] Display bugs with uncommon characters
Date: Sun, 3 Oct 2021 02:20:24 +0330
[Message part 1 (text/plain, inline)]
I have two display bugs to report, one a regression that is not present in
emacs 27. I start with this regression.

1. `curl https://files.lilf.ir/tmp/weird.txt > weird.txt`
2. `emacs -Q -nw weird.txt`
3. try editing the text, deleting characters, etc. The character display
will get messed up.

Here is a screenshot of emacs before editing the file:

https://files.lilf.ir/tmp/tmp.kik6vbBw8S.png

And here is a screenshot after I do `backspace a`:
https://files.lilf.ir/tmp/tmp.Twz5ZXVbR6.png

I have tried this bug with emacs 27 (both myself and some other user on
IRC), and it is not present there.

The second bug:
1. `curl https://files.lilf.ir/tmp/bug.txt > bug.txt`
2. do `cat bug.txt` and note the output:
https://files.lilf.ir/tmp/tmp.HKfKc9PUds.png

3. `emacs -Q -nw bug.txt`
As you can see, emacs is displaying the file incorrectly:
https://files.lilf.ir/tmp/tmp.0yKbCbB80R.png

In particular, the line `#+TITLE: sharif/contact info` is not displayed at
all.

I could reproduce this bug on both emacs 27 and 28.

Additional info:

In GNU Emacs 28.0.50 (build 1, x86_64-apple-darwin20.3.0, NS appkit-2022.30
Version 11.2.1 (Build 20D75))
 of 2021-09-21 built on Fereidoons-MacBook-Pro.local
System Description:  macOS 11.2.1

Configured using:
 'configure --disable-dependency-tracking --disable-silent-rules
 --enable-locallisppath=/usr/local/share/emacs/site-lisp
 --infodir=/usr/local/Cellar/emacs-plus <at> 28/28.0.50/share/info/emacs
 --prefix=/usr/local/Cellar/emacs-plus <at> 28/28.0.50 --with-xml2
 --with-gnutls --with-native-compilation --without-dbus
 --with-imagemagick --with-modules --with-rsvg --with-xwidgets --with-ns
 --disable-ns-self-contained 'CFLAGS=-I/usr/local/opt/gcc/include
 -I/usr/local/opt/libgccjit/include -I/usr/local/opt/gmp/include
 -I/usr/local/opt/jpeg/include' 'LDFLAGS=-L/usr/local/lib/gcc/11
 -I/usr/local/opt/gcc/include -I/usr/local/opt/libgccjit/include
 -I/usr/local/opt/gmp/include -I/usr/local/opt/jpeg/include''

Configured features:
ACL GIF GLIB GMP GNUTLS IMAGEMAGICK JPEG JSON LCMS2 LIBXML2 MODULES
NATIVE_COMP NOTIFY KQUEUE NS PDUMPER PNG RSVG THREADS TIFF
TOOLKIT_SCROLL_BARS XIM XWIDGETS ZLIB

Important settings:
  value of $LC_ALL: en_US.UTF-8
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50983; Package emacs. (Sun, 03 Oct 2021 05:53:02 GMT) Full text and rfc822 format available.

Message #8 received at 50983 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Rudi C <rudiwillalwaysloveyou <at> gmail.com>
Cc: 50983 <at> debbugs.gnu.org
Subject: Re: bug#50983: 28.0.50;
 [REGRESSION, BUG] Display bugs with uncommon characters
Date: Sun, 03 Oct 2021 08:51:39 +0300
> From: Rudi C <rudiwillalwaysloveyou <at> gmail.com>
> Date: Sun, 3 Oct 2021 02:20:24 +0330
> 
> I have two display bugs to report, one a regression that is not present in emacs 27. I start with this
> regression.
> 
> 1. `curl https://files.lilf.ir/tmp/weird.txt > weird.txt`
> 2. `emacs -Q -nw weird.txt`
> 3. try editing the text, deleting characters, etc. The character display will get messed up.
> 
> Here is a screenshot of emacs before editing the file:
> 
> https://files.lilf.ir/tmp/tmp.kik6vbBw8S.png
> 
> And here is a screenshot after I do `backspace a`:
> https://files.lilf.ir/tmp/tmp.Twz5ZXVbR6.png
> 
> I have tried this bug with emacs 27 (both myself and some other user on IRC), and it is not present there. 
> 
> The second bug:
> 1. `curl https://files.lilf.ir/tmp/bug.txt > bug.txt`
> 2. do `cat bug.txt` and note the output:
> https://files.lilf.ir/tmp/tmp.HKfKc9PUds.png
> 
> 3. `emacs -Q -nw bug.txt`
> As you can see, emacs is displaying the file incorrectly:
> https://files.lilf.ir/tmp/tmp.0yKbCbB80R.png
> 
> In particular, the line `#+TITLE: sharif/contact info` is not displayed at all.
> 
> I could reproduce this bug on both emacs 27 and 28.

I'm unable to reproduce any of this on my system.  Both files display
correctly, and the problems after deleting character and/or after
displaying the file in a -nw session don't happen.

This could be specific to macOS, where AFAIK the display is
implemented slightly differently from the other platforms.  Or maybe
something else is at work here.  For the -nw problems, this could
perhaps be related to the terminal emulator you are using (just a
guess, I have no real explanation how that could hide entire portions
of the file's display).

P.S. The site which you use to post the files is problematic: its
certificate is expired or invalid, and at least on one of my systems
wget said the TLS handshake failed, perhaps for the same reason.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50983; Package emacs. (Sun, 03 Oct 2021 06:48:02 GMT) Full text and rfc822 format available.

Message #11 received at 50983 <at> debbugs.gnu.org (full text, mbox):

From: Rudi C <rudiwillalwaysloveyou <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 50983 <at> debbugs.gnu.org
Subject: Re: bug#50983: 28.0.50;
 [REGRESSION, BUG] Display bugs with uncommon characters
Date: Sun, 3 Oct 2021 10:17:21 +0330
[Message part 1 (text/plain, inline)]
> The site which you use to post the files is problematic: its
certificate is expired or invalid.

I use caddy to automatically manage its certificates, and I don't get any
cert errors myself. Can you be more specific? Perhaps you need newer
versions of wget?

https://www.ssllabs.com/ssltest/analyze.html?d=files.lilf.ir also says
everything is okay.

> This could be specific to macOS

I tested `bug.txt` via SSH on an Ubuntu server with emacs 27.2, and it was
no different:
https://files.lilf.ir/tmp/tmp.mZxt5Pilap.png

Testing it with other terminal apps, none of the bugs occur with
`terminal.app`.

The RTL is all wrong on `terminal.app` though (
https://files.lilf.ir/tmp/tmp.1UjK8TYGoG.png)
, but I guess it's unrelated. Alacritty doesn't show the bug, and it also
doesn't mess up the RTL shaping.

However, the bug is probably an interaction between both emacs and the
terminal app Kitty, as `vim` does not have this problem. Interestingly,
neovim does. (This is true for both of the bugs; vim doesn't have them,
emacs and nvim do, and only on Kitty.)

BTW, I tested using `command kitty --config=/dev/null`, so the bug did not
have anything with my Kitty config. (A screenshot in unconfigured Kitty:
https://files.lilf.ir/tmp/tmp.UcEXWQkTwn.png)

If you think the issue is to be upstreamed to Kitty, can you open an issue
on their Github? (https://github.com/kovidgoyal/kitty/issues)

Thanks.

On Sun, Oct 3, 2021 at 9:21 AM Eli Zaretskii <eliz <at> gnu.org> wrote:

> > From: Rudi C <rudiwillalwaysloveyou <at> gmail.com>
> > Date: Sun, 3 Oct 2021 02:20:24 +0330
> >
> > I have two display bugs to report, one a regression that is not present
> in emacs 27. I start with this
> > regression.
> >
> > 1. `curl https://files.lilf.ir/tmp/weird.txt > weird.txt`
> > 2. `emacs -Q -nw weird.txt`
> > 3. try editing the text, deleting characters, etc. The character display
> will get messed up.
> >
> > Here is a screenshot of emacs before editing the file:
> >
> > https://files.lilf.ir/tmp/tmp.kik6vbBw8S.png
> >
> > And here is a screenshot after I do `backspace a`:
> > https://files.lilf.ir/tmp/tmp.Twz5ZXVbR6.png
> >
> > I have tried this bug with emacs 27 (both myself and some other user on
> IRC), and it is not present there.
> >
> > The second bug:
> > 1. `curl https://files.lilf.ir/tmp/bug.txt > bug.txt`
> > 2. do `cat bug.txt` and note the output:
> > https://files.lilf.ir/tmp/tmp.HKfKc9PUds.png
> >
> > 3. `emacs -Q -nw bug.txt`
> > As you can see, emacs is displaying the file incorrectly:
> > https://files.lilf.ir/tmp/tmp.0yKbCbB80R.png
> >
> > In particular, the line `#+TITLE: sharif/contact info` is not displayed
> at all.
> >
> > I could reproduce this bug on both emacs 27 and 28.
>
> I'm unable to reproduce any of this on my system.  Both files display
> correctly, and the problems after deleting character and/or after
> displaying the file in a -nw session don't happen.
>
> This could be specific to macOS, where AFAIK the display is
> implemented slightly differently from the other platforms.  Or maybe
> something else is at work here.  For the -nw problems, this could
> perhaps be related to the terminal emulator you are using (just a
> guess, I have no real explanation how that could hide entire portions
> of the file's display).
>
> P.S. The site which you use to post the files is problematic: its
> certificate is expired or invalid, and at least on one of my systems
> wget said the TLS handshake failed, perhaps for the same reason.
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50983; Package emacs. (Sun, 03 Oct 2021 09:02:01 GMT) Full text and rfc822 format available.

Message #14 received at 50983 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Rudi C <rudiwillalwaysloveyou <at> gmail.com>
Cc: 50983 <at> debbugs.gnu.org
Subject: Re: bug#50983: 28.0.50;
 [REGRESSION, BUG] Display bugs with uncommon characters
Date: Sun, 03 Oct 2021 12:01:36 +0300
> From: Rudi C <rudiwillalwaysloveyou <at> gmail.com>
> Date: Sun, 3 Oct 2021 10:17:21 +0330
> Cc: 50983 <at> debbugs.gnu.org
> 
> > The site which you use to post the files is problematic: its
> certificate is expired or invalid.
> 
> I use caddy to automatically manage its certificates, and I don't get any cert errors myself. Can you be more
> specific? Perhaps you need newer versions of wget? 

No, I don't think so.  Anyway, this is a tangent; if you think
everything is okay with the site, I can fetch files regardless.

> https://www.ssllabs.com/ssltest/analyze.html?d=files.lilf.ir also says everything is okay.
> 
> > This could be specific to macOS
> 
> I tested `bug.txt` via SSH on an Ubuntu server with emacs 27.2, and it was no different:
> https://files.lilf.ir/tmp/tmp.mZxt5Pilap.png

How exactly did you do that? where was Emacs running and where was the
display running?  Was that with or without X forwarding?

Also, this is the second file; what about the first one?  Do you see
on Ubuntu problems with deleting characters in it, and if so, which
characters and what problems this causes?

> Testing it with other terminal apps, none of the bugs occur with `terminal.app`. 
> 
> The RTL is all wrong on `terminal.app` though (https://files.lilf.ir/tmp/tmp.1UjK8TYGoG.png)
> , but I guess it's unrelated. Alacritty doesn't show the bug, and it also doesn't mess up the RTL shaping.
> 

To display RTL text on a terminal, you need to turn off bidirectional
features of the terminal, if it has them, because Emacs performs the
bidirectional processing by itself.

> If you think the issue is to be upstreamed to Kitty, can you open an issue on their Github?
> (https://github.com/kovidgoyal/kitty/issues) 

Sorry, I wouldn't know what to write there, and cannot present any
data as I don't have Kitty installed.  I think it's better that you do
it.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50983; Package emacs. (Sun, 03 Oct 2021 09:10:01 GMT) Full text and rfc822 format available.

Message #17 received at 50983 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Rudi C <rudiwillalwaysloveyou <at> gmail.com>
Cc: 50983 <at> debbugs.gnu.org
Subject: Re: bug#50983: 28.0.50; [REGRESSION, BUG] Display bugs with
 uncommon characters
Date: Sun, 03 Oct 2021 11:08:49 +0200
[Message part 1 (text/plain, inline)]
Rudi C <rudiwillalwaysloveyou <at> gmail.com> writes:

> I have two display bugs to report, one a regression that is not present in
> emacs 27. I start with this regression.
>
> 1. `curl https://files.lilf.ir/tmp/weird.txt > weird.txt`
> 2. `emacs -Q -nw weird.txt`
> 3. try editing the text, deleting characters, etc. The character display will get
> messed up.

I'm unable to reproduce this problem, but my Emacs looks very different
from yours -- I'm thinking of the line breaking in particular.  Here's
what mine look with "emacs -Q" under Debian/bullseye:

[Message part 2 (image/png, inline)]
[Message part 3 (text/plain, inline)]

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50983; Package emacs. (Sun, 03 Oct 2021 09:12:02 GMT) Full text and rfc822 format available.

Message #20 received at 50983 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 50983 <at> debbugs.gnu.org, Rudi C <rudiwillalwaysloveyou <at> gmail.com>
Subject: Re: bug#50983: 28.0.50; [REGRESSION, BUG] Display bugs with
 uncommon characters
Date: Sun, 03 Oct 2021 11:11:23 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

> P.S. The site which you use to post the files is problematic: its
> certificate is expired or invalid, and at least on one of my systems
> wget said the TLS handshake failed, perhaps for the same reason.

It's not expired, but if you're getting messages saying that it is, it
probably means that your wget is too old.  It's due to the Let's Encrypt
meltdown on October 1st.  Here's one of the many, many threads about it;
basically you have to delete the X3 certificate:

https://stackoverflow.com/questions/69387175/git-for-windows-ssl-certificate-problem-certificate-has-expired

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50983; Package emacs. (Sun, 03 Oct 2021 09:55:02 GMT) Full text and rfc822 format available.

Message #23 received at 50983 <at> debbugs.gnu.org (full text, mbox):

From: Alan Third <alan <at> idiocy.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 50983 <at> debbugs.gnu.org, Rudi C <rudiwillalwaysloveyou <at> gmail.com>
Subject: Re: bug#50983: 28.0.50; [REGRESSION, BUG] Display bugs with uncommon
 characters
Date: Sun, 3 Oct 2021 10:54:01 +0100
On Sun, Oct 03, 2021 at 08:51:39AM +0300, Eli Zaretskii wrote:
> 
> This could be specific to macOS, where AFAIK the display is
> implemented slightly differently from the other platforms.

As far as I'm aware the "-nw" display is implemented the same as any
other platform. For the record I can't see these problems on GUI emacs
on the mac, but I do see them in the terminal using iTerm2.

It looks like sometimes the display is incorrect, and other times the
action actually changes the underlying buffer contents in an
unexpected way.

-- 
Alan Third




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50983; Package emacs. (Sun, 03 Oct 2021 10:05:01 GMT) Full text and rfc822 format available.

Message #26 received at 50983 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Alan Third <alan <at> idiocy.org>
Cc: 50983 <at> debbugs.gnu.org, rudiwillalwaysloveyou <at> gmail.com
Subject: Re: bug#50983: 28.0.50; [REGRESSION, BUG] Display bugs with uncommon
 characters
Date: Sun, 03 Oct 2021 13:04:36 +0300
> Date: Sun, 3 Oct 2021 10:54:01 +0100
> From: Alan Third <alan <at> idiocy.org>
> Cc: Rudi C <rudiwillalwaysloveyou <at> gmail.com>, 50983 <at> debbugs.gnu.org
> 
> On Sun, Oct 03, 2021 at 08:51:39AM +0300, Eli Zaretskii wrote:
> > 
> > This could be specific to macOS, where AFAIK the display is
> > implemented slightly differently from the other platforms.
> 
> As far as I'm aware the "-nw" display is implemented the same as any
> other platform.

I meant the GUI part of the report.

> For the record I can't see these problems on GUI emacs
> on the mac, but I do see them in the terminal using iTerm2.

Does iTerm2 have some support for bidirectional text?  If so, you need
to disable it for Emacs -nw to display bidirectional text correctly.

> It looks like sometimes the display is incorrect, and other times the
> action actually changes the underlying buffer contents in an
> unexpected way.

Any idea what could cause that?  Does it happen with plain-ASCII text
as well?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50983; Package emacs. (Sun, 03 Oct 2021 10:26:01 GMT) Full text and rfc822 format available.

Message #29 received at 50983 <at> debbugs.gnu.org (full text, mbox):

From: Alan Third <alan <at> idiocy.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 50983 <at> debbugs.gnu.org, rudiwillalwaysloveyou <at> gmail.com
Subject: Re: bug#50983: 28.0.50; [REGRESSION, BUG] Display bugs with uncommon
 characters
Date: Sun, 3 Oct 2021 11:24:59 +0100
On Sun, Oct 03, 2021 at 01:04:36PM +0300, Eli Zaretskii wrote:
> > Date: Sun, 3 Oct 2021 10:54:01 +0100
> > From: Alan Third <alan <at> idiocy.org>
> > Cc: Rudi C <rudiwillalwaysloveyou <at> gmail.com>, 50983 <at> debbugs.gnu.org
> > 
> > For the record I can't see these problems on GUI emacs
> > on the mac, but I do see them in the terminal using iTerm2.
> 
> Does iTerm2 have some support for bidirectional text?  If so, you need
> to disable it for Emacs -nw to display bidirectional text correctly.

I don't see any options (but there are a LOT of options).

> > It looks like sometimes the display is incorrect, and other times the
> > action actually changes the underlying buffer contents in an
> > unexpected way.
> 
> Any idea what could cause that?  Does it happen with plain-ASCII text
> as well?

Actually, I was wrong. If I follow the instructions for the first
example, by removing the character indicated by an underscore in
Rudi's first screenshot, it actually deletes the previous "o" in
"note", and displays the rest wrongly, as shown in his second
screenshot.

If I put the cursor over that underscore character and do
describe-char, it tells me it's an "o", so the problem exists even
before editing.

I don't see these problems in normal ascii text, or even normal UTF-8
text, even RTL. For example the Hebrew text in HELLO behaves exactly
as I'd expect.
-- 
Alan Third




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50983; Package emacs. (Sun, 03 Oct 2021 10:49:01 GMT) Full text and rfc822 format available.

Message #32 received at 50983 <at> debbugs.gnu.org (full text, mbox):

From: Rudi C <rudiwillalwaysloveyou <at> gmail.com>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 50983 <at> debbugs.gnu.org
Subject: Re: bug#50983: 28.0.50;
 [REGRESSION, BUG] Display bugs with uncommon characters
Date: Sun, 3 Oct 2021 14:18:29 +0330
[Message part 1 (text/plain, inline)]
> I'm unable to reproduce this problem, but my Emacs looks very different
from yours -- I'm thinking of the line breaking in particular.

Aren't you using emacs 27? Mine also looks like that in 27. It also doesn't
have the bug in that version.



On Sun, Oct 3, 2021 at 12:38 PM Lars Ingebrigtsen <larsi <at> gnus.org> wrote:

> Rudi C <rudiwillalwaysloveyou <at> gmail.com> writes:
>
> > I have two display bugs to report, one a regression that is not present
> in
> > emacs 27. I start with this regression.
> >
> > 1. `curl https://files.lilf.ir/tmp/weird.txt > weird.txt`
> > 2. `emacs -Q -nw weird.txt`
> > 3. try editing the text, deleting characters, etc. The character display
> will get
> > messed up.
>
> I'm unable to reproduce this problem, but my Emacs looks very different
> from yours -- I'm thinking of the line breaking in particular.  Here's
> what mine look with "emacs -Q" under Debian/bullseye:
>
>
>
> --
> (domestic pets only, the antidote for overdose, milk.)
>    bloggy blog: http://lars.ingebrigtsen.no
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50983; Package emacs. (Sun, 03 Oct 2021 10:51:01 GMT) Full text and rfc822 format available.

Message #35 received at 50983 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Alan Third <alan <at> idiocy.org>
Cc: 50983 <at> debbugs.gnu.org, alan <at> idiocy.org, rudiwillalwaysloveyou <at> gmail.com
Subject: Re: bug#50983: 28.0.50; [REGRESSION, BUG] Display bugs with uncommon
 characters
Date: Sun, 03 Oct 2021 13:49:29 +0300
> Date: Sun, 3 Oct 2021 11:24:59 +0100
> From: Alan Third <alan <at> idiocy.org>
> Cc: rudiwillalwaysloveyou <at> gmail.com, 50983 <at> debbugs.gnu.org
> 
> > > It looks like sometimes the display is incorrect, and other times the
> > > action actually changes the underlying buffer contents in an
> > > unexpected way.
> > 
> > Any idea what could cause that?  Does it happen with plain-ASCII text
> > as well?
> 
> Actually, I was wrong. If I follow the instructions for the first
> example, by removing the character indicated by an underscore in
> Rudi's first screenshot, it actually deletes the previous "o" in
> "note", and displays the rest wrongly, as shown in his second
> screenshot.
> 
> If I put the cursor over that underscore character and do
> describe-char, it tells me it's an "o", so the problem exists even
> before editing.

Is this in a GUI frame or a TTY frame?

And what do you mean by "underscore character"? What is its Unicode
codepoint?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50983; Package emacs. (Sun, 03 Oct 2021 11:02:02 GMT) Full text and rfc822 format available.

Message #38 received at 50983 <at> debbugs.gnu.org (full text, mbox):

From: Rudi C <rudiwillalwaysloveyou <at> gmail.com>
To: Alan Third <alan <at> idiocy.org>, Eli Zaretskii <eliz <at> gnu.org>, 
 Rudi C <rudiwillalwaysloveyou <at> gmail.com>, 50983 <at> debbugs.gnu.org
Subject: Re: bug#50983: 28.0.50;
 [REGRESSION, BUG] Display bugs with uncommon characters
Date: Sun, 3 Oct 2021 14:30:56 +0330
[Message part 1 (text/plain, inline)]
> I don't see these problems in normal ascii text, or even normal UTF-8 text

Are the text files I have attached not normal UTf-8? (I have no idea how to
notice if they are normal or not.)

> support for bidirectional text

This is irrelevant to these bugs, I think. If your terminal does RTL
reordering, then emacs does that, too, so the ordering will get reversed,
but it shouldn't have anything to do with these two bugs.

Also, can you be more specific about where you do observe the bugs? In TUI
emacs on iTerm?

I can confirm that the bug with `weird.txt` happens on iTerm, too, again
with both emacs and neovim! But the bug with `bug.txt` does not happen in
iTerm, only on Kitty.


On Sun, Oct 3, 2021 at 1:55 PM Alan Third <alan <at> idiocy.org> wrote:

> On Sun, Oct 03, 2021 at 01:04:36PM +0300, Eli Zaretskii wrote:
> > > Date: Sun, 3 Oct 2021 10:54:01 +0100
> > > From: Alan Third <alan <at> idiocy.org>
> > > Cc: Rudi C <rudiwillalwaysloveyou <at> gmail.com>, 50983 <at> debbugs.gnu.org
> > >
> > > For the record I can't see these problems on GUI emacs
> > > on the mac, but I do see them in the terminal using iTerm2.
> >
> > Does iTerm2 have some support for bidirectional text?  If so, you need
> > to disable it for Emacs -nw to display bidirectional text correctly.
>
> I don't see any options (but there are a LOT of options).
>
> > > It looks like sometimes the display is incorrect, and other times the
> > > action actually changes the underlying buffer contents in an
> > > unexpected way.
> >
> > Any idea what could cause that?  Does it happen with plain-ASCII text
> > as well?
>
> Actually, I was wrong. If I follow the instructions for the first
> example, by removing the character indicated by an underscore in
> Rudi's first screenshot, it actually deletes the previous "o" in
> "note", and displays the rest wrongly, as shown in his second
> screenshot.
>
> If I put the cursor over that underscore character and do
> describe-char, it tells me it's an "o", so the problem exists even
> before editing.
>
> I don't see these problems in normal ascii text, or even normal UTF-8
> text, even RTL. For example the Hebrew text in HELLO behaves exactly
> as I'd expect.
> --
> Alan Third
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50983; Package emacs. (Sun, 03 Oct 2021 11:13:02 GMT) Full text and rfc822 format available.

Message #41 received at 50983 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Rudi C <rudiwillalwaysloveyou <at> gmail.com>
Cc: 50983 <at> debbugs.gnu.org, alan <at> idiocy.org, rudiwillalwaysloveyou <at> gmail.com
Subject: Re: bug#50983: 28.0.50;
 [REGRESSION, BUG] Display bugs with uncommon characters
Date: Sun, 03 Oct 2021 14:11:52 +0300
> From: Rudi C <rudiwillalwaysloveyou <at> gmail.com>
> Date: Sun, 3 Oct 2021 14:30:56 +0330
> 
> Also, can you be more specific about where you do observe the bugs? In TUI emacs on iTerm? 
> 
> I can confirm that the bug with `weird.txt` happens on iTerm, too, again with both emacs and neovim! But the
> bug with `bug.txt` does not happen in iTerm, only on Kitty.

This sounds like the terminal emulators have a problem in supporting
unusual Unicode characters, such as zero-width or double-width
characters, perhaps?  I see no Emacs problem here, since it happens
only on some terminal emulators and not on others.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50983; Package emacs. (Sun, 03 Oct 2021 11:27:01 GMT) Full text and rfc822 format available.

Message #44 received at 50983 <at> debbugs.gnu.org (full text, mbox):

From: Alan Third <alan <at> idiocy.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 50983 <at> debbugs.gnu.org, rudiwillalwaysloveyou <at> gmail.com
Subject: Re: bug#50983: 28.0.50; [REGRESSION, BUG] Display bugs with uncommon
 characters
Date: Sun, 3 Oct 2021 12:26:22 +0100
On Sun, Oct 03, 2021 at 01:49:29PM +0300, Eli Zaretskii wrote:
> > Date: Sun, 3 Oct 2021 11:24:59 +0100
> > From: Alan Third <alan <at> idiocy.org>
> > Cc: rudiwillalwaysloveyou <at> gmail.com, 50983 <at> debbugs.gnu.org
> > 
> > > > It looks like sometimes the display is incorrect, and other times the
> > > > action actually changes the underlying buffer contents in an
> > > > unexpected way.
> > > 
> > > Any idea what could cause that?  Does it happen with plain-ASCII text
> > > as well?
> > 
> > Actually, I was wrong. If I follow the instructions for the first
> > example, by removing the character indicated by an underscore in
> > Rudi's first screenshot, it actually deletes the previous "o" in
> > "note", and displays the rest wrongly, as shown in his second
> > screenshot.
> > 
> > If I put the cursor over that underscore character and do
> > describe-char, it tells me it's an "o", so the problem exists even
> > before editing.
> 
> Is this in a GUI frame or a TTY frame?

All TTY, GUI works fine.

> And what do you mean by "underscore character"? What is its Unicode
> codepoint?

In the screenshot (and in my own iTerm2 session) there is an
underscore character after "note-". I think it's inserted by the
terminal as a placeholder for something it doesn't understand.

In GUI Emacs that position in the file has a zero width character.

If I do describe-char on the underscore it says it's a plain ascii
"o", which is clearly incorrect. In GUI it says it's 8203 (0x200B),
"ZERO WIDTH SPACE", and as I said it displays as a zero width space.

I think I agree with your other email that it's down to the terminal
doing something strange with characters it doesn't understand.
-- 
Alan Third




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50983; Package emacs. (Sun, 03 Oct 2021 12:03:02 GMT) Full text and rfc822 format available.

Message #47 received at 50983 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Alan Third <alan <at> idiocy.org>
Cc: 50983 <at> debbugs.gnu.org, rudiwillalwaysloveyou <at> gmail.com
Subject: Re: bug#50983: 28.0.50; [REGRESSION, BUG] Display bugs with uncommon
 characters
Date: Sun, 03 Oct 2021 15:02:20 +0300
> Date: Sun, 3 Oct 2021 12:26:22 +0100
> From: Alan Third <alan <at> idiocy.org>
> Cc: rudiwillalwaysloveyou <at> gmail.com, 50983 <at> debbugs.gnu.org
> 
> > And what do you mean by "underscore character"? What is its Unicode
> > codepoint?
> 
> In the screenshot (and in my own iTerm2 session) there is an
> underscore character after "note-". I think it's inserted by the
> terminal as a placeholder for something it doesn't understand.

No, it's a special face we use to display some characters that may
look like ASCII, but aren't.  See nobreak-char-display.

> In GUI Emacs that position in the file has a zero width character.
> 
> If I do describe-char on the underscore it says it's a plain ascii
> "o", which is clearly incorrect. In GUI it says it's 8203 (0x200B),
> "ZERO WIDTH SPACE", and as I said it displays as a zero width space.

Can you show the output of "C-x =" on all the characters, one by one,
starting from "n" in "note" and ending with "t" in "taking" after it?
Are they all incorrect, i.e. do not correspond to the place the cursor
is on?  That is, does the corruption start around there or does it
start much earlier (and if the latter, where does it start)?

> I think I agree with your other email that it's down to the terminal
> doing something strange with characters it doesn't understand.

If this is the case, the only way to fix the display is to use
us-ascii as terminal encoding.  Or maybe set up the terminal for a
"simpler" encoding, like latin-1, and then set up Emacs to that using
set-terminal-coding-system.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50983; Package emacs. (Sun, 03 Oct 2021 12:55:02 GMT) Full text and rfc822 format available.

Message #50 received at 50983 <at> debbugs.gnu.org (full text, mbox):

From: Alan Third <alan <at> idiocy.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 50983 <at> debbugs.gnu.org, rudiwillalwaysloveyou <at> gmail.com
Subject: Re: bug#50983: 28.0.50; [REGRESSION, BUG] Display bugs with uncommon
 characters
Date: Sun, 3 Oct 2021 13:54:33 +0100
On Sun, Oct 03, 2021 at 03:02:20PM +0300, Eli Zaretskii wrote:
> Can you show the output of "C-x =" on all the characters, one by one,
> starting from "n" in "note" and ending with "t" in "taking" after it?
> Are they all incorrect, i.e. do not correspond to the place the cursor
> is on?  That is, does the corruption start around there or does it
> start much earlier (and if the latter, where does it start)?

n -> i
o -> t
t -> h
e -> SPC
- -> n
_ -> o
t -> t

So it's off-set by some 4 characters.

Looking at the raw file, there are 4 0xAD (SOFT HYPHEN) characters
before "note", and after each one the offset increases by one.

I do not see them displayed in the terminal.

> > I think I agree with your other email that it's down to the terminal
> > doing something strange with characters it doesn't understand.
> 
> If this is the case, the only way to fix the display is to use
> us-ascii as terminal encoding.  Or maybe set up the terminal for a
> "simpler" encoding, like latin-1, and then set up Emacs to that using
> set-terminal-coding-system.

Indeed, changing the "character encoding" setting in iTerm to ASCII
displays the soft-hyphens as a red "A" and everything seems to work
right.

The default is UTF-8 and it reports itself as "xterm-256color". I
suspect most terminal applications on macOS will default to UTF-8
since that's the default everywhere else, which might help explain why
this seems to be limited to macOS.
-- 
Alan Third




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50983; Package emacs. (Sun, 03 Oct 2021 14:50:01 GMT) Full text and rfc822 format available.

Message #53 received at 50983 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Alan Third <alan <at> idiocy.org>
Cc: 50983 <at> debbugs.gnu.org, rudiwillalwaysloveyou <at> gmail.com
Subject: Re: bug#50983: 28.0.50; [REGRESSION, BUG] Display bugs with uncommon
 characters
Date: Sun, 03 Oct 2021 17:48:53 +0300
> Date: Sun, 3 Oct 2021 13:54:33 +0100
> From: Alan Third <alan <at> idiocy.org>
> Cc: 50983 <at> debbugs.gnu.org, rudiwillalwaysloveyou <at> gmail.com
> 
> n -> i
> o -> t
> t -> h
> e -> SPC
> - -> n
> _ -> o
> t -> t
> 
> So it's off-set by some 4 characters.
> 
> Looking at the raw file, there are 4 0xAD (SOFT HYPHEN) characters
> before "note", and after each one the offset increases by one.
> 
> I do not see them displayed in the terminal.

So this terminal seems to be unable to display those SOFT HYPHEN
characters (or maybe it's a "feature"?), and since Emacs knows nothing
about that, the relation between cursor position and buffer position
is disrupted.

> > If this is the case, the only way to fix the display is to use
> > us-ascii as terminal encoding.  Or maybe set up the terminal for a
> > "simpler" encoding, like latin-1, and then set up Emacs to that using
> > set-terminal-coding-system.
> 
> Indeed, changing the "character encoding" setting in iTerm to ASCII
> displays the soft-hyphens as a red "A" and everything seems to work
> right.
> 
> The default is UTF-8 and it reports itself as "xterm-256color". I
> suspect most terminal applications on macOS will default to UTF-8
> since that's the default everywhere else, which might help explain why
> this seems to be limited to macOS.

OK, thanks.  I think we can say this is not an Emacs problem.  I
recommend to file a bug with the developers of the terminal, and maybe
they will tell how to avoid that by some setting.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50983; Package emacs. (Mon, 04 Oct 2021 08:06:02 GMT) Full text and rfc822 format available.

Message #56 received at 50983 <at> debbugs.gnu.org (full text, mbox):

From: Rudi C <rudiwillalwaysloveyou <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 50983 <at> debbugs.gnu.org, Alan Third <alan <at> idiocy.org>
Subject: Re: bug#50983: 28.0.50;
 [REGRESSION, BUG] Display bugs with uncommon characters
Date: Mon, 4 Oct 2021 11:35:41 +0330
[Message part 1 (text/plain, inline)]
> I see no Emacs problem here

But the problem does not happen with vim (nor with emacs 27 for
`weird.txt`), so it is clearly an interaction of different elements.

Anyhow, I have opened an [upstream issue](
https://github.com/kovidgoyal/kitty/issues/4094). Please subscribe to it so
that you might offer your emacs expertise there, if needed.

> changing the "character encoding" setting in iTerm to ASCII

This is a most loath workaround. I do want UTF-8, as I use mathematical
symbols, emojis, and non-English languages. Anyhow, making the text full of
random unrecognized characters is not much better than the current behavior.

On Sun, Oct 3, 2021 at 2:42 PM Eli Zaretskii <eliz <at> gnu.org> wrote:

> > From: Rudi C <rudiwillalwaysloveyou <at> gmail.com>
> > Date: Sun, 3 Oct 2021 14:30:56 +0330
> >
> > Also, can you be more specific about where you do observe the bugs? In
> TUI emacs on iTerm?
> >
> > I can confirm that the bug with `weird.txt` happens on iTerm, too, again
> with both emacs and neovim! But the
> > bug with `bug.txt` does not happen in iTerm, only on Kitty.
>
> This sounds like the terminal emulators have a problem in supporting
> unusual Unicode characters, such as zero-width or double-width
> characters, perhaps?  I see no Emacs problem here, since it happens
> only on some terminal emulators and not on others.
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50983; Package emacs. (Mon, 04 Oct 2021 12:41:02 GMT) Full text and rfc822 format available.

Message #59 received at 50983 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Rudi C <rudiwillalwaysloveyou <at> gmail.com>
Cc: 50983 <at> debbugs.gnu.org, alan <at> idiocy.org
Subject: Re: bug#50983: 28.0.50;
 [REGRESSION, BUG] Display bugs with uncommon characters
Date: Mon, 04 Oct 2021 15:40:05 +0300
> From: Rudi C <rudiwillalwaysloveyou <at> gmail.com>
> Date: Mon, 4 Oct 2021 11:35:41 +0330
> Cc: Alan Third <alan <at> idiocy.org>, 50983 <at> debbugs.gnu.org
> 
> But the problem does not happen with vim (nor with emacs 27 for `weird.txt`), so it is clearly an interaction
> of different elements. 
> 
> Anyhow, I have opened an [upstream issue](https://github.com/kovidgoyal/kitty/issues/4094). Please
> subscribe to it so that you might offer your emacs expertise there, if needed.

I subscribed and posted the following comment:

Emacs uses character width tables computed from the latest Unicode
Standard version 14.0.0, using the data in the file
EastAsianWidth.txt.  In that text, the U+00AD SOFT HYPHEN character,
which caused the problems in your file, has the East Asian Width
property value of A, which stands for "Ambiguous".  The definition of
this value in the Unicode Standard Annex 11 (UAX#11) is as follows:

  East Asian Ambiguous (A): All characters that can be sometimes wide
  and sometimes narrow. Ambiguous characters require additional
  information not contained in the character code to further resolve
  their width.

    Ambiguous characters occur in East Asian legacy character sets as
    wide characters, but as narrow (i.e., normal-width) characters in
    non-East Asian usage.

And since the file you show didn't have any East Asian legacy
characters, treating SOFT HYPHEN as narrow is IMO correct.

> > changing the "character encoding" setting in iTerm to ASCII
> 
> This is a most loath workaround. I do want UTF-8, as I use mathematical symbols, emojis, and non-English
> languages. Anyhow, making the text full of random unrecognized characters is not much better than the
> current behavior.

It is better because it doesn't confuse the user regarding which
character is he or she editing.

But I agree with you that the results are hardly satisfactory, so my
recommendation is not to use Kitty in conjunction with Emacs.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50983; Package emacs. (Mon, 04 Oct 2021 13:51:02 GMT) Full text and rfc822 format available.

Message #62 received at 50983 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: rudiwillalwaysloveyou <at> gmail.com
Cc: 50983 <at> debbugs.gnu.org, alan <at> idiocy.org
Subject: Re: bug#50983: 28.0.50;
 [REGRESSION, BUG] Display bugs with uncommon characters
Date: Mon, 04 Oct 2021 16:50:15 +0300
> Date: Mon, 04 Oct 2021 15:40:05 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 50983 <at> debbugs.gnu.org, alan <at> idiocy.org
> 
> > From: Rudi C <rudiwillalwaysloveyou <at> gmail.com>
> > Date: Mon, 4 Oct 2021 11:35:41 +0330
> > Cc: Alan Third <alan <at> idiocy.org>, 50983 <at> debbugs.gnu.org
> > 
> > But the problem does not happen with vim (nor with emacs 27 for `weird.txt`), so it is clearly an interaction
> > of different elements. 
> > 
> > Anyhow, I have opened an [upstream issue](https://github.com/kovidgoyal/kitty/issues/4094). Please
> > subscribe to it so that you might offer your emacs expertise there, if needed.
> 
> I subscribed and posted the following comment:
> 
> Emacs uses character width tables computed from the latest Unicode
> Standard version 14.0.0, using the data in the file
> EastAsianWidth.txt.  In that text, the U+00AD SOFT HYPHEN character,
> which caused the problems in your file, has the East Asian Width
> property value of A, which stands for "Ambiguous".  The definition of
> this value in the Unicode Standard Annex 11 (UAX#11) is as follows:
> 
>   East Asian Ambiguous (A): All characters that can be sometimes wide
>   and sometimes narrow. Ambiguous characters require additional
>   information not contained in the character code to further resolve
>   their width.
> 
>     Ambiguous characters occur in East Asian legacy character sets as
>     wide characters, but as narrow (i.e., normal-width) characters in
>     non-East Asian usage.
> 
> And since the file you show didn't have any East Asian legacy
> characters, treating SOFT HYPHEN as narrow is IMO correct.

To summarize the comments there:

The problematic character in the first example is U+00AD SOFT HYPHEN.
Kitty assumes that character is never rendered, and therefore
effectively treats it as zero-width character.

I don't see how Emacs display can possibly work correctly on such a
terminal, so I think we should close this bug report as "notabug".

For the second example, I think there could be an issue with character
compositions on this terminal, so the OP is advised to try turning off
auto-composition-mode.  If that solves the problem, fine; if not, I
guess Kitty once again assumes something about how such sequences are
rendered, and those assumptions don't fit how Emacs displays them in
reality, and if so, that problem, too, has no satisfactory solution
(and isn't an Emacs bug).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50983; Package emacs. (Sun, 04 Sep 2022 21:38:01 GMT) Full text and rfc822 format available.

Message #65 received at 50983 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 50983 <at> debbugs.gnu.org, rudiwillalwaysloveyou <at> gmail.com, alan <at> idiocy.org
Subject: Re: bug#50983: 28.0.50; [REGRESSION, BUG] Display bugs with
 uncommon characters
Date: Sun, 04 Sep 2022 23:37:45 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

> I don't see how Emacs display can possibly work correctly on such a
> terminal, so I think we should close this bug report as "notabug".
>
> For the second example, I think there could be an issue with character
> compositions on this terminal, so the OP is advised to try turning off
> auto-composition-mode.  If that solves the problem, fine; if not, I
> guess Kitty once again assumes something about how such sequences are
> rendered, and those assumptions don't fit how Emacs displays them in
> reality, and if so, that problem, too, has no satisfactory solution
> (and isn't an Emacs bug).

So I guess the conclusion here is that there's nothing to be done on the
Emacs side here, and I'm therefore closing this bug report.




bug closed, send any further explanations to 50983 <at> debbugs.gnu.org and Rudi C <rudiwillalwaysloveyou <at> gmail.com> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Sun, 04 Sep 2022 21:38:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 03 Oct 2022 11:24:17 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 177 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.