Package: emacs;
Reported by: 张海 <dreaming.in.code.zh <at> gmail.com>
Date: Sat, 8 Mar 2025 06:54:01 UTC
Severity: normal
Found in version 30.1
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 76852 in the body.
You can then email your comments to 76852 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
View this report as an mbox folder, status mbox, maintainer mbox
bug-gnu-emacs <at> gnu.org
:bug#76852
; Package emacs
.
(Sat, 08 Mar 2025 06:54:01 GMT) Full text and rfc822 format available.张海 <dreaming.in.code.zh <at> gmail.com>
:bug-gnu-emacs <at> gnu.org
.
(Sat, 08 Mar 2025 06:54:01 GMT) Full text and rfc822 format available.Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: 张海 <dreaming.in.code.zh <at> gmail.com> To: bug-gnu-emacs <at> gnu.org Subject: 30.1; Regression in whitespace-mode causes display issue under terminal Date: Fri, 7 Mar 2025 13:22:45 -0800
Hi Emacs maintainers, I recently updated from Emacs 29.4 to 30.1 and noticed an issue when I use global-whitespace-mode under terminal - each line would either automatically wrap before reaching the right edge of the terminal, or the cursor can go beyond the actual line end as if there were virtual spaces. Inserting new characters also results in characters after the insertion to drift right further than necessary and makes it impossible to properly edit a document. The amount of drifting seems to be related to the number of whitespace characters shown by whitespace-mode. A image showing the bug is available at https://imgur.com/a/PWbFgyE . I verified that this issue occurs with `emacs -nw -q` (so that my .emacs isn't interfering) and the `*scratch*` buffer (its vanilla English-only content) using `M-x global-whitespace-mode` on Emacs 30.1 runnin under gnome-terminal, gnome-console and foot (the Wayland terminal emulator) on most recent Arch Linux running GNOME 3 on Wayland. My locale is zh_CN.UTF-8 and my terminal font, i.e. the `monospace` font, is set to `WenQuanYi Micro Hei Mono` (文泉驿等宽微米黑 in Chinese, part of the wqy-microhei Arch Linxu package), if that matters. Downgrading to Emacs 29.4 immediately solved this issue so I believe it is a regression. Here's a screen recoding and the coresponding termscript of this issue: - Video: https://imgur.com/a/IhA8JxD - Termscript: https://files.catbox.moe/tskpy8 I've also attached some environment info and output from `M-x report-emacs-bug` below, and hopefully they can be helpful in debugging this.Thanks in advance! Hai TERM=xterm-256color No /etc/termcap LC_ALL= LC_COLLATE= LC_CTYPE= LC_MESSAGES= LC_TIME= LANG=zh_CN.UTF-8 fc-match monospace: wqy-microhei.ttc: "文泉驿等宽微米黑" "Regular" In GNU Emacs 30.1 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.24.48, cairo version 1.18.2) Windowing system distributor 'The X.Org Foundation', version 11.0.12401006 System Description: Arch Linux Configured using: 'configure --with-x-toolkit=gtk3 --sysconfdir=/etc --prefix=/usr --libexecdir=/usr/lib --localstatedir=/var --disable-build-details --with-cairo --with-harfbuzz --with-libsystemd --with-modules --with-native-compilation=aot --with-tree-sitter 'CFLAGS=-march=x86-64 -mtune=generic -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=3 -Wformat -Werror=format-security -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -g -ffile-prefix-map=/build/emacs/src=/usr/src/debug/emacs -flto=auto' 'LDFLAGS=-Wl,-O1 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,-z,pack-relative-relocs -flto=auto'' Configured features: ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG LCMS2 LIBOTF LIBSYSTEMD LIBXML2 M17N_FLT MODULES NATIVE_COMP NOTIFY INOTIFY PDUMPER PNG RSVG SECCOMP SOUND SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS TREE_SITTER WEBP X11 XDBE XIM XINPUT2 XPM GTK3 ZLIB Important settings: value of $LANG: zh_CN.UTF-8 value of $XMODIFIERS: @im=ibus locale-coding-system: utf-8-unix Major mode: Fundamental Minor modes in effect: global-whitespace-mode: t tooltip-mode: t global-eldoc-mode: t show-paren-mode: t electric-indent-mode: t mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t blink-cursor-mode: t minibuffer-regexp-mode: t buffer-read-only: t line-number-mode: t indent-tabs-mode: t transient-mark-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t Load-path shadows: None found. Features: (shadow sort mail-extr filecache compile comint ansi-osc ansi-color ring comp-run bytecomp byte-compile comp-common rx emacsbug message mailcap yank-media puny dired dired-loaddefs rfc822 mml mml-sec password-cache epa derived epg rfc6068 epg-config gnus-util text-property-search time-date subr-x mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader cl-loaddefs cl-lib sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils cus-start cus-load disp-table whitespace china-util rmc iso-transl tooltip cconv eldoc paren electric uniquify ediff-hook vc-hooks lisp-float-type elisp-mode mwheel term/x-win x-win term/common-win x-dnd touch-screen tool-bar dnd fontset image regexp-opt fringe tabulated-list replace newcomment text-mode lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow isearch easymenu timer select scroll-bar mouse jit-lock font-lock syntax font-core term/tty-colors frame minibuffer nadvice seq simple cl-generic indonesian philippine cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese composite emoji-zwj charscript charprop case-table epa-hook jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs theme-loaddefs faces cus-face macroexp files window text-properties overlay sha1 md5 base64 format env code-pages mule custom widget keymap hashtable-print-readable backquote threads dbusbind inotify lcms2 dynamic-setting system-font-setting font-render-setting cairo gtk x-toolkit xinput2 x multi-tty move-toolbar make-network-process native-compile emacs) Memory information: ((conses 16 81904 15452) (symbols 48 7904 0) (strings 32 19386 2345) (string-bytes 1 583198) (vectors 16 12555) (vector-slots 8 212422 11491) (floats 8 22 3) (intervals 56 411 13) (buffers 992 12))
bug-gnu-emacs <at> gnu.org
:bug#76852
; Package emacs
.
(Sat, 08 Mar 2025 11:16:02 GMT) Full text and rfc822 format available.Message #8 received at 76852 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: 张海 <dreaming.in.code.zh <at> gmail.com> Cc: 76852 <at> debbugs.gnu.org Subject: Re: bug#76852: 30.1; Regression in whitespace-mode causes display issue under terminal Date: Sat, 08 Mar 2025 13:15:42 +0200
> From: 张海 <dreaming.in.code.zh <at> gmail.com> > Date: Fri, 7 Mar 2025 13:22:45 -0800 > > I recently updated from Emacs 29.4 to 30.1 and noticed an issue when I > use global-whitespace-mode under terminal - each line would either > automatically wrap before reaching the right edge of the terminal, or > the cursor can go beyond the actual line end as if there were virtual > spaces. Inserting new characters also results in characters after the > insertion to drift right further than necessary and makes it > impossible to properly edit a document. The amount of drifting seems > to be related to the number of whitespace characters shown by > whitespace-mode. A image showing the bug is available at > https://imgur.com/a/PWbFgyE . > > I verified that this issue occurs with `emacs -nw -q` (so that my > .emacs isn't interfering) and the `*scratch*` buffer (its vanilla > English-only content) using `M-x global-whitespace-mode` on Emacs 30.1 > runnin under gnome-terminal, gnome-console and foot (the Wayland > terminal emulator) on most recent Arch Linux running GNOME 3 on > Wayland. My locale is zh_CN.UTF-8 and my terminal font, i.e. the > `monospace` font, is set to `WenQuanYi Micro Hei Mono` (文泉驿等宽微米黑 in > Chinese, part of the wqy-microhei Arch Linxu package), if that > matters. Downgrading to Emacs 29.4 immediately solved this issue so I > believe it is a regression. > > Here's a screen recoding and the coresponding termscript of this issue: > - Video: https://imgur.com/a/IhA8JxD > - Termscript: https://files.catbox.moe/tskpy8 > > I've also attached some environment info and output from `M-x > report-emacs-bug` below, and hopefully they can be helpful in > debugging this.Thanks in advance! Thanks. I cannot reproduce these problems, but then I don't have access to a system with gnome-console. Can someone please reproduce these display issues and debug them?
bug-gnu-emacs <at> gnu.org
:bug#76852
; Package emacs
.
(Sat, 08 Mar 2025 20:44:02 GMT) Full text and rfc822 format available.Message #11 received at 76852 <at> debbugs.gnu.org (full text, mbox):
From: 张海 <dreaming.in.code.zh <at> gmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: 76852 <at> debbugs.gnu.org Subject: Re: bug#76852: 30.1; Regression in whitespace-mode causes display issue under terminal Date: Sat, 8 Mar 2025 12:43:09 -0800
> I cannot reproduce these problems, but then I don't have access to a > system with gnome-console. > > Can someone please reproduce these display issues and debug them? Thanks for the reply. I did some further debugging and found out this issue also disappears on 30.1 if I set my LANG=en_US.UTF-8 , whereas my current env is LANG=zh_CN.UTF-8 . The same issue still exists if I set LANG=ja_JP.UTF-8 . (I do have all the three locales enabled in my /etc/locale.gen .) So I suspect this might be a regression in 30.1 (compared to 29.4) about logic handling full/half width characters under terminal for the whitespace-mode characters (e.g. middle dot) when LANG is a CJK locale. I also tried to identify if this issue is related to my font configuration, but the issue still happened when I prepended `DejaVu Sans Mono` to my `monospace` font and verified that it became my terminal font. So I think this issue is less likely related to my font.
bug-gnu-emacs <at> gnu.org
:bug#76852
; Package emacs
.
(Sun, 09 Mar 2025 02:54:03 GMT) Full text and rfc822 format available.Message #14 received at 76852 <at> debbugs.gnu.org (full text, mbox):
From: 张海 <dreaming.in.code.zh <at> gmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: 76852 <at> debbugs.gnu.org Subject: Re: bug#76852: 30.1; Regression in whitespace-mode causes display issue under terminal Date: Sat, 8 Mar 2025 18:53:35 -0800
> Thanks for the reply. I did some further debugging and found out this > issue also disappears on 30.1 if I set my LANG=en_US.UTF-8 , whereas > my current env is LANG=zh_CN.UTF-8 . The same issue still exists if I > set LANG=ja_JP.UTF-8 . (I do have all the three locales enabled in my > /etc/locale.gen .) So I suspect this might be a regression in 30.1 > (compared to 29.4) about logic handling full/half width characters > under terminal for the whitespace-mode characters (e.g. middle dot) > when LANG is a CJK locale. I want to add some more tests that I did, with en_US.UTF-8 and zh_CN.UTF-8 enabled in my /etc/locale.gen : 1. env -i TERM=$TERM LANG=zh_CN.UTF-8 emacs -nw -q shows the issue; 2. env -i TERM=$TERM LANG=zh_CN.UTF-8 LC_CTYPE=en_US.UTF-8 emacs -nw -q doesn't show the issue; 3. env -i TERM=$TERM LC_CTYPE=zh_CN.UTF-8 emacs -nw -q shows the issue again. So I believe this is an issue that more precisely happens when LC_CTYPE is set to a CJK locale. And I tried the tests above under tty (which uses the kerner default font) and got the same result. So I'm relatively confident this issue is unrelated to my fonts now.
bug-gnu-emacs <at> gnu.org
:bug#76852
; Package emacs
.
(Sun, 09 Mar 2025 06:13:02 GMT) Full text and rfc822 format available.Message #17 received at 76852 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: 张海 <dreaming.in.code.zh <at> gmail.com> Cc: 76852 <at> debbugs.gnu.org Subject: Re: bug#76852: 30.1; Regression in whitespace-mode causes display issue under terminal Date: Sun, 09 Mar 2025 08:12:07 +0200
> From: 张海 <dreaming.in.code.zh <at> gmail.com> > Date: Sat, 8 Mar 2025 12:43:09 -0800 > Cc: 76852 <at> debbugs.gnu.org > > > I cannot reproduce these problems, but then I don't have access to a > > system with gnome-console. > > > > Can someone please reproduce these display issues and debug them? > > Thanks for the reply. I did some further debugging and found out this > issue also disappears on 30.1 if I set my LANG=en_US.UTF-8 , whereas > my current env is LANG=zh_CN.UTF-8 . The same issue still exists if I > set LANG=ja_JP.UTF-8 . (I do have all the three locales enabled in my > /etc/locale.gen .) So I suspect this might be a regression in 30.1 > (compared to 29.4) about logic handling full/half width characters > under terminal for the whitespace-mode characters (e.g. middle dot) > when LANG is a CJK locale. That rings a bell. What happens if, before turning on whitespace-mode for the first time in a session, you customize the variable cjk-ambiguous-chars-are-wide to the nil value? This must be done either via setopt or interactively via customize-option, not via setq. Does the problem go away if you do that, and then turn on global-whitespace-mode?
bug-gnu-emacs <at> gnu.org
:bug#76852
; Package emacs
.
(Sun, 09 Mar 2025 07:03:02 GMT) Full text and rfc822 format available.Message #20 received at 76852 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: 张海 <dreaming.in.code.zh <at> gmail.com> Cc: 76852 <at> debbugs.gnu.org Subject: Re: bug#76852: 30.1; Regression in whitespace-mode causes display issue under terminal Date: Sun, 09 Mar 2025 09:01:47 +0200
> From: 张海 <dreaming.in.code.zh <at> gmail.com> > Date: Sat, 8 Mar 2025 18:53:35 -0800 > Cc: 76852 <at> debbugs.gnu.org > > 1. env -i TERM=$TERM LANG=zh_CN.UTF-8 emacs -nw -q shows the issue; > 2. env -i TERM=$TERM LANG=zh_CN.UTF-8 LC_CTYPE=en_US.UTF-8 emacs -nw > -q doesn't show the issue; > 3. env -i TERM=$TERM LC_CTYPE=zh_CN.UTF-8 emacs -nw -q shows the issue > again. > > So I believe this is an issue that more precisely happens when > LC_CTYPE is set to a CJK locale. > > And I tried the tests above under tty (which uses the kerner default > font) and got the same result. So I'm relatively confident this issue > is unrelated to my fonts now. Can you look for a font where the characters U+00B7 MIDDLE DOT and U+00A4 CURRENCY SIGN have fullwidth glyphs? I presume some CJK-specific fonts should be like that, since these characters are referred to as having "ambiguous width" in the Unicode character database. If you can find such a font, please try setting up gnome-console to use it, and see if the problem then goes away. Thanks.
bug-gnu-emacs <at> gnu.org
:bug#76852
; Package emacs
.
(Wed, 12 Mar 2025 04:57:02 GMT) Full text and rfc822 format available.Message #23 received at 76852 <at> debbugs.gnu.org (full text, mbox):
From: 张海 <dreaming.in.code.zh <at> gmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: 76852 <at> debbugs.gnu.org Subject: Re: bug#76852: 30.1; Regression in whitespace-mode causes display issue under terminal Date: Tue, 11 Mar 2025 21:56:00 -0700
> That rings a bell. What happens if, before turning on whitespace-mode > for the first time in a session, you customize the variable > cjk-ambiguous-chars-are-wide to the nil value? This must be done > either via setopt or interactively via customize-option, not via setq. > Does the problem go away if you do that, and then turn on > global-whitespace-mode? I can confirm this does make the issue go away. But looking at the NEWS announcement for cjk-ambiguous-chars-are-wide, it seems to suggest the previous default was already full-width, which conflicts with the fact that whitespace-mode was working fine in terminal in Emacs 29.4 under a CJK locale. So was there an unintentional default behavior change, and should that be fixed? > Can you look for a font where the characters U+00B7 MIDDLE DOT and > U+00A4 CURRENCY SIGN have fullwidth glyphs? I presume some > CJK-specific fonts should be like that, since these characters are > referred to as having "ambiguous width" in the Unicode character > database. If you can find such a font, please try setting up > gnome-console to use it, and see if the problem then goes away. No, the same issue persisted and the only difference is that this time only the first half of the full-width middle dot was rendered.
bug-gnu-emacs <at> gnu.org
:bug#76852
; Package emacs
.
(Wed, 12 Mar 2025 14:22:01 GMT) Full text and rfc822 format available.Message #26 received at 76852 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: 张海 <dreaming.in.code.zh <at> gmail.com> Cc: 76852 <at> debbugs.gnu.org Subject: Re: bug#76852: 30.1; Regression in whitespace-mode causes display issue under terminal Date: Wed, 12 Mar 2025 16:21:12 +0200
> From: 张海 <dreaming.in.code.zh <at> gmail.com> > Date: Tue, 11 Mar 2025 21:56:00 -0700 > Cc: 76852 <at> debbugs.gnu.org > > > That rings a bell. What happens if, before turning on whitespace-mode > > for the first time in a session, you customize the variable > > cjk-ambiguous-chars-are-wide to the nil value? This must be done > > either via setopt or interactively via customize-option, not via setq. > > Does the problem go away if you do that, and then turn on > > global-whitespace-mode? > > I can confirm this does make the issue go away. OK, so we now at least understand what caused the issue. > But looking at the NEWS announcement for cjk-ambiguous-chars-are-wide, > it seems to suggest the previous default was already full-width, which > conflicts with the fact that whitespace-mode was working fine in > terminal in Emacs 29.4 under a CJK locale. So was there an > unintentional default behavior change, and should that be fixed? The change was intentional, but it follows the Unicode data tables, so evidently some characters which were previously half-width because full-width by default in Emacs 30. Thus, the NEWS text is indeed slightly misleading. > > Can you look for a font where the characters U+00B7 MIDDLE DOT and > > U+00A4 CURRENCY SIGN have fullwidth glyphs? I presume some > > CJK-specific fonts should be like that, since these characters are > > referred to as having "ambiguous width" in the Unicode character > > database. If you can find such a font, please try setting up > > gnome-console to use it, and see if the problem then goes away. > > No, the same issue persisted and the only difference is that this time > only the first half of the full-width middle dot was rendered. Hmm... does gnome-console have any configuration options for this? It sounds like it assumes these characters are half-width regardless of what the font does. This URL: https://superuser.com/questions/573876/how-to-let-gnome-terminal-to-use-specific-font-to-display-punctuations-in-their seems to imply you should be able to control this aspect of the terminal.
bug-gnu-emacs <at> gnu.org
:bug#76852
; Package emacs
.
(Thu, 13 Mar 2025 06:50:01 GMT) Full text and rfc822 format available.Message #29 received at 76852 <at> debbugs.gnu.org (full text, mbox):
From: 张海 <dreaming.in.code.zh <at> gmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: 76852 <at> debbugs.gnu.org Subject: Re: bug#76852: 30.1; Regression in whitespace-mode causes display issue under terminal Date: Wed, 12 Mar 2025 23:49:17 -0700
> The change was intentional, but it follows the Unicode data tables, so > evidently some characters which were previously half-width because > full-width by default in Emacs 30. Thus, the NEWS text is indeed > slightly misleading. Does that mean the solution should be users who use a CJK locale will need to set cjk-ambiguous-chars-are-wide to nil in order for whitespace-mode to work? It seems to me this might mean most users with a CJK locale will need to do that if they ever use whitespace-mode, because it never makes sense for the whitespace replacement char (the middle dot) to occupy the space of two characters. > Hmm... does gnome-console have any configuration options for this? It > sounds like it assumes these characters are half-width regardless of > what the font does. This URL: > > https://superuser.com/questions/573876/how-to-let-gnome-terminal-to-use-specific-font-to-display-punctuations-in-their > > seems to imply you should be able to control this aspect of the > terminal. I can confirm this does allow the middle dot to show up as two characters wide (even without a font change), which looks strange but does make line wrapping and editing work. On Wed, Mar 12, 2025 at 7:21 AM Eli Zaretskii <eliz <at> gnu.org> wrote: > > > From: 张海 <dreaming.in.code.zh <at> gmail.com> > > Date: Tue, 11 Mar 2025 21:56:00 -0700 > > Cc: 76852 <at> debbugs.gnu.org > > > > > That rings a bell. What happens if, before turning on whitespace-mode > > > for the first time in a session, you customize the variable > > > cjk-ambiguous-chars-are-wide to the nil value? This must be done > > > either via setopt or interactively via customize-option, not via setq. > > > Does the problem go away if you do that, and then turn on > > > global-whitespace-mode? > > > > I can confirm this does make the issue go away. > > OK, so we now at least understand what caused the issue. > > > But looking at the NEWS announcement for cjk-ambiguous-chars-are-wide, > > it seems to suggest the previous default was already full-width, which > > conflicts with the fact that whitespace-mode was working fine in > > terminal in Emacs 29.4 under a CJK locale. So was there an > > unintentional default behavior change, and should that be fixed? > > The change was intentional, but it follows the Unicode data tables, so > evidently some characters which were previously half-width because > full-width by default in Emacs 30. Thus, the NEWS text is indeed > slightly misleading. > > > > Can you look for a font where the characters U+00B7 MIDDLE DOT and > > > U+00A4 CURRENCY SIGN have fullwidth glyphs? I presume some > > > CJK-specific fonts should be like that, since these characters are > > > referred to as having "ambiguous width" in the Unicode character > > > database. If you can find such a font, please try setting up > > > gnome-console to use it, and see if the problem then goes away. > > > > No, the same issue persisted and the only difference is that this time > > only the first half of the full-width middle dot was rendered. > > Hmm... does gnome-console have any configuration options for this? It > sounds like it assumes these characters are half-width regardless of > what the font does. This URL: > > https://superuser.com/questions/573876/how-to-let-gnome-terminal-to-use-specific-font-to-display-punctuations-in-their > > seems to imply you should be able to control this aspect of the > terminal. >
bug-gnu-emacs <at> gnu.org
:bug#76852
; Package emacs
.
(Thu, 13 Mar 2025 07:37:01 GMT) Full text and rfc822 format available.Message #32 received at 76852 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: 张海 <dreaming.in.code.zh <at> gmail.com> Cc: 76852 <at> debbugs.gnu.org Subject: Re: bug#76852: 30.1; Regression in whitespace-mode causes display issue under terminal Date: Thu, 13 Mar 2025 09:35:59 +0200
> From: 张海 <dreaming.in.code.zh <at> gmail.com> > Date: Wed, 12 Mar 2025 23:49:17 -0700 > Cc: 76852 <at> debbugs.gnu.org > > > The change was intentional, but it follows the Unicode data tables, so > > evidently some characters which were previously half-width because > > full-width by default in Emacs 30. Thus, the NEWS text is indeed > > slightly misleading. > > Does that mean the solution should be users who use a CJK locale will > need to set cjk-ambiguous-chars-are-wide to nil in order for > whitespace-mode to work? Not necessarily. I still haven't decided what would be the best solution for that, but telling users to sett cjk-ambiguous-chars-are-wide to nil for the benefit of whitespace-mode is definitely not high on the list, because the effects of that variable are global on the entire Emacs session. I'm asking these questions in order to understand better the issues and the possible solutions. Currently, the two solution I'm pondering are: . remove from the list of ambiguous-width characters some of the characters with low Unicode codepoints, including the 2 characters used by whitespace-mode, on the assumption that these characters are unlikely to be full-width in fonts used by Emacs users in CJK locales . change the default of cjk-ambiguous-chars-are-wide to nil, on the assumption that most users in CJK locales use fonts where these characters have half-width glyphs > > Hmm... does gnome-console have any configuration options for this? It > > sounds like it assumes these characters are half-width regardless of > > what the font does. This URL: > > > > https://superuser.com/questions/573876/how-to-let-gnome-terminal-to-use-specific-font-to-display-punctuations-in-their > > > > seems to imply you should be able to control this aspect of the > > terminal. > > I can confirm this does allow the middle dot to show up as two > characters wide (even without a font change), which looks strange but > does make line wrapping and editing work. OK, so in any case, we should extend the doc string of cjk-ambiguous-chars-are-wide to mention the possible need to customize the terminal emulator to draw the ambiguous-width characters according to the font that is actually being used. Thanks. Let me think a bit more about what would be the best solution. But could you tell which font you used that has full-width glyphs for these characters? Is it unusual to use that font for terminal emulators in CJK locales? I'm wondering why we didn't hear by now complaints from CJK users about the effects of cjk-ambiguous-chars-are-wide other than on whitespace-mode. Maybe the other ambiguous-width characters are seldom used in practice? Or maybe too little time has passed since Emacs 30 was released?
bug-gnu-emacs <at> gnu.org
:bug#76852
; Package emacs
.
(Thu, 13 Mar 2025 09:33:02 GMT) Full text and rfc822 format available.Message #35 received at 76852 <at> debbugs.gnu.org (full text, mbox):
From: 张海 <dreaming.in.code.zh <at> gmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: 76852 <at> debbugs.gnu.org Subject: Re: bug#76852: 30.1; Regression in whitespace-mode causes display issue under terminal Date: Thu, 13 Mar 2025 02:32:33 -0700
> Currently, the two solution I'm pondering are: > > . remove from the list of ambiguous-width characters some of the > characters with low Unicode codepoints, including the 2 characters > used by whitespace-mode, on the assumption that these characters > are unlikely to be full-width in fonts used by Emacs users in CJK > locales I wonder how emacs 29.4 dealt with this - did it also have a special list of characters that it treats as half-width, while the rest are treated as full-width like what was in the announcement? > . change the default of cjk-ambiguous-chars-are-wide to nil, on > the assumption that most users in CJK locales use fonts where > these characters have half-width glyphs I think this might be a good option because both TTY and some popular terminal emulators like gnome-terminal, gnome-console and foot ship with the ambiguous CJK characters defaulted to half width. (Since I tested them and saw how wide the middle dot was) And some of them don't even offer an option to change it to full width. I don't know about other terminal emulators like KConsole though. > Thanks. Let me think a bit more about what would be the best > solution. But could you tell which font you used that has full-width > glyphs for these characters? Is it unusual to use that font for > terminal emulators in CJK locales? The fonts I'm using, and most Chinese Linux users may be using (to my understanding), are: - WenQuanYi (WQY) fonts - Noto (i.e. Source Han, a different branding) fonts See also https://wiki.archlinux.org/title/Localization/Chinese The WenQuanYi fonts have a much longer history and the middle dot is always half width in them (checked just now). I'm currently using WenQuanYi Micro Hei. The Noto/Source Han fonts are relatively new and have an interesting situation where the Noto Sans CJK SC/TC have the middle dot as full width but Noto Sans CJK JP/KR have it as half width. Some Microsoft proprietary system-default fonts for Chinese characters, e.g. SimSun and SimHei, also have the middle dot as full width, but I guess few Linux users would be using it. I should also mention that some CJK users prepend an English font before their CJK font for usage in UI/terminal, because English fonts usually contain better quality glyphs for latin letters than the ones embedded in CJK fonts - essentially they only use the CJK-only part of the CJK fonts. So the middle dot will always be half width for them. > I'm wondering why we didn't hear > by now complaints from CJK users about the effects of > cjk-ambiguous-chars-are-wide other than on whitespace-mode. Maybe the > other ambiguous-width characters are seldom used in practice? Or > maybe too little time has passed since Emacs 30 was released? I think it could be both: 1. Emacs 30 is still new-ish, e.g. it landed in Arch Linux stable on Feb 25, only two weeks ago 2. Not a lot of CJK users set a CJK locale ($LANG) for their terminal - they usually set it for the DE but not necessarily in /etc/locale.conf for the entire environment, because that means localized messages won't appear properly under TTY with its default font. I did that because I would manually set LANG if I actually have to use TTY instead of a terminal emulator some day.
bug-gnu-emacs <at> gnu.org
:bug#76852
; Package emacs
.
(Thu, 13 Mar 2025 14:12:02 GMT) Full text and rfc822 format available.Message #38 received at 76852 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: 张海 <dreaming.in.code.zh <at> gmail.com> Cc: 76852 <at> debbugs.gnu.org Subject: Re: bug#76852: 30.1; Regression in whitespace-mode causes display issue under terminal Date: Thu, 13 Mar 2025 16:11:35 +0200
> From: 张海 <dreaming.in.code.zh <at> gmail.com> > Date: Thu, 13 Mar 2025 02:32:33 -0700 > Cc: 76852 <at> debbugs.gnu.org > > > Currently, the two solution I'm pondering are: > > > > . remove from the list of ambiguous-width characters some of the > > characters with low Unicode codepoints, including the 2 characters > > used by whitespace-mode, on the assumption that these characters > > are unlikely to be full-width in fonts used by Emacs users in CJK > > locales > > I wonder how emacs 29.4 dealt with this - did it also have a special > list of characters that it treats as half-width, while the rest are > treated as full-width like what was in the announcement? Emacs 29 didn't have the notion of ambiguous-width characters. Each character was either a half-width or a full-width. And the characters used by whitespace-mode were half-width. > > . change the default of cjk-ambiguous-chars-are-wide to nil, on > > the assumption that most users in CJK locales use fonts where > > these characters have half-width glyphs > > I think this might be a good option because both TTY and some popular > terminal emulators like gnome-terminal, gnome-console and foot ship > with the ambiguous CJK characters defaulted to half width. Yes, but are we sure that users in CJK locales don't customize terminal emulators to default the ambiguous-width characters to full-width? If many users do that, then Emacs should cater to the majority. > > Thanks. Let me think a bit more about what would be the best > > solution. But could you tell which font you used that has full-width > > glyphs for these characters? Is it unusual to use that font for > > terminal emulators in CJK locales? > > The fonts I'm using, and most Chinese Linux users may be using (to my > understanding), are: > - WenQuanYi (WQY) fonts > - Noto (i.e. Source Han, a different branding) fonts > See also https://wiki.archlinux.org/title/Localization/Chinese > > The WenQuanYi fonts have a much longer history and the middle dot is > always half width in them (checked just now). I'm currently using > WenQuanYi Micro Hei. > > The Noto/Source Han fonts are relatively new and have an interesting > situation where the Noto Sans CJK SC/TC have the middle dot as full > width but Noto Sans CJK JP/KR have it as half width. > > Some Microsoft proprietary system-default fonts for Chinese > characters, e.g. SimSun and SimHei, also have the middle dot as full > width, but I guess few Linux users would be using it. > > I should also mention that some CJK users prepend an English font > before their CJK font for usage in UI/terminal, because English fonts > usually contain better quality glyphs for latin letters than the ones > embedded in CJK fonts - essentially they only use the CJK-only part of > the CJK fonts. So the middle dot will always be half width for them. > > > I'm wondering why we didn't hear > > by now complaints from CJK users about the effects of > > cjk-ambiguous-chars-are-wide other than on whitespace-mode. Maybe the > > other ambiguous-width characters are seldom used in practice? Or > > maybe too little time has passed since Emacs 30 was released? > > I think it could be both: > > 1. Emacs 30 is still new-ish, e.g. it landed in Arch Linux stable on > Feb 25, only two weeks ago > > 2. Not a lot of CJK users set a CJK locale ($LANG) for their terminal > - they usually set it for the DE but not necessarily in > /etc/locale.conf for the entire environment, because that means > localized messages won't appear properly under TTY with its default > font. I did that because I would manually set LANG if I actually have > to use TTY instead of a terminal emulator some day. Thanks for the info, I will think about this some more.
bug-gnu-emacs <at> gnu.org
:bug#76852
; Package emacs
.
(Sat, 15 Mar 2025 11:46:02 GMT) Full text and rfc822 format available.Message #41 received at 76852 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: dreaming.in.code.zh <at> gmail.com Cc: 76852 <at> debbugs.gnu.org Subject: Re: bug#76852: 30.1; Regression in whitespace-mode causes display issue under terminal Date: Sat, 15 Mar 2025 13:45:27 +0200
> Cc: 76852 <at> debbugs.gnu.org > Date: Thu, 13 Mar 2025 16:11:35 +0200 > From: Eli Zaretskii <eliz <at> gnu.org> > > > From: 张海 <dreaming.in.code.zh <at> gmail.com> > > Date: Thu, 13 Mar 2025 02:32:33 -0700 > > Cc: 76852 <at> debbugs.gnu.org > > > > > Currently, the two solution I'm pondering are: > > > > > > . remove from the list of ambiguous-width characters some of the > > > characters with low Unicode codepoints, including the 2 characters > > > used by whitespace-mode, on the assumption that these characters > > > are unlikely to be full-width in fonts used by Emacs users in CJK > > > locales > > > > I wonder how emacs 29.4 dealt with this - did it also have a special > > list of characters that it treats as half-width, while the rest are > > treated as full-width like what was in the announcement? > > Emacs 29 didn't have the notion of ambiguous-width characters. Each > character was either a half-width or a full-width. And the characters > used by whitespace-mode were half-width. > > > > . change the default of cjk-ambiguous-chars-are-wide to nil, on > > > the assumption that most users in CJK locales use fonts where > > > these characters have half-width glyphs > > > > I think this might be a good option because both TTY and some popular > > terminal emulators like gnome-terminal, gnome-console and foot ship > > with the ambiguous CJK characters defaulted to half width. > > Yes, but are we sure that users in CJK locales don't customize > terminal emulators to default the ambiguous-width characters to > full-width? If many users do that, then Emacs should cater to the > majority. > > > > Thanks. Let me think a bit more about what would be the best > > > solution. But could you tell which font you used that has full-width > > > glyphs for these characters? Is it unusual to use that font for > > > terminal emulators in CJK locales? > > > > The fonts I'm using, and most Chinese Linux users may be using (to my > > understanding), are: > > - WenQuanYi (WQY) fonts > > - Noto (i.e. Source Han, a different branding) fonts > > See also https://wiki.archlinux.org/title/Localization/Chinese > > > > The WenQuanYi fonts have a much longer history and the middle dot is > > always half width in them (checked just now). I'm currently using > > WenQuanYi Micro Hei. > > > > The Noto/Source Han fonts are relatively new and have an interesting > > situation where the Noto Sans CJK SC/TC have the middle dot as full > > width but Noto Sans CJK JP/KR have it as half width. > > > > Some Microsoft proprietary system-default fonts for Chinese > > characters, e.g. SimSun and SimHei, also have the middle dot as full > > width, but I guess few Linux users would be using it. > > > > I should also mention that some CJK users prepend an English font > > before their CJK font for usage in UI/terminal, because English fonts > > usually contain better quality glyphs for latin letters than the ones > > embedded in CJK fonts - essentially they only use the CJK-only part of > > the CJK fonts. So the middle dot will always be half width for them. > > > > > I'm wondering why we didn't hear > > > by now complaints from CJK users about the effects of > > > cjk-ambiguous-chars-are-wide other than on whitespace-mode. Maybe the > > > other ambiguous-width characters are seldom used in practice? Or > > > maybe too little time has passed since Emacs 30 was released? > > > > I think it could be both: > > > > 1. Emacs 30 is still new-ish, e.g. it landed in Arch Linux stable on > > Feb 25, only two weeks ago > > > > 2. Not a lot of CJK users set a CJK locale ($LANG) for their terminal > > - they usually set it for the DE but not necessarily in > > /etc/locale.conf for the entire environment, because that means > > localized messages won't appear properly under TTY with its default > > font. I did that because I would manually set LANG if I actually have > > to use TTY instead of a terminal emulator some day. > > Thanks for the info, I will think about this some more. I eventually decided to go with the more conservative approach of removing the two characters used by whitespace-mode from the list of ambiguous-width characters. This change is not installed on the emacs-30 branch, so the next Emacs release should have this problem fixed.
bug-gnu-emacs <at> gnu.org
:bug#76852
; Package emacs
.
(Sat, 15 Mar 2025 20:18:01 GMT) Full text and rfc822 format available.Message #44 received at 76852 <at> debbugs.gnu.org (full text, mbox):
From: 张海 <dreaming.in.code.zh <at> gmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: 76852 <at> debbugs.gnu.org Subject: Re: bug#76852: 30.1; Regression in whitespace-mode causes display issue under terminal Date: Sat, 15 Mar 2025 13:17:21 -0700
Thanks for fixing this for whitespace-mode! In case anyone has a similar problem and finds this thread. as an opinionated user I feel it would make the most sense for me to always have a consistent ambiguous CJK character setting between my terminal and Emacs, so that the text layout for Emacs under terminal never breaks for any of those characters. I've added the following to my .emacs now (as all my terminals has defaulted to half width for years): ;;; half width CJK ambiguous characters (same as most terminal ;;; defaults) (setopt cjk-ambiguous-chars-are-wide nil)
Eli Zaretskii <eliz <at> gnu.org>
:张海 <dreaming.in.code.zh <at> gmail.com>
:Message #49 received at 76852-done <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: dreaming.in.code.zh <at> gmail.com Cc: 76852-done <at> debbugs.gnu.org Subject: Re: bug#76852: 30.1; Regression in whitespace-mode causes display issue under terminal Date: Sat, 29 Mar 2025 14:19:54 +0300
> Cc: 76852 <at> debbugs.gnu.org > Date: Sat, 15 Mar 2025 13:45:27 +0200 > From: Eli Zaretskii <eliz <at> gnu.org> > > > Cc: 76852 <at> debbugs.gnu.org > > Date: Thu, 13 Mar 2025 16:11:35 +0200 > > From: Eli Zaretskii <eliz <at> gnu.org> > > > > > From: 张海 <dreaming.in.code.zh <at> gmail.com> > > > Date: Thu, 13 Mar 2025 02:32:33 -0700 > > > Cc: 76852 <at> debbugs.gnu.org > > > > > > > Currently, the two solution I'm pondering are: > > > > > > > > . remove from the list of ambiguous-width characters some of the > > > > characters with low Unicode codepoints, including the 2 characters > > > > used by whitespace-mode, on the assumption that these characters > > > > are unlikely to be full-width in fonts used by Emacs users in CJK > > > > locales > > > > > > I wonder how emacs 29.4 dealt with this - did it also have a special > > > list of characters that it treats as half-width, while the rest are > > > treated as full-width like what was in the announcement? > > > > Emacs 29 didn't have the notion of ambiguous-width characters. Each > > character was either a half-width or a full-width. And the characters > > used by whitespace-mode were half-width. > > > > > > . change the default of cjk-ambiguous-chars-are-wide to nil, on > > > > the assumption that most users in CJK locales use fonts where > > > > these characters have half-width glyphs > > > > > > I think this might be a good option because both TTY and some popular > > > terminal emulators like gnome-terminal, gnome-console and foot ship > > > with the ambiguous CJK characters defaulted to half width. > > > > Yes, but are we sure that users in CJK locales don't customize > > terminal emulators to default the ambiguous-width characters to > > full-width? If many users do that, then Emacs should cater to the > > majority. > > > > > > Thanks. Let me think a bit more about what would be the best > > > > solution. But could you tell which font you used that has full-width > > > > glyphs for these characters? Is it unusual to use that font for > > > > terminal emulators in CJK locales? > > > > > > The fonts I'm using, and most Chinese Linux users may be using (to my > > > understanding), are: > > > - WenQuanYi (WQY) fonts > > > - Noto (i.e. Source Han, a different branding) fonts > > > See also https://wiki.archlinux.org/title/Localization/Chinese > > > > > > The WenQuanYi fonts have a much longer history and the middle dot is > > > always half width in them (checked just now). I'm currently using > > > WenQuanYi Micro Hei. > > > > > > The Noto/Source Han fonts are relatively new and have an interesting > > > situation where the Noto Sans CJK SC/TC have the middle dot as full > > > width but Noto Sans CJK JP/KR have it as half width. > > > > > > Some Microsoft proprietary system-default fonts for Chinese > > > characters, e.g. SimSun and SimHei, also have the middle dot as full > > > width, but I guess few Linux users would be using it. > > > > > > I should also mention that some CJK users prepend an English font > > > before their CJK font for usage in UI/terminal, because English fonts > > > usually contain better quality glyphs for latin letters than the ones > > > embedded in CJK fonts - essentially they only use the CJK-only part of > > > the CJK fonts. So the middle dot will always be half width for them. > > > > > > > I'm wondering why we didn't hear > > > > by now complaints from CJK users about the effects of > > > > cjk-ambiguous-chars-are-wide other than on whitespace-mode. Maybe the > > > > other ambiguous-width characters are seldom used in practice? Or > > > > maybe too little time has passed since Emacs 30 was released? > > > > > > I think it could be both: > > > > > > 1. Emacs 30 is still new-ish, e.g. it landed in Arch Linux stable on > > > Feb 25, only two weeks ago > > > > > > 2. Not a lot of CJK users set a CJK locale ($LANG) for their terminal > > > - they usually set it for the DE but not necessarily in > > > /etc/locale.conf for the entire environment, because that means > > > localized messages won't appear properly under TTY with its default > > > font. I did that because I would manually set LANG if I actually have > > > to use TTY instead of a terminal emulator some day. > > > > Thanks for the info, I will think about this some more. > > I eventually decided to go with the more conservative approach of > removing the two characters used by whitespace-mode from the list of > ambiguous-width characters. > > This change is not installed on the emacs-30 branch, so the next Emacs > release should have this problem fixed. The above should have said "This change is now installed on the emacs-30 branch". No further comments, so I presume the fix is okay, and I'm closing this bug.
Debbugs Internal Request <help-debbugs <at> gnu.org>
to internal_control <at> debbugs.gnu.org
.
(Sat, 26 Apr 2025 11:24:10 GMT) Full text and rfc822 format available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.