X-Loop: help-debbugs@HIDDEN Subject: bug#70000: 29.2; Grapheme handling incorrect Resent-From: Phillip Susi <phill@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-gnu-emacs@HIDDEN Resent-Date: Mon, 25 Mar 2024 18:47:01 +0000 Resent-Message-ID: <handler.70000.B.171139236311697 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: report 70000 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: 70000 <at> debbugs.gnu.org X-Debbugs-Original-To: bug-gnu-emacs@HIDDEN Received: via spool by submit <at> debbugs.gnu.org id=B.171139236311697 (code B ref -1); Mon, 25 Mar 2024 18:47:01 +0000 Received: (at submit) by debbugs.gnu.org; 25 Mar 2024 18:46:03 +0000 Received: from localhost ([127.0.0.1]:36258 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1ropKc-00032a-8G for submit <at> debbugs.gnu.org; Mon, 25 Mar 2024 14:46:02 -0400 Received: from lists.gnu.org ([2001:470:142::17]:44128) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <phill@HIDDEN>) id 1ropKY-00031n-V0 for submit <at> debbugs.gnu.org; Mon, 25 Mar 2024 14:46:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <phill@HIDDEN>) id 1ropKS-0005wA-5o for bug-gnu-emacs@HIDDEN; Mon, 25 Mar 2024 14:45:52 -0400 Received: from vps.thesusis.net ([34.202.238.73]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <phill@HIDDEN>) id 1ropKQ-0001lu-An for bug-gnu-emacs@HIDDEN; Mon, 25 Mar 2024 14:45:51 -0400 Received: by vps.thesusis.net (Postfix, from userid 1000) id C454A2B46D; Mon, 25 Mar 2024 14:45:48 -0400 (EDT) From: Phillip Susi <phill@HIDDEN> Date: Mon, 25 Mar 2024 14:45:48 -0400 Message-ID: <878r26duar.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain Received-SPF: pass client-ip=34.202.238.73; envelope-from=phill@HIDDEN; helo=vps.thesusis.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: 0.9 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -0.1 (/) I had some terminal breakage the other day when browsing email with notmuch. Now a ways down the rabbit hole, it seems this is because emacs does not correctly handle graphemes. I found this article here: https://mitchellh.com/writing/grapheme-clusters-in-terminals If I paste that gramehe into GUI emacs, it is displayed as two separate characters, each two columns wide, instead of the correct way: as a single double wide character. C-f and C-b move over the character as if it were one, however, backspace deletes only the second, leaving both the first and the zero width joiner. If C-f and C-b treat it as one, then so should backspace. Under recent versions of the foot terminal emulator, this character is displayed as a single, double wide character, but emacs assumes it still is 4 colums wide, leading to terminal breakage. Emacs needs to not assume the width of graphemes are what wcwidth() reports, but instead need to query the cursor position after printing one to find out how wide the terminal actually dispalyed it as. In GNU Emacs 29.2 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.39, cairo version 1.18.0) of 2024-02-26 built on localhost System Description: Gentoo Linux Configured using: 'configure --prefix=/usr --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --mandir=/usr/share/man --infodir=/usr/share/info --datadir=/usr/share --sysconfdir=/etc --localstatedir=/var/lib --datarootdir=/usr/share --disable-silent-rules --docdir=/usr/share/doc/emacs-29.2-r1 --htmldir=/usr/share/doc/emacs-29.2-r1/html --libdir=/usr/lib64 --program-suffix=-emacs-29 --includedir=/usr/include/emacs-29 --infodir=/usr/share/info/emacs-29 --localstatedir=/var --enable-locallisppath=/etc/emacs:/usr/share/emacs/site-lisp --without-compress-install --without-hesiod --without-pop --with-file-notification=inotify --with-pdumper --enable-acl --with-dbus --with-modules --without-gameuser --with-libgmp --with-gpm --with-native-compilation=aot --without-json --without-kerberos --without-kerberos5 --with-lcms2 --without-xml2 --without-mailutils --without-selinux --without-sqlite3 --with-gnutls --with-libsystemd --with-threads --with-tree-sitter --without-wide-int --with-sound=alsa --with-zlib --with-pgtk --without-x --without-ns --with-toolkit-scroll-bars --without-gconf --without-gsettings --without-harfbuzz --without-libotf --without-m17n-flt --without-xwidgets --with-gif --with-jpeg --with-png --with-rsvg --with-tiff --without-webp --without-imagemagick --with-dumping=pdumper 'CFLAGS=-march=native -O2 -pipe' 'LDFLAGS=-Wl,-O1 -Wl,--as-needed'' Configured features: ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM JPEG LCMS2 LIBSYSTEMD MODULES NATIVE_COMP NOTIFY INOTIFY PDUMPER PGTK PNG RSVG SECCOMP SOUND THREADS TIFF TOOLKIT_SCROLL_BARS TREE_SITTER XIM GTK3 ZLIB Important settings: value of $LANG: en_US.UTF-8 locale-coding-system: utf-8-unix Major mode: Lisp Interaction Minor modes in effect: tooltip-mode: t global-eldoc-mode: t eldoc-mode: t show-paren-mode: t electric-indent-mode: t mouse-wheel-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t column-number-mode: t line-number-mode: t indent-tabs-mode: t transient-mark-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t Load-path shadows: None found. Features: (shadow sort mail-extr emacsbug message yank-media puny dired dired-loaddefs rfc822 mml mml-sec epa derived epg rfc6068 epg-config gnus-util text-property-search time-date mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils cus-start cus-load wid-edit descr-text enriched disp-table facemenu comp comp-cstr warnings icons rx cl-extra help-mode manoj-dark-theme site-gentoo ranger-autoloads scopeline-autoloads package browse-url url url-proxy url-privacy url-expand url-methods url-history url-cookie generate-lisp-file url-domsuf url-util mailcap url-handlers url-parse auth-source cl-seq eieio eieio-core cl-macs password-cache json subr-x map byte-opt gv bytecomp byte-compile url-vars cl-loaddefs cl-lib rmc iso-transl tooltip cconv eldoc paren electric uniquify ediff-hook vc-hooks lisp-float-type elisp-mode mwheel term/pgtk-win pgtk-win term/common-win pgtk-dnd tool-bar dnd fontset image regexp-opt fringe tabulated-list replace newcomment text-mode lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow isearch easymenu timer select scroll-bar mouse jit-lock font-lock syntax font-core term/tty-colors frame minibuffer nadvice seq simple cl-generic indonesian philippine cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese composite emoji-zwj charscript charprop case-table epa-hook jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs theme-loaddefs faces cus-face macroexp files window text-properties overlay sha1 md5 base64 format env code-pages mule custom widget keymap hashtable-print-readable backquote threads dbusbind inotify dynamic-setting font-render-setting cairo gtk pgtk lcms2 multi-tty make-network-process native-compile emacs) Memory information: ((conses 16 121243 14450) (symbols 48 22924 0) (strings 32 87992 2869) (string-bytes 1 2065634) (vectors 16 27491) (vector-slots 8 1623278 223666) (floats 8 58 48) (intervals 56 908 0) (buffers 984 13))
Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) Content-Type: text/plain; charset=utf-8 X-Loop: help-debbugs@HIDDEN From: help-debbugs@HIDDEN (GNU bug Tracking System) To: Phillip Susi <phill@HIDDEN> Subject: bug#70000: Acknowledgement (29.2; Grapheme handling incorrect) Message-ID: <handler.70000.B.171139236311697.ack <at> debbugs.gnu.org> References: <878r26duar.fsf@HIDDEN> X-Gnu-PR-Message: ack 70000 X-Gnu-PR-Package: emacs Reply-To: 70000 <at> debbugs.gnu.org Date: Mon, 25 Mar 2024 18:47:02 +0000 Thank you for filing a new bug report with debbugs.gnu.org. This is an automatically generated reply to let you know your message has been received. Your message is being forwarded to the package maintainers and other interested parties for their attention; they will reply in due course. Your message has been sent to the package maintainer(s): bug-gnu-emacs@HIDDEN If you wish to submit further information on this problem, please send it to 70000 <at> debbugs.gnu.org. Please do not send mail to help-debbugs@HIDDEN unless you wish to report a problem with the Bug-tracking system. --=20 70000: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D70000 GNU Bug Tracking System Contact help-debbugs@HIDDEN with problems
X-Loop: help-debbugs@HIDDEN Subject: bug#70000: 29.2; Grapheme handling incorrect Resent-From: Eli Zaretskii <eliz@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-gnu-emacs@HIDDEN Resent-Date: Mon, 25 Mar 2024 19:36:02 +0000 Resent-Message-ID: <handler.70000.B70000.171139533417122 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 70000 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Phillip Susi <phill@HIDDEN> Cc: 70000 <at> debbugs.gnu.org Received: via spool by 70000-submit <at> debbugs.gnu.org id=B70000.171139533417122 (code B ref 70000); Mon, 25 Mar 2024 19:36:02 +0000 Received: (at 70000) by debbugs.gnu.org; 25 Mar 2024 19:35:34 +0000 Received: from localhost ([127.0.0.1]:36326 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1roq6Y-0004S5-5n for submit <at> debbugs.gnu.org; Mon, 25 Mar 2024 15:35:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46706) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1roq6W-0004Rn-3M; Mon, 25 Mar 2024 15:35:33 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <eliz@HIDDEN>) id 1roq6R-0001Vf-Eb; Mon, 25 Mar 2024 15:35:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=uOxBp07R6KwnVrSVYMxQpKNQmjxmW1f7rvO5ks6F4uM=; b=pilbHEyD20y7 fW2KGrDS+2eUVmRfSPU/6bfxUKJ0pah0RNZ3oymn9N+t/Gml7uTlYTTol0MGd3RsCgmMOfyNOTWVo ydRGBUW5tEGjnnp/63omIjDR25MyKYPkBWiO6b2r4RoTepsJiD66ZwO3vBSS0hujM12cNUZFcO70n BrPTHGAiSVS9YlcATi1ppDW6V29rlEECOX5sdj0PeNk0KsDGJhxAE5WEFjg1uUYux0vM3Zp6qegYr 22XTJJjvu+sd6NO6QHZV7gHoOb6wFXkhnNF9ca4xsx6mGCTyJlNq7lBD/gsl4ckJnNOVN2a3rYesk 1L+jRiZY1T0Wxoj8UnwRGg==; Date: Mon, 25 Mar 2024 21:35:24 +0200 Message-Id: <86cyrije9v.fsf@HIDDEN> From: Eli Zaretskii <eliz@HIDDEN> In-Reply-To: <878r26duar.fsf@HIDDEN> (message from Phillip Susi on Mon, 25 Mar 2024 14:45:48 -0400) References: <878r26duar.fsf@HIDDEN> X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) tags 70000 notabug thanks > From: Phillip Susi <phill@HIDDEN> > Date: Mon, 25 Mar 2024 14:45:48 -0400 > > I had some terminal breakage the other day when browsing email with > notmuch. Now a ways down the rabbit hole, it seems this is because > emacs does not correctly handle graphemes. I found this article here: > > https://mitchellh.com/writing/grapheme-clusters-in-terminals > > If I paste that gramehe into GUI emacs, it is displayed as two separate > characters, each two columns wide, instead of the correct way: as a > single double wide character. First, the above blog talks about text-mode terminals (a.k.a. "TTYs"), so it is not relevant to GUI Emacs session. And second, how that particular sequence of codepoints is displayed on GUI frames depends on how your Emacs was built. According to the list of features included in your report, viz.: Configured features: ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM JPEG LCMS2 LIBSYSTEMD MODULES NATIVE_COMP NOTIFY INOTIFY PDUMPER PGTK PNG RSVG SECCOMP SOUND THREADS TIFF TOOLKIT_SCROLL_BARS TREE_SITTER XIM GTK3 ZLIB your Emacs is built without HarfBuzz, which I think explains why your Emacs displays the above sequences as 2 separate characters. Furthermore, the appearance depends on the fonts you have installed; specifically, Emoji sequences need a font that has a good support of the Emoji Unicode blocks. In my Emacs, which does use HarfBuzz, I see a single grapheme cluster. > C-f and C-b move over the character as if > it were one, however, backspace deletes only the second, leaving both > the first and the zero width joiner. If C-f and C-b treat it as one, > then so should backspace. That Backspace deletes a single codepoint is a feature: it allows easier editing of composable character sequences, such as Emoji. E.g., imagine you want to make a slight change to the Emoji by modifying just the second of the two characters composed into a grapheme cluster. Emacs supports deletion of the entire grapheme cluster with the command delete-forward-char, by default bound to the <Delete> function key. > Under recent versions of the foot terminal emulator, this character is > displayed as a single, double wide character, but emacs assumes it still > is 4 colums wide, leading to terminal breakage. Emacs cannot know what the terminal does with these characters, because there's no widely-accepted protocol for accessing that information. Different terminal emulators behave differently, and some even have options to modify their behavior via the various settings. > Emacs needs to not assume the width of graphemes are what wcwidth() > reports, but instead need to query the cursor position after > printing one to find out how wide the terminal actually dispalyed it > as. Querying the cursor position won't help in this case because it is Emacs that moves the cursor when you type C-f, not the terminal. I see no Emacs bug here. Until we have standard ways of querying text-mode terminals about their processing of composable character sequences into grapheme clusters, there's no way for Emacs to behave correctly with all such terminal emulators. Sorry.
Received: (at control) by debbugs.gnu.org; 25 Mar 2024 19:35:35 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Mar 25 15:35:35 2024 Received: from localhost ([127.0.0.1]:36328 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1roq6Y-0004S7-NA for submit <at> debbugs.gnu.org; Mon, 25 Mar 2024 15:35:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46706) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1roq6W-0004Rn-3M; Mon, 25 Mar 2024 15:35:33 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <eliz@HIDDEN>) id 1roq6R-0001Vf-Eb; Mon, 25 Mar 2024 15:35:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=uOxBp07R6KwnVrSVYMxQpKNQmjxmW1f7rvO5ks6F4uM=; b=pilbHEyD20y7 fW2KGrDS+2eUVmRfSPU/6bfxUKJ0pah0RNZ3oymn9N+t/Gml7uTlYTTol0MGd3RsCgmMOfyNOTWVo ydRGBUW5tEGjnnp/63omIjDR25MyKYPkBWiO6b2r4RoTepsJiD66ZwO3vBSS0hujM12cNUZFcO70n BrPTHGAiSVS9YlcATi1ppDW6V29rlEECOX5sdj0PeNk0KsDGJhxAE5WEFjg1uUYux0vM3Zp6qegYr 22XTJJjvu+sd6NO6QHZV7gHoOb6wFXkhnNF9ca4xsx6mGCTyJlNq7lBD/gsl4ckJnNOVN2a3rYesk 1L+jRiZY1T0Wxoj8UnwRGg==; Date: Mon, 25 Mar 2024 21:35:24 +0200 Message-Id: <86cyrije9v.fsf@HIDDEN> From: Eli Zaretskii <eliz@HIDDEN> To: Phillip Susi <phill@HIDDEN> In-Reply-To: <878r26duar.fsf@HIDDEN> (message from Phillip Susi on Mon, 25 Mar 2024 14:45:48 -0400) Subject: Re: bug#70000: 29.2; Grapheme handling incorrect References: <878r26duar.fsf@HIDDEN> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: control Cc: 70000 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) tags 70000 notabug thanks > From: Phillip Susi <phill@HIDDEN> > Date: Mon, 25 Mar 2024 14:45:48 -0400 > > I had some terminal breakage the other day when browsing email with > notmuch. Now a ways down the rabbit hole, it seems this is because > emacs does not correctly handle graphemes. I found this article here: > > https://mitchellh.com/writing/grapheme-clusters-in-terminals > > If I paste that gramehe into GUI emacs, it is displayed as two separate > characters, each two columns wide, instead of the correct way: as a > single double wide character. First, the above blog talks about text-mode terminals (a.k.a. "TTYs"), so it is not relevant to GUI Emacs session. And second, how that particular sequence of codepoints is displayed on GUI frames depends on how your Emacs was built. According to the list of features included in your report, viz.: Configured features: ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM JPEG LCMS2 LIBSYSTEMD MODULES NATIVE_COMP NOTIFY INOTIFY PDUMPER PGTK PNG RSVG SECCOMP SOUND THREADS TIFF TOOLKIT_SCROLL_BARS TREE_SITTER XIM GTK3 ZLIB your Emacs is built without HarfBuzz, which I think explains why your Emacs displays the above sequences as 2 separate characters. Furthermore, the appearance depends on the fonts you have installed; specifically, Emoji sequences need a font that has a good support of the Emoji Unicode blocks. In my Emacs, which does use HarfBuzz, I see a single grapheme cluster. > C-f and C-b move over the character as if > it were one, however, backspace deletes only the second, leaving both > the first and the zero width joiner. If C-f and C-b treat it as one, > then so should backspace. That Backspace deletes a single codepoint is a feature: it allows easier editing of composable character sequences, such as Emoji. E.g., imagine you want to make a slight change to the Emoji by modifying just the second of the two characters composed into a grapheme cluster. Emacs supports deletion of the entire grapheme cluster with the command delete-forward-char, by default bound to the <Delete> function key. > Under recent versions of the foot terminal emulator, this character is > displayed as a single, double wide character, but emacs assumes it still > is 4 colums wide, leading to terminal breakage. Emacs cannot know what the terminal does with these characters, because there's no widely-accepted protocol for accessing that information. Different terminal emulators behave differently, and some even have options to modify their behavior via the various settings. > Emacs needs to not assume the width of graphemes are what wcwidth() > reports, but instead need to query the cursor position after > printing one to find out how wide the terminal actually dispalyed it > as. Querying the cursor position won't help in this case because it is Emacs that moves the cursor when you type C-f, not the terminal. I see no Emacs bug here. Until we have standard ways of querying text-mode terminals about their processing of composable character sequences into grapheme clusters, there's no way for Emacs to behave correctly with all such terminal emulators. Sorry.
X-Loop: help-debbugs@HIDDEN Subject: bug#70000: 29.2; Grapheme handling incorrect Resent-From: Phillip Susi <phill@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-gnu-emacs@HIDDEN Resent-Date: Wed, 27 Mar 2024 14:12:01 +0000 Resent-Message-ID: <handler.70000.B70000.171154870528625 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 70000 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: notabug To: Eli Zaretskii <eliz@HIDDEN> Cc: 70000 <at> debbugs.gnu.org Received: via spool by 70000-submit <at> debbugs.gnu.org id=B70000.171154870528625 (code B ref 70000); Wed, 27 Mar 2024 14:12:01 +0000 Received: (at 70000) by debbugs.gnu.org; 27 Mar 2024 14:11:45 +0000 Received: from localhost ([127.0.0.1]:38007 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1rpU0E-0007Ra-7A for submit <at> debbugs.gnu.org; Wed, 27 Mar 2024 10:11:45 -0400 Received: from vps.thesusis.net ([34.202.238.73]:49582) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <phill@HIDDEN>) id 1rpU08-0007RH-Cf for 70000 <at> debbugs.gnu.org; Wed, 27 Mar 2024 10:11:40 -0400 Received: by vps.thesusis.net (Postfix, from userid 1000) id DADD42BA0A; Wed, 27 Mar 2024 10:11:30 -0400 (EDT) From: Phillip Susi <phill@HIDDEN> In-Reply-To: <86cyrije9v.fsf@HIDDEN> References: <878r26duar.fsf@HIDDEN> <86cyrije9v.fsf@HIDDEN> Date: Wed, 27 Mar 2024 10:11:30 -0400 Message-ID: <875xx7epd9.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) Eli Zaretskii <eliz@HIDDEN> writes: > Querying the cursor position won't help in this case because it is > Emacs that moves the cursor when you type C-f, not the terminal. I'm not talking about C-f, but simply displaying the characters on the screen. Emacs assumes the width is 4 when it prints this character, and so it thinks that the cursor moved over 4 places. When the terminal actually only moves the cursor over 2 spaces, emacs gets out of sync with the terminal, and massive breakage occurs. By reading back the cursor position from the terminal after displaying a grapheme cluster, it would learn how the terminal displayed it and update its idea of where the cursor is correctly. I originally ran into this problem not with a ZWJ, but with an emoji followed by alternate selector 16 that someone used in a subject line of an email, and when browsing my inbox with notmuch, the terminal went FUBAR.
X-Loop: help-debbugs@HIDDEN Subject: bug#70000: 29.2; Grapheme handling incorrect Resent-From: Eli Zaretskii <eliz@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-gnu-emacs@HIDDEN Resent-Date: Wed, 27 Mar 2024 17:18:02 +0000 Resent-Message-ID: <handler.70000.B70000.171155987111216 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 70000 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: notabug To: Phillip Susi <phill@HIDDEN> Cc: 70000 <at> debbugs.gnu.org Received: via spool by 70000-submit <at> debbugs.gnu.org id=B70000.171155987111216 (code B ref 70000); Wed, 27 Mar 2024 17:18:02 +0000 Received: (at 70000) by debbugs.gnu.org; 27 Mar 2024 17:17:51 +0000 Received: from localhost ([127.0.0.1]:38266 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1rpWuN-0002uq-0i for submit <at> debbugs.gnu.org; Wed, 27 Mar 2024 13:17:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39636) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1rpWuK-0002uV-Ok for 70000 <at> debbugs.gnu.org; Wed, 27 Mar 2024 13:17:50 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <eliz@HIDDEN>) id 1rpWuE-0006GL-P3; Wed, 27 Mar 2024 13:17:43 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=czUy6b4rILf69LK5fkqAD/XoDe3/x8HFuf/MK7I6vSU=; b=PnojMGwViFnI L2MRgMjqmaWirFKNAJIpJPH0nWnbykLN/iW+Aqi1k3CNr08//X6tYvxEj814r+kzV94cu6DslQOhh d7sXjxqf+JwVMJAzswjEjU82w26gMsodSUupcPyw4X7PmGyeex6cJCqr8KLevd3Bvf0m46tPedUUx egVjXX4JsunyukrptaMxP4onipdf2oxrE5oqqpJW/hPylNCaTVu7R9KkVGjeMqljyAD1mRM/lKkJL PY1FrSyMqaX4+g2WewTZqpcKofaWbUE3GNUgVMc5n15jAulZyvXBjo3yYYZTrmWnkx7ptI9G2AaZl gbl+33F76Jgl0ygS3uDQRQ==; Date: Wed, 27 Mar 2024 19:17:39 +0200 Message-Id: <865xx7iogc.fsf@HIDDEN> From: Eli Zaretskii <eliz@HIDDEN> In-Reply-To: <875xx7epd9.fsf@HIDDEN> (message from Phillip Susi on Wed, 27 Mar 2024 10:11:30 -0400) References: <878r26duar.fsf@HIDDEN> <86cyrije9v.fsf@HIDDEN> <875xx7epd9.fsf@HIDDEN> X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) > From: Phillip Susi <phill@HIDDEN> > Cc: 70000 <at> debbugs.gnu.org > Date: Wed, 27 Mar 2024 10:11:30 -0400 > > Eli Zaretskii <eliz@HIDDEN> writes: > > > Querying the cursor position won't help in this case because it is > > Emacs that moves the cursor when you type C-f, not the terminal. > > I'm not talking about C-f, but simply displaying the characters on the > screen. Emacs assumes the width is 4 when it prints this character, and > so it thinks that the cursor moved over 4 places. When the terminal > actually only moves the cursor over 2 spaces, emacs gets out of sync > with the terminal, and massive breakage occurs. I understand what you are saying, but this is not how Emacs display code works. It needs to know the width of every character displayed on the screen, and it needs to be able to determine that even without actually displaying the character. When Emacs is about to redraw some portion of the screen, it moves the cursor to that place. To be able to move the cursor there, it needs to be able to compute the coordinates on the screen of every character that is currently shown, so it can construct the command for the terminal driver to move cursor to that place. If Emacs were to rely on displaying characters for that, it would have needed to constantly redraw large portions of the screen, and that would both be much slower and cause unpleasant flickering of the display, due to redrawing of screen portions that don't actually change. So this technique is out of the question for Emacs. > By reading back the cursor position from the terminal after displaying a > grapheme cluster, it would learn how the terminal displayed it and > update its idea of where the cursor is correctly. I understand. But Emacs needs this information also long after the characters were already drawn. For example, imagine that Emacs displays these characters on the screen, and then leaves most of the screen intact and periodically redraws some small portion of the screen, like updating current time in the lower-right corner of the screen when Emacs is otherwise idle. To do that, Emacs needs to move the cursor from its current position somewhere on the screen to the lower-right corner, redraw the time there, then move the cursor back to where it was. These cursor moves are based on the ability to calculate the geometry of each character on display without actually writing the characters to the screen. In addition, if Emacs had to query the cursor position after each written character, its redisplay would be much slower than it is now. > I originally ran into this problem not with a ZWJ, but with an emoji > followed by alternate selector 16 that someone used in a subject line of > an email, and when browsing my inbox with notmuch, the terminal went > FUBAR. Yes, that's a known issue with some of the terminal emulators that compose Emoji and other similar character sequences into grapheme clusters, while ignoring the width that is expected from the result. I'm not aware of any good solution, unfortunately. Sometimes, disabling auto-composition-mode helps, but even that cannot solve all the problems, especially when each of the characters composed by the terminal into a single grapheme cluster has non-zero width according to the Unicode tables. (If only the first character in the composed sequence has non-zero width and the rest are zero-width, disabling auto-composition-mode might produce a correct display.) The bottom line is what I said at the beginning: we need some protocol by which a terminal emulator could be queried about whether it supports character composition, and if so, what is the screen width of a given sequence of codepoints that will be composed, without actually displaying them. Better yet, some standard table of such widths could be accepted by complying terminal emulators, and then Emacs could use such a table to know the width in advance (similarly to how it knows that from the Unicode data files). Until such protocols or tables exist, Emacs will be unable to produce correct display on these terminal emulators.
X-Loop: help-debbugs@HIDDEN Subject: bug#70000: 29.2; Grapheme handling incorrect Resent-From: Phillip Susi <phill@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-gnu-emacs@HIDDEN Resent-Date: Thu, 28 Mar 2024 16:17:01 +0000 Resent-Message-ID: <handler.70000.B70000.171164260422542 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 70000 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: notabug To: Eli Zaretskii <eliz@HIDDEN> Cc: 70000 <at> debbugs.gnu.org Received: via spool by 70000-submit <at> debbugs.gnu.org id=B70000.171164260422542 (code B ref 70000); Thu, 28 Mar 2024 16:17:01 +0000 Received: (at 70000) by debbugs.gnu.org; 28 Mar 2024 16:16:44 +0000 Received: from localhost ([127.0.0.1]:40994 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1rpsQl-0005rW-Tf for submit <at> debbugs.gnu.org; Thu, 28 Mar 2024 12:16:44 -0400 Received: from vps.thesusis.net ([34.202.238.73]:41372) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <phill@HIDDEN>) id 1rpsQj-0005rH-N1 for 70000 <at> debbugs.gnu.org; Thu, 28 Mar 2024 12:16:42 -0400 Received: by vps.thesusis.net (Postfix, from userid 1000) id A1E0D2BF05; Thu, 28 Mar 2024 12:16:33 -0400 (EDT) From: Phillip Susi <phill@HIDDEN> In-Reply-To: <865xx7iogc.fsf@HIDDEN> References: <878r26duar.fsf@HIDDEN> <86cyrije9v.fsf@HIDDEN> <875xx7epd9.fsf@HIDDEN> <865xx7iogc.fsf@HIDDEN> Date: Thu, 28 Mar 2024 12:16:33 -0400 Message-ID: <87bk6y8h7i.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) Eli Zaretskii <eliz@HIDDEN> writes: > I understand. But Emacs needs this information also long after the > characters were already drawn. For example, imagine that Emacs Yes, it would have to learn the width the first time it displays each grapheme and build a list of known widths to remember for future use. > In addition, if Emacs had to query the cursor position after each > written character, its redisplay would be much slower than it is now. It would only need to query when printing a grapheme cluster, and only the first time. After that, it could remeber.
X-Loop: help-debbugs@HIDDEN Subject: bug#70000: 29.2; Grapheme handling incorrect Resent-From: Stefan Kangas <stefankangas@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-gnu-emacs@HIDDEN Resent-Date: Sat, 01 Mar 2025 03:06:02 +0000 Resent-Message-ID: <handler.70000.B70000.174079830212610 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 70000 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: notabug To: Eli Zaretskii <eliz@HIDDEN> Cc: 70000 <at> debbugs.gnu.org, Phillip Susi <phill@HIDDEN> Received: via spool by 70000-submit <at> debbugs.gnu.org id=B70000.174079830212610 (code B ref 70000); Sat, 01 Mar 2025 03:06:02 +0000 Received: (at 70000) by debbugs.gnu.org; 1 Mar 2025 03:05:02 +0000 Received: from localhost ([127.0.0.1]:56856 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1toD9x-0003H3-OP for submit <at> debbugs.gnu.org; Fri, 28 Feb 2025 22:05:02 -0500 Received: from mail-ed1-x534.google.com ([2a00:1450:4864:20::534]:46362) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from <stefankangas@HIDDEN>) id 1toD9u-0003GN-9M for 70000 <at> debbugs.gnu.org; Fri, 28 Feb 2025 22:04:59 -0500 Received: by mail-ed1-x534.google.com with SMTP id 4fb4d7f45d1cf-5e08064b4ddso3618344a12.1 for <70000 <at> debbugs.gnu.org>; Fri, 28 Feb 2025 19:04:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1740798292; x=1741403092; darn=debbugs.gnu.org; h=cc:to:subject:message-id:date:mime-version:references:in-reply-to :from:from:to:cc:subject:date:message-id:reply-to; bh=z+SPC6TGKOPuCdVuDStw+l4VyBnlajr2K2H3n8IArls=; b=c1/rPWMzevwFpSdjLc+V+82f0DjsKeT4ERT6X3EJ2zNV+Mwbz8WI2WS5bhX/UJdAM5 JKkk1qlB4sbZSoaLXOkTV62w3qTVRdA/KysATYik0ZS9MmUFNih6CsONZuLtGgXFqIph FcDG8dQjMhmnXTZ+YhXYSBmNn9YwGOCN7EPnGD09199GNXELXSbfoS+7iMjBGIgZbnaI HDFcjHC7xK1TtbHZlq/Kt7Q230wQdoKSapLIwM77D/OagmI6WLRvax1CQ3NNbZrrQBn1 mT9QSY7JEc8kb5tBLBYAnHdxNBJyhZy/pETSLjDCaQhmu4JYD9oudTlgv6hO53eUY4l+ xwIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740798292; x=1741403092; h=cc:to:subject:message-id:date:mime-version:references:in-reply-to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=z+SPC6TGKOPuCdVuDStw+l4VyBnlajr2K2H3n8IArls=; b=QfvE2yQtcLuDLVNk7JulB8xxzn4NwwXS9O7S30bGlm6l7adk45302RGXp538bkr5kW lKS6U2IB4eAOJs287T8wexDSnok73yAyO4uGIi6n2Q666MFddpWwRfZXbTKve/+Z/KoW 0fW+J2XbEV8KZvRWf3/vPYU9NMY8HOHymN9VEdDkG7AjFGJ183+0us1vNx6trjaGUDCj Dav00VIcngU4G1GxkjCqhD6Pk7yhW8CiFe/fFfhLZY+AMuwpUmc7zXqEWSAUGEfXHF9e qK3pRKWwARuQFgBIquzzZjAywUV1dOgMZBEvX0yFOpFfExlv7oQeZRfrVYDrjdrb+d82 RfZw== X-Forwarded-Encrypted: i=1; AJvYcCXaF+jyymRZQFORbZHfgVTFe/637Hcowptvhm1dSSBSC8nNCHS3vG8rrh3IuxRCAgsEq0rMlg==@debbugs.gnu.org X-Gm-Message-State: AOJu0YwPvuTI08oCV9zuDtlHdTS3V5CydbrmBMsARATeCtrmcsZW+lIe Wr/hrk4KqOUwLe5mXhBNIX6X7x4vjChdOSI1vKLYID5CJ4no0wW61xrYhXthdB5juG4J4Q7Sq0r B+0Z6OIt+fV8cbRa4lAuZYUFf/WDhfDvPaCY= X-Gm-Gg: ASbGncsn4cDJRQsgsQ7AmB8qvVjza+x5PDMy7MPMZIZBbWZFyzohrsO6v08AyggESPw f1vfJi/cCM0P0TungpgKWnMdZGffADPs8kbzHx+tkmyISZfYLKhxv0mtx77MYLQGaXPdZiYEdt7 d1ASaUTEsmM05uQW71euZiTWWJzQ4= X-Google-Smtp-Source: AGHT+IG5+EOS6TieTv2V20rZS/Ep1LXwQxTDczSFmmI4yO8gdHNfxc/b80SSe55sJl0jA/zFjQaxS1Ys6hcYfQudp90= X-Received: by 2002:a05:6402:5194:b0:5de:c9d0:6742 with SMTP id 4fb4d7f45d1cf-5e4d6ae8552mr5014450a12.9.1740798291982; Fri, 28 Feb 2025 19:04:51 -0800 (PST) Received: from 753933720722 named unknown by gmailapi.google.com with HTTPREST; Fri, 28 Feb 2025 19:04:51 -0800 From: Stefan Kangas <stefankangas@HIDDEN> In-Reply-To: <865xx7iogc.fsf@HIDDEN> References: <878r26duar.fsf@HIDDEN> <86cyrije9v.fsf@HIDDEN> <875xx7epd9.fsf@HIDDEN> <865xx7iogc.fsf@HIDDEN> MIME-Version: 1.0 Date: Fri, 28 Feb 2025 19:04:51 -0800 X-Gm-Features: AQ5f1JpaEyeEbgGb8kx5tIKzTaLFcoJJ_P8RHXg0_pb6H7tlglkXIdw5ICVgTxE Message-ID: <CADwFkmkquvhv94SMPRHBwneaDzSnjX6Rz+rHDp+S8YOTaAvttQ@HIDDEN> Content-Type: text/plain; charset="UTF-8" X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) Eli Zaretskii <eliz@HIDDEN> writes: >> From: Phillip Susi <phill@HIDDEN> >> Cc: 70000 <at> debbugs.gnu.org >> Date: Wed, 27 Mar 2024 10:11:30 -0400 >> >> Eli Zaretskii <eliz@HIDDEN> writes: >> >> > Querying the cursor position won't help in this case because it is >> > Emacs that moves the cursor when you type C-f, not the terminal. >> >> I'm not talking about C-f, but simply displaying the characters on the >> screen. Emacs assumes the width is 4 when it prints this character, and >> so it thinks that the cursor moved over 4 places. When the terminal >> actually only moves the cursor over 2 spaces, emacs gets out of sync >> with the terminal, and massive breakage occurs. > > I understand what you are saying, but this is not how Emacs display > code works. It needs to know the width of every character displayed > on the screen, and it needs to be able to determine that even without > actually displaying the character. > > When Emacs is about to redraw some portion of the screen, it moves the > cursor to that place. To be able to move the cursor there, it needs > to be able to compute the coordinates on the screen of every character > that is currently shown, so it can construct the command for the > terminal driver to move cursor to that place. If Emacs were to rely > on displaying characters for that, it would have needed to constantly > redraw large portions of the screen, and that would both be much > slower and cause unpleasant flickering of the display, due to > redrawing of screen portions that don't actually change. > > So this technique is out of the question for Emacs. > >> By reading back the cursor position from the terminal after displaying a >> grapheme cluster, it would learn how the terminal displayed it and >> update its idea of where the cursor is correctly. > > I understand. But Emacs needs this information also long after the > characters were already drawn. For example, imagine that Emacs > displays these characters on the screen, and then leaves most of the > screen intact and periodically redraws some small portion of the > screen, like updating current time in the lower-right corner of the > screen when Emacs is otherwise idle. To do that, Emacs needs to move > the cursor from its current position somewhere on the screen to the > lower-right corner, redraw the time there, then move the cursor back > to where it was. These cursor moves are based on the ability to > calculate the geometry of each character on display without actually > writing the characters to the screen. > > In addition, if Emacs had to query the cursor position after each > written character, its redisplay would be much slower than it is now. > >> I originally ran into this problem not with a ZWJ, but with an emoji >> followed by alternate selector 16 that someone used in a subject line of >> an email, and when browsing my inbox with notmuch, the terminal went >> FUBAR. > > Yes, that's a known issue with some of the terminal emulators that > compose Emoji and other similar character sequences into grapheme > clusters, while ignoring the width that is expected from the result. > I'm not aware of any good solution, unfortunately. Sometimes, > disabling auto-composition-mode helps, but even that cannot solve all > the problems, especially when each of the characters composed by the > terminal into a single grapheme cluster has non-zero width according > to the Unicode tables. (If only the first character in the composed > sequence has non-zero width and the rest are zero-width, disabling > auto-composition-mode might produce a correct display.) > > The bottom line is what I said at the beginning: we need some protocol > by which a terminal emulator could be queried about whether it > supports character composition, and if so, what is the screen width of > a given sequence of codepoints that will be composed, without actually > displaying them. Better yet, some standard table of such widths could > be accepted by complying terminal emulators, and then Emacs could use > such a table to know the width in advance (similarly to how it knows > that from the Unicode data files). > > Until such protocols or tables exist, Emacs will be unable to produce > correct display on these terminal emulators. It seems to me like this should be closed as a wontfix?
X-Loop: help-debbugs@HIDDEN Subject: bug#70000: 29.2; Grapheme handling incorrect Resent-From: Eli Zaretskii <eliz@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-gnu-emacs@HIDDEN Resent-Date: Sat, 01 Mar 2025 09:35:02 +0000 Resent-Message-ID: <handler.70000.B70000.17408216498996 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 70000 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: notabug To: Stefan Kangas <stefankangas@HIDDEN> Cc: 70000 <at> debbugs.gnu.org, phill@HIDDEN Received: via spool by 70000-submit <at> debbugs.gnu.org id=B70000.17408216498996 (code B ref 70000); Sat, 01 Mar 2025 09:35:02 +0000 Received: (at 70000) by debbugs.gnu.org; 1 Mar 2025 09:34:09 +0000 Received: from localhost ([127.0.0.1]:32912 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1toJEW-0002L0-Hi for submit <at> debbugs.gnu.org; Sat, 01 Mar 2025 04:34:09 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:46708) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1toJET-0002Jv-IE; Sat, 01 Mar 2025 04:34:06 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <eliz@HIDDEN>) id 1toJEO-00031b-4Z; Sat, 01 Mar 2025 04:34:00 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=Lca8GUam/8z8KxDTk1sVdpTvVo66vtou0xdn8tc1K68=; b=LLrxacaP6u3r tk5o8Dog7PAYf/zIwMLNs7SOuTTjXRa0WMvuwG2IkAWqUi7xXNkHVPt/UTag4fJWj+h8ADYSNLY+v tLQoic2FLgVebTf+f8Q9qMq0sONLP9cMdIXIj8ciq7Rsrz+LUzM3dt2H4ItU9BGQHMmVhUp0E53Hf L2LgZIlZbyAZbp6HXVaQ60DWtWxIS2qzA8p1w2klN3wE1mB/AGs6Ya1HB5kxuIY7kVhCjn6VtnD7B ODUA3duIDlNubfnBl1CuUvojE6BuzDTlfySoFTI7nJlS2Mk01dWX8TqwHmOlUZGtG5EdQfOuNp9kJ czL9SvwMmRsUv0MfVHhs+A==; Date: Sat, 01 Mar 2025 11:33:53 +0200 Message-Id: <86cyf1qem6.fsf@HIDDEN> From: Eli Zaretskii <eliz@HIDDEN> In-Reply-To: <CADwFkmkquvhv94SMPRHBwneaDzSnjX6Rz+rHDp+S8YOTaAvttQ@HIDDEN> (message from Stefan Kangas on Fri, 28 Feb 2025 19:04:51 -0800) References: <878r26duar.fsf@HIDDEN> <86cyrije9v.fsf@HIDDEN> <875xx7epd9.fsf@HIDDEN> <865xx7iogc.fsf@HIDDEN> <CADwFkmkquvhv94SMPRHBwneaDzSnjX6Rz+rHDp+S8YOTaAvttQ@HIDDEN> X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) tags 70000 wontfix close 70000 thanks > From: Stefan Kangas <stefankangas@HIDDEN> > Date: Fri, 28 Feb 2025 19:04:51 -0800 > Cc: Phillip Susi <phill@HIDDEN>, 70000 <at> debbugs.gnu.org > > Eli Zaretskii <eliz@HIDDEN> writes: > > >> From: Phillip Susi <phill@HIDDEN> > >> Cc: 70000 <at> debbugs.gnu.org > >> Date: Wed, 27 Mar 2024 10:11:30 -0400 > >> > >> Eli Zaretskii <eliz@HIDDEN> writes: > >> > >> > Querying the cursor position won't help in this case because it is > >> > Emacs that moves the cursor when you type C-f, not the terminal. > >> > >> I'm not talking about C-f, but simply displaying the characters on the > >> screen. Emacs assumes the width is 4 when it prints this character, and > >> so it thinks that the cursor moved over 4 places. When the terminal > >> actually only moves the cursor over 2 spaces, emacs gets out of sync > >> with the terminal, and massive breakage occurs. > > > > I understand what you are saying, but this is not how Emacs display > > code works. It needs to know the width of every character displayed > > on the screen, and it needs to be able to determine that even without > > actually displaying the character. > > > > When Emacs is about to redraw some portion of the screen, it moves the > > cursor to that place. To be able to move the cursor there, it needs > > to be able to compute the coordinates on the screen of every character > > that is currently shown, so it can construct the command for the > > terminal driver to move cursor to that place. If Emacs were to rely > > on displaying characters for that, it would have needed to constantly > > redraw large portions of the screen, and that would both be much > > slower and cause unpleasant flickering of the display, due to > > redrawing of screen portions that don't actually change. > > > > So this technique is out of the question for Emacs. > > > >> By reading back the cursor position from the terminal after displaying a > >> grapheme cluster, it would learn how the terminal displayed it and > >> update its idea of where the cursor is correctly. > > > > I understand. But Emacs needs this information also long after the > > characters were already drawn. For example, imagine that Emacs > > displays these characters on the screen, and then leaves most of the > > screen intact and periodically redraws some small portion of the > > screen, like updating current time in the lower-right corner of the > > screen when Emacs is otherwise idle. To do that, Emacs needs to move > > the cursor from its current position somewhere on the screen to the > > lower-right corner, redraw the time there, then move the cursor back > > to where it was. These cursor moves are based on the ability to > > calculate the geometry of each character on display without actually > > writing the characters to the screen. > > > > In addition, if Emacs had to query the cursor position after each > > written character, its redisplay would be much slower than it is now. > > > >> I originally ran into this problem not with a ZWJ, but with an emoji > >> followed by alternate selector 16 that someone used in a subject line of > >> an email, and when browsing my inbox with notmuch, the terminal went > >> FUBAR. > > > > Yes, that's a known issue with some of the terminal emulators that > > compose Emoji and other similar character sequences into grapheme > > clusters, while ignoring the width that is expected from the result. > > I'm not aware of any good solution, unfortunately. Sometimes, > > disabling auto-composition-mode helps, but even that cannot solve all > > the problems, especially when each of the characters composed by the > > terminal into a single grapheme cluster has non-zero width according > > to the Unicode tables. (If only the first character in the composed > > sequence has non-zero width and the rest are zero-width, disabling > > auto-composition-mode might produce a correct display.) > > > > The bottom line is what I said at the beginning: we need some protocol > > by which a terminal emulator could be queried about whether it > > supports character composition, and if so, what is the screen width of > > a given sequence of codepoints that will be composed, without actually > > displaying them. Better yet, some standard table of such widths could > > be accepted by complying terminal emulators, and then Emacs could use > > such a table to know the width in advance (similarly to how it knows > > that from the Unicode data files). > > > > Until such protocols or tables exist, Emacs will be unable to produce > > correct display on these terminal emulators. > > It seems to me like this should be closed as a wontfix? Yes, now done.
Received: (at control) by debbugs.gnu.org; 1 Mar 2025 09:34:12 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sat Mar 01 04:34:12 2025 Received: from localhost ([127.0.0.1]:32914 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1toJEX-0002L8-91 for submit <at> debbugs.gnu.org; Sat, 01 Mar 2025 04:34:12 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:46708) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1toJET-0002Jv-IE; Sat, 01 Mar 2025 04:34:06 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <eliz@HIDDEN>) id 1toJEO-00031b-4Z; Sat, 01 Mar 2025 04:34:00 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=Lca8GUam/8z8KxDTk1sVdpTvVo66vtou0xdn8tc1K68=; b=LLrxacaP6u3r tk5o8Dog7PAYf/zIwMLNs7SOuTTjXRa0WMvuwG2IkAWqUi7xXNkHVPt/UTag4fJWj+h8ADYSNLY+v tLQoic2FLgVebTf+f8Q9qMq0sONLP9cMdIXIj8ciq7Rsrz+LUzM3dt2H4ItU9BGQHMmVhUp0E53Hf L2LgZIlZbyAZbp6HXVaQ60DWtWxIS2qzA8p1w2klN3wE1mB/AGs6Ya1HB5kxuIY7kVhCjn6VtnD7B ODUA3duIDlNubfnBl1CuUvojE6BuzDTlfySoFTI7nJlS2Mk01dWX8TqwHmOlUZGtG5EdQfOuNp9kJ czL9SvwMmRsUv0MfVHhs+A==; Date: Sat, 01 Mar 2025 11:33:53 +0200 Message-Id: <86cyf1qem6.fsf@HIDDEN> From: Eli Zaretskii <eliz@HIDDEN> To: Stefan Kangas <stefankangas@HIDDEN> In-Reply-To: <CADwFkmkquvhv94SMPRHBwneaDzSnjX6Rz+rHDp+S8YOTaAvttQ@HIDDEN> (message from Stefan Kangas on Fri, 28 Feb 2025 19:04:51 -0800) Subject: Re: bug#70000: 29.2; Grapheme handling incorrect References: <878r26duar.fsf@HIDDEN> <86cyrije9v.fsf@HIDDEN> <875xx7epd9.fsf@HIDDEN> <865xx7iogc.fsf@HIDDEN> <CADwFkmkquvhv94SMPRHBwneaDzSnjX6Rz+rHDp+S8YOTaAvttQ@HIDDEN> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: control Cc: 70000 <at> debbugs.gnu.org, phill@HIDDEN X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) tags 70000 wontfix close 70000 thanks > From: Stefan Kangas <stefankangas@HIDDEN> > Date: Fri, 28 Feb 2025 19:04:51 -0800 > Cc: Phillip Susi <phill@HIDDEN>, 70000 <at> debbugs.gnu.org > > Eli Zaretskii <eliz@HIDDEN> writes: > > >> From: Phillip Susi <phill@HIDDEN> > >> Cc: 70000 <at> debbugs.gnu.org > >> Date: Wed, 27 Mar 2024 10:11:30 -0400 > >> > >> Eli Zaretskii <eliz@HIDDEN> writes: > >> > >> > Querying the cursor position won't help in this case because it is > >> > Emacs that moves the cursor when you type C-f, not the terminal. > >> > >> I'm not talking about C-f, but simply displaying the characters on the > >> screen. Emacs assumes the width is 4 when it prints this character, and > >> so it thinks that the cursor moved over 4 places. When the terminal > >> actually only moves the cursor over 2 spaces, emacs gets out of sync > >> with the terminal, and massive breakage occurs. > > > > I understand what you are saying, but this is not how Emacs display > > code works. It needs to know the width of every character displayed > > on the screen, and it needs to be able to determine that even without > > actually displaying the character. > > > > When Emacs is about to redraw some portion of the screen, it moves the > > cursor to that place. To be able to move the cursor there, it needs > > to be able to compute the coordinates on the screen of every character > > that is currently shown, so it can construct the command for the > > terminal driver to move cursor to that place. If Emacs were to rely > > on displaying characters for that, it would have needed to constantly > > redraw large portions of the screen, and that would both be much > > slower and cause unpleasant flickering of the display, due to > > redrawing of screen portions that don't actually change. > > > > So this technique is out of the question for Emacs. > > > >> By reading back the cursor position from the terminal after displaying a > >> grapheme cluster, it would learn how the terminal displayed it and > >> update its idea of where the cursor is correctly. > > > > I understand. But Emacs needs this information also long after the > > characters were already drawn. For example, imagine that Emacs > > displays these characters on the screen, and then leaves most of the > > screen intact and periodically redraws some small portion of the > > screen, like updating current time in the lower-right corner of the > > screen when Emacs is otherwise idle. To do that, Emacs needs to move > > the cursor from its current position somewhere on the screen to the > > lower-right corner, redraw the time there, then move the cursor back > > to where it was. These cursor moves are based on the ability to > > calculate the geometry of each character on display without actually > > writing the characters to the screen. > > > > In addition, if Emacs had to query the cursor position after each > > written character, its redisplay would be much slower than it is now. > > > >> I originally ran into this problem not with a ZWJ, but with an emoji > >> followed by alternate selector 16 that someone used in a subject line of > >> an email, and when browsing my inbox with notmuch, the terminal went > >> FUBAR. > > > > Yes, that's a known issue with some of the terminal emulators that > > compose Emoji and other similar character sequences into grapheme > > clusters, while ignoring the width that is expected from the result. > > I'm not aware of any good solution, unfortunately. Sometimes, > > disabling auto-composition-mode helps, but even that cannot solve all > > the problems, especially when each of the characters composed by the > > terminal into a single grapheme cluster has non-zero width according > > to the Unicode tables. (If only the first character in the composed > > sequence has non-zero width and the rest are zero-width, disabling > > auto-composition-mode might produce a correct display.) > > > > The bottom line is what I said at the beginning: we need some protocol > > by which a terminal emulator could be queried about whether it > > supports character composition, and if so, what is the screen width of > > a given sequence of codepoints that will be composed, without actually > > displaying them. Better yet, some standard table of such widths could > > be accepted by complying terminal emulators, and then Emacs could use > > such a table to know the width in advance (similarly to how it knows > > that from the Unicode data files). > > > > Until such protocols or tables exist, Emacs will be unable to produce > > correct display on these terminal emulators. > > It seems to me like this should be closed as a wontfix? Yes, now done.
Received: (at control) by debbugs.gnu.org; 1 Mar 2025 09:34:12 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sat Mar 01 04:34:12 2025 Received: from localhost ([127.0.0.1]:32914 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1toJEX-0002L8-91 for submit <at> debbugs.gnu.org; Sat, 01 Mar 2025 04:34:12 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:46708) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1toJET-0002Jv-IE; Sat, 01 Mar 2025 04:34:06 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <eliz@HIDDEN>) id 1toJEO-00031b-4Z; Sat, 01 Mar 2025 04:34:00 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=Lca8GUam/8z8KxDTk1sVdpTvVo66vtou0xdn8tc1K68=; b=LLrxacaP6u3r tk5o8Dog7PAYf/zIwMLNs7SOuTTjXRa0WMvuwG2IkAWqUi7xXNkHVPt/UTag4fJWj+h8ADYSNLY+v tLQoic2FLgVebTf+f8Q9qMq0sONLP9cMdIXIj8ciq7Rsrz+LUzM3dt2H4ItU9BGQHMmVhUp0E53Hf L2LgZIlZbyAZbp6HXVaQ60DWtWxIS2qzA8p1w2klN3wE1mB/AGs6Ya1HB5kxuIY7kVhCjn6VtnD7B ODUA3duIDlNubfnBl1CuUvojE6BuzDTlfySoFTI7nJlS2Mk01dWX8TqwHmOlUZGtG5EdQfOuNp9kJ czL9SvwMmRsUv0MfVHhs+A==; Date: Sat, 01 Mar 2025 11:33:53 +0200 Message-Id: <86cyf1qem6.fsf@HIDDEN> From: Eli Zaretskii <eliz@HIDDEN> To: Stefan Kangas <stefankangas@HIDDEN> In-Reply-To: <CADwFkmkquvhv94SMPRHBwneaDzSnjX6Rz+rHDp+S8YOTaAvttQ@HIDDEN> (message from Stefan Kangas on Fri, 28 Feb 2025 19:04:51 -0800) Subject: Re: bug#70000: 29.2; Grapheme handling incorrect References: <878r26duar.fsf@HIDDEN> <86cyrije9v.fsf@HIDDEN> <875xx7epd9.fsf@HIDDEN> <865xx7iogc.fsf@HIDDEN> <CADwFkmkquvhv94SMPRHBwneaDzSnjX6Rz+rHDp+S8YOTaAvttQ@HIDDEN> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: control Cc: 70000 <at> debbugs.gnu.org, phill@HIDDEN X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) tags 70000 wontfix close 70000 thanks > From: Stefan Kangas <stefankangas@HIDDEN> > Date: Fri, 28 Feb 2025 19:04:51 -0800 > Cc: Phillip Susi <phill@HIDDEN>, 70000 <at> debbugs.gnu.org > > Eli Zaretskii <eliz@HIDDEN> writes: > > >> From: Phillip Susi <phill@HIDDEN> > >> Cc: 70000 <at> debbugs.gnu.org > >> Date: Wed, 27 Mar 2024 10:11:30 -0400 > >> > >> Eli Zaretskii <eliz@HIDDEN> writes: > >> > >> > Querying the cursor position won't help in this case because it is > >> > Emacs that moves the cursor when you type C-f, not the terminal. > >> > >> I'm not talking about C-f, but simply displaying the characters on the > >> screen. Emacs assumes the width is 4 when it prints this character, and > >> so it thinks that the cursor moved over 4 places. When the terminal > >> actually only moves the cursor over 2 spaces, emacs gets out of sync > >> with the terminal, and massive breakage occurs. > > > > I understand what you are saying, but this is not how Emacs display > > code works. It needs to know the width of every character displayed > > on the screen, and it needs to be able to determine that even without > > actually displaying the character. > > > > When Emacs is about to redraw some portion of the screen, it moves the > > cursor to that place. To be able to move the cursor there, it needs > > to be able to compute the coordinates on the screen of every character > > that is currently shown, so it can construct the command for the > > terminal driver to move cursor to that place. If Emacs were to rely > > on displaying characters for that, it would have needed to constantly > > redraw large portions of the screen, and that would both be much > > slower and cause unpleasant flickering of the display, due to > > redrawing of screen portions that don't actually change. > > > > So this technique is out of the question for Emacs. > > > >> By reading back the cursor position from the terminal after displaying a > >> grapheme cluster, it would learn how the terminal displayed it and > >> update its idea of where the cursor is correctly. > > > > I understand. But Emacs needs this information also long after the > > characters were already drawn. For example, imagine that Emacs > > displays these characters on the screen, and then leaves most of the > > screen intact and periodically redraws some small portion of the > > screen, like updating current time in the lower-right corner of the > > screen when Emacs is otherwise idle. To do that, Emacs needs to move > > the cursor from its current position somewhere on the screen to the > > lower-right corner, redraw the time there, then move the cursor back > > to where it was. These cursor moves are based on the ability to > > calculate the geometry of each character on display without actually > > writing the characters to the screen. > > > > In addition, if Emacs had to query the cursor position after each > > written character, its redisplay would be much slower than it is now. > > > >> I originally ran into this problem not with a ZWJ, but with an emoji > >> followed by alternate selector 16 that someone used in a subject line of > >> an email, and when browsing my inbox with notmuch, the terminal went > >> FUBAR. > > > > Yes, that's a known issue with some of the terminal emulators that > > compose Emoji and other similar character sequences into grapheme > > clusters, while ignoring the width that is expected from the result. > > I'm not aware of any good solution, unfortunately. Sometimes, > > disabling auto-composition-mode helps, but even that cannot solve all > > the problems, especially when each of the characters composed by the > > terminal into a single grapheme cluster has non-zero width according > > to the Unicode tables. (If only the first character in the composed > > sequence has non-zero width and the rest are zero-width, disabling > > auto-composition-mode might produce a correct display.) > > > > The bottom line is what I said at the beginning: we need some protocol > > by which a terminal emulator could be queried about whether it > > supports character composition, and if so, what is the screen width of > > a given sequence of codepoints that will be composed, without actually > > displaying them. Better yet, some standard table of such widths could > > be accepted by complying terminal emulators, and then Emacs could use > > such a table to know the width in advance (similarly to how it knows > > that from the Unicode data files). > > > > Until such protocols or tables exist, Emacs will be unable to produce > > correct display on these terminal emulators. > > It seems to me like this should be closed as a wontfix? Yes, now done.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.