Package: emacs;
Reported by: Aura Kelloniemi <kaura.dev <at> sange.fi>
Date: Thu, 17 Feb 2022 06:58:01 UTC
Severity: normal
Found in version 29.0.50
To reply to this bug, email your comments to 54032 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
View this report as an mbox folder, status mbox, maintainer mbox
bug-gnu-emacs <at> gnu.org
:bug#54032
; Package emacs
.
(Thu, 17 Feb 2022 06:58:01 GMT) Full text and rfc822 format available.Aura Kelloniemi <kaura.dev <at> sange.fi>
:bug-gnu-emacs <at> gnu.org
.
(Thu, 17 Feb 2022 06:58:01 GMT) Full text and rfc822 format available.Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Aura Kelloniemi <kaura.dev <at> sange.fi> To: bug-gnu-emacs <at> gnu.org Subject: 29.0.50; Emoji display on Linux console switched to hexadecimal output Date: Thu, 17 Feb 2022 08:57:40 +0200
Hello, on recent Emacs development repository builds, emoji characters are no more displayed on Linux console. Instead Emacs prints \UABCDEF hexadecimal codes. This is due to the commit 10c680551e899805a6de7360e9b65986fd87df72 which probably makes things better on some terminals. Reverting this commit fixed the issue for me, and emojis are again displayed as usual. Linux console is (sort of) capable of displaying emojis. Console font can be configured so that it has glyphs for emojis exactly the same way as for other characters. Also, blind users using refreshable braile displays use Linux console to access Emacs. The braille terminal driver is able to detect the correct character code points even when Linux itself is not able to display them properly on the screen. This detection is done using the /dev/vcsu (virtual console screen unicode) character devices. For these reasons it is important that emacs outputs the real characters to the terminal on Linux console. Linux console has a terminal type string of "linux". lisp/term/linux.el contains already some Linux terminal specific code (which unfortunately assumes though that Linux has a default character set of Latin-1, which has never been true). My preferred solution to this problem would be to add and document a way to configure character display logic on TTYs more precisely. It would be great to be able to control the terminal output of Unicode on grapheme cluster precision – i.e. allow the user to define a function which translates code points/grapheme clusters into something that their terminal can display. I believe that Linux console is not the only terminal that behaves peculiarly when it comes to Unicode support. So this might benefit others than Linux VT users too. -- Aura In GNU Emacs 29.0.50 (build 3, x86_64-pc-linux-gnu, GTK+ Version 3.24.31, cairo version 1.17.4) of 2022-02-16 built on solaria Repository revision: e6e723bb4d300e6ceeeb12bf43bf3d54a6108cac Repository branch: makepkg System Description: Arch Linux Configured using: 'configure --prefix=/usr --sysconfdir=/etc --libexecdir=/usr/lib --localstatedir=/var --with-native-compilation --with-x-toolkit=gtk3 --with-xft --with-wide-int --with-modules --with-gameuser=:games --with-sound=alsa --with-cairo --with-harfbuzz --enable-link-time-optimization 'CFLAGS=-march=native -mtune=native -O2 -pipe -fno-plt -fuse-ld=gold -flto' CPPFLAGS=-D_FORTIFY_SOURCE=2 LDFLAGS=-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now' Configured features: ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG JSON LCMS2 LIBOTF LIBSYSTEMD LIBXML2 M17N_FLT MODULES NATIVE_COMP NOTIFY INOTIFY PDUMPER PNG RSVG SECCOMP SOUND SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS WEBP X11 XDBE XIM XPM GTK3 ZLIB Important settings: value of $LANG: fi_FI.UTF-8 locale-coding-system: utf-8-unix Major mode: Fundamental Minor modes in effect: telega-root-auto-fill-mode: t telega-active-locations-mode: t telega-patrons-mode: t gpm-mouse-mode: t leaf-key-override-global-mode: t shell-dirtrack-mode: t savehist-mode: t minibuffer-electric-default-mode: t icomplete-mode: t tooltip-mode: t global-eldoc-mode: t electric-indent-mode: t mouse-wheel-mode: t tool-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t auto-composition-mode: linux auto-encryption-mode: t auto-compression-mode: t column-number-mode: t line-number-mode: t Load-path shadows: /home/aura/.config/emacs/elpa/transient-20220130.1941/transient hides /usr/share/emacs/29.0.50/lisp/transient Features: (shadow sort company-oddmuse company-keywords company-etags etags fileloop xref project company-gtags company-dabbrev-code company-dabbrev company-files company-clang company-capf company-cmake company-semantic company-template company-bbdb mail-extr textsec uni-scripts idna-mapping ucs-normalize uni-confusable textsec-check shr pixel-fill kinsoku vterm bookmark face-remap compile term disp-table ehelp vterm-module term/xterm xterm mule-util telega-obsolete telega telega-tdlib-events telega-webpage visual-fill-column telega-root telega-info telega-chat telega-modes image-mode exif telega-company telega-user telega-notifications notifications dbus telega-voip telega-msg telega-tme telega-sticker telega-i18n telega-vvnote bindat telega-ffplay telega-media telega-sort telega-filter telega-ins telega-folders telega-inline telega-tdlib telega-util rainbow-identifiers org-element avl-tree generator org ob ob-tangle ob-ref ob-lob ob-table ob-exp org-macro org-footnote org-src ob-comint org-pcomplete org-list org-faces org-entities noutline outline org-version ob-emacs-lisp ob-core ob-eval org-table oc-basic bibtex ol org-keys oc org-compat advice org-macs org-loaddefs dired-aux color ewoc telega-server telega-core telega-customize svg dom xml emacsbug sendmail find-func cursor-sensor comp comp-cstr warnings rx cl-extra help-mode t-mouse term/linux recentf tree-widget notmuch notmuch-tree notmuch-jump notmuch-hello notmuch-show notmuch-print notmuch-crypto notmuch-mua notmuch-message notmuch-draft notmuch-maildir-fcc notmuch-address notmuch-company notmuch-parser notmuch-wash diff-mode easy-mmode coolj notmuch-query goto-addr thingatpt icalendar diary-lib diary-loaddefs cal-menu calendar cal-loaddefs notmuch-tag crm notmuch-lib notmuch-version notmuch-compat hl-line message yank-media rmc puny dired dired-loaddefs rfc822 mml mailabbrev mail-utils gmm-utils mailheader mm-view mml-smime mml-sec epa derived epg rfc6068 epg-config gnus-util text-property-search smime dig mm-decode mm-bodies mm-encode mail-parse rfc2231 rfc2047 rfc2045 mm-util ietf-drums mail-prsvr company pcase server leaf-keywords leaf finder-inf package browse-url url url-proxy url-privacy url-expand url-methods url-history url-cookie url-domsuf url-util mailcap url-handlers url-parse url-vars tramp tramp-loaddefs trampver tramp-integration cus-edit pp wid-edit files-x tramp-compat shell pcomplete comint ansi-color ring parse-time iso8601 time-date ls-lisp format-spec auth-source cl-seq eieio eieio-core cl-macs eieio-loaddefs password-cache json map savehist minibuf-eldef keypad ido seq gv subr-x byte-opt bytecomp byte-compile cconv icomplete desktop frameset cl-loaddefs cl-lib cus-load info iso-transl tooltip eldoc paren electric uniquify ediff-hook vc-hooks lisp-float-type elisp-mode mwheel term/x-win x-win term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe tabulated-list replace newcomment text-mode lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow isearch easymenu timer select scroll-bar mouse jit-lock font-lock syntax font-core term/tty-colors frame minibuffer cl-generic cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese composite emoji-zwj charscript charprop case-table epa-hook jka-cmpr-hook help simple abbrev obarray cl-preloaded nadvice button loaddefs faces cus-face macroexp files window text-properties overlay sha1 md5 base64 format env code-pages mule custom widget keymap hashtable-print-readable backquote threads dbusbind inotify lcms2 dynamic-setting system-font-setting font-render-setting cairo move-toolbar gtk x-toolkit x multi-tty make-network-process native-compile emacs) Memory information: ((conses 16 899499 30380) (symbols 48 32992 10) (strings 32 222707 14412) (string-bytes 1 6442228) (vectors 16 114679) (vector-slots 8 1497371 73794) (floats 8 8158 532) (intervals 56 5228 1491) (buffers 992 15))
bug-gnu-emacs <at> gnu.org
:bug#54032
; Package emacs
.
(Thu, 17 Feb 2022 07:53:02 GMT) Full text and rfc822 format available.Message #8 received at 54032 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Aura Kelloniemi <kaura.dev <at> sange.fi> Cc: 54032 <at> debbugs.gnu.org Subject: Re: bug#54032: 29.0.50; Emoji display on Linux console switched to hexadecimal output Date: Thu, 17 Feb 2022 09:52:15 +0200
> From: Aura Kelloniemi <kaura.dev <at> sange.fi> > Date: Thu, 17 Feb 2022 08:57:40 +0200 > > on recent Emacs development repository builds, emoji characters are no more > displayed on Linux console. Instead Emacs prints \UABCDEF hexadecimal codes. > > This is due to the commit 10c680551e899805a6de7360e9b65986fd87df72 which > probably makes things better on some terminals. Reverting this commit fixed > the issue for me, and emojis are again displayed as usual. Thanks, but I think we need more detailed information to understand the problem. First, when you say "display emoji characters", which characters exactly does that allude to? Can you show specific examples of text that includes Emoji, which displayed the Emoji glyphs before the above commit, but not after it? In particular, are we talking about single codepoints in the Emoji block, or are we talking about Emoji sequences that involve more than one codepoint (and are supposed to display like a single Emoji glyph)? For each example, please show both the text and what you see on display for that text. Also, what is the value of auto-composition-mode? I think it should be the string "linux", in which case please try setting it to t and see if the display becomes better or worse. > Linux console is (sort of) capable of displaying emojis. Console font can be > configured so that it has glyphs for emojis exactly the same way as for other > characters. If that is the case, why did Emacs think the terminal cannot display these characters? Can you step with GDB inside terminal_glyph_code, when it is called for the first time in the Emacs session, and see whether the ioctl call we issue in calculate_glyph_code_table returns valid values for the Emoji codepoints? > Also, blind users using refreshable braile displays use Linux > console to access Emacs. The braille terminal driver is able to detect the > correct character code points even when Linux itself is not able to display > them properly on the screen. This detection is done using the /dev/vcsu > (virtual console screen unicode) character devices. Are you saying that the braille terminal driver will not respond to the ioctl call we issue in calculate_glyph_code_table? Is there any other method of knowing which characters are supported in that case? > For these reasons it is important that emacs outputs the real > characters to the terminal on Linux console. Outputting codepoints for which there are no glyphs produced unreadable display, since (AFAIU) the console displays them all as the "diamond" replacement character. Detecting the fact that a codepoint cannot be displayed allows us to produce something that at least can be interpreted, and allows the user to install optional features (such as those provided by latin1-disp.el) which will replace the characters that cannot be displayed by equivalent strings, for example ASCII strings. So we would like to keep the automatic detection of whether a given character can be displayed by the console, although it sounds like the current solution should be made more flexible and sophisticated in some way. > Linux console has a terminal type string of "linux". lisp/term/linux.el > contains already some Linux terminal specific code (which unfortunately > assumes though that Linux has a default character set of Latin-1, which has > never been true). That's just the default. We attempt to detect which characters can be displayed later on, when the functions I mentioned above are called during startup. > My preferred solution to this problem would be to add and document a way to > configure character display logic on TTYs more precisely. It would be great to > be able to control the terminal output of Unicode on grapheme cluster > precision – i.e. allow the user to define a function which translates code > points/grapheme clusters into something that their terminal can display. I'm not yet sure something like that would be needed. It will certainly slow down the display on the console, which is undesirable for obvious reasons. It is also too complex (not every Emacs user can write Lisp programs that play sophisticated games with characters and glyphs). I think we don't yet have a detailed enough understanding of the issue to discuss solutions, so I suggest to postpone this discussion until the questions I asked above are answered, and we have a good understanding of what is going on. The commit to which you point out was made based on reports from another user of the Linux console (albeit not about Emoji), and in that case the change had a positive effect. So the issue is not simple, and we need a good understanding of it before we devise a solution. Thanks.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.