GNU bug report logs - #42256
27.0.50; composition

Previous Next

Package: emacs;

Reported by: rms <at> gnu.org

Date: Wed, 8 Jul 2020 02:42:02 UTC

Severity: normal

Found in version 27.0.50

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 42256 in the body.
You can then email your comments to 42256 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#42256; Package emacs. (Wed, 08 Jul 2020 02:42:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to rms <at> gnu.org:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Wed, 08 Jul 2020 02:42:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Richard Stallman <rms <at> gnu.org>
To: bug-gnu-emacs <at> gnu.org
Subject: 27.0.50; composition
Date: Tue, 07 Jul 2020 22:41:14 -0400
On the tty, a composition shows up as u followed by a diamond.
I try to find out with C-u C-x = what the diamond stands for,
and get this, which tells me the hex code 304 but does not
say what the character looks like or means.

It does so for the u, which I can see, but not for the 0x304
which I cannot see.

It would be nice for the description of a composition to give the name
description of each of the components.

             position: 1484 of 26036 (6%), column: 12
            character: u (displayed as u) (codepoint 117, #o165, #x75)
              charset: ascii (ASCII (ISO646 IRV))
code point in charset: 0x75
               script: latin
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong), a:ASCII, l:Latin, r:Roman
             to input: type "C-x 8 RET 75" or "C-x 8 RET LATIN SMALL LETTER U"
          buffer code: #x75
            file code: #x75 (encoded by coding system utf-8-unix)
              display: composed to form "ū" (see below)

Composed with the following character(s) "̄" by these characters:
 u (#x75)
 ̄ (#x304)

Character code properties: customize what to show
  name: LATIN SMALL LETTER U
  general-category: Ll (Letter, Lowercase)
  decomposition: (117) ('u')

[back]



In GNU Emacs 27.0.50 (build 3, x86_64-pc-linux-gnu, GTK+ Version 2.24.30)
 of 2019-06-28 built on freetop
Repository revision: 093f5d0045cc5facd3728e385a71ef84f218bdfe
Repository branch: master
System Description: Trisquel GNU/Linux Flidas (8.0)

Recent messages:
Type C-x 1 to delete the help window.
Char: ) (41, #o51, #x29) point=1486 of 26036 (6%) column=13

Char: u (117, #o165, #x75) point=1484 of 26036 (6%) column=12
Mark set [2 times]
Saved text until "e)
  decomposition: (117) ('u')

[back]
"

Configured using:
 'configure 'CFLAGS=-O0 -g' --with-gnutls=ifavailable'

Configured features:
XPM JPEG TIFF GIF PNG RSVG SOUND GPM DBUS GSETTINGS GLIB NOTIFY
INOTIFY LIBXML2 FREETYPE HARFBUZZ M17N_FLT LIBOTF XFT ZLIB
TOOLKIT_SCROLL_BARS GTK2 X11 XDBE XIM THREADS PDUMPER GMP

Important settings:
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix

Major mode: Text

Minor modes in effect:
  shell-dirtrack-mode: t
  gpm-mouse-mode: t
  tooltip-mode: t
  global-eldoc-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  buffer-read-only: t
  line-number-mode: t
  transient-mark-mode: t
  abbrev-mode: t

Load-path shadows:
None found.

Features:
(shadow emacsbug descr-text ispell smerge-mode vc vc-dispatcher
cl-extra parse-time vc-cvs mhtml-mode css-mode smie eww mm-url gnus
nnheader wid-edit url-queue url url-proxy url-privacy url-expand
url-methods url-history url-cookie url-domsuf mailcap color js imenu
sgml-mode shell pcomplete grep mule-util compile comint ansi-color
conf-mode quail help-mode rmailout vc-git diff-mode easy-mmode
bug-reference cc-mode cc-fonts cc-guess cc-menus cc-cmds cc-styles
cc-align cc-engine cc-vars cc-defs dired-aux misearch multi-isearch
thingatpt etags fileloop generator xref project ring dabbrev mailalias
sendmail rmailkwd url-util shr svg xml dom browse-url qp rmailmm
message rmc puny format-spec rfc822 mml mml-sec epa epg gnus-util
text-property-search time-date mm-decode mm-bodies mm-encode
mailabbrev gmm-utils mailheader mail-parse rfc2231 rmail
rmail-loaddefs rfc2047 rfc2045 ietf-drums mm-util mail-prsvr
mail-utils dired dired-loaddefs t-mouse term/linux elec-pair view
derived paren cus-start cus-load advice finder-inf package easymenu
epg-config url-handlers url-parse auth-source cl-seq eieio eieio-core
cl-macs eieio-loaddefs password-cache json subr-x map url-vars seq
byte-opt gv bytecomp byte-compile cconv cl-loaddefs cl-lib tooltip
eldoc electric uniquify ediff-hook vc-hooks lisp-float-type mwheel
term/x-win x-win term/common-win x-dnd tool-bar dnd fontset image
regexp-opt fringe tabulated-list replace newcomment text-mode
elisp-mode lisp-mode prog-mode register page menu-bar rfn-eshadow
isearch timer select scroll-bar mouse jit-lock font-lock syntax
facemenu font-core term/tty-colors frame cl-generic cham georgian
utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean
japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european
ethiopic indian cyrillic chinese composite charscript charprop
case-table epa-hook jka-cmpr-hook help simple abbrev obarray
minibuffer cl-preloaded nadvice loaddefs button faces cus-face
macroexp files text-properties overlay sha1 md5 base64 format env
code-pages mule custom widget hashtable-print-readable backquote
threads dbusbind inotify dynamic-setting system-font-setting
font-render-setting move-toolbar gtk x-toolkit x multi-tty
make-network-process emacs)

Memory information:
((conses 16 343141 64732)
 (symbols 48 28965 2)
 (strings 32 114233 932)
 (string-bytes 1 2935245)
 (vectors 16 36731)
 (vector-slots 8 1556392 168068)
 (floats 8 230 196)
 (intervals 56 69684 806)
 (buffers 992 72)
 (heap 1024 33230 3039))
[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]


-- 
Dr Richard Stallman
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42256; Package emacs. (Wed, 08 Jul 2020 14:19:02 GMT) Full text and rfc822 format available.

Message #8 received at 42256 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: rms <at> gnu.org
Cc: 42256 <at> debbugs.gnu.org
Subject: Re: bug#42256: 27.0.50; composition
Date: Wed, 08 Jul 2020 17:18:50 +0300
> From: Richard Stallman <rms <at> gnu.org>
> Date: Tue, 07 Jul 2020 22:41:14 -0400
> 
> 
> On the tty, a composition shows up as u followed by a diamond.
> I try to find out with C-u C-x = what the diamond stands for,
> and get this, which tells me the hex code 304 but does not
> say what the character looks like or means.

I'm not sure I understand what you'd like to see there in addition to
what is shown (the codepoint in hex).  That diamond means that your
terminal cannot display this codepoint, so Emacs cannot usefully show
you what it looks like (it does show it on my system, where that
character can be displayed).  Given that your terminal cannot display
this character, what would help you to know more?  We could perhaps
display the Unicode name of the character (COMBINING MACRON), would
that help?  Alternatively, you could go to that character and type
"C-u C-x =", which would then show the full information about it.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42256; Package emacs. (Thu, 09 Jul 2020 03:02:01 GMT) Full text and rfc822 format available.

Message #11 received at 42256 <at> debbugs.gnu.org (full text, mbox):

From: Richard Stallman <rms <at> gnu.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 42256 <at> debbugs.gnu.org
Subject: Re: bug#42256: 27.0.50; composition
Date: Wed, 08 Jul 2020 23:01:27 -0400
[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > I'm not sure I understand what you'd like to see there in addition to
  > what is shown (the codepoint in hex).  That diamond means that your
  > terminal cannot display this codepoint, so Emacs cannot usefully show
  > you what it looks like

I did not say "show", I said "say":

    and get this, which tells me the hex code 304 but does not
    say what the character looks like or means.

Unicode characters have names which say what they look like.  For
instance, á is LATIN SMALL LETTER A WITH ACUTE.  Even if my terminal
could not display á, that name would tell me what it is.

If that diamond were not inside a composition, I could use C-u C-x =
on it and find out what character that is.  The flaw here is that
there is no way to see the descriptive name of the second character in
a composition.

C-u C-x = shows the name for the first composed character, #x75,
but fails to show it for #x304:

    Composed with the following character(s) "̄" by these characters:
     u (#x75)
     ̄ (#x304)

    Character code properties: customize what to show
      name: LATIN SMALL LETTER U
      general-category: Ll (Letter, Lowercase)
      decomposition: (117) ('u')

   [nothing further]

I would like C-u C-x = on a composed charadcter to show the name for
each character in the composition.

-- 
Dr Richard Stallman
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)






Reply sent to Eli Zaretskii <eliz <at> gnu.org>:
You have taken responsibility. (Thu, 09 Jul 2020 17:53:01 GMT) Full text and rfc822 format available.

Notification sent to rms <at> gnu.org:
bug acknowledged by developer. (Thu, 09 Jul 2020 17:53:02 GMT) Full text and rfc822 format available.

Message #16 received at 42256-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: rms <at> gnu.org
Cc: 42256-done <at> debbugs.gnu.org
Subject: Re: bug#42256: 27.0.50; composition
Date: Thu, 09 Jul 2020 20:51:48 +0300
> From: Richard Stallman <rms <at> gnu.org>
> Cc: 42256 <at> debbugs.gnu.org
> Date: Wed, 08 Jul 2020 23:01:27 -0400
> 
>   > I'm not sure I understand what you'd like to see there in addition to
>   > what is shown (the codepoint in hex).  That diamond means that your
>   > terminal cannot display this codepoint, so Emacs cannot usefully show
>   > you what it looks like
> 
> I did not say "show", I said "say":

Please forgive me for not getting this fine nuance of what you said.
We frequently use "say" meaning something a program displays (as in
"Emacs says this:" etc.), so it was easy for me to misunderstand.

> If that diamond were not inside a composition, I could use C-u C-x =
> on it and find out what character that is.  The flaw here is that
> there is no way to see the descriptive name of the second character in
> a composition.

As I said, you can "C-u C-x =" on that diamond in the *Help* buffer,
but I agree that it would be handy to have the info shown
automatically.

> I would like C-u C-x = on a composed charadcter to show the name for
> each character in the composition.

OK, I've now added the names of the characters to the composition
information display on TTY frames.  From now on Emacs will say in this
case:

  Composed with the following character(s) "̄" by these characters:
   u (#x75) LATIN SMALL LETTER U
   ̄ (#x304) COMBINING MACRON

I'm therefore closing this bug report.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42256; Package emacs. (Fri, 10 Jul 2020 00:04:02 GMT) Full text and rfc822 format available.

Message #19 received at 42256 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: 42256 <at> debbugs.gnu.org
Cc: eliz <at> gnu.org, rms <at> gnu.org
Subject: Re: bug#42256: 27.0.50; composition
Date: Fri, 10 Jul 2020 02:36:41 +0300
>> I would like C-u C-x = on a composed charadcter to show the name for
>> each character in the composition.
>
> OK, I've now added the names of the characters to the composition
> information display on TTY frames.

This has been a big problem for me, thanks for fixing.  I see now
all combining characters displayed in the same Help buffer on TTY:

  Composed with the following character(s) "́" by these characters:
   a (#x61) LATIN SMALL LETTER A
   ́ (#x301) COMBINING ACUTE ACCENT

But I wonder why display combining character names only on TTY frames?
On GUI frames it currently displays only:

  Composed with the following character(s) "́" using this font:
    x:-misc-fixed-medium-r-normal--15-108-100-100-c-60-iso10646-1
  by these glyphs:
    [0 1 97 97 6 0 6 12 3 nil]
    [0 1 769 769 6 0 6 12 3 [-6 0 0]]

I don't know what these glyph numbers mean, but still no combining
character names are displayed on GUI frames.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42256; Package emacs. (Fri, 10 Jul 2020 03:53:01 GMT) Full text and rfc822 format available.

Message #22 received at 42256 <at> debbugs.gnu.org (full text, mbox):

From: Richard Stallman <rms <at> gnu.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: eliz <at> gnu.org, 42256 <at> debbugs.gnu.org
Subject: Re: bug#42256: 27.0.50; composition
Date: Thu, 09 Jul 2020 23:52:21 -0400
[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > As I said, you can "C-u C-x =" on that diamond in the *Help* buffer,

Sorry, I didn't understand that point before.  I guess it would
work.

    > Composed with the following character(s) "̄" by these characters:
    >  u (#x75) LATIN SMALL LETTER U
    >  ̄ (#x304) COMBINING MACRON

Thank you.  I expect all will agree that is more helpful.

-- 
Dr Richard Stallman
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42256; Package emacs. (Fri, 10 Jul 2020 06:23:02 GMT) Full text and rfc822 format available.

Message #25 received at 42256 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Juri Linkov <juri <at> linkov.net>
Cc: rms <at> gnu.org, 42256 <at> debbugs.gnu.org
Subject: Re: bug#42256: 27.0.50; composition
Date: Fri, 10 Jul 2020 09:21:50 +0300
> From: Juri Linkov <juri <at> linkov.net>
> Cc: eliz <at> gnu.org,  rms <at> gnu.org
> Date: Fri, 10 Jul 2020 02:36:41 +0300
> 
> But I wonder why display combining character names only on TTY frames?

Because font glyphs have no names, at least not names that Emacs knows
about and that could be of use to users.

>   Composed with the following character(s) "́" using this font:
>     x:-misc-fixed-medium-r-normal--15-108-100-100-c-60-iso10646-1
>   by these glyphs:
>     [0 1 97 97 6 0 6 12 3 nil]
>     [0 1 769 769 6 0 6 12 3 [-6 0 0]]
> 
> I don't know what these glyph numbers mean

Which numbers?  If you mean the components of the glyph vectors, see
the doc string of composition-get-gstring.  I don't think the details
of this information is useful for casual users.

> but still no combining character names are displayed on GUI frames.

I don't think I understand what you mean by that.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42256; Package emacs. (Sat, 11 Jul 2020 02:19:01 GMT) Full text and rfc822 format available.

Message #28 received at 42256 <at> debbugs.gnu.org (full text, mbox):

From: Richard Stallman <rms <at> gnu.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 42256 <at> debbugs.gnu.org, juri <at> linkov.net
Subject: Re: bug#42256: 27.0.50; composition
Date: Fri, 10 Jul 2020 22:17:50 -0400
[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > Because font glyphs have no names, at least not names that Emacs knows
  > about and that could be of use to users.

Names such as LATIN SMALL LETTER A WITH ACUTE belong to Unicode code
points, not to glyphs.  They do not depend on the font used to display
the character.

So I think it makes sense to show those names independent of
the kind of display.

On a graphic display, it should show the glyph data (as now)
in addition to those Unicode character names.

-- 
Dr Richard Stallman
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42256; Package emacs. (Sun, 12 Jul 2020 00:36:02 GMT) Full text and rfc822 format available.

Message #31 received at 42256 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: rms <at> gnu.org, 42256 <at> debbugs.gnu.org
Subject: Re: bug#42256: 27.0.50; composition
Date: Sun, 12 Jul 2020 02:57:03 +0300
>> But I wonder why display combining character names only on TTY frames?
>
> Because font glyphs have no names, at least not names that Emacs knows
> about and that could be of use to users.

Font glyphs have no names indeed, but the characters that they display
have names.

>>   Composed with the following character(s) "́" using this font:
>>     x:-misc-fixed-medium-r-normal--15-108-100-100-c-60-iso10646-1
>>   by these glyphs:
>>     [0 1 97 97 6 0 6 12 3 nil]
>>     [0 1 769 769 6 0 6 12 3 [-6 0 0]]
>>
>> I don't know what these glyph numbers mean
>
> Which numbers?  If you mean the components of the glyph vectors, see
> the doc string of composition-get-gstring.  I don't think the details
> of this information is useful for casual users.

I agree that the details of the glyph vectors are not useful for users.
But the character names are hugely useful, even on GUI frames,
not only on TTY frames.

>> but still no combining character names are displayed on GUI frames.
>
> I don't think I understand what you mean by that.

The names of combining characters under point are now displayed on TTY frames,
but still not displayed on GUI frames.  This information about buffer characters
is needed without relation to fonts and glyphs.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42256; Package emacs. (Sun, 12 Jul 2020 15:42:02 GMT) Full text and rfc822 format available.

Message #34 received at 42256 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Juri Linkov <juri <at> linkov.net>
Cc: rms <at> gnu.org, 42256 <at> debbugs.gnu.org
Subject: Re: bug#42256: 27.0.50; composition
Date: Sun, 12 Jul 2020 18:40:37 +0300
> From: Juri Linkov <juri <at> linkov.net>
> Cc: 42256 <at> debbugs.gnu.org,  rms <at> gnu.org
> Date: Sun, 12 Jul 2020 02:57:03 +0300
> 
> I agree that the details of the glyph vectors are not useful for users.
> But the character names are hugely useful, even on GUI frames,
> not only on TTY frames.

On TTY frames, they are the only information available about the
composition (because the actual composition is done by the terminal
emulator).  On GUI frames, we have more important information already
shown.

If someone wants or needs to know which characters participated in a
composition on a GUI frame, they can go to those characters in the
*Help* buffer and type "C-u C-x =".

That said, if someone wants to work on adding the character names to
the GUI display as well, I won't object.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42256; Package emacs. (Sun, 12 Jul 2020 23:37:01 GMT) Full text and rfc822 format available.

Message #37 received at 42256 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: rms <at> gnu.org, 42256 <at> debbugs.gnu.org
Subject: Re: bug#42256: 27.0.50; composition
Date: Mon, 13 Jul 2020 02:35:00 +0300
> If someone wants or needs to know which characters participated in a
> composition on a GUI frame, they can go to those characters in the
> *Help* buffer and type "C-u C-x =".

It's what I'm doing all the time:

1. type "C-u C-x ="
2. move point to the combining character
3. type "C-u C-x =" again

This takes too much time.

> That said, if someone wants to work on adding the character names to
> the GUI display as well, I won't object.

Ok, done on master in commit 46a0c115f0.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42256; Package emacs. (Mon, 13 Jul 2020 02:57:01 GMT) Full text and rfc822 format available.

Message #40 received at 42256 <at> debbugs.gnu.org (full text, mbox):

From: Richard Stallman <rms <at> gnu.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 42256 <at> debbugs.gnu.org, juri <at> linkov.net
Subject: Re: bug#42256: 27.0.50; composition
Date: Sun, 12 Jul 2020 22:56:27 -0400
[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > On TTY frames, they are the only information available about the
  > composition (because the actual composition is done by the terminal
  > emulator).  On GUI frames, we have more important information already
  > shown.

I contend that the unicode character name is more meaningful to the user
than the numeric glyph codes.  I hope someone will extend the display
of all the composants, now implemented on TTYs, to graphic displays as well.

It doesn't affect me personally since I normally run Emacs only on TTYs,
but I expect it will help other users.

-- 
Dr Richard Stallman
Chief GNUisance of the GNU Project (https://gnu.org)
Founder, Free Software Foundation (https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42256; Package emacs. (Mon, 13 Jul 2020 03:39:01 GMT) Full text and rfc822 format available.

Message #43 received at 42256 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Juri Linkov <juri <at> linkov.net>
Cc: rms <at> gnu.org, 42256 <at> debbugs.gnu.org
Subject: Re: bug#42256: 27.0.50; composition
Date: Mon, 13 Jul 2020 06:37:52 +0300
> From: Juri Linkov <juri <at> linkov.net>
> Cc: 42256 <at> debbugs.gnu.org,  rms <at> gnu.org
> Date: Mon, 13 Jul 2020 02:35:00 +0300
> 
> > If someone wants or needs to know which characters participated in a
> > composition on a GUI frame, they can go to those characters in the
> > *Help* buffer and type "C-u C-x =".
> 
> It's what I'm doing all the time:

Why do you need that, may I ask?  Why is it important to know which
characters were composed, and in what usage scenario?

> > That said, if someone wants to work on adding the character names to
> > the GUI display as well, I won't object.
> 
> Ok, done on master in commit 46a0c115f0.

Please also update the Emacs manual, where it describes this display,
because now the text there is outdated.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42256; Package emacs. (Mon, 13 Jul 2020 13:41:01 GMT) Full text and rfc822 format available.

Message #46 received at 42256 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: juri <at> linkov.net
Cc: rms <at> gnu.org, 42256 <at> debbugs.gnu.org
Subject: Re: bug#42256: 27.0.50; composition
Date: Mon, 13 Jul 2020 16:39:53 +0300
> Date: Mon, 13 Jul 2020 06:37:52 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: rms <at> gnu.org, 42256 <at> debbugs.gnu.org
> 
> > > That said, if someone wants to work on adding the character names to
> > > the GUI display as well, I won't object.
> > 
> > Ok, done on master in commit 46a0c115f0.
> 
> Please also update the Emacs manual, where it describes this display,
> because now the text there is outdated.

Actually, the results are inaccurate or even incorrect, at least in
some cases.  Here's one case where the results are wrong:

  emacs -Q
  C-h h
  C-u 411 M-g c
  C-u C-x =

You will see towards the end of the *help* buffer:

 Composed with the following character(s) "്" using this font:
   harfbuzz:-outline-Kartika-normal-normal-normal-serif-13-*-*-*-p-*-iso10646-1
 by these glyphs:
   [0 1 3384 337 12 0 12 9 0 nil]
 from these character(s):
   സ (#xd38) MALAYALAM LETTER SA
   ് (#xd4d) MALAYALAM SIGN VIRAMA
   ക (#xd15) MALAYALAM LETTER KA
   ാ (#xd3e) MALAYALAM VOWEL SIGN AA

The added list of characters seems to imply that 4 characters were
composed at buffer position 411.  But actually only the first 2 of
them were composed, as clearly see from the line starting with
"Composed with" above.

A similar problem happens at buffer position 413 of HELLO.  And at
position 872 you can see an even more stark example: instead of 2
characters, we show 8(!).

Can this please be fixed?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42256; Package emacs. (Tue, 14 Jul 2020 00:56:02 GMT) Full text and rfc822 format available.

Message #49 received at 42256 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: rms <at> gnu.org, 42256 <at> debbugs.gnu.org
Subject: Re: bug#42256: 27.0.50; composition
Date: Tue, 14 Jul 2020 03:13:12 +0300
> Why do you need that, may I ask?  Why is it important to know which
> characters were composed, and in what usage scenario?

For the same reason why there is the need to see the name of
the base character.  Displaying information about only part of
composition (only its first character) is incomplete to see
what characters participate in the composition, especially to know
the names of usually small glyphs of accents that are hardly distinguishable
when composed with the base character.

> Please also update the Emacs manual, where it describes this display,
> because now the text there is outdated.

Done.

> Actually, the results are inaccurate or even incorrect, at least in
> some cases.  Here's one case where the results are wrong:
>
>   emacs -Q
>   C-h h
>   C-u 411 M-g c
>   C-u C-x =
>
> You will see towards the end of the *help* buffer:
>
>  Composed with the following character(s) "്" using this font:
>    harfbuzz:-outline-Kartika-normal-normal-normal-serif-13-*-*-*-p-*-iso10646-1
>  by these glyphs:
>    [0 1 3384 337 12 0 12 9 0 nil]
>  from these character(s):
>    സ (#xd38) MALAYALAM LETTER SA
>    ് (#xd4d) MALAYALAM SIGN VIRAMA
>    ക (#xd15) MALAYALAM LETTER KA
>    ാ (#xd3e) MALAYALAM VOWEL SIGN AA

I tried, but got a different output:

Composed with the following character(s) "്കാ" using this font:
  ftcrhb:-PfEd-Lohit Malayalam-normal-normal-normal-*-13-*-*-*-*-0-iso10646-1
by these glyphs:
  [0 3 3384 184 14 0 15 8 5 nil]
  [0 3 3405 71 6 0 6 8 0 nil]
from these character(s):
  സ (#xd38) MALAYALAM LETTER SA
  ് (#xd4d) MALAYALAM SIGN VIRAMA
  ക (#xd15) MALAYALAM LETTER KA
  ാ (#xd3e) MALAYALAM VOWEL SIGN AA

The difference is in "Composed with the following character(s) "്കാ"
and in the rows of glyphs.  And according to the composition string "്കാ"
the list of 4 characters is correct.

> The added list of characters seems to imply that 4 characters were
> composed at buffer position 411.  But actually only the first 2 of
> them were composed, as clearly see from the line starting with
> "Composed with" above.
>
> A similar problem happens at buffer position 413 of HELLO.

Here is the output from 'C-u 413 M-g c C-u C-x =':

Composed with the following character(s) "ം" using this font:
  ftcrhb:-PfEd-Lohit Malayalam-normal-normal-normal-*-13-*-*-*-*-0-iso10646-1
by these glyphs:
  [0 1 3376 59 8 0 8 8 0 nil]
  [0 1 3330 16 7 0 8 6 0 nil]
from these character(s):
  ര (#xd30) MALAYALAM LETTER RA
  ം (#xd02) MALAYALAM SIGN ANUSVARA

Again, it seems the list of characters is correct according
to the text "Composed with the following character(s) "ം".

> And at position 872 you can see an even more stark example: instead of
> 2 characters, we show 8(!).

I don't understand where these 8 characters are coming from.
This composition of 8 characters is returned by find-composition.
Maybe the bug is in find-composition?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42256; Package emacs. (Tue, 14 Jul 2020 02:38:01 GMT) Full text and rfc822 format available.

Message #52 received at 42256 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Juri Linkov <juri <at> linkov.net>
Cc: rms <at> gnu.org, 42256 <at> debbugs.gnu.org
Subject: Re: bug#42256: 27.0.50; composition
Date: Tue, 14 Jul 2020 05:36:56 +0300
> From: Juri Linkov <juri <at> linkov.net>
> Cc: rms <at> gnu.org,  42256 <at> debbugs.gnu.org
> Date: Tue, 14 Jul 2020 03:13:12 +0300
> 
> I tried, but got a different output:

The output you get depends on the font as well, so I see no problem
here.

> > And at position 872 you can see an even more stark example: instead of
> > 2 characters, we show 8(!).
> 
> I don't understand where these 8 characters are coming from.
> This composition of 8 characters is returned by find-composition.
> Maybe the bug is in find-composition?

No, there's no bug in find-composition: it returns what we should pass
to the text shaper.  The problem here is that your code assumes all
the characters we passed to the shaper are a single grapheme cluster,
which is not true.

I suggest to look at the code which displays the "Composed with" line
and decides which characters to show there, and do the same in your
addition.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42256; Package emacs. (Tue, 14 Jul 2020 23:22:02 GMT) Full text and rfc822 format available.

Message #55 received at 42256 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: rms <at> gnu.org, 42256 <at> debbugs.gnu.org
Subject: Re: bug#42256: 27.0.50; composition
Date: Wed, 15 Jul 2020 02:20:50 +0300
> I suggest to look at the code which displays the "Composed with" line
> and decides which characters to show there, and do the same in your
> addition.

Do you think it's now better?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42256; Package emacs. (Wed, 15 Jul 2020 14:40:01 GMT) Full text and rfc822 format available.

Message #58 received at 42256 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Juri Linkov <juri <at> linkov.net>
Cc: rms <at> gnu.org, 42256 <at> debbugs.gnu.org
Subject: Re: bug#42256: 27.0.50; composition
Date: Wed, 15 Jul 2020 17:39:21 +0300
> From: Juri Linkov <juri <at> linkov.net>
> Cc: rms <at> gnu.org,  42256 <at> debbugs.gnu.org
> Date: Wed, 15 Jul 2020 02:20:50 +0300
> 
> > I suggest to look at the code which displays the "Composed with" line
> > and decides which characters to show there, and do the same in your
> > addition.
> 
> Do you think it's now better?

It is much better, thanks.  But there is still one small glitch: try
"C-u C-x =" on buffer position 872 in HELLO, and you will see that
U+0651 ARABIC SHADDA is displayed as an empty box.  By contrast, the
"Composed with" line displays the shadda correctly.  I think this is
because you display only a single character (by using 'aref'), whereas
combining characters need to be surrounded by TABs to display
correctly without combining with their neighbors.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42256; Package emacs. (Wed, 15 Jul 2020 23:57:01 GMT) Full text and rfc822 format available.

Message #61 received at 42256 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: rms <at> gnu.org, 42256 <at> debbugs.gnu.org
Subject: Re: bug#42256: 27.0.50; composition
Date: Thu, 16 Jul 2020 02:43:45 +0300
>> > I suggest to look at the code which displays the "Composed with" line
>> > and decides which characters to show there, and do the same in your
>> > addition.
>> 
>> Do you think it's now better?
>
> It is much better, thanks.  But there is still one small glitch: try
> "C-u C-x =" on buffer position 872 in HELLO, and you will see that
> U+0651 ARABIC SHADDA is displayed as an empty box.  By contrast, the
> "Composed with" line displays the shadda correctly.  I think this is
> because you display only a single character (by using 'aref'), whereas
> combining characters need to be surrounded by TABs to display
> correctly without combining with their neighbors.

Thank for the hint, I missed the invisible text property named ‘composition’
with TABs on the "Composed with" line.  Now fixed.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42256; Package emacs. (Thu, 16 Jul 2020 16:42:01 GMT) Full text and rfc822 format available.

Message #64 received at 42256 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Juri Linkov <juri <at> linkov.net>
Cc: rms <at> gnu.org, 42256 <at> debbugs.gnu.org
Subject: Re: bug#42256: 27.0.50; composition
Date: Thu, 16 Jul 2020 19:40:48 +0300
> From: Juri Linkov <juri <at> linkov.net>
> Cc: rms <at> gnu.org,  42256 <at> debbugs.gnu.org
> Date: Thu, 16 Jul 2020 02:43:45 +0300
> 
> Thank for the hint, I missed the invisible text property named ‘composition’
> with TABs on the "Composed with" line.  Now fixed.

Thanks, confirmed.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 14 Aug 2020 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 255 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.