GNU bug report logs -
#55319
28.1.50; Abugida not rendered correctly (MacOS)
Previous Next
Reported by: Kai Ma <justksqsf <at> gmail.com>
Date: Sun, 8 May 2022 16:23:01 UTC
Severity: wishlist
Found in version 28.1.50
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 55319 in the body.
You can then email your comments to 55319 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#55319
; Package
emacs
.
(Sun, 08 May 2022 16:23:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Kai Ma <justksqsf <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Sun, 08 May 2022 16:23:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
I installed the Crisa Regular font [1] (font name is “Crisa”) and tried to type some zbalermorna [2] (an abugida) into Emacs.
However, the positions of the vowels are not correct, as shown in the attached screenshot obtained in emacs -Q.
The vowels should be right above the constants.
The correct rendering can be seen at this web page [3] (using a decent modern Web browser).
I can confirm other applications using the system GUI toolkit works, e.g. TextEdit.app.
[1] https://github.com/jackhumbert/zbalermorna/tree/master/fonts/
[2] https://jackhumbert.github.io/zbalermorna/
[3] https://jackhumbert.github.io/zbalermorna/examiner/#crisa-regular
In GNU Emacs 28.1.50 (build 1, x86_64-apple-darwin21.4.0, NS appkit-2113.40 Version 12.3.1 (Build 21E258))
of 2022-04-05 built on Kais-MacBook.local
Windowing system distributor 'Apple', version 10.3.2113
System Description: macOS 12.3.1
Configured using:
'configure --disable-dependency-tracking --disable-silent-rules
--enable-locallisppath=/usr/local/share/emacs/site-lisp
--infodir=/usr/local/Cellar/emacs-plus <at> 28/28.0.50/share/info/emacs
--prefix=/usr/local/Cellar/emacs-plus <at> 28/28.0.50 --with-xml2
--with-gnutls --with-native-compilation --with-dbus
--without-imagemagick --with-modules --with-rsvg --with-xwidgets
--with-ns --disable-ns-self-contained
'CFLAGS=-I/usr/local/opt/gcc/include -I/usr/local/opt/libgccjit/include
-I/usr/local/opt/gmp/include -I/usr/local/opt/jpeg/include'
'LDFLAGS=-L/usr/local/lib/gcc/11 -I/usr/local/opt/gcc/include
-I/usr/local/opt/libgccjit/include -I/usr/local/opt/gmp/include
-I/usr/local/opt/jpeg/include''
Configured features:
ACL DBUS GIF GLIB GMP GNUTLS JPEG JSON LCMS2 LIBXML2 MODULES NATIVE_COMP
NOTIFY KQUEUE NS PDUMPER PNG RSVG THREADS TIFF TOOLKIT_SCROLL_BARS XIM
XWIDGETS ZLIB
Important settings:
value of $LC_CTYPE: UTF-8
value of $LANG: en_CN <at> calendar=iso8601.UTF-8
locale-coding-system: utf-8-unix
Major mode: Lisp Interaction
Minor modes in effect:
text-scale-mode: t
tooltip-mode: t
global-eldoc-mode: t
eldoc-mode: t
show-paren-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
line-number-mode: t
indent-tabs-mode: t
transient-mark-mode: t
Load-path shadows:
None found.
Features:
(shadow comp comp-cstr warnings rx cl-extra sort mail-extr emacsbug
message rmc puny dired dired-loaddefs rfc822 mml mml-sec epa derived epg
rfc6068 epg-config gnus-util rmail rmail-loaddefs auth-source cl-seq
eieio eieio-core cl-macs eieio-loaddefs password-cache json map
text-property-search seq byte-opt gv bytecomp byte-compile cconv
mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils
mailheader sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr
mail-utils face-remap cus-start cus-load rfc1345 quail help-mode lojban
vc-git diff-mode easy-mmode vc-dispatcher time-date subr-x cl-loaddefs
cl-lib iso-transl tooltip eldoc paren electric uniquify ediff-hook
vc-hooks lisp-float-type elisp-mode mwheel term/ns-win ns-win
ucs-normalize mule-util term/common-win tool-bar dnd fontset image
regexp-opt fringe tabulated-list replace newcomment text-mode lisp-mode
prog-mode register page tab-bar menu-bar rfn-eshadow isearch easymenu
timer select scroll-bar mouse jit-lock font-lock syntax font-core
term/tty-colors frame minibuffer cl-generic cham georgian utf-8-lang
misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms
cp51932 hebrew greek romanian slovak czech european ethiopic indian
cyrillic chinese composite emoji-zwj charscript charprop case-table
epa-hook jka-cmpr-hook help simple abbrev obarray cl-preloaded nadvice
button loaddefs faces cus-face macroexp files window text-properties
overlay sha1 md5 base64 format env code-pages mule custom widget
hashtable-print-readable backquote threads xwidget-internal dbusbind
kqueue cocoa ns lcms2 multi-tty make-network-process native-compile
emacs)
Memory information:
((conses 16 124050 4825)
(symbols 48 9858 1)
(strings 32 28293 1994)
(string-bytes 1 877108)
(vectors 16 24427)
(vector-slots 8 422635 7736)
(floats 8 34 43)
(intervals 56 1634 0)
(buffers 992 14))
[Message part 2 (text/html, inline)]
[PastedGraphic-1.png (image/png, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#55319
; Package
emacs
.
(Sun, 08 May 2022 16:58:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 55319 <at> debbugs.gnu.org (full text, mbox):
severity 55319 wishlist
thanks
> From: Kai Ma <justksqsf <at> gmail.com>
> Date: Sun, 8 May 2022 19:45:04 +0800
>
> I installed the Crisa Regular font [1] (font name is “Crisa”) and tried to type some zbalermorna [2] (an abugida) into Emacs.
>
> However, the positions of the vowels are not correct, as shown in the attached screenshot obtained in emacs -Q.
> The vowels should be right above the constants.
>
> The correct rendering can be seen at this web page [3] (using a decent modern Web browser).
> I can confirm other applications using the system GUI toolkit works, e.g. TextEdit.app.
Emacs doesn't OOTB support scripts whose characters are not in
Unicode. When characters are not in Unicode, their properties and
attributes aren't known, unless someone tells Emacs what they are.
The sites to which you point indicate that this script was created for
an artificial language and its characters use the Private Use Area
codepoints of the Unicode code-space. So making Emacs support this
invented script will need some work from someone who knows the details
and can submit patches which add these characters and their properties
to the databases Emacs needs in order to handle those characters.
Severity set to 'wishlist' from 'normal'
Request was from
Eli Zaretskii <eliz <at> gnu.org>
to
control <at> debbugs.gnu.org
.
(Sun, 08 May 2022 16:58:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#55319
; Package
emacs
.
(Mon, 09 May 2022 02:39:01 GMT)
Full text and
rfc822 format available.
Message #13 received at 55319 <at> debbugs.gnu.org (full text, mbox):
> From: Kai Ma <justksqsf <at> gmail.com>
> Date: Mon, 9 May 2022 09:43:48 +0800
> Cc: 55319 <at> debbugs.gnu.org
>
> > On May 9, 2022, at 00:57, Eli Zaretskii <eliz <at> gnu.org> wrote:
> >
> > Emacs doesn't OOTB support scripts whose characters are not in
> > Unicode. When characters are not in Unicode, their properties and
> > attributes aren't known, unless someone tells Emacs what they are.
>
> That was my thought, too. However, I don’t think this is the root cause in this case.
>
> 1. Other applications (including Web browsers and the native GUI toolkit) render
> the text just fine. This makes me believe the font file itself contains enough information.
I don't know about other applications and their needs, but I do know
what Emacs needs to support a character. A font cannot contain enough
information for Emacs to use a character in general, and doesn't even
include enough information for Emacs to display that character. More
importantly, Emacs never takes information about characters from
fonts. It's actually the other way around: Emacs needs to know enough
about a character to choose the right font for it.
> 2. Emacs also discovers some glyphs should be composed, e.g. intonation marks (e.g. #xed8c),
> but not the vowels. I don’t know why this happens. I just tried copy character properties
> from these good ones. It didn’t work.
Emacs doesn't discover composition rules. The composition rules are
part of the Emacs code, see the various *.el files in lisp/language/
directory. Some of these composition rules are derived automatically
from character properties, see composite.el and characters.el (which
cannot happen without Emacs knowing up-front about the properties).
> But in general, this could just be Emacs not fully supporting OpenType features.
Emacs relies on text-shaping engines for full OTF support. AFAIK,
text-shaping engines also don't support PUA characters without special
measures.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#55319
; Package
emacs
.
(Mon, 09 May 2022 07:05:06 GMT)
Full text and
rfc822 format available.
Message #16 received at 55319 <at> debbugs.gnu.org (full text, mbox):
> On May 9, 2022, at 00:57, Eli Zaretskii <eliz <at> gnu.org> wrote:
>
> Emacs doesn't OOTB support scripts whose characters are not in
> Unicode. When characters are not in Unicode, their properties and
> attributes aren't known, unless someone tells Emacs what they are.
That was my thought, too. However, I don’t think this is the root cause in this case.
1. Other applications (including Web browsers and the native GUI toolkit) render
the text just fine. This makes me believe the font file itself contains enough information.
2. Emacs also discovers some glyphs should be composed, e.g. intonation marks (e.g. #xed8c),
but not the vowels. I don’t know why this happens. I just tried copy character properties
from these good ones. It didn’t work.
But in general, this could just be Emacs not fully supporting OpenType features.
Reply sent
to
Eli Zaretskii <eliz <at> gnu.org>
:
You have taken responsibility.
(Wed, 11 May 2022 16:14:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Kai Ma <justksqsf <at> gmail.com>
:
bug acknowledged by developer.
(Wed, 11 May 2022 16:14:02 GMT)
Full text and
rfc822 format available.
Message #21 received at 55319-done <at> debbugs.gnu.org (full text, mbox):
> From: Kai Ma <justksqsf <at> gmail.com>
> Date: Wed, 11 May 2022 23:43:36 +0800
> Cc: 55319 <at> debbugs.gnu.org
>
> Emacs doesn't discover composition rules. The composition rules are
> part of the Emacs code, see the various *.el files in lisp/language/
> directory. Some of these composition rules are derived automatically
> from character properties, see composite.el and characters.el (which
> cannot happen without Emacs knowing up-front about the properties).
>
> Thanks for this. I didn’t know Emacs needed to manually compose characters.
>
> Feel free to close this report, since it is due to my misunderstanding, not a real problem nor a real “wishlist”.
Done.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#55319
; Package
emacs
.
(Wed, 11 May 2022 17:35:02 GMT)
Full text and
rfc822 format available.
Message #24 received at 55319 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
> On May 9, 2022, at 10:38, Eli Zaretskii <eliz <at> gnu.org <mailto:eliz <at> gnu.org>> wrote:
>
> Emacs doesn't discover composition rules. The composition rules are
> part of the Emacs code, see the various *.el files in lisp/language/
> directory. Some of these composition rules are derived automatically
> from character properties, see composite.el and characters.el (which
> cannot happen without Emacs knowing up-front about the properties).
Thanks for this. I didn’t know Emacs needed to manually compose characters.
Feel free to close this report, since it is due to my misunderstanding, not a real problem nor a real “wishlist”.
BTW,
I did try to follow language/*.el, and come with up the following code:
(let* ((c "[\uED80-\uED9F]\\|\uEDAA\\|\uEDAB”) ; constant
(v "[\uEDA0-\uEDA9]”) ; vowel
(cv (concat v c)))
(set-char-table-range
composition-function-table '(#xeda0 . #xeda9)
(list
(vector cv 1 #'zbalermorna-shape-gstring)
[nil 0 font-shape-gstring])))
(defun zbalermorna-shape-gstring (gstring direction)
(message "shape %s" gstring) ; debugging
gstring)
But it doesn’t work as expected. For example, “ka” should be composed, but the behavior here is “a” itself is composed, and when the first rule is matched, only the consonant “k” is sent to font-shape-gstring: only “k” is in the header.
Have you any pointers? Thanks!
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#55319
; Package
emacs
.
(Thu, 12 May 2022 08:12:01 GMT)
Full text and
rfc822 format available.
Message #27 received at 55319 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Wed, 11 May 2022 23:43:36 +0800, Kai Ma <justksqsf <at> gmail.com> said:
>> On May 9, 2022, at 10:38, Eli Zaretskii <eliz <at> gnu.org <mailto:eliz <at> gnu.org>> wrote:
>>
>> Emacs doesn't discover composition rules. The composition rules are
>> part of the Emacs code, see the various *.el files in lisp/language/
>> directory. Some of these composition rules are derived automatically
>> from character properties, see composite.el and characters.el (which
>> cannot happen without Emacs knowing up-front about the properties).
Kai> Thanks for this. I didn’t know Emacs needed to manually compose characters.
Kai> Feel free to close this report, since it is due to my misunderstanding, not a real problem nor a real “wishlist”.
Kai> BTW,
Kai> I did try to follow language/*.el, and come with up the following code:
Kai> (let* ((c "[\uED80-\uED9F]\\|\uEDAA\\|\uEDAB”) ; constant
ie: "[\uED80-\uED9F\uEDAA\uEDAB]”
Kai> (v "[\uEDA0-\uEDA9]”) ; vowel
Kai> (cv (concat v c)))
You've called this 'cv', but itʼs actually 'vc'.
Kai> (set-char-table-range
Kai> composition-function-table '(#xeda0 . #xeda9)
Kai> (list
Kai> (vector cv 1 #'zbalermorna-shape-gstring)
Kai> [nil 0 font-shape-gstring])))
Youʼre looking back from vowels, it might be easier to add entries for
the consonants and look forward.
Kai> (defun zbalermorna-shape-gstring (gstring direction)
Kai> (message "shape %s" gstring) ; debugging
Kai> gstring)
Kai> But it doesn’t work as expected. For example, “ka” should be
Kai> composed, but the behavior here is “a” itself is composed,
Kai> and when the first rule is matched, only the consonant “k” is
Kai> sent to font-shape-gstring: only “k” is in the header.
Kai> Have you any pointers? Thanks!
I think if you fix 'cv' this will work.
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#55319
; Package
emacs
.
(Thu, 12 May 2022 08:37:01 GMT)
Full text and
rfc822 format available.
Message #30 received at 55319 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Thu, 12 May 2022 16:26:49 +0800, Kai Ma <justksqsf <at> gmail.com> said:
Kai> Thanks. I’ve got it work.
Kai> Besides the pattern problem, there were two missing pieces:
Kai> (1) canonical-combining-class, and
Iʼm surprised you needed to override that, but composition has many
dark corners.
Kai> (2) `compose-' to actually compose it into one glyph. `font-shape-gstring' alone does not work.
Kai> This is the result:
Kai> (defun zbalermorna-setup ()
Kai> "Set up the composition rules for zbalermonrna."
Kai> (interactive)
Kai> (dolist (v (number-sequence #xeda0 #xeda9))
Kai> (put-char-code-property v 'canonical-combining-class (encode-composition-rule '(tc . bc))))
Kai> (let* ((c "\\([\uED80-\uED97]\\|\uEDAA\\|\uEDAB\\)")
Kai> (v "[\uEDA0-\uEDA9]")
Kai> (dot "\uED89")
Kai> (h "\uED8A")
Kai> (pattern1 (concat c v))
Kai> (pattern2 (concat v h v)))
Kai> (set-char-table-range
Kai> composition-function-table '(#xeda0 . #xeda9)
Kai> (list (vector pattern2 2 #'compose-gstring-for-graphic)
Kai> (vector pattern1 1 #'compose-gstring-for-graphic)
Kai> [nil 0 font-shape-gstring]))))
Eli, since these are PUA, can we still add them to Emacs?
Thanks
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#55319
; Package
emacs
.
(Thu, 12 May 2022 09:38:02 GMT)
Full text and
rfc822 format available.
Message #33 received at 55319 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 55319 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>
> Date: Thu, 12 May 2022 10:36:29 +0200
>
> Eli, since these are PUA, can we still add them to Emacs?
I don't think I understand what exactly would you like to add.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#55319
; Package
emacs
.
(Thu, 12 May 2022 09:43:02 GMT)
Full text and
rfc822 format available.
Message #36 received at 55319 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Thu, 12 May 2022 12:37:52 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Cc: 55319 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>
>> Date: Thu, 12 May 2022 10:36:29 +0200
>>
>> Eli, since these are PUA, can we still add them to Emacs?
Eli> I don't think I understand what exactly would you like to add.
The composition rules that Kai Ma just produced.
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#55319
; Package
emacs
.
(Thu, 12 May 2022 09:55:01 GMT)
Full text and
rfc822 format available.
Message #39 received at 55319 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: justksqsf <at> gmail.com, 55319 <at> debbugs.gnu.org
> Date: Thu, 12 May 2022 11:42:24 +0200
>
> >>>>> On Thu, 12 May 2022 12:37:52 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>
> >> From: Robert Pluim <rpluim <at> gmail.com>
> >> Cc: 55319 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>
> >> Date: Thu, 12 May 2022 10:36:29 +0200
> >>
> >> Eli, since these are PUA, can we still add them to Emacs?
>
> Eli> I don't think I understand what exactly would you like to add.
>
> The composition rules that Kai Ma just produced.
Those composition rules assume a specific meaning to these PUA
codepoints. But if someone uses those same PUA codepoints to express
other characters, the composition rules will no longer be valid for
that someone.
This is a general problem with PUA codepoints: their meaning is in the
eyes of the beholder. We could perhaps provide some infrastructure
for making use of PUA codepoints easier than it is now. We could even
provide opt-in packages, which, when loaded, assign specific meanings
to specific PUA codepoints. But I don't see how we could _by_default_
assign some specific meaning to those codepoints, because there's no
basis for preferring one interpretation of them to another.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#55319
; Package
emacs
.
(Thu, 12 May 2022 15:01:01 GMT)
Full text and
rfc822 format available.
Message #42 received at 55319 <at> debbugs.gnu.org (full text, mbox):
> On May 12, 2022, at 16:10, Robert Pluim <rpluim <at> gmail.com> wrote:
>
>>>>>> On Wed, 11 May 2022 23:43:36 +0800, Kai Ma <justksqsf <at> gmail.com> said:
>
>>> On May 9, 2022, at 10:38, Eli Zaretskii <eliz <at> gnu.org <mailto:eliz <at> gnu.org>> wrote:
>>>
>>> Emacs doesn't discover composition rules. The composition rules are
>>> part of the Emacs code, see the various *.el files in lisp/language/
>>> directory. Some of these composition rules are derived automatically
>>> from character properties, see composite.el and characters.el (which
>>> cannot happen without Emacs knowing up-front about the properties).
>
> Kai> Thanks for this. I didn’t know Emacs needed to manually compose characters.
>
> Kai> Feel free to close this report, since it is due to my misunderstanding, not a real problem nor a real “wishlist”.
>
> Kai> BTW,
>
> Kai> I did try to follow language/*.el, and come with up the following code:
>
> Kai> (let* ((c "[\uED80-\uED9F]\\|\uEDAA\\|\uEDAB”) ; constant
>
> ie: "[\uED80-\uED9F\uEDAA\uEDAB]”
>
> Kai> (v "[\uEDA0-\uEDA9]”) ; vowel
> Kai> (cv (concat v c)))
>
> You've called this 'cv', but itʼs actually 'vc'.
>
> Kai> (set-char-table-range
> Kai> composition-function-table '(#xeda0 . #xeda9)
> Kai> (list
> Kai> (vector cv 1 #'zbalermorna-shape-gstring)
> Kai> [nil 0 font-shape-gstring])))
>
> Youʼre looking back from vowels, it might be easier to add entries for
> the consonants and look forward.
>
> Kai> (defun zbalermorna-shape-gstring (gstring direction)
> Kai> (message "shape %s" gstring) ; debugging
> Kai> gstring)
>
> Kai> But it doesn’t work as expected. For example, “ka” should be
> Kai> composed, but the behavior here is “a” itself is composed,
> Kai> and when the first rule is matched, only the consonant “k” is
> Kai> sent to font-shape-gstring: only “k” is in the header.
>
> Kai> Have you any pointers? Thanks!
>
> I think if you fix 'cv' this will work.
Thanks. I’ve got it work.
Besides the pattern problem, there were two missing pieces:
(1) canonical-combining-class, and
(2) `compose-' to actually compose it into one glyph. `font-shape-gstring' alone does not work.
This is the result:
(defun zbalermorna-setup ()
"Set up the composition rules for zbalermonrna."
(interactive)
(dolist (v (number-sequence #xeda0 #xeda9))
(put-char-code-property v 'canonical-combining-class (encode-composition-rule '(tc . bc))))
(let* ((c "\\([\uED80-\uED97]\\|\uEDAA\\|\uEDAB\\)")
(v "[\uEDA0-\uEDA9]")
(dot "\uED89")
(h "\uED8A")
(pattern1 (concat c v))
(pattern2 (concat v h v)))
(set-char-table-range
composition-function-table '(#xeda0 . #xeda9)
(list (vector pattern2 2 #'compose-gstring-for-graphic)
(vector pattern1 1 #'compose-gstring-for-graphic)
[nil 0 font-shape-gstring]))))
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Fri, 10 Jun 2022 11:24:05 GMT)
Full text and
rfc822 format available.
This bug report was last modified 1 year and 320 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.