GNU bug report logs -
#63731
[PATCH] Support Emoji Variation Sequence 16 (FE0F) where appropriate
Previous Next
Reported by: Steven Allen <steven <at> stebalien.com>
Date: Fri, 26 May 2023 03:19:01 UTC
Severity: normal
Tags: fixed, patch
Fixed in version 29.1
Done: Robert Pluim <rpluim <at> gmail.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 63731 in the body.
You can then email your comments to 63731 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Fri, 26 May 2023 03:19:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Steven Allen <steven <at> stebalien.com>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Fri, 26 May 2023 03:19:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
This patch imports the full list from unicode.org instead of
special-casing a few characters as was done previously.
With this patch, '👍️' (1F44D FE0F) should look the same as '👍' (1F44D).
Without it, it will look like '👍️'.
As a simple regression test, '✔' (2714) should still as "text" while '✔️'
(2714 FE0F) should still display as an emoji.
Fixes https://github.com/alphapapa/ement.el/issues/137
NOTE: I'm not a Unicode expert, nor do I understand how Emacs handles
Unicode (beyond what was required to implement this patch). But this
patch appears to work and I can't find any regressions.
[0001-Support-Emoji-Variation-Sequence-16-FE0F-where-appro.patch (text/x-patch, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Fri, 26 May 2023 06:42:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Steven Allen <steven <at> stebalien.com>
> Date: Thu, 25 May 2023 20:18:02 -0700
>
> This patch imports the full list from unicode.org instead of
> special-casing a few characters as was done previously.
>
> With this patch, '👍️' (1F44D FE0F) should look the same as '👍' (1F44D).
> Without it, it will look like '👍️'.
>
> As a simple regression test, '✔' (2714) should still as "text" while '✔️'
> (2714 FE0F) should still display as an emoji.
>
> Fixes https://github.com/alphapapa/ement.el/issues/137
>
> NOTE: I'm not a Unicode expert, nor do I understand how Emacs handles
> Unicode (beyond what was required to implement this patch). But this
> patch appears to work and I can't find any regressions.
AFAIU, this change will populate composition-function-table for many
"normal" characters, including ASCII digits and symbol/punctuation
characters from the 0x2xxx blocks. E.g., after you build Emacs with
this patch, what do the following evaluations yield:
M-: (aref composition-function-table ?0) RET
M-: (aref composition-function-table #x2122) RET
If they yield non-nil values, it could mean dramatic slowdown of
redisplay with these characters. Which is precisely what we wanted to
avoid when we made the decision which parts of the Unicode-defined
Emoji sequences to support in Emacs, and how to arrange for that
support to work.
The issue you site is strange: according to the "C-u C-x =" display
there, Emacs did compose #x1f44d with VS-16 using the Noto Color Emoji
font, so I don't quite understand why VS-16 is then also shown as an
empty rectangle. On my system Noto Color Emoji doesn't work, and "C-u
C-x =" says this instead:
Composed with the following character(s) "️" using this font:
harfbuzz:-outline-Noto Emoji-regular-normal-normal-mono-15-*-*-*-c-*-iso10646-1
by these glyphs:
[0 1 128077 422 19 2 17 14 2 nil]
[0 1 65039 3 19 0 1 0 1 [0 0 0]]
with these character(s):
️ (#xfe0f) VARIATION SELECTOR-16
which explains why I see two glyphs and not 1. But in the display
shown in the above issue, I see
Composed with the following character(s) "️" using this font:
ftcrhb:-GOOG-Noto Color Emoji-regular-normal-normal-*-18-*-*-*-m-0-iso10646-1
by these glyphs:
[0 1 128077 569 22 0 23 17 5 [0 0 136]]
with these character(s):
️ (#xfe0f) VARIATION SELECTOR-16
which describes only one glyph, not two. So the result ought to be
what you expect.
Robert, what am I missing here?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Fri, 26 May 2023 08:35:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 63731 <at> debbugs.gnu.org (full text, mbox):
Disclaimer: I havenʼt looked at the patch yet
>>>>> On Fri, 26 May 2023 09:41:42 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Steven Allen <steven <at> stebalien.com>
>> Date: Thu, 25 May 2023 20:18:02 -0700
>>
>> This patch imports the full list from unicode.org instead of
>> special-casing a few characters as was done previously.
>>
>> With this patch, '👍️' (1F44D FE0F) should look the same as '👍' (1F44D).
>> Without it, it will look like '👍️'.
>>
>> As a simple regression test, '✔' (2714) should still as "text" while '✔️'
>> (2714 FE0F) should still display as an emoji.
>>
>> Fixes https://github.com/alphapapa/ement.el/issues/137
>>
>> NOTE: I'm not a Unicode expert, nor do I understand how Emacs handles
>> Unicode (beyond what was required to implement this patch). But this
>> patch appears to work and I can't find any regressions.
Eli> AFAIU, this change will populate composition-function-table for many
Eli> "normal" characters, including ASCII digits and symbol/punctuation
Eli> characters from the 0x2xxx blocks. E.g., after you build Emacs with
Eli> this patch, what do the following evaluations yield:
Eli> M-: (aref composition-function-table ?0) RET
Eli> M-: (aref composition-function-table #x2122) RET
Eli> If they yield non-nil values, it could mean dramatic slowdown of
Eli> redisplay with these characters. Which is precisely what we wanted to
Eli> avoid when we made the decision which parts of the Unicode-defined
Eli> Emoji sequences to support in Emacs, and how to arrange for that
Eli> support to work.
Yes. We donʼt want to do composition checks for ASCII if we can avoid it.
Eli> The issue you site is strange: according to the "C-u C-x =" display
Eli> there, Emacs did compose #x1f44d with VS-16 using the Noto Color Emoji
Eli> font, so I don't quite understand why VS-16 is then also shown as an
Eli> empty rectangle. On my system Noto Color Emoji doesn't work, and "C-u
Eli> C-x =" says this instead:
Eli> Composed with the following character(s) "️" using this font:
Eli> harfbuzz:-outline-Noto Emoji-regular-normal-normal-mono-15-*-*-*-c-*-iso10646-1
Eli> by these glyphs:
Eli> [0 1 128077 422 19 2 17 14 2 nil]
Eli> [0 1 65039 3 19 0 1 0 1 [0 0 0]]
Eli> with these character(s):
Eli> ️ (#xfe0f) VARIATION SELECTOR-16
Eli> which explains why I see two glyphs and not 1. But in the display
Eli> shown in the above issue, I see
Eli> Composed with the following character(s) "️" using this font:
Eli> ftcrhb:-GOOG-Noto Color Emoji-regular-normal-normal-*-18-*-*-*-m-0-iso10646-1
Eli> by these glyphs:
Eli> [0 1 128077 569 22 0 23 17 5 [0 0 136]]
Eli> with these character(s):
Eli> ️ (#xfe0f) VARIATION SELECTOR-16
Eli> which describes only one glyph, not two. So the result ought to be
Eli> what you expect.
I see the emoji followed by a blank box with Noto Color Emoji here. I
donʼt yet understand why.
Eli> Robert, what am I missing here?
1F44D FE0F is a valid sequence according to tr51
(aref composition-function-table #x1f44d)
=> (["\\(?:👍[🏻-🏿]\\)" 0 compose-gstring-for-graphic])
which means that the composition is being triggered by this entry:
(aref composition-function-table #xfe0f)
=> (["\\c.\\c^+" 1 compose-gstring-for-graphic] [nil 0 compose-gstring-for-graphic])
(time passes)
Ugh. The following fixes it for me:
diff --git a/lisp/composite.el b/lisp/composite.el
index fb8b76114f4..af86d1436d3 100644
--- a/lisp/composite.el
+++ b/lisp/composite.el
@@ -756,7 +756,7 @@ compose-gstring-for-dotted-circle
;; Allow for bootstrapping without uni-*.el.
(when unicode-category-table
(let ((elt `([,(purecopy "\\c.\\c^+") 1 compose-gstring-for-graphic]
- [nil 0 compose-gstring-for-graphic])))
+ )))
(map-char-table
#'(lambda (key val)
(if (memq val '(Mn Mc Me))
Although the following is less invasive:
diff --git a/lisp/composite.el b/lisp/composite.el
index fb8b76114f4..333428f008a 100644
--- a/lisp/composite.el
+++ b/lisp/composite.el
@@ -762,6 +762,11 @@ compose-gstring-for-dotted-circle
(if (memq val '(Mn Mc Me))
(set-char-table-range composition-function-table key elt)))
unicode-category-table))
+ ;; for Emoji presentation selector
+ (set-char-table-range
+ composition-function-table
+ #xFE0F
+ `([,(purecopy "\\c.\ufe0f") 1 compose-gstring-for-graphic]))
;; for dotted-circle
(aset composition-function-table #x25CC
`([,(purecopy ".\\c^") 0 compose-gstring-for-dotted-circle]))
Didnʼt we conclude that composition had some issues with multiple
entries for the same codepoint if there was a mix for forward and
backward looking regexp?
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Fri, 26 May 2023 08:46:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: Steven Allen <steven <at> stebalien.com>, 63731 <at> debbugs.gnu.org
> Date: Fri, 26 May 2023 10:34:02 +0200
>
> Ugh. The following fixes it for me:
>
> diff --git a/lisp/composite.el b/lisp/composite.el
> index fb8b76114f4..af86d1436d3 100644
> --- a/lisp/composite.el
> +++ b/lisp/composite.el
> @@ -756,7 +756,7 @@ compose-gstring-for-dotted-circle
> ;; Allow for bootstrapping without uni-*.el.
> (when unicode-category-table
> (let ((elt `([,(purecopy "\\c.\\c^+") 1 compose-gstring-for-graphic]
> - [nil 0 compose-gstring-for-graphic])))
> + )))
This is unacceptable, AFAIU. We cannot unsupported (or change) the
correct display of mark characters, can we?
> Although the following is less invasive:
>
> diff --git a/lisp/composite.el b/lisp/composite.el
> index fb8b76114f4..333428f008a 100644
> --- a/lisp/composite.el
> +++ b/lisp/composite.el
> @@ -762,6 +762,11 @@ compose-gstring-for-dotted-circle
> (if (memq val '(Mn Mc Me))
> (set-char-table-range composition-function-table key elt)))
> unicode-category-table))
> + ;; for Emoji presentation selector
> + (set-char-table-range
> + composition-function-table
> + #xFE0F
> + `([,(purecopy "\\c.\ufe0f") 1 compose-gstring-for-graphic]))
> ;; for dotted-circle
> (aset composition-function-table #x25CC
> `([,(purecopy ".\\c^") 0 compose-gstring-for-dotted-circle]))
Can you please explain why the current setup doesn't work in this
case, even though "C-u C-x =" says the composition was done? And how
the above patch fixes that?
> Didnʼt we conclude that composition had some issues with multiple
> entries for the same codepoint if there was a mix for forward and
> backward looking regexp?
Not sure I understand to what does this allude. What mix of forward
and backward looking regexp do you see?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Fri, 26 May 2023 11:15:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Fri, 26 May 2023 11:46:05 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Cc: Steven Allen <steven <at> stebalien.com>, 63731 <at> debbugs.gnu.org
>> Date: Fri, 26 May 2023 10:34:02 +0200
>>
>> Ugh. The following fixes it for me:
>>
>> diff --git a/lisp/composite.el b/lisp/composite.el
>> index fb8b76114f4..af86d1436d3 100644
>> --- a/lisp/composite.el
>> +++ b/lisp/composite.el
>> @@ -756,7 +756,7 @@ compose-gstring-for-dotted-circle
>> ;; Allow for bootstrapping without uni-*.el.
>> (when unicode-category-table
>> (let ((elt `([,(purecopy "\\c.\\c^+") 1 compose-gstring-for-graphic]
>> - [nil 0 compose-gstring-for-graphic])))
>> + )))
Eli> This is unacceptable, AFAIU. We cannot unsupported (or change) the
Eli> correct display of mark characters, can we?
Right. Iʼll hold off pushing it 😃
>> Although the following is less invasive:
>>
>> diff --git a/lisp/composite.el b/lisp/composite.el
>> index fb8b76114f4..333428f008a 100644
>> --- a/lisp/composite.el
>> +++ b/lisp/composite.el
>> @@ -762,6 +762,11 @@ compose-gstring-for-dotted-circle
>> (if (memq val '(Mn Mc Me))
>> (set-char-table-range composition-function-table key elt)))
>> unicode-category-table))
>> + ;; for Emoji presentation selector
>> + (set-char-table-range
>> + composition-function-table
>> + #xFE0F
>> + `([,(purecopy "\\c.\ufe0f") 1 compose-gstring-for-graphic]))
>> ;; for dotted-circle
>> (aset composition-function-table #x25CC
>> `([,(purecopy ".\\c^") 0 compose-gstring-for-dotted-circle]))
Eli> Can you please explain why the current setup doesn't work in this
Eli> case, even though "C-u C-x =" says the composition was done? And how
Eli> the above patch fixes that?
Composition is done for 1f44d+fe0f, but I suspect that with the current
setup, composition is called again for FE0F, which results in the box
glyph. With the second patch we will only do backwards looking composition
for FE0F
>> Didnʼt we conclude that composition had some issues with multiple
>> entries for the same codepoint if there was a mix for forward and
>> backward looking regexp?
Eli> Not sure I understand to what does this allude. What mix of forward
Eli> and backward looking regexp do you see?
Youʼre right, thereʼs no forward looking regexp, only a backwards one
and a no-regexp. But itʼs undeniable that:
[nil 0 compose-gstring-for-graphic]
causes the issue. Iʼve never been clear on the semantics of that.
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Fri, 26 May 2023 12:07:01 GMT)
Full text and
rfc822 format available.
Message #20 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: steven <at> stebalien.com, 63731 <at> debbugs.gnu.org
> Date: Fri, 26 May 2023 13:14:27 +0200
>
> >> Although the following is less invasive:
> >>
> >> diff --git a/lisp/composite.el b/lisp/composite.el
> >> index fb8b76114f4..333428f008a 100644
> >> --- a/lisp/composite.el
> >> +++ b/lisp/composite.el
> >> @@ -762,6 +762,11 @@ compose-gstring-for-dotted-circle
> >> (if (memq val '(Mn Mc Me))
> >> (set-char-table-range composition-function-table key elt)))
> >> unicode-category-table))
> >> + ;; for Emoji presentation selector
> >> + (set-char-table-range
> >> + composition-function-table
> >> + #xFE0F
> >> + `([,(purecopy "\\c.\ufe0f") 1 compose-gstring-for-graphic]))
> >> ;; for dotted-circle
> >> (aset composition-function-table #x25CC
> >> `([,(purecopy ".\\c^") 0 compose-gstring-for-dotted-circle]))
>
> Eli> Can you please explain why the current setup doesn't work in this
> Eli> case, even though "C-u C-x =" says the composition was done? And how
> Eli> the above patch fixes that?
>
> Composition is done for 1f44d+fe0f, but I suspect that with the current
> setup, composition is called again for FE0F, which results in the box
> glyph. With the second patch we will only do backwards looking composition
> for FE0F
OK, then I think we should install this on the emacs-29 branch.
> Youʼre right, thereʼs no forward looking regexp, only a backwards one
> and a no-regexp. But itʼs undeniable that:
>
> [nil 0 compose-gstring-for-graphic]
>
> causes the issue. Iʼve never been clear on the semantics of that.
It has special support in compose-gstring-for-graphic, see there. The
doc string also says a few words about that. We use this, e.g., in
describe-char display, where we sometimes need to show a single
combining character with no base character to combine it with. I
think this is only relevant for accents and other such combining
characters, not for VS-n.
What does this issue mean for the other VS-n characters, though?
Should we perhaps install something similar for them as well?
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Fri, 26 May 2023 14:03:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Fri, 26 May 2023 15:06:40 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Robert Pluim <rpluim <at> gmail.com>
>>
>> Composition is done for 1f44d+fe0f, but I suspect that with the current
>> setup, composition is called again for FE0F, which results in the box
>> glyph. With the second patch we will only do backwards looking composition
>> for FE0F
Eli> OK, then I think we should install this on the emacs-29 branch.
>> Youʼre right, thereʼs no forward looking regexp, only a backwards one
>> and a no-regexp. But itʼs undeniable that:
>>
>> [nil 0 compose-gstring-for-graphic]
>>
>> causes the issue. Iʼve never been clear on the semantics of that.
Eli> It has special support in compose-gstring-for-graphic, see there. The
Eli> doc string also says a few words about that. We use this, e.g., in
Eli> describe-char display, where we sometimes need to show a single
Eli> combining character with no base character to combine it with. I
Eli> think this is only relevant for accents and other such combining
Eli> characters, not for VS-n.
OK
Eli> What does this issue mean for the other VS-n characters, though?
Eli> Should we perhaps install something similar for them as well?
For VS-15 maybe? The following gets me text-presentation composition
with CHAR+FE0E and emoji-presentation with CHAR+FE0F
diff --git a/lisp/composite.el b/lisp/composite.el
index fb8b76114f4..ada35010146 100644
--- a/lisp/composite.el
+++ b/lisp/composite.el
@@ -762,6 +762,11 @@ compose-gstring-for-dotted-circle
(if (memq val '(Mn Mc Me))
(set-char-table-range composition-function-table key elt)))
unicode-category-table))
+ ;; for Emoji presentation selector
+ (set-char-table-range
+ composition-function-table
+ '(#xFE0E . #xFE0F)
+ `([,(purecopy "\\c.[\ufe0f\ufe0e]") 1 compose-gstring-for-graphic]))
;; for dotted-circle
(aset composition-function-table #x25CC
`([,(purecopy ".\\c^") 0 compose-gstring-for-dotted-circle]))
@@ -861,7 +866,7 @@ compose-gstring-for-variation-glyph
;; handled in font_range, we end up choosing the Emoji presentation
;; rather than the Text presentation.
(let ((elt '([".." 1 compose-gstring-for-variation-glyph])))
- (set-char-table-range composition-function-table '(#xFE00 . #xFE0E) elt)
+ (set-char-table-range composition-function-table '(#xFE00 . #xFE0D) elt)
(set-char-table-range composition-function-table '(#xE0100 . #xE01EF) elt))
(defun auto-compose-chars (func from to font-object string direction)
although perhaps we could have both `compose-gstring-for-graphic' and
`compose-gstring-for-variation-glyph' for FE0E
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Fri, 26 May 2023 14:56:01 GMT)
Full text and
rfc822 format available.
Message #26 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: steven <at> stebalien.com, 63731 <at> debbugs.gnu.org
> Date: Fri, 26 May 2023 16:02:40 +0200
>
> Eli> What does this issue mean for the other VS-n characters, though?
> Eli> Should we perhaps install something similar for them as well?
>
> For VS-15 maybe? The following gets me text-presentation composition
> with CHAR+FE0E and emoji-presentation with CHAR+FE0F
Actually, I forgot about compose-gstring-for-variation-glyph. My
question was actually whether the general setting in
(let ((elt `([,(purecopy "\\c.\\c^+") 1 compose-gstring-for-graphic]
[nil 0 compose-gstring-for-graphic])))
(map-char-table
#'(lambda (key val)
(if (memq val '(Mn Mc Me))
(set-char-table-range composition-function-table key elt)))
unicode-category-table))
affects also the VS-n selectors. But since the latter setting of
(let ((elt '([".." 1 compose-gstring-for-variation-glyph])))
(set-char-table-range composition-function-table '(#xFE00 . #xFE0E) elt)
(set-char-table-range composition-function-table '(#xE0100 . #xE01EF) elt))
takes care of all the VS-n selectors except VS-16, and your patch now
will take care of VS-16, it sounds like we don't need to care about
other VS-n selectors?
Or are you saying that without including VS-15, CHAR+FE0E is not
displayed using its text representation?
Did you test the proposed change with the admin/emoji-*.txt files, to
make sure they all still display OK?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Fri, 26 May 2023 15:07:01 GMT)
Full text and
rfc822 format available.
Message #29 received at 63731 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii <eliz <at> gnu.org> writes:
> AFAIU, this change will populate composition-function-table for many
> "normal" characters, including ASCII digits and symbol/punctuation
> characters from the 0x2xxx blocks. E.g., after you build Emacs with
> this patch, what do the following evaluations yield:
>
> M-: (aref composition-function-table ?0) RET
> M-: (aref composition-function-table #x2122) RET
>
> If they yield non-nil values, it could mean dramatic slowdown of
> redisplay with these characters.
Both of these yield nil with this patch applied (and I haven't noticed
any performance regressions). But it looks like you and Robert have a
better patch so I'll leave you to it.
However, I'd like to draw your attention to the existing hard-coded
VS-16 table here:
https://git.savannah.gnu.org/cgit/emacs.git/tree/admin/unidata/emoji-zwj.awk?h=4b3de748b0b04407d2492500c77905de56de1180#n72
It feels like this should either be the full table (the one in the
patch) or it shouldn't exist at all. But again, I'm not the expert here.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Fri, 26 May 2023 15:26:01 GMT)
Full text and
rfc822 format available.
Message #32 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Fri, 26 May 2023 17:55:26 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Cc: steven <at> stebalien.com, 63731 <at> debbugs.gnu.org
>> Date: Fri, 26 May 2023 16:02:40 +0200
>>
Eli> What does this issue mean for the other VS-n characters, though?
Eli> Should we perhaps install something similar for them as well?
>>
>> For VS-15 maybe? The following gets me text-presentation composition
>> with CHAR+FE0E and emoji-presentation with CHAR+FE0F
Eli> Actually, I forgot about compose-gstring-for-variation-glyph. My
Eli> question was actually whether the general setting in
Eli> (let ((elt `([,(purecopy "\\c.\\c^+") 1 compose-gstring-for-graphic]
Eli> [nil 0 compose-gstring-for-graphic])))
Eli> (map-char-table
Eli> #'(lambda (key val)
Eli> (if (memq val '(Mn Mc Me))
Eli> (set-char-table-range composition-function-table key elt)))
Eli> unicode-category-table))
Eli> affects also the VS-n selectors. But since the latter setting of
Eli> (let ((elt '([".." 1 compose-gstring-for-variation-glyph])))
Eli> (set-char-table-range composition-function-table '(#xFE00 . #xFE0E) elt)
Eli> (set-char-table-range composition-function-table '(#xE0100 . #xE01EF) elt))
Eli> takes care of all the VS-n selectors except VS-16, and your patch now
Eli> will take care of VS-16, it sounds like we don't need to care about
Eli> other VS-n selectors?
Eli> Or are you saying that without including VS-15, CHAR+FE0E is not
Eli> displayed using its text representation?
Not quite. If I donʼt have compose-gstring-for-graphic for VS-15, no
composition occurs for CHAR+FE0E. With my change youʼll get
composition, but itʼs still not 100% correct: CHAR+FE0E when CHAR is a
member of the emoji script will use emoji presentation, not text, but
the extra empty box will not show, so itʼs still an improvement.
Eli> Did you test the proposed change with the admin/emoji-*.txt files, to
Eli> make sure they all still display OK?
Yes. Iʼve also got a change that makes Emoji_Keycap_Sequence work, but
I think we can leave that for master.
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Fri, 26 May 2023 15:30:02 GMT)
Full text and
rfc822 format available.
Message #35 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Fri, 26 May 2023 08:06:11 -0700, Steven Allen <steven <at> stebalien.com> said:
Steven> Eli Zaretskii <eliz <at> gnu.org> writes:
>> AFAIU, this change will populate composition-function-table for many
>> "normal" characters, including ASCII digits and symbol/punctuation
>> characters from the 0x2xxx blocks. E.g., after you build Emacs with
>> this patch, what do the following evaluations yield:
>>
>> M-: (aref composition-function-table ?0) RET
>> M-: (aref composition-function-table #x2122) RET
>>
>> If they yield non-nil values, it could mean dramatic slowdown of
>> redisplay with these characters.
Steven> Both of these yield nil with this patch applied (and I haven't noticed
Steven> any performance regressions). But it looks like you and Robert have a
Steven> better patch so I'll leave you to it.
Itʼs smaller, thatʼs for sure. And it will definitely be faster.
Steven> However, I'd like to draw your attention to the existing hard-coded
Steven> VS-16 table here:
Steven> https://git.savannah.gnu.org/cgit/emacs.git/tree/admin/unidata/emoji-zwj.awk?h=4b3de748b0b04407d2492500c77905de56de1180#n72
Steven> It feels like this should either be the full table (the one in the
Steven> patch) or it shouldn't exist at all. But again, I'm not the expert here.
Welcome to the wonderful world of Unicode. The reason the table exists
is that there are codepoints that are *not* emoji, but theyʼre part of
emoji sequences, so we still need to treat them as emoji in some
situations. Why Unicode didnʼt just make them emoji I donʼt know.
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Fri, 26 May 2023 15:52:02 GMT)
Full text and
rfc822 format available.
Message #38 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: steven <at> stebalien.com, 63731 <at> debbugs.gnu.org
> Date: Fri, 26 May 2023 17:25:24 +0200
>
> Eli> Or are you saying that without including VS-15, CHAR+FE0E is not
> Eli> displayed using its text representation?
>
> Not quite. If I donʼt have compose-gstring-for-graphic for VS-15, no
> composition occurs for CHAR+FE0E. With my change youʼll get
> composition, but itʼs still not 100% correct: CHAR+FE0E when CHAR is a
> member of the emoji script will use emoji presentation, not text, but
> the extra empty box will not show, so itʼs still an improvement.
OK. And what about CHAR+FE0E when CHAR is not an Emoji?
Anyway, I think you should install the patch on emacs-29, and we
should then try to fix the text-representation bug with VS-15 on
master. (I guess it requires a change to font.c or something?)
> Eli> Did you test the proposed change with the admin/emoji-*.txt files, to
> Eli> make sure they all still display OK?
>
> Yes. Iʼve also got a change that makes Emoji_Keycap_Sequence work, but
> I think we can leave that for master.
Depends on the solution, I guess. Isn't it just a change to the
VS-16's entry in composition-function-table? Or maybe a change in the
#x20e3's entry? (Did we discus the Emoji_Keycap_Sequence case before?)
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Fri, 26 May 2023 16:04:02 GMT)
Full text and
rfc822 format available.
Message #41 received at 63731 <at> debbugs.gnu.org (full text, mbox):
Robert Pluim <rpluim <at> gmail.com> writes:
> Welcome to the wonderful world of Unicode. The reason the table exists
> is that there are codepoints that are *not* emoji, but theyʼre part of
> emoji sequences, so we still need to treat them as emoji in some
> situations. Why Unicode didnʼt just make them emoji I donʼt know.
Got it... It sounds like the "correct" solution is to download the full
list (emoji-variation-sequences.txt) and filter for non-emoji
characters, but I guess that's overkill.
Thanks!
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Fri, 26 May 2023 16:25:02 GMT)
Full text and
rfc822 format available.
Message #44 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Fri, 26 May 2023 18:52:22 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Cc: steven <at> stebalien.com, 63731 <at> debbugs.gnu.org
>> Date: Fri, 26 May 2023 17:25:24 +0200
>>
Eli> Or are you saying that without including VS-15, CHAR+FE0E is not
Eli> displayed using its text representation?
>>
>> Not quite. If I donʼt have compose-gstring-for-graphic for VS-15, no
>> composition occurs for CHAR+FE0E. With my change youʼll get
>> composition, but itʼs still not 100% correct: CHAR+FE0E when CHAR is a
>> member of the emoji script will use emoji presentation, not text, but
>> the extra empty box will not show, so itʼs still an improvement.
Eli> OK. And what about CHAR+FE0E when CHAR is not an Emoji?
Then you get the (composed) text presentation (and the composed emoji
presentation when itʼs CHAR+FE0F).
Eli> Anyway, I think you should install the patch on emacs-29, and we
Eli> should then try to fix the text-representation bug with VS-15 on
Eli> master. (I guess it requires a change to font.c or something?)
It requires something that answers the question "what font would we
use for this codepoint if it was not an emoji?". Maybe we can have a
separate fontset that pretends that the emoji script is equivalent to
symbol? Or invent some kind of 'text-presentation-font' property to
put somewhere?
Eli> Did you test the proposed change with the admin/emoji-*.txt files, to
Eli> make sure they all still display OK?
>>
>> Yes. Iʼve also got a change that makes Emoji_Keycap_Sequence work, but
>> I think we can leave that for master.
Eli> Depends on the solution, I guess. Isn't it just a change to the
Eli> VS-16's entry in composition-function-table? Or maybe a change in the
Eli> #x20e3's entry? (Did we discus the Emoji_Keycap_Sequence case before?)
Itʼs a change to the VS-16 entry. We did discuss it before, and
decided to put it aside because the solutions all involved adding
composition-function-table entries for 0-9 or similar. I donʼt
remember why we didnʼt consider adding to VS-16ʼs entry.
Iʼll do some more testing, and post a final version hopefully this
weekend sometime.
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Fri, 26 May 2023 17:28:02 GMT)
Full text and
rfc822 format available.
Message #47 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Fri, 26 May 2023 18:24:02 +0200
>
> >>>>> On Fri, 26 May 2023 18:52:22 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>
> Eli> Anyway, I think you should install the patch on emacs-29, and we
> Eli> should then try to fix the text-representation bug with VS-15 on
> Eli> master. (I guess it requires a change to font.c or something?)
>
> It requires something that answers the question "what font would we
> use for this codepoint if it was not an emoji?". Maybe we can have a
> separate fontset that pretends that the emoji script is equivalent to
> symbol? Or invent some kind of 'text-presentation-font' property to
> put somewhere?
I'm not sure I understand why we don't select the right font by
default. Selecting a non-Emoji font for a non-Emoji codepoints should
not need any special tricks.
> >> Yes. Iʼve also got a change that makes Emoji_Keycap_Sequence work, but
> >> I think we can leave that for master.
>
> Eli> Depends on the solution, I guess. Isn't it just a change to the
> Eli> VS-16's entry in composition-function-table? Or maybe a change in the
> Eli> #x20e3's entry? (Did we discus the Emoji_Keycap_Sequence case before?)
>
> Itʼs a change to the VS-16 entry. We did discuss it before, and
> decided to put it aside because the solutions all involved adding
> composition-function-table entries for 0-9 or similar. I donʼt
> remember why we didnʼt consider adding to VS-16ʼs entry.
>
> Iʼll do some more testing, and post a final version hopefully this
> weekend sometime.
OK, thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Fri, 26 May 2023 17:37:01 GMT)
Full text and
rfc822 format available.
Message #50 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Fri, 26 May 2023 20:27:26 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
>> Date: Fri, 26 May 2023 18:24:02 +0200
>>
>> >>>>> On Fri, 26 May 2023 18:52:22 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>>
Eli> Anyway, I think you should install the patch on emacs-29, and we
Eli> should then try to fix the text-representation bug with VS-15 on
Eli> master. (I guess it requires a change to font.c or something?)
>>
>> It requires something that answers the question "what font would we
>> use for this codepoint if it was not an emoji?". Maybe we can have a
>> separate fontset that pretends that the emoji script is equivalent to
>> symbol? Or invent some kind of 'text-presentation-font' property to
>> put somewhere?
Eli> I'm not sure I understand why we don't select the right font by
Eli> default. Selecting a non-Emoji font for a non-Emoji codepoints should
Eli> not need any special tricks.
It doesnʼt but in this case it *is* an emoji codepoint, so it displays
as emoji because of font.c, even when followed by VS-15.
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Fri, 26 May 2023 17:44:01 GMT)
Full text and
rfc822 format available.
Message #53 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Fri, 26 May 2023 20:27:26 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
>
> > From: Robert Pluim <rpluim <at> gmail.com>
> > Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> > Date: Fri, 26 May 2023 18:24:02 +0200
> >
> > It requires something that answers the question "what font would we
> > use for this codepoint if it was not an emoji?". Maybe we can have a
> > separate fontset that pretends that the emoji script is equivalent to
> > symbol? Or invent some kind of 'text-presentation-font' property to
> > put somewhere?
>
> I'm not sure I understand why we don't select the right font by
> default. Selecting a non-Emoji font for a non-Emoji codepoints should
> not need any special tricks.
Actually, I don't understand why there's an issue here with font
selection. Are you saying that using Noto Color Emoji with
CHAR+0xFE0E, when CHAR is an Emoji character, doesn't produce the
textual representation of CHAR? If so, isn't that a problem with the
font? I thought all we needed to do was to hand the combination to an
Emoji-aware font, and the font would do the rest. Now you seem to be
saying that we somehow need to select a non-Emoji font? But if so,
who'd guarantee that a font that cannot display Emoji will know what
to do with the combination CHAR+0xFE0E?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Fri, 26 May 2023 18:06:02 GMT)
Full text and
rfc822 format available.
Message #56 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Fri, 26 May 2023 19:35:56 +0200
>
> in this case it *is* an emoji codepoint, so it displays
> as emoji because of font.c, even when followed by VS-15.
If we pass to an Emoji-capable font a sequence of a character followed
by VS-15, I'd expect the font to produce a glyph with the textual
representation of that character.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Sun, 28 May 2023 10:30:03 GMT)
Full text and
rfc822 format available.
Message #59 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Fri, 26 May 2023 20:43:37 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
>> Date: Fri, 26 May 2023 20:27:26 +0300
>> From: Eli Zaretskii <eliz <at> gnu.org>
>>
>> > From: Robert Pluim <rpluim <at> gmail.com>
>> > Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
>> > Date: Fri, 26 May 2023 18:24:02 +0200
>> >
>> > It requires something that answers the question "what font would we
>> > use for this codepoint if it was not an emoji?". Maybe we can have a
>> > separate fontset that pretends that the emoji script is equivalent to
>> > symbol? Or invent some kind of 'text-presentation-font' property to
>> > put somewhere?
>>
>> I'm not sure I understand why we don't select the right font by
>> default. Selecting a non-Emoji font for a non-Emoji codepoints should
>> not need any special tricks.
Eli> Actually, I don't understand why there's an issue here with font
Eli> selection. Are you saying that using Noto Color Emoji with
Eli> CHAR+0xFE0E, when CHAR is an Emoji character, doesn't produce the
Eli> textual representation of CHAR? If so, isn't that a problem with the
Eli> font? I thought all we needed to do was to hand the combination to an
Eli> Emoji-aware font, and the font would do the rest. Now you seem to be
Eli> saying that we somehow need to select a non-Emoji font? But if so,
Eli> who'd guarantee that a font that cannot display Emoji will know what
Eli> to do with the combination CHAR+0xFE0E?
Iʼm not sure: gedit displays the text representation, and libreoffice
displays the emoji presentation. And the google color emoji website
only shows colour glyphs. So I think itʼs up to the application to
select the correct font.
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Sun, 28 May 2023 11:44:01 GMT)
Full text and
rfc822 format available.
Message #62 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Fri, 26 May 2023 21:05:47 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
>> Date: Fri, 26 May 2023 19:35:56 +0200
>>
>> in this case it *is* an emoji codepoint, so it displays
>> as emoji because of font.c, even when followed by VS-15.
Eli> If we pass to an Emoji-capable font a sequence of a character followed
Eli> by VS-15, I'd expect the font to produce a glyph with the textual
Eli> representation of that character.
But we donʼt do that: we ask the font "give me a glyph for this codepoint".
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Sun, 28 May 2023 11:58:02 GMT)
Full text and
rfc822 format available.
Message #65 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Fri, 26 May 2023 20:27:26 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>>
>> Itʼs a change to the VS-16 entry. We did discuss it before, and
>> decided to put it aside because the solutions all involved adding
>> composition-function-table entries for 0-9 or similar. I donʼt
>> remember why we didnʼt consider adding to VS-16ʼs entry.
>>
>> Iʼll do some more testing, and post a final version hopefully this
>> weekend sometime.
Eli> OK, thanks.
Eli, if the 20e3 changes are too much for emacs-29, I can put them in
master.
Iʼll put some notes in admin/notes/unicode as well.
diff --git c/admin/unidata/emoji-zwj.awk i/admin/unidata/emoji-zwj.awk
index 7d2ff6cb900..0b6f1267205 100644
--- c/admin/unidata/emoji-zwj.awk
+++ i/admin/unidata/emoji-zwj.awk
@@ -82,6 +82,7 @@ END {
trigger_codepoints[11] = "1F574"
trigger_codepoints[12] = "1F575"
trigger_codepoints[13] = "1F590"
+ trigger_codepoints[14] = "20E3"
printf "(setq auto-composition-emoji-eligible-codepoints\n"
printf "'("
diff --git c/lisp/composite.el i/lisp/composite.el
index fb8b76114f4..acba4e73c17 100644
--- c/lisp/composite.el
+++ i/lisp/composite.el
@@ -762,6 +762,23 @@ compose-gstring-for-dotted-circle
(if (memq val '(Mn Mc Me))
(set-char-table-range composition-function-table key elt)))
unicode-category-table))
+ ;; for Emoji presentation selector
+ ;; We don't want the generic nil 0 entry because it causes display
+ ;; of an extra box for FE0F. (Bug#63731)
+ ;; This also covers the fully-qualified enclosing keycap case.
+ (set-char-table-range
+ composition-function-table
+ #xFE0E
+ `([,(purecopy "\\c.\ufe0e") 1 compose-gstring-for-graphic]))
+ (set-char-table-range
+ composition-function-table
+ #xFE0F
+ `([,(purecopy "\\c.\ufe0f\u20e3?") 1 compose-gstring-for-graphic]))
+ ;; for unqualified enclosing keycap
+ (set-char-table-range
+ composition-function-table
+ #x20E3
+ `([,(purecopy "[#*0-9]\u20e3") 1 compose-gstring-for-graphic]))
;; for dotted-circle
(aset composition-function-table #x25CC
`([,(purecopy ".\\c^") 0 compose-gstring-for-dotted-circle]))
@@ -857,11 +874,10 @@ compose-gstring-for-variation-glyph
;; taken care of by font_range in font.c, which will check for an
;; emoji font for codepoints used in compositions even if they're not
;; emoji themselves, and thus choose the Emoji presentation for them
-;; when followed by VS-16. VS-15 *is* handled here, because if it's
-;; handled in font_range, we end up choosing the Emoji presentation
-;; rather than the Text presentation.
+;; when followed by VS-16. VS-15 is handled by the setup around
+;; unicode-category-table above.
(let ((elt '([".." 1 compose-gstring-for-variation-glyph])))
- (set-char-table-range composition-function-table '(#xFE00 . #xFE0E) elt)
+ (set-char-table-range composition-function-table '(#xFE00 . #xFE0D) elt)
(set-char-table-range composition-function-table '(#xE0100 . #xE01EF) elt))
(defun auto-compose-chars (func from to font-object string direction)
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Sun, 28 May 2023 12:38:01 GMT)
Full text and
rfc822 format available.
Message #68 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Sun, 28 May 2023 12:29:48 +0200
>
> >>>>> On Fri, 26 May 2023 20:43:37 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>
> Eli> Actually, I don't understand why there's an issue here with font
> Eli> selection. Are you saying that using Noto Color Emoji with
> Eli> CHAR+0xFE0E, when CHAR is an Emoji character, doesn't produce the
> Eli> textual representation of CHAR? If so, isn't that a problem with the
> Eli> font? I thought all we needed to do was to hand the combination to an
> Eli> Emoji-aware font, and the font would do the rest. Now you seem to be
> Eli> saying that we somehow need to select a non-Emoji font? But if so,
> Eli> who'd guarantee that a font that cannot display Emoji will know what
> Eli> to do with the combination CHAR+0xFE0E?
>
> Iʼm not sure: gedit displays the text representation, and libreoffice
> displays the emoji presentation. And the google color emoji website
> only shows colour glyphs. So I think itʼs up to the application to
> select the correct font.
But what is "the correct font", when the sequence of codepoints is
CHAR+0xFE0E? How do we identify such a font? Do you know of a font
that produces the correct glyph for this sequence, when HarfBuzz is
used as the shaping engine?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Sun, 28 May 2023 12:44:02 GMT)
Full text and
rfc822 format available.
Message #71 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Sun, 28 May 2023 13:43:13 +0200
>
> >>>>> On Fri, 26 May 2023 21:05:47 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>
> >> From: Robert Pluim <rpluim <at> gmail.com>
> >> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> >> Date: Fri, 26 May 2023 19:35:56 +0200
> >>
> >> in this case it *is* an emoji codepoint, so it displays
> >> as emoji because of font.c, even when followed by VS-15.
>
> Eli> If we pass to an Emoji-capable font a sequence of a character followed
> Eli> by VS-15, I'd expect the font to produce a glyph with the textual
> Eli> representation of that character.
>
> But we donʼt do that: we ask the font "give me a glyph for this codepoint".
Is that because of the composition-function-table's entry for VS-15?
Maybe we should augment that, then?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Sun, 28 May 2023 12:47:01 GMT)
Full text and
rfc822 format available.
Message #74 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Sun, 28 May 2023 13:57:49 +0200
>
> Eli, if the 20e3 changes are too much for emacs-29, I can put them in
> master.
Yeah, I think it should go to master for now.
Otherwise, LGTM, thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Mon, 29 May 2023 10:46:02 GMT)
Full text and
rfc822 format available.
Message #77 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Sun, 28 May 2023 15:47:11 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
>> Date: Sun, 28 May 2023 13:57:49 +0200
>>
>> Eli, if the 20e3 changes are too much for emacs-29, I can put them in
>> master.
Eli> Yeah, I think it should go to master for now.
I pushed the doc changes, but not the code changes, because I now
think theyʼre papering over a deeper bug (which weʼve noticed before,
but didnʼt fix then).
In all these cases, consider the sequence U+1F44D U+FE0F
- emacs-29:
Displays as colour emoji, followed by an empty box
- emacs-29 with the following change in composite.el:
(set-char-table-range
composition-function-table
#xFE0F
`([,(purecopy "\\c.\ufe0f") 1 compose-gstring-for-graphic]))
Displays as colour emoji. Much rejoicing. If I follow my own
advice, and customize `glyphless-char-display-control' to show
hex-boxes for variation selectors, you then see that in actual
fact, we are still displaying the FE0F, but since it uses
thin-space by default, it wasnʼt obvious. Much sadness.
C-u C-x =:
display: composed to form "👍️" (see below)
Composed with the following character(s) "️" using this font:
ftcrhb:-GOOG-Noto Color Emoji-regular-normal-normal-*-13-*-*-*-m-0-iso10646-1
by these glyphs:
[0 1 128077 569 16 0 17 13 4 nil]
with these character(s):
️ (#xfe0f) VARIATION SELECTOR-16
Now I notice (via emoji-variation-sequences.txt), that this is only
happening for the following codepoints.
U+1F408
U+1F415
U+1F426
U+1F446
U+1F447
U+1F448
U+1F449
U+1F44D
U+1F44E
And if I look in lisp/international/emoji-zwj.el, I find:
(#x1F44D .
,(eval-when-compile (regexp-opt
'(
"\N{U+1F44D}\N{U+1F3FB}"
"\N{U+1F44D}\N{U+1F3FC}"
"\N{U+1F44D}\N{U+1F3FD}"
"\N{U+1F44D}\N{U+1F3FE}"
"\N{U+1F44D}\N{U+1F3FF}"
))))
If I add
"\N{U+1F44D}\N{U+FE0F}"
to that, and undo the composite.el change, then everything is
fine. Hurrah! This means that the
`([,(purecopy "\\c.\\c^+") 1 compose-gstring-for-graphic]
[nil 0 compose-gstring-for-graphic])
is not doing the right thing for this case.
I can change the emoji-zwj.awk script to add CHAR+FE0F for all emoji,
unless someone knows how to fix composition to do the right thing
here.
(there are similar issues with CHAR+FE0E)
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Mon, 29 May 2023 13:59:02 GMT)
Full text and
rfc822 format available.
Message #80 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Mon, 29 May 2023 12:44:58 +0200
>
> In all these cases, consider the sequence U+1F44D U+FE0F
>
> - emacs-29:
>
> Displays as colour emoji, followed by an empty box
>
> - emacs-29 with the following change in composite.el:
>
> (set-char-table-range
> composition-function-table
> #xFE0F
> `([,(purecopy "\\c.\ufe0f") 1 compose-gstring-for-graphic]))
>
> Displays as colour emoji. Much rejoicing. If I follow my own
> advice, and customize `glyphless-char-display-control' to show
> hex-boxes for variation selectors, you then see that in actual
> fact, we are still displaying the FE0F, but since it uses
> thin-space by default, it wasnʼt obvious. Much sadness.
>
> C-u C-x =:
>
> display: composed to form "👍️" (see below)
This is not what I see. I didn't use the above set-char-table-range
expression literally, but instead started "emacs -Q", and then
evaluated in *scratch*:
(set-char-table-range
composition-function-table
#xFE0F
'(["\\c.\ufe0f" 1 compose-gstring-for-graphic]))
After that, the sequence U+1F44D U+FE0F displays as a single glyph,
and there's no thin space after it. What am I missing? Is this
somehow specific to ftcrhb font driver or something?
> Now I notice (via emoji-variation-sequences.txt), that this is only
> happening for the following codepoints.
>
> U+1F408
> U+1F415
> U+1F426
> U+1F446
> U+1F447
> U+1F448
> U+1F449
> U+1F44D
> U+1F44E
>
> And if I look in lisp/international/emoji-zwj.el, I find:
>
> (#x1F44D .
> ,(eval-when-compile (regexp-opt
> '(
> "\N{U+1F44D}\N{U+1F3FB}"
> "\N{U+1F44D}\N{U+1F3FC}"
> "\N{U+1F44D}\N{U+1F3FD}"
> "\N{U+1F44D}\N{U+1F3FE}"
> "\N{U+1F44D}\N{U+1F3FF}"
> ))))
>
> If I add
>
> "\N{U+1F44D}\N{U+FE0F}"
>
> to that, and undo the composite.el change, then everything is
> fine. Hurrah! This means that the
>
> `([,(purecopy "\\c.\\c^+") 1 compose-gstring-for-graphic]
> [nil 0 compose-gstring-for-graphic])
>
> is not doing the right thing for this case.
You are saying that the entry in composition-function-table for
U+1F44D (and other similar characters) is used in preference to the
entry for U+FE0F that follows it, even though there's no U+1F3FB
etc. after it to "steal" the composition? Did you try stepping
through composite.c to see whether and why this is the case?
> I can change the emoji-zwj.awk script to add CHAR+FE0F for all emoji,
> unless someone knows how to fix composition to do the right thing
> here.
I think we need first to understand the issue at hand better. There's
more here than meets the eye, I think.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Mon, 29 May 2023 14:44:02 GMT)
Full text and
rfc822 format available.
Message #83 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Mon, 29 May 2023 16:58:43 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> display: composed to form "👍️" (see below)
Eli> This is not what I see. I didn't use the above set-char-table-range
Eli> expression literally, but instead started "emacs -Q", and then
Eli> evaluated in *scratch*:
Eli> (set-char-table-range
Eli> composition-function-table
Eli> #xFE0F
Eli> '(["\\c.\ufe0f" 1 compose-gstring-for-graphic]))
Eli> After that, the sequence U+1F44D U+FE0F displays as a single glyph,
Eli> and there's no thin space after it. What am I missing? Is this
Eli> somehow specific to ftcrhb font driver or something?
Itʼs a single glyph, but that glyph contains a thin-space. I used this
to check, the second 'a' is slightly offset
👍️a
👍a
This persists if I disable harfbuzz, and it behaves the same on macOS
Eli> You are saying that the entry in composition-function-table for
Eli> U+1F44D (and other similar characters) is used in preference to the
Eli> entry for U+FE0F that follows it, even though there's no U+1F3FB
Eli> etc. after it to "steal" the composition? Did you try stepping
Eli> through composite.c to see whether and why this is the case?
Right. It looks the the FE0F entry is ignored. Iʼve not ventured into
composite.c yet.
>> I can change the emoji-zwj.awk script to add CHAR+FE0F for all emoji,
>> unless someone knows how to fix composition to do the right thing
>> here.
Eli> I think we need first to understand the issue at hand better. There's
Eli> more here than meets the eye, I think.
Absolutely
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Mon, 29 May 2023 14:56:02 GMT)
Full text and
rfc822 format available.
Message #86 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Mon, 29 May 2023 16:43:00 +0200
>
> >>>>> On Mon, 29 May 2023 16:58:43 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>
> >> display: composed to form "👍️" (see below)
>
> Eli> This is not what I see. I didn't use the above set-char-table-range
> Eli> expression literally, but instead started "emacs -Q", and then
> Eli> evaluated in *scratch*:
>
> Eli> (set-char-table-range
> Eli> composition-function-table
> Eli> #xFE0F
> Eli> '(["\\c.\ufe0f" 1 compose-gstring-for-graphic]))
>
> Eli> After that, the sequence U+1F44D U+FE0F displays as a single glyph,
> Eli> and there's no thin space after it. What am I missing? Is this
> Eli> somehow specific to ftcrhb font driver or something?
>
> Itʼs a single glyph, but that glyph contains a thin-space. I used this
> to check, the second 'a' is slightly offset
>
> 👍️a
> 👍a
That's because the first one shows two glyphs that are
"pseudo-composed": not by the font, but by our hand-made "composition"
in compose-gstring-for-graphic. Try this instead:
(set-char-table-range
composition-function-table
#xFE0F
'(["\\c.\ufe0f" 1 font-shape-gstring]))
so that we only see a composition if the font indeed agrees to
compose. What do you see?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Mon, 29 May 2023 16:14:02 GMT)
Full text and
rfc822 format available.
Message #89 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Mon, 29 May 2023 17:55:49 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
>> Date: Mon, 29 May 2023 16:43:00 +0200
>>
>> >>>>> On Mon, 29 May 2023 16:58:43 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>>
>> >> display: composed to form "👍️" (see below)
>>
Eli> This is not what I see. I didn't use the above set-char-table-range
Eli> expression literally, but instead started "emacs -Q", and then
Eli> evaluated in *scratch*:
>>
Eli> (set-char-table-range
Eli> composition-function-table
Eli> #xFE0F
Eli> '(["\\c.\ufe0f" 1 compose-gstring-for-graphic]))
>>
Eli> After that, the sequence U+1F44D U+FE0F displays as a single glyph,
Eli> and there's no thin space after it. What am I missing? Is this
Eli> somehow specific to ftcrhb font driver or something?
>>
>> Itʼs a single glyph, but that glyph contains a thin-space. I used this
>> to check, the second 'a' is slightly offset
>>
>> 👍️a
>> 👍a
Eli> That's because the first one shows two glyphs that are
Eli> "pseudo-composed": not by the font, but by our hand-made "composition"
Eli> in compose-gstring-for-graphic. Try this instead:
Eli> (set-char-table-range
Eli> composition-function-table
Eli> #xFE0F
Eli> '(["\\c.\ufe0f" 1 font-shape-gstring]))
Eli> so that we only see a composition if the font indeed agrees to
Eli> compose. What do you see?
It still displays a single glyph with a thin-space. If I customize
`glyphless-char-display-control' to display hex codes for VS, then it
display a hex box.
So I guess that means weʼre not composing?
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Mon, 29 May 2023 17:19:02 GMT)
Full text and
rfc822 format available.
Message #92 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Mon, 29 May 2023 18:13:14 +0200
>
> >>>>> On Mon, 29 May 2023 17:55:49 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>
> Eli> That's because the first one shows two glyphs that are
> Eli> "pseudo-composed": not by the font, but by our hand-made "composition"
> Eli> in compose-gstring-for-graphic. Try this instead:
>
> Eli> (set-char-table-range
> Eli> composition-function-table
> Eli> #xFE0F
> Eli> '(["\\c.\ufe0f" 1 font-shape-gstring]))
>
> Eli> so that we only see a composition if the font indeed agrees to
> Eli> compose. What do you see?
>
> It still displays a single glyph with a thin-space. If I customize
> `glyphless-char-display-control' to display hex codes for VS, then it
> display a hex box.
>
> So I guess that means weʼre not composing?
What does "C-u C-x =" say in this case?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Tue, 30 May 2023 07:27:01 GMT)
Full text and
rfc822 format available.
Message #95 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Mon, 29 May 2023 20:18:41 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
>> Date: Mon, 29 May 2023 18:13:14 +0200
>>
>> >>>>> On Mon, 29 May 2023 17:55:49 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>>
Eli> That's because the first one shows two glyphs that are
Eli> "pseudo-composed": not by the font, but by our hand-made "composition"
Eli> in compose-gstring-for-graphic. Try this instead:
>>
Eli> (set-char-table-range
Eli> composition-function-table
Eli> #xFE0F
Eli> '(["\\c.\ufe0f" 1 font-shape-gstring]))
>>
Eli> so that we only see a composition if the font indeed agrees to
Eli> compose. What do you see?
>>
>> It still displays a single glyph with a thin-space. If I customize
>> `glyphless-char-display-control' to display hex codes for VS, then it
>> display a hex box.
>>
>> So I guess that means weʼre not composing?
Eli> What does "C-u C-x =" say in this case?
It claims itʼs composed:
position: 146 of 251 (58%), column: 0
character: 👍 (displayed as 👍) (codepoint 128077, #o372115, #x1f44d)
charset: unicode (Unicode (ISO10646))
code point in charset: 0x1F44D
script: emoji
syntax: w which means: word
category: .:Base
to input: type "C-x 8 RET 1f44d" or "C-x 8 RET THUMBS UP SIGN"
buffer code: #xF0 #x9F #x91 #x8D
file code: #xF0 #x9F #x91 #x8D (encoded by coding system utf-8-unix)
display: composed to form "👍️" (see below)
Composed with the following character(s) "️" using this font:
ftcrhb:-GOOG-Noto Color Emoji-regular-normal-normal-*-13-*-*-*-m-0-iso10646-1
by these glyphs:
[0 1 128077 569 16 0 17 13 4 nil]
with these character(s):
️ (#xfe0f) VARIATION SELECTOR-16
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Tue, 30 May 2023 12:11:01 GMT)
Full text and
rfc822 format available.
Message #98 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Tue, 30 May 2023 09:25:52 +0200
>
> >>>>> On Mon, 29 May 2023 20:18:41 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>
> Eli> (set-char-table-range
> Eli> composition-function-table
> Eli> #xFE0F
> Eli> '(["\\c.\ufe0f" 1 font-shape-gstring]))
> >>
> Eli> so that we only see a composition if the font indeed agrees to
> Eli> compose. What do you see?
> >>
> >> It still displays a single glyph with a thin-space. If I customize
> >> `glyphless-char-display-control' to display hex codes for VS, then it
> >> display a hex box.
> >>
> >> So I guess that means weʼre not composing?
>
> Eli> What does "C-u C-x =" say in this case?
>
> It claims itʼs composed:
>
> position: 146 of 251 (58%), column: 0
> character: 👍 (displayed as 👍) (codepoint 128077, #o372115, #x1f44d)
> charset: unicode (Unicode (ISO10646))
> code point in charset: 0x1F44D
> script: emoji
> syntax: w which means: word
> category: .:Base
> to input: type "C-x 8 RET 1f44d" or "C-x 8 RET THUMBS UP SIGN"
> buffer code: #xF0 #x9F #x91 #x8D
> file code: #xF0 #x9F #x91 #x8D (encoded by coding system utf-8-unix)
> display: composed to form "👍️" (see below)
>
> Composed with the following character(s) "️" using this font:
> ftcrhb:-GOOG-Noto Color Emoji-regular-normal-normal-*-13-*-*-*-m-0-iso10646-1
> by these glyphs:
> [0 1 128077 569 16 0 17 13 4 nil]
> with these character(s):
> ️ (#xfe0f) VARIATION SELECTOR-16
Which means it _is_ composed. Moreover, with Noto Color Emoji we get
a single glyph. On my system, I have Noto Emoji, from which I get two
glyphs:
[0 1 128077 422 17 1 15 12 2 nil]
[0 1 65039 3 17 0 1 0 1 [0 0 0]]
(in which case I can understand why the second one is displayed as a
hex box if I customize glyphless-char-display-control).
So, given that this is the case, why is this wrong, again? If the
font and the shaper produce two glyphs, or one glyph that looks like
two, why should we think it's an Emacs's problem?
(I verified that Emacs 28 shows the same, so this is not a recent
regression.)
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Tue, 30 May 2023 13:32:02 GMT)
Full text and
rfc822 format available.
Message #101 received at 63731 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
>>>>> On Tue, 30 May 2023 15:10:45 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
Eli> Which means it _is_ composed. Moreover, with Noto Color Emoji we get
Eli> a single glyph. On my system, I have Noto Emoji, from which I get two
Eli> glyphs:
Eli> [0 1 128077 422 17 1 15 12 2 nil]
Eli> [0 1 65039 3 17 0 1 0 1 [0 0 0]]
Eli> (in which case I can understand why the second one is displayed as a
Eli> hex box if I customize glyphless-char-display-control).
But I also get a hex box if I customize
glyphless-char-display-control, even though 'C-u C-x =' claims thereʼs
only one glyph.
Eli> So, given that this is the case, why is this wrong, again? If the
Eli> font and the shaper produce two glyphs, or one glyph that looks like
Eli> two, why should we think it's an Emacs's problem?
Because Emacs behaves differently depending on whether we have a
composition rule for FE0F that looks backwards or one for 1F44D that
looks forwards. The sequence in both cases is
U+1F44D U+FE0F U+7C U+61
U+1F44D U+7C U+61
(set-char-table-range
composition-function-table
#xFE0F
'(["\\c.\ufe0f" 1 font-shape-gstring]))
produces the following:
[backward-composition.png (image/png, inline)]
[Message part 3 (text/plain, inline)]
There is a (very) thin space that shouldnʼt be there between the 1f44d
and the '|' on the line that has the FE0F (and since it follows the
value of glyphless-char-display-control, I donʼt think
it comes from the shaping engine).
but
(set-char-table-range
composition-function-table
#x1F44D
'(["\U0001f44d\ufe0f" 0 font-shape-gstring]))
gives me this, where the two '|' align perfectly.
[forward-composition.png (image/png, inline)]
[Message part 5 (text/plain, inline)]
(as an experiment, I hacked 'produce_glyphless_glyph' to skip
displaying variation selectors, and the problem disappears).
thanks
Robert
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Tue, 30 May 2023 16:32:01 GMT)
Full text and
rfc822 format available.
Message #104 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Tue, 30 May 2023 15:30:58 +0200
>
> >>>>> On Tue, 30 May 2023 15:10:45 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>
> Eli> Which means it _is_ composed. Moreover, with Noto Color Emoji we get
> Eli> a single glyph. On my system, I have Noto Emoji, from which I get two
> Eli> glyphs:
>
> Eli> [0 1 128077 422 17 1 15 12 2 nil]
> Eli> [0 1 65039 3 17 0 1 0 1 [0 0 0]]
>
> Eli> (in which case I can understand why the second one is displayed as a
> Eli> hex box if I customize glyphless-char-display-control).
>
> But I also get a hex box if I customize
> glyphless-char-display-control, even though 'C-u C-x =' claims thereʼs
> only one glyph.
>
> Eli> So, given that this is the case, why is this wrong, again? If the
> Eli> font and the shaper produce two glyphs, or one glyph that looks like
> Eli> two, why should we think it's an Emacs's problem?
>
> Because Emacs behaves differently depending on whether we have a
> composition rule for FE0F that looks backwards or one for 1F44D that
> looks forwards. The sequence in both cases is
>
> U+1F44D U+FE0F U+7C U+61
> U+1F44D U+7C U+61
>
> (set-char-table-range
> composition-function-table
> #xFE0F
> '(["\\c.\ufe0f" 1 font-shape-gstring]))
>
> produces the following:
>
> There is a (very) thin space that shouldnʼt be there between the 1f44d
> and the '|' on the line that has the FE0F (and since it follows the
> value of glyphless-char-display-control, I donʼt think
> it comes from the shaping engine).
OK, here's the scoop: there's no composition there. "C-u C-x =" says
there is, but that's a lie: when I look in GDB at the glyphs actually
shown there, there's no composition glyphs, only the glyph for U+1F44D
followed by a glyph for U+FE0F.
> but
>
> (set-char-table-range
> composition-function-table
> #x1F44D
> '(["\U0001f44d\ufe0f" 0 font-shape-gstring]))
>
> gives me this, where the two '|' align perfectly.
Here, there _is_ a composition.
So there are two issues here: (a) why there's no composition in the
first case, and (b) why does "C-u C-x =" says there is when there
isn't.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Wed, 31 May 2023 16:12:02 GMT)
Full text and
rfc822 format available.
Message #107 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Tue, 30 May 2023 19:32:23 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> (set-char-table-range
>> composition-function-table
>> #x1F44D
>> '(["\U0001f44d\ufe0f" 0 font-shape-gstring]))
>>
>> gives me this, where the two '|' align perfectly.
Eli> Here, there _is_ a composition.
Eli> So there are two issues here: (a) why there's no composition in the
Eli> first case, and (b) why does "C-u C-x =" says there is when there
Eli> isn't.
OK. I can poke around in gdb if you give me some idea of what I should
be looking at.
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Wed, 31 May 2023 16:18:02 GMT)
Full text and
rfc822 format available.
Message #110 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Wed, 31 May 2023 18:11:36 +0200
>
> >>>>> On Tue, 30 May 2023 19:32:23 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>
> >> (set-char-table-range
> >> composition-function-table
> >> #x1F44D
> >> '(["\U0001f44d\ufe0f" 0 font-shape-gstring]))
> >>
> >> gives me this, where the two '|' align perfectly.
>
> Eli> Here, there _is_ a composition.
>
> Eli> So there are two issues here: (a) why there's no composition in the
> Eli> first case, and (b) why does "C-u C-x =" says there is when there
> Eli> isn't.
>
> OK. I can poke around in gdb if you give me some idea of what I should
> be looking at.
I don't really know. I plan to just step through the code in
composite.c tomorrow, unless you beat me to it. Once we understand
issue (a), I think we will also understand issue (b).
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Thu, 01 Jun 2023 12:43:02 GMT)
Full text and
rfc822 format available.
Message #113 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Wed, 31 May 2023 19:18:22 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
>
> > From: Robert Pluim <rpluim <at> gmail.com>
> > Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> > Date: Wed, 31 May 2023 18:11:36 +0200
> >
> > Eli> So there are two issues here: (a) why there's no composition in the
> > Eli> first case, and (b) why does "C-u C-x =" says there is when there
> > Eli> isn't.
> >
> > OK. I can poke around in gdb if you give me some idea of what I should
> > be looking at.
>
> I don't really know. I plan to just step through the code in
> composite.c tomorrow, unless you beat me to it. Once we understand
> issue (a), I think we will also understand issue (b).
OK, the issue is quite clear even without stepping with a debugger.
Bottom line: we cannot support a situation where the same character
can be composed by more than one slot in composition-function-table.
If there are more than a single slot for the same character, one of
them will be tried, and the rest will be ignored (not even tried).
In particular, if a character CH has a "forward" composition rule that
starts with itself, and also has a "backward" rule (one with non-zero
look-back parameter) triggered by a different character (which should
follow CH), the latter rule will never be tried.
This is what happens in this case: the character #x1F44D has several
rules that start with itself in emoji-zwj.el:
(#x1F44D .
,(eval-when-compile (regexp-opt
'(
"\N{U+1F44D}\N{U+1F3FB}"
"\N{U+1F44D}\N{U+1F3FC}"
"\N{U+1F44D}\N{U+1F3FD}"
"\N{U+1F44D}\N{U+1F3FE}"
"\N{U+1F44D}\N{U+1F3FF}"
))))
and it also has a "backward" rule:
(set-char-table-range
composition-function-table
#xFE0F '(["\\c.\ufe0f" 1 font-shape-gstring]))
The latter is triggered by #xFE0F and has a 1-character look-back,
which will match #x1F44D, since its category is '.' (it's a "base
character"). This latter rule is never tried. Why? because the
former rules, anchored at #X1F44D, are tried first (Emacs redisplay
examines characters in the order of their buffer positions), and fail
to match. When those rules fail to match, due to how the
composition-related functions called by the display engine are
factored, we never again consider compositions triggered by a later
character which "cover" also #x1F44D: once that position was examined
and the attempted composition failed, we move to the next character.
IOW, we assume that this first set of composition rules we find for a
given character are the only ones that could possibly be relevant for
that character.
Which means that to have #xFE0F compose correctly with Emoji
codepoints, we should include #xFE0F in the sequences in emoji-zwj.el.
The reason why "C-u C-x =" lies to us saying there's a composition
where really there isn't is because descr-text.el uses the
find-composition primitive, whose implementation is parallel and
separate from that of the display-engine routines, and is structured
differently. So find-composition does succeed to detect the second
rule, the one triggered by #xFE0F, which the display engine ignores.
I will think whether this can be fixed, to avoid such false positives,
but if we accept that there can be only one set of composition rules
for a character, then we basically invoked undefined behavior here,
and we got what we deserved.
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Thu, 01 Jun 2023 13:31:03 GMT)
Full text and
rfc822 format available.
Message #116 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Thu, 01 Jun 2023 15:43:26 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
>> Date: Wed, 31 May 2023 19:18:22 +0300
>> From: Eli Zaretskii <eliz <at> gnu.org>
>>
>> > From: Robert Pluim <rpluim <at> gmail.com>
>> > Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
>> > Date: Wed, 31 May 2023 18:11:36 +0200
>> >
>> > Eli> So there are two issues here: (a) why there's no composition in the
>> > Eli> first case, and (b) why does "C-u C-x =" says there is when there
>> > Eli> isn't.
>> >
>> > OK. I can poke around in gdb if you give me some idea of what I should
>> > be looking at.
>>
>> I don't really know. I plan to just step through the code in
>> composite.c tomorrow, unless you beat me to it. Once we understand
>> issue (a), I think we will also understand issue (b).
Eli> OK, the issue is quite clear even without stepping with a debugger.
Eli> Bottom line: we cannot support a situation where the same character
Eli> can be composed by more than one slot in composition-function-table.
Eli> If there are more than a single slot for the same character, one of
Eli> them will be tried, and the rest will be ignored (not even tried).
Eli> In particular, if a character CH has a "forward" composition rule that
Eli> starts with itself, and also has a "backward" rule (one with non-zero
Eli> look-back parameter) triggered by a different character (which should
Eli> follow CH), the latter rule will never be tried.
OK, that makes sense. Where would be a good place to document this?
Eli> This is what happens in this case: the character #x1F44D has several
Eli> rules that start with itself in emoji-zwj.el:
Eli> (#x1F44D .
Eli> ,(eval-when-compile (regexp-opt
Eli> '(
Eli> "\N{U+1F44D}\N{U+1F3FB}"
Eli> "\N{U+1F44D}\N{U+1F3FC}"
Eli> "\N{U+1F44D}\N{U+1F3FD}"
Eli> "\N{U+1F44D}\N{U+1F3FE}"
Eli> "\N{U+1F44D}\N{U+1F3FF}"
Eli> ))))
Eli> and it also has a "backward" rule:
Eli> (set-char-table-range
Eli> composition-function-table
Eli> #xFE0F '(["\\c.\ufe0f" 1 font-shape-gstring]))
Eli> The latter is triggered by #xFE0F and has a 1-character look-back,
Eli> which will match #x1F44D, since its category is '.' (it's a "base
Eli> character"). This latter rule is never tried. Why? because the
Eli> former rules, anchored at #X1F44D, are tried first (Emacs redisplay
Eli> examines characters in the order of their buffer positions), and fail
Eli> to match. When those rules fail to match, due to how the
Eli> composition-related functions called by the display engine are
Eli> factored, we never again consider compositions triggered by a later
Eli> character which "cover" also #x1F44D: once that position was examined
Eli> and the attempted composition failed, we move to the next character.
Eli> IOW, we assume that this first set of composition rules we find for a
Eli> given character are the only ones that could possibly be relevant for
Eli> that character.
Eli> Which means that to have #xFE0F compose correctly with Emoji
Eli> codepoints, we should include #xFE0F in the sequences in emoji-zwj.el.
Thatʼs easy enough:
diff --git a/admin/unidata/emoji-zwj.awk b/admin/unidata/emoji-zwj.awk
index 7d2ff6cb900..d1195ebbad8 100644
--- a/admin/unidata/emoji-zwj.awk
+++ b/admin/unidata/emoji-zwj.awk
@@ -106,7 +106,8 @@ END {
for (elt in ch)
{
- printf("(#x%s .\n,(eval-when-compile (regexp-opt\n'(\n%s\n))))\n", elt, vec[elt])
+ entries = sprintf("%s\n\"\\N{U+%s}\\N{U+FE0F}\"", vec[elt], elt)
+ printf("(#x%s .\n,(eval-when-compile (regexp-opt\n'(\n%s\n))))\n", elt, entries)
}
print "))"
print " (set-char-table-range composition-function-table"
That makes all the VS-16 sequences in
admin/unidata/emoji-variation-sequences.txt display with the emoji
font for me.
Eli> The reason why "C-u C-x =" lies to us saying there's a composition
Eli> where really there isn't is because descr-text.el uses the
Eli> find-composition primitive, whose implementation is parallel and
Eli> separate from that of the display-engine routines, and is structured
Eli> differently. So find-composition does succeed to detect the second
Eli> rule, the one triggered by #xFE0F, which the display engine ignores.
Eli> I will think whether this can be fixed, to avoid such false positives,
Eli> but if we accept that there can be only one set of composition rules
Eli> for a character, then we basically invoked undefined behavior here,
Eli> and we got what we deserved.
If find-composition DTRT, could we not use it in the display engine?
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Thu, 01 Jun 2023 16:10:02 GMT)
Full text and
rfc822 format available.
Message #119 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Thu, 01 Jun 2023 15:30:18 +0200
>
> Eli> OK, the issue is quite clear even without stepping with a debugger.
>
> Eli> Bottom line: we cannot support a situation where the same character
> Eli> can be composed by more than one slot in composition-function-table.
> Eli> If there are more than a single slot for the same character, one of
> Eli> them will be tried, and the rest will be ignored (not even tried).
> Eli> In particular, if a character CH has a "forward" composition rule that
> Eli> starts with itself, and also has a "backward" rule (one with non-zero
> Eli> look-back parameter) triggered by a different character (which should
> Eli> follow CH), the latter rule will never be tried.
>
> OK, that makes sense. Where would be a good place to document this?
In the doc string of composition-function-table, I think. We already
document there the caveat of arranging rules in descending order of
look-back, which is part of the same "misfeature".
> Eli> Which means that to have #xFE0F compose correctly with Emoji
> Eli> codepoints, we should include #xFE0F in the sequences in emoji-zwj.el.
>
> Thatʼs easy enough:
>
> diff --git a/admin/unidata/emoji-zwj.awk b/admin/unidata/emoji-zwj.awk
> index 7d2ff6cb900..d1195ebbad8 100644
> --- a/admin/unidata/emoji-zwj.awk
> +++ b/admin/unidata/emoji-zwj.awk
> @@ -106,7 +106,8 @@ END {
>
> for (elt in ch)
> {
> - printf("(#x%s .\n,(eval-when-compile (regexp-opt\n'(\n%s\n))))\n", elt, vec[elt])
> + entries = sprintf("%s\n\"\\N{U+%s}\\N{U+FE0F}\"", vec[elt], elt)
> + printf("(#x%s .\n,(eval-when-compile (regexp-opt\n'(\n%s\n))))\n", elt, entries)
> }
> print "))"
> print " (set-char-table-range composition-function-table"
>
> That makes all the VS-16 sequences in
> admin/unidata/emoji-variation-sequences.txt display with the emoji
> font for me.
Ready to install this on the emacs-29 branch?
> Eli> The reason why "C-u C-x =" lies to us saying there's a composition
> Eli> where really there isn't is because descr-text.el uses the
> Eli> find-composition primitive, whose implementation is parallel and
> Eli> separate from that of the display-engine routines, and is structured
> Eli> differently. So find-composition does succeed to detect the second
> Eli> rule, the one triggered by #xFE0F, which the display engine ignores.
> Eli> I will think whether this can be fixed, to avoid such false positives,
> Eli> but if we accept that there can be only one set of composition rules
> Eli> for a character, then we basically invoked undefined behavior here,
> Eli> and we got what we deserved.
>
> If find-composition DTRT, could we not use it in the display engine?
Not easily, because the display code calls subroutines of
find-composition in a certain order, and that's what causes the
behavior I described.
And even if we could make this happen, I'm not sure we should:
basically, having multiple matching slots would mean users and callers
will never be sure which one "wins".
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Thu, 01 Jun 2023 16:36:02 GMT)
Full text and
rfc822 format available.
Message #122 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Thu, 01 Jun 2023 19:10:16 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
>> Date: Thu, 01 Jun 2023 15:30:18 +0200
>>
Eli> OK, the issue is quite clear even without stepping with a debugger.
>>
Eli> Bottom line: we cannot support a situation where the same character
Eli> can be composed by more than one slot in composition-function-table.
Eli> If there are more than a single slot for the same character, one of
Eli> them will be tried, and the rest will be ignored (not even tried).
Eli> In particular, if a character CH has a "forward" composition rule that
Eli> starts with itself, and also has a "backward" rule (one with non-zero
Eli> look-back parameter) triggered by a different character (which should
Eli> follow CH), the latter rule will never be tried.
>>
>> OK, that makes sense. Where would be a good place to document this?
Eli> In the doc string of composition-function-table, I think. We already
Eli> document there the caveat of arranging rules in descending order of
Eli> look-back, which is part of the same "misfeature".
OK. Iʼll see if I can come up with something (or Iʼll just steal what
you wrote above :-)).
>> That makes all the VS-16 sequences in
>> admin/unidata/emoji-variation-sequences.txt display with the emoji
>> font for me.
Eli> Ready to install this on the emacs-29 branch?
Not today. My brain is fuzzy, and it needs more testing (the patch,
not my brain).
>> If find-composition DTRT, could we not use it in the display engine?
Eli> Not easily, because the display code calls subroutines of
Eli> find-composition in a certain order, and that's what causes the
Eli> behavior I described.
Eli> And even if we could make this happen, I'm not sure we should:
Eli> basically, having multiple matching slots would mean users and callers
Eli> will never be sure which one "wins".
Yes, at least the semantics are clear (now that we know what they
are).
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Fri, 02 Jun 2023 08:16:01 GMT)
Full text and
rfc822 format available.
Message #125 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Thu, 01 Jun 2023 18:34:53 +0200, Robert Pluim <rpluim <at> gmail.com> said:
Eli> Ready to install this on the emacs-29 branch?
Robert> Not today. My brain is fuzzy, and it needs more testing (the patch,
Robert> not my brain).
So the minimal change to get CHAR+VS-15 and CHAR+VS-16 to compose in
all our emoji test files is below. I noticed that we donʼt compose all
the sequences in emoji-test.txt correctly, but Iʼll fix that on master
by stealing^Wdrawing inspiration from Larsʼ work.
Proper VS-15 support is harder, I need to think about that some more.
diff --git c/admin/unidata/emoji-zwj.awk i/admin/unidata/emoji-zwj.awk
index 7d2ff6cb900..f13f796bcac 100644
--- c/admin/unidata/emoji-zwj.awk
+++ i/admin/unidata/emoji-zwj.awk
@@ -106,7 +106,8 @@ END {
for (elt in ch)
{
- printf("(#x%s .\n,(eval-when-compile (regexp-opt\n'(\n%s\n))))\n", elt, vec[elt])
+ entries = sprintf("%s\n\"\\N{U+%s}\\N{U+FE0E}\"\n\"\\N{U+%s}\\N{U+FE0F}\"", vec[elt], elt, elt)
+ printf("(#x%s .\n,(eval-when-compile (regexp-opt\n'(\n%s\n))))\n", elt, entries)
}
print "))"
print " (set-char-table-range composition-function-table"
diff --git c/lisp/composite.el i/lisp/composite.el
index fb8b76114f4..9710c3c371b 100644
--- c/lisp/composite.el
+++ i/lisp/composite.el
@@ -861,7 +861,7 @@ compose-gstring-for-variation-glyph
;; handled in font_range, we end up choosing the Emoji presentation
;; rather than the Text presentation.
(let ((elt '([".." 1 compose-gstring-for-variation-glyph])))
- (set-char-table-range composition-function-table '(#xFE00 . #xFE0E) elt)
+ (set-char-table-range composition-function-table '(#xFE00 . #xFE0D) elt)
(set-char-table-range composition-function-table '(#xE0100 . #xE01EF) elt))
(defun auto-compose-chars (func from to font-object string direction)
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Fri, 02 Jun 2023 12:07:02 GMT)
Full text and
rfc822 format available.
Message #128 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Fri, 02 Jun 2023 10:15:08 +0200
>
> >>>>> On Thu, 01 Jun 2023 18:34:53 +0200, Robert Pluim <rpluim <at> gmail.com> said:
>
> Eli> Ready to install this on the emacs-29 branch?
>
> Robert> Not today. My brain is fuzzy, and it needs more testing (the patch,
> Robert> not my brain).
>
> So the minimal change to get CHAR+VS-15 and CHAR+VS-16 to compose in
> all our emoji test files is below. I noticed that we donʼt compose all
> the sequences in emoji-test.txt correctly, but Iʼll fix that on master
> by stealing^Wdrawing inspiration from Larsʼ work.
Thanks, please install this on the emacs-29 branch.
> Proper VS-15 support is harder, I need to think about that some more.
Can you describe here the current problems with VS-15?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Fri, 02 Jun 2023 12:26:02 GMT)
Full text and
rfc822 format available.
Message #131 received at 63731 <at> debbugs.gnu.org (full text, mbox):
tags 63731 fixed
close 63731 29.1
quit
>>>>> On Fri, 02 Jun 2023 15:06:32 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
Eli> Thanks, please install this on the emacs-29 branch.
Closing.
Committed as 2f94f6de9d6
>> Proper VS-15 support is harder, I need to think about that some more.
Eli> Can you describe here the current problems with VS-15?
CHAR+VS-15 and CHAR+VS-16 correctly choose text and emoji
representation, but CHAR+VS-15 results in the text representation only
if CHAR is not an emoji. If it is an emoji, the font selected for it
will always be the emoji font.
Iʼve tried forcing font_range to use the font for the 'symbol' script
for EMOJI+VS-15, instead, but that resulted in composition
failing. Maybe there are some more dragons lurking in the composition
rules.
Robert
--
Added tag(s) fixed.
Request was from
Robert Pluim <rpluim <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Fri, 02 Jun 2023 12:26:02 GMT)
Full text and
rfc822 format available.
bug marked as fixed in version 29.1, send any further explanations to
63731 <at> debbugs.gnu.org and Steven Allen <steven <at> stebalien.com>
Request was from
Robert Pluim <rpluim <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Fri, 02 Jun 2023 12:26:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Fri, 02 Jun 2023 12:58:01 GMT)
Full text and
rfc822 format available.
Message #138 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Fri, 02 Jun 2023 14:25:28 +0200
>
> Eli> Thanks, please install this on the emacs-29 branch.
>
> Closing.
> Committed as 2f94f6de9d6
Thanks.
> >> Proper VS-15 support is harder, I need to think about that some more.
>
> Eli> Can you describe here the current problems with VS-15?
>
> CHAR+VS-15 and CHAR+VS-16 correctly choose text and emoji
> representation, but CHAR+VS-15 results in the text representation only
> if CHAR is not an emoji. If it is an emoji, the font selected for it
> will always be the emoji font.
And an Emoji font, when presented with CHAR+VS-15 sequence doesn't
produce a textual-representation glyph for CHAR? I'd expect it to.
If Emoji fonts don't produce textual-representation glyphs in this
case, I wonder how can this work at all. Because if we select some
non-Emoji font, it will probably not know about VS-15, so we will be
left with VS-15. Are we supposed to handle that ourselves, instead of
relying on the font and the shaping engine?
> Iʼve tried forcing font_range to use the font for the 'symbol' script
> for EMOJI+VS-15, instead, but that resulted in composition
> failing.
That's what I'd expect: non-Emoji fonts don't know about VS-15.
What does HarfBuzz's hb-view do with such sequences, when using Noto
Color Emoji font?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Fri, 02 Jun 2023 13:59:01 GMT)
Full text and
rfc822 format available.
Message #141 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Fri, 02 Jun 2023 15:58:05 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> CHAR+VS-15 and CHAR+VS-16 correctly choose text and emoji
>> representation, but CHAR+VS-15 results in the text representation only
>> if CHAR is not an emoji. If it is an emoji, the font selected for it
>> will always be the emoji font.
Eli> And an Emoji font, when presented with CHAR+VS-15 sequence doesn't
Eli> produce a textual-representation glyph for CHAR? I'd expect it to.
No.
Eli> If Emoji fonts don't produce textual-representation glyphs in this
Eli> case, I wonder how can this work at all. Because if we select some
Eli> non-Emoji font, it will probably not know about VS-15, so we will be
Eli> left with VS-15. Are we supposed to handle that ourselves, instead of
Eli> relying on the font and the shaping engine?
>> Iʼve tried forcing font_range to use the font for the 'symbol' script
>> for EMOJI+VS-15, instead, but that resulted in composition
>> failing.
Itʼs finding what appears to be the default system font, not whatʼs
specified in the fontset for 'symbol', so thatʼs one reason why
composition fails. Even with 'use-default-font-for-symbols' nil.
Eli> That's what I'd expect: non-Emoji fonts don't know about VS-15.
Right
Eli> What does HarfBuzz's hb-view do with such sequences, when using Noto
Eli> Color Emoji font?
Sequence Font Result
23e9 fe0e system black box
23e9 fe0e Symbola correct text representation
23e9 fe0e NotoEmoji correct text representation
23e9 fe0e NotoColorEmoji blank
And on emacs-29, Symbola and NotoEmoji compose that sequence
correctly. Now I just need to persuade emacs-30 to use one of them.
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Sat, 03 Jun 2023 05:37:02 GMT)
Full text and
rfc822 format available.
Message #144 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Fri, 02 Jun 2023 15:58:37 +0200
>
> Eli> What does HarfBuzz's hb-view do with such sequences, when using Noto
> Eli> Color Emoji font?
>
> Sequence Font Result
> 23e9 fe0e system black box
> 23e9 fe0e Symbola correct text representation
> 23e9 fe0e NotoEmoji correct text representation
> 23e9 fe0e NotoColorEmoji blank
>
> And on emacs-29, Symbola and NotoEmoji compose that sequence
> correctly. Now I just need to persuade emacs-30 to use one of them.
So you are saying that, in our default fontset, we should specify that
#xFE0E should be displayed by Noto Emoji (with Symbola as fallback),
and then make sure that font_range uses the same font for the likes of
#x23E9? IOW, specify a different font for VS-15 even though is script
is 'emoji'?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Mon, 05 Jun 2023 13:09:01 GMT)
Full text and
rfc822 format available.
Message #147 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Sat, 03 Jun 2023 08:36:59 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
>> Date: Fri, 02 Jun 2023 15:58:37 +0200
>>
Eli> What does HarfBuzz's hb-view do with such sequences, when using Noto
Eli> Color Emoji font?
>>
>> Sequence Font Result
>> 23e9 fe0e system black box
>> 23e9 fe0e Symbola correct text representation
>> 23e9 fe0e NotoEmoji correct text representation
>> 23e9 fe0e NotoColorEmoji blank
>>
>> And on emacs-29, Symbola and NotoEmoji compose that sequence
>> correctly. Now I just need to persuade emacs-30 to use one of them.
Eli> So you are saying that, in our default fontset, we should specify that
Eli> #xFE0E should be displayed by Noto Emoji (with Symbola as fallback),
Eli> and then make sure that font_range uses the same font for the likes of
Eli> #x23E9? IOW, specify a different font for VS-15 even though is script
Eli> is 'emoji'?
Yes, that works (and we can remove VS-15 and VS-16 from the emoji
script, so that theyʼll then be displayed via
`glyphless-char-display-control' when theyʼre on their own).
Thanks for the suggestion Eli, I was looking at it from the wrong
direction.
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Mon, 05 Jun 2023 13:13:02 GMT)
Full text and
rfc822 format available.
Message #150 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Mon, 05 Jun 2023 15:08:08 +0200
>
> >>>>> On Sat, 03 Jun 2023 08:36:59 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>
> >> Sequence Font Result
> >> 23e9 fe0e system black box
> >> 23e9 fe0e Symbola correct text representation
> >> 23e9 fe0e NotoEmoji correct text representation
> >> 23e9 fe0e NotoColorEmoji blank
> >>
> >> And on emacs-29, Symbola and NotoEmoji compose that sequence
> >> correctly. Now I just need to persuade emacs-30 to use one of them.
>
> Eli> So you are saying that, in our default fontset, we should specify that
> Eli> #xFE0E should be displayed by Noto Emoji (with Symbola as fallback),
> Eli> and then make sure that font_range uses the same font for the likes of
> Eli> #x23E9? IOW, specify a different font for VS-15 even though is script
> Eli> is 'emoji'?
>
> Yes, that works (and we can remove VS-15 and VS-16 from the emoji
> script, so that theyʼll then be displayed via
> `glyphless-char-display-control' when theyʼre on their own).
What about the rest of VS-nn? do they need to stay in 'emoji' script,
and if so, why?
> Thanks for the suggestion Eli, I was looking at it from the wrong
> direction.
You are the one who did most of the footwork, so kudos to you.
This is simple enough to install on emacs-29, I think?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Mon, 05 Jun 2023 13:33:01 GMT)
Full text and
rfc822 format available.
Message #153 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Mon, 05 Jun 2023 16:12:20 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
>
> > Eli> So you are saying that, in our default fontset, we should specify that
> > Eli> #xFE0E should be displayed by Noto Emoji (with Symbola as fallback),
> > Eli> and then make sure that font_range uses the same font for the likes of
> > Eli> #x23E9? IOW, specify a different font for VS-15 even though is script
> > Eli> is 'emoji'?
> >
> > Yes, that works (and we can remove VS-15 and VS-16 from the emoji
> > script, so that theyʼll then be displayed via
> > `glyphless-char-display-control' when theyʼre on their own).
>
> What about the rest of VS-nn? do they need to stay in 'emoji' script,
> and if so, why?
And one more question: if we remove VS-16 from the emoji script, what
will happen to the sequences like U+23E9 U+FE0F? Isn't it true that
we use a color Emoji font for those because VS-16 is in emoji script?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Mon, 05 Jun 2023 13:37:02 GMT)
Full text and
rfc822 format available.
Message #156 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Mon, 05 Jun 2023 16:12:20 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
>> Date: Mon, 05 Jun 2023 15:08:08 +0200
>>
>> >>>>> On Sat, 03 Jun 2023 08:36:59 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>>
>> >> Sequence Font Result
>> >> 23e9 fe0e system black box
>> >> 23e9 fe0e Symbola correct text representation
>> >> 23e9 fe0e NotoEmoji correct text representation
>> >> 23e9 fe0e NotoColorEmoji blank
>> >>
>> >> And on emacs-29, Symbola and NotoEmoji compose that sequence
>> >> correctly. Now I just need to persuade emacs-30 to use one of them.
>>
Eli> So you are saying that, in our default fontset, we should specify that
Eli> #xFE0E should be displayed by Noto Emoji (with Symbola as fallback),
Eli> and then make sure that font_range uses the same font for the likes of
Eli> #x23E9? IOW, specify a different font for VS-15 even though is script
Eli> is 'emoji'?
>>
>> Yes, that works (and we can remove VS-15 and VS-16 from the emoji
>> script, so that theyʼll then be displayed via
>> `glyphless-char-display-control' when theyʼre on their own).
Eli> What about the rest of VS-nn? do they need to stay in 'emoji' script,
Eli> and if so, why?
They were never in the 'emoji' script anyway.
>> Thanks for the suggestion Eli, I was looking at it from the wrong
>> direction.
Eli> You are the one who did most of the footwork, so kudos to you.
Eli> This is simple enough to install on emacs-29, I think?
The main change is in font.c, and looks like this. I think itʼs too
big for emacs-29 (breaking composition is very easy, itʼs entirely
possible Iʼve missed a few cases :-) )
diff --git a/src/font.c b/src/font.c
index e586277a5d3..30b088c818e 100644
--- a/src/font.c
+++ b/src/font.c
@@ -3633,10 +3633,14 @@ font_at (int c, ptrdiff_t pos, struct face *face, struct window *w,
/* Check if CH is a codepoint for which we should attempt to use the
emoji font, even if the codepoint itself has Emoji_Presentation =
No. Vauto_composition_emoji_eligible_codepoints is filled in for
- us by admin/unidata/emoji-zwj.awk. */
+ us by admin/unidata/emoji-zwj.awk. We also check if there's a
+ VS-15 or VS-16 following CH, and select text/emoji presentation
+ respectively if so. */
static bool
-codepoint_is_emoji_eligible (int ch)
+codepoint_is_font_change_eligible (int ch, int next_c)
{
+ if (next_c == 0xFE0E || next_c == 0xFE0F)
+ return true;
if (EQ (CHAR_TABLE_REF (Vchar_script_table, ch), Qemoji))
return true;
@@ -3690,21 +3694,43 @@ font_range (ptrdiff_t pos, ptrdiff_t pos_byte, ptrdiff_t *limit,
}
face = FACE_FROM_ID (f, face_id);
}
-
- /* If the composition was triggered by an emoji, use a character
- from 'script-representative-chars', rather than the first
- character in the string, to determine the font to use. */
- if (codepoint_is_emoji_eligible (ch))
+ int next_c = 0;
+ {
+ ptrdiff_t p = pos;
+ ptrdiff_t p_b = pos_byte;
+ int c;
+ c = (NILP (string)
+ ? fetch_char_advance_no_check (&p, &p_b)
+ : fetch_string_char_advance_no_check (string, &p, &p_b));
+ if (p < *limit)
+ {
+ c = (NILP (string)
+ ? fetch_char_advance_no_check (&p, &p_b)
+ : fetch_string_char_advance_no_check (string, &p, &p_b));
+ next_c = c;
+ }
+ }
+ if (codepoint_is_font_change_eligible (ch, next_c))
{
- Lisp_Object val = assq_no_quit (Qemoji, Vscript_representative_chars);
- if (CONSP (val))
+ if (next_c == 0xFE0E)
{
- val = XCDR (val);
+ font_object = font_for_char (face, 0xFE0E, pos, string);
+ }
+ else
+ {
+ /* If the composition was triggered by an emoji, use a character
+ from 'script-representative-chars', rather than the first
+ character in the string, to determine the font to use. */
+ Lisp_Object val = assq_no_quit (Qemoji, Vscript_representative_chars);
if (CONSP (val))
- val = XCAR (val);
- else if (VECTORP (val))
- val = AREF (val, 0);
- font_object = font_for_char (face, XFIXNAT (val), pos, string);
+ {
+ val = XCDR (val);
+ if (CONSP (val))
+ val = XCAR (val);
+ else if (VECTORP (val))
+ val = AREF (val, 0);
+ font_object = font_for_char (face, XFIXNAT (val), pos, string);
+ }
}
}
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Mon, 05 Jun 2023 13:48:02 GMT)
Full text and
rfc822 format available.
Message #159 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Mon, 05 Jun 2023 15:36:52 +0200
>
> >>>>> On Mon, 05 Jun 2023 16:12:20 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>
> Eli> This is simple enough to install on emacs-29, I think?
>
> The main change is in font.c, and looks like this. I think itʼs too
> big for emacs-29 (breaking composition is very easy, itʼs entirely
> possible Iʼve missed a few cases :-) )
Hmm... I though just changing the fontset in fontset.el would be
enough.
OK, so I guess master it is, then.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Mon, 05 Jun 2023 14:07:01 GMT)
Full text and
rfc822 format available.
Message #162 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Mon, 05 Jun 2023 16:31:58 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
>> Date: Mon, 05 Jun 2023 16:12:20 +0300
>> From: Eli Zaretskii <eliz <at> gnu.org>
>>
>> > Eli> So you are saying that, in our default fontset, we should specify that
>> > Eli> #xFE0E should be displayed by Noto Emoji (with Symbola as fallback),
>> > Eli> and then make sure that font_range uses the same font for the likes of
>> > Eli> #x23E9? IOW, specify a different font for VS-15 even though is script
>> > Eli> is 'emoji'?
>> >
>> > Yes, that works (and we can remove VS-15 and VS-16 from the emoji
>> > script, so that theyʼll then be displayed via
>> > `glyphless-char-display-control' when theyʼre on their own).
>>
>> What about the rest of VS-nn? do they need to stay in 'emoji' script,
>> and if so, why?
Eli> And one more question: if we remove VS-16 from the emoji script, what
Eli> will happen to the sequences like U+23E9 U+FE0F? Isn't it true that
Eli> we use a color Emoji font for those because VS-16 is in emoji script?
Not anymore. Now we have a forward composition rule for U+23E9
U+FE0F that triggers because U+23E9 is in the emoji script, which is
why U+23E9 U+FE0E also uses the emoji font (currently).
For non-emoji codepoints like U+203C, adding U+FE0F uses the emoji
font because U+FE0F is in the emoji script (and thereʼs no composition
rule for U+203C, so the backwards looking one for U+FE0F is used).
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Mon, 05 Jun 2023 14:28:02 GMT)
Full text and
rfc822 format available.
Message #165 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Mon, 05 Jun 2023 16:47:22 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
>> Date: Mon, 05 Jun 2023 15:36:52 +0200
>>
>> >>>>> On Mon, 05 Jun 2023 16:12:20 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>>
Eli> This is simple enough to install on emacs-29, I think?
>>
>> The main change is in font.c, and looks like this. I think itʼs too
>> big for emacs-29 (breaking composition is very easy, itʼs entirely
>> possible Iʼve missed a few cases :-) )
Eli> Hmm... I though just changing the fontset in fontset.el would be
Eli> enough.
Itʼs almost enough to do that, and to check if the triggering
character is U+FE0E, bu then we fall foul of the composition rule
forward/backward issue again.
If we could have forward and backwards looking rules working together,
then font_range would get passed U+FE0F or U+FE0E as the triggering
character, it could choose the font, and there would be no need to
peek at the next character.
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Mon, 05 Jun 2023 15:36:01 GMT)
Full text and
rfc822 format available.
Message #168 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Mon, 05 Jun 2023 16:27:28 +0200
>
> Eli> Hmm... I though just changing the fontset in fontset.el would be
> Eli> enough.
>
> Itʼs almost enough to do that, and to check if the triggering
> character is U+FE0E, bu then we fall foul of the composition rule
> forward/backward issue again.
Which forward rules would conflict with a backward rule triggered by
U+FE0E?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Mon, 05 Jun 2023 15:58:02 GMT)
Full text and
rfc822 format available.
Message #171 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Mon, 05 Jun 2023 18:35:37 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
>> Date: Mon, 05 Jun 2023 16:27:28 +0200
>>
Eli> Hmm... I though just changing the fontset in fontset.el would be
Eli> enough.
>>
>> Itʼs almost enough to do that, and to check if the triggering
>> character is U+FE0E, bu then we fall foul of the composition rule
>> forward/backward issue again.
Eli> Which forward rules would conflict with a backward rule triggered by
Eli> U+FE0E?
All the ones for the non-emoji codepoints that still need to be
composed as emoji sometimes, eg U+261D:
"\N{U+261D}"
"\N{U+261D}\N{U+1F3FB}"
"\N{U+261D}\N{U+1F3FC}"
"\N{U+261D}\N{U+1F3FD}"
"\N{U+261D}\N{U+1F3FE}"
"\N{U+261D}\N{U+1F3FF}"
to which we add:
"\N{U+261D}\N{U+FE0E}"
"\N{U+261D}\N{U+FE0F}"
(and not adding those doesnʼt help).
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Mon, 05 Jun 2023 16:21:01 GMT)
Full text and
rfc822 format available.
Message #174 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Mon, 05 Jun 2023 17:57:04 +0200, Robert Pluim <rpluim <at> gmail.com> said:
>>>>> On Mon, 05 Jun 2023 18:35:37 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>>> From: Robert Pluim <rpluim <at> gmail.com>
>>> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
>>> Date: Mon, 05 Jun 2023 16:27:28 +0200
>>>
Eli> Hmm... I though just changing the fontset in fontset.el would be
Eli> enough.
>>>
>>> Itʼs almost enough to do that, and to check if the triggering
>>> character is U+FE0E, bu then we fall foul of the composition rule
>>> forward/backward issue again.
Eli> Which forward rules would conflict with a backward rule triggered by
Eli> U+FE0E?
Robert> All the ones for the non-emoji codepoints that still need to be
Robert> composed as emoji sometimes, eg U+261D:
Oh, and all the <foo>+skin tone ones. And probably more.
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Mon, 05 Jun 2023 16:40:02 GMT)
Full text and
rfc822 format available.
Message #177 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Mon, 05 Jun 2023 17:57:04 +0200
>
> >>>>> On Mon, 05 Jun 2023 18:35:37 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>
> Eli> Which forward rules would conflict with a backward rule triggered by
> Eli> U+FE0E?
>
> All the ones for the non-emoji codepoints that still need to be
> composed as emoji sometimes, eg U+261D:
>
> "\N{U+261D}"
> "\N{U+261D}\N{U+1F3FB}"
> "\N{U+261D}\N{U+1F3FC}"
> "\N{U+261D}\N{U+1F3FD}"
> "\N{U+261D}\N{U+1F3FE}"
> "\N{U+261D}\N{U+1F3FF}"
Couldn't we put these in the slots of #x1F3FB..#x1F3FF instead, as
backward rules? As long as we don't have a forward rule starting with
#x261D, we could have backward rules for it triggered by #x1F3Fx and
#xFE0x, right?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Mon, 05 Jun 2023 16:43:02 GMT)
Full text and
rfc822 format available.
Message #180 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Mon, 05 Jun 2023 18:20:08 +0200
>
> Eli> Which forward rules would conflict with a backward rule triggered by
> Eli> U+FE0E?
>
> Robert> All the ones for the non-emoji codepoints that still need to be
> Robert> composed as emoji sometimes, eg U+261D:
>
> Oh, and all the <foo>+skin tone ones. And probably more.
What do you mean by <foo>+skin? Can you give a few examples?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Tue, 06 Jun 2023 07:25:01 GMT)
Full text and
rfc822 format available.
Message #183 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Mon, 05 Jun 2023 19:41:55 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
>> Date: Mon, 05 Jun 2023 18:20:08 +0200
>>
Eli> Which forward rules would conflict with a backward rule triggered by
Eli> U+FE0E?
>>
Robert> All the ones for the non-emoji codepoints that still need to be
Robert> composed as emoji sometimes, eg U+261D:
>>
>> Oh, and all the <foo>+skin tone ones. And probably more.
Eli> What do you mean by <foo>+skin? Can you give a few examples?
Anything using 1F3FB..1F3FF, such as 1F44B 1F3FB or 1F3C4 1F3FB
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Tue, 06 Jun 2023 07:29:01 GMT)
Full text and
rfc822 format available.
Message #186 received at 63731 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Mon, 05 Jun 2023 19:39:37 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
>> Date: Mon, 05 Jun 2023 17:57:04 +0200
>>
>> >>>>> On Mon, 05 Jun 2023 18:35:37 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>>
Eli> Which forward rules would conflict with a backward rule triggered by
Eli> U+FE0E?
>>
>> All the ones for the non-emoji codepoints that still need to be
>> composed as emoji sometimes, eg U+261D:
>>
>> "\N{U+261D}"
>> "\N{U+261D}\N{U+1F3FB}"
>> "\N{U+261D}\N{U+1F3FC}"
>> "\N{U+261D}\N{U+1F3FD}"
>> "\N{U+261D}\N{U+1F3FE}"
>> "\N{U+261D}\N{U+1F3FF}"
Eli> Couldn't we put these in the slots of #x1F3FB..#x1F3FF instead, as
Eli> backward rules? As long as we don't have a forward rule starting with
Eli> #x261D, we could have backward rules for it triggered by #x1F3Fx and
Eli> #xFE0x, right?
Yes, we could invert the whole composition rules setup, and make them
all work backwards, but then it will almost certainly all break again
with the next release of Unicode. Adding a special case for FE0E in
font_range is going to be more robust.
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#63731
; Package
emacs
.
(Tue, 06 Jun 2023 11:54:02 GMT)
Full text and
rfc822 format available.
Message #189 received at 63731 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> Date: Tue, 06 Jun 2023 09:28:04 +0200
>
> >>>>> On Mon, 05 Jun 2023 19:39:37 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>
> >> From: Robert Pluim <rpluim <at> gmail.com>
> >> Cc: 63731 <at> debbugs.gnu.org, steven <at> stebalien.com
> >> Date: Mon, 05 Jun 2023 17:57:04 +0200
> >>
> >> >>>>> On Mon, 05 Jun 2023 18:35:37 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
> >>
> Eli> Which forward rules would conflict with a backward rule triggered by
> Eli> U+FE0E?
> >>
> >> All the ones for the non-emoji codepoints that still need to be
> >> composed as emoji sometimes, eg U+261D:
> >>
> >> "\N{U+261D}"
> >> "\N{U+261D}\N{U+1F3FB}"
> >> "\N{U+261D}\N{U+1F3FC}"
> >> "\N{U+261D}\N{U+1F3FD}"
> >> "\N{U+261D}\N{U+1F3FE}"
> >> "\N{U+261D}\N{U+1F3FF}"
>
> Eli> Couldn't we put these in the slots of #x1F3FB..#x1F3FF instead, as
> Eli> backward rules? As long as we don't have a forward rule starting with
> Eli> #x261D, we could have backward rules for it triggered by #x1F3Fx and
> Eli> #xFE0x, right?
>
> Yes, we could invert the whole composition rules setup, and make them
> all work backwards, but then it will almost certainly all break again
> with the next release of Unicode. Adding a special case for FE0E in
> font_range is going to be more robust.
I don't think it could break, since such sequences are all likely to
be triggered by special codepoints that follow the U+2xxx characters.
Our win would be a much simpler setup.
But okay, let's try to do it this way.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Wed, 05 Jul 2023 11:24:06 GMT)
Full text and
rfc822 format available.
This bug report was last modified 1 year and 311 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.