GNU bug report logs - #33944
27.0.50; harfbuzz: Noto Sans Mandaic not rendered correctly

Previous Next

Package: emacs;

Reported by: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>

Date: Tue, 1 Jan 2019 14:37:02 UTC

Severity: normal

Found in version 27.0.50

Done: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 33944 in the body.
You can then email your comments to 33944 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#33944; Package emacs. (Tue, 01 Jan 2019 14:37:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Tue, 01 Jan 2019 14:37:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>
To: bug-gnu-emacs <at> gnu.org
Subject: 27.0.50; harfbuzz: Noto Sans Mandaic not rendered correctly
Date: Tue, 01 Jan 2019 15:36:40 +0100
This may be a more general problem, but Noto Sans Mandaic reproduces it
for me:

I am running Emacs from the Harfbuzz branch.

I have the font "Noto Sans Mandaic" installed (from
https://noto-website-2.storage.googleapis.com/pkgs/NotoSansMandaic-hinted.zip).

I save this code in a file "reproduce.el" and execute it with "emacs -Q
-l reproduce.el":

    (set-fontset-font t '(?\u0840 . ?\u085B) "Noto Sans Mandaic 20")

    (set-char-table-range
     composition-function-table '(?\u0840 . ?\u085B)
     (list ["[\u0840-\u085B]+" 0 arabic-shape-gstring]))

    (setq bidi-paragraph-direction t)
    (insert "\u0856\u0844\u0845")

The problem is that the second and third character from the right are
not combined as they should.  This works in hb-view and it also works,
if I remove the setting of bidi-paragraph-direction.

The commit that breaks this is the last one, 48776b7011 "Provide text
directionality and language to HarfBuzz shaper".  Before that commit it
works for me.

----

In GNU Emacs 27.0.50 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.22.11)
 of 2018-12-30 built on arrian
Repository revision: 48776b70115edf3775df19d80f734048dadff198
Repository branch: harfbuzz
Windowing system distributor 'The X.Org Foundation', version 11.0.11902000
System Description: Debian GNU/Linux 9 (stretch)

Configured using:
 'configure --with-harfbuzz --without-m17n-flt'

Configured features:
XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND GPM DBUS GSETTINGS GLIB
NOTIFY INOTIFY ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE HARFBUZZ LIBOTF
XFT ZLIB TOOLKIT_SCROLL_BARS GTK3 X11 XDBE XIM THREADS LCMS2 GMP

Important settings:
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix

Major mode: Buffer Menu

Minor modes in effect:
  shell-dirtrack-mode: t
  desktop-save-mode: t
  display-time-mode: t
  diff-auto-refine-mode: t
  delete-selection-mode: t
  cua-mode: t
  tooltip-mode: t
  global-eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  buffer-read-only: t
  column-number-mode: t
  line-number-mode: t
  transient-mark-mode: t

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug message rmc puny rfc822 mml mml-sec epa
mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils
mailheader sendmail pcmpl-unix tramp trampver tramp-compat
tramp-loaddefs ucs-normalize parse-time format-spec nnheader gnus-util
rmail rmail-loaddefs rfc2047 rfc2045 ietf-drums time-date mail-utils
dirtrack shell pcomplete eieio-opt speedbar sb-image ezimage dframe
find-func thingatpt help-fns tabify imenu elec-pair desktop frameset
highline benny-calendar-cfg ange-ftp comint ansi-color ring
benny-unicode generic-x cl autoinsert cc-mode cc-fonts cc-guess cc-menus
cc-styles cc-align cc-cmds cc-engine cc-vars cc-defs ps-print
ps-print-loaddefs ps-def lpr advice dired dired-loaddefs
benny-x-clipboard disp-table mm-util mail-prsvr time server protbuf
cal-china lunar solar cal-dst cal-bahai cal-islam cal-hebrew holidays
hol-loaddefs vc-git diff-mode easy-mmode diary-lib diary-loaddefs
cal-menu calendar cal-loaddefs delsel cua-base .loaddefs benny-tools
browse-url autoload radix-tree lisp-mnt mule-util cus-edit cus-start
cus-load wid-edit info finder-inf package let-alist derived pcase
cl-extra help-mode easymenu url-handlers url-parse auth-source cl-seq
eieio eieio-core cl-macs eieio-loaddefs password-cache json map url-vars
seq byte-opt gv bytecomp byte-compile cconv epg epg-config subr-x
cl-loaddefs cl-lib tooltip eldoc electric uniquify ediff-hook vc-hooks
lisp-float-type mwheel term/x-win x-win term/common-win x-dnd tool-bar
dnd fontset image regexp-opt fringe tabulated-list replace newcomment
text-mode elisp-mode lisp-mode prog-mode register page menu-bar
rfn-eshadow isearch timer select scroll-bar mouse jit-lock font-lock
syntax facemenu font-core term/tty-colors frame cl-generic cham georgian
utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean
japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european
ethiopic indian cyrillic chinese composite charscript charprop
case-table epa-hook jka-cmpr-hook help simple abbrev obarray minibuffer
cl-preloaded nadvice loaddefs button faces cus-face macroexp files
text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget hashtable-print-readable backquote threads dbusbind
inotify lcms2 dynamic-setting system-font-setting font-render-setting
move-toolbar gtk x-toolkit x multi-tty make-network-process emacs)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#33944; Package emacs. (Tue, 01 Jan 2019 15:29:01 GMT) Full text and rfc822 format available.

Message #8 received at 33944 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>
Cc: 33944 <at> debbugs.gnu.org
Subject: Re: bug#33944: 27.0.50;
 harfbuzz: Noto Sans Mandaic not rendered correctly
Date: Tue, 01 Jan 2019 17:28:42 +0200
> From: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>
> Date: Tue, 01 Jan 2019 15:36:40 +0100
> 
> This may be a more general problem, but Noto Sans Mandaic reproduces it
> for me:

You mean, with other fonts that support Arabic shaping the problem
doesn't happen?

> I save this code in a file "reproduce.el" and execute it with "emacs -Q
> -l reproduce.el":
> 
>     (set-fontset-font t '(?\u0840 . ?\u085B) "Noto Sans Mandaic 20")
> 
>     (set-char-table-range
>      composition-function-table '(?\u0840 . ?\u085B)
>      (list ["[\u0840-\u085B]+" 0 arabic-shape-gstring]))
> 
>     (setq bidi-paragraph-direction t)

From the doc string of bidi-paragraph-direction:

  If this is nil (the default), the direction of each paragraph is
  determined by the first strong directional character of its text.
  The values of ‘right-to-left’ and ‘left-to-right’ override that.
  Any other value is treated as nil.
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Did you set it to t on purpose?  If so, can you explain why?

>     (insert "\u0856\u0844\u0845")
> 
> The problem is that the second and third character from the right are
> not combined as they should.  This works in hb-view and it also works,
> if I remove the setting of bidi-paragraph-direction.

What happens if bidi-paragraph-direction is set to one of the valid
values?

> The commit that breaks this is the last one, 48776b7011 "Provide text
> directionality and language to HarfBuzz shaper".  Before that commit it
> works for me.

Can you run Emacs under a debugger and see what value of 'dir' do we
come up with in this snippet from ftfont.c:

  hb_direction_t dir = HB_DIRECTION_INVALID;
  if (EQ (direction, QL2R))
    dir = HB_DIRECTION_LTR;
  else if (EQ (direction, QR2L))
    dir = HB_DIRECTION_RTL;
  /* If the caller didn't provide a meaningful DIRECTION, let HarfBuzz
     guess it.  */
  if (dir != HB_DIRECTION_INVALID)
    hb_buffer_set_direction (hb_buffer, dir);

Do we call hb_buffer_set_direction, and if so, with what value?

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#33944; Package emacs. (Tue, 01 Jan 2019 18:35:02 GMT) Full text and rfc822 format available.

Message #11 received at 33944 <at> debbugs.gnu.org (full text, mbox):

From: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 33944 <at> debbugs.gnu.org
Subject: Re: bug#33944: 27.0.50;
 harfbuzz: Noto Sans Mandaic not rendered correctly
Date: Tue, 01 Jan 2019 19:34:15 +0100
Eli Zaretskii writes:
> You mean, with other fonts that support Arabic shaping the problem
> doesn't happen?

Now that you asked, I get the same problem with some words in Syriac,
like with this:

    (set-fontset-font t '(?\u0700 . ?\u07FF) "Serto Mardin 20")
    (setq bidi-paragraph-direction 'right-to-left)
    (insert "\u0718\u0726\u0720\u0713\u0717\u073F")

The characters should all be connected, but the third and fourth are not
in this case.

>> From the doc string of bidi-paragraph-direction:
>
>   If this is nil (the default), the direction of each paragraph is
>   determined by the first strong directional character of its text.
>   The values of ‘right-to-left’ and ‘left-to-right’ override that.
>   Any other value is treated as nil.
>   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> Did you set it to t on purpose?  If so, can you explain why?

No, that was a mistake.  OTOH it seems that "is treated as nil" is not
true, because "t" has the same effect as "right-to-left".

> What happens if bidi-paragraph-direction is set to one of the valid
> values?

With right-to-left the same happens, with left-to-right it's good again
(same as with the default).

>> The commit that breaks this is the last one, 48776b7011 "Provide text
>> directionality and language to HarfBuzz shaper".  Before that commit it
>> works for me.
>
> Can you run Emacs under a debugger and see what value of 'dir' do we
> come up with in this snippet from ftfont.c:
>
>   hb_direction_t dir = HB_DIRECTION_INVALID;
>   if (EQ (direction, QL2R))
>     dir = HB_DIRECTION_LTR;
>   else if (EQ (direction, QR2L))
>     dir = HB_DIRECTION_RTL;
>   /* If the caller didn't provide a meaningful DIRECTION, let HarfBuzz
>      guess it.  */
>   if (dir != HB_DIRECTION_INVALID)
>     hb_buffer_set_direction (hb_buffer, dir);
>
> Do we call hb_buffer_set_direction, and if so, with what value?

With "t" or right-to-left we have dir == HB_DIRECTION_LTR, and yes we go
into that function.  With the default (nil) or left-to-right we have dir
== HB_DIRECTION_RTL and we also call that function.  Is this switched
around somewhere?


Thanks for looking into this,
benny





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#33944; Package emacs. (Tue, 01 Jan 2019 19:18:02 GMT) Full text and rfc822 format available.

Message #14 received at 33944 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>
Cc: 33944 <at> debbugs.gnu.org
Subject: Re: bug#33944: 27.0.50;
 harfbuzz: Noto Sans Mandaic not rendered correctly
Date: Tue, 01 Jan 2019 21:17:03 +0200
> From: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>
> Cc: 33944 <at> debbugs.gnu.org
> Date: Tue, 01 Jan 2019 19:34:15 +0100
> 
> Eli Zaretskii writes:
> > You mean, with other fonts that support Arabic shaping the problem
> > doesn't happen?
> 
> Now that you asked, I get the same problem with some words in Syriac,
> like with this:
> 
>     (set-fontset-font t '(?\u0700 . ?\u07FF) "Serto Mardin 20")
>     (setq bidi-paragraph-direction 'right-to-left)
>     (insert "\u0718\u0726\u0720\u0713\u0717\u073F")
> 
> The characters should all be connected, but the third and fourth are not
> in this case.

I asked whether the problem happens with other fonts, not with other
characters.  Because you specifically mentioned the font.

> >   If this is nil (the default), the direction of each paragraph is
> >   determined by the first strong directional character of its text.
> >   The values of ‘right-to-left’ and ‘left-to-right’ override that.
> >   Any other value is treated as nil.
> >   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >
> > Did you set it to t on purpose?  If so, can you explain why?
> 
> No, that was a mistake.  OTOH it seems that "is treated as nil" is not
> true, because "t" has the same effect as "right-to-left".

It does?  How do you see that?  Does the paragraph direction change to
R2L when you set bidi-paragraph-direction to t?

> > What happens if bidi-paragraph-direction is set to one of the valid
> > values?
> 
> With right-to-left the same happens, with left-to-right it's good again
> (same as with the default).
> 
> >> The commit that breaks this is the last one, 48776b7011 "Provide text
> >> directionality and language to HarfBuzz shaper".  Before that commit it
> >> works for me.
> >
> > Can you run Emacs under a debugger and see what value of 'dir' do we
> > come up with in this snippet from ftfont.c:
> >
> >   hb_direction_t dir = HB_DIRECTION_INVALID;
> >   if (EQ (direction, QL2R))
> >     dir = HB_DIRECTION_LTR;
> >   else if (EQ (direction, QR2L))
> >     dir = HB_DIRECTION_RTL;
> >   /* If the caller didn't provide a meaningful DIRECTION, let HarfBuzz
> >      guess it.  */
> >   if (dir != HB_DIRECTION_INVALID)
> >     hb_buffer_set_direction (hb_buffer, dir);
> >
> > Do we call hb_buffer_set_direction, and if so, with what value?
> 
> With "t" or right-to-left we have dir == HB_DIRECTION_LTR, and yes we go
> into that function.  With the default (nil) or left-to-right we have dir
> == HB_DIRECTION_RTL and we also call that function.

When the value is nil, do you see the text that starts with Mandanaic
or Syriac letters begin at the right margin of the window?  IOW, do
you see that the paragraph direction changes when the paragraph begins
with a strong Right to Left letter?  Or does the text still get laid
out starting at the left margin of the window?

> Is this switched around somewhere?

Yes, this was the whole point of the changeset that succeeded in
breaking the shaping.  But I have an idea why this happens, and will
try to fix it.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#33944; Package emacs. (Tue, 01 Jan 2019 21:41:01 GMT) Full text and rfc822 format available.

Message #17 received at 33944 <at> debbugs.gnu.org (full text, mbox):

From: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 33944 <at> debbugs.gnu.org
Subject: Re: bug#33944: 27.0.50;
 harfbuzz: Noto Sans Mandaic not rendered correctly
Date: Tue, 01 Jan 2019 22:40:11 +0100
Eli Zaretskii writes:
> I asked whether the problem happens with other fonts, not with other
> characters.  Because you specifically mentioned the font.

Ah, I got confused with you mentioning Arabic.

Actually Noto is the only freely available font for Mandaic that I am
aware of.  I just mentioned it and its source to make it easier to
reproduce the problem.

As I said, other fonts seem to have similar problems with other
characters.  In the Syriac example, the same problem with the same
characters happens with other fonts.  OTOH all the fonts I have for
Syriac are made by the same company in a single package, so I expect
them all to contain the same shaping information.

>> >   If this is nil (the default), the direction of each paragraph is
>> >   determined by the first strong directional character of its text.
>> >   The values of ‘right-to-left’ and ‘left-to-right’ override that.
>> >   Any other value is treated as nil.
>> >   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> >
>> > Did you set it to t on purpose?  If so, can you explain why?
>> 
>> No, that was a mistake.  OTOH it seems that "is treated as nil" is not
>> true, because "t" has the same effect as "right-to-left".
>
> It does?  How do you see that?  Does the paragraph direction change to
> R2L when you set bidi-paragraph-direction to t?

I got confused there, because I expected the scratch buffer to use the
default for b-p-d, when it actually has it explictly set to ltr.  That
explains what I see.  Setting b-p-d to "t" in scratch changes the
behaviour, because than it uses default detection.

>> > Can you run Emacs under a debugger and see what value of 'dir' do we
>> > come up with in this snippet from ftfont.c:
>> >
>> >   hb_direction_t dir = HB_DIRECTION_INVALID;
>> >   if (EQ (direction, QL2R))
>> >     dir = HB_DIRECTION_LTR;
>> >   else if (EQ (direction, QR2L))
>> >     dir = HB_DIRECTION_RTL;
>> >   /* If the caller didn't provide a meaningful DIRECTION, let HarfBuzz
>> >      guess it.  */
>> >   if (dir != HB_DIRECTION_INVALID)
>> >     hb_buffer_set_direction (hb_buffer, dir);
>> >
>> > Do we call hb_buffer_set_direction, and if so, with what value?
>> 
>> With "t" or right-to-left we have dir == HB_DIRECTION_LTR, and yes we go
>> into that function.  With the default (nil) or left-to-right we have dir
>> == HB_DIRECTION_RTL and we also call that function.
>
> When the value is nil, do you see the text that starts with Mandanaic
> or Syriac letters begin at the right margin of the window?  IOW, do
> you see that the paragraph direction changes when the paragraph begins
> with a strong Right to Left letter?  Or does the text still get laid
> out starting at the left margin of the window?

Yes.  With b-p-d set to rtl or nil I get the text at the right margin
but with the shaping error.  With the default value ltr the text is at
the left, and the shaping error is gone.  This is always with changing
the script and restarting Emacs.

When I add an "x" at the beginning in the script and set b-p-d to nil,
the text is at the left and with correct shaping.  If I set b-p-d to nil
and interactively add or remove an "x" at the beginning of the line, the
place where the text goes changes accordingly but the shaping does not
change, the form persists that the script initially causes.  I executed
clear-font-cache, but that does not make the shaping change in this
scenario either.

Can I test something else?

benny




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#33944; Package emacs. (Wed, 02 Jan 2019 16:05:01 GMT) Full text and rfc822 format available.

Message #20 received at 33944 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>
Cc: 33944 <at> debbugs.gnu.org
Subject: Re: bug#33944: 27.0.50;
 harfbuzz: Noto Sans Mandaic not rendered correctly
Date: Wed, 02 Jan 2019 18:04:15 +0200
> From: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>
> Cc: 33944 <at> debbugs.gnu.org
> Date: Tue, 01 Jan 2019 22:40:11 +0100
> 
> > When the value is nil, do you see the text that starts with Mandanaic
> > or Syriac letters begin at the right margin of the window?  IOW, do
> > you see that the paragraph direction changes when the paragraph begins
> > with a strong Right to Left letter?  Or does the text still get laid
> > out starting at the left margin of the window?
> 
> Yes.  With b-p-d set to rtl or nil I get the text at the right margin
> but with the shaping error.  With the default value ltr the text is at
> the left, and the shaping error is gone.  This is always with changing
> the script and restarting Emacs.
> 
> When I add an "x" at the beginning in the script and set b-p-d to nil,
> the text is at the left and with correct shaping.  If I set b-p-d to nil
> and interactively add or remove an "x" at the beginning of the line, the
> place where the text goes changes accordingly but the shaping does not
> change, the form persists that the script initially causes.  I executed
> clear-font-cache, but that does not make the shaping change in this
> scenario either.
> 
> Can I test something else?

Please test the latest branch, I tried to fix this problem.

Thanks.




Reply sent to Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>:
You have taken responsibility. (Wed, 02 Jan 2019 23:13:02 GMT) Full text and rfc822 format available.

Notification sent to Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>:
bug acknowledged by developer. (Wed, 02 Jan 2019 23:13:02 GMT) Full text and rfc822 format available.

Message #25 received at 33944-done <at> debbugs.gnu.org (full text, mbox):

From: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 33944-done <at> debbugs.gnu.org
Subject: Re: bug#33944: 27.0.50;
 harfbuzz: Noto Sans Mandaic not rendered correctly
Date: Thu, 03 Jan 2019 00:12:14 +0100
Eli Zaretskii writes:
> Please test the latest branch, I tried to fix this problem.

That works for all my tests.  I'm closing the bug.

Thank you very much,
benny




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 31 Jan 2019 12:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 5 years and 87 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.