GNU bug report logs - #28339
25.2; Emacs shows ZWNJ character (Zero Width non-Joiner) as Space

Previous Next

Package: emacs;

Reported by: Nima Aryan <nimawebgard <at> gmail.com>

Date: Sun, 3 Sep 2017 16:41:01 UTC

Severity: normal

Found in version 25.2

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 28339 in the body.
You can then email your comments to 28339 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Sun, 03 Sep 2017 16:41:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Nima Aryan <nimawebgard <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sun, 03 Sep 2017 16:41:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Nima Aryan <nimawebgard <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 25.2; Emacs shows ZWNJ character (Zero Width non-Joiner) as Space
Date: Sun, 03 Sep 2017 15:57:34 +0000
[Message part 1 (text/plain, inline)]
Hi,

I'm trying to write a XeLaTeX document using Emacs+AUCTex. Everything is
awesome except this issue which Emacs do not show ZWNJ character. However
it write it correctly in the file when I save the document and I'm able to
see the characters correctly when I open it in other editors.

I've tested in different conditions (fresh Emacs with default settings,
different fonts, even with new user) but the problem exists. This problem
is specific to Emacs and I've no such problem in other editors.



Regards,
Nima



---------------

In GNU Emacs 25.2.1 (x86_64-unknown-linux-gnu, GTK+ Version 3.22.16)
 of 2017-07-16 built on arojas
Windowing system distributor 'The X.Org Foundation', version 11.0.11903000
Configured using:
 'configure --prefix=/usr --sysconfdir=/etc --libexecdir=/usr/lib
 --localstatedir=/var --with-x-toolkit=gtk3 --with-xft --with-modules
 'CFLAGS=-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong
 -fno-plt' CPPFLAGS=-D_FORTIFY_SOURCE=2
 LDFLAGS=-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now'

Configured features:
XPM JPEG TIFF GIF PNG RSVG IMAGEMAGICK SOUND GPM DBUS GCONF GSETTINGS
NOTIFY ACL GNUTLS LIBXML2 FREETYPE M17N_FLT LIBOTF XFT ZLIB
TOOLKIT_SCROLL_BARS GTK3 X11 MODULES

Important settings:
  value of $LC_COLLATE:
  value of $LC_CTYPE:
  value of $LC_MESSAGES:
  value of $LC_MONETARY:
  value of $LC_NUMERIC:
  value of $LC_TIME:
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix

Major mode: Lisp Interaction

Minor modes in effect:
  highlight-numbers-mode: t
  delete-selection-mode: t
  show-paren-mode: t
  cua-mode: t
  override-global-mode: t
  global-undo-tree-mode: t
  undo-tree-mode: t
  evil-mode: t
  evil-local-mode: t
  tooltip-mode: t
  global-eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  global-visual-line-mode: t
  visual-line-mode: t
  transient-mark-mode: t

Recent messages:
Loading cua-base...done
Loading paren...done
For information about GNU Emacs and the GNU system, type C-h C-a.

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug message idna dired format-spec rfc822
mml mml-sec password-cache epg gnus-util mm-decode mm-bodies mm-encode
mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047
rfc2045 ietf-drums mm-util help-fns mail-prsvr mail-utils
highlight-numbers parent-mode delsel paren cua-base cus-start cus-load
edit-indirect preview-latex tex-site auto-loads use-package diminish
bind-key easy-mmode evil evil-integration undo-tree diff evil-maps
evil-commands evil-jumps evil-command-window evil-types evil-search
evil-ex evil-macros evil-repeat evil-states evil-core advice evil-common
windmove thingatpt rect evil-digraphs evil-vars ring edmacro kmacro ido
finder-inf info package epg-config seq byte-opt gv bytecomp byte-compile
cl-extra help-mode easymenu cconv cl-loaddefs pcase cl-lib time-date
mule-util tooltip eldoc electric uniquify ediff-hook vc-hooks
lisp-float-type mwheel x-win term/common-win x-dnd tool-bar dnd fontset
image regexp-opt fringe tabulated-list newcomment elisp-mode lisp-mode
prog-mode register page menu-bar rfn-eshadow timer select scroll-bar
mouse jit-lock font-lock syntax facemenu font-core frame cl-generic cham
georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao
korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech
european ethiopic indian cyrillic chinese charscript case-table epa-hook
jka-cmpr-hook help simple abbrev minibuffer cl-preloaded nadvice
loaddefs button faces cus-face macroexp files text-properties overlay
sha1 md5 base64 format env code-pages mule custom widget
hashtable-print-readable backquote dbusbind inotify dynamic-setting
system-font-setting font-render-setting move-toolbar gtk x-toolkit x
multi-tty make-network-process emacs)

Memory information:
((conses 16 219413 13323)
 (symbols 48 29850 0)
 (miscs 40 54 148)
 (strings 32 50964 7936)
 (string-bytes 1 1419802)
 (vectors 16 23465)
 (vector-slots 8 580873 3493)
 (floats 8 235 72)
 (intervals 56 252 0)
 (buffers 976 18))
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Sun, 03 Sep 2017 17:07:02 GMT) Full text and rfc822 format available.

Message #8 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Nima Aryan <nimawebgard <at> gmail.com>
Cc: 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2;
 Emacs shows ZWNJ character (Zero Width non-Joiner) as Space
Date: Sun, 03 Sep 2017 20:06:33 +0300
> From: Nima Aryan <nimawebgard <at> gmail.com>
> Date: Sun, 03 Sep 2017 15:57:34 +0000
> 
> I'm trying to write a XeLaTeX document using Emacs+AUCTex. Everything is awesome except this issue
> which Emacs do not show ZWNJ character. However it write it correctly in the file when I save the document
> and I'm able to see the characters correctly when I open it in other editors. 
> 
> I've tested in different conditions (fresh Emacs with default settings, different fonts, even with new user) but
> the problem exists. This problem is specific to Emacs and I've no such problem in other editors. 

Emacs traditionally tries not to hide characters from the user.
However, this is just the default, and you can customize it: the
variable 'glyphless-char-display-control' controls how this and other
similar characters are shown.

OK?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Sun, 03 Sep 2017 17:13:01 GMT) Full text and rfc822 format available.

Message #11 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: nimawebgard <at> gmail.com
Cc: 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2;
 Emacs shows ZWNJ character (Zero Width non-Joiner) as Space
Date: Sun, 03 Sep 2017 20:11:45 +0300
> Date: Sun, 03 Sep 2017 20:06:33 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 28339 <at> debbugs.gnu.org
> 
> Emacs traditionally tries not to hide characters from the user.

Maybe there's a misunderstanding on my part: are you saying that you
do NOT see ZWNJ on display?  In that case, it could be because the
character is by default displayed as a very thin (1-pixel) space.
When you move the cursor across it, you should see a very thin bar
instead of the normal cursor.

In any case, the variable I mentioned lets you change how this
character is displayed.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Sun, 03 Sep 2017 22:45:01 GMT) Full text and rfc822 format available.

Message #14 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Nima Aryan <nimawebgard <at> gmail.com>
To: 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2; Emacs shows ZWNJ character (Zero Width
 non-Joiner) as Space
Date: Sun, 03 Sep 2017 19:31:16 +0000
[Message part 1 (text/plain, inline)]
The problem is specific (as it could be tested) to non-English Alphabet. At
least Persian.

I've tested the 'glyphless-char-display-control' and I can confirm that as
It seems, it's working on English input as expected. For example choosing
to set `Hex box` cause ZWNJ be replaced as hex box. Other options
(including defaults) seems to works as expected.

But the same option (on the same session) does not work as expected  for
Persian (and possibly other similar) alphabets. Actually with exactly same
option using Persian characters a clear normal space is shown instead of
Hex box (completely different behavior). No matter what the option for
non-English alphabet is, the ZWNJ is shown as a normal space character.

Using Persian, ZWNJ really matters. For example while `A+ZWNJ+B` should be
displayed as `AB` it's shown as `A  B` in Persian but `AB` in English. ZWNJ
might not have any application in English but it's vital for some other
languages. It's a kind of an end-immediate-start in continuous scripts. In
Latin alphabet it might have some uses in German language (fl). However it
very important for Persian, Hebrew,, Arabic, Urdu, Hindi and some other
alphabets.

Thanks

p.s. Just as a hypothesis, BiDi most interfere here as well and causes such
behavior.

On Sun, Sep 3, 2017 at 9:42 PM Eli Zaretskii <eliz <at> gnu.org> wrote:

> > Date: Sun, 03 Sep 2017 20:06:33 +0300
> > From: Eli Zaretskii <eliz <at> gnu.org>
> > Cc: 28339 <at> debbugs.gnu.org
> >
> > Emacs traditionally tries not to hide characters from the user.
>
> Maybe there's a misunderstanding on my part: are you saying that you
> do NOT see ZWNJ on display?  In that case, it could be because the
> character is by default displayed as a very thin (1-pixel) space.
> When you move the cursor across it, you should see a very thin bar
> instead of the normal cursor.
>
> In any case, the variable I mentioned lets you change how this
> character is displayed.
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Mon, 04 Sep 2017 04:30:01 GMT) Full text and rfc822 format available.

Message #17 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Nima Aryan <nimawebgard <at> gmail.com>
Cc: Kenichi Handa <handa <at> gnu.org>, 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2;
 Emacs shows ZWNJ character (Zero Width non-Joiner) as Space
Date: Mon, 04 Sep 2017 07:26:43 +0300
> From: Nima Aryan <nimawebgard <at> gmail.com>
> Date: Sun, 03 Sep 2017 19:31:16 +0000
> 
> The problem is specific (as it could be tested) to non-English Alphabet. At least Persian.
> 
> I've tested the 'glyphless-char-display-control' and I can confirm that as It seems, it's working on English input
> as expected. For example choosing to set `Hex box` cause ZWNJ be replaced as hex box. Other options
> (including defaults) seems to works as expected. 
> 
> But the same option (on the same session) does not work as expected for Persian (and possibly other
> similar) alphabets.

Ah, that changes everything.  When Emacs displays the Persian script,
it composes the ZWNJ character with surrounding characters to provide
correct shaping.  The rules for this character composition are in
lisp/language/misc-lang.el, near the end.  I don't read Persian, but
if the resulting shaping is incorrect, please show specific examples
with characters from the Persian script, and please show screenshots
of their correct display (in some other application) vs what Emacs
produces on your system.  Then we can investigate what could possibly
be wrong with the Emacs display.

> p.s. Just as a hypothesis, BiDi most interfere here as well and causes such behavior. 

I'm not sure this is the reason, but I need a clear example to
investigate.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Mon, 04 Sep 2017 06:37:01 GMT) Full text and rfc822 format available.

Message #20 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: sadid sahami <sadidsahami <at> gmail.com>
To: 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2; Emacs shows ZWNJ character (Zero Width
 non-Joiner) as Space
Date: Mon, 04 Sep 2017 05:05:03 +0000
[Message part 1 (text/plain, inline)]
I've provided a minimal test text, written in Emacs (Test.text) and its
display for Gedit (Gedit_display.png) and Emacs (Emacs_display.png). The
Gedit display is the correct one.

Best Regards,




On Mon, Sep 4, 2017 at 9:00 AM Eli Zaretskii <eliz <at> gnu.org> wrote:

> > From: Nima Aryan <nimawebgard <at> gmail.com>
> > Date: Sun, 03 Sep 2017 19:31:16 +0000
> >
> > The problem is specific (as it could be tested) to non-English Alphabet.
> At least Persian.
> >
> > I've tested the 'glyphless-char-display-control' and I can confirm that
> as It seems, it's working on English input
> > as expected. For example choosing to set `Hex box` cause ZWNJ be
> replaced as hex box. Other options
> > (including defaults) seems to works as expected.
> >
> > But the same option (on the same session) does not work as expected for
> Persian (and possibly other
> > similar) alphabets.
>
> Ah, that changes everything.  When Emacs displays the Persian script,
> it composes the ZWNJ character with surrounding characters to provide
> correct shaping.  The rules for this character composition are in
> lisp/language/misc-lang.el, near the end.  I don't read Persian, but
> if the resulting shaping is incorrect, please show specific examples
> with characters from the Persian script, and please show screenshots
> of their correct display (in some other application) vs what Emacs
> produces on your system.  Then we can investigate what could possibly
> be wrong with the Emacs display.
>
> > p.s. Just as a hypothesis, BiDi most interfere here as well and causes
> such behavior.
>
> I'm not sure this is the reason, but I need a clear example to
> investigate.
>
> Thanks.
>

   -
[Message part 2 (text/html, inline)]
[Test.text (text/plain, attachment)]
[Gedit_display.png (image/png, attachment)]
[Emacs_display.png (image/png, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Mon, 04 Sep 2017 09:16:02 GMT) Full text and rfc822 format available.

Message #23 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: sadid sahami <sadidsahami <at> gmail.com>,
    Kenichi Handa <handa <at> gnu.org>
Cc: 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2;
 Emacs shows ZWNJ character (Zero Width non-Joiner) as Space
Date: Mon, 04 Sep 2017 12:15:18 +0300
> From: sadid sahami <sadidsahami <at> gmail.com>
> Date: Mon, 04 Sep 2017 05:05:03 +0000
> 
> I've provided a minimal test text, written in Emacs (Test.text) and its display for Gedit (Gedit_display.png) and
> Emacs (Emacs_display.png). The Gedit display is the correct one. 

Hmm.. on my system I see a display that is almost identical to what
your "Gedit" display shows.

CC'ing Handa-san who might be able to help us with verifying the
composition rules for Persian.  Or maybe this is a problem with the
shaping engine used on GNU/Linux?

In any case, disabling bidi reordering doesn't fix the display (it
makes the display much worse for me), so it is not the problem.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Mon, 04 Sep 2017 11:45:01 GMT) Full text and rfc822 format available.

Message #26 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Nima Aryan <nimawebgard <at> gmail.com>
To: 28339 <at> debbugs.gnu.org
Cc: Kenichi Handa <handa <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#28339: 25.2; Emacs shows ZWNJ character (Zero Width
 non-Joiner) as Space
Date: Mon, 04 Sep 2017 11:43:40 +0000
[Message part 1 (text/plain, inline)]
It got interesting and I've found a workaround for the issue. display of
ZWNJ as SPACE or any other character is matter of font. Different fonts
uses different characters. Default Emacs font shows '[' instead of space
which is better and more readable at least.

The only minor problem I've seen so far is the irrelevance of displayed
character (shown as ZWNJ) to the 'glyphless-char-display-control' for
Persian alphabet.

I've attached a screenshot which shows different behavior of display for
both English and Persian at the same time. I execute `emacs -q` to launch
default Emacs. Then I open Test.text sample attached in previous emails.
Set the `glyphless-char-display-control` to show hex-box. It's clearly
shown that The English one is replaced by a hex-box but the Persian one
with a '[' (or SPACE).  No matter what the 'glyphless-char-display-control'
the Persian case shows same character.

Note, To type the ZWNJ for the English text, AB, I used Persian input (A,
switch keyboard layout, SHIFT+Space, switch back to English, B). So when I
put ZWNJ between the AB it's shown as hex-box (and affected by
'glyphless-char-display-control' as expected) but when I type it between
Persian characters it's shown as fixed '[' or 'SPACE' (font based) no
matter what the glyphless-char dictates.

Best Regards,


On Mon, Sep 4, 2017 at 1:45 PM Eli Zaretskii <eliz <at> gnu.org> wrote:

> > From: sadid sahami <sadidsahami <at> gmail.com>
> > Date: Mon, 04 Sep 2017 05:05:03 +0000
> >
> > I've provided a minimal test text, written in Emacs (Test.text) and its
> display for Gedit (Gedit_display.png) and
> > Emacs (Emacs_display.png). The Gedit display is the correct one.
>
> Hmm.. on my system I see a display that is almost identical to what
> your "Gedit" display shows.
>
> CC'ing Handa-san who might be able to help us with verifying the
> composition rules for Persian.  Or maybe this is a problem with the
> shaping engine used on GNU/Linux?
>
> In any case, disabling bidi reordering doesn't fix the display (it
> makes the display much worse for me), so it is not the problem.
>
[Message part 2 (text/html, inline)]
[glyphless_diff_behav_in_EngOrFa.png (image/png, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Mon, 04 Sep 2017 11:50:01 GMT) Full text and rfc822 format available.

Message #29 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Nima Aryan <nimawebgard <at> gmail.com>
To: 28339 <at> debbugs.gnu.org
Cc: Kenichi Handa <handa <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#28339: 25.2; Emacs shows ZWNJ character (Zero Width
 non-Joiner) as Space
Date: Mon, 04 Sep 2017 11:49:17 +0000
[Message part 1 (text/plain, inline)]
In previous email I just open the test file but now I try to write the
example again this behavior is shown which might be useful:

When I type A[ZWNJ]B it's OK. In Persian when I type first character and
then ZWNJ it's exactly the same as English but just after I type the next
character it replaced by '['. So some another mechanism might be involved.
I've provided a screen shot which now show these three cases:

Regards,

On Mon, Sep 4, 2017 at 4:12 PM Nima Aryan <nimawebgard <at> gmail.com> wrote:

> It got interesting and I've found a workaround for the issue. display of
> ZWNJ as SPACE or any other character is matter of font. Different fonts
> uses different characters. Default Emacs font shows '[' instead of space
> which is better and more readable at least.
>
> The only minor problem I've seen so far is the irrelevance of displayed
> character (shown as ZWNJ) to the 'glyphless-char-display-control' for
> Persian alphabet.
>
> I've attached a screenshot which shows different behavior of display for
> both English and Persian at the same time. I execute `emacs -q` to launch
> default Emacs. Then I open Test.text sample attached in previous emails.
> Set the `glyphless-char-display-control` to show hex-box. It's clearly
> shown that The English one is replaced by a hex-box but the Persian one
> with a '[' (or SPACE).  No matter what the 'glyphless-char-display-control'
> the Persian case shows same character.
>
> Note, To type the ZWNJ for the English text, AB, I used Persian input (A,
> switch keyboard layout, SHIFT+Space, switch back to English, B). So when I
> put ZWNJ between the AB it's shown as hex-box (and affected by
> 'glyphless-char-display-control' as expected) but when I type it between
> Persian characters it's shown as fixed '[' or 'SPACE' (font based) no
> matter what the glyphless-char dictates.
>
> Best Regards,
>
>
> On Mon, Sep 4, 2017 at 1:45 PM Eli Zaretskii <eliz <at> gnu.org> wrote:
>
>> > From: sadid sahami <sadidsahami <at> gmail.com>
>> > Date: Mon, 04 Sep 2017 05:05:03 +0000
>> >
>> > I've provided a minimal test text, written in Emacs (Test.text) and its
>> display for Gedit (Gedit_display.png) and
>> > Emacs (Emacs_display.png). The Gedit display is the correct one.
>>
>> Hmm.. on my system I see a display that is almost identical to what
>> your "Gedit" display shows.
>>
>> CC'ing Handa-san who might be able to help us with verifying the
>> composition rules for Persian.  Or maybe this is a problem with the
>> shaping engine used on GNU/Linux?
>>
>> In any case, disabling bidi reordering doesn't fix the display (it
>> makes the display much worse for me), so it is not the problem.
>>
>
[Message part 2 (text/html, inline)]
[2017-09-04-161616_888x354_scrot.png (image/png, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Mon, 04 Sep 2017 12:13:02 GMT) Full text and rfc822 format available.

Message #32 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Nima Aryan <nimawebgard <at> gmail.com>
Cc: handa <at> gnu.org, 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2; Emacs shows ZWNJ character (Zero Width
 non-Joiner) as Space
Date: Mon, 04 Sep 2017 15:11:53 +0300
> From: Nima Aryan <nimawebgard <at> gmail.com>
> Date: Mon, 04 Sep 2017 11:43:40 +0000
> Cc: Eli Zaretskii <eliz <at> gnu.org>, Kenichi Handa <handa <at> gnu.org>
> 
> It got interesting and I've found a workaround for the issue. display of ZWNJ as SPACE or any other character
> is matter of font. Different fonts uses different characters. Default Emacs font shows '[' instead of space
> which is better and more readable at least. 
> 
> The only minor problem I've seen so far is the irrelevance of displayed character (shown as ZWNJ) to the
> 'glyphless-char-display-control' for Persian alphabet. 
> 
> I've attached a screenshot which shows different behavior of display for both English and Persian at the same
> time. I execute `emacs -q` to launch default Emacs. Then I open Test.text sample attached in previous
> emails. Set the `glyphless-char-display-control` to show hex-box. It's clearly shown that The English one is
> replaced by a hex-box but the Persian one with a '[' (or SPACE). No matter what the
> 'glyphless-char-display-control' the Persian case shows same character. 
> 
> Note, To type the ZWNJ for the English text, AB, I used Persian input (A, switch keyboard layout,
> SHIFT+Space, switch back to English, B). So when I put ZWNJ between the AB it's shown as hex-box (and
> affected by 'glyphless-char-display-control' as expected) but when I type it between Persian characters it's
> shown as fixed '[' or 'SPACE' (font based) no matter what the glyphless-char dictates. 

You don't need to customize glyphless-char-display-control at all for
the correct display of ZWNJ in Persian.  I pointed to that variable
before I knew you were talking about the Persian script.  When
characters in Persian script are displayed and ZWNJ among them, Emacs
combines the ZWNJ character with neighboring characters to produce the
correct shaping, as expected by users of Persian.

On my system, ZWNJ is not visible at all among Persian text, and
that's without any customizations of glyphless-char-display-control.

It's possible that the original display was incorrect because the font
you were using for Persian characters doesn't support shaping as Emacs
expects.  In that case, finding a better font and customizing your
default fontset to use it for Persian should be the solution.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Wed, 06 Sep 2017 23:26:01 GMT) Full text and rfc822 format available.

Message #35 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: handa <handa <at> gnu.org>
To: sadid sahami <sadidsahami <at> gmail.com>
Cc: eliz <at> gnu.org, 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2;
 Emacs shows ZWNJ character (Zero Width non-Joiner) as Space
Date: Thu, 07 Sep 2017 08:25:11 +0900
[Message part 1 (text/plain, inline)]
I've just tried Test.text on my Emacs and Gedit, and got the attached
screen shot which shows almost same rendering, and they are more
similary to your Emacs_display.png than Gedit_display.png.

In my case, Emacs uses "Dejavu Sans".  I don't know how to get which
font Gedit uses for Arabic, but as far as I see from the glyph shapes,
it also uses "Dejavu Sans".

Do you know which font your Gedit uses?
Or, Eli, do you know which font your Emacs uses for Arabic?

---
K. Handa
handa <at> gnu.org

[EmacsAndGedit.png (image/png, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Thu, 07 Sep 2017 02:41:01 GMT) Full text and rfc822 format available.

Message #38 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: handa <handa <at> gnu.org>
Cc: 28339 <at> debbugs.gnu.org, sadidsahami <at> gmail.com
Subject: Re: bug#28339: 25.2;
 Emacs shows ZWNJ character (Zero Width non-Joiner) as Space
Date: Thu, 07 Sep 2017 05:40:17 +0300
> From: handa <handa <at> gnu.org>
> Cc: 28339 <at> debbugs.gnu.org, eliz <at> gnu.org
> Date: Thu, 07 Sep 2017 08:25:11 +0900
> 
> I've just tried Test.text on my Emacs and Gedit, and got the attached
> screen shot which shows almost same rendering, and they are more
> similary to your Emacs_display.png than Gedit_display.png.

That's what I see on my system as well.  It seems Emacs displays this
text correctly on both your and mine system.

> Or, Eli, do you know which font your Emacs uses for Arabic?

Courier New, the default font.

Btw, do you see some artifacts on display when you move cursor across
this text?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Sun, 10 Sep 2017 23:09:02 GMT) Full text and rfc822 format available.

Message #41 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: handa <handa <at> gnu.org>
To: Nima Aryan <nimawebgard <at> gmail.com>
Cc: eliz <at> gnu.org, 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2; Emacs shows ZWNJ character (Zero Width
 non-Joiner) as Space
Date: Mon, 11 Sep 2017 08:08:08 +0900
[Message part 1 (text/plain, inline)]
Hi,

I found why Emacs shows ZWNJ with space.
Emacs on GNU/Linux renders ZWNJ (unless it is absorbed by a rendering
engine) with a glyph defined in a font.  As Vazir Code (and Dejavu Sans)
defines a spacing glyph for ZWNJ, Emacs displays a space.  As Courier New
defines a vertical bar glyph for ZWNJ, Emacs displays a vertivcal bar.
And as Freeserif defines a zero-width glyph, Emacs displays a 1-dot
width space.

So, please try this:

At first, load the attached code to tell Emacs that a glyph of ZWNJ have
1-dot width.

Then, tell Emacs to use the same font for Arabic and ZWNJ as this;

(let ((spec (font-spec :family "Vazir Code")))
  (set-fontset-font nil 'arabic spec)
  (set-fontset-font nil #x200c spec))

One problem with this solution is that if a font has some actual glyph
 (e.g. vertical bar as Courier New), that bar is anyway displayed.

---
K. Handa
handa <at> gnu.org

[arabic-shape.el (application/emacs-lisp, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Mon, 11 Sep 2017 16:20:01 GMT) Full text and rfc822 format available.

Message #44 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: handa <handa <at> gnu.org>
Cc: nimawebgard <at> gmail.com, 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2; Emacs shows ZWNJ character (Zero Width
 non-Joiner) as Space
Date: Mon, 11 Sep 2017 19:19:43 +0300
> From: handa <handa <at> gnu.org>
> Cc: eliz <at> gnu.org, 28339 <at> debbugs.gnu.org
> Date: Mon, 11 Sep 2017 08:08:08 +0900
> 
> I found why Emacs shows ZWNJ with space.
> Emacs on GNU/Linux renders ZWNJ (unless it is absorbed by a rendering
> engine) with a glyph defined in a font.  As Vazir Code (and Dejavu Sans)
> defines a spacing glyph for ZWNJ, Emacs displays a space.  As Courier New
> defines a vertical bar glyph for ZWNJ, Emacs displays a vertivcal bar.
> And as Freeserif defines a zero-width glyph, Emacs displays a 1-dot
> width space.
> 
> So, please try this:
> 
> At first, load the attached code to tell Emacs that a glyph of ZWNJ have
> 1-dot width.
> 
> Then, tell Emacs to use the same font for Arabic and ZWNJ as this;
> 
> (let ((spec (font-spec :family "Vazir Code")))
>   (set-fontset-font nil 'arabic spec)
>   (set-fontset-font nil #x200c spec))
> 
> One problem with this solution is that if a font has some actual glyph
>  (e.g. vertical bar as Courier New), that bar is anyway displayed.

Thanks.

What is the significance of using the same font for ZWNJ in this case?

And why do we need to tell Emacs that ZWNJ has a 1-pixel width?
Should ZWNJ be at all displayed in this case?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Wed, 13 Sep 2017 14:04:02 GMT) Full text and rfc822 format available.

Message #47 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: handa <handa <at> gnu.org>
To: Nima Aryan <nimawebgard <at> gmail.com>
Cc: eliz <at> gnu.org, 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2; Emacs shows ZWNJ character (Zero Width
 non-Joiner) as Space
Date: Wed, 13 Sep 2017 23:02:56 +0900
[Message part 1 (text/plain, inline)]
In article <CALp2H_2w50RrBiaWV1dpg760cUpamy1nZdRgrwJKAjESq3no3Q <at> mail.gmail.com>, Nima Aryan <nimawebgard <at> gmail.com> writes:

> I can confirm that this code solves the problem for many fonts I tested
> including:
>    DejaVu Sans, Vazir Code, Inconsolata-g, Office Code Pro, Ubunto, Meslo,
> ...

Thank you for testing my code.

> The fonts I still see some problems are 'Droid Sans Regular' which shows
> hallow boxes and Noto Sans which shows a narrow width bar.

That's perhaps because they define those glyphs for ZWNJ.

To avoid that problem, there are two ways:
(1) display ZWNJ with a glyph for space (if the font has a glyph for space)
(2) do not generate a glypgh for ZWNJ

Please try the attached new version.  It tries (1).  If you change the
value of arabic-font-shape-gstring to
`arabic-font-shape-gstring-ZWNJ-absorb, it tries (2).

As an editor, I think (1) is better, but an Arabic/Persian user may have
different opinion.

---
K. Handa
handa <at> gnu.org

[arabic-shape.el (application/emacs-lisp, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Wed, 13 Sep 2017 14:07:02 GMT) Full text and rfc822 format available.

Message #50 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: handa <handa <at> gnu.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: nimawebgard <at> gmail.com, 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2; Emacs shows ZWNJ character (Zero Width
 non-Joiner) as Space
Date: Wed, 13 Sep 2017 23:06:25 +0900
In article <83mv61ryw0.fsf <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:

> What is the significance of using the same font for ZWNJ in this case?

To be sure to include ZWNJ in an Arabic glyph string.

> And why do we need to tell Emacs that ZWNJ has a 1-pixel width?
> Should ZWNJ be at all displayed in this case?

I'm not sure.  As I wrote in the previous mail, as an editor, isn't it
better to notify a user the existence of ZWNJ?

---
K. Handa
handa <at> gnu.org




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Wed, 13 Sep 2017 15:03:01 GMT) Full text and rfc822 format available.

Message #53 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: handa <handa <at> gnu.org>
Cc: nimawebgard <at> gmail.com, 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2; Emacs shows ZWNJ character (Zero Width
 non-Joiner) as Space
Date: Wed, 13 Sep 2017 18:02:33 +0300
> From: handa <handa <at> gnu.org>
> Cc: nimawebgard <at> gmail.com, 28339 <at> debbugs.gnu.org
> Date: Wed, 13 Sep 2017 23:06:25 +0900
> 
> > And why do we need to tell Emacs that ZWNJ has a 1-pixel width?
> > Should ZWNJ be at all displayed in this case?
> 
> I'm not sure.  As I wrote in the previous mail, as an editor, isn't it
> better to notify a user the existence of ZWNJ?

I thought that the shaping engine returns to us a series of grapheme
clusters that completely replaces ZWNJ and the neighboring characters,
and that therefore we only need to display the glyphs returned by the
shaper.  If one of the glyphs returned by the shaper is ZWNJ, then
isn't the shaper doing a poor job?

Or maybe I misunderstand something about this situation?

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Thu, 14 Sep 2017 12:25:02 GMT) Full text and rfc822 format available.

Message #56 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: handa <handa <at> gnu.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: nimawebgard <at> gmail.com, 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2; Emacs shows ZWNJ character (Zero Width
 non-Joiner) as Space
Date: Thu, 14 Sep 2017 21:24:28 +0900
In article <83wp52od4m.fsf <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:

> I thought that the shaping engine returns to us a series of grapheme
> clusters that completely replaces ZWNJ and the neighboring characters,
> and that therefore we only need to display the glyphs returned by the
> shaper.  If one of the glyphs returned by the shaper is ZWNJ, then
> isn't the shaper doing a poor job?

Each Arabic character constitutes a grapheme cluster.  Then, for the
sequence "0646 0645 06CC 200C 0634 0648 062F", to which neighboring should
200C belongs to?  Does Unicode define it?

Anyway, is it convenient or inconvenient to be able to edit ZWNJ directly?





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Thu, 14 Sep 2017 17:17:02 GMT) Full text and rfc822 format available.

Message #59 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: handa <handa <at> gnu.org>
Cc: nimawebgard <at> gmail.com, 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2; Emacs shows ZWNJ character (Zero Width
 non-Joiner) as Space
Date: Thu, 14 Sep 2017 20:15:59 +0300
> From: handa <handa <at> gnu.org>
> Cc: nimawebgard <at> gmail.com, 28339 <at> debbugs.gnu.org
> Date: Thu, 14 Sep 2017 21:24:28 +0900
> 
> In article <83wp52od4m.fsf <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > I thought that the shaping engine returns to us a series of grapheme
> > clusters that completely replaces ZWNJ and the neighboring characters,
> > and that therefore we only need to display the glyphs returned by the
> > shaper.  If one of the glyphs returned by the shaper is ZWNJ, then
> > isn't the shaper doing a poor job?
> 
> Each Arabic character constitutes a grapheme cluster.  Then, for the
> sequence "0646 0645 06CC 200C 0634 0648 062F", to which neighboring should
> 200C belongs to?  Does Unicode define it?

I don't think Unicode defines that, but I thought the shaping engine
gives us back glyphs that don't include ZWNJ itself.  Evidently,
that's not true, which I find strange.

> Anyway, is it convenient or inconvenient to be able to edit ZWNJ directly?

It's convenient.  But we already support deletion of composed
characters, so I didn't think it mattered.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Thu, 14 Sep 2017 21:15:02 GMT) Full text and rfc822 format available.

Message #62 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Nima Aryan <nimawebgard <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>, handa <handa <at> gnu.org>
Cc: 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2; Emacs shows ZWNJ character (Zero Width
 non-Joiner) as Space
Date: Thu, 14 Sep 2017 21:13:57 +0000
[Message part 1 (text/plain, inline)]
I've no technical background in typography, but as a use case when the user
type 'A[ZWNJ]B' the editor should show 'A[Discontinuation of continuous
script but without any space or kerning]B'. It can be translated to '[the
end shape of A][No space or kerning][the beginning shape of B]' .
Persian/Hebrew/Arabic scripts have different glyphs for the same character
based on their position in the word (beginning, middle, end), so the ZWNJ
is vital here. Regarding ZWNJ from user point of view in these scripts, it
works exactly like 'Space' but without showing it.

This might be misunderstanding on my part, but It is strange to me if the
font (or shaper?) replaces the ZWNJ with space. It's OK to show nothing for
ZWNJ but not the space. I've not such experience with other editors such as
Gedit (or even with terminal emulators) and if this is the case how other
editors figure it out?

Thanks a lot,

P.s. Regarding the new patch I'll test it as soon as possible.
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Sat, 16 Sep 2017 01:34:02 GMT) Full text and rfc822 format available.

Message #65 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: handa <handa <at> gnu.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: nimawebgard <at> gmail.com, 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2; Emacs shows ZWNJ character (Zero Width
 non-Joiner) as Space
Date: Sat, 16 Sep 2017 10:32:57 +0900
[Message part 1 (text/plain, inline)]
In article <83y3phmca8.fsf <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:

> > Each Arabic character constitutes a grapheme cluster.  Then, for the
> > sequence "0646 0645 06CC 200C 0634 0648 062F", to which neighboring should
> > 200C belongs to?  Does Unicode define it?

> I don't think Unicode defines that, but I thought the shaping engine
> gives us back glyphs that don't include ZWNJ itself.  Evidently,
> that's not true, which I find strange.

If ZWNJ is WITHIN a grapheme cluster (i.e. not at the edges
of the cluster), the m17n lib does not return ZWNJ glyph.

> > Anyway, is it convenient or inconvenient to be able to edit ZWNJ directly?

> It's convenient.  But we already support deletion of composed
> characters, so I didn't think it mattered.

If Unicode does not have a rule of ZWNJ handing, to delete ZWNJ, how a
user know which to type; C-d or BS?  And while doing cut&paste
repeatedly, are there any chance of having the second and third lines of
the attached file?  They have two and three consecutive ZWNJ.  How does
a user notice such a (perhaps incorrect) situation?

---
K. Handa
handa <at> gnu.org

[arabic.txt (text/plain, inline)]
نمی‌شود
نمی‌‌شود
نمی‌‌‌شود

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Sat, 16 Sep 2017 04:06:02 GMT) Full text and rfc822 format available.

Message #68 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Nima Aryan <nimawebgard <at> gmail.com>
To: handa <handa <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org>
Cc: 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2; Emacs shows ZWNJ character (Zero Width
 non-Joiner) as Space
Date: Sat, 16 Sep 2017 04:05:15 +0000
[Message part 1 (text/plain, inline)]
If Unicode does not have a rule of ZWNJ handing, to delete ZWNJ, how a user
know which to type; C-d or BS? And while doing cut&paste repeatedly, are
there any chance of having the second and third lines of the attached file?
They have two and three consecutive ZWNJ. How does a user notice such a
(perhaps incorrect) situation?

As a user, I’ve been in this situation before and it simply doesn’t have
any effect on the user and the user simply can’t figure it out (unless
represent ZWNJ as something else). This is why ZWNJ-as-Thin is a workaround
hack and not a solution. ZWNJ takes no space it’s like 3x0=0. To delete,
some editors like Gedit and many more simply take any number of consequent
ZWNJs as one. I’ve seen some which count each ZWNJ and the user have to
delete each to reach the character before.

On Sat, Sep 16, 2017 at 6:03 AM handa handa <at> gnu.org
<http://mailto:handa <at> gnu.org> wrote:

In article <83y3phmca8.fsf <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:
>
> > > Each Arabic character constitutes a grapheme cluster.  Then, for the
> > > sequence "0646 0645 06CC 200C 0634 0648 062F", to which neighboring
> should
> > > 200C belongs to?  Does Unicode define it?
>
> > I don't think Unicode defines that, but I thought the shaping engine
> > gives us back glyphs that don't include ZWNJ itself.  Evidently,
> > that's not true, which I find strange.
>
> If ZWNJ is WITHIN a grapheme cluster (i.e. not at the edges
> of the cluster), the m17n lib does not return ZWNJ glyph.
>
> > > Anyway, is it convenient or inconvenient to be able to edit ZWNJ
> directly?
>
> > It's convenient.  But we already support deletion of composed
> > characters, so I didn't think it mattered.
>
> If Unicode does not have a rule of ZWNJ handing, to delete ZWNJ, how a
> user know which to type; C-d or BS?  And while doing cut&paste
> repeatedly, are there any chance of having the second and third lines of
> the attached file?  They have two and three consecutive ZWNJ.  How does
> a user notice such a (perhaps incorrect) situation?
>
> ---
> K. Handa
> handa <at> gnu.org
>
> ​
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Sat, 16 Sep 2017 07:25:01 GMT) Full text and rfc822 format available.

Message #71 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: handa <handa <at> gnu.org>
Cc: nimawebgard <at> gmail.com, 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2; Emacs shows ZWNJ character (Zero Width
 non-Joiner) as Space
Date: Sat, 16 Sep 2017 10:24:06 +0300
> From: handa <handa <at> gnu.org>
> Cc: nimawebgard <at> gmail.com, 28339 <at> debbugs.gnu.org
> Date: Sat, 16 Sep 2017 10:32:57 +0900
> 
> In article <83y3phmca8.fsf <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > > Each Arabic character constitutes a grapheme cluster.  Then, for the
> > > sequence "0646 0645 06CC 200C 0634 0648 062F", to which neighboring should
> > > 200C belongs to?  Does Unicode define it?
> 
> > I don't think Unicode defines that, but I thought the shaping engine
> > gives us back glyphs that don't include ZWNJ itself.  Evidently,
> > that's not true, which I find strange.
> 
> If ZWNJ is WITHIN a grapheme cluster (i.e. not at the edges
> of the cluster), the m17n lib does not return ZWNJ glyph.
> 
> > > Anyway, is it convenient or inconvenient to be able to edit ZWNJ directly?
> 
> > It's convenient.  But we already support deletion of composed
> > characters, so I didn't think it mattered.
> 
> If Unicode does not have a rule of ZWNJ handing, to delete ZWNJ, how a
> user know which to type; C-d or BS?

Above, you asked about Unicode definition as to which grapheme cluster
should ZWNJ belong.  On that, I said I didn't think there's any
Unicode ruling (although to be sure, we should probably ask a question
on the Unicode mailing list).

But here, you are talking about deleting a ZWNJ from display, and
there Unicode does have a clear rule, see Section 23.2 there.  A
pertinent quote (Implementation Notes, p.849):

  As with all other alternate format characters, fonts should use an
  invisible zero-width glyph for representation of both ZWJ and ZWNJ.

This seems to be a requirement for fonts, but it does convey what
Unicode thinks about displaying ZWNJ.

Emacs generally tries to display such control characters, because
hiding them from users is un-Emacsy.  But in this case, it seems like
users expect us to hide it.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Sat, 16 Sep 2017 12:38:02 GMT) Full text and rfc822 format available.

Message #74 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: handa <handa <at> gnu.org>
To: Nima Aryan <nimawebgard <at> gmail.com>
Cc: eliz <at> gnu.org, 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2; Emacs shows ZWNJ character (Zero Width
 non-Joiner) as Space
Date: Sat, 16 Sep 2017 21:36:47 +0900
[Message part 1 (text/plain, inline)]
In article <CALp2H_3tLC71X6-jvH2XD-6qX8O=KE5wHa561QPk-w2OoCX9HA <at> mail.gmail.com>, Nima Aryan <nimawebgard <at> gmail.com> writes:

>    With ZWNJ-as-space, the Droid Sans problem (showing box) is resolved but
>    the Noto Sans still shows small superscript bar line (Arial is also have
>    similar problem).

I'm sorry.  The code for ZWNJ-as-space had a bug.  Please try the
attached new one.

Anyway, I also installed Nato Sans (i.e. NotoSans-Regular.ttf) by
"fonts-noto" debian package, but it seems that font does not support
Arabic.  If your "Noto Sans" font supports Arabic, please send me it.

>    With ZWNJ-absorb, I couldn’t find any problem with any font. As a user,
>    it seems perfect.

>    with ZWNJ-thin-width, also seems good and I couldn’t find any problem in
>    fonts.

With this method, if you are using the same "Noto Sans" font as mine,
you should see a visual glyph that "Noto Sans" defines for ZWNJ.

---
K. Handa
handa <at> gnu.org

[arabic-shape.el (application/emacs-lisp, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Sat, 16 Sep 2017 17:31:01 GMT) Full text and rfc822 format available.

Message #77 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: handa <handa <at> gnu.org>, nimawebgard <at> gmail.com, 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2;
 Emacs shows ZWNJ character (Zero Width non-Joiner) as Space
Date: Sat, 16 Sep 2017 19:30:05 +0200
Hi all,

A few thoughts from an occasional user of this feature.

Eli Zaretskii writes:
> [...] I thought the shaping engine gives us back glyphs that don't
> include ZWNJ itself.  Evidently, that's not true, which I find
> strange.

I thought that with OpenType at least that depends on the font?  Not
that I trust that fonts do the right thing.  I think the right thing for
the font would be to just implement the behaviour (break up ligatgures,
prevent shaping), but not show a glyph.  Emacs could of course work
around fonts that *do* show a glyph by rendering the characters before
and after the ZWNJ separately.

For read-only text (Info, Gnus) that is the behaviour that I would like.

For editing, I would like a hair-line type glyph to delete.  But I
personally can live with not showing a glyph, and deleting ZWNJ with the
character after it, so that X ZWNJ Y BACKSPACE results in "X".  I think
in this scenario multiple ZWNJs should be deleted as one.  This is
similar to composed characters, I think.

We should also consider what to do about ZWJ and the bidi directional
control characters.  ZWJ handling must come from the font, I think, so
this really can only work when the font works right, but than ZWJ is
probably more rarely used, so it's ok not to try to work around bad
fonts.

Other issues (excuse the verbosity, most of you know all this already,
of course):

* Hightlighting ZWNJ in read-only text while searching for it with
  incremental search.

* Read-only parts of buffers that are not completely read-only
  (Customize, minibuffer prompts).

* User-specified replacement via display tables. 

benny





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Sat, 16 Sep 2017 17:44:01 GMT) Full text and rfc822 format available.

Message #80 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>
Cc: handa <at> gnu.org, nimawebgard <at> gmail.com, 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2;
 Emacs shows ZWNJ character (Zero Width non-Joiner) as Space
Date: Sat, 16 Sep 2017 20:42:54 +0300
> From: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>
> Cc: handa <handa <at> gnu.org>,  nimawebgard <at> gmail.com,  28339 <at> debbugs.gnu.org
> Date: Sat, 16 Sep 2017 19:30:05 +0200
> 
> > [...] I thought the shaping engine gives us back glyphs that don't
> > include ZWNJ itself.  Evidently, that's not true, which I find
> > strange.
> 
> I thought that with OpenType at least that depends on the font?

It does, but Handa-san seems to say that even the best fonts don't
consider ZWNJ part of any grapheme cluster, and always leave it alone.

> For editing, I would like a hair-line type glyph to delete.

We already have a solution for deleting a character which was composed
with the preceding one(s).  So I think this aspects doesn't have to be
a factor in our decision how to display ZWNJ.

> We should also consider what to do about ZWJ and the bidi directional
> control characters.

Bidi controls are different in that they are never composed.  Their
effect is via the application of the UBA, and whether or not to
display them is explicitly left to the application to decide.

> * Hightlighting ZWNJ in read-only text while searching for it with
>   incremental search.

In general, search should ignore ZWNJ and similar controls, at least
the "folding" search.

> * User-specified replacement via display tables. 

We already have the glyphless-char-display-control feature for that.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Sat, 16 Sep 2017 18:06:01 GMT) Full text and rfc822 format available.

Message #83 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: handa <at> gnu.org, nimawebgard <at> gmail.com, 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2;
 Emacs shows ZWNJ character (Zero Width non-Joiner) as Space
Date: Sat, 16 Sep 2017 20:05:23 +0200
>> From: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>
>> I thought that with OpenType at least that depends on the font?

Eli Zaretskii writes:
> It does, but Handa-san seems to say that even the best fonts don't
> consider ZWNJ part of any grapheme cluster, and always leave it alone.

It breaks shaping and ligatures, so the result is not a cluster by
definition, I think.  To the contrary, it breaks the cluster.  Or maybe
the terminology confuses me.

>> For editing, I would like a hair-line type glyph to delete.
>
> We already have a solution for deleting a character which was composed
> with the preceding one(s).  So I think this aspects doesn't have to be
> a factor in our decision how to display ZWNJ.

What I mean is, I would want to see something that I can delete (and
re-add) on its own, with the only other conseqence that the neighboring
characters change shape.

>> * Hightlighting ZWNJ in read-only text while searching for it with
>>   incremental search.
>
> In general, search should ignore ZWNJ and similar controls, at least
> the "folding" search.

I was thinking about searching explicitly for ZWNJ, e.g. to find and
delete wrong uses, or to find out how another auther has achieved a
particular effect.

>> * User-specified replacement via display tables. 
>
> We already have the glyphless-char-display-control feature for that.

Right, I was thinking, if I wanted to implement ZWNJ as a glyph and I
did not want to rely on the font to provide a glyph, I would create my
own glyph and use that.  But I would want the user to be able to replace
it.

benny




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Sat, 16 Sep 2017 18:22:02 GMT) Full text and rfc822 format available.

Message #86 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: b.riefenstahl <at> turtle-trading.net, nimawebgard <at> gmail.com
Cc: Kenichi Handa <handa <at> gnu.org>, 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2;
 Emacs shows ZWNJ character (Zero Width non-Joiner) as Space
Date: Sat, 16 Sep 2017 21:20:51 +0300
> Date: Sat, 16 Sep 2017 20:42:54 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: nimawebgard <at> gmail.com, 28339 <at> debbugs.gnu.org
> 
> > From: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>
> > Cc: handa <handa <at> gnu.org>,  nimawebgard <at> gmail.com,  28339 <at> debbugs.gnu.org
> > Date: Sat, 16 Sep 2017 19:30:05 +0200
> > 
> > > [...] I thought the shaping engine gives us back glyphs that don't
> > > include ZWNJ itself.  Evidently, that's not true, which I find
> > > strange.
> > 
> > I thought that with OpenType at least that depends on the font?
> 
> It does, but Handa-san seems to say that even the best fonts don't
> consider ZWNJ part of any grapheme cluster, and always leave it alone.

Well, "always" here means "always when Arabic script is being
rendered".




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Sat, 16 Sep 2017 18:25:03 GMT) Full text and rfc822 format available.

Message #89 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>
Cc: handa <at> gnu.org, nimawebgard <at> gmail.com, 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2;
 Emacs shows ZWNJ character (Zero Width non-Joiner) as Space
Date: Sat, 16 Sep 2017 21:23:56 +0300
> From: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>
> Cc: handa <at> gnu.org,  nimawebgard <at> gmail.com,  28339 <at> debbugs.gnu.org
> Date: Sat, 16 Sep 2017 20:05:23 +0200
> 
> >> From: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>
> >> I thought that with OpenType at least that depends on the font?
> 
> Eli Zaretskii writes:
> > It does, but Handa-san seems to say that even the best fonts don't
> > consider ZWNJ part of any grapheme cluster, and always leave it alone.
> 
> It breaks shaping and ligatures, so the result is not a cluster by
> definition, I think.

It could be considered a cluster with the preceding character.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Mon, 18 Sep 2017 01:54:02 GMT) Full text and rfc822 format available.

Message #92 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: handa <handa <at> gnu.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: b.riefenstahl <at> turtle-trading.net, nimawebgard <at> gmail.com,
 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2;
 Emacs shows ZWNJ character (Zero Width non-Joiner) as Space
Date: Mon, 18 Sep 2017 10:52:41 +0900
In article <83r2v6k09t.fsf <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:

> > > [...] I thought the shaping engine gives us back glyphs that don't
> > > include ZWNJ itself.  Evidently, that's not true, which I find
> > > strange.
> > 
> > I thought that with OpenType at least that depends on the font?

> It does, but Handa-san seems to say that even the best fonts don't
> consider ZWNJ part of any grapheme cluster, and always leave it alone.

I checked the GSUB table of "Courier New" font (cour.ttf) using the
program ttx (included in fonttools package of Ubuntu).  It surely
contains many rules with ZWNJ, but none of them are with Arabic
characters.  So, I suspect that absorbing of ZWNJ for Arabic is done by
a layout engine (halfbuzz? uniscribe?) or by an application level
library (pango?).

> > For editing, I would like a hair-line type glyph to delete.

> We already have a solution for deleting a character which was composed
> with the preceding one(s).  So I think this aspects doesn't have to be
> a factor in our decision how to display ZWNJ.

Isn't there a case that ZWNJ is prepeneded to a character to change the
shape of the following character?

---
K. Handa
handa <at> gnu.org




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Mon, 18 Sep 2017 15:17:02 GMT) Full text and rfc822 format available.

Message #95 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: handa <handa <at> gnu.org>
Cc: b.riefenstahl <at> turtle-trading.net, nimawebgard <at> gmail.com,
 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2;
 Emacs shows ZWNJ character (Zero Width non-Joiner) as Space
Date: Mon, 18 Sep 2017 18:16:32 +0300
> From: handa <handa <at> gnu.org>
> Cc: b.riefenstahl <at> turtle-trading.net, nimawebgard <at> gmail.com, 28339 <at> debbugs.gnu.org
> Date: Mon, 18 Sep 2017 10:52:41 +0900
> 
> > > I thought that with OpenType at least that depends on the font?
> 
> > It does, but Handa-san seems to say that even the best fonts don't
> > consider ZWNJ part of any grapheme cluster, and always leave it alone.
> 
> I checked the GSUB table of "Courier New" font (cour.ttf) using the
> program ttx (included in fonttools package of Ubuntu).  It surely
> contains many rules with ZWNJ, but none of them are with Arabic
> characters.  So, I suspect that absorbing of ZWNJ for Arabic is done by
> a layout engine (halfbuzz? uniscribe?) or by an application level
> library (pango?).
> 
> > > For editing, I would like a hair-line type glyph to delete.
> 
> > We already have a solution for deleting a character which was composed
> > with the preceding one(s).  So I think this aspects doesn't have to be
> > a factor in our decision how to display ZWNJ.
> 
> Isn't there a case that ZWNJ is prepeneded to a character to change the
> shape of the following character?

I don't see this in Unicode, but maybe I'm missing something.

Anyway, what would you suggest as a solution to this issue?  Should we
install the arabic-font-shape-gstring function into Emacs?  Do we need
to do something else in addition?  E.g., do we need to make the
display of ZWNJ optional?

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Mon, 18 Sep 2017 15:23:02 GMT) Full text and rfc822 format available.

Message #98 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Benjamin Riefenstahl <b.riefenstahl <at> turtle-trading.net>
To: handa <handa <at> gnu.org>
Cc: Eli Zaretskii <eliz <at> gnu.org>, nimawebgard <at> gmail.com, 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2;
 Emacs shows ZWNJ character (Zero Width non-Joiner) as Space
Date: Mon, 18 Sep 2017 17:22:04 +0200
Hi Handa,

NB: Take these remarks with the usual grain of salt, I have not read the
latest version of the standard about this.  I do have some practical
experience implementing my own fonts though, so I have thought about it
before from that perspective.

handa writes:
> I checked the GSUB table of "Courier New" font (cour.ttf) using the
> program ttx (included in fonttools package of Ubuntu).  It surely
> contains many rules with ZWNJ, but none of them are with Arabic
> characters.  So, I suspect that absorbing of ZWNJ for Arabic is done by
> a layout engine (halfbuzz? uniscribe?) or by an application level
> library (pango?).

The font doesn't need a specific rule, because "not shaping" is the
default in any combination that is not covered by rules.  It only needs
rules when it *does* want to do shaping.  ZWNJ can work just by keeping
the characters apart.  After the shaping rules have been applied the
question is, is ZWNJ just represented by an empty glyph, or do you have
a rule that drops the ZWNJ from the glyph list.

This is different from ZWJ, where you need specific rules to do shaping.
That is why I said we need the font for that, while we could get the
effect of ZWNJ by separate rendering.

> Isn't there a case that ZWNJ is prepeneded to a character to change the
> shape of the following character?

ZWNJ is only interesting between characters.  It does not have an effect
at the start or the end of a string.  It prevents changes which would
occur because of ligatures or shaping rules.  It should also prevent
composition with accents, although that is not the usual way to prevent
composition.  At least not in western scripts, I don't know what indic
scripts do or need.

ZWJ OTOH is before or after a character to force the application of
shaping.  It does not make sense with ligatures, because there the
effect depends on the other character, while with shaping there is a
standard effect.

benny




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Tue, 19 Sep 2017 12:20:01 GMT) Full text and rfc822 format available.

Message #101 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: handa <handa <at> gnu.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: b.riefenstahl <at> turtle-trading.net, nimawebgard <at> gmail.com,
 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2;
 Emacs shows ZWNJ character (Zero Width non-Joiner) as Space
Date: Tue, 19 Sep 2017 21:18:31 +0900
In article <83h8w0hwa7.fsf <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:

> I don't see this in Unicode, but maybe I'm missing something.

> Anyway, what would you suggest as a solution to this issue?  Should we
> install the arabic-font-shape-gstring function into Emacs?  Do we need
> to do something else in addition?  E.g., do we need to make the
> display of ZWNJ optional?

As I don't know what is the right thing, I'm asking here.

If users never ever want to put cursor on ZWNJ,
arabic-font-shape-gstring-ZWNJ-absorb is the solution.

If users want a thin space to be able to handle ZWNJ directly,
arabic-font-shape-gstring-ZWNJ-as-space is the solution.

If it depends on a situation or a user's prefernce, it is better to have
a user customizable variable to switch between them.

By the way, I've just tried arabic-shape.el on Windows, and found that
arabic-font-shape-gstring-ZWNJ-as-space worked, which means Windows text
laout backend on Windows (uniscribe?) also returns ZWNJ glyph.  And,
without arabic-font-shape-gstring-ZWNJ-as-space, I see a strange cursor
display.  Is it the "glitch" you mentioned?

---
K. Handa
handa <at> gnu.org




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Wed, 20 Sep 2017 07:27:02 GMT) Full text and rfc822 format available.

Message #104 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: handa <handa <at> gnu.org>
Cc: b.riefenstahl <at> turtle-trading.net, nimawebgard <at> gmail.com,
 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2;
 Emacs shows ZWNJ character (Zero Width non-Joiner) as Space
Date: Wed, 20 Sep 2017 10:25:57 +0300
> From: handa <handa <at> gnu.org>
> Cc: b.riefenstahl <at> turtle-trading.net, nimawebgard <at> gmail.com,
> 	28339 <at> debbugs.gnu.org
> Date: Tue, 19 Sep 2017 21:18:31 +0900
> 
> In article <83h8w0hwa7.fsf <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > I don't see this in Unicode, but maybe I'm missing something.
> 
> > Anyway, what would you suggest as a solution to this issue?  Should we
> > install the arabic-font-shape-gstring function into Emacs?  Do we need
> > to do something else in addition?  E.g., do we need to make the
> > display of ZWNJ optional?
> 
> As I don't know what is the right thing, I'm asking here.
> 
> If users never ever want to put cursor on ZWNJ,
> arabic-font-shape-gstring-ZWNJ-absorb is the solution.
> 
> If users want a thin space to be able to handle ZWNJ directly,
> arabic-font-shape-gstring-ZWNJ-as-space is the solution.
> 
> If it depends on a situation or a user's prefernce, it is better to have
> a user customizable variable to switch between them.

Maybe we should go with an option.  I will try to come up with a patch
for that.

> By the way, I've just tried arabic-shape.el on Windows, and found that
> arabic-font-shape-gstring-ZWNJ-as-space worked, which means Windows text
> laout backend on Windows (uniscribe?) also returns ZWNJ glyph.

Yes, it does.

> And, without arabic-font-shape-gstring-ZWNJ-as-space, I see a
> strange cursor display.  Is it the "glitch" you mentioned?

Yes.  Do you understand what is the reason for that?

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Fri, 06 Oct 2017 10:07:01 GMT) Full text and rfc822 format available.

Message #107 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: handa <handa <at> gnu.org>
To: Nima Aryan <nimawebgard <at> gmail.com>
Cc: eliz <at> gnu.org, b.riefenstahl <at> turtle-trading.net, 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2;
 Emacs shows ZWNJ character (Zero Width non-Joiner) as Space
Date: Fri, 06 Oct 2017 19:05:41 +0900
In article <CALp2H_2MWgjoEEm6Rp5+5uOdMk-RbFWzaCrweo=pbdzAaq8btA <at> mail.gmail.com>, Nima Aryan <nimawebgard <at> gmail.com> writes:

> As a user I prefer absorb mode by default but some times thin-space (and
> not simple space) might be a good option to consider.

Attached patch introduces a customizable variable
arabic-shaper-ZWNJ-handling.  Shall I install it?

---
K. Handa
handa <at> gnu.org

------------------------------------------------------------
diff --git a/lisp/composite.el b/lisp/composite.el
index ab39e08..72b0ffc 100644
--- a/lisp/composite.el
+++ b/lisp/composite.el
@@ -442,8 +442,10 @@ lglyph-set-width
 (defsubst lglyph-set-adjustment (glyph &optional xoff yoff wadjust)
   (aset glyph 9 (vector (or xoff 0) (or yoff 0) (or wadjust 0))))
 
+;; Return the shallow Copy of GLYPH.
 (defsubst lglyph-copy (glyph) (copy-sequence glyph))
 
+;; Insert GLYPH at the index IDX of GSTRING.
 (defun lgstring-insert-glyph (gstring idx glyph)
   (let ((nglyphs (lgstring-glyph-len gstring))
 	(i idx))
@@ -459,6 +461,18 @@ lgstring-insert-glyph
     (lgstring-set-glyph gstring i glyph)
     gstring))
 
+;; Remove glyph at IDX from GSTRING.
+(defun lgstring-remove-glyph (gstring idx)
+  (setq gstring (copy-sequence gstring))
+  (lgstring-set-id gstring nil)
+  (let ((len (length gstring)))
+    (setq idx (+ idx 3))
+    (while (< idx len)
+      (aset gstring (1- idx) (aref gstring idx))
+      (setq idx (1+ idx)))
+    (aset gstring (1- len) nil))
+  gstring)
+
 (defun compose-glyph-string (gstring from to)
   (let ((glyph (lgstring-glyph gstring from))
 	from-pos to-pos)
diff --git a/lisp/language/misc-lang.el b/lisp/language/misc-lang.el
index 2843c7c..4e10227 100644
--- a/lisp/language/misc-lang.el
+++ b/lisp/language/misc-lang.el
@@ -75,12 +75,72 @@ 'cp1256
 	    (sample-text . "Persian	فارسی")
 	    (documentation . "Bidirectional editing is supported.")))
 
+(defcustom arabic-shaper-ZWNJ-handling nil
+  "How to handle ZWMJ in Arabic text renderling.
+This variable controls the way to handle a glyph for ZWNJ
+returned by the underling shaping engine.
+
+The default value is nil, which means that the ZWNJ glyph is
+displayed as is.
+
+If the value is `absorb', ZWNJ is absorbed into the previous
+grapheme cluster, and not displayed.
+
+If the value is `as-space', the glyph is displayed by a
+thin (i.e. 1-dot width) space.
+
+Customizing the value takes effect when you start Emacs next time."
+  :group 'mule
+  :version "27.1"
+  :type '(choice
+          (const :tag "default" nil)
+          (const :tag "as space" as-space)
+          (const :tag "absorb" absorb)))
+
+(defvar arabic-shape-log nil)
+
+(defun arabic-shape-gstring (gstring)
+  (setq gstring (font-shape-gstring gstring))
+  (push arabic-shaper-ZWNJ-handling arabic-shape-log)
+  (condition-case err
+      (when arabic-shaper-ZWNJ-handling
+        (let ((font (lgstring-font gstring))
+              (i 1)
+              (len (lgstring-glyph-len gstring))
+              (modified nil))
+          (while (< i len)
+            (let ((glyph (lgstring-glyph gstring i)))
+              (when (eq (lglyph-char glyph) #x200c)
+                (cond
+                 ((eq arabic-shaper-ZWNJ-handling 'as-space)
+                  (if (> (- (lglyph-rbearing glyph) (lglyph-lbearing glyph)) 0)
+                      (let ((space-glyph (aref (font-get-glyphs font 0 1 " ") 0)))
+                        (when space-glyph
+                          (lglyph-set-code glyph (aref space-glyph 3))
+                          (lglyph-set-width glyph (aref space-glyph 4)))))
+                  (lglyph-set-adjustment glyph 0 0 1)
+                  (setq modified t))
+                 ((eq arabic-shaper-ZWNJ-handling 'absorb)
+                  (let ((prev (lgstring-glyph gstring (1- i))))
+                    (lglyph-set-from-to prev (lglyph-from prev) (lglyph-to glyph))
+                    (push (cons "remove" (lgstring-glyph gstring i))
+                          arabic-shape-log)
+                    (setq gstring (lgstring-remove-glyph gstring i))
+                    (setq len (1- len)))
+                  (setq modified t)))))
+            (setq i (1+ i)))
+          (if modified
+              (lgstring-set-id gstring nil))))
+    (error (push err arabic-shape-log)))
+  gstring)
+
 (set-char-table-range
  composition-function-table
  '(#x600 . #x74F)
- (list (vector "[\u0600-\u074F\u200C\u200D]+" 0 'font-shape-gstring)
-       (vector "[\u200C\u200D][\u0600-\u074F\u200C\u200D]+"
-               1 'font-shape-gstring)))
+ (list (vector "[\u0600-\u074F\u200C\u200D]+" 0
+               'arabic-shape-gstring)
+       (vector "[\u200C\u200D][\u0600-\u074F\u200C\u200D]+" 1
+               'arabic-shape-gstring)))
 
 (provide 'misc-lang)
 






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Fri, 06 Oct 2017 12:15:03 GMT) Full text and rfc822 format available.

Message #110 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: handa <handa <at> gnu.org>
Cc: b.riefenstahl <at> turtle-trading.net, nimawebgard <at> gmail.com,
 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2;
 Emacs shows ZWNJ character (Zero Width non-Joiner) as Space
Date: Fri, 06 Oct 2017 15:14:14 +0300
> From: handa <handa <at> gnu.org>
> Cc: b.riefenstahl <at> turtle-trading.net, eliz <at> gnu.org, 28339 <at> debbugs.gnu.org
> Date: Fri, 06 Oct 2017 19:05:41 +0900
> 
> > As a user I prefer absorb mode by default but some times thin-space (and
> > not simple space) might be a good option to consider.
> 
> Attached patch introduces a customizable variable
> arabic-shaper-ZWNJ-handling.  Shall I install it?

Yes, please install on the emacs-26 branch.  And please change the
:version tag to say "26.1" instead.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Sat, 07 Oct 2017 01:12:02 GMT) Full text and rfc822 format available.

Message #113 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: handa <handa <at> gnu.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: b.riefenstahl <at> turtle-trading.net, nimawebgard <at> gmail.com,
 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2;
 Emacs shows ZWNJ character (Zero Width non-Joiner) as Space
Date: Sat, 07 Oct 2017 10:11:19 +0900
In article <83o9pkv5gp.fsf <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:
> > Attached patch introduces a customizable variable
> > arabic-shaper-ZWNJ-handling.  Shall I install it?

> Yes, please install on the emacs-26 branch.  And please change the
> :version tag to say "26.1" instead.

Ok, but I've long not committed a change to Emacs, and I don't know how
to handle changelog entry recently.  Which document should I read to
know the recent manner?

---
K. Handa
handa <at> gnu.org




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#28339; Package emacs. (Fri, 04 Sep 2020 05:14:02 GMT) Full text and rfc822 format available.

Message #116 received at 28339 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: handa <handa <at> gnu.org>, b.riefenstahl <at> turtle-trading.net,
 nimawebgard <at> gmail.com, 28339 <at> debbugs.gnu.org
Subject: Re: bug#28339: 25.2; Emacs shows ZWNJ character (Zero Width
 non-Joiner) as Space
Date: Fri, 04 Sep 2020 07:12:57 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

>> Attached patch introduces a customizable variable
>> arabic-shaper-ZWNJ-handling.  Shall I install it?
>
> Yes, please install on the emacs-26 branch.  And please change the
> :version tag to say "26.1" instead.

Reading this thread, it looks like the patch was applied and the bug was
fixed, so I'm closing this bug report now.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




bug closed, send any further explanations to 28339 <at> debbugs.gnu.org and Nima Aryan <nimawebgard <at> gmail.com> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Fri, 04 Sep 2020 05:14:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 02 Oct 2020 11:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 206 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.