GNU bug report logs - #29364
27.0.50; Word Wrap doesn't support CJK

Previous Next

Package: emacs;

Reported by: Chunyang Xu <mail <at> xuchunyang.me>

Date: Mon, 20 Nov 2017 12:36:01 UTC

Severity: wishlist

Tags: fixed

Found in version 27.0.50

Fixed in version 28.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 29364 in the body.
You can then email your comments to 29364 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#29364; Package emacs. (Mon, 20 Nov 2017 12:36:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Chunyang Xu <mail <at> xuchunyang.me>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Mon, 20 Nov 2017 12:36:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Chunyang Xu <mail <at> xuchunyang.me>
To: bug-gnu-emacs <at> gnu.org
Subject: 27.0.50; Word Wrap doesn't support CJK
Date: Mon, 20 Nov 2017 20:34:48 +0800
Hello.

For example, here is some Chinese text:

文字文字 English 文字文字文字文字文字文字文字文字文字文字文字文字文字文字文字文字文字文字文字文字文字文字。

in Emacs, turn on Word Wrap via M-x visual-line-mode and adjust window
width (e.g., 87), Emacs displays it as:

---------------------------------------------------------------------------------------
文字文字 English
文字文字文字文字文字文字文字文字文字文字文字文字文字文字文字文字文字文字文字文字文字文
字。
---------------------------------------------------------------------------------------

but it should display:

---------------------------------------------------------------------------------------
文字文字 English 文字文字文字文字文字文字文字文字文字文字文字文字文字文字文字文字文字文
字文字文字文字文字。
---------------------------------------------------------------------------------------

the point is, Chinese doesn't have to rely on space to break, it can
break at anywhere but with some exceptions, such as, after breaking, the
punctuation mark can not be at the beginning of the line, so the
following is also incorrect:

-----------------------------------------------------
文字文字 English 文字文字文字文字文字文字文字文字文字
文字文字文字文字文字文字文字文字文字文字文字文字文字
。
-----------------------------------------------------

for this width, it should display:

-----------------------------------------------------
文字文字 English 文字文字文字文字文字文字文字文字文字
文字文字文字文字文字文字文字文字文字文字文字文字文
字。
-----------------------------------------------------


Some applications such as Chrome (web browser) and Atom (text edit) can
wrap CJK text, and Atom added CJK wrapping support by this patch:

- https://github.com/atom/atom/pull/9162

There is an article from Wikipedia explaining how to break CJK text:

- https://en.wikipedia.org/wiki/Line_breaking_rules_in_East_Asian_languages

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#29364; Package emacs. (Mon, 20 Nov 2017 18:23:01 GMT) Full text and rfc822 format available.

Message #8 received at 29364 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Chunyang Xu <mail <at> xuchunyang.me>
Cc: 29364 <at> debbugs.gnu.org
Subject: Re: bug#29364: 27.0.50; Word Wrap doesn't support CJK
Date: Mon, 20 Nov 2017 20:22:09 +0200
> From: Chunyang Xu <mail <at> xuchunyang.me>
> Date: Mon, 20 Nov 2017 20:34:48 +0800
> 
> the point is, Chinese doesn't have to rely on space to break, it can
> break at anywhere but with some exceptions, such as, after breaking, the
> punctuation mark can not be at the beginning of the line, so the
> following is also incorrect:
> 
> -----------------------------------------------------
> 文字文字 English 文字文字文字文字文字文字文字文字文字
> 文字文字文字文字文字文字文字文字文字文字文字文字文字
> 。
> -----------------------------------------------------
> 
> for this width, it should display:
> 
> -----------------------------------------------------
> 文字文字 English 文字文字文字文字文字文字文字文字文字
> 文字文字文字文字文字文字文字文字文字文字文字文字文
> 字。
> -----------------------------------------------------
> 
> 
> Some applications such as Chrome (web browser) and Atom (text edit) can
> wrap CJK text, and Atom added CJK wrapping support by this patch:
> 
> - https://github.com/atom/atom/pull/9162
> 
> There is an article from Wikipedia explaining how to break CJK text:
> 
> - https://en.wikipedia.org/wiki/Line_breaking_rules_in_East_Asian_languages

Emacs has an implementation of those rules in kinsoku.el.  Word-wrap
is done on the C level in the display engine; volunteers are welcome
to submit patches for honoring Kinsoku rules on that level.  The
challenge is to do so without slowing down redisplay.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#29364; Package emacs. (Thu, 04 Feb 2021 14:06:01 GMT) Full text and rfc822 format available.

Message #11 received at 29364 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Chunyang Xu <mail <at> xuchunyang.me>
Cc: 29364 <at> debbugs.gnu.org
Subject: Re: bug#29364: 27.0.50; Word Wrap doesn't support CJK
Date: Thu, 04 Feb 2021 15:05:33 +0100
Chunyang Xu <mail <at> xuchunyang.me> writes:

> For example, here is some Chinese text:
>
> 文字文字 English 文字文字文字文字文字文字文字文字文字文字文字文字文字
> 文字文字文字文字文字文字文字文字文字。
>
> in Emacs, turn on Word Wrap via M-x visual-line-mode and adjust window
> width (e.g., 87), Emacs displays it as:
>
> ---------------------------------------------------------------------------------------
> 文字文字 English
> 文字文字文字文字文字文字文字文字文字文字文字文字文字文字文字文字文字文
> 字文字文字文字文
> 字。

This has been fixed in Emacs 28.  You can now say
(setq word-wrap-by-category t) to make Emacs respect kinsoku rules when
wrapping lines visually.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no

Added tag(s) fixed. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Thu, 04 Feb 2021 14:06:02 GMT) Full text and rfc822 format available.

bug marked as fixed in version 28.1, send any further explanations to 29364 <at> debbugs.gnu.org and Chunyang Xu <mail <at> xuchunyang.me> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Thu, 04 Feb 2021 14:06:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 05 Mar 2021 12:24:07 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 45 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.