GNU bug report logs - #47767
28.0.50; japanese-hankaku unnatural or misconversion results

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: emacs; Reported by: Kazuhiro Ito <kzhr@HIDDEN>; Keywords: patch fixed; Done: Lars Ingebrigtsen <larsi@HIDDEN>; Maintainer for emacs is bug-gnu-emacs@HIDDEN.
bug marked as fixed in version 28.1, send any further explanations to 47767 <at> debbugs.gnu.org and Kazuhiro Ito <kzhr@HIDDEN> Request was from Lars Ingebrigtsen <larsi@HIDDEN> to control <at> debbugs.gnu.org. Full text available.
Added tag(s) fixed. Request was from Lars Ingebrigtsen <larsi@HIDDEN> to control <at> debbugs.gnu.org. Full text available.

Message received at 47767 <at> debbugs.gnu.org:


Received: (at 47767) by debbugs.gnu.org; 5 May 2021 15:10:43 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed May 05 11:10:43 2021
Received: from localhost ([127.0.0.1]:60859 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1leJAg-0004vI-Qn
	for submit <at> debbugs.gnu.org; Wed, 05 May 2021 11:10:43 -0400
Received: from quimby.gnus.org ([95.216.78.240]:37926)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <larsi@HIDDEN>) id 1leJAf-0004v7-FU
 for 47767 <at> debbugs.gnu.org; Wed, 05 May 2021 11:10:41 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnus.org;
 s=20200322; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-ID
 :In-Reply-To:Date:References:Subject:Cc:To:From:Sender:Reply-To:Content-ID:
 Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc
 :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe:
 List-Post:List-Owner:List-Archive;
 bh=hvHU7D3Kr2PI/+Boz9vn3LZfI84XzmuSVj0WjsGEDZg=; b=anykMsCEIrNjwHVjOp6/Bm+o17
 X8UzZYMX0f7V6hS8JBov7ftrXHIdW4XwSJYfNDH2C5m34oo/k/jF/lYCgGXLMn8q8x+0ryuzheEUw
 jbUjkh9KWjIu/dq6JIRchjO35V3bQxolWltQ7wbKThFWpPSDopLzZ4aX052lxAWMwahY=;
Received: from cm-84.212.220.105.getinternet.no ([84.212.220.105] helo=xo)
 by quimby.gnus.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.92) (envelope-from <larsi@HIDDEN>)
 id 1leJAV-0002hM-UQ; Wed, 05 May 2021 17:10:34 +0200
From: Lars Ingebrigtsen <larsi@HIDDEN>
To: Kazuhiro Ito <kzhr@HIDDEN>
Subject: Re: bug#47767: 28.0.50; japanese-hankaku unnatural or misconversion
 results
References: <86tuo96xz9.wl--xmue@HIDDEN>
X-Now-Playing: Chrome Hoof's _Pre-Emptive False Rapture_: "Spokes of Uridium"
Date: Wed, 05 May 2021 17:10:31 +0200
In-Reply-To: <86tuo96xz9.wl--xmue@HIDDEN> (Kazuhiro Ito's message of
 "Wed, 14 Apr 2021 18:01:46 +0900")
Message-ID: <87pmy543o8.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Report: Spam detection software, running on the system "quimby.gnus.org",
 has NOT identified this incoming email as spam.  The original
 message has been attached to this so you can view it or label
 similar future email.  If you have any questions, see
 @@CONTACT_ADDRESS@@ for details.
 
 Content preview:  Kazuhiro Ito <kzhr@HIDDEN> writes: > 1. japanese-hankaku[-region]
    with ASCII-ONLY option converts Japanese > panctuation characters and prolonged
    sound mark to ascii. > > (japanese-hankaku "ケーキ、ドーナツ。"
   t) > -> "ケ-キ [...] 
 
 Content analysis details:   (-2.9 points, 5.0 required)
 
  pts rule name              description
 ---- ---------------------- --------------------------------------------------
 -1.0 ALL_TRUSTED            Passed through trusted hosts only via SMTP
 -1.9 BAYES_00               BODY: Bayes spam probability is 0 to 1%
                             [score: 0.0000]
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 47767
Cc: 47767 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

Kazuhiro Ito <kzhr@HIDDEN> writes:

> 1. japanese-hankaku[-region] with ASCII-ONLY option converts Japanese
> panctuation characters and prolonged sound mark to ascii.
>
> (japanese-hankaku "=E3=82=B1=E3=83=BC=E3=82=AD=E3=80=81=E3=83=89=E3=83=BC=
=E3=83=8A=E3=83=84=E3=80=82" t)
> -> "=E3=82=B1-=E3=82=AD,=E3=83=89-=E3=83=8A=E3=83=84."
>
> the result is very unnatural because "=E3=83=BC", "=E3=80=81" and "=E3=80=
=82" are normally
> used among Japanese characters which are not converted to hankaku in
> this case.  I hope they are kept as is (the result should be
> "=E3=82=B1=E3=83=BC=E3=82=AD=E3=80=81=E3=83=89=E3=83=BC=E3=83=8A=E3=83=84=
=E3=80=82").
>
> 2. japanese-hankaku[-region] without ASCII-ONLY option and
> japanese-zenkaku[-region] fails to convert Latin punctuations.
>
> (japanese-zenkaku "A, B, C.")
> -> "=EF=BC=A1=E3=80=81=E3=80=80=EF=BC=A2=E3=80=81=E3=80=80=EF=BC=A3=E3=80=
=82"
>
> (japanese-hankaku "=EF=BC=A1=EF=BC=8C=EF=BC=A2=EF=BC=8C=EF=BC=A3=EF=BC=8E=
")
> -> "A=EF=BD=A4B=EF=BD=A4C=EF=BD=A1"
>
> They should be "=EF=BC=A1=EF=BC=8C=E3=80=80=EF=BC=A2=EF=BC=8C=E3=80=80=EF=
=BC=A3=EF=BC=8E" and "A,B,C." respectively.
>
> Below patch fixes problems.

Thanks; applied to Emacs 28.

I don't know Japanese, but since nobody else has piped up about this
patch in three weeks, I'm applying it.  If other Japanese-speaking
people disagree with this patch, we can revert it.

--=20
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs@HIDDEN:
bug#47767; Package emacs. Full text available.
Added tag(s) patch. Request was from Stefan Kangas <stefan@HIDDEN> to control <at> debbugs.gnu.org. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 14 Apr 2021 09:02:03 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Apr 14 05:02:03 2021
Received: from localhost ([127.0.0.1]:33371 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1lWbPO-0006nw-QF
	for submit <at> debbugs.gnu.org; Wed, 14 Apr 2021 05:02:03 -0400
Received: from lists.gnu.org ([209.51.188.17]:58226)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <kzhr@HIDDEN>) id 1lWbPM-0006nV-63
 for submit <at> debbugs.gnu.org; Wed, 14 Apr 2021 05:02:00 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:33382)
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <kzhr@HIDDEN>)
 id 1lWbPL-0000bU-Uv
 for bug-gnu-emacs@HIDDEN; Wed, 14 Apr 2021 05:01:59 -0400
Received: from snd00006.auone-net.jp ([111.86.247.6]:53025
 helo=dmta0003.auone-net.jp)
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <kzhr@HIDDEN>)
 id 1lWbPI-00047A-2c
 for bug-gnu-emacs@HIDDEN; Wed, 14 Apr 2021 05:01:58 -0400
Received: from kzhr.d1.dion.ne.jp by dmta0003.auone-net.jp with ESMTP
 id <20210414090148193.RDRP.44995.kzhr.d1.dion.ne.jp@HIDDEN>;
 Wed, 14 Apr 2021 18:01:48 +0900
Date: Wed, 14 Apr 2021 18:01:46 +0900
Message-ID: <86tuo96xz9.wl--xmue@HIDDEN>
From: Kazuhiro Ito <kzhr@HIDDEN>
To: bug-gnu-emacs@HIDDEN
Subject: 28.0.50; japanese-hankaku unnatural or misconversion results
X-Hashcash: 1:20:210414:bug-gnu-emacs@HIDDEN::889ZYV13ODz6vFGQ:00000000000000000000000000000000000000005ri/
User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue)
 FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0
 Emacs/28.0.50 (x86_64-w64-mingw32) MULE/6.0 (HANACHIRUSATO)
MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue")
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Received-SPF: pass client-ip=111.86.247.6; envelope-from=kzhr@HIDDEN;
 helo=dmta0003.auone-net.jp
X-Spam_score_int: -18
X-Spam_score: -1.9
X-Spam_bar: -
X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001,
 SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-Spam-Score: -1.3 (-)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -2.3 (--)


1. japanese-hankaku[-region] with ASCII-ONLY option converts Japanese
panctuation characters and prolonged sound mark to ascii.

(japanese-hankaku "=E3=82=B1=E3=83=BC=E3=82=AD=E3=80=81=E3=83=89=E3=83=BC=
=E3=83=8A=E3=83=84=E3=80=82" t)
-> "=E3=82=B1-=E3=82=AD,=E3=83=89-=E3=83=8A=E3=83=84."

the result is very unnatural because "=E3=83=BC", "=E3=80=81" and "=E3=80=
=82" are normally
used among Japanese characters which are not converted to hankaku in
this case.  I hope they are kept as is (the result should be
"=E3=82=B1=E3=83=BC=E3=82=AD=E3=80=81=E3=83=89=E3=83=BC=E3=83=8A=E3=83=84=
=E3=80=82").


2. japanese-hankaku[-region] without ASCII-ONLY option and
japanese-zenkaku[-region] fails to convert Latin punctuations.

(japanese-zenkaku "A, B, C.")
-> "=EF=BC=A1=E3=80=81=E3=80=80=EF=BC=A2=E3=80=81=E3=80=80=EF=BC=A3=E3=80=
=82"

(japanese-hankaku "=EF=BC=A1=EF=BC=8C=EF=BC=A2=EF=BC=8C=EF=BC=A3=EF=BC=8E")
-> "A=EF=BD=A4B=EF=BD=A4C=EF=BD=A1"

They should be "=EF=BC=A1=EF=BC=8C=E3=80=80=EF=BC=A2=EF=BC=8C=E3=80=80=EF=
=BC=A3=EF=BC=8E" and "A,B,C." respectively.


Below patch fixes problems.


diff --git a/lisp/language/japan-util.el b/lisp/language/japan-util.el
index 3f1fb2b749..8b80599d99 100644
--- a/lisp/language/japan-util.el
+++ b/lisp/language/japan-util.el
@@ -96,9 +96,9 @@ japanese-kana-table
 	  (put-char-code-property jisx0201 'jisx0208 katakana)))))
=20
 (defconst japanese-symbol-table
-  '((?\=E3=80=80 ?\ ) (?=EF=BC=8C ?, ?=EF=BD=A4) (?=EF=BC=8E ?. ?=EF=BD=A1=
) (?=E3=80=81 ?, ?=EF=BD=A4) (?=E3=80=82 ?. ?=EF=BD=A1) (?=E3=83=BB nil ?=
=EF=BD=A5)
+  '((?\=E3=80=80 ?\ ) (?=EF=BC=8C ?,) (?=EF=BC=8E ?.) (?=E3=80=81 nil ?=EF=
=BD=A4) (?=E3=80=82 nil ?=EF=BD=A1) (?=E3=83=BB nil ?=EF=BD=A5)
     (?=EF=BC=9A ?:) (?=EF=BC=9B ?\;) (?=EF=BC=9F ??) (?=EF=BC=81 ?!) (?=E3=
=82=9B nil ?=EF=BE=9E) (?=E3=82=9C nil ?=EF=BE=9F)
-    (?=C2=B4 ?') (?=EF=BD=80 ?`) (?=EF=BC=BE ?^) (?=EF=BC=BF ?_) (?=E3=83=
=BC ?- ?=EF=BD=B0) (?=E2=80=94 ?-) (?=E2=80=90 ?-)
+    (?=C2=B4 ?') (?=EF=BD=80 ?`) (?=EF=BC=BE ?^) (?=EF=BC=BF ?_) (?=E3=83=
=BC nil ?=EF=BD=B0) (?=E2=80=94 ?-) (?=E2=80=90 ?-)
     (?=EF=BC=8F ?/) (?=EF=BC=BC ?\\) (?=E3=80=9C ?~)  (?=EF=BD=9C ?|) (?=
=E2=80=98 ?`) (?=E2=80=99 ?') (?=E2=80=9C ?\") (?=E2=80=9D ?\")
     (?\=EF=BC=88 ?\() (?\=EF=BC=89 ?\)) (?\=EF=BC=BB ?\[) (?\=EF=BC=BD ?\]=
) (?\=EF=BD=9B ?{) (?\=EF=BD=9D ?})
     (?=E3=80=88 ?<) (?=E3=80=89 ?>) (?\=E3=80=8C nil ?\=EF=BD=A2) (?\=E3=
=80=8D nil ?\=EF=BD=A3)


--=20
Kazuhiro Ito




Acknowledgement sent to Kazuhiro Ito <kzhr@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs@HIDDEN. Full text available.
Report forwarded to bug-gnu-emacs@HIDDEN:
bug#47767; Package emacs. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Wed, 5 May 2021 15:15:01 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.