GNU bug report logs - #3616
23.0.94; vc-bzr coding system bug

Previous Next

Package: emacs;

Reported by: 端瑞 <duanpanda <at> gmail.com>

Date: Fri, 19 Jun 2009 08:30:03 UTC

Severity: normal

Done: Sean Whitton <spwhitton <at> spwhitton.name>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 3616 in the body.
You can then email your comments to 3616 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#3616; Package emacs. (Fri, 19 Jun 2009 08:30:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to 端瑞 <duanpanda <at> gmail.com>:
New bug report received and forwarded. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Fri, 19 Jun 2009 08:30:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> emacsbugs.donarmstrong.com (full text, mbox):

From: 端瑞 <duanpanda <at> gmail.com>
To: emacs-pretest-bug <at> gnu.org
Subject: 23.0.94; vc-bzr coding system bug
Date: Fri, 19 Jun 2009 16:24:37 +0800
In short, the bug is, when I use Chinese, it cannot commit successfully.
The version of my bazaar is 1.15.

Next I will show 3 bazaar sessions and their bzr log, 1 and 3 are with this bug.
Obviously, the command string passed from Emacs to bazaar was wrongly encoded.

1. When I run the command bzr commit in Emacs *shell* buffer, my
comment message is written in Chinese, the result was:
--------------------------------
d:\ehome\5-Dev\Mobile\Brew\Ver0.9\ehome>bzr commit -m
"修正了联系人列表的焦点移动问题和按#键时的菜单更新问题。"
bzr commit -m "淇  浜����绯讳汉���琛ㄧ�������圭Щ��ㄩ��棰������?�� �剁����������存�伴��棰�� ?
Traceback (most recent call last):
 File "bzr", line 130, in <module>
 File "bzrlib\commands.pyo", line 969, in main
bzrlib.errors.BzrError: Parameter
''\xe4\xbf\xae\xe6\xad\xa3\xe4\xba\x86\xe8\x81\x94\xe7\xb3\xbb\xe4\xba\xba\xe5\x88\x97\xe8\xa1\xa8\xe7\x9a\x84\xe7\x84\xa6\xe7\x82\xb9\xe7\xa7\xbb\xe5\x8a\xa8\xe9\x97\xae\xe9\xa2\x98\xe5\x92\x8c\xe6\x8c?\xe9\x94\xae\xe6\x97\xb6\xe7\x9a\x84\xe8\x8f\x9c\xe5\x8d\x95\xe6\x9b\xb4\xe6\x96\xb0\xe9\x97\xae\xe9\xa2\x98\xe3\x80?''
is unsupported by the current encoding.
--------------------------------
I run the above command twice.  Both got the same result.  There was
no bzr log  for these two operations in the file .bzr.log.

2. It could not commit until I changed the comment message to English:
--------------------------------
d:\ehome\5-Dev\Mobile\Brew\Ver0.9\ehome>bzr commit -m "Fixed the focus
move problems in the contact list and the update problem on the menu
bar when # key is clicked."
bzr commit -m "Fixed the focus move problems in the contact list and
the update problem on the menu bar when # key is clicked."
Committing to: D:/ehome/5-Dev/Mobile/Brew/Ver0.9/ehome/
modified .bzrignore
modified App/ChattingListBox.h
modified App/ChattingListBox.inl
modified App/EhomeAppLayer.h
modified App/EhomeAppLayer.inl
modified App/EhomeAppLayer_Session.inl
modified App/SipPriorityTable.h
modified App/StructInfo/Message.h
modified App/StructInfo/Message.inl
modified App/StructInfo/session.inl
modified AppUI/ChattingListBoxItem.inl
modified AppUI/ChattingTabPage.h
modified AppUI/ChattingTabPage.inl
modified AppUI/ContactsListTabPage.inl
modified AppUI/ContactsListTree.inl
modified AppUI/FormChat.inl
modified AppUI/MsgItemTrans.inl
modified AppUI/SendTakePic.inl
modified AppUI/SetInfoTakePic.inl
modified AppUI/TreeViewWithAds.h
modified AppUI/TreeViewWithAds.inl
modified AppUI/WithAdItemListbox.inl
modified AppUI/AdsUI/AdsAction.h
modified AppUI/AdsUI/AdsAction.inl
modified AppUI/AdsUI/AdsComm.inl
modified AppUI/AdsUI/AdsHttpRequest.h
modified AppUI/AdsUI/AdsHttpRequest.inl
modified AppUI/AdsUI/AdsItemPainter.h
modified AppUI/AdsUI/AdsItemPainter.inl
modified common/SaveRecord.inl
modified common/config.h
modified common/ehomesound.h
modified common/ehomesound.inl
modified common/logger.inl
modified common/sound.inl
modified common/util.h
modified common/util.inl
modified common/net/MySocket.inl
modified common/stl/astringbuilder.inl
modified common/stl/wstring.h
added doc/contacts_focus.txt
added doc/tilemgr_test_cases.html
Committed revision 56.
--------------------------------

The log in .bzr.log is:
--------------------------------
星期五 2009-06-19 15:46:19 +0800
0.125  bzr arguments: [u'commit', u'-m', u'Fixed the focus move
problems in the contact list and the update problem on the menu bar
when # key is clicked.']
0.140  looking for plugins in C:/Documents and
Settings/Ryan/Application Data/bazaar/2.0/plugins
0.140  looking for plugins in C:/Program Files/Bazaar/plugins
0.312  encoding stdout as osutils.get_user_encoding() 'cp936'
0.375  opening working tree 'D:/ehome/5-Dev/Mobile/Brew/Ver0.9/ehome'
0.437  preparing to commit
[ 4452] 2009-06-19 15:46:19.703 INFO: Committing to:
D:/ehome/5-Dev/Mobile/Brew/Ver0.9/ehome/
0.453  Selecting files for commit with filter []
[ 4452] 2009-06-19 15:46:19.875 INFO: modified .bzrignore
[ 4452] 2009-06-19 15:46:19.875 INFO: modified App/ChattingListBox.h
[ 4452] 2009-06-19 15:46:19.875 INFO: modified App/ChattingListBox.inl
[ 4452] 2009-06-19 15:46:19.875 INFO: modified App/EhomeAppLayer.h
[ 4452] 2009-06-19 15:46:19.875 INFO: modified App/EhomeAppLayer.inl
[ 4452] 2009-06-19 15:46:19.875 INFO: modified App/EhomeAppLayer_Session.inl
[ 4452] 2009-06-19 15:46:19.875 INFO: modified App/SipPriorityTable.h
[ 4452] 2009-06-19 15:46:19.890 INFO: modified App/StructInfo/Message.h
[ 4452] 2009-06-19 15:46:19.890 INFO: modified App/StructInfo/Message.inl
[ 4452] 2009-06-19 15:46:19.890 INFO: modified App/StructInfo/session.inl
[ 4452] 2009-06-19 15:46:19.890 INFO: modified AppUI/ChattingListBoxItem.inl
[ 4452] 2009-06-19 15:46:19.890 INFO: modified AppUI/ChattingTabPage.h
[ 4452] 2009-06-19 15:46:19.890 INFO: modified AppUI/ChattingTabPage.inl
[ 4452] 2009-06-19 15:46:19.890 INFO: modified AppUI/ContactsListTabPage.inl
[ 4452] 2009-06-19 15:46:19.890 INFO: modified AppUI/ContactsListTree.inl
[ 4452] 2009-06-19 15:46:19.890 INFO: modified AppUI/FormChat.inl
[ 4452] 2009-06-19 15:46:19.890 INFO: modified AppUI/MsgItemTrans.inl
[ 4452] 2009-06-19 15:46:19.890 INFO: modified AppUI/SendTakePic.inl
[ 4452] 2009-06-19 15:46:19.890 INFO: modified AppUI/SetInfoTakePic.inl
[ 4452] 2009-06-19 15:46:19.890 INFO: modified AppUI/TreeViewWithAds.h
[ 4452] 2009-06-19 15:46:19.890 INFO: modified AppUI/TreeViewWithAds.inl
[ 4452] 2009-06-19 15:46:19.890 INFO: modified AppUI/WithAdItemListbox.inl
[ 4452] 2009-06-19 15:46:19.890 INFO: modified AppUI/AdsUI/AdsAction.h
[ 4452] 2009-06-19 15:46:19.890 INFO: modified AppUI/AdsUI/AdsAction.inl
[ 4452] 2009-06-19 15:46:19.890 INFO: modified AppUI/AdsUI/AdsComm.inl
[ 4452] 2009-06-19 15:46:19.890 INFO: modified AppUI/AdsUI/AdsHttpRequest.h
[ 4452] 2009-06-19 15:46:19.890 INFO: modified AppUI/AdsUI/AdsHttpRequest.inl
[ 4452] 2009-06-19 15:46:19.890 INFO: modified AppUI/AdsUI/AdsItemPainter.h
[ 4452] 2009-06-19 15:46:19.890 INFO: modified AppUI/AdsUI/AdsItemPainter.inl
[ 4452] 2009-06-19 15:46:19.890 INFO: modified common/SaveRecord.inl
[ 4452] 2009-06-19 15:46:19.890 INFO: modified common/config.h
[ 4452] 2009-06-19 15:46:19.890 INFO: modified common/ehomesound.h
[ 4452] 2009-06-19 15:46:19.890 INFO: modified common/ehomesound.inl
[ 4452] 2009-06-19 15:46:19.890 INFO: modified common/logger.inl
[ 4452] 2009-06-19 15:46:19.890 INFO: modified common/sound.inl
[ 4452] 2009-06-19 15:46:19.905 INFO: modified common/util.h
[ 4452] 2009-06-19 15:46:19.905 INFO: modified common/util.inl
[ 4452] 2009-06-19 15:46:19.905 INFO: modified common/net/MySocket.inl
[ 4452] 2009-06-19 15:46:19.905 INFO: modified common/stl/astringbuilder.inl
[ 4452] 2009-06-19 15:46:19.905 INFO: modified common/stl/wstring.h
[ 4452] 2009-06-19 15:46:19.905 INFO: added doc/contacts_focus.txt
[ 4452] 2009-06-19 15:46:19.905 INFO: added doc/tilemgr_test_cases.html
[ 4452] 2009-06-19 15:46:20.733 INFO: Committed revision 56.
1.531  return code 0
--------------------------------

3. When I committed a single file from the *vc-dir* buffer using the v
shortcut, and write the message in Chinese, it also failed and
complained as follows:
--------------------------------
Traceback (most recent call last):
 File "bzr", line 130, in <module>
 File "bzrlib\commands.pyo", line 969, in main
bzrlib.errors.BzrError: Parameter
''\xe8\xaf\x95\xe8\xaf\x95\xe7\x94\xa8\xe4\xb8\xad\xe6\x96\x87\xe3\x80?''
is unsupported by the current encoding.
--------------------------------
It didn't generate any bzr log, either.



In GNU Emacs 23.0.94.1 (i386-mingw-nt5.1.2600)
of 2009-05-24 on SOFT-MJASON
Windowing system distributor `Microsoft Corp.', version 5.1.2600
configured using `configure --with-gcc (3.4)'

Important settings:
 value of $LC_ALL: nil
 value of $LC_COLLATE: nil
 value of $LC_CTYPE: nil
 value of $LC_MESSAGES: nil
 value of $LC_MONETARY: nil
 value of $LC_NUMERIC: nil
 value of $LC_TIME: nil
 value of $LANG: CHS
 value of $XMODIFIERS: nil
 locale-coding-system: cp936
 default-enable-multibyte-characters: t

Major mode: Shell

Minor modes in effect:
 diff-auto-refine-mode: t
 shell-dirtrack-mode: t
 desktop-save-mode: t
 show-paren-mode: t
 tooltip-mode: t
 tool-bar-mode: t
 mouse-wheel-mode: t
 menu-bar-mode: t
 file-name-shadow-mode: t
 global-font-lock-mode: t
 font-lock-mode: t
 blink-cursor-mode: t
 global-auto-composition-mode: t
 auto-composition-mode: t
 auto-encryption-mode: t
 auto-compression-mode: t
 column-number-mode: t
 line-number-mode: t
 transient-mark-mode: t

Recent input:
p d a t e SPC p r <backspace> <backspace> p r o b l
e m SPC o n SPC t h e m <backspace> SPC m e n u SPC
b a r SPC w h e n SPC p r e s s i n g <backspace> <backspace>
<backspace> <backspace> <backspace> <backspace> <backspace>
<backspace> # SPC k e y SPC i s SPC p r e s s e d .
<backspace> <backspace> <backspace> <backspace> <backspace>
<backspace> <backspace> <backspace> c l i c k e d .
C-f <return> M-v M-v C-l C-h l C-x 1 C-h L <return>
C-x o v C-v M-v C-x k <return> C-x o C-x 1 C-h k C-x
<return> f C-x <return> C-h C-x <return> t C-g C-h
k C-x <return> t C-x o C-v M-v C-p C-p C-p C-p C-p
C-p C-p C-p C-p C-p C-p C-n C-n C-n C-p C-p C-p C-p
C-p C-p C-p C-n <tab> <return> C-v C-x b <return> C-h
v t e r m i n a l <tab> <tab> <backspace> <backspace>
<backspace> <backspace> <backspace> c o <tab> <backspace>
<backspace> <tab> C-g C-x b <return> C-x b <return>
C-h v d e f a u l t SPC t e r <tab> <return> C-x 1
<help-echo> <help-echo> <help-echo> <help-echo> <help-echo>
M-x f i n d SPC l i b <tab> <return> b z r <return>
M-x M-p <return> v c - v b <backspace> <backspace>
b a <backspace> z <tab> <return> C-v C-n C-n C-n C-n
C-n C-n C-n C-n C-f C-f C-f C-f C-f C-f <C-f2> <return>
M-v C-v C-v C-v C-v C-v C-v C-v C-v C-v C-v C-v C-v
C-v C-v C-v C-v C-v C-v C-v C-v C-v C-v C-v C-x k <return>
M-x r e p o r t SPC <tab> <return>

Recent messages:
History item: 2
Quit [3 times]
History item: 1 [2 times]
Type C-x 1 to delete the help window, C-M-v to scroll help.
Buffer is read-only: #<buffer *Help*>
Type C-x 1 to delete the help window, C-M-v to scroll help.
Quit
mouse-2, RET: find function's definition
Quit
find-library-name: Can't find library bzr
call-interactively: End of buffer



Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#3616; Package emacs. (Fri, 19 Jun 2009 12:15:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Eli Zaretskii <eliz <at> gnu.org>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Fri, 19 Jun 2009 12:15:04 GMT) Full text and rfc822 format available.

Message #10 received at 3616 <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: 端瑞 <duanpanda <at> gmail.com>,
        3616 <at> debbugs.gnu.org
Subject: Re: bug#3616: 23.0.94; vc-bzr coding system bug
Date: Fri, 19 Jun 2009 08:10:54 -0400
> Date: Fri, 19 Jun 2009 16:24:37 +0800
> From: =?UTF-8?Q?=E7=AB=AF=E7=91=9E?= <duanpanda <at> gmail.com>
> Cc: 
> Reply-To: =?UTF-8?Q?=E7=AB=AF=E7=91=9E?= <duanpanda <at> gmail.com>,
> 	3616 <at> emacsbugs.donarmstrong.com
> 
> In short, the bug is, when I use Chinese, it cannot commit successfully.
> The version of my bazaar is 1.15.

Does it work for you from the command line?  If it does, what encoding
of Chinese do you use in that case?

> Next I will show 3 bazaar sessions and their bzr log, 1 and 3 are with this bug.
> Obviously, the command string passed from Emacs to bazaar was wrongly encoded.
> 
> 1. When I run the command bzr commit in Emacs *shell* buffer, my
> comment message is written in Chinese, the result was:

What is the value of buffer-file-coding-system in the *shell* buffer?
Does it help to change it to cp936?



Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#3616; Package emacs. (Mon, 22 Jun 2009 02:05:06 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ryan Duan <duanpanda <at> gmail.com>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Mon, 22 Jun 2009 02:05:06 GMT) Full text and rfc822 format available.

Message #15 received at 3616 <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Ryan Duan <duanpanda <at> gmail.com>
To: 3616 <at> debbugs.gnu.org
Subject: bug#3616: 23.0.94; vc-bzr coding system bug
Date: Mon, 22 Jun 2009 10:01:51 +0800
It works from the command line which is part of Windows XP and uses
Windows ANSI coding system.  Windows command line seems use cp936 as
the coding system.
The value of buffer-file-coding-system in the *shell* buffer is
chinese-gbk-dos, one of whose alias is cp936-dos.  It doesn't help to
change it to any of cp936 nor chinese-iso-8bit.

I observe that *shell* and *VC-log* buffers pass UTF-8 encoded string
(Is Emacs's internal buffer code UTF-8?) to Windows command line,
which might be the real cause of this bug and other related bugs.
Three examples following.

EXAMPLE 1
--------------------------------
In *shell*,
d:\code>bzr commit -m "第二"
bzr commit -m "绗 ��"
Traceback (most recent call last):
 File "bzr", line 130, in <module>
 File "bzrlib\commands.pyo", line 969, in main
bzrlib.errors.BzrError: Parameter ''\xe7\xac\xac\xe4\xba\x8c'' is
unsupported by the current encoding.

Notice ''\xe7\xac\xac\xe4\xba\x8c'' which is the UTF-8 encoding of my
inputted Chinese characters.  It was these UTF-8 string causing the
above error.

Apply C-u C-x = on the Chinese character "第":
       character: 第 (31532, #o75454, #x7b2c)
preferred charset: chinese-gbk (GBK Chinese simplified.)
      code point: 0xB5DA
          syntax: w    which means: word
        category:
                  .:Base, C:2-byte han, c:Chinese, h:Korean,
j:Japanese, |:line breakable
     buffer code: #xE7 #xAC #xAC
       file code: #xB5 #xDA (encoded by coding system chinese-gbk-dos)
         display: by this font (glyph code)
   uniscribe:-outline-新宋体-normal-normal-normal-mono-13-*-*-*-c-*-gb2312.1980-0
(#x3100)

Notice its buffer code is "\xe7\xac\xac" which is the first substring
of ''\xe7\xac\xac\xe4\xba\x8c''.  The file code "\xb5\xda" is
chinse-gbk encoded, and is what I expect to pass to the command line
system in Windows, which might work correctly.  But unfortunately,
instead of passing Chinese GBK encoded string to SHELL, Emacs passes
UTF-8 encoded string to SHELL.

EXAMPLE 2
--------------------------------
In *VC-log* buffer, I inputted two Chinese characters "第二" which was
the same as that in EXAMPLE 1.
After C-c C-c, the same error occurs: bzrlib.errors.BzrError:
Parameter ''\xe7\xac\xac\xe4\xba\x8c'' is unsupported by the current
encoding.
Apply C-u C-x = on "第" returned the same information as that in EXAMPLE 1.

EXAMPLE 3 (Another related bug)
--------------------------------
In Windows, I created a directory (folder) named "第二".
In dired, it works all right.
But in *shell*,
d:\>cd 第二
cd 绗 ��
系统找不到指定的路径。

It complains that the system cannot find the specified path.  Because
the "\xb5\xda\xb6\xfe"(Chinese GBK) is converted to
''\xe7\xac\xac\xe4\xba\x8c''(UTF-8) to pass to the SHELL, but the
SHELL can only process Chinese GBK characters.

CONCLUSION
--------------------------------
When we use Emacs on Chinese Windows, Chinese GBK characters are
converted to UTF-8 characters to pass to Windows command line, but
Windows command line cannot process UTF-8 characters, which causes
this bug and other related bugs.

I feel that this is not a small problem.  Emacs should detect the OS's
locale, then use the correct encoding system to interact with the OS.
It seems to do well on Linux but badly on Windows.  Dired seems do
well on Windows but shell.el and vc-bzr.el do badly.  I didn't test
other vc-* modes.

I hope the information above will help solve this problem.  Thank you!
HAPPY HACKING!

2009/6/19 Eli Zaretskii <eliz <at> gnu.org>:
>> Date: Fri, 19 Jun 2009 16:24:37 +0800
>> From: =?UTF-8?Q?=E7=AB=AF=E7=91=9E?= <duanpanda <at> gmail.com>
>> Cc:
>> Reply-To: =?UTF-8?Q?=E7=AB=AF=E7=91=9E?= <duanpanda <at> gmail.com>,
>>       3616 <at> emacsbugs.donarmstrong.com

> Does it work for you from the command line?  If it does, what encoding
> of Chinese do you use in that case?
>
> What is the value of buffer-file-coding-system in the *shell* buffer?
> Does it help to change it to cp936?



Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#3616; Package emacs. (Mon, 22 Jun 2009 18:05:09 GMT) Full text and rfc822 format available.

Acknowledgement sent to Andreas Schwab <schwab <at> linux-m68k.org>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Mon, 22 Jun 2009 18:05:09 GMT) Full text and rfc822 format available.

Message #20 received at 3616 <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Andreas Schwab <schwab <at> linux-m68k.org>
To: Ryan Duan <duanpanda <at> gmail.com>
Cc: 3616 <at> debbugs.gnu.org
Subject: Re: bug#3616: 23.0.94; vc-bzr coding system bug
Date: Mon, 22 Jun 2009 19:59:29 +0200
Ryan Duan <duanpanda <at> gmail.com> writes:

> EXAMPLE 3 (Another related bug)
> --------------------------------
> In Windows, I created a directory (folder) named "第二".
> In dired, it works all right.
> But in *shell*,
> d:\>cd 第二
> cd 绗 簩
> 系统找不到指定的路径。
>
> It complains that the system cannot find the specified path.  Because
> the "\xb5\xda\xb6\xfe"(Chinese GBK) is converted to
> ''\xe7\xac\xac\xe4\xba\x8c''(UTF-8) to pass to the SHELL, but the
> SHELL can only process Chinese GBK characters.

What does (process-coding-system (get-buffer-process "*shell*")) return?

Andreas.

-- 
Andreas Schwab, schwab <at> linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."



Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#3616; Package emacs. (Tue, 23 Jun 2009 02:45:05 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ryan Duan <duanpanda <at> gmail.com>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Tue, 23 Jun 2009 02:45:05 GMT) Full text and rfc822 format available.

Message #25 received at 3616 <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Ryan Duan <duanpanda <at> gmail.com>
To: Andreas Schwab <schwab <at> linux-m68k.org>
Cc: 3616 <at> debbugs.gnu.org
Subject: Re: bug#3616: 23.0.94; vc-bzr coding system bug
Date: Tue, 23 Jun 2009 10:39:57 +0800
(chinese-gbk-dos . undecided-dos)

2009/6/23 Andreas Schwab <schwab <at> linux-m68k.org>:
> Ryan Duan <duanpanda <at> gmail.com> writes:
>
>> EXAMPLE 3 (Another related bug)
>> --------------------------------
>> In Windows, I created a directory (folder) named "第二".
>> In dired, it works all right.
>> But in *shell*,
>> d:\>cd 第二
>> cd 绗 ��
>> 系统找不到指定的路径。
>>
>> It complains that the system cannot find the specified path.  Because
>> the "\xb5\xda\xb6\xfe"(Chinese GBK) is converted to
>> ''\xe7\xac\xac\xe4\xba\x8c''(UTF-8) to pass to the SHELL, but the
>> SHELL can only process Chinese GBK characters.
>
> What does (process-coding-system (get-buffer-process "*shell*")) return?
>
> Andreas.
>
> --
> Andreas Schwab, schwab <at> linux-m68k.org
> GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
> "And now for something completely different."
>



Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#3616; Package emacs. (Wed, 05 Mar 2025 12:15:01 GMT) Full text and rfc822 format available.

Message #28 received at 3616 <at> debbugs.gnu.org (full text, mbox):

From: Sean Whitton <spwhitton <at> spwhitton.name>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 3616 <at> debbugs.gnu.org, Rui Duan <duanpanda <at> gmail.com>
Subject: Fwd: 3616 <at> debbugs.gnu.org
Date: Wed, 05 Mar 2025 20:13:59 +0800
[Message part 1 (text/plain, inline)]
Forwarding with permission.

Eli, this is a Windows thing, do you think there is something to
investigate and fix here, or perhaps this is already captured somewhere
else?

We should probably close this specific bug, but I wanted to check with
you.

Thanks.

-- 
Sean Whitton

-------------------- Start of forwarded message --------------------
From: Rui Duan <duanpanda <at> gmail.com>
Date: Tue, 4 Mar 2025 10:09:51 -0500
Message-ID: <CAPnO5hEFkD2kwP771ZnDhJwzGM8xz_xUeAh1UaVNpUk7m5POqQ <at> mail.gmail.com>
Subject: Re: 3616 <at> debbugs.gnu.org
To: Sean Whitton <spwhitton <at> spwhitton.name>

[Message part 2 (text/plain, inline)]
Hi Sean,

Thank you for asking! I guess the bug still exists. But you can close this
bug if there are not many people complaining about this issue.

I don't have the GBK Chinese Windows at hand, so I cannot test all the 3
example cases.

I've just tested my example 3 (`cd` a Chinese directory) in a Shell mode
buffer on an English (latin) Windows system with GNU Emacs 29.4. Emacs
passed whitespaces (0x0A) to the Windows command line, instead of UTF-8
encoded string. Emacs detected that the process encoding system is latin,
it cannot convert the UTF-8 encoded Chinese characters to the latin
encoding, then it passes whitespaces as placeholders to the shell process.
This behaviour is different from what I saw 15 years ago.

However, I don't think this is the right logic to handle the encoding
conversion problem. The correct logic should be: try to get the OS encoding
system, if it succeeds, try to encode the string using that encoding system
and pass the encoded string to the subprocess, if failed, report error.
However, the current behaviour is to use whitespaces as placeholders.

Best Regards,
Rui (Ryan) Duan

On Mon, Mar 3, 2025 at 11:14 PM Sean Whitton <spwhitton <at> spwhitton.name>
wrote:

> Hello Ryan Duan,
>
> Can you still reproduce <https://debbugs.gnu.org/3616> ?
>
> If not, I would propose we close this very old bug report.
>
> --
> Sean Whitton
>
[Message part 3 (text/html, inline)]
[Message part 4 (text/plain, inline)]
-------------------- End of forwarded message --------------------

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#3616; Package emacs. (Wed, 05 Mar 2025 14:09:01 GMT) Full text and rfc822 format available.

Message #31 received at 3616 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Sean Whitton <spwhitton <at> spwhitton.name>
Cc: 3616 <at> debbugs.gnu.org, duanpanda <at> gmail.com
Subject: Re: Fwd: 3616 <at> debbugs.gnu.org
Date: Wed, 05 Mar 2025 16:08:21 +0200
> From: Sean Whitton <spwhitton <at> spwhitton.name>
> Cc: Rui Duan <duanpanda <at> gmail.com>, 3616 <at> debbugs.gnu.org
> Date: Wed, 05 Mar 2025 20:13:59 +0800
> 
> Eli, this is a Windows thing, do you think there is something to
> investigate and fix here, or perhaps this is already captured somewhere
> else?

I think the doc string of 'shell' already spells this out:

  To specify a coding system for converting non-ASCII characters
  in the input and output to the shell, use C-x RET c
  before M-x shell.  You can also specify this with C-x RET p
  in the shell buffer, after you start the shell.
  The default comes from ‘process-coding-system-alist’ and
  ‘default-process-coding-system’.

Setting the correct encoding by default is a hard problem on Windows,
since it is hard to know which process will want what encoding, and
because UTF-8 cannot be easily used due to all kinds of subtle
problems with how we launch subprocesses on Windows.  So I think users
need either to set it manually or customize
process-coding-system-alist for the programs they invoke frequently.

> We should probably close this specific bug

Yes.




Reply sent to Sean Whitton <spwhitton <at> spwhitton.name>:
You have taken responsibility. (Thu, 06 Mar 2025 01:08:02 GMT) Full text and rfc822 format available.

Notification sent to 端瑞 <duanpanda <at> gmail.com>:
bug acknowledged by developer. (Thu, 06 Mar 2025 01:08:02 GMT) Full text and rfc822 format available.

Message #36 received at 3616-close <at> debbugs.gnu.org (full text, mbox):

From: Sean Whitton <spwhitton <at> spwhitton.name>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: duanpanda <at> gmail.com, 3616-close <at> debbugs.gnu.org
Subject: Re: bug#3616: Fwd: 3616 <at> debbugs.gnu.org
Date: Thu, 06 Mar 2025 09:07:04 +0800
Hello,

On Wed 05 Mar 2025 at 04:08pm +02, Eli Zaretskii wrote:

>> From: Sean Whitton <spwhitton <at> spwhitton.name>
>> Cc: Rui Duan <duanpanda <at> gmail.com>, 3616 <at> debbugs.gnu.org
>> Date: Wed, 05 Mar 2025 20:13:59 +0800
>>
>> Eli, this is a Windows thing, do you think there is something to
>> investigate and fix here, or perhaps this is already captured somewhere
>> else?
>
> I think the doc string of 'shell' already spells this out:
>
>   To specify a coding system for converting non-ASCII characters
>   in the input and output to the shell, use C-x RET c
>   before M-x shell.  You can also specify this with C-x RET p
>   in the shell buffer, after you start the shell.
>   The default comes from ‘process-coding-system-alist’ and
>   ‘default-process-coding-system’.
>
> Setting the correct encoding by default is a hard problem on Windows,
> since it is hard to know which process will want what encoding, and
> because UTF-8 cannot be easily used due to all kinds of subtle
> problems with how we launch subprocesses on Windows.  So I think users
> need either to set it manually or customize
> process-coding-system-alist for the programs they invoke frequently.

Thank you for confirming.

>> We should probably close this specific bug
>
> Yes.

Okay, doing so with this message.

-- 
Sean Whitton




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 03 Apr 2025 11:24:08 GMT) Full text and rfc822 format available.

This bug report was last modified 5 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.