GNU bug report logs - #6546
win32 grep/shell utf-8 encoding

Previous Next

Package: emacs;

Reported by: Laimonas Vėbra <laimonas.vebra <at> gmail.com>

Date: Thu, 1 Jul 2010 08:48:02 UTC

Severity: normal

Tags: moreinfo

Merged with 6705

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 6546 in the body.
You can then email your comments to 6546 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#6546; Package emacs. (Thu, 01 Jul 2010 08:48:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Laimonas Vėbra <laimonas.vebra <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Thu, 01 Jul 2010 08:48:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Laimonas Vėbra <laimonas.vebra <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: win32 grep/shell utf-8 encoding 
Date: Thu, 01 Jul 2010 11:46:37 +0300
Maybe it's actually not the bug (but missing functionality), but how do 
one should/could setup ones emacs && .emacs to grep files in utf-8 
encoding?

grep 2.6.3 (cygwin) at last works correctly (coloring multibyte matches) 
from win32 console (according to LANG environment settings), but no 
matter how i've tried to push emacs (set-language-environment, 
coding-system-for-(read|write), set-env in grep-setup-hook), it just 
don't work, because somewhere inside the Emacs win32 stuff it sticks to 
windows locale codepage and tries hard to convert to this 
codepage/encoding before it passes arguments to shell. No wonder -- it 
fails when it comes to unicode.

How to reproduce:

Create utf-8 file with some unicode characters (Cyrillic, Baltic, 
whatever; not only ascii) and try to grep for some utf-8 strings from 
Emacs (M-x grep).




Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#6546; Package emacs. (Thu, 01 Jul 2010 17:25:01 GMT) Full text and rfc822 format available.

Message #8 received at 6546 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Laimonas Vėbra <laimonas.vebra <at> gmail.com>
Cc: 6546 <at> debbugs.gnu.org
Subject: Re: bug#6546: win32 grep/shell utf-8 encoding
Date: Thu, 01 Jul 2010 20:26:26 +0300
> Date: Thu, 01 Jul 2010 11:46:37 +0300
> From: Laimonas Vėbra <laimonas.vebra <at> gmail.com>
> Cc: 
> 
> Maybe it's actually not the bug (but missing functionality), but how do 
> one should/could setup ones emacs && .emacs to grep files in utf-8 
> encoding?
> 
> grep 2.6.3 (cygwin) at last works correctly (coloring multibyte matches) 
> from win32 console (according to LANG environment settings), but no 
> matter how i've tried to push emacs (set-language-environment, 
> coding-system-for-(read|write), set-env in grep-setup-hook), it just 
> don't work, because somewhere inside the Emacs win32 stuff it sticks to 
> windows locale codepage and tries hard to convert to this 
> codepage/encoding before it passes arguments to shell. No wonder -- it 
> fails when it comes to unicode.

Did you try set-process-coding-system?





Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#6546; Package emacs. (Thu, 01 Jul 2010 18:06:01 GMT) Full text and rfc822 format available.

Message #11 received at 6546 <at> debbugs.gnu.org (full text, mbox):

From: Laimonas Vėbra <laimonas.vebra <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 6546 <at> debbugs.gnu.org
Subject: Re: bug#6546: win32 grep/shell utf-8 encoding
Date: Thu, 01 Jul 2010 21:05:36 +0300
Eli Zaretskii wrote:

> Did you try set-process-coding-system?

No, but is't it coding-system-for-(read|write) that specifies 
(synchronous) subprocess input|output coding system?
And how do i suppose to do that (set-process-coding-system) a priori 
(when no process exist yet) for a single grep command which executes and 
returns (process terminates)?




Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#6546; Package emacs. (Thu, 22 Jul 2010 12:51:01 GMT) Full text and rfc822 format available.

Message #14 received at 6546 <at> debbugs.gnu.org (full text, mbox):

From: Juanma Barranquero <lekktu <at> gmail.com>
To: Laimonas Vėbra <laimonas.vebra <at> gmail.com>
Cc: 6546 <at> debbugs.gnu.org
Subject: Re: bug#6546: win32 grep/shell utf-8 encoding
Date: Thu, 22 Jul 2010 14:50:21 +0200
On Thu, Jul 1, 2010 at 10:46, Laimonas Vėbra <laimonas.vebra <at> gmail.com> wrote:

> Create utf-8 file with some unicode characters (Cyrillic, Baltic, whatever;
> not only ascii) and try to grep for some utf-8 strings from Emacs (M-x
> grep).

File 6546.txt (in utf-8, no BOM):

--------------------------------
Cyrillic follows:
ЁШејҘҘ
--------------------------------

M-x grep <RET> ШејҘ 6546.txt<RET>

=>

-*- mode: grep; default-directory: "c:/emacs/repo/" -*-
Grep started at Thu Jul 22 14:46:58

grep -nH -e ШејҘ 6546.txt
6546.txt:2:ЁШејҘҘ

Grep finished (matches found) at Thu Jul 22 14:46:58


so I cannot reproduce it. Could you please send a step-by-step recipe,
starting from emacs -Q?

Thanks,

    Juanma




Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#6546; Package emacs. (Thu, 22 Jul 2010 14:12:02 GMT) Full text and rfc822 format available.

Message #17 received at 6546 <at> debbugs.gnu.org (full text, mbox):

From: Laimonas Vėbra <laimonas.vebra <at> gmail.com>
To: Juanma Barranquero <lekktu <at> gmail.com>
Cc: 6546 <at> debbugs.gnu.org
Subject: Re: bug#6546: win32 grep/shell utf-8 encoding
Date: Thu, 22 Jul 2010 17:11:37 +0300
Juanma Barranquero wrote:

> --------------------------------
> Cyrillic follows:
> ЁШејҘҘ
> --------------------------------
>
> M-x grep<RET>  ШејҘ 6546.txt<RET>
>
> =>
>
> -*- mode: grep; default-directory: "c:/emacs/repo/" -*-
> Grep started at Thu Jul 22 14:46:58
>
> grep -nH -e ШејҘ 6546.txt
> 6546.txt:2:ЁШејҘҘ
>
> Grep finished (matches found) at Thu Jul 22 14:46:58
>
>
> so I cannot reproduce it. Could you please send a step-by-step recipe,
> starting from emacs -Q?

That means you are using gnu-win32 grep. Some older (2.5.4) and newer 
(2.6.3) cygwin greps won't work.

I don't believe cygwin grep (and other app) is going to be fixed/coded 
like (gnu-win32 app), because it's a matter how arguments are passed 
through winapi->cygwin (whole system) layers.

Besides, older (2.5.x) greps doesn't correctly color (multibyte) matches 
(try grep -nH -e 'Ш*' 6546.txt)





Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#6546; Package emacs. (Thu, 22 Jul 2010 15:03:02 GMT) Full text and rfc822 format available.

Message #20 received at 6546 <at> debbugs.gnu.org (full text, mbox):

From: Juanma Barranquero <lekktu <at> gmail.com>
To: Laimonas Vėbra <laimonas.vebra <at> gmail.com>
Cc: 6546 <at> debbugs.gnu.org
Subject: Re: bug#6546: win32 grep/shell utf-8 encoding
Date: Thu, 22 Jul 2010 17:02:50 +0200
On Thu, Jul 22, 2010 at 16:11, Laimonas Vėbra <laimonas.vebra <at> gmail.com> wrote:

> That means you are using gnu-win32 grep. Some older (2.5.4) and newer
> (2.6.3) cygwin greps won't work.

Sorry, I missed that in your original report.

Did you try adding an entry to `process-coding-system-alist'?

    Juanma




Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#6546; Package emacs. (Thu, 22 Jul 2010 18:15:03 GMT) Full text and rfc822 format available.

Message #23 received at 6546 <at> debbugs.gnu.org (full text, mbox):

From: Laimonas Vėbra <laimonas.vebra <at> gmail.com>
Cc: 6546 <at> debbugs.gnu.org
Subject: Re: bug#6705: w32 cmdproxy.c pass args to cygwin; erroneous charset
	conversion (problem description, solution/suggestion)
Date: Thu, 22 Jul 2010 21:14:08 +0300
Jason Rumney wrote:

> Don't use cmdproxy with Cygwin programs. If you need a shell in
> between, use Cygwin bash.  cmdproxy is a wrapper to get around some
> problems with various versions of the Windows native cmd.exe and
> command.com shell programs.  Mixing Cygwin and native Windows is not
> advised.

That doesn't solve the problem (try to pass utf-8 string from Emacs to 
cygwin/bin/(ba)sh.exe or any other cygwin app), nor it anyhow 
complicates the matter (cmdproxy just passes commandline to 
CreateProcess(); same happens in w32proc.c calling /bin/sh instead of 
cmdproxy.exe). The problem is not cmdproxy itself, but winapi/cygwin 
layer and the way the args are passed/transcoded using CreateProcess(A) 
-> cygwin layer.




Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#6546; Package emacs. (Thu, 22 Jul 2010 18:25:01 GMT) Full text and rfc822 format available.

Message #26 received at 6546 <at> debbugs.gnu.org (full text, mbox):

From: Laimonas Vėbra <laimonas.vebra <at> gmail.com>
To: Juanma Barranquero <lekktu <at> gmail.com>
Cc: 6546 <at> debbugs.gnu.org
Subject: Re: bug#6546: win32 grep/shell utf-8 encoding
Date: Thu, 22 Jul 2010 21:24:12 +0300
Juanma Barranquero wrote:
> On Thu, Jul 22, 2010 at 16:11, Laimonas Vėbra<laimonas.vebra <at> gmail.com>  wrote:
>
>> That means you are using gnu-win32 grep. Some older (2.5.4) and newer
>> (2.6.3) cygwin greps won't work.
>
> Sorry, I missed that in your original report.
>
> Did you try adding an entry to `process-coding-system-alist'?

The problem is not here. I can change the encoding of the command string 
(which is passed to external cygwin apps) using
coding-system-for-write. It works (converted correctly utf-8->cp1257, 
cp1251, etc), but it doesn't help, because of the way the args (command 
line) are passed/transcoded through the winapi (CreateProcessA) and 
cygwin layer.
This bug is related to bug#6705 (there are detailed description of 
what's happening)




Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#6546; Package emacs. (Thu, 22 Jul 2010 19:55:02 GMT) Full text and rfc822 format available.

Message #29 received at 6546 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Laimonas Vėbra <laimonas.vebra <at> gmail.com>
Cc: lekktu <at> gmail.com, 6546 <at> debbugs.gnu.org
Subject: Re: bug#6546: win32 grep/shell utf-8 encoding
Date: Thu, 22 Jul 2010 22:53:34 +0300
> Date: Thu, 22 Jul 2010 21:24:12 +0300
> From: Laimonas Vėbra <laimonas.vebra <at> gmail.com>
> Cc: 6546 <at> debbugs.gnu.org
> 
> The problem is not here. I can change the encoding of the command string 
> (which is passed to external cygwin apps) using
> coding-system-for-write. It works (converted correctly utf-8->cp1257, 
> cp1251, etc), but it doesn't help, because of the way the args (command 
> line) are passed/transcoded through the winapi (CreateProcessA) and 
> cygwin layer.

Did you try to add a suitably-valued LANG variable to
process-environment?  That would at least force Cygwin executables to
work in the Windows codepage.

> This bug is related to bug#6705 (there are detailed description of 
> what's happening)

Then please merge them.





Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#6546; Package emacs. (Thu, 22 Jul 2010 21:49:01 GMT) Full text and rfc822 format available.

Message #32 received at 6546 <at> debbugs.gnu.org (full text, mbox):

From: Laimonas Vėbra <laimonas.vebra <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 6546 <at> debbugs.gnu.org
Subject: Re: bug#6546: win32 grep/shell utf-8 encoding
Date: Fri, 23 Jul 2010 00:48:28 +0300
Eli Zaretskii wrote:
>> Date: Thu, 22 Jul 2010 21:24:12 +0300
>> From: Laimonas Vėbra<laimonas.vebra <at> gmail.com>
>> Cc: 6546 <at> debbugs.gnu.org
>>
>> The problem is not here. I can change the encoding of the command string
>> (which is passed to external cygwin apps) using
>> coding-system-for-write. It works (converted correctly utf-8->cp1257,
>> cp1251, etc), but it doesn't help, because of the way the args (command
>> line) are passed/transcoded through the winapi (CreateProcessA) and
>> cygwin layer.
>
> Did you try to add a suitably-valued LANG variable to
> process-environment?  That would at least force Cygwin executables to
> work in the Windows codepage.

The only way it works is when i set LANG process-environment variable to 
the current windows locale codepage and 'coding-system-for-write' to the 
encoding/charset in which i'd like to grep.
That way it works, but i'm not sure (seriously doubt) if LANG/locale 
codepage, which differs from the actual args encoding, won't result in 
any ugly problems/bugs (e.g. sorting, piping to other apps)
If it really won't and this setup is "as it should be, intended", then 
this bug could be closed.


>> This bug is related to bug#6705 (there are detailed description of
>> what's happening)
>
> Then please merge them.

How can i do that?




Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#6546; Package emacs. (Thu, 22 Jul 2010 23:01:02 GMT) Full text and rfc822 format available.

Message #35 received at 6546 <at> debbugs.gnu.org (full text, mbox):

From: Juanma Barranquero <lekktu <at> gmail.com>
To: Laimonas Vėbra <laimonas.vebra <at> gmail.com>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 6546 <at> debbugs.gnu.org
Subject: Re: bug#6546: win32 grep/shell utf-8 encoding
Date: Fri, 23 Jul 2010 01:00:20 +0200
On Thu, Jul 22, 2010 at 23:48, Laimonas Vėbra <laimonas.vebra <at> gmail.com> wrote:

> How can i do that?

You can send a message to control <at> debbugs.gnu.org, starting with

merge 6705 6546
quit

If both bugs aren't in the same state (open/closed, etc.), you can use
"forcemerge" instead.

Control commands for debbugs are documented in the file admin/notes/bugtracker.

    Juanma




Merged 6546 6705. Request was from Laimonas Vėbra <laimonas.vebra <at> gmail.com> to control <at> debbugs.gnu.org. (Fri, 23 Jul 2010 09:27:02 GMT) Full text and rfc822 format available.

Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#6546; Package emacs. (Fri, 23 Jul 2010 10:25:02 GMT) Full text and rfc822 format available.

Message #40 received at 6546 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Laimonas Vėbra <laimonas.vebra <at> gmail.com>
Cc: 6546 <at> debbugs.gnu.org
Subject: Re: bug#6546: win32 grep/shell utf-8 encoding
Date: Fri, 23 Jul 2010 13:24:02 +0300
> Date: Fri, 23 Jul 2010 00:48:28 +0300
> From: Laimonas Vėbra <laimonas.vebra <at> gmail.com>
> CC: 6546 <at> debbugs.gnu.org
> 
> > Did you try to add a suitably-valued LANG variable to
> > process-environment?  That would at least force Cygwin executables to
> > work in the Windows codepage.
> 
> The only way it works is when i set LANG process-environment variable to 
> the current windows locale codepage and 'coding-system-for-write' to the 
> encoding/charset in which i'd like to grep.

That's the only way it's _supposed_ to work.

> That way it works, but i'm not sure (seriously doubt) if LANG/locale 
> codepage, which differs from the actual args encoding, won't result in 
> any ugly problems/bugs (e.g. sorting, piping to other apps)

You should set LANG to the current codepage and make sure
locale-coding-system is set to the same codepage.  Then the Cygwin
programs invoked as Emacs subprocesses should do what you expect.

> If it really won't and this setup is "as it should be, intended", then 
> this bug could be closed.

Yes, this is the only setup that is supposed to work.





Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#6546; Package emacs. (Fri, 23 Jul 2010 12:55:03 GMT) Full text and rfc822 format available.

Message #43 received at 6546 <at> debbugs.gnu.org (full text, mbox):

From: Laimonas Vėbra <laimonas.vebra <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 6546 <at> debbugs.gnu.org
Subject: Re: bug#6546: win32 grep/shell utf-8 encoding
Date: Fri, 23 Jul 2010 15:54:34 +0300
Eli Zaretskii wrote:
>> Date: Fri, 23 Jul 2010 00:48:28 +0300
>> From: Laimonas Vėbra<laimonas.vebra <at> gmail.com>
>> CC: 6546 <at> debbugs.gnu.org
>>
>>> Did you try to add a suitably-valued LANG variable to
>>> process-environment?  That would at least force Cygwin executables to
>>> work in the Windows codepage.
>>
>> The only way it works is when i set LANG process-environment variable to
>> the current windows locale codepage and 'coding-system-for-write' to the
>> encoding/charset in which i'd like to grep.
>
> That's the only way it's _supposed_ to work.

Then i suppose it's wrong/incorrect way of what is supposed to operate 
like that.

Why? Because for the correct behaviour we (external app, Emacs) 
shouldn't require to set locale to some fixed setting; it should be 
freely changed as many cygwin apps relies on that. For example, how do 
you sort data with improper locale settings (which are required to be 
fixed)? Will seek for another workaround?

Example:
echo -e "-ĔĿİ-\n_ĔĿİ_\nELI\nĔĿİ" > file.txt

$ export LANG=lt_LT.cp1257
$ cat file.txt
-ĔĿİ-
_ĔĿİ_
ELI
ĔĿİ

$ cat file.txt | sort
_ĔĿİ_
ĔĿİ
-ĔĿİ-
ELI

$ export LANG=lt_LT.utf-8
$ cat file.txt
-ĔĿİ-
_ĔĿİ_
ELI
ĔĿİ

$ cat file.txt | sort
_ĔĿİ_
ELI
ĔĿİ
-ĔĿİ-

> Yes, this is the only setup that is supposed to work.

Maybe it is/was suppose to work (at all) like that in the sense of 
workaround, but i doubt if it was/is supposed to be correct.





Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#6546; Package emacs. (Fri, 23 Jul 2010 14:34:01 GMT) Full text and rfc822 format available.

Message #46 received at 6546 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Laimonas Vėbra <laimonas.vebra <at> gmail.com>
Cc: 6546 <at> debbugs.gnu.org
Subject: Re: bug#6546: win32 grep/shell utf-8 encoding
Date: Fri, 23 Jul 2010 17:23:47 +0300
> Date: Fri, 23 Jul 2010 15:54:34 +0300
> From: Laimonas Vėbra <laimonas.vebra <at> gmail.com>
> CC: 6546 <at> debbugs.gnu.org
> 
> >> The only way it works is when i set LANG process-environment variable to
> >> the current windows locale codepage and 'coding-system-for-write' to the
> >> encoding/charset in which i'd like to grep.
> >
> > That's the only way it's _supposed_ to work.
> 
> Then i suppose it's wrong/incorrect way of what is supposed to operate 
> like that.
> 
> Why? Because for the correct behaviour we (external app, Emacs) 
> shouldn't require to set locale to some fixed setting; it should be 
> freely changed as many cygwin apps relies on that.

You cannot easily change the locale of a Windows system by specifying
some environment variable.  You need to actually switch it
system-wide.  As long as we use ANSI APIs on Windows, we can only
support a single Windows locale, and that locale must be the current
user's locale.

> For example, how do you sort data with improper locale settings
> (which are required to be fixed)?

You can't, sorry.

> > Yes, this is the only setup that is supposed to work.
> 
> Maybe it is/was suppose to work (at all) like that in the sense of 
> workaround, but i doubt if it was/is supposed to be correct.

It cannot work in any other way with ANSI APIs.





Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#6546; Package emacs. (Fri, 23 Jul 2010 15:52:02 GMT) Full text and rfc822 format available.

Message #49 received at 6546 <at> debbugs.gnu.org (full text, mbox):

From: Laimonas Vėbra <laimonas.vebra <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 6546 <at> debbugs.gnu.org
Subject: Re: bug#6546: win32 grep/shell utf-8 encoding
Date: Fri, 23 Jul 2010 18:50:54 +0300
Eli Zaretskii wrote:

> You cannot easily change the locale of a Windows system by specifying
> some environment variable.  You need to actually switch it
> system-wide.  As long as we use ANSI APIs on Windows, we can only

I am talking about LANG env settings, which we can freely change for the 
cygwin apps to act differently (as we need).

> You can't, sorry.

You can. That example was supposed to show, that you can freely change 
LANG variable and cygwin utils, which relies on it, acts appropriately.

Well, you can't change it freely in the sense of Emacs setup 
("workaround"), which requires, that LANG should be set the same as the 
current system locale in order for the Emacs to pass 
unicode/non-system-encoding args.

So, i'm asking the same question again -- why do you think it's not 
worth to fix this Emacs setup restriction in order to work with cygwin 
apps like it's intended from cygwin/cmd shell (setting on the fly as 
needed whatever supported locale)?




Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#6546; Package emacs. (Fri, 23 Jul 2010 18:11:02 GMT) Full text and rfc822 format available.

Message #52 received at 6546 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Laimonas Vėbra <laimonas.vebra <at> gmail.com>
Cc: 6546 <at> debbugs.gnu.org
Subject: Re: bug#6546: win32 grep/shell utf-8 encoding
Date: Fri, 23 Jul 2010 21:09:31 +0300
> Date: Fri, 23 Jul 2010 18:50:54 +0300
> From: Laimonas Vėbra <laimonas.vebra <at> gmail.com>
> CC: 6546 <at> debbugs.gnu.org
> 
> Eli Zaretskii wrote:
> 
> > You cannot easily change the locale of a Windows system by specifying
> > some environment variable.  You need to actually switch it
> > system-wide.  As long as we use ANSI APIs on Windows, we can only
> 
> I am talking about LANG env settings, which we can freely change for the 
> cygwin apps to act differently (as we need).

You are talking about Cygwin programs, while I'm talking about the
native w32 build of Emacs.  The effect of LANG and the way to change
the locale is different for each one of these two.

> > You can't, sorry.
> 
> You can. That example was supposed to show, that you can freely change 
> LANG variable and cygwin utils, which relies on it, acts appropriately.

Again, I was not talking about Cygwin, I was talking about the native
w32 build of Emacs.  It doesn't use the Unicode (UTF-16) APIs, so it
can only support the current codepage when it invokes programs through
the Windows APIs.

> So, i'm asking the same question again -- why do you think it's not 
> worth to fix this Emacs setup restriction in order to work with cygwin 
> apps like it's intended from cygwin/cmd shell (setting on the fly as 
> needed whatever supported locale)?

I already answered that.  I have nothing to add to what I said.





Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#6546; Package emacs. (Fri, 23 Jul 2010 19:08:01 GMT) Full text and rfc822 format available.

Message #55 received at 6546 <at> debbugs.gnu.org (full text, mbox):

From: Laimonas Vėbra <laimonas.vebra <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 6546 <at> debbugs.gnu.org
Subject: Re: bug#6546: win32 grep/shell utf-8 encoding
Date: Fri, 23 Jul 2010 22:07:16 +0300
Eli Zaretskii wrote:
>> Date: Fri, 23 Jul 2010 18:50:54 +0300
>> From: Laimonas Vėbra<laimonas.vebra <at> gmail.com>
>> CC: 6546 <at> debbugs.gnu.org
>>
>> Eli Zaretskii wrote:
>>
>>> You cannot easily change the locale of a Windows system by specifying
>>> some environment variable.  You need to actually switch it
>>> system-wide.  As long as we use ANSI APIs on Windows, we can only
>>
>> I am talking about LANG env settings, which we can freely change for the
>> cygwin apps to act differently (as we need).
>
> You are talking about Cygwin programs, while I'm talking about the
> native w32 build of Emacs.  The effect of LANG and the way to change
> the locale is different for each one of these two.

I am talking about LANG setting restrictions, that Emacs implies. I 
think -- it shouldn't.

>
>>> You can't, sorry.
>>
>> You can. That example was supposed to show, that you can freely change
>> LANG variable and cygwin utils, which relies on it, acts appropriately.
>
> Again, I was not talking about Cygwin, I was talking about the native
> w32 build of Emacs.  It doesn't use the Unicode (UTF-16) APIs, so it
> can only support the current codepage when it invokes programs through
> the Windows APIs.

It *can* (try mingw example, that i posted) pass utf-8 encoded (and in 
other encodings) arguments when it invokes external programs and for 
that it doesn't need to use UTF-16 API _everywhere_. Like i said -- now 
it (perfectly) works with native/mingw apps without any change.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#6546; Package emacs. (Sun, 24 Apr 2022 12:02:02 GMT) Full text and rfc822 format available.

Message #58 received at 6546 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Laimonas Vėbra <laimonas.vebra <at> gmail.com>
Cc: 6546 <at> debbugs.gnu.org, 6705 <at> debbugs.gnu.org
Subject: Re: bug#6705: w32 cmdproxy.c pass args to cygwin; erroneous charset
 conversion (problem description, solution/suggestion)
Date: Sun, 24 Apr 2022 14:01:32 +0200
Laimonas Vėbra <laimonas.vebra <at> gmail.com> writes:

> Create utf-8 file with some unicode characters (Cyrillic, Baltic,
> whatever; not only ascii) and try to grep for some utf-8 strings from
> Emacs (M-x grep).

(I'm going through old bug reports that unfortunately weren't resolved
at the time.)

This was eleven years ago -- is this still an issue in recent
Emacs/Cygwin versions?  (I can't recall seeing any recent reports about
this.)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Added tag(s) moreinfo. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Sun, 24 Apr 2022 12:02:03 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#6546; Package emacs. (Sun, 24 Apr 2022 12:32:01 GMT) Full text and rfc822 format available.

Message #63 received at 6546 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: laimonas.vebra <at> gmail.com, 6546 <at> debbugs.gnu.org, 6705 <at> debbugs.gnu.org
Subject: Re: bug#6546: bug#6705: w32 cmdproxy.c pass args to cygwin;
 erroneous charset conversion (problem description,
 solution/suggestion)
Date: Sun, 24 Apr 2022 15:31:27 +0300
> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Date: Sun, 24 Apr 2022 14:01:32 +0200
> Cc: 6546 <at> debbugs.gnu.org, 6705 <at> debbugs.gnu.org
> 
> Laimonas Vėbra <laimonas.vebra <at> gmail.com> writes:
> 
> > Create utf-8 file with some unicode characters (Cyrillic, Baltic,
> > whatever; not only ascii) and try to grep for some utf-8 strings from
> > Emacs (M-x grep).
> 
> (I'm going through old bug reports that unfortunately weren't resolved
> at the time.)
> 
> This was eleven years ago -- is this still an issue in recent
> Emacs/Cygwin versions?  (I can't recall seeing any recent reports about
> this.)

I think this bug should be closed.  Support for mixing a native w32
Emacs with Cygwin external programs is limited where character
encoding is involved because of the limitations of the APIs we use in
Emacs to invoke external programs, and because native w32 bui8lds of
external programs in most cases support only a single system codepage.

So people who want to be able to invoke Cygwin programs from Emacs and
play by Cygwin LANG and locale rules (which emulate quite well the
Posix environment) should use a Cygwin build of Emacs.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#6546; Package emacs. (Sun, 24 Apr 2022 13:26:02 GMT) Full text and rfc822 format available.

Message #66 received at 6546 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: laimonas.vebra <at> gmail.com, 6546 <at> debbugs.gnu.org, 6705 <at> debbugs.gnu.org
Subject: Re: bug#6546: bug#6705: w32 cmdproxy.c pass args to cygwin;
 erroneous charset conversion (problem description, solution/suggestion)
Date: Sun, 24 Apr 2022 15:25:32 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

> I think this bug should be closed.  Support for mixing a native w32
> Emacs with Cygwin external programs is limited where character
> encoding is involved because of the limitations of the APIs we use in
> Emacs to invoke external programs, and because native w32 bui8lds of
> external programs in most cases support only a single system codepage.
>
> So people who want to be able to invoke Cygwin programs from Emacs and
> play by Cygwin LANG and locale rules (which emulate quite well the
> Posix environment) should use a Cygwin build of Emacs.

OK; closing this bug report, then.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




bug closed, send any further explanations to 6546 <at> debbugs.gnu.org and Laimonas Vėbra <laimonas.vebra <at> gmail.com> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Sun, 24 Apr 2022 13:26:03 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 23 May 2022 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 42 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.