GNU bug report logs - #19591
24.4; file & buffer compare failures

Previous Next

Package: emacs;

Reported by: Glenn Linderman <v+python <at> g.nevcal.com>

Date: Tue, 13 Jan 2015 20:59:03 UTC

Severity: wishlist

Found in version 24.4

Done: Stefan Kangas <stefan <at> marxist.se>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 19591 in the body.
You can then email your comments to 19591 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#19591; Package emacs. (Tue, 13 Jan 2015 20:59:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Glenn Linderman <v+python <at> g.nevcal.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Tue, 13 Jan 2015 20:59:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Glenn Linderman <v+python <at> g.nevcal.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 24.4; file & buffer compare failures
Date: Tue, 13 Jan 2015 11:56:54 -0800
[Message part 1 (text/plain, inline)]

I'm delighted that emacs 24.4 can now open all files, even those
that have characters in their names that are not part of the current
ANSI set.

However, the auxiliary program diff when launched by emacs still doesn't
accept files with such characters. The latest version of diff for
windows that I can find is 2.8.7. The error message from diff in the
error buffer seems to contain the proper characters for the file name,
but diff reports it cannot find the file so I tihnk it is a deficiency
in diff, like was in emacs versions prior to 24.4, using the
"bytes" version of open instead of the "widechars" version.

While it may be somewhat inefficient, it would be possible for emacs to
work around the deficiency of diff by saving temporary copies of the
buffers to be compared using generated names in the ANSI subset.

Obviously I can achieve that myself, and have a number of times, but then
one must be careful to copy the fixed data back to the original file.


In GNU Emacs 24.4.1 (x86_64-w64-mingw32)
 of 2014-10-20 on KAEL
Windowing system distributor `Microsoft Corp.', version 6.1.7601
Configured using:
 `configure --prefix=/z/emacs --host=x86_64-w64-mingw32
 --target=x86_64-w64-mingw32 --build=x86_64-w64-mingw32 --with-wide-int
 --with-jpeg --with-xpm --with-png --with-tiff --with-rsvg --with-xml2
 --with-gnutls --with-xft --with-sound=yes --with-file-notification=yes
 --without-dbus --without-imagemagick 'CFLAGS=-Ofast
 -fomit-frame-pointer -funroll-loops -g0 -pipe' 'CPPFLAGS=-DNDEBUG
 -DDBUS_STATIC_BUILD' 'LDFLAGS=-static-libgcc -static-libstdc++ -static
 -s -Wl,-s''

Important settings:
  value of $LANG: ENU
  locale-coding-system: cp1252

Major mode: Emacs-Lisp

Minor modes in effect:
  shell-dirtrack-mode: t
  which-function-mode: t
  tooltip-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  size-indication-mode: t
  column-number-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent input:
<backspace> <backspace> o <backspace> <backspace> o
p e n s SPC o n e SPC t h a t SPC i t SPC p <backspace>
<backspace> <return> p r e v i o u s l y SPC c o u
l d n ' <backspace> ' t SPC ( b e c a u s e SPC t h
e SPC c h a a r <backspace> <backspace> r a c e <backspace>
t e r SPC i s SPC n o t SPC i n SPC t h e SPC c u r
r e n t A N S I <backspace> <backspace> <backspace>
<backspace> SPC A N S I <return> s e t ) , SPC t h
e SPC d i s p l a y SPC o f SPC t h e SPC f i l e SPC
n a m e SPC i n SPC t h e SPC t i t l e SPC b a r SPC
h a s SPC c h o s e <backspace> <backspace> <backspace>
<backspace> <backspace> s u c h SPC c h a r a c t e
r s <return> o m i t t e d . <help-echo> <down-mouse-1>
<mouse-1> C-n C-SPC C-b C-b C-b C-b C-b C-b C-b C-b
C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b
<escape> w <help-echo> <down-mouse-1> <mouse-1> C-SPC
C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b
C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b C-b
C-b C-b C-b C-b C-b <escape> w <help-echo> <down-mouse-1>
<mouse-1> C-SPC <escape> > <escape> w <down-mouse-1>
<mouse-1> <escape> < <help-echo> <down-mouse-1> <mouse-1>
C-x k <return> y e s <return> <escape> x r e p o r
t <tab> <return>

Recent messages:
Checking 151 files in d:/emacs/share/emacs/24.4/lisp/emacs-lisp...
Checking 24 files in d:/emacs/share/emacs/24.4/lisp/cedet...
Checking 57 files in d:/emacs/share/emacs/24.4/lisp/calendar...
Checking 87 files in d:/emacs/share/emacs/24.4/lisp/calc...
Checking 95 files in d:/emacs/share/emacs/24.4/lisp/obsolete...
Checking for load-path shadows...done
Auto-saving...done
Mark set [3 times]
Saved text from "
I'm delighted that emacs 24.4 can, at l"

Load-path shadows:
None found.

Features:
(pp shadow sort gnus-util mail-extr emacsbug message format-spec rfc822
mml mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev
gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums mm-util
mail-prsvr mail-utils add-log python-mode derived skeleton advice
help-fns edmacro kmacro cl-macs thingatpt flymake rx shell pcomplete
cc-cmds cc-engine cc-vars cc-defs compile cl gv cl-loaddefs cl-lib
comint ansi-color ring misearch multi-isearch help-mode easymenu
whitespace which-func imenu time-date tooltip electric uniquify
ediff-hook vc-hooks lisp-float-type mwheel dos-w32 ls-lisp
w32-common-fns disp-table w32-win w32-vars tool-bar dnd fontset image
regexp-opt fringe tabulated-list newcomment lisp-mode prog-mode register
page menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock
font-lock syntax facemenu font-core frame cham georgian utf-8-lang
misc-lang vietnamese tibetan thai tai-viet lao korean japanese hebrew
greek romanian slovak czech european ethiopic indian cyrillic chinese
case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer nadvice
loaddefs button faces cus-face macroexp files text-properties overlay
sha1 md5 base64 format env code-pages mule custom widget
hashtable-print-readable backquote make-network-process w32notify w32
multi-tty emacs)

Memory information:
((conses 16 169994 18131)
 (symbols 56 23492 0)
 (miscs 48 92 169)
 (strings 32 28274 6229)
 (string-bytes 1 978775)
 (vectors 16 21772)
 (vector-slots 8 1297449 203189)
 (floats 8 77 457)
 (intervals 56 1134 37)
 (buffers 960 17))

[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#19591; Package emacs. (Wed, 14 Jan 2015 18:29:02 GMT) Full text and rfc822 format available.

Message #8 received at 19591 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Glenn Linderman <v+python <at> g.nevcal.com>
Cc: 19591 <at> debbugs.gnu.org
Subject: Re: bug#19591: 24.4; file & buffer compare failures
Date: Wed, 14 Jan 2015 20:28:07 +0200
> Date: Tue, 13 Jan 2015 11:56:54 -0800
> From: Glenn Linderman <v+python <at> g.nevcal.com>
> 
> However, the auxiliary program diff when launched by emacs still doesn't
> accept files with such characters. The latest version of diff for
> windows that I can find is 2.8.7. The error message from diff in the
> error buffer seems to contain the proper characters for the file name,
> but diff reports it cannot find the file so I tihnk it is a deficiency
> in diff, like was in emacs versions prior to 24.4, using the
> "bytes" version of open instead of the "widechars" version.

Yes, Diff, as all the other native ports of GNU software to Windows,
uses the ANSI APIs to access files and its command-line arguments.

It is hardly the job of the Emacs team to fix programs that are not
part of the Emacs package.  So I'm not sure what exactly did you
expect of the Emacs project in this matter.

You should know that the Emacs support for non-ASCII characters
outside of the current system codepage stops short of extending that
support to subprocesses invoked by Emacs, for this very reason: there
are no native ports known to me of popular programs, such as Diff,
Grep, find/xargs, etc. that can handle such file names.  So being able
to pass such non-ASCII file names to those programs would be a waste
of effort, since they cannot handle them.

> While it may be somewhat inefficient, it would be possible for emacs to
> work around the deficiency of diff by saving temporary copies of the
> buffers to be compared using generated names in the ANSI subset.

This is not practical.  The place in Emacs sources where command-line
arguments of subprocesses are constructed and encoded has no idea
which of these arguments are file names and which aren't.  (There are
also additional technical difficulties to do that, too boring to go
into here.)  Only the application level -- the Lisp program that needs
to invoke Diff or whatever -- knows that.  So what you suggest would
mean we need to add this kind of work-around in each and every place
where some Lisp invokes some program, too many places to do that.  On
top of that, this would be inefficient: a file could be very large.

So I don't think this problem could or should be solved in Emacs.  Let
people who produce the ports of Diff etc. add support for these
characters first, then there will be a good reason for Emacs to do the
same.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#19591; Package emacs. (Wed, 14 Jan 2015 19:54:02 GMT) Full text and rfc822 format available.

Message #11 received at 19591 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Glenn Linderman <v+python <at> g.nevcal.com>
Cc: 19591 <at> debbugs.gnu.org
Subject: Re: bug#19591: 24.4; file & buffer compare failures
Date: Wed, 14 Jan 2015 21:53:23 +0200
[Please keep the bug address on the CC list.]

> Date: Wed, 14 Jan 2015 11:40:16 -0800
> From: Glenn Linderman <v+python <at> g.nevcal.com>
> 
> I didn't expect a fix for diff from the emacs team, but I do know that the
> excellent file comparison is one huge reason that people use emacs... I know
> people that use vi for most editing, but fire up emacs for file comparison...
> probably works on Unix even with funny names...

Most Unix systems use UTF-8 to encode file names, which is why Diff
doesn't have a problem on such systems.

> I was sort of thinking, though, that the case of buffer comparison is a case
> where emacs is creating the files to do the diff, and that it creates temp
> files with names derived from the buffer name, which is, I suppose somewhat
> mnemonic when looking at the error message, but temporary file names such as
> "compare-buffer-1.txt" and "compare-buffer-2.txt" would be just as useful. And
> the file has to be written before the compare can be done in that case anyway.

If you are talking about comparing buffers, not files, then yes,
perhaps Emacs can do something about the issue, if it exists.  But
please provide a reproducible recipe, starting from "emacs -Q", that
shows the problem.

> Of course, the other approach, since diff is invoked with very specific options
> by buffer/file compare, would be to reimplement that aspect of diff internally,
> which would actually be an optimization (not needed to write the files, call
> the external program, and read its results) that would also sidestep the need
> for file names at all.

Emacs tries not to reinvent the wheels that already exist.

> It does seem, though, that the correct file names are being passed to the
> external programs, at least, the error message seen in emacs has the correct
> file name... it is just that diff isn't smart enough to use the right API to
> open it. Or else the incorrect name being passed isn't being included in the
> error message.

I think just the error message, being generated inside Emacs, shows
the correct file names, what Diff gets are file names butchered by
conversion to the ANSI codepage.  Once again, if you show the command
you issued and the error message you've got in response, we could look
into that and tell what really happens in your case.




Reply sent to Stefan Kangas <stefan <at> marxist.se>:
You have taken responsibility. (Mon, 30 Sep 2019 01:10:02 GMT) Full text and rfc822 format available.

Notification sent to Glenn Linderman <v+python <at> g.nevcal.com>:
bug acknowledged by developer. (Mon, 30 Sep 2019 01:10:02 GMT) Full text and rfc822 format available.

Message #16 received at 19591-done <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefan <at> marxist.se>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 19591-done <at> debbugs.gnu.org, Glenn Linderman <v+python <at> g.nevcal.com>
Subject: Re: bug#19591: 24.4; file & buffer compare failures
Date: Mon, 30 Sep 2019 03:08:50 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

>> Date: Tue, 13 Jan 2015 11:56:54 -0800
>> From: Glenn Linderman <v+python <at> g.nevcal.com>
>>
>> However, the auxiliary program diff when launched by emacs still doesn't
>> accept files with such characters. The latest version of diff for
>> windows that I can find is 2.8.7. The error message from diff in the
>> error buffer seems to contain the proper characters for the file name,
>> but diff reports it cannot find the file so I tihnk it is a deficiency
>> in diff, like was in emacs versions prior to 24.4, using the
>> "bytes" version of open instead of the "widechars" version.
>
> Yes, Diff, as all the other native ports of GNU software to Windows,
> uses the ANSI APIs to access files and its command-line arguments.
>
> It is hardly the job of the Emacs team to fix programs that are not
> part of the Emacs package.  So I'm not sure what exactly did you
> expect of the Emacs project in this matter.
>
> You should know that the Emacs support for non-ASCII characters
> outside of the current system codepage stops short of extending that
> support to subprocesses invoked by Emacs, for this very reason: there
> are no native ports known to me of popular programs, such as Diff,
> Grep, find/xargs, etc. that can handle such file names.  So being able
> to pass such non-ASCII file names to those programs would be a waste
> of effort, since they cannot handle them.
>
>> While it may be somewhat inefficient, it would be possible for emacs to
>> work around the deficiency of diff by saving temporary copies of the
>> buffers to be compared using generated names in the ANSI subset.
>
> This is not practical.  The place in Emacs sources where command-line
> arguments of subprocesses are constructed and encoded has no idea
> which of these arguments are file names and which aren't.  (There are
> also additional technical difficulties to do that, too boring to go
> into here.)  Only the application level -- the Lisp program that needs
> to invoke Diff or whatever -- knows that.  So what you suggest would
> mean we need to add this kind of work-around in each and every place
> where some Lisp invokes some program, too many places to do that.  On
> top of that, this would be inefficient: a file could be very large.
>
> So I don't think this problem could or should be solved in Emacs.  Let
> people who produce the ports of Diff etc. add support for these
> characters first, then there will be a good reason for Emacs to do the
> same.

With the above explanation, I'm closing this bug report.

Best regards,
Stefan Kagas




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 28 Oct 2019 11:24:18 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 182 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.