GNU bug report logs - #21816
elisp-mode-tests fails on a case-preserving filesystem

Previous Next

Package: emacs;

Reported by: Juanma Barranquero <lekktu <at> gmail.com>

Date: Mon, 2 Nov 2015 16:44:02 UTC

Severity: minor

Found in version 25.0.50

Done: Juanma Barranquero <lekktu <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 21816 in the body.
You can then email your comments to 21816 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to stephen_leake <at> stephe-leake.org, bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Mon, 02 Nov 2015 16:44:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Juanma Barranquero <lekktu <at> gmail.com>:
New bug report received and forwarded. Copy sent to stephen_leake <at> stephe-leake.org, bug-gnu-emacs <at> gnu.org. (Mon, 02 Nov 2015 16:44:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Juanma Barranquero <lekktu <at> gmail.com>
To: Bug-Gnu-Emacs <bug-gnu-emacs <at> gnu.org>
Subject: elisp-mode-tests fails on a case-preserving filesystem
Date: Mon, 2 Nov 2015 17:41:58 +0100
[Message part 1 (text/plain, inline)]
Package: emacs
Version: 25.0.50
Severity: minor
X-Debbugs-CC: stephen_leake <at> stephe-leake.org


Test xref-elisp-test-find-defs-feature-el fails on my Windows system
because of a case-sensitive filename comparison.

Basically

ELISP> (aref (aref (car (elisp--xref-find-definitions 'xref)) 2) 3)
"c:/devel/emacs/repo/trunk/lisp/progmodes/xref.el"

ELISP> (aref (xref-make-elisp-location
     'xref 'feature
     (expand-file-name "../../lisp/progmodes/xref.el"
emacs-test-dir))
    3)
"c:/Devel/emacs/repo/trunk/lisp/progmodes/xref.el"

and the test fails at the d/D in c:/Devel vs. c:/devel (as seen in the log
below).

The value returned by `expand-file-name' is the real name as stored in the
filesystem (the directory is called "Devel"). The value from
`elisp--xref-find-definitions' ultimately comes from `locate-file', and
reflects what's stored in `load-path' (which can be set by the user, so we
cannot force it to have "correct" casing).

ELISP> (locate-file "xref" load-path '(".el"))
"c:/devel/emacs/repo/trunk/lisp/progmodes/xref.el"
ELISP> (locate-file "xref" '("c:/Devel/emacs/repo/trunk/lisp/progmodes")
'(".el"))
"c:/Devel/emacs/repo/trunk/lisp/progmodes/xref.el"

The test code in elisp-mode-tests.el uses `equal' to compare the xref
structures. I can think a couple of ways to fix the test, but they are all
quite ugly....




Test xref-elisp-test-find-defs-feature-el backtrace:
  (if (unwind-protect (setq value-107 (apply fn-105 args-106)) (setq f
  (let (form-description-109) (if (unwind-protect (setq value-107 (app
  (let ((value-107 (quote ert-form-evaluation-aborted-108))) (let (for
  (let ((fn-105 (function equal)) (args-106 (list xref (or (if (consp
  (let ((xref (car-safe (prog1 xrefs (setq xrefs (cdr xrefs))))) (expe
  (while xrefs (let ((xref (car-safe (prog1 xrefs (setq xrefs (cdr xre
  xref-elisp-test-run(([eieio-class-tag--xref-item #("(feature xref)"
  (closure (cl-struct-xref-elisp-root-type-tags t) nil (xref-elisp-tes
  ert--run-test-internal([cl-struct-ert--test-execution-info [cl-struc
  ert-run-test([cl-struct-ert-test xref-elisp-test-find-defs-feature-e
  ert-run-or-rerun-test([cl-struct-ert--stats t [[cl-struct-ert-test e
  ert-run-tests(t #[385 "\306 \307\"\203G \211\211G\310U\203 \211@\20
  ert-run-tests-batch(nil)
  ert-run-tests-batch-and-exit()
  command-line-1(("-l" "ert" "-l" "elisp-mode-tests.el" "-f" "ert-run-
  command-line()
  normal-top-level()
Test xref-elisp-test-find-defs-feature-el condition:
    (ert-test-failed
     ((should
       (equal xref
              (or ... expected)))
      :form
      (equal
       [eieio-class-tag--xref-item
        #("(feature xref)" 1 8 ... 9 13 ...)
        [cl-struct-xref-elisp-location xref feature
"c:/devel/emacs/repo/trunk/lisp/progmodes/xref.el"]]
       [eieio-class-tag--xref-item "(feature xref)"
                                   [cl-struct-xref-elisp-location xref
feature "c:/Devel/emacs/repo/trunk/lisp/progmodes/xref.el"]])
      :value nil :explanation
      (array-elt 2
                 (array-elt 3
                            (array-elt 3 ...)))))
   FAILED  34/35  xref-elisp-test-find-defs-feature-el
   passed  35/35  xref-elisp-test-find-defs-feature-eval

Ran 35 tests, 34 results as expected, 1 unexpected (2015-11-02
17:11:45+0100)

1 unexpected results:
   FAILED  xref-elisp-test-find-defs-feature-el
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Mon, 02 Nov 2015 17:58:01 GMT) Full text and rfc822 format available.

Message #8 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Juanma Barranquero <lekktu <at> gmail.com>
Cc: 21816 <at> debbugs.gnu.org, stephen_leake <at> stephe-leake.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Mon, 02 Nov 2015 19:57:01 +0200
> From: Juanma Barranquero <lekktu <at> gmail.com>
> Date: Mon, 2 Nov 2015 17:41:58 +0100
> Cc: stephen_leake <at> stephe-leake.org
> 
> Test xref-elisp-test-find-defs-feature-el fails on my Windows system because of
> a case-sensitive filename comparison.
> 
> Basically
> 
> ELISP> (aref (aref (car (elisp--xref-find-definitions 'xref)) 2) 3)
> "c:/devel/emacs/repo/trunk/lisp/progmodes/xref.el"
> 
> ELISP> (aref (xref-make-elisp-location
> 'xref 'feature
> (expand-file-name "../../lisp/progmodes/xref.el"
> emacs-test-dir))
> 3)
> "c:/Devel/emacs/repo/trunk/lisp/progmodes/xref.el"
> 
> and the test fails at the d/D in c:/Devel vs. c:/devel (as seen in the log
> below).
> 
> The value returned by `expand-file-name' is the real name as stored in the
> filesystem (the directory is called "Devel"). The value from
> `elisp--xref-find-definitions' ultimately comes from `locate-file', and
> reflects what's stored in `load-path' (which can be set by the user, so we
> cannot force it to have "correct" casing).
> 
> ELISP> (locate-file "xref" load-path '(".el"))
> "c:/devel/emacs/repo/trunk/lisp/progmodes/xref.el"
> ELISP> (locate-file "xref" '("c:/Devel/emacs/repo/trunk/lisp/progmodes") '
> (".el"))
> "c:/Devel/emacs/repo/trunk/lisp/progmodes/xref.el"
> 
> The test code in elisp-mode-tests.el uses `equal' to compare the xref
> structures. I can think a couple of ways to fix the test, but they are all
> quite ugly....

The comparison should use file-equal-p.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Mon, 02 Nov 2015 18:22:01 GMT) Full text and rfc822 format available.

Message #11 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Juanma Barranquero <lekktu <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 21816 <at> debbugs.gnu.org, Stephen Leake <stephen_leake <at> stephe-leake.org>
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Mon, 2 Nov 2015 19:20:31 +0100
[Message part 1 (text/plain, inline)]
On Mon, Nov 2, 2015 at 6:57 PM, Eli Zaretskii <eliz <at> gnu.org> wrote:

> The comparison should use file-equal-p.

Yes, that's one of the ugly ways to fix it that I mentioned. "Ugly" because
the test, through a macro, is not comparing file names, but two structures,
which happen to have a field that contains a filename. IOW, we'll have to
go from

     (should (equal xref
                     (or (when (consp expected) (car expected)) expected)))

to defining a new comparing function, specific to this test. Which I
suppose I'll eventually do.
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 01:21:02 GMT) Full text and rfc822 format available.

Message #14 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Juanma Barranquero <lekktu <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 21816 <at> debbugs.gnu.org, Stephen Leake <stephen_leake <at> stephe-leake.org>
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 3 Nov 2015 02:19:54 +0100
[Message part 1 (text/plain, inline)]
On Mon, Nov 2, 2015 at 7:20 PM, Juanma Barranquero <lekktu <at> gmail.com> wrote:

> to defining a new comparing function, specific to this test. Which I
suppose I'll eventually do.

An additional complication is that file-equal-p is intended to compare
existing files ("If FILE1 or FILE2 does not exist, the return value is
unspecified."), but the xref-elisp-location values do not seem to be so
constrained. In elisp-mode-tests.el there are examples like

   (xref-make "(defun xref-find-definitions)"
     (xref-make-elisp-location
      'xref-find-definitions nil
      (expand-file-name "../../lisp/progmodes/xref.el" emacs-test-dir)))))

but also

   (xref-make "(defun buffer-live-p)"
     (xref-make-elisp-location 'buffer-live-p nil "src/buffer.c"))))
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 03:43:01 GMT) Full text and rfc822 format available.

Message #17 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Juanma Barranquero <lekktu <at> gmail.com>
Cc: 21816 <at> debbugs.gnu.org, stephen_leake <at> stephe-leake.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 05:42:28 +0200
> From: Juanma Barranquero <lekktu <at> gmail.com>
> Date: Tue, 3 Nov 2015 02:19:54 +0100
> Cc: 21816 <at> debbugs.gnu.org, Stephen Leake <stephen_leake <at> stephe-leake.org>
> 
> An additional complication is that file-equal-p is intended to compare existing
> files ("If FILE1 or FILE2 does not exist, the return value is unspecified."),
> but the xref-elisp-location values do not seem to be so constrained.

We could introduce a new function, file-name-equal-p, say, that would
be free of this limitation (for the price of not being able to compare
a file name with its 8+3 short variant and such likes).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 09:19:01 GMT) Full text and rfc822 format available.

Message #20 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Leake <stephen_leake <at> stephe-leake.org>
To: Juanma Barranquero <lekktu <at> gmail.com>
Cc: 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 03:17:42 -0600
Juanma Barranquero <lekktu <at> gmail.com> writes:

> Package: emacs
> Version: 25.0.50
> Severity: minor
> X-Debbugs-CC: stephen_leake <at> stephe-leake.org
>
>
> Test xref-elisp-test-find-defs-feature-el fails on my Windows system
> because of a case-sensitive filename comparison.
>
> Basically
>
> ELISP> (aref (aref (car (elisp--xref-find-definitions 'xref)) 2) 3)
> "c:/devel/emacs/repo/trunk/lisp/progmodes/xref.el"
>
> ELISP> (aref (xref-make-elisp-location
>      'xref 'feature
>      (expand-file-name "../../lisp/progmodes/xref.el"
> emacs-test-dir))
>     3)
> "c:/Devel/emacs/repo/trunk/lisp/progmodes/xref.el"
>
> and the test fails at the d/D in c:/Devel vs. c:/devel (as seen in the log
> below).
>
> The value returned by `expand-file-name' is the real name as stored in the
> filesystem (the directory is called "Devel"). 

This is not caused by expand-file-name, but by the values passed to it.
In this case, `emacs-test-dir' has the "c:/Devel"; it is set by:

(defconst emacs-test-dir (file-name-directory (or load-file-name (buffer-file-name))))

I can reproduce the problem by adding a "downcase" to this; my emacs
test dir is in c:/Projects/..., but my load-path has the correct case.

> The value from
> `elisp--xref-find-definitions' ultimately comes from `locate-file', and
> reflects what's stored in `load-path' (which can be set by the user, so we
> cannot force it to have "correct" casing).

Well, since this is only a problem in a test run by developers (not
general users), we _could_ insist on that, but I agree it would be nice
not to.

> The test code in elisp-mode-tests.el uses `equal' to compare the xref
> structures. I can think a couple of ways to fix the test, but they are all
> quite ugly....

The canonical solution for case-insensitive string comparisons is to use
`downcase' on both values before using `equal'. This works for me:

(defun xref-elisp-test-run (xrefs expected-xrefs)
  (should (= (length xrefs) (length expected-xrefs)))
  (while xrefs
    (let* ((xref (pop xrefs))
           (expected (pop expected-xrefs))
           (expected-xref (or (when (consp expected) (car expected)) expected))
           (expected-source (when (consp expected) (cdr expected))))

      ;; Downcase the filename for case-insensitive file systems.
      (setf (xref-elisp-location-file (oref xref :location))
            (downcase (xref-elisp-location-file (oref xref :location))))

      (setf (xref-elisp-location-file (oref expected-xref :location))
            (downcase (xref-elisp-location-file (oref expected-xref :location))))

      (should (equal xref expected-xref))

      (xref--goto-location (xref-item-location xref))
      (back-to-indentation)
      (should (looking-at (or expected-source
                              (xref-elisp-test-descr-to-target expected)))))
    ))

together with:

;; We add 'downcase' here to deliberately cause a potential problem on
;; case-insensitive file systems. On such systems, `load-file-name'
;; may not have the same case as the real file system, since the user
;; can set `load-path' to have the wrong case. This is handled in
;; `xref-elisp-test-run'.
(defconst emacs-test-dir (downcase (file-name-directory (or load-file-name (buffer-file-name)))))


If no one objects to this, I'll commit it.

-- 
-- Stephe




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 10:08:01 GMT) Full text and rfc822 format available.

Message #23 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Juanma Barranquero <lekktu <at> gmail.com>
To: Stephen Leake <stephen_leake <at> stephe-leake.org>
Cc: 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 3 Nov 2015 11:06:57 +0100
[Message part 1 (text/plain, inline)]
On Tue, Nov 3, 2015 at 10:17 AM, Stephen Leake <
stephen_leake <at> stephe-leake.org> wrote:

> Well, since this is only a problem in a test run by developers (not
> general users), we _could_ insist on that, but I agree it would be nice
> not to.

I think it points to a problem with xrefs in general. IMO xref-location
should have a generic compare function (specialized by default on
xref-file-location) so two of them can be compared other than by `equal'
and get the right result in case-insensitive systems. And xref-item should
also have a generic compare function, whose default implementation would
use xref-compare-locations (or whatever the name) to compare its location
slots.

> The canonical solution for case-insensitive string comparisons is to use
> `downcase' on both values before using `equal'. This works for me:

It works for me too, though I get a whole bunch of warnings running the
test:

c:/devel/emacs/repo/trunk/test/automated/elisp-mode-tests.el and
c:/Devel/emacs/repo/trunk/test/automated/elisp-mode-tests.el are the same
file

> ;; We add 'downcase' here to deliberately cause a potential problem on
> ;; case-insensitive file systems. On such systems, `load-file-name'
> ;; may not have the same case as the real file system, since the user
> ;; can set `load-path' to have the wrong case. This is handled in
> ;; `xref-elisp-test-run'.

In case-insensitive (or, as Windows, case-preserving) systems, that's not
the "wrong" case, just an alternative ;-)

> If no one objects to this, I'll commit it.

Please do, thanks.
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 14:52:01 GMT) Full text and rfc822 format available.

Message #26 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Leake <stephen_leake <at> stephe-leake.org>
To: Juanma Barranquero <lekktu <at> gmail.com>
Cc: 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 08:51:29 -0600
Juanma Barranquero <lekktu <at> gmail.com> writes:

> On Tue, Nov 3, 2015 at 10:17 AM, Stephen Leake <
> stephen_leake <at> stephe-leake.org> wrote:
>
>> Well, since this is only a problem in a test run by developers (not
>> general users), we _could_ insist on that, but I agree it would be nice
>> not to.
>
> I think it points to a problem with xrefs in general. IMO xref-location
> should have a generic compare function (specialized by default on
> xref-file-location) so two of them can be compared other than by `equal'
> and get the right result in case-insensitive systems. And xref-item should
> also have a generic compare function, whose default implementation would
> use xref-compare-locations (or whatever the name) to compare its location
> slots.

What is the use case for these compares, other than tests?

>> The canonical solution for case-insensitive string comparisons is to use
>> `downcase' on both values before using `equal'. This works for me:
>
> It works for me too, though I get a whole bunch of warnings running the
> test:
>
> c:/devel/emacs/repo/trunk/test/automated/elisp-mode-tests.el and
> c:/Devel/emacs/repo/trunk/test/automated/elisp-mode-tests.el are the same
> file

Which is why I always put the right case in `load-path' :).

There's probably a way (other than fixing your load-path) to suppress
those warnings; ert can redirect errors in general. I did not look into
it.

More worrisome is this from the byte compiler:

In xref-elisp-test-run:
../../../master/test/automated/elisp-mode-tests.el:185:57:Warning: Unknown
    slot `:location'
../../../master/test/automated/elisp-mode-tests.el:189:60:Warning: Unknown
    slot `:location'

yet the code runs and the tests pass. Changing ":location" to "location"
fixes the warning.

There are also several "unused lexical arg" warnings; see the WORKAROUND
comment.

>> If no one objects to this, I'll commit it.
>
> Please do, thanks.

Done.

--
-- Stephe




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 15:00:09 GMT) Full text and rfc822 format available.

Message #29 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Juanma Barranquero <lekktu <at> gmail.com>
To: Stephen Leake <stephen_leake <at> stephe-leake.org>
Cc: 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 3 Nov 2015 15:58:26 +0100
[Message part 1 (text/plain, inline)]
On Tue, Nov 3, 2015 at 3:51 PM, Stephen Leake <
stephen_leake <at> stephe-leake.org> wrote:

> What is the use case for these compares, other than tests?

Avoiding duplicates? When adding an xref to a list of xrefs, how do you
check that isn't it already there?

> Which is why I always put the right case in `load-path' :).

I haven't set my load-path. Its value comes straight of  autoconf /
configure / make bootstrap.

> There are also several "unused lexical arg" warnings; see the WORKAROUND
> comment.

Yes, I saw them. But these are warnings when byte-compiling the test file.
The "same file" warnings happen when running the tests.

I'll look into silencing them.

> Done.

Thanks.
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 15:41:03 GMT) Full text and rfc822 format available.

Message #32 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stephen Leake <stephen_leake <at> stephe-leake.org>
Cc: lekktu <at> gmail.com, 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 17:40:34 +0200
> From: Stephen Leake <stephen_leake <at> stephe-leake.org>
> Date: Tue, 03 Nov 2015 08:51:29 -0600
> Cc: 21816 <at> debbugs.gnu.org
> 
> >> The canonical solution for case-insensitive string comparisons is to use
> >> `downcase' on both values before using `equal'. This works for me:
> >
> > It works for me too, though I get a whole bunch of warnings running the
> > test:
> >
> > c:/devel/emacs/repo/trunk/test/automated/elisp-mode-tests.el and
> > c:/Devel/emacs/repo/trunk/test/automated/elisp-mode-tests.el are the same
> > file
> 
> Which is why I always put the right case in `load-path' :).

You cannot rely on that, in general.  File names don't have to come
from load-path, and load-path is not entirely under your control.

> There's probably a way (other than fixing your load-path) to suppress
> those warnings; ert can redirect errors in general. I did not look into
> it.
> 
> More worrisome is this from the byte compiler:
> 
> In xref-elisp-test-run:
> ../../../master/test/automated/elisp-mode-tests.el:185:57:Warning: Unknown
>     slot `:location'
> ../../../master/test/automated/elisp-mode-tests.el:189:60:Warning: Unknown
>     slot `:location'
> 
> yet the code runs and the tests pass. Changing ":location" to "location"
> fixes the warning.
> 
> There are also several "unused lexical arg" warnings; see the WORKAROUND
> comment.
> 
> >> If no one objects to this, I'll commit it.
> >
> > Please do, thanks.
> 
> Done.

Sorry to bust this party, guys, but I don't think this solution is
"good enough".  First, it won't DTRT with Windows file names that have
backslashes (yes, they happen in Emacs).  Second, 'downcase' has a
known gotcha of using the current buffer's case tables, which is
something you definitely don't want here.

Bottom line, doing this "right" is not so trivial, and a dedicated
function is a better solution, IMO.  If you put it in files-x.el, it
could be reused by other Lisp programs in the need of this.

IME, it doesn't pay off to use such incomplete solutions, the result
is an annoying, albeit minor, maintenance burden.  It's much better
IMO to fix this once and for all, then we could just forget about this
issue.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 15:47:01 GMT) Full text and rfc822 format available.

Message #35 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Juanma Barranquero <lekktu <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 21816 <at> debbugs.gnu.org, Stephen Leake <stephen_leake <at> stephe-leake.org>
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 3 Nov 2015 16:45:31 +0100
[Message part 1 (text/plain, inline)]
On Tue, Nov 3, 2015 at 4:40 PM, Eli Zaretskii <eliz <at> gnu.org> wrote:

> Sorry to bust this party, guys, but I don't think this solution is
> "good enough".  First, it won't DTRT with Windows file names that have
> backslashes (yes, they happen in Emacs).  Second, 'downcase' has a
> known gotcha of using the current buffer's case tables, which is
> something you definitely don't want here.

I take Stephen's commit as a fix for the current tests. The tests are not
about comparison.

But, as I've said, I think the real fix is adding generic comparison
functions to xrefs.
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 15:53:01 GMT) Full text and rfc822 format available.

Message #38 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Juanma Barranquero <lekktu <at> gmail.com>, Eli Zaretskii <eliz <at> gnu.org>
Cc: 21816 <at> debbugs.gnu.org, Stephen Leake <stephen_leake <at> stephe-leake.org>
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 3 Nov 2015 17:52:22 +0200
On 11/03/2015 05:45 PM, Juanma Barranquero wrote:

> But, as I've said, I think the real fix is adding generic comparison
> functions to xrefs.

Just to make it work better, in theory, on case-sensitive filesystems?

Please produce a real-life example of breakage first, please.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 16:06:02 GMT) Full text and rfc822 format available.

Message #41 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Juanma Barranquero <lekktu <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 21816 <at> debbugs.gnu.org,
 Stephen Leake <stephen_leake <at> stephe-leake.org>
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 3 Nov 2015 17:04:14 +0100
[Message part 1 (text/plain, inline)]
On Tue, Nov 3, 2015 at 4:52 PM, Dmitry Gutov <dgutov <at> yandex.ru> wrote:

> Just to make it work better, in theory, on case-sensitive filesystems?

I'm not sure where you're arguing that

a) xrefs never need to be compared
b) they are sometimes, but it's ok to use equal and fail in case-preserving
and case-insensitive systems

> Please produce a real-life example of breakage first, please.

We're here discussing a real one: a test that needs an ad hoc fix because
it couldn't just compare two xrefs. Or do you think that writing tests
isn't a real-life example?

I'm a bit surprised to be discussing that the basic type offered by an
infrastructure library as xref needs ways to be compared...
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 16:19:03 GMT) Full text and rfc822 format available.

Message #44 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Juanma Barranquero <lekktu <at> gmail.com>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 21816 <at> debbugs.gnu.org,
 Stephen Leake <stephen_leake <at> stephe-leake.org>
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 3 Nov 2015 18:18:18 +0200
On 11/03/2015 06:04 PM, Juanma Barranquero wrote:

> I'm not sure where you're arguing that
>
> a) xrefs never need to be compared
> b) they are sometimes, but it's ok to use equal and fail in
> case-preserving and case-insensitive systems

Why don't we canonicalize the file name somehow when an xref instance is 
created?

> We're here discussing a real one: a test that needs an ad hoc fix
> because it couldn't just compare two xrefs. Or do you think that writing
> tests isn't a real-life example?

It can be solved as above, for example.

> I'm a bit surprised to be discussing that the basic type offered by an
> infrastructure library as xref needs ways to be compared...

Compared in an impementation-specialized way? Yes, I'd like to see a use 
case.

It's very nontrivial to write a comparison function that would work 
beyond the type you're currently defining (just one xref subtype). 
Hence, its applicability will probably be rather limited.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 16:35:02 GMT) Full text and rfc822 format available.

Message #47 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Juanma Barranquero <lekktu <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 21816 <at> debbugs.gnu.org,
 Stephen Leake <stephen_leake <at> stephe-leake.org>
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 3 Nov 2015 17:33:13 +0100
[Message part 1 (text/plain, inline)]
On Tue, Nov 3, 2015 at 5:18 PM, Dmitry Gutov <dgutov <at> yandex.ru> wrote:

> Why don't we canonicalize the file name somehow when an xref instance is
created?

I'm OK with that, but then we need to add a comment or something to xref
explicitly stating that, when deriving new types from xref-location (or
xref-item, if someone adds a file slot to it for whatever reason) all
filenames *must* be kept canonicalized.

And then, someone will derive from xref-location and add propertized
strings and will want to compare *including* the properties...

No, of course I cannot think of a use case right now, but again, it's
infrastructure. It's there to help generalization and reuse.

> It can be solved as above, for example.

Until someone adds tests for etags and wants to compare two
xref-etag-location instances (which include a file slot), and then they
will have to copy the code from elisp-mode-tests.el.

> It's very nontrivial to write a comparison function that would work
beyond the type
> you're currently defining (just one xref subtype). Hence, its
applicability will probably
> be rather limited.

Well, you define a generic function that defaults to using `equal', and let
subtypes do the refining. `equal' already is quite broad, so I'd bet many
subtypes will never require anything more complex than the default
xref-compare-locations.

Now that I think of it... How do you sort xref-items and xref-locations?
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 16:36:02 GMT) Full text and rfc822 format available.

Message #50 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: lekktu <at> gmail.com, 21816 <at> debbugs.gnu.org, stephen_leake <at> stephe-leake.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 18:34:44 +0200
> Cc: Eli Zaretskii <eliz <at> gnu.org>, 21816 <at> debbugs.gnu.org,
>  Stephen Leake <stephen_leake <at> stephe-leake.org>
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Tue, 3 Nov 2015 18:18:18 +0200
> 
> On 11/03/2015 06:04 PM, Juanma Barranquero wrote:
> 
> > I'm not sure where you're arguing that
> >
> > a) xrefs never need to be compared
> > b) they are sometimes, but it's ok to use equal and fail in
> > case-preserving and case-insensitive systems
> 
> Why don't we canonicalize the file name somehow when an xref instance is 
> created?

Both forms (and some additional ones) are perfectly canonical.

> > I'm a bit surprised to be discussing that the basic type offered by an
> > infrastructure library as xref needs ways to be compared...
> 
> Compared in an impementation-specialized way? Yes, I'd like to see a use 
> case.
> 
> It's very nontrivial to write a comparison function that would work 
> beyond the type you're currently defining (just one xref subtype). 
> Hence, its applicability will probably be rather limited.

This isn't a xref-specific problem, so I think it should have a more
general solution that any package could use.  It's not reasonable to
expect from every Emacs developer to be an expert on subtle aspects of
comparing Windows file names.  We should provide a tested solution for
that.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 16:44:01 GMT) Full text and rfc822 format available.

Message #53 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Juanma Barranquero <lekktu <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 21816 <at> debbugs.gnu.org, Stephen Leake <stephen_leake <at> stephe-leake.org>,
 Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 3 Nov 2015 17:42:28 +0100
[Message part 1 (text/plain, inline)]
On Tue, Nov 3, 2015 at 5:34 PM, Eli Zaretskii <eliz <at> gnu.org> wrote:

> Both forms (and some additional ones) are perfectly canonical.

It would be a matter of choosing one.

> This isn't a xref-specific problem, so I think it should have a more
> general solution that any package could use.  It's not reasonable to
> expect from every Emacs developer to be an expert on subtle aspects of
> comparing Windows file names.  We should provide a tested solution for
> that.

That would be helpful in general, but wouldn't help with the case that now,
to compare two xref-location instances, you have to delve into their
innards. Different problems.

<rambling>Unless you're proposing a variant of `equal' with better
file-comparing abilities... which doesn't make sense because
`equal-super-duper` wouldn't have a way to know that some things are
filenames and not just arbitrary strings that match a filename just by
happenstance...</rambling>
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 16:54:02 GMT) Full text and rfc822 format available.

Message #56 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Juanma Barranquero <lekktu <at> gmail.com>
Cc: 21816 <at> debbugs.gnu.org, stephen_leake <at> stephe-leake.org, dgutov <at> yandex.ru
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 18:53:49 +0200
> From: Juanma Barranquero <lekktu <at> gmail.com>
> Date: Tue, 3 Nov 2015 17:42:28 +0100
> Cc: Dmitry Gutov <dgutov <at> yandex.ru>, 21816 <at> debbugs.gnu.org, 
> 	Stephen Leake <stephen_leake <at> stephe-leake.org>
> 
> > This isn't a xref-specific problem, so I think it should have a more
> > general solution that any package could use. It's not reasonable to
> > expect from every Emacs developer to be an expert on subtle aspects of
> > comparing Windows file names. We should provide a tested solution for
> > that.
> 
> That would be helpful in general, but wouldn't help with the case that now, to
> compare two xref-location instances, you have to delve into their innards.
> Different problems.
> 
> <rambling>Unless you're proposing a variant of `equal' with better
> file-comparing abilities... which doesn't make sense because `equal-super-duper
> ` wouldn't have a way to know that some things are filenames and not just
> arbitrary strings that match a filename just by happenstance...</rambling>

You mean, ref doesn't know that some strings are file names and some
aren't?  Then how does it know they are strings and not symbols or
something else?  This is Lisp, not C.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 18:13:02 GMT) Full text and rfc822 format available.

Message #59 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Juanma Barranquero <lekktu <at> gmail.com>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 21816 <at> debbugs.gnu.org,
 Stephen Leake <stephen_leake <at> stephe-leake.org>
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 3 Nov 2015 20:12:43 +0200
On 11/03/2015 06:33 PM, Juanma Barranquero wrote:

> I'm OK with that, but then we need to add a comment or something to xref
> explicitly stating that, when deriving new types from xref-location (or
> xref-item, if someone adds a file slot to it for whatever reason) all
> filenames *must* be kept canonicalized.

Yes, that would need to mentioned somewhere. And if someone were to 
point me to a "canonicalize file name" function (or suggest a quick 
implementation), I'd document that right away.

> And then, someone will derive from xref-location and add propertized
> strings and will want to compare *including* the properties...

Maybe they would, but you already understand that it's not a realistic 
example. Anything they might want to store in the properties, they can 
store inside their xref-derived instance.

> No, of course I cannot think of a use case right now, but again, it's
> infrastructure. It's there to help generalization and reuse.

When we add a new generic function, it's good to document exactly what 
is expected of its implementations. Which is hard to do without knowing 
use cases.

>  > It can be solved as above, for example.
>
> Until someone adds tests for etags and wants to compare two
> xref-etag-location instances (which include a file slot), and then they
> will have to copy the code from elisp-mode-tests.el.

Why *-tests.el? Canonicalization should happen inside the constructor 
function, if anywhere. E.g. inside xref-make-file-location and 
xref-make-elisp-location.

> `equal' already is quite broad, so I'd bet
> many subtypes will never require anything more complex than the default
> xref-compare-locations.

Which is my point exactly. Let's not complicate things until we have to.

> Now that I think of it... How do you sort xref-items and xref-locations?

We display the items in the order they were returned. Works pretty well.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 18:15:02 GMT) Full text and rfc822 format available.

Message #62 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>, Juanma Barranquero <lekktu <at> gmail.com>
Cc: 21816 <at> debbugs.gnu.org, stephen_leake <at> stephe-leake.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 3 Nov 2015 20:13:58 +0200
On 11/03/2015 06:53 PM, Eli Zaretskii wrote:

> You mean, ref doesn't know that some strings are file names and some
> aren't?  Then how does it know they are strings and not symbols or
> something else?  This is Lisp, not C.

Lisp has a "string" data type. It doesn't have a "file" data type. It's 
not Java.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 18:20:03 GMT) Full text and rfc822 format available.

Message #65 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: lekktu <at> gmail.com, 21816 <at> debbugs.gnu.org, stephen_leake <at> stephe-leake.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 3 Nov 2015 20:18:48 +0200
On 11/03/2015 06:34 PM, Eli Zaretskii wrote:

> Both forms (and some additional ones) are perfectly canonical.

Like suggested, let's pick one? Speaking of personal preferences, it 
would be the form with forward slashes, and with path segments written 
with capitalization that the file system prefers. I know NTFS does 
prefer *one* of many way to write each file name, even if it accepts others.

> This isn't a xref-specific problem, so I think it should have a more
> general solution that any package could use.  It's not reasonable to
> expect from every Emacs developer to be an expert on subtle aspects of
> comparing Windows file names.  We should provide a tested solution for
> that.

I don't know which solution to pick, and I believe I don't know the 
problem well enough. We probably don't want to venture into Java-land 
here, and encapsulate any type of field into its separate data type, 
with its comparison function. But, well, we could do that.

In any case, it's not a hugely important problem, compared to others we 
still have.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 19:26:01 GMT) Full text and rfc822 format available.

Message #68 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Leake <stephen_leake <at> stephe-leake.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: lekktu <at> gmail.com, 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 13:24:31 -0600
Eli Zaretskii <eliz <at> gnu.org> writes:

> Sorry to bust this party, guys, but I don't think this solution is
> "good enough".  

In general, or just for this test?

Let's fix the test first, and worry about the general case later.

> First, it won't DTRT with Windows file names that have
> backslashes (yes, they happen in Emacs).  

That's easy; add `expand-file-name'.

> Second, 'downcase' has a known gotcha of using the current buffer's
> case tables, which is something you definitely don't want here.

Ah; I had not considered non-ASCII stuff.

Is there some global variable we can let-bind to control that?

> Bottom line, doing this "right" is not so trivial, and a dedicated
> function is a better solution, IMO.  If you put it in files-x.el, it
> could be reused by other Lisp programs in the need of this.

Yes, that makes sense.

> IME, it doesn't pay off to use such incomplete solutions, the result
> is an annoying, albeit minor, maintenance burden.  

Not in this test.

-- 
-- Stephe




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 19:45:02 GMT) Full text and rfc822 format available.

Message #71 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Leake <stephen_leake <at> stephe-leake.org>
To: Juanma Barranquero <lekktu <at> gmail.com>
Cc: 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 13:43:56 -0600
Juanma Barranquero <lekktu <at> gmail.com> writes:

> On Tue, Nov 3, 2015 at 3:51 PM, Stephen Leake <
> stephen_leake <at> stephe-leake.org> wrote:
>
>> What is the use case for these compares, other than tests?
>
> Avoiding duplicates? When adding an xref to a list of xrefs, how do you
> check that isn't it already there?

Currently we don't.

It may come up with some of the multi-backend projects I'm working on,
but in general the xref backends should cover disjoint sets of files, so
it should not happen.

It might happen with non-file xrefs; I haven't used those much.

You asked about "sorting xrefs"; the only sorting that is currently done
is in xref--analyze, which groups xrefs with the same filename. It uses
cl-assoc on the filename, which uses "equal" or "eq" for the string
comparison, I think. It does have a :test parameter we can use to
override that.

So that could fail due to filename case sensitivity; I'll see if I can
construct an example.

However, the fix for this is _not_ a "compare xref" function, since. A
canonicalized file name in the xref would be a fix, as would a "file
name case insensitive compare" function.

-- 
-- Stephe




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 19:52:02 GMT) Full text and rfc822 format available.

Message #74 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Random832 <random832 <at> fastmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 14:50:29 -0500
Eli Zaretskii <eliz <at> gnu.org> writes:
> Both forms (and some additional ones) are perfectly canonical.

That is a contradiction in terms. There can only be one canonical
form. If there is no viable way to privilege one form above all others,
then _no_ forms are canonical.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 19:55:02 GMT) Full text and rfc822 format available.

Message #77 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Leake <stephen_leake <at> stephe-leake.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Juanma Barranquero <lekktu <at> gmail.com>, Eli Zaretskii <eliz <at> gnu.org>,
 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 13:54:21 -0600
Dmitry Gutov <dgutov <at> yandex.ru> writes:

> Why don't we canonicalize the file name somehow when an xref instance
> is created?

Yes; I have often thought Emacs needs a canonical file format. But first
we need a good definition for "canonical file name". The closest we have
now is "expand-file-name"; that does not fix case sensitivity.

I think a canonical name that includes case sensitivity would need a
rule like:

   If the file name identifies an existing file, convert the file name
   casing to match the actual file name casing. If not, no case
   conversion is done.

I'm not clear how to do the case conversion, but I bet Emacs does have
the right functions.

This does not handle comparing two file names for a not-yet-created file
that differ in case.

-- 
-- Stephe




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 19:56:02 GMT) Full text and rfc822 format available.

Message #80 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Random832 <random832 <at> fastmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 14:54:16 -0500
Dmitry Gutov <dgutov <at> yandex.ru> writes:

> On 11/03/2015 06:53 PM, Eli Zaretskii wrote:
>
>> You mean, ref doesn't know that some strings are file names and some
>> aren't?  Then how does it know they are strings and not symbols or
>> something else?  This is Lisp, not C.
>
> Lisp has a "string" data type. It doesn't have a "file" data
> type. It's not Java.

Yes, but that means you should not have any places (variables,
parameters, parts of data structures) where you have a string that may
or may not be a file name with no way of knowing which you want.

Ideally, any data structure which can be compared and may contain a file
name should provide a comparison function which will, if it contains a
filename, compare the filenames with a filename comparison function.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 20:04:01 GMT) Full text and rfc822 format available.

Message #83 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Random832 <random832 <at> fastmail.com>, 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 3 Nov 2015 22:03:52 +0200
On 11/03/2015 09:54 PM, Random832 wrote:

> Ideally, any data structure which can be compared and may contain a file
> name should provide a comparison function which will, if it contains a
> filename, compare the filenames with a filename comparison function.

There's not much ideal about that. If we could get away with only 
canonicalizing file names, having to write comparison functions instead 
would involve an order of magnitude more code.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 20:07:01 GMT) Full text and rfc822 format available.

Message #86 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Stephen Leake <stephen_leake <at> stephe-leake.org>
Cc: Juanma Barranquero <lekktu <at> gmail.com>, Eli Zaretskii <eliz <at> gnu.org>,
 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 3 Nov 2015 22:05:57 +0200
On 11/03/2015 09:54 PM, Stephen Leake wrote:

> I'm not clear how to do the case conversion, but I bet Emacs does have
> the right functions.

I'd expect file-truename to return an appropriate value, but apparently not.

> This does not handle comparing two file names for a not-yet-created file
> that differ in case.

Indeed. We're unlikely to encounter this case with xref, though.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 20:27:02 GMT) Full text and rfc822 format available.

Message #89 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Random832 <random832 <at> fastmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 15:25:52 -0500
Dmitry Gutov <dgutov <at> yandex.ru> writes:

> On 11/03/2015 09:54 PM, Random832 wrote:
>
>> Ideally, any data structure which can be compared and may contain a file
>> name should provide a comparison function which will, if it contains a
>> filename, compare the filenames with a filename comparison function.
>
> There's not much ideal about that. If we could get away with only
> canonicalizing file names, having to write comparison functions
> instead would involve an order of magnitude more code.

How are you going to canonicalize though?

Case as it exists on the filesystem? What if the file doesn't exist?

All upper or lower case? What if the file doesn't exist and the
"canonicalized" path is used to create it, contrary to the user's
intentions?

As it exists for what does exist, then as entered for components that
don't exist? What if the file is created with different cases
afterwards?





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 20:52:01 GMT) Full text and rfc822 format available.

Message #92 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: lekktu <at> gmail.com, 21816 <at> debbugs.gnu.org, stephen_leake <at> stephe-leake.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 22:51:39 +0200
> Cc: 21816 <at> debbugs.gnu.org, stephen_leake <at> stephe-leake.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Tue, 3 Nov 2015 20:13:58 +0200
> 
> On 11/03/2015 06:53 PM, Eli Zaretskii wrote:
> 
> > You mean, ref doesn't know that some strings are file names and some
> > aren't?  Then how does it know they are strings and not symbols or
> > something else?  This is Lisp, not C.
> 
> Lisp has a "string" data type. It doesn't have a "file" data type. It's 
> not Java.

File names are not strings.

If we need to know that a string is a file name, we can put some
property on it.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 20:56:02 GMT) Full text and rfc822 format available.

Message #95 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: lekktu <at> gmail.com, 21816 <at> debbugs.gnu.org, stephen_leake <at> stephe-leake.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 22:56:12 +0200
> Cc: lekktu <at> gmail.com, 21816 <at> debbugs.gnu.org, stephen_leake <at> stephe-leake.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Tue, 3 Nov 2015 20:18:48 +0200
> 
> > This isn't a xref-specific problem, so I think it should have a more
> > general solution that any package could use.  It's not reasonable to
> > expect from every Emacs developer to be an expert on subtle aspects of
> > comparing Windows file names.  We should provide a tested solution for
> > that.
> 
> I don't know which solution to pick, and I believe I don't know the 
> problem well enough. We probably don't want to venture into Java-land 
> here, and encapsulate any type of field into its separate data type, 
> with its comparison function. But, well, we could do that.

I hoped Juanma will pick up the gauntlet.

> In any case, it's not a hugely important problem, compared to others we 
> still have.

One step at a time.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 21:01:01 GMT) Full text and rfc822 format available.

Message #98 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stephen Leake <stephen_leake <at> stephe-leake.org>
Cc: lekktu <at> gmail.com, 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 23:00:32 +0200
> From: Stephen Leake <stephen_leake <at> stephe-leake.org>
> Cc: lekktu <at> gmail.com,  21816 <at> debbugs.gnu.org
> Date: Tue, 03 Nov 2015 13:24:31 -0600
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > Sorry to bust this party, guys, but I don't think this solution is
> > "good enough".  
> 
> In general, or just for this test?

Both, IMO.

> Let's fix the test first, and worry about the general case later.

Why not kill two birds with the same stone?

> > First, it won't DTRT with Windows file names that have
> > backslashes (yes, they happen in Emacs).  
> 
> That's easy; add `expand-file-name'.

For this specific case, yes.  But there's no way to be sure no one
will ever use concat or some such.  Or just take default directory and
use it directly.

> > Second, 'downcase' has a known gotcha of using the current buffer's
> > case tables, which is something you definitely don't want here.
> 
> Ah; I had not considered non-ASCII stuff.
> 
> Is there some global variable we can let-bind to control that?

You want with-case-table and standard-case-table, I think.

> > Bottom line, doing this "right" is not so trivial, and a dedicated
> > function is a better solution, IMO.  If you put it in files-x.el, it
> > could be reused by other Lisp programs in the need of this.
> 
> Yes, that makes sense.
> 
> > IME, it doesn't pay off to use such incomplete solutions, the result
> > is an annoying, albeit minor, maintenance burden.  
> 
> Not in this test.

Not today, maybe.  But there's always tomorrow and the day after.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 21:03:02 GMT) Full text and rfc822 format available.

Message #101 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Random832 <random832 <at> fastmail.com>
Cc: 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 23:02:31 +0200
> From: Random832 <random832 <at> fastmail.com>
> Date: Tue, 03 Nov 2015 14:50:29 -0500
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> > Both forms (and some additional ones) are perfectly canonical.
> 
> That is a contradiction in terms. There can only be one canonical
> form.

Not if "canonical" does not tell anything about letter-case or the
flavor of slashes.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 21:07:01 GMT) Full text and rfc822 format available.

Message #104 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stephen Leake <stephen_leake <at> stephe-leake.org>
Cc: lekktu <at> gmail.com, 21816 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 23:05:05 +0200
> From: Stephen Leake <stephen_leake <at> stephe-leake.org>
> Cc: Juanma Barranquero <lekktu <at> gmail.com>,  Eli Zaretskii <eliz <at> gnu.org>,  21816 <at> debbugs.gnu.org
> Date: Tue, 03 Nov 2015 13:54:21 -0600
> 
>    If the file name identifies an existing file, convert the file name
>    casing to match the actual file name casing. If not, no case
>    conversion is done.

For existing files we have file-equal-p, which is more general.  The
problem here is that files might not exist, so we need a weaker test
that still does a reasonably good job.

> I'm not clear how to do the case conversion, but I bet Emacs does have
> the right functions.

It does.  Actually, we have more than one such function.

> This does not handle comparing two file names for a not-yet-created file
> that differ in case.

Indeed.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 21:08:02 GMT) Full text and rfc822 format available.

Message #107 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: lekktu <at> gmail.com, 21816 <at> debbugs.gnu.org, stephen_leake <at> stephe-leake.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 23:07:21 +0200
> Cc: Juanma Barranquero <lekktu <at> gmail.com>, Eli Zaretskii <eliz <at> gnu.org>,
>  21816 <at> debbugs.gnu.org
> From: Dmitry Gutov <dgutov <at> yandex.ru>
> Date: Tue, 3 Nov 2015 22:05:57 +0200
> 
> On 11/03/2015 09:54 PM, Stephen Leake wrote:
> 
> > I'm not clear how to do the case conversion, but I bet Emacs does have
> > the right functions.
> 
> I'd expect file-truename to return an appropriate value, but apparently not.

file-truename works only for existing files.  See file-equal-p.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 21:19:01 GMT) Full text and rfc822 format available.

Message #110 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Random832 <random832 <at> fastmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 16:17:43 -0500
Eli Zaretskii <eliz <at> gnu.org> writes:
> Not if "canonical" does not tell anything about letter-case or the
> flavor of slashes.

It does, because to be canonical it has to be a _single_ sequence of
characters that is the preferred representation of the object over all
others. If no such sequence exists, then that means there is no
canonical form, not that there are two. That's what canonical means.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 21:23:02 GMT) Full text and rfc822 format available.

Message #113 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Random832 <random832 <at> fastmail.com>
Cc: 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 23:22:35 +0200
> From: Random832 <random832 <at> fastmail.com>
> Date: Tue, 03 Nov 2015 16:17:43 -0500
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> > Not if "canonical" does not tell anything about letter-case or the
> > flavor of slashes.
> 
> It does, because to be canonical it has to be a _single_ sequence of
> characters that is the preferred representation of the object over all
> others. If no such sequence exists, then that means there is no
> canonical form, not that there are two. That's what canonical means.

I don't see how this argument is useful for the issue at hand, so I
won't.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 21:26:02 GMT) Full text and rfc822 format available.

Message #116 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Random832 <random832 <at> fastmail.com>, 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 3 Nov 2015 23:25:11 +0200
On 11/03/2015 10:25 PM, Random832 wrote:
> How are you going to canonicalize though?
>
> Case as it exists on the filesystem? What if the file doesn't exist?

Then we have no business using it together with xref.

In any case, we should put a pin in this discussion: equality 
comparison, aside from writing tests, is basically only useful for 
removing duplicates. As far as I know, of course.

This subject has come up before, and the decision at the time was to let 
the function that produces the list of xrefs, remove the duplicates.

Most likely, it's already familiar with what kind of items it's 
returning, the types of their fields, and etc. I understand that may not 
always be true, but we're yet to produce a realistic example of that 
situation.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 21:53:01 GMT) Full text and rfc822 format available.

Message #119 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Leake <stephen_leake <at> stephe-leake.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Juanma Barranquero <lekktu <at> gmail.com>, Eli Zaretskii <eliz <at> gnu.org>,
 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 15:52:15 -0600
Dmitry Gutov <dgutov <at> yandex.ru> writes:

> On 11/03/2015 09:54 PM, Stephen Leake wrote:
>
>> I'm not clear how to do the case conversion, but I bet Emacs does have
>> the right functions.
>
> I'd expect file-truename to return an appropriate value, but
> apparently not.

Right. But it does handle symbolic links; is that what we need for a
canonical file name? It is what we want for comparing filenames in
xrefs, so it at least needs to be an option.

-- 
-- Stephe




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 22:00:02 GMT) Full text and rfc822 format available.

Message #122 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Leake <stephen_leake <at> stephe-leake.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: lekktu <at> gmail.com, 21816 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 15:58:49 -0600
Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: Stephen Leake <stephen_leake <at> stephe-leake.org>
>> Cc: Juanma Barranquero <lekktu <at> gmail.com>, Eli Zaretskii
>> <eliz <at> gnu.org>, 21816 <at> debbugs.gnu.org
>> Date: Tue, 03 Nov 2015 13:54:21 -0600
>> 
>>    If the file name identifies an existing file, convert the file name
>>    casing to match the actual file name casing. If not, no case
>>    conversion is done.
>
> For existing files we have file-equal-p, which is more general.  

You missed the context; we have switched from discussing how to compare
file names to how to canonicalize a file name, in order to simplify the
comparison.

The rationale here is that it is not possible/easy to always know
whether the string at hand is a file name, but it is always possible (or
at least much easier) to ensure that an xref contains a canonical file
name.

Part of the definition of "canonical file name" must also be:

    Two names for an existing file will convert to canonical filenames
    that will return t from `equal-string'".

-- 
-- Stephe




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 22:08:02 GMT) Full text and rfc822 format available.

Message #125 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Leake <stephen_leake <at> stephe-leake.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Juanma Barranquero <lekktu <at> gmail.com>, Eli Zaretskii <eliz <at> gnu.org>,
 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 16:07:00 -0600
Stephen Leake <stephen_leake <at> stephe-leake.org> writes:

> Dmitry Gutov <dgutov <at> yandex.ru> writes:
>
>> On 11/03/2015 09:54 PM, Stephen Leake wrote:
>>
>>> I'm not clear how to do the case conversion, but I bet Emacs does have
>>> the right functions.
>>
>> I'd expect file-truename to return an appropriate value, but
>> apparently not.
>
> Right. But it does handle symbolic links; is that what we need for a
> canonical file name? It is what we want for comparing filenames in
> xrefs, so it at least needs to be an option.

Actually, file-truename works for the cases we've discussed so far:

(file-truename "c:/Projects/emacs/master/lisp/elide-head.el")
"c:/Projects/emacs/master/lisp/elide-head.el"

(file-truename "c:\\Projects\\emacs\\master\\lisp\\elide-head.el")
"c:/Projects/emacs/master/lisp/elide-head.el"

(file-truename "c:\\projects\\emacs\\master\\lisp\\elide-head.el")
"c:/Projects/emacs/master/lisp/elide-head.el"

I did not try a non-ASCII file name; I'm not sure how to construct one
that has a non-ASCII casing issue.

What test did you run that failed?

-- 
-- Stephe




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 22:19:02 GMT) Full text and rfc822 format available.

Message #128 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Stephen Leake <stephen_leake <at> stephe-leake.org>
Cc: Juanma Barranquero <lekktu <at> gmail.com>, 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Wed, 4 Nov 2015 00:18:23 +0200
On 11/04/2015 12:07 AM, Stephen Leake wrote:

> What test did you run that failed?

I didn't run any real tests (sorry), because I don't have a Windows 
machine at hand. The description in file-truename's docstring just makes 
it look like it's unsuitable for our purpose (but there's a comment in 
its body about delegating to w32-long-file-name).

If it works well, that's great, and makes downcasing issues moot.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 22:29:02 GMT) Full text and rfc822 format available.

Message #131 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Leake <stephen_leake <at> stephe-leake.org>
To: Juanma Barranquero <lekktu <at> gmail.com>
Cc: 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 16:24:35 -0600
Stephen Leake <stephen_leake <at> stephe-leake.org> writes:

> You asked about "sorting xrefs"; the only sorting that is currently done
> is in xref--analyze, which groups xrefs with the same filename. It uses
> cl-assoc on the filename, which uses "equal" or "eq" for the string
> comparison, I think. It does have a :test parameter we can use to
> override that.
>
> So that could fail due to filename case sensitivity; I'll see if I can
> construct an example.

Here's a simple example:

(let* ((xref1 (xref-make "(defvar foo"
			 (xref-make-elisp-location
			  'xref-elisp-location 'define-var
			  "c:/projects/emacs/master/lisp/progmodes/elisp-mode.el")))
       (xref2 (xref-make "(defvar bar"
			 (xref-make-elisp-location
			  'xref-elisp-location 'define-var
			  "c:/Projects/emacs/master/lisp/progmodes/elisp-mode.el")))
       (xref3 (xref-make "(defvar bar"
			 (xref-make-elisp-location
			  'xref-elisp-location 'define-var
			  "c:/projects/emacs/master/lisp/progmodes/elisp-mode.el")))
       (alist1 (xref--analyze (list xref1 xref2)))
       (alist2 (xref--analyze (list xref1 xref3))))
  (list (length alist1) (length alist2))
  )

returns (2 1); xref1 and xref2 are not grouped, because the file names
compare not equal.

If xref-make-elisp-location used file-truename, this problem would be fixed:

(let* ((xref1 (xref-make "(defvar foo"
			 (xref-make-elisp-location
			  'xref-elisp-location 'define-var
			  (file-truename
			   "c:/projects/emacs/master/lisp/progmodes/elisp-mode.el"))))
       (xref2 (xref-make "(defvar bar"
			 (xref-make-elisp-location
			  'xref-elisp-location 'define-var
			  (file-truename
			  "c:/Projects/emacs/master/lisp/progmodes/elisp-mode.el"))))
       (xref3 (xref-make "(defvar bar"
			 (xref-make-elisp-location
			  'xref-elisp-location 'define-var
			  (file-truename
			  "c:/projects/emacs/master/lisp/progmodes/elisp-mode.el"))))
       (alist1 (xref--analyze (list xref1 xref2)))
       (alist2 (xref--analyze (list xref1 xref3))))
  (list (length alist1) (length alist2))
  )

returns (1 1)
-- 
-- Stephe




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Tue, 03 Nov 2015 23:22:02 GMT) Full text and rfc822 format available.

Message #134 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Leake <stephen_leake <at> stephe-leake.org>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Juanma Barranquero <lekktu <at> gmail.com>, Eli Zaretskii <eliz <at> gnu.org>,
 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Tue, 03 Nov 2015 16:20:15 -0600
Stephen Leake <stephen_leake <at> stephe-leake.org> writes:

> Stephen Leake <stephen_leake <at> stephe-leake.org> writes:
>
>> Dmitry Gutov <dgutov <at> yandex.ru> writes:
>>
>>> On 11/03/2015 09:54 PM, Stephen Leake wrote:
>>>
>>>> I'm not clear how to do the case conversion, but I bet Emacs does have
>>>> the right functions.
>>>
>>> I'd expect file-truename to return an appropriate value, but
>>> apparently not.
>>
>> Right. But it does handle symbolic links; is that what we need for a
>> canonical file name? It is what we want for comparing filenames in
>> xrefs, so it at least needs to be an option.
>
> Actually, file-truename works for the cases we've discussed so far:
>
> (file-truename "c:/Projects/emacs/master/lisp/elide-head.el")
> "c:/Projects/emacs/master/lisp/elide-head.el"
>
> (file-truename "c:\\Projects\\emacs\\master\\lisp\\elide-head.el")
> "c:/Projects/emacs/master/lisp/elide-head.el"
>
> (file-truename "c:\\projects\\emacs\\master\\lisp\\elide-head.el")
> "c:/Projects/emacs/master/lisp/elide-head.el"
>
> I did not try a non-ASCII file name; I'm not sure how to construct one
> that has a non-ASCII casing issue.

It even works for non-existing files:

(file-truename "c:\\projects\\emacs\\master\\lisp\\foo")
"c:/Projects/emacs/master/lisp/foo"

--
-- Stephe




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Wed, 04 Nov 2015 03:43:01 GMT) Full text and rfc822 format available.

Message #137 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stephen Leake <stephen_leake <at> stephe-leake.org>
Cc: lekktu <at> gmail.com, 21816 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Wed, 04 Nov 2015 05:42:16 +0200
> From: Stephen Leake <stephen_leake <at> stephe-leake.org>
> Cc: dgutov <at> yandex.ru,  lekktu <at> gmail.com,  21816 <at> debbugs.gnu.org
> Date: Tue, 03 Nov 2015 15:58:49 -0600
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> >> From: Stephen Leake <stephen_leake <at> stephe-leake.org>
> >> Cc: Juanma Barranquero <lekktu <at> gmail.com>, Eli Zaretskii
> >> <eliz <at> gnu.org>, 21816 <at> debbugs.gnu.org
> >> Date: Tue, 03 Nov 2015 13:54:21 -0600
> >> 
> >>    If the file name identifies an existing file, convert the file name
> >>    casing to match the actual file name casing. If not, no case
> >>    conversion is done.
> >
> > For existing files we have file-equal-p, which is more general.  
> 
> You missed the context; we have switched from discussing how to compare
> file names to how to canonicalize a file name, in order to simplify the
> comparison.

I'm still trying to convince you that this is a dead end: you cannot
trust user-extensible code to comply with such requirements, on
filesystems where "canonicalized file name" is not well defined.

But if you already decided to do this regardless, then I guess it's
time for me to bail out of this discussion.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Wed, 04 Nov 2015 03:47:02 GMT) Full text and rfc822 format available.

Message #140 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stephen Leake <stephen_leake <at> stephe-leake.org>
Cc: lekktu <at> gmail.com, 21816 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Wed, 04 Nov 2015 05:46:46 +0200
> From: Stephen Leake <stephen_leake <at> stephe-leake.org>
> Cc: Juanma Barranquero <lekktu <at> gmail.com>,  Eli Zaretskii <eliz <at> gnu.org>,  21816 <at> debbugs.gnu.org
> Date: Tue, 03 Nov 2015 16:07:00 -0600
> 
> Actually, file-truename works for the cases we've discussed so far:
> 
> (file-truename "c:/Projects/emacs/master/lisp/elide-head.el")
> "c:/Projects/emacs/master/lisp/elide-head.el"
> 
> (file-truename "c:\\Projects\\emacs\\master\\lisp\\elide-head.el")
> "c:/Projects/emacs/master/lisp/elide-head.el"
> 
> (file-truename "c:\\projects\\emacs\\master\\lisp\\elide-head.el")
> "c:/Projects/emacs/master/lisp/elide-head.el"

It seems to work because the part that needs caseless comparison does
exist.  Try using it on a file name where case differences are in a
part that doesn't exist on disk, and you will see the problem.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Wed, 04 Nov 2015 05:25:01 GMT) Full text and rfc822 format available.

Message #143 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Alexis <flexibeast <at> gmail.com>
To: Stephen Leake <stephen_leake <at> stephe-leake.org>
Cc: Juanma Barranquero <lekktu <at> gmail.com>, 21816 <at> debbugs.gnu.org,
 Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Wed, 04 Nov 2015 16:24:27 +1100
Stephen Leake <stephen_leake <at> stephe-leake.org> writes:

> I did not try a non-ASCII file name; I'm not sure how to 
> construct one that has a non-ASCII casing issue.

Perhaps use German Eszett / sharp S / Unicode LATIN SMALL LETTER 
SHARP S? The words 'Strasse' and 'Straße' are equivalent, but 
typically the latter is up-cased to 'STRASSE', rather than 
'STRAẞE'. (Indeed, LATIN CAPITAL LETTER SHARP S was only added to 
Unicode in version 5.1.) In turn, 'STRASSE' is typically 
down-cased to 'strasse'. Further, even the latest version of 
Unicode notes that SpecialCasing.txt doesn't handle this 
situation:

http://www.unicode.org/versions/Unicode8.0.0/ch03.pdf [p. 81]


Alexis.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Wed, 04 Nov 2015 06:35:01 GMT) Full text and rfc822 format available.

Message #146 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Random832 <random832 <at> fastmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Wed, 04 Nov 2015 01:33:20 -0500
Alexis <flexibeast <at> gmail.com> writes:
> Perhaps use German Eszett / sharp S / Unicode LATIN SMALL LETTER SHARP
> S? The words 'Strasse' and 'Straße' are equivalent, but typically the
> latter is up-cased to 'STRASSE', rather than 'STRAẞE'.

Fsvo "typically". On Emacs, and in Windows filenames [fortuitously, they
match in this case. Unfortunately, I don't believe there's an easy way
to get at the table used in Windows], upcasing is done on a
character-by-character basis, and AIUI neither of them do this, so
(string= (upcase "Straße") "STRAßE").





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Wed, 04 Nov 2015 06:53:01 GMT) Full text and rfc822 format available.

Message #149 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Alexis <flexibeast <at> gmail.com>
To: Random832 <random832 <at> fastmail.com>
Cc: 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Wed, 04 Nov 2015 17:52:31 +1100
Random832 <random832 <at> fastmail.com> writes:

> Alexis <flexibeast <at> gmail.com> writes: 
>> Perhaps use German Eszett / sharp S / Unicode LATIN SMALL 
>> LETTER SHARP S? The words 'Strasse' and 'Straße' are 
>> equivalent, but typically the latter is up-cased to 'STRASSE', 
>> rather than 'STRAẞE'.
>
> Fsvo "typically".

Oh, sorry, by "typically", i meant "when writing in German", not 
"when doing case-conversions in computing".

> On Emacs, and in Windows filenames [fortuitously, they match in 
> this case. Unfortunately, I don't believe there's an easy way to 
> get at the table used in Windows], upcasing is done on a 
> character-by-character basis, and AIUI neither of them do this, 
> so (string= (upcase "Straße") "STRAßE").

*nod* A good example of why i think the Eszett might be useful for 
testing any proposed equivalence-relation or canonicalisation code 
- there are a variety of approaches "in the wild" re. TRT. Take 
SRFI 75, for example:

https://www.cs.utah.edu/~mflatt/tmp/srfi-75.html

in the section "String Procedures":

   (string<? "z" "ß") => #t (string<? "z" "zz") => #t (string<? 
   "z" "Z") => #f (string=? "Straße" "Strasse") => #f 

   (string-upcase "Hi") => "HI" (string-downcase "Hi") => "hi" 
   (string-foldcase "Hi") => "hi" 

   (string-upcase "Straße") => "STRASSE" (string-downcase 
   "Straße") => "straße" (string-foldcase "Straße") => "strasse" 
   (string-downcase "STRASSE")  => "strasse"


Alexis.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Wed, 04 Nov 2015 08:21:03 GMT) Full text and rfc822 format available.

Message #152 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Juanma Barranquero <lekktu <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 21816 <at> debbugs.gnu.org, Stephen Leake <stephen_leake <at> stephe-leake.org>,
 dgutov <at> yandex.ru
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Wed, 4 Nov 2015 09:19:37 +0100
[Message part 1 (text/plain, inline)]
On Tue, Nov 3, 2015 at 5:53 PM, Eli Zaretskii <eliz <at> gnu.org> wrote:

> You mean, ref doesn't know that some strings are file names and some
> aren't?  Then how does it know they are strings and not symbols or
> something else?  This is Lisp, not C.

I meant a general-purpose comparison function. You cannot look at a struct
and know which strings are intended as filenames and which ones do not. But
xref-specific comparison functions obviously would.
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Wed, 04 Nov 2015 08:30:06 GMT) Full text and rfc822 format available.

Message #155 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Leake <stephen_leake <at> stephe-leake.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: lekktu <at> gmail.com, 21816 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Wed, 04 Nov 2015 02:28:50 -0600
Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: Stephen Leake <stephen_leake <at> stephe-leake.org>
>> Cc: Juanma Barranquero <lekktu <at> gmail.com>, Eli Zaretskii
>> <eliz <at> gnu.org>, 21816 <at> debbugs.gnu.org
>> Date: Tue, 03 Nov 2015 16:07:00 -0600
>> 
>> Actually, file-truename works for the cases we've discussed so far:
>> 
>> (file-truename "c:/Projects/emacs/master/lisp/elide-head.el")
>> "c:/Projects/emacs/master/lisp/elide-head.el"
>> 
>> (file-truename "c:\\Projects\\emacs\\master\\lisp\\elide-head.el")
>> "c:/Projects/emacs/master/lisp/elide-head.el"
>> 
>> (file-truename "c:\\projects\\emacs\\master\\lisp\\elide-head.el")
>> "c:/Projects/emacs/master/lisp/elide-head.el"
>
> It seems to work because the part that needs caseless comparison does
> exist.  Try using it on a file name where case differences are in a
> part that doesn't exist on disk, and you will see the problem.

Right. But I think that's not a solvable problem in principle, and not
a significant one in practice. 

The only time we would worry about the canonical name for a file that
does not yet exist is when we are about to create it; why would we then
have two different names for it that need to be compared?

We might be preparing to create a list of files with similar names; in
practice, each one will be created one at a time, at which point the
canonical name is well-defined, and can be compared to the proposed
name of the next file to be created. In fact, the OS "file create" will
do that comparison for us.

Hmm; that does mean the creation could not be done in parallel threads.
Again; not a significant problem in practice.

In tests we might invent non-existing names, but that's a special case
that is easily handled by paying attention to case, or creating the
files in the test.

-- 
-- Stephe




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Wed, 04 Nov 2015 08:42:02 GMT) Full text and rfc822 format available.

Message #158 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Juanma Barranquero <lekktu <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 21816 <at> debbugs.gnu.org,
 Stephen Leake <stephen_leake <at> stephe-leake.org>
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Wed, 4 Nov 2015 09:40:22 +0100
[Message part 1 (text/plain, inline)]
On Tue, Nov 3, 2015 at 7:12 PM, Dmitry Gutov <dgutov <at> yandex.ru> wrote:

> Maybe they would, but you already understand that it's not a realistic
example.  Anything
> they might want to store in the properties, they can store inside their
xref-derived instance.

Yup. Though, there's this comment at the beginning of xref.el:

;; Each identifier must be represented as a string.  Implementers can
;; use string properties to store additional information about the
;; identifier, [...]

> When we add a new generic function, it's good to document exactly what is
expected
> of its implementations. Which is hard to do without knowing use cases.

I've stopped arguing for a generic comparison function (I've read the rest
of the thread and it's obvious that Stephen and you don't consider it
useful). But as a general comment, I disagree with this. A generic
function, in most languages or frameworks that define interfaces, do not
document "exactly" what is expected of the implementations. They document
the behavior in general terms. Which in this case would be: "Compare two
xref-location items and return non-nil if they are considered equal, nil
otherwise." That's *all* a generic comparison function for xref-location
should need to document.

> Why *-tests.el? Canonicalization should happen inside the constructor
function, if anywhere. E.g. inside xref-make-file-location and
xref-make-elisp-location.

I though "solved as above" meant Stephen's fix for this particular bug.
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Wed, 04 Nov 2015 08:50:04 GMT) Full text and rfc822 format available.

Message #161 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Juanma Barranquero <lekktu <at> gmail.com>
To: Stephen Leake <stephen_leake <at> stephe-leake.org>
Cc: 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Wed, 4 Nov 2015 09:48:45 +0100
[Message part 1 (text/plain, inline)]
On Tue, Nov 3, 2015 at 8:43 PM, Stephen Leake <
stephen_leake <at> stephe-leake.org> wrote:

> You asked about "sorting xrefs"; the only sorting that is currently done
> is in xref--analyze, which groups xrefs with the same filename.

I suppose I view xrefs as potentially more general (a "bookmarkish"
object), while they are in fact designed for a very specific purpose that
doesn't need equality tests or total ordering.
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Wed, 04 Nov 2015 08:55:03 GMT) Full text and rfc822 format available.

Message #164 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Juanma Barranquero <lekktu <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 21816 <at> debbugs.gnu.org, Stephen Leake <stephen_leake <at> stephe-leake.org>,
 Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Wed, 4 Nov 2015 09:53:53 +0100
[Message part 1 (text/plain, inline)]
On Tue, Nov 3, 2015 at 9:56 PM, Eli Zaretskii <eliz <at> gnu.org> wrote:
>
> > From: Dmitry Gutov <dgutov <at> yandex.ru>
>
> > I don't know which solution to pick, and I believe I don't know the
> > problem well enough. We probably don't want to venture into Java-land
> > here, and encapsulate any type of field into its separate data type,
> > with its comparison function. But, well, we could do that.
>
> I hoped Juanma will pick up the gauntlet.

At this point, I don't even know what is being proposed. A file-name data
type (in the cl-defstruct sense) with appropriate canonicalization and
comparison functions?
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Wed, 04 Nov 2015 09:06:02 GMT) Full text and rfc822 format available.

Message #167 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Leake <stephen_leake <at> stephe-leake.org>
To: Alexis <flexibeast <at> gmail.com>
Cc: Juanma Barranquero <lekktu <at> gmail.com>, 21816 <at> debbugs.gnu.org,
 Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Wed, 04 Nov 2015 03:04:41 -0600
Alexis <flexibeast <at> gmail.com> writes:

> Stephen Leake <stephen_leake <at> stephe-leake.org> writes:
>
>> I did not try a non-ASCII file name; I'm not sure how to construct
>> one that has a non-ASCII casing issue.
>
> Perhaps use German Eszett / sharp S / Unicode LATIN SMALL LETTER SHARP
> S? The words 'Strasse' and 'Straße' are equivalent, , but typically the
> latter is up-cased to 'STRASSE', rather than 'STRAẞE'

Windows 8.1 lets me create:

"c:/tmp/STRAẞE.text" ;; LATIN CAPITAL LETTER SHARP S
"c:/tmp/Straße.text" ;; LATIN SMALL LETTER SHARP S
"c:/tmp/Strasse.text"

The font I'm using on Windows has no glyph for LATIN CAPITAL LETTER
SHARP S; on Debian, it's the same glyph as LATIN SMALL LETTER SHARP S,
which is confusing in this context.

file-truename follows Windows:

(file-truename "c:/tmp/Straẞe.text");; CAPITAL
"c:/tmp/STRAẞE.text";; CAPITAL

(file-truename "c:/tmp/Straße.text");; SMALL
"c:/tmp/Straße.text";; SMALL

(file-truename "c:/tmp/Strasse.text")
"c:/tmp/Strasse.text"

As do upcase, downcase:
(upcase "Straße.text") ;; SMALL
"STRAßE.TEXT" ;; SMALL

(downcase "Straẞe.text") ;; CAPITAL
"straẞe.text" ;; CAPITAL

I think changing CAPITAL ẞ to "SS" is not strictly a "casing" issue? But
perhaps it should be considered for a canonical file name on a
case-insensitive system?

Perhaps Windows has simply not yet implemented this recent addition to
Unicode? In which case, we need a better example to test with.

Debian testing (a few months old) running in VMWare under Windows 8.1
shows the same behavior. But Debian is case-sensitive, so that's not an
issue here, I think. The upcase/downcase is wrong, but I'm guessing
that's due to the CAPITAL being a recent addition to Unicode. 

-- 
-- Stephe




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Wed, 04 Nov 2015 10:02:02 GMT) Full text and rfc822 format available.

Message #170 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Leake <stephen_leake <at> stephe-leake.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: lekktu <at> gmail.com, 21816 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Wed, 04 Nov 2015 04:00:58 -0600
Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: Stephen Leake <stephen_leake <at> stephe-leake.org>
>> Cc: dgutov <at> yandex.ru,  lekktu <at> gmail.com,  21816 <at> debbugs.gnu.org
>> Date: Tue, 03 Nov 2015 15:58:49 -0600
>> 
>> Eli Zaretskii <eliz <at> gnu.org> writes:
>> 
>> >> From: Stephen Leake <stephen_leake <at> stephe-leake.org>
>> >> Cc: Juanma Barranquero <lekktu <at> gmail.com>, Eli Zaretskii
>> >> <eliz <at> gnu.org>, 21816 <at> debbugs.gnu.org
>> >> Date: Tue, 03 Nov 2015 13:54:21 -0600
>> >> 
>> >>    If the file name identifies an existing file, convert the file name
>> >>    casing to match the actual file name casing. If not, no case
>> >>    conversion is done.
>> >
>> > For existing files we have file-equal-p, which is more general.  
>> 
>> You missed the context; we have switched from discussing how to compare
>> file names to how to canonicalize a file name, in order to simplify the
>> comparison.
>
> I'm still trying to convince you that this is a dead end: you cannot
> trust user-extensible code to comply with such requirements, on
> filesystems where "canonicalized file name" is not well defined.

We can provide a `file-canonical-name' that does something useful on all
systems.

I believe the notion of "canonical file name" is well defined for
existing files on all systems we care about; is that not true?

For non-existing files, I don't think there is a real problem in
practice (see other email); can you present a use case where there is?


The file-equal-p option does not fix the problem in xref--analyze, since
that is comparing the result of `xref-location-group', which is yet more
user-extensible code; for xref-elisp-location, that is the file name
slot of the location. Using file-canonical-name in xref-location-group
implemetations would fix this; using it in make-xref-location fixes it
in a more robust and consistent way.


To summarize:

The problem is comparing xrefs directly (for sorting and testing),
comparing the result of xref-location-group (for sorting), and possibly
comparing results of other xref access functions in the future. Note
that any filenames in xrefs identify existing files, except in
artifically constructed tests.

I believe the two alternatives being proposed are:

1) Use `file-equal-p'

- Provide a cl-defgeneric `xref-location-equal-p' for the root
  xref-location type.

- For each type that inherits from xref-location, provide an
  implementation of `xref-location-equal-p' that uses `file-equal-p' for
  each file name.

- Provide a cl-defmethod `xref-equal-p' for the root xref type that uses
  `xref-location-equal-p' for the location.

- For each type that inherits from xref, provide an implementation of
  `xref-equal-p' that uses `xref-location-equal-p' for each location.


2) Use `file-canonical-name'

- A canonical file name has the following features:

  It contains forward slashes for directory separators.

  If the file name identifies an existing file, the canonical file name
  casing matches the actual file name casing. If not, the case of the
  portions of the canonical file name that do not exist is arbitrary.
  Note that this matters only on case-insensitive file systems.

  Two different non-canonical names for an existing file convert to the
  same canonical file name. Note that this means symlinks are resolved.

  Canonical file names may be compared with `string-equal' and
  `string-lessp'.

- Provide a system-dependent `file-canonical-name' that converts
  user-provided file names to a canonical file name.

- Require each variant of make-xref-location to use it for file name
  slots.


Option 1) solves the problem of comparing xrefs and xref-locations, but
does not solve the problem of comparing results of xref-location-group.

Option 2) solves the problems of comparing xrefs, xref-locations,
results of xref-location-group, and results of future xref access
functions.

So far, `file-truename' seems adequate for `file-canonical-name', but it
might be best to define the alias, in case we find problems in the
future.

One problem with canonical file names is that the user can construct two
names for a non-existing file that differ only in case; they are
supposed to convert to the same canonical file name on a
case-insensitive file system, but `file-canonical-name' has no way to
know which might represent an actual file in the future. One solution is
for `file-canonical-name' to downcase the portions of the file name that
do not exist, but that's not very user-friendly (`file-truename' does
not do this). This problem does not seem likely to occur in practice.


Option 2) seems better to me. 

-- 
-- Stephe




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Wed, 04 Nov 2015 10:38:02 GMT) Full text and rfc822 format available.

Message #173 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Alexis <flexibeast <at> gmail.com>
To: stephen_leake <at> stephe-leake.org
Cc: Juanma Barranquero <lekktu <at> gmail.com>, 21816 <at> debbugs.gnu.org,
 Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Wed, 04 Nov 2015 21:37:02 +1100
Stephen Leake <stephen_leake <at> stephe-leake.org> writes:

> I think changing CAPITAL ẞ to "SS" is not strictly a "casing" 
> issue?

Oh, no, it is; a capital Eszett is not at all undebatedly 
canonical. E.g. Wikipedia calls capital Eszett 'contestable':

   https://en.wikipedia.org/wiki/Capital_%E1%BA%9E

This page:

   http://sites.psu.edu/gotunicode/2008/07/18/a_new_german_unicode_letter_ca/

has some more details, including noting that:

   "[i]n official German spelling convention, there is NO CAPITAL 
   SHARP S."

although in 2010 an exception was made:

   "However, in 2010 the use of the capital sharp s became 
   mandatory in official documentation when writing geographical 
   names in all-caps." 

Though i doubt that /all/ usage patterns have adjusted in response 
to this. Thus, the conservative approach is to assume that 'SS' is 
indeed /the/ capital form of ß. The above Wikipedia page goes on 
to note:

   "As of April 2008, typographers have yet to agree on a 
   standard form for the letter capital ẞ, as they did in 1903 
   when an association of German printers and type foundries 
   agreed on the 'Sulzbacher Form' as standard for the lowercase 
   ß."

Hence the capital Eszett only appearing in Unicode 5.1 - iiuc, due 
to the 2004 proposal arguing for a capital Eszett in Unicode on 
the basis that there are typefaces with a capital Eszett now out 
in the wild - and not appearing at all in ISO-8859-1, even though 
ß is there.


Alexis.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Wed, 04 Nov 2015 15:31:01 GMT) Full text and rfc822 format available.

Message #176 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Leake <stephen_leake <at> stephe-leake.org>
To: Juanma Barranquero <lekktu <at> gmail.com>
Cc: 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Wed, 04 Nov 2015 09:29:42 -0600
Juanma Barranquero <lekktu <at> gmail.com> writes:

> ... though I get a whole bunch of warnings running the
> test:
>
> c:/devel/emacs/repo/trunk/test/automated/elisp-mode-tests.el and
> c:/Devel/emacs/repo/trunk/test/automated/elisp-mode-tests.el are the same
> file

Perhaps it would be appropriate to delete the code that generates this
warning.

If the policy is that `load-path' may contain directory names that are
different from the disk name, then taking advantage of that should not
produce warnings.

But that discussion should move to a separate bug/subject.

-- 
-- Stephe




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Wed, 04 Nov 2015 15:36:02 GMT) Full text and rfc822 format available.

Message #179 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Alexis <flexibeast <at> gmail.com>
Cc: lekktu <at> gmail.com, 21816 <at> debbugs.gnu.org, stephen_leake <at> stephe-leake.org,
 dgutov <at> yandex.ru
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Wed, 04 Nov 2015 17:34:03 +0200
> From: Alexis <flexibeast <at> gmail.com>
> Date: Wed, 04 Nov 2015 16:24:27 +1100
> Cc: Juanma Barranquero <lekktu <at> gmail.com>, 21816 <at> debbugs.gnu.org,
> 	Dmitry Gutov <dgutov <at> yandex.ru>
> 
>     I did not try a non-ASCII file name; I'm not sure how to construct one that has a non-ASCII casing issue. 
> 
> Perhaps use German Eszett / sharp S / Unicode LATIN SMALL LETTER SHARP S? The words 'Strasse' and 'Straße' are equivalent, but typically the latter is up-cased to 'STRASSE', rather than 'STRAẞE'. (Indeed, LATIN CAPITAL LETTER SHARP S was only added to Unicode in version 5.1.) In turn, 'STRASSE' is typically down-cased to 'strasse'. Further, even the latest version of Unicode notes that SpecialCasing.txt doesn't handle this situation: 

Thanks for bringing up this problematic issue.  However, this issue is
not relevant to file names, at least not on MS-Windows, because
Windows doesn't consider this case conversion either:

  (file-exists-p "Straße.txt")
   => t
  (file-exists-p "straße.txt")
   => t
  (file-exists-p "STRAßE.TXT")
   => t
  (file-exists-p "STRASSE.txt")
   => nil





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Wed, 04 Nov 2015 15:37:01 GMT) Full text and rfc822 format available.

Message #182 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Alexis <flexibeast <at> gmail.com>
Cc: random832 <at> fastmail.com, 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Wed, 04 Nov 2015 17:35:34 +0200
> From: Alexis <flexibeast <at> gmail.com>
> Date: Wed, 04 Nov 2015 17:52:31 +1100
> Cc: 21816 <at> debbugs.gnu.org
> 
>     (string<? "z" "ß") => #t (string<? "z" "zz") => #t (string<? 
>     "z" "Z") => #f (string=? "Straße" "Strasse") => #f 
>  
>     (string-upcase "Hi") => "HI" (string-downcase "Hi") => "hi" 
>     (string-foldcase "Hi") => "hi" 
>  
>     (string-upcase "Straße") => "STRASSE" (string-downcase 
>     "Straße") => "straße" (string-foldcase "Straße") => "strasse" 
>     (string-downcase "STRASSE")  => "strasse"

That's locale-dependent (I'm guessing your locale uses the UTF-8
codeset and your system is glibc-based), whereas Emacs's case
conversions are independent of the current locale.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Wed, 04 Nov 2015 15:41:01 GMT) Full text and rfc822 format available.

Message #185 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Random832 <random832 <at> fastmail.com>
Cc: 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Wed, 04 Nov 2015 17:38:55 +0200
> From: Random832 <random832 <at> fastmail.com>
> Date: Wed, 04 Nov 2015 01:33:20 -0500
> 
> Fsvo "typically". On Emacs, and in Windows filenames [fortuitously, they
> match in this case. Unfortunately, I don't believe there's an easy way
> to get at the table used in Windows], upcasing is done on a
> character-by-character basis, and AIUI neither of them do this, so
> (string= (upcase "Straße") "STRAßE").

Btw, patches to support what Unicode's SpecialCasing.txt describes are
welcome.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Wed, 04 Nov 2015 16:09:02 GMT) Full text and rfc822 format available.

Message #188 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stephen Leake <stephen_leake <at> stephe-leake.org>
Cc: lekktu <at> gmail.com, 21816 <at> debbugs.gnu.org, flexibeast <at> gmail.com,
 dgutov <at> yandex.ru
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Wed, 04 Nov 2015 18:08:20 +0200
> From: Stephen Leake <stephen_leake <at> stephe-leake.org>
> Date: Wed, 04 Nov 2015 03:04:41 -0600
> Cc: Juanma Barranquero <lekktu <at> gmail.com>, 21816 <at> debbugs.gnu.org,
> 	Dmitry Gutov <dgutov <at> yandex.ru>
> 
> Windows 8.1 lets me create:
> 
> "c:/tmp/STRAẞE.text" ;; LATIN CAPITAL LETTER SHARP S
> "c:/tmp/Straße.text" ;; LATIN SMALL LETTER SHARP S
> "c:/tmp/Strasse.text"

Windows is right: ẞ is not the upper-case variant of ß.  See
UnicodeData.txt and SpecialCasing.txt.

> Perhaps Windows has simply not yet implemented this recent addition to
> Unicode?

It's not recent: it appeared in Unicode 5.1, 7 years ago.

> Debian testing (a few months old) running in VMWare under Windows 8.1
> shows the same behavior. But Debian is case-sensitive, so that's not an
> issue here, I think. The upcase/downcase is wrong, but I'm guessing
> that's due to the CAPITAL being a recent addition to Unicode. 

No, it's because Unicode doesn't say these two letters are a case
pair, at least not in the usual sense.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Wed, 04 Nov 2015 19:40:02 GMT) Full text and rfc822 format available.

Message #191 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Leake <stephen_leake <at> stephe-leake.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: lekktu <at> gmail.com, 21816 <at> debbugs.gnu.org, dgutov <at> yandex.ru
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Wed, 04 Nov 2015 13:14:31 -0600
I've now looked at the definition of `file-equal-p':

  (let ((handler (or (find-file-name-handler file1 'file-equal-p)
                     (find-file-name-handler file2 'file-equal-p))))
    (if handler
        (funcall handler 'file-equal-p file1 file2)
      (let (f1-attr f2-attr)
        (and (setq f1-attr (file-attributes (file-truename file1)))
	     (setq f2-attr (file-attributes (file-truename file2)))
	     (equal f1-attr f2-attr))))))

This implies there are two reasons why `file-truename' (and by extension
`file-canonical-name') is not sufficient for determining if two names
identify the same file.

I think one issue is hardlinks; a file can have two distinct truenames
because of hardlinks, but the attributes contain the uid and the inode
number, which will identify the actual file.

I'm not clear on why a special handler for file-equal-p is needed; is
that just an optimization? file-truename respects file handlers.


This changes Option 2 as indicated below.

Stephen Leake <stephen_leake <at> stephe-leake.org> writes:

> To summarize:
>
> The problem is comparing xrefs directly (for sorting and testing),
> comparing the result of xref-location-group (for sorting), and possibly
> comparing results of other xref access functions in the future. Note
> that any filenames in xrefs identify existing files, except in
> artifically constructed tests.
>
> I believe the two alternatives being proposed are:
>
> 1) Use `file-equal-p'
>
> - Provide a cl-defgeneric `xref-location-equal-p' for the root
>   xref-location type.
>
> - For each type that inherits from xref-location, provide an
>   implementation of `xref-location-equal-p' that uses `file-equal-p' for
>   each file name.
>
> - Provide a cl-defmethod `xref-equal-p' for the root xref type that uses
>   `xref-location-equal-p' for the location.
>
> - For each type that inherits from xref, provide an implementation of
>   `xref-equal-p' that uses `xref-location-equal-p' for each location.
>
>
> 2) Use `file-canonical-name'

This changes to "Use `file-truename'.

> - A canonical file name has the following features:
>
>   It contains forward slashes for directory separators.
>
>   If the file name identifies an existing file, the canonical file name
>   casing matches the actual file name casing. If not, the case of the
>   portions of the canonical file name that do not exist is arbitrary.
>   Note that this matters only on case-insensitive file systems.

I believe this is true of the result of file-truename, and it could be
added to the docstring for that function.

The following two points cannot be satisfied in the presence of
hardlinks, and possibly other issues with file handlers; so they are deleted.

>   Two different non-canonical names for an existing file convert to the
>   same canonical file name. Note that this means symlinks are resolved.
>
>   Canonical file names may be compared with `string-equal' and
>   `string-lessp'.

> - Provide a system-dependent `file-canonical-name' that converts
>   user-provided file names to a canonical file name.

This is deleted.


> - Require each variant of make-xref-location to use it for file name
>   slots.

"it" now means "file-truename".

> Option 1) solves the problem of comparing xrefs and xref-locations, but
> does not solve the problem of comparing results of
> xref-location-group.

This remains true. And since xref--analyze compares the result of
xref-location-group, it is not just a theoretical problem; I provided an
example where it fails.

> Option 2) solves the problems of comparing xrefs, xref-locations,
> results of xref-location-group, and results of future xref access
> functions.

This is true only if there are no hard links (or other issues with file
handlers). 

> So far, `file-truename' seems adequate for `file-canonical-name', but it
> might be best to define the alias, in case we find problems in the
> future.

This is not true; I'm now convinced it is not possible to define a fully
correct `file-canonical-name'.

Hmm. If there is a way to iterate all the truenames for a file (I didn't
find one), then we could order them with string-lessp, and simply take
the first as the canonical name. I'm not sure that would be really
useful.

> Option 2) seems better to me. 

This remains true. If there are hardlinks on the disk, users will just
have to be aware of that, and cope. That's no worse than the current
situation.

-- 
-- Stephe




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Thu, 05 Nov 2015 12:28:01 GMT) Full text and rfc822 format available.

Message #194 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Juanma Barranquero <lekktu <at> gmail.com>
To: Stephen Leake <stephen_leake <at> stephe-leake.org>
Cc: 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Thu, 5 Nov 2015 13:26:00 +0100
[Message part 1 (text/plain, inline)]
On Wed, Nov 4, 2015 at 4:29 PM, Stephen Leake <
stephen_leake <at> stephe-leake.org> wrote:

> If the policy is that `load-path' may contain directory names that are
> different from the disk name, then taking advantage of that should not
> produce warnings.
>
> But that discussion should move to a separate bug/subject.

For this specific test file, I could fix the problem with this patch, if
you and Dmitry are OK with it.

diff --git i/test/automated/elisp-mode-tests.el
w/test/automated/elisp-mode-tests.el
index 1085b54..38c0b3b 100644
--- i/test/automated/elisp-mode-tests.el
+++ w/test/automated/elisp-mode-tests.el
@@ -209,6 +209,7 @@ xref-elisp-deftest
            (debug (symbolp "name")))
   `(ert-deftest ,(intern (concat "xref-elisp-test-" (symbol-name name))) ()
-     (xref-elisp-test-run ,computed-xrefs ,expected-xrefs)
-     ))
+     (let ((find-file-suppress-same-file-warnings t))
+       (xref-elisp-test-run ,computed-xrefs ,expected-xrefs)
+       )))

 ;; When tests are run from the Makefile, 'default-directory' is $HOME,
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Thu, 05 Nov 2015 13:25:02 GMT) Full text and rfc822 format available.

Message #197 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Leake <stephen_leake <at> stephe-leake.org>
To: Juanma Barranquero <lekktu <at> gmail.com>
Cc: 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Thu, 05 Nov 2015 07:23:47 -0600
Juanma Barranquero <lekktu <at> gmail.com> writes:

> On Wed, Nov 4, 2015 at 4:29 PM, Stephen Leake <
> stephen_leake <at> stephe-leake.org> wrote:
>
>> If the policy is that `load-path' may contain directory names that are
>> different from the disk name, then taking advantage of that should not
>> produce warnings.
>>
>> But that discussion should move to a separate bug/subject.
>
> For this specific test file, I could fix the problem with this patch, if
> you and Dmitry are OK with it.
>
> diff --git i/test/automated/elisp-mode-tests.el
> w/test/automated/elisp-mode-tests.el
> index 1085b54..38c0b3b 100644
> --- i/test/automated/elisp-mode-tests.el
> +++ w/test/automated/elisp-mode-tests.el
> @@ -209,6 +209,7 @@ xref-elisp-deftest
>             (debug (symbolp "name")))
>    `(ert-deftest ,(intern (concat "xref-elisp-test-" (symbol-name name))) ()
> -     (xref-elisp-test-run ,computed-xrefs ,expected-xrefs)
> -     ))
> +     (let ((find-file-suppress-same-file-warnings t))
> +       (xref-elisp-test-run ,computed-xrefs ,expected-xrefs)
> +       )))
>

looks good to me


-- 
-- Stephe




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21816; Package emacs. (Mon, 09 Nov 2015 03:40:02 GMT) Full text and rfc822 format available.

Message #200 received at 21816 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Stephen Leake <stephen_leake <at> stephe-leake.org>,
 Juanma Barranquero <lekktu <at> gmail.com>
Cc: 21816 <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Mon, 9 Nov 2015 05:38:57 +0200
On 11/03/2015 09:43 PM, Stephen Leake wrote:

> You asked about "sorting xrefs"; the only sorting that is currently done
> is in xref--analyze, which groups xrefs with the same filename.

Just to clarify: there's no sorting being done in xref--analyze, only 
grouping. If it sorted xrefs, it would've likely made things worse for 
project-find-regexp.




Reply sent to Juanma Barranquero <lekktu <at> gmail.com>:
You have taken responsibility. (Wed, 02 Oct 2019 00:59:01 GMT) Full text and rfc822 format available.

Notification sent to Juanma Barranquero <lekktu <at> gmail.com>:
bug acknowledged by developer. (Wed, 02 Oct 2019 00:59:02 GMT) Full text and rfc822 format available.

Message #205 received at 21816-done <at> debbugs.gnu.org (full text, mbox):

From: Juanma Barranquero <lekktu <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: Stephen Leake <stephen_leake <at> stephe-leake.org>, 21816-done <at> debbugs.gnu.org
Subject: Re: bug#21816: elisp-mode-tests fails on a case-preserving filesystem
Date: Wed, 2 Oct 2019 02:57:46 +0200
[Message part 1 (text/plain, inline)]
The tests pass, the bug was fixed long ago, and the rest is a very long
discussion about generalizing comparison of xrefs and canonicalizing
filenames.

I'm closing this one.
[Message part 2 (text/html, inline)]

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 30 Oct 2019 11:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 151 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.