GNU bug report logs -
#57129
29.0.50; Improve behavior of conditionals in Eshell
Previous Next
Reported by: Jim Porter <jporterbugs <at> gmail.com>
Date: Thu, 11 Aug 2022 02:44:02 UTC
Severity: normal
Found in version 29.0.50
Done: Jim Porter <jporterbugs <at> gmail.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 57129 in the body.
You can then email your comments to 57129 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Thu, 11 Aug 2022 02:44:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Jim Porter <jporterbugs <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Thu, 11 Aug 2022 02:44:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
In Eshell, you can use conditionals pretty much like you'd expect from
other shells. Starting from 'emacs -Q -f eshell':
~ $ [ foo = foo ] && echo hi
hi
~ $ [ foo = bar ] && echo hi
~ $ [ foo = foo ] || echo hi
~ $ [ foo = bar ] || echo hi
hi
~ $ if {[ foo = foo ]} {echo yes} {echo no}
yes
~ $ if {[ foo = bar ]} {echo yes} {echo no}
no
However, that only works for external commands. If the command is
implemented in Lisp, it doesn't work:
~ $ (zerop 0) && echo hi
t
hi
~ $ (zerop 1) && echo hi
hi ;; Shouldn't say hi.
That's because the exit status is always 0 for Lisp commands. This works
correctly for external commands:
~ $ [ foo = foo ]; echo status $? result $$
("status" 0 "result" nil)
~ $ [ foo = bar ]; echo status $? result $$
("status" 1 "result" nil)
But not for Lisp commands:
~ $ (zerop 0); echo status $? result $$
t
("status" 0 "result" t)
~ $ (zerop 1); echo status $? result $$
("status" 0 "result" nil)
~ $ (zerop "foo"); echo status $? result $$
Wrong type argument: number-or-marker-p, "foo"
("status" 0 "result" nil)
The manual says that the status should be 1 when a Lisp command fails,
but it's 0 no matter what, even if it signaled an error. (Likewise, the
manual says that the "result" variable should be t for successful
external commands, but it's always nil.)
Similarly to the above, you can't use variable expansions for conditionals:
~ $ (setq foo t)
t
~ $ if $foo {echo yes} {echo no}
yes
~ $ (setq foo nil)
~ $ if $foo {echo yes} {echo no}
yes ;; Should say no.
Patch forthcoming. Just splitting it into two messages since it seemed
more readable that way...
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Thu, 11 Aug 2022 02:47:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 57129 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Here are some patches to fix this.
The first patch adds tests/documentation for the current state so that
the subsequent patches are clearer. It also improves
'eshell-close-handles' so that you can set the exit status elsewhere,
which makes some of the code simpler.
The second patch fixes the use of variables in conditionals (e.g. if
statements, while loops). The only non-test code change here is that
'eshell-structure-basic-command' needed to be taught that forms
beginning with 'eshell-escape-arg' should be treated as data, much like
'eshell-convert'.
The third patch fixes the behavior of the '$?' and '$$' variables to
match the manual: '$$' is t when an external command succeeds, and '$?'
is 1 when a Lisp command signals an error. I also added a new feature
that '$?' is 2 when an actual Lisp *form* returns nil. This lets you use
Lisp forms as conditionals in Eshell much like you would in regular
Lisp. However, it doesn't apply to the command syntax, even if it runs
Lisp code:
~ $ (zerop 1); echo $?
2
~ $ zerop 1; echo $?
0
That's because Eshell prints the return value of Lisp commands (unless
it's nil), so a Lisp command that wants to silently succeed would return
nil. For example, the Lisp version of 'cat' returns nil. We want that to
have an exit status of 0. However, when writing it as a Lisp form with
parentheses, I think it makes more sense that nil is treated as false
for conditionals. The overall goal is so that this prints hi:
cat foo.txt && echo hi
And that this doesn't:
(zerop 1) && echo hi
For people who don't like this behavior, they can customize the new
'eshell-lisp-form-nil-is-failure' variable.
[0001-Only-set-Eshell-execution-result-metavariables-when-.patch (text/plain, attachment)]
[0002-Allow-using-dollar-expansions-in-Eshell-conditionals.patch (text/plain, attachment)]
[0003-Make-and-variables-more-consistent-in-Eshell.patch (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Fri, 12 Aug 2022 15:23:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 57129 <at> debbugs.gnu.org (full text, mbox):
Jim Porter <jporterbugs <at> gmail.com> writes:
> Here are some patches to fix this.
I haven't tested the patches, but reading them, they make sense to me,
so please go ahead and push.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Sat, 13 Aug 2022 05:15:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 57129 <at> debbugs.gnu.org (full text, mbox):
On 8/12/2022 8:22 AM, Lars Ingebrigtsen wrote:
> Jim Porter <jporterbugs <at> gmail.com> writes:
>
>> Here are some patches to fix this.
>
> I haven't tested the patches, but reading them, they make sense to me,
> so please go ahead and push.
Thanks for taking a look. Merged as
f3408af0a3251a744cb0b55b5e153565bfd57ea3.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Sat, 13 Aug 2022 07:02:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 57129 <at> debbugs.gnu.org (full text, mbox):
> Cc: 57129 <at> debbugs.gnu.org
> From: Jim Porter <jporterbugs <at> gmail.com>
> Date: Fri, 12 Aug 2022 22:14:02 -0700
>
> On 8/12/2022 8:22 AM, Lars Ingebrigtsen wrote:
> > Jim Porter <jporterbugs <at> gmail.com> writes:
> >
> >> Here are some patches to fix this.
> >
> > I haven't tested the patches, but reading them, they make sense to me,
> > so please go ahead and push.
>
> Thanks for taking a look. Merged as
> f3408af0a3251a744cb0b55b5e153565bfd57ea3.
One of the tests in esh-var-tests.el failed on MS-Windows; I fixed it,
although I'm not sure it's a correct fix, because the Eshell manual
seems to say that a built-in implementation of a command should be
preferred by default?
One of the tests in eshell-tests.el also fails on MS-Windows, but I
have no idea how to fix it, nor even what are the details of the
test. Are the 'echo' commands in that test supposed to be external
shell commands or internal Eshell commands? If the former, it could
be a problem on MS-Windows.
(This is a good example of my gripes about the test suite, which I
described in bug#57150 yesterday: debugging a failure is unnecessarily
hard and tedious. There's no information about the specific intent of
the test and its inner workings and assumptions, except the general
idea described in the doc string.)
Here's the failure I get:
Test eshell-test/subcommand-reset-in-pipeline backtrace:
signal(ert-test-failed (((should (equal (eshell-test-command-result
ert-fail(((should (equal (eshell-test-command-result (format templat
(if (unwind-protect (setq value-38 (apply fn-36 args-37)) (setq form
(let (form-description-40) (if (unwind-protect (setq value-38 (apply
(let ((value-38 'ert-form-evaluation-aborted-39)) (let (form-descrip
(let* ((fn-36 #'equal) (args-37 (condition-case err (let ((signal-ho
(let ((template (car --dolist-tail--))) (let* ((fn-26 #'equal) (args
(while --dolist-tail-- (let ((template (car --dolist-tail--))) (let*
(let ((--dolist-tail-- '("echo {%s} | *cat" "echo ${%s} | *cat" "*ca
(closure (t) nil (let* ((fn-21 #'executable-find) (args-22 (conditio
ert--run-test-internal(#s(ert--test-execution-info :test #s(ert-test
ert-run-test(#s(ert-test :name eshell-test/subcommand-reset-in-pipel
ert-run-or-rerun-test(#s(ert--stats :selector ... :tests ... :test-m
ert-run-tests((not (or (tag :unstable) (tag :nativecomp))) #f(compil
ert-run-tests-batch((not (or (tag :unstable) (tag :nativecomp))))
ert-run-tests-batch-and-exit((not (or (tag :unstable) (tag :nativeco
eval((ert-run-tests-batch-and-exit '(not (or (tag :unstable) (tag :n
command-line-1(("-L" ";." "-l" "ert" "-l" "lisp/eshell/eshell-tests.
command-line()
normal-top-level()
Test eshell-test/subcommand-reset-in-pipeline condition:
(ert-test-failed
((should
(equal
(eshell-test-command-result ...)
"first"))
:form
(equal nil "first")
:value nil :explanation
(different-types nil "first")))
FAILED 18/18 eshell-test/subcommand-reset-in-pipeline (1.873920 sec) at lisp/eshell/eshell-tests.el:80
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Sat, 13 Aug 2022 18:57:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 57129 <at> debbugs.gnu.org (full text, mbox):
On 8/13/2022 12:01 AM, Eli Zaretskii wrote:
> One of the tests in esh-var-tests.el failed on MS-Windows; I fixed it,
> although I'm not sure it's a correct fix, because the Eshell manual
> seems to say that a built-in implementation of a command should be
> preferred by default?
Sorry about that. I think that's the right fix for that case. Maybe it
would make sense to set 'eshell-prefer-lisp-functions' to t for most of
the Eshell tests to improve reproducibility on various platforms; tests
that want an external command can just put a "*" in front of the command
name.
> One of the tests in eshell-tests.el also fails on MS-Windows, but I
> have no idea how to fix it, nor even what are the details of the
> test. Are the 'echo' commands in that test supposed to be external
> shell commands or internal Eshell commands? If the former, it could
> be a problem on MS-Windows.
The echo commands should be internal Eshell commands (since there's an
'eshell/echo' Lisp function, that one should always be preferred by Eshell).
I'm surprised that test fails on MS Windows, since it *should* be
testing internal Eshell logic that's not platform-specific. Based on the
failure, it looks like one of the following commands is returning the
wrong value:
echo {echo $eshell-in-pipeline-p | echo} | *cat
echo ${echo $eshell-in-pipeline-p | echo} | *cat
*cat $<echo $eshell-in-pipeline-p | echo> | *cat
All of these should return 'first'. That test is just checking that,
when you're in a subcommand ({...}, ${...}, or $<...>), the value of
'eshell-in-pipeline-p' shouldn't be influenced by the pipeline in the
"super-command". Some built-in Eshell commands consult
'eshell-in-pipeline-p', and if it had the wrong value, they might do the
wrong thing.
If nothing else, it would probably be helpful to set up ERT explainers
so that the error messages are easier to understand. As it is now,
they're not very explanatory.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Sun, 14 Aug 2022 05:04:01 GMT)
Full text and
rfc822 format available.
Message #23 received at 57129 <at> debbugs.gnu.org (full text, mbox):
Hello,
On Fri 12 Aug 2022 at 10:14PM -07, Jim Porter wrote:
> On 8/12/2022 8:22 AM, Lars Ingebrigtsen wrote:
>> Jim Porter <jporterbugs <at> gmail.com> writes:
>>
>>> Here are some patches to fix this.
>> I haven't tested the patches, but reading them, they make sense to me,
>> so please go ahead and push.
>
> Thanks for taking a look. Merged as f3408af0a3251a744cb0b55b5e153565bfd57ea3.
Thanks for working on this, pretty cool.
--
Sean Whitton
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Sun, 14 Aug 2022 05:09:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 57129 <at> debbugs.gnu.org (full text, mbox):
> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
> From: Jim Porter <jporterbugs <at> gmail.com>
> Date: Sat, 13 Aug 2022 11:56:20 -0700
>
> On 8/13/2022 12:01 AM, Eli Zaretskii wrote:
> > One of the tests in esh-var-tests.el failed on MS-Windows; I fixed it,
> > although I'm not sure it's a correct fix, because the Eshell manual
> > seems to say that a built-in implementation of a command should be
> > preferred by default?
>
> Sorry about that. I think that's the right fix for that case. Maybe it
> would make sense to set 'eshell-prefer-lisp-functions' to t for most of
> the Eshell tests to improve reproducibility on various platforms; tests
> that want an external command can just put a "*" in front of the command
> name.
But then this fragment from the Eshell manual:
The command can be either an Elisp function or an external command.
Eshell looks first for an alias (*note Aliases::) with the same name as
the command, then a built-in (*note Built-ins::) or a function with the
same name; if there is no match, it then tries to execute it as an
external command.
seems to be inaccurate? Since 'format' exists as a built-in command,
why did Eshell in this case invoke the external command instead?
> I'm surprised that test fails on MS Windows, since it *should* be
> testing internal Eshell logic that's not platform-specific. Based on the
> failure, it looks like one of the following commands is returning the
> wrong value:
>
> echo {echo $eshell-in-pipeline-p | echo} | *cat
> echo ${echo $eshell-in-pipeline-p | echo} | *cat
> *cat $<echo $eshell-in-pipeline-p | echo> | *cat
>
> All of these should return 'first'.
The first two do; the last one returns nothing.
> That test is just checking that,
> when you're in a subcommand ({...}, ${...}, or $<...>), the value of
> 'eshell-in-pipeline-p' shouldn't be influenced by the pipeline in the
> "super-command". Some built-in Eshell commands consult
> 'eshell-in-pipeline-p', and if it had the wrong value, they might do the
> wrong thing.
So how to investigate this failure further?
> If nothing else, it would probably be helpful to set up ERT explainers
> so that the error messages are easier to understand. As it is now,
> they're not very explanatory.
Indeed, more explanations in this and other tests will be most
welcome.
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Sun, 14 Aug 2022 05:38:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 57129 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 8/13/2022 10:07 PM, Eli Zaretskii wrote:
>> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
>> From: Jim Porter <jporterbugs <at> gmail.com>
>> Date: Sat, 13 Aug 2022 11:56:20 -0700
>>
>> Sorry about that. I think that's the right fix for that case. Maybe it
>> would make sense to set 'eshell-prefer-lisp-functions' to t for most of
>> the Eshell tests to improve reproducibility on various platforms; tests
>> that want an external command can just put a "*" in front of the command
>> name.
>
> But then this fragment from the Eshell manual:
>
> The command can be either an Elisp function or an external command.
> Eshell looks first for an alias (*note Aliases::) with the same name as
> the command, then a built-in (*note Built-ins::) or a function with the
> same name; if there is no match, it then tries to execute it as an
> external command.
>
> seems to be inaccurate? Since 'format' exists as a built-in command,
> why did Eshell in this case invoke the external command instead?
"Built-in" in that part of the manual refers to a function named
'eshell/FOO'. If you run "FOO" as an Eshell command, it will check for
'eshell/FOO' before an external command on your system. The manual could
probably stand to be improved here.
>> I'm surprised that test fails on MS Windows, since it *should* be
>> testing internal Eshell logic that's not platform-specific. Based on the
>> failure, it looks like one of the following commands is returning the
>> wrong value:
>>
>> echo {echo $eshell-in-pipeline-p | echo} | *cat
>> echo ${echo $eshell-in-pipeline-p | echo} | *cat
>> *cat $<echo $eshell-in-pipeline-p | echo> | *cat
>>
>> All of these should return 'first'.
>
> The first two do; the last one returns nothing.
[snip]
> So how to investigate this failure further?
I have access to an MS Windows system (but don't have a build
environment set up for native MS Windows builds), so I can try to
reproduce this using the Emacs 29 pretest binaries in the next few days.
Hopefully that works.
If you'd like to dig into this further yourself, you could try running
this command in Eshell:
eshell-parse-command '*cat $<echo $eshell-in-pipeline-p | echo> | *cat'
That will print the Lisp form that the command gets converted to. I've
attached the result that I get in a recent Emacs 29 build on GNU/Linux.
The two functions that set 'eshell-in-pipeline-p' in there are:
* 'eshell-as-subcommand', which resets it to nil so that the outer
command's pipeline state doesn't interfere with the subcommand.
* 'eshell-execute-pipeline', which calls 'eshell-do-pipelines' (except
on MS-DOS, I think) and should set 'eshell-in-pipeline-p' to 'first'
in the above case.
>> If nothing else, it would probably be helpful to set up ERT explainers
>> so that the error messages are easier to understand. As it is now,
>> they're not very explanatory.
>
> Indeed, more explanations in this and other tests will be most
> welcome.
Yeah, this test in particular is high up on the list of tests that needs
more explanation. It's pretty subtle, and the test doesn't really serve
as a good example of why someone would care that this behavior works the
way the test demands it.
[parsed-command.el (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Sun, 14 Aug 2022 07:31:02 GMT)
Full text and
rfc822 format available.
Message #32 received at 57129 <at> debbugs.gnu.org (full text, mbox):
> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
> From: Jim Porter <jporterbugs <at> gmail.com>
> Date: Sat, 13 Aug 2022 22:37:12 -0700
>
> > But then this fragment from the Eshell manual:
> >
> > The command can be either an Elisp function or an external command.
> > Eshell looks first for an alias (*note Aliases::) with the same name as
> > the command, then a built-in (*note Built-ins::) or a function with the
> > same name; if there is no match, it then tries to execute it as an
> > external command.
> >
> > seems to be inaccurate? Since 'format' exists as a built-in command,
> > why did Eshell in this case invoke the external command instead?
>
> "Built-in" in that part of the manual refers to a function named
> 'eshell/FOO'. If you run "FOO" as an Eshell command, it will check for
> 'eshell/FOO' before an external command on your system. The manual could
> probably stand to be improved here.
The manual definitely should be clarified in that matter, because:
d:/gnu/git/emacs/trunk/src $ which format
format is a built-in function in ‘C source code’.
To me this says that 'format' is a built-in command, and the manual
says such commands should be invoked in preference to external
commands. The "eshell/" part is not mentioned anywhere.
> If you'd like to dig into this further yourself, you could try running
> this command in Eshell:
>
> eshell-parse-command '*cat $<echo $eshell-in-pipeline-p | echo> | *cat'
>
> That will print the Lisp form that the command gets converted to. I've
> attached the result that I get in a recent Emacs 29 build on GNU/Linux.
I get the same output, with 2 exceptions:
. the name of the temporary file is different
. instead of a simple string here:
(eshell-set-output-handle 1 'overwrite "/tmp/QqPFwo"))
I get a file name split into several parts, which are then
concatenated by eshell-concat:
(eshell-set-output-handle 1 'overwrite
(eshell-extended-glob
(eshell-concat nil "c:/DOCUME" "~" "1/Zaretzky/LOCALS" "~" "1/Temp/OIi8Wd")))
Any chance that the "~" parts here somehow get in the way?
If not, any other thoughts? My main worry is that there's something
here specific to how we invoke subprocesses, which you lately changed.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Sun, 14 Aug 2022 21:41:01 GMT)
Full text and
rfc822 format available.
Message #35 received at 57129 <at> debbugs.gnu.org (full text, mbox):
On 8/14/2022 12:30 AM, Eli Zaretskii wrote:
> The manual definitely should be clarified in that matter, because:
>
> d:/gnu/git/emacs/trunk/src $ which format
> format is a built-in function in ‘C source code’.
>
> To me this says that 'format' is a built-in command, and the manual
> says such commands should be invoked in preference to external
> commands. The "eshell/" part is not mentioned anywhere.
I'll work on a fix for that. This is a little more complex than
immediately apparent though, since Eshell also unfortunately uses the
term "alias" to refer to two kinds of entities:
1) Command abbreviations created by the user using the 'alias' command,
e.g. "alias ll 'ls -l'". These are sometimes called "command
aliases", and are implemented in em-alias.el.
2) Lisp functions with names like 'eshell/FOO'. These are sometimes
called "alias functions", and are implemented in esh-cmd.el. For
example, see 'eshell-find-alias-function', which checks if
'eshell/FOO' exists.
It would probably be good to fix all this terminology for once and for all.
> I get the same output, with 2 exceptions:
>
> . the name of the temporary file is different
> . instead of a simple string here:
>
> (eshell-set-output-handle 1 'overwrite "/tmp/QqPFwo"))
>
> I get a file name split into several parts, which are then
> concatenated by eshell-concat:
>
> (eshell-set-output-handle 1 'overwrite
> (eshell-extended-glob
> (eshell-concat nil "c:/DOCUME" "~" "1/Zaretzky/LOCALS" "~" "1/Temp/OIi8Wd")))
>
> Any chance that the "~" parts here somehow get in the way?
It's possible. I think Eshell is trying to ensure that the "~" doesn't
get parsed as "home directory" here. But it could be doing something
wrong there. I did also touch 'eshell-concat' not too long ago, so I
might have broken something there.
> If not, any other thoughts? My main worry is that there's something
> here specific to how we invoke subprocesses, which you lately changed.
You could try some Eshell commands that don't use subprocesses to see if
that changes anything. For example, these should only use Lisp functions:
;; Should print "first" in the Eshell buffer:
eshell/echo ${eshell/echo $eshell-in-pipeline-p | eshell/echo}
eshell/echo ${eshell/echo $eshell-in-pipeline-p | eshell/echo} |
eshell/echo
;; Should open a temp file containing "first" in a new buffer:
find-file $<eshell/echo $eshell-in-pipeline-p | eshell/echo>
find-file $<eshell/echo $eshell-in-pipeline-p | eshell/echo> |
eshell/echo
The "eshell/" prefix probably isn't necessary, but I figured it's worth
being extra-careful here while diagnosing the issue. I'm particularly
interested in whether the third and fourth commands work. If they work,
that would definitely point towards a bug in Eshell's subprocess
handling; if not, then that would point to a bug in the "$<subcommand>"
handling.
I also filed bug#57216 to add ERT explainers that will print better
error messages for most of the Eshell tests. That doesn't address the
sometimes-vague documentation of the tests, but I'll do that separately.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Mon, 15 Aug 2022 12:15:02 GMT)
Full text and
rfc822 format available.
Message #38 received at 57129 <at> debbugs.gnu.org (full text, mbox):
> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
> From: Jim Porter <jporterbugs <at> gmail.com>
> Date: Sun, 14 Aug 2022 14:40:06 -0700
>
> On 8/14/2022 12:30 AM, Eli Zaretskii wrote:
> > The manual definitely should be clarified in that matter, because:
> >
> > d:/gnu/git/emacs/trunk/src $ which format
> > format is a built-in function in ‘C source code’.
> >
> > To me this says that 'format' is a built-in command, and the manual
> > says such commands should be invoked in preference to external
> > commands. The "eshell/" part is not mentioned anywhere.
>
> I'll work on a fix for that. This is a little more complex than
> immediately apparent though, since Eshell also unfortunately uses the
> term "alias" to refer to two kinds of entities:
>
> 1) Command abbreviations created by the user using the 'alias' command,
> e.g. "alias ll 'ls -l'". These are sometimes called "command
> aliases", and are implemented in em-alias.el.
>
> 2) Lisp functions with names like 'eshell/FOO'. These are sometimes
> called "alias functions", and are implemented in esh-cmd.el. For
> example, see 'eshell-find-alias-function', which checks if
> 'eshell/FOO' exists.
>
> It would probably be good to fix all this terminology for once and for all.
Yes, that could well be a source of a lot of confusion.
> You could try some Eshell commands that don't use subprocesses to see if
> that changes anything. For example, these should only use Lisp functions:
>
> ;; Should print "first" in the Eshell buffer:
> eshell/echo ${eshell/echo $eshell-in-pipeline-p | eshell/echo}
> eshell/echo ${eshell/echo $eshell-in-pipeline-p | eshell/echo} |
> eshell/echo
>
> ;; Should open a temp file containing "first" in a new buffer:
> find-file $<eshell/echo $eshell-in-pipeline-p | eshell/echo>
> find-file $<eshell/echo $eshell-in-pipeline-p | eshell/echo> |
> eshell/echo
>
> The "eshell/" prefix probably isn't necessary, but I figured it's worth
> being extra-careful here while diagnosing the issue. I'm particularly
> interested in whether the third and fourth commands work. If they work,
> that would definitely point towards a bug in Eshell's subprocess
> handling; if not, then that would point to a bug in the "$<subcommand>"
> handling.
All the 4 commands do what you say they should do.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Mon, 15 Aug 2022 16:15:01 GMT)
Full text and
rfc822 format available.
Message #41 received at 57129 <at> debbugs.gnu.org (full text, mbox):
On 8/15/2022 5:13 AM, Eli Zaretskii wrote:
>> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
>> From: Jim Porter <jporterbugs <at> gmail.com>
>> Date: Sun, 14 Aug 2022 14:40:06 -0700
>>
>> You could try some Eshell commands that don't use subprocesses to see if
>> that changes anything. For example, these should only use Lisp functions:
>>
>> ;; Should print "first" in the Eshell buffer:
>> eshell/echo ${eshell/echo $eshell-in-pipeline-p | eshell/echo}
>> eshell/echo ${eshell/echo $eshell-in-pipeline-p | eshell/echo} |
>> eshell/echo
>>
>> ;; Should open a temp file containing "first" in a new buffer:
>> find-file $<eshell/echo $eshell-in-pipeline-p | eshell/echo>
>> find-file $<eshell/echo $eshell-in-pipeline-p | eshell/echo> |
>> eshell/echo
>>
>> The "eshell/" prefix probably isn't necessary, but I figured it's worth
>> being extra-careful here while diagnosing the issue. I'm particularly
>> interested in whether the third and fourth commands work. If they work,
>> that would definitely point towards a bug in Eshell's subprocess
>> handling; if not, then that would point to a bug in the "$<subcommand>"
>> handling.
>
> All the 4 commands do what you say they should do.
Hm, and I can't reproduce this on the prebuilt Emacs 29 snapshot from
July 18, which is from just before I merged the changes to
'make-process', so it's definitely possible that this is due to those
commits.
Does reverting 4e59830bc0ab17cdbd85748b133c97837bed99e3 and
d7b89ea4077d4fe677ba0577245328819ee79cdc fix the issue? I checked
locally, and they should revert cleanly at least.
It might also be interesting to revert just the changes from d7b89ea in
lisp/eshell/esh-proc.el. I don't think any of the changes I made to the
C sources should impact this, since a) I was careful to keep the logic
as close to before as I could, and b) it just changes the logic for
creating a PTY, which doesn't work on MS Windows anyway.
I did notice one weird issue though, and maybe this has something to do
with the problem: on MS Windows, if I run any command with a
$<subcommand>, the second and subsequent times, it warns me about
changing the temp file, e.g.:
3Zz80A has changed since visited or saved. Save anyway? (yes or no)
This probably shouldn't do that, and it could definitely cause issues in
the tests, which can't prompt the user for things like that.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Mon, 15 Aug 2022 16:50:02 GMT)
Full text and
rfc822 format available.
Message #44 received at 57129 <at> debbugs.gnu.org (full text, mbox):
> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
> From: Jim Porter <jporterbugs <at> gmail.com>
> Date: Mon, 15 Aug 2022 09:14:04 -0700
>
> > All the 4 commands do what you say they should do.
>
> Hm, and I can't reproduce this on the prebuilt Emacs 29 snapshot from
> July 18, which is from just before I merged the changes to
> 'make-process', so it's definitely possible that this is due to those
> commits.
>
> Does reverting 4e59830bc0ab17cdbd85748b133c97837bed99e3 and
> d7b89ea4077d4fe677ba0577245328819ee79cdc fix the issue? I checked
> locally, and they should revert cleanly at least.
>
> It might also be interesting to revert just the changes from d7b89ea in
> lisp/eshell/esh-proc.el. I don't think any of the changes I made to the
> C sources should impact this, since a) I was careful to keep the logic
> as close to before as I could, and b) it just changes the logic for
> creating a PTY, which doesn't work on MS Windows anyway.
None of these reverts fixes the issue. I tried both the tests in
eshell-tests.el and the problematic command:
*cat $<echo $eshell-in-pipeline-p | echo> | *cat
> I did notice one weird issue though, and maybe this has something to do
> with the problem: on MS Windows, if I run any command with a
> $<subcommand>, the second and subsequent times, it warns me about
> changing the temp file, e.g.:
>
> 3Zz80A has changed since visited or saved. Save anyway? (yes or no)
>
> This probably shouldn't do that, and it could definitely cause issues in
> the tests, which can't prompt the user for things like that.
I guess this means you somehow reuse the same temporary file on
MS-Windows? Maybe change the test a bit so that the file name
definitely differs?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Mon, 15 Aug 2022 18:09:02 GMT)
Full text and
rfc822 format available.
Message #47 received at 57129 <at> debbugs.gnu.org (full text, mbox):
On 8/15/2022 9:49 AM, Eli Zaretskii wrote:
> None of these reverts fixes the issue. I tried both the tests in
> eshell-tests.el and the problematic command:
>
> *cat $<echo $eshell-in-pipeline-p | echo> | *cat
Thanks for testing. I might have to get an MS-Windows dev environment
set up to dig into this further. I'm a bit worried that this is related
to bug#55590, where I had to jump through some awkward hoops to fix a
somewhat-related issue with Eshell subcommands. I think that's due to
'eshell-do-eval' being written before lexical binding existed, and so
the variable scoping doesn't work quite as expected. It could be that
you only see this failure when using external commands due to some
differences in timing.
If that's the problem here too, it would be a pain to fix (but it would
likely also mean that this bug has been present for a long time, and my
tests/fixes have just brought it to the surface). As mentioned in
bug#55590, I think it'd be nice to rip out 'eshell-do-eval' and replace
it with the generator.el machinery, since they're conceptually pretty
similar. That's easier said than done though...
>> I did notice one weird issue though, and maybe this has something to do
>> with the problem: on MS Windows, if I run any command with a
>> $<subcommand>, the second and subsequent times, it warns me about
>> changing the temp file, e.g.:
>>
>> 3Zz80A has changed since visited or saved. Save anyway? (yes or no)
>>
>> This probably shouldn't do that, and it could definitely cause issues in
>> the tests, which can't prompt the user for things like that.
>
> I guess this means you somehow reuse the same temporary file on
> MS-Windows? Maybe change the test a bit so that the file name
> definitely differs?
I need to investigate this a bit further, since on GNU/Linux, I get a
new temp file every time. I think that's the behavior we want, or
failing that, we could at least kill the temp file's buffer after we're
done with it. I don't think there's much value in leaving it around.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Mon, 15 Aug 2022 18:22:01 GMT)
Full text and
rfc822 format available.
Message #50 received at 57129 <at> debbugs.gnu.org (full text, mbox):
> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
> From: Jim Porter <jporterbugs <at> gmail.com>
> Date: Mon, 15 Aug 2022 11:08:45 -0700
>
> >> I did notice one weird issue though, and maybe this has something to do
> >> with the problem: on MS Windows, if I run any command with a
> >> $<subcommand>, the second and subsequent times, it warns me about
> >> changing the temp file, e.g.:
> >>
> >> 3Zz80A has changed since visited or saved. Save anyway? (yes or no)
> >>
> >> This probably shouldn't do that, and it could definitely cause issues in
> >> the tests, which can't prompt the user for things like that.
> >
> > I guess this means you somehow reuse the same temporary file on
> > MS-Windows? Maybe change the test a bit so that the file name
> > definitely differs?
>
> I need to investigate this a bit further, since on GNU/Linux, I get a
> new temp file every time.
Can you tell how you create these temporary files? Or point me to the
code which does that?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Mon, 15 Aug 2022 18:31:01 GMT)
Full text and
rfc822 format available.
Message #53 received at 57129 <at> debbugs.gnu.org (full text, mbox):
On 8/15/2022 11:21 AM, Eli Zaretskii wrote:
>> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
>> From: Jim Porter <jporterbugs <at> gmail.com>
>> Date: Mon, 15 Aug 2022 11:08:45 -0700
>>
>> I need to investigate this a bit further, since on GNU/Linux, I get a
>> new temp file every time.
>
> Can you tell how you create these temporary files? Or point me to the
> code which does that?
The temp files are created by Eshell in lisp/eshell/esh-var.el in the
function 'eshell-parse-variable-ref', specifically in the part starting
with:
(eq (char-after) ?\<)
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Mon, 15 Aug 2022 18:59:02 GMT)
Full text and
rfc822 format available.
Message #56 received at 57129 <at> debbugs.gnu.org (full text, mbox):
> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
> From: Jim Porter <jporterbugs <at> gmail.com>
> Date: Mon, 15 Aug 2022 11:30:07 -0700
>
> >> I need to investigate this a bit further, since on GNU/Linux, I get a
> >> new temp file every time.
> >
> > Can you tell how you create these temporary files? Or point me to the
> > code which does that?
>
> The temp files are created by Eshell in lisp/eshell/esh-var.el in the
> function 'eshell-parse-variable-ref', specifically in the part starting
> with:
>
> (eq (char-after) ?\<)
Ah, okay. It's a (mis)feature of Gnulib's gen_tempname function
(which is the guts of make-temp-file) in its implementation for
MS-Windows (and maybe other platforms?): it always begins from the
same "random" characters in the file name, and only generates other
random characters if there's already a file by that name. So if you
are careful and delete the temporary file each time after usage, and
never need more than one temporary file at the same time, you will get
the same name every call.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Mon, 15 Aug 2022 20:56:02 GMT)
Full text and
rfc822 format available.
Message #59 received at 57129 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 8/15/22 11:58, Eli Zaretskii wrote:
> Ah, okay. It's a (mis)feature of Gnulib's gen_tempname function
> (which is the guts of make-temp-file) in its implementation for
> MS-Windows (and maybe other platforms?): it always begins from the
> same "random" characters in the file name, and only generates other
> random characters if there's already a file by that name.
Not sure I'd call it a misfeature, as gen_tempname is generating a
uniquely-named file that is exclusive to Emacs, which is all it's
supposed to do.
I do see a comment saying that gen_tempname generates "hard-to-predict"
names, which as you note is not correct on MS-DOS, nor even strictly
speaking on all POSIX platforms. I installed the first attached patch
into Gnulib to fix that comment.
I'm not sure I'm entirely understanding the Emacs problem, but it
appears to be that Emacs has its own set of filenames that it thinks it
knows about, and Emacs wants the new temporary file's name to not be a
member of that set. If I'm right, does the second attached patch (this
patch is to Emacs) address the problem? I haven't tested or installed it.
[0001-tempname-remove-incorrect-comment.patch (text/x-patch, attachment)]
[0001-Don-t-create-temp-file-with-same-name-as-visited.patch (text/x-patch, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Mon, 15 Aug 2022 21:23:02 GMT)
Full text and
rfc822 format available.
Message #62 received at 57129 <at> debbugs.gnu.org (full text, mbox):
Paul Eggert wrote:
> I do see a comment saying that gen_tempname generates "hard-to-predict"
> names, which as you note is not correct on MS-DOS, nor even strictly
> speaking on all POSIX platforms. I installed the first attached patch
> into Gnulib to fix that comment.
Another comment fix is as below. Note that both comment fixes need to be
propagated to glibc.
2022-08-15 Bruno Haible <bruno <at> clisp.org>
tempname: Fix a comment.
* lib/tempname.c (try_tempname_len): Use of entropy makes the function
more, not less, secure.
diff --git a/lib/tempname.c b/lib/tempname.c
index 75a939e571..e6520191d7 100644
--- a/lib/tempname.c
+++ b/lib/tempname.c
@@ -273,7 +273,7 @@ try_tempname_len (char *tmpl, int suffixlen, void *args,
/* Whether to consume entropy when acquiring random bits. On the
first try it's worth the entropy cost with __GT_NOCREATE, which
is inherently insecure and can use the entropy to make it a bit
- less secure. On the (rare) second and later attempts it might
+ more secure. On the (rare) second and later attempts it might
help against DoS attacks. */
bool use_getrandom = tryfunc == try_nocreate;
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Tue, 16 Aug 2022 11:05:02 GMT)
Full text and
rfc822 format available.
Message #65 received at 57129 <at> debbugs.gnu.org (full text, mbox):
> Date: Mon, 15 Aug 2022 13:55:35 -0700
> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org, Jim Porter
> <jporterbugs <at> gmail.com>, Gnulib bugs <bug-gnulib <at> gnu.org>
> From: Paul Eggert <eggert <at> cs.ucla.edu>
>
> On 8/15/22 11:58, Eli Zaretskii wrote:
>
> > Ah, okay. It's a (mis)feature of Gnulib's gen_tempname function
> > (which is the guts of make-temp-file) in its implementation for
> > MS-Windows (and maybe other platforms?): it always begins from the
> > same "random" characters in the file name, and only generates other
> > random characters if there's already a file by that name.
>
> Not sure I'd call it a misfeature
I didn't say "misfeature". Evidently, for you and perhaps others it's
a feature.
> as gen_tempname is generating a uniquely-named file that is
> exclusive to Emacs, which is all it's supposed to do.
Then perhaps calling it from make-temp-file in Emacs is a mistake,
because that function's doc string says
Create a temporary file.
The returned file name (created by appending some random characters at the end
of PREFIX, and expanding against ‘temporary-file-directory’ if necessary),
is guaranteed to point to a newly created file.
When you tell someone that the file name will include "some random
characters", they don't expect that they will be the same "random
characters" every call, just like saying that some function generates
random numbers doesn't imply it will be the same number every call.
> I'm not sure I'm entirely understanding the Emacs problem, but it
> appears to be that Emacs has its own set of filenames that it thinks it
> knows about, and Emacs wants the new temporary file's name to not be a
> member of that set. If I'm right, does the second attached patch (this
> patch is to Emacs) address the problem? I haven't tested or installed it.
I don't think that's the problem, no. The problem is that callers of
make-temp-file reasonably expect it to return a new name every call.
And evidently, it does that on GNU/Linux (not sure how, given that the
tempname.c code is supposed to be in glibc?). So if there's no
intention to change gen_tempname on non-glibc platforms so that it
doesn't repeat the same "random characters" upon each call, I think
Emacs should call a different function to generate "random" names of
temporary files, for consistency across platforms if not for other
reasons.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Tue, 16 Aug 2022 13:36:01 GMT)
Full text and
rfc822 format available.
Message #68 received at 57129 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii wrote:
> The problem is that callers of
> make-temp-file reasonably expect it to return a new name every call.
> And evidently, it does that on GNU/Linux (not sure how, given that the
> tempname.c code is supposed to be in glibc?). So if there's no
> intention to change gen_tempname on non-glibc platforms so that it
> doesn't repeat the same "random characters" upon each call, I think
> Emacs should call a different function to generate "random" names of
> temporary files, for consistency across platforms if not for other
> reasons.
You are making incorrect assumptions or hypotheses. I am adding the unit
test below. It proves that (at least) on Linux, FreeBSD 11, NetBSD 7,
OpenBSD 6.0, macOS, AIX 7.1, Solaris 10, Cygwin, and native Windows,
gen_tempname *does* return a new file name on each call, with a very high
probability.
So, gen_tempname does *not* repeat the same "random characters" upon each call.
2022-08-16 Bruno Haible <bruno <at> clisp.org>
tempname: Add tests.
* tests/test-tempname.c: New file.
* modules/tempname-tests: New file.
=========================== tests/test-tempname.c ==========================
/* Test of creating a temporary file.
Copyright (C) 2022 Free Software Foundation, Inc.
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>. */
/* Written by Bruno Haible <bruno <at> clisp.org>, 2022. */
#include <config.h>
#include "tempname.h"
#include <string.h>
#include <unistd.h>
#include "macros.h"
int
main ()
{
/* Verify that two consecutive calls to gen_tempname return two different
file names, with a high probability. */
const char *templ18 = "gl-temp-XXXXXX.xyz";
char filename1[18 + 1];
char filename2[18 + 1];
strcpy (filename1, templ18);
int fd1 = gen_tempname (filename1, strlen (".xyz"), 0, GT_FILE);
ASSERT (fd1 >= 0);
strcpy (filename2, templ18);
int fd2 = gen_tempname (filename2, strlen (".xyz"), 0, GT_FILE);
ASSERT (fd2 >= 0);
/* With 6 'X' and a good pseudo-random number generator behind the scenes,
the probability of getting the same file name twice in a row should be
1/62^6 < 1/10^10. */
ASSERT (strcmp (filename1, filename2) != 0);
/* Clean up. */
close (fd1);
close (fd2);
unlink (filename1);
unlink (filename2);
return 0;
}
=========================== modules/tempname-tests =========================
Files:
tests/test-tempname.c
tests/macros.h
Depends-on:
unlink
configure.ac:
Makefile.am:
TESTS += test-tempname
check_PROGRAMS += test-tempname
test_tempname_LDADD = $(LDADD) $(LIB_GETRANDOM) $(LIB_CLOCK_GETTIME)
============================================================================
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Tue, 16 Aug 2022 13:52:01 GMT)
Full text and
rfc822 format available.
Message #71 received at 57129 <at> debbugs.gnu.org (full text, mbox):
> From: Bruno Haible <bruno <at> clisp.org>
> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org, jporterbugs <at> gmail.com
> Date: Tue, 16 Aug 2022 15:35:17 +0200
>
> Eli Zaretskii wrote:
> > The problem is that callers of
> > make-temp-file reasonably expect it to return a new name every call.
> > And evidently, it does that on GNU/Linux (not sure how, given that the
> > tempname.c code is supposed to be in glibc?). So if there's no
> > intention to change gen_tempname on non-glibc platforms so that it
> > doesn't repeat the same "random characters" upon each call, I think
> > Emacs should call a different function to generate "random" names of
> > temporary files, for consistency across platforms if not for other
> > reasons.
>
> You are making incorrect assumptions or hypotheses.
They are neither assumptions nor hypotheses. They are what I actually
saw by stepping with a debugger into the code.
> I am adding the unit test below. It proves that (at least) on Linux,
> FreeBSD 11, NetBSD 7, OpenBSD 6.0, macOS, AIX 7.1, Solaris 10,
> Cygwin, and native Windows, gen_tempname *does* return a new file
> name on each call, with a very high probability.
>
> So, gen_tempname does *not* repeat the same "random characters" upon each call.
Well, it does here, when called from Emacs in the particular scenario
of the unit test I was looking into.
Looking at your test program, I see that you generate the seconds file
name without deleting the first one. When a file by the first
generated name already exists, gen_tempname will indeed generate a
different name. But in the scenario I described, Emacs creates one
temporary file, uses it, then deletes it, and only after that creates
another file. In terms of your test program, you need to move the
first unlink call to before the second call to gen_tempname. Then
your test program will faithfully reproduce what Emacs does in the
case in point.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Tue, 16 Aug 2022 14:41:02 GMT)
Full text and
rfc822 format available.
Message #74 received at 57129 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii wrote:
> Looking at your test program, I see that you generate the seconds file
> name without deleting the first one. When a file by the first
> generated name already exists, gen_tempname will indeed generate a
> different name. But in the scenario I described, Emacs creates one
> temporary file, uses it, then deletes it, and only after that creates
> another file.
Why would it be important for the second temporary file to bear a different
name, once the first one is gone? I mean, process IDs get reused over time;
socket numbers get reused over time; what is wrong with reusing a file name,
once the older file is gone?
Maybe the problem is that the file gets deleted too early, when some parts
of Emacs still reference it?
Bruno
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Tue, 16 Aug 2022 16:26:02 GMT)
Full text and
rfc822 format available.
Message #77 received at 57129 <at> debbugs.gnu.org (full text, mbox):
> From: Bruno Haible <bruno <at> clisp.org>
> Cc: eggert <at> cs.ucla.edu, bug-gnulib <at> gnu.org, larsi <at> gnus.org, 57129 <at> debbugs.gnu.org, jporterbugs <at> gmail.com
> Date: Tue, 16 Aug 2022 16:40:16 +0200
>
> Eli Zaretskii wrote:
> > Looking at your test program, I see that you generate the seconds file
> > name without deleting the first one. When a file by the first
> > generated name already exists, gen_tempname will indeed generate a
> > different name. But in the scenario I described, Emacs creates one
> > temporary file, uses it, then deletes it, and only after that creates
> > another file.
>
> Why would it be important for the second temporary file to bear a different
> name, once the first one is gone? I mean, process IDs get reused over time;
> socket numbers get reused over time; what is wrong with reusing a file name,
> once the older file is gone?
Because the Emacs Lisp program expected that, based on what
make-temp-file in Emacs promises (see the "random characters" part),
and because that's how it behaves on GNU/Linux. Then the same program
behaved differently on MS-Windows, and the programmer was stumped.
Sure, it's possible to write the same program somewhat differently,
carefully working around this issue, but my point is that such
functions should ideally present the same behavior traits on all
systems, otherwise the portability is hampered.
> Maybe the problem is that the file gets deleted too early, when some parts
> of Emacs still reference it?
The buffer which visited that file still exists, and still records the
name of the file, yes. And then, when the program writes to another
file created by another call to make-temp-file, it actually writes to
a file that some buffer still visits, and in Emacs that triggers a
prompt to the user to refresh the buffer. The programmer didn't
expect that because it is natural to expect each call to a function
that creates a temporary file to create a file under a new name, not
reuse the same name. E.g., similar facilities in the Standard C
library exhibit that behavior. So the program was written assuming
that the second write was to an entirely different and unrelated file.
And again, my main point is: why does this work differently on
different platforms? If you consider it OK to return the same file
name each time, provided that no such file exists, then why doesn't
that function do it on GNU/Linux?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Tue, 16 Aug 2022 16:55:02 GMT)
Full text and
rfc822 format available.
Message #80 received at 57129 <at> debbugs.gnu.org (full text, mbox):
On 8/16/22 09:25, Eli Zaretskii wrote:
> The programmer didn't
> expect that because it is natural to expect each call to a function
> that creates a temporary file to create a file under a new name, not
> reuse the same name.
Regardless of whether the programmer expects a random name or a
predictable-but-unique name, there are only a finite number of names to
choose from so the programmer must take into account the possibility
that the chosen name clashes with names that the programmer would prefer
not to be chosen.
This means an Emacs patch such as [1] is needed regardless of whether
Gnulib's gen_tempname is changed to be more random than it is. Although
it's true that the bug fixed by [1] is less likely if gen_tempname
generates more-random names, the bug can occur no matter what we do
about gen_tempname.
[1]
https://debbugs.gnu.org/cgi/bugreport.cgi?filename=0001-Don-t-create-temp-file-with-same-name-as-visited.patch;bug=57129;msg=59;att=2
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Tue, 16 Aug 2022 17:06:01 GMT)
Full text and
rfc822 format available.
Message #83 received at 57129 <at> debbugs.gnu.org (full text, mbox):
> Date: Tue, 16 Aug 2022 09:54:25 -0700
> Cc: bug-gnulib <at> gnu.org, larsi <at> gnus.org, 57129 <at> debbugs.gnu.org,
> jporterbugs <at> gmail.com, Bruno Haible <bruno <at> clisp.org>
> From: Paul Eggert <eggert <at> cs.ucla.edu>
>
> On 8/16/22 09:25, Eli Zaretskii wrote:
> > The programmer didn't
> > expect that because it is natural to expect each call to a function
> > that creates a temporary file to create a file under a new name, not
> > reuse the same name.
>
> Regardless of whether the programmer expects a random name or a
> predictable-but-unique name, there are only a finite number of names to
> choose from so the programmer must take into account the possibility
> that the chosen name clashes with names that the programmer would prefer
> not to be chosen.
Then what is this comment and the following assertion in Bruno's code
about?
/* With 6 'X' and a good pseudo-random number generator behind the scenes,
the probability of getting the same file name twice in a row should be
1/62^6 < 1/10^10. */
ASSERT (strcmp (filename1, filename2) != 0);
That "finite number" of 62^6 sounds pretty close to infinity to me!
> This means an Emacs patch such as [1] is needed regardless of whether
> Gnulib's gen_tempname is changed to be more random than it is. Although
> it's true that the bug fixed by [1] is less likely if gen_tempname
> generates more-random names, the bug can occur no matter what we do
> about gen_tempname.
No, sorry. I object to this patch, because it hides the problem under
the carpet. That there's a file-visiting buffer that records the name
of a temporary file is just one case where this issue gets in the way.
The general problem is still there. And it isn't an Emacs problem, so
Emacs is not the place to solve it.
Therefore, if there's no intention to fix this in Gnulib, I'm going to
update the documentation of make-temp-file so that Emacs users and
programmers will be informed about that and will be careful enough to
side-step these issues in all the situations. (Not that I understand
why won't Gnulib provide consistent behavior on all platforms, but I
guess it's above my pay grade.)
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Tue, 16 Aug 2022 17:27:02 GMT)
Full text and
rfc822 format available.
Message #86 received at 57129 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 8/16/22 09:25, Eli Zaretskii wrote:
> why does this work differently on
> different platforms?
Platforms that lack clock_gettime and CLOCK_MONOTONIC fall back on a
deterministic algorithm. Presumably MS-DOS one of the few such
platforms, which is why the problem is observed only on MS-DOS.
How about something like the attached patch to Gnulib's lib/tempname.c?
If I understand things correctly this should make the names random
enough on MS-DOS, though Emacs itself still needs a patch as I mentioned
a few minutes ago.
If this patch isn't good enough for MS-DOS, what sort of random bits are
easily available on that platform? We don't need anything
cryptographically secure; a higher-res clock should suffice.
[rand.diff (text/x-patch, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Tue, 16 Aug 2022 17:30:02 GMT)
Full text and
rfc822 format available.
Message #89 received at 57129 <at> debbugs.gnu.org (full text, mbox):
Thanks for the explanations, Eli.
Eli Zaretskii wrote:
> > Maybe the problem is that the file gets deleted too early, when some parts
> > of Emacs still reference it?
>
> The buffer which visited that file still exists, and still records the
> name of the file, yes. And then, when the program writes to another
> file created by another call to make-temp-file, it actually writes to
> a file that some buffer still visits, and in Emacs that triggers a
> prompt to the user to refresh the buffer.
That is a reasonable behaviour for a text editor — but only for file
names that are explicitly controlled by some program or by the user,
not for temporary files.
> The programmer didn't
> expect that because it is natural to expect each call to a function
> that creates a temporary file to create a file under a new name, not
> reuse the same name.
This is an incorrect assumption. Just like socket numbers are randomly
generated in some situations, temp file names are random, and you can't
make assumptions about the resulting file name.
How about storing, as an attribute of the buffer, a boolean that tells
whether the underlying file name is randomly generated (i.e. a temp file),
and query this boolean before prompting to the user to refresh the buffer?
Bruno
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Tue, 16 Aug 2022 17:32:02 GMT)
Full text and
rfc822 format available.
Message #92 received at 57129 <at> debbugs.gnu.org (full text, mbox):
On 8/16/22 10:04, Eli Zaretskii wrote:
> That "finite number" of 62^6 sounds pretty close to infinity to me!
It's only 57 billion or so. That's not infinity, considering all the
Emacs sessions out there. If Emacs's success grows, almost surely
someone will run into this unlikely bug. Since we have the patch to
prevent it, why not install it?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Tue, 16 Aug 2022 17:42:02 GMT)
Full text and
rfc822 format available.
Message #95 received at 57129 <at> debbugs.gnu.org (full text, mbox):
> Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, bug-gnulib <at> gnu.org,
> 57129 <at> debbugs.gnu.org, bruno <at> clisp.org
> Date: Tue, 16 Aug 2022 20:04:49 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
>
> Therefore, if there's no intention to fix this in Gnulib, I'm going to
> update the documentation of make-temp-file so that Emacs users and
> programmers will be informed about that and will be careful enough to
> side-step these issues in all the situations. (Not that I understand
> why won't Gnulib provide consistent behavior on all platforms, but I
> guess it's above my pay grade.)
And btw, looking closer, I see that this is a regression in Emacs 28,
which was caused by a change in Gnulib: the versions of tempname.c
that we used up to and including Emacs 27.2 used gettimeofday and the
PID to generate the file name, so it was more random than the code in
current Gnulib, and in particular the sequence of
generate file-name1
delete file-name1
generate file-name2
would produce file-name2 that is different from file-name1. With the
current code in Gnulib, something similar happens only on glibc
systems. So I hope the code in Gnulib's tempname.c will be amended to
call clock_gettime or something similar, so that the names on
non-glibc platforms could also be random. Or, failing that, at least
that gen_tempname in glibc would behave identically to other systems,
i.e. start from a fixed "random" value.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Tue, 16 Aug 2022 17:48:02 GMT)
Full text and
rfc822 format available.
Message #98 received at 57129 <at> debbugs.gnu.org (full text, mbox):
> Date: Tue, 16 Aug 2022 10:25:57 -0700
> Cc: bug-gnulib <at> gnu.org, larsi <at> gnus.org, 57129 <at> debbugs.gnu.org,
> jporterbugs <at> gmail.com
> From: Paul Eggert <eggert <at> cs.ucla.edu>
>
> On 8/16/22 09:25, Eli Zaretskii wrote:
> > why does this work differently on
> > different platforms?
>
> Platforms that lack clock_gettime and CLOCK_MONOTONIC fall back on a
> deterministic algorithm. Presumably MS-DOS one of the few such
> platforms, which is why the problem is observed only on MS-DOS.
(Why are you talking about MS-DOS?)
MinGW does have clock_gettime, but we don't want to use it, because
one of the MinGW flavors (MinGW64) implements that function in
libwinpthreads, and that makes Emacs depend on libwinpthreads DLL,
which we want to avoid.
> How about something like the attached patch to Gnulib's lib/tempname.c?
Thanks, but why not use 'random' instead? Emacs does have it on all
platforms, including MS-Windows. AFAIU, it's better than 'rand'.
> If I understand things correctly this should make the names random
> enough on MS-DOS, though Emacs itself still needs a patch as I mentioned
> a few minutes ago.
Why would Emacs need that patch?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Tue, 16 Aug 2022 17:54:01 GMT)
Full text and
rfc822 format available.
Message #101 received at 57129 <at> debbugs.gnu.org (full text, mbox):
> From: Bruno Haible <bruno <at> clisp.org>
> Cc: eggert <at> cs.ucla.edu, bug-gnulib <at> gnu.org, larsi <at> gnus.org, 57129 <at> debbugs.gnu.org, jporterbugs <at> gmail.com
> Date: Tue, 16 Aug 2022 19:29:33 +0200
>
> > The buffer which visited that file still exists, and still records the
> > name of the file, yes. And then, when the program writes to another
> > file created by another call to make-temp-file, it actually writes to
> > a file that some buffer still visits, and in Emacs that triggers a
> > prompt to the user to refresh the buffer.
>
> That is a reasonable behaviour for a text editor — but only for file
> names that are explicitly controlled by some program or by the user,
> not for temporary files.
Why not for temporary files?
Emacs has a with-temp-file macro, which generates a temporary file
name, executes the body of the macro with a variable bound to the file
name, then deletes the file. Very handy for writing test suites. We
don't usually bother deleting the buffers where the file was processed
in those tests, because the test suite runs in batch mode, and Emacs
exits after running the tests, thus cleaning up.
Having to carefully make sure no such buffers were left from the
previous test is a nuisance.
> > The programmer didn't
> > expect that because it is natural to expect each call to a function
> > that creates a temporary file to create a file under a new name, not
> > reuse the same name.
>
> This is an incorrect assumption. Just like socket numbers are randomly
> generated in some situations, temp file names are random, and you can't
> make assumptions about the resulting file name.
People get their ideas from other similar APIs, and functions like
tmpfile in C do generate a different name each call.
> How about storing, as an attribute of the buffer, a boolean that tells
> whether the underlying file name is randomly generated (i.e. a temp file),
> and query this boolean before prompting to the user to refresh the buffer?
That's a terrible complications, especially since the solution is so
close at hand, and it makes gen_tempname behave on Windows as it does
on GNU/Linux, and as it behaved just a year or two ago.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Tue, 16 Aug 2022 17:57:02 GMT)
Full text and
rfc822 format available.
Message #104 received at 57129 <at> debbugs.gnu.org (full text, mbox):
> Date: Tue, 16 Aug 2022 10:30:54 -0700
> Cc: bug-gnulib <at> gnu.org, larsi <at> gnus.org, 57129 <at> debbugs.gnu.org,
> jporterbugs <at> gmail.com, bruno <at> clisp.org
> From: Paul Eggert <eggert <at> cs.ucla.edu>
>
> On 8/16/22 10:04, Eli Zaretskii wrote:
> > That "finite number" of 62^6 sounds pretty close to infinity to me!
>
> It's only 57 billion or so. That's not infinity
Yeah, right.
> Since we have the patch to prevent it, why not install it?
Because it isn't needed, and is a weird thing to do. It also slows
down make-temp-file, especially when a session has a lot of buffers
and with remote file names. So thanks, but no, thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Tue, 16 Aug 2022 19:12:01 GMT)
Full text and
rfc822 format available.
Message #107 received at 57129 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 8/16/22 10:47, Eli Zaretskii wrote:
> (Why are you talking about MS-DOS?)
I mistakenly thought it was an MS-DOS problem because tempname.c
ordinarily would use clock_gettime on MinGW. I didn't know Emacs
'configure' deliberately suppresses MinGW's clock_gettime.
> Thanks, but why not use 'random' instead? Emacs does have it on all
> platforms, including MS-Windows. AFAIU, it's better than 'rand'.
If the code used 'random' then the Gnulib 'tempname' module would need
to add a dependency on the Gnulib 'random' module, which would in turn
add a cascading dependency on Gnulib's 'random_r' module. It's better to
avoid this dependency if we can easily do so.
Come to think of it, we don't need to use 'rand' either, since
tempname.c already has a good-enough pseudorandom generator. I installed
into Gnulib the attached patch, which I hope fixes the Emacs problem
without changing glibc's generated code (once this gets migrated back
into glibc).
>> If I understand things correctly this should make the names random
>> enough on MS-DOS, though Emacs itself still needs a patch as I mentioned
>> a few minutes ago.
>
> Why would Emacs need that patch?
In another part of this thread you rejected that patch, on the grounds
that fixing the unlikely Emacs bug is more trouble than it's worth. So
I'll drop that suggestion.
[0001-tempname-generate-better-names-for-MinGW-Emacs.patch (text/x-patch, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Tue, 16 Aug 2022 20:00:02 GMT)
Full text and
rfc822 format available.
Message #110 received at 57129 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Eli Zaretskii wrote:
> Looking at your test program, I see that you generate the seconds file
> name without deleting the first one. When a file by the first
> generated name already exists, gen_tempname will indeed generate a
> different name. But in the scenario I described, Emacs creates one
> temporary file, uses it, then deletes it, and only after that creates
> another file.
I'm adding this scenario to the unit test. Still, the unit test succeeds
when run once on:
Linux, FreeBSD, NetBSD, OpenBSD, macOS, Solaris, Cygwin, native Windows.
Since this contradicts what you debugged on mingw, I ran the test 10000
times on native Windows. Result:
- on 32-bit mingw, no failure.
- on 64-bit mingw, around 30 failures (or around 10 failures with Paul's
newest patch). That is, a probability of ca. 0.3% of getting the same
file name as on the previous call.
Bruno
[0001-tempname-Add-more-tests.patch (text/x-patch, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Tue, 16 Aug 2022 20:07:02 GMT)
Full text and
rfc822 format available.
Message #113 received at 57129 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii wrote:
> Emacs has a with-temp-file macro, which generates a temporary file
> name, executes the body of the macro with a variable bound to the file
> name, then deletes the file. Very handy for writing test suites.
Since, even with Paul's newest patch, gen_tempname on 64-bit mingw
produces the same file name as on the previous call with a probability
of about 0.1%, you still need to be prepared to see the original problem
occasionally.
Bruno
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Tue, 16 Aug 2022 20:13:01 GMT)
Full text and
rfc822 format available.
Message #116 received at 57129 <at> debbugs.gnu.org (full text, mbox):
Paul Eggert wrote:
> Emacs 'configure' deliberately suppresses MinGW's clock_gettime.
Ah, this explains why Eli saw the problem in Emacs in a deterministic
manner, whereas I see it only with 0.3% probability in my tests on 64-bit
mingw.
> I installed
> into Gnulib the attached patch, which I hope fixes the Emacs problem
> without changing glibc's generated code (once this gets migrated back
> into glibc).
In 10000 runs on 64-bit mingw, your patch went from 27 test failures
to 11 test failures. So, we're still at ca. 0.1% probability (*), far
above the 62^-6 or 10^-10 probability that one would have hoped for.
(*) I wouldn't consider these measurements reliable. Maybe the probability
did not change, and what I'm seeing is merely a measurement fluke.
Bruno
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Tue, 16 Aug 2022 20:54:01 GMT)
Full text and
rfc822 format available.
Message #119 received at 57129 <at> debbugs.gnu.org (full text, mbox):
On 8/16/22 12:59, Bruno Haible wrote:
> Since this contradicts what you debugged on mingw, I ran the test 10000
> times on native Windows. Result:
> - on 32-bit mingw, no failure.
> - on 64-bit mingw, around 30 failures (or around 10 failures with Paul's
> newest patch). That is, a probability of ca. 0.3% of getting the same
> file name as on the previous call.
That's odd, for two reasons. First, 64-bit and 32-bit mingw shouldn't
differ, as they both use uint_fast64_t which should be the same width on
both platforms. Second, I could not reproduce the problem on 64-bit
Ubuntu x86-64 (after altering tempname.c to always define
HAS_CLOCK_ENTROPY to false) test-tempname always succeeded in 10000 tries.
Could you investigate further why mingw 64-bit fails?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Wed, 17 Aug 2022 11:10:01 GMT)
Full text and
rfc822 format available.
Message #122 received at 57129 <at> debbugs.gnu.org (full text, mbox):
> Date: Tue, 16 Aug 2022 12:11:24 -0700
> Cc: bruno <at> clisp.org, bug-gnulib <at> gnu.org, larsi <at> gnus.org,
> 57129 <at> debbugs.gnu.org, jporterbugs <at> gmail.com
> From: Paul Eggert <eggert <at> cs.ucla.edu>
>
> > Thanks, but why not use 'random' instead? Emacs does have it on all
> > platforms, including MS-Windows. AFAIU, it's better than 'rand'.
>
> If the code used 'random' then the Gnulib 'tempname' module would need
> to add a dependency on the Gnulib 'random' module, which would in turn
> add a cascading dependency on Gnulib's 'random_r' module. It's better to
> avoid this dependency if we can easily do so.
>
> Come to think of it, we don't need to use 'rand' either, since
> tempname.c already has a good-enough pseudorandom generator. I installed
> into Gnulib the attached patch, which I hope fixes the Emacs problem
> without changing glibc's generated code (once this gets migrated back
> into glibc).
Thanks.
Would it be possible to cherry-pick this to the emacs-28 branch, so
that Emacs 28.2 would have this fixed?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Wed, 17 Aug 2022 11:31:02 GMT)
Full text and
rfc822 format available.
Message #125 received at 57129 <at> debbugs.gnu.org (full text, mbox):
> From: Bruno Haible <bruno <at> clisp.org>
> Cc: eggert <at> cs.ucla.edu, bug-gnulib <at> gnu.org, larsi <at> gnus.org, 57129 <at> debbugs.gnu.org, jporterbugs <at> gmail.com
> Date: Tue, 16 Aug 2022 22:06:13 +0200
>
> Eli Zaretskii wrote:
> > Emacs has a with-temp-file macro, which generates a temporary file
> > name, executes the body of the macro with a variable bound to the file
> > name, then deletes the file. Very handy for writing test suites.
>
> Since, even with Paul's newest patch, gen_tempname on 64-bit mingw
> produces the same file name as on the previous call with a probability
> of about 0.1%, you still need to be prepared to see the original problem
> occasionally.
I'll take that risk, thank you.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Thu, 18 Aug 2022 06:06:02 GMT)
Full text and
rfc822 format available.
Message #128 received at 57129 <at> debbugs.gnu.org (full text, mbox):
On 8/17/22 04:08, Eli Zaretskii wrote:
> Would it be possible to cherry-pick this to the emacs-28 branch, so
> that Emacs 28.2 would have this fixed?
I installed that.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Thu, 18 Aug 2022 06:46:01 GMT)
Full text and
rfc822 format available.
Message #131 received at 57129 <at> debbugs.gnu.org (full text, mbox):
> Date: Wed, 17 Aug 2022 23:05:33 -0700
> Cc: bruno <at> clisp.org, bug-gnulib <at> gnu.org, larsi <at> gnus.org,
> 57129 <at> debbugs.gnu.org, jporterbugs <at> gmail.com
> From: Paul Eggert <eggert <at> cs.ucla.edu>
>
> On 8/17/22 04:08, Eli Zaretskii wrote:
> > Would it be possible to cherry-pick this to the emacs-28 branch, so
> > that Emacs 28.2 would have this fixed?
>
> I installed that.
Thanks!
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Sat, 20 Aug 2022 18:04:01 GMT)
Full text and
rfc822 format available.
Message #134 received at 57129 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 8/15/2022 11:58 AM, Eli Zaretskii wrote:
>> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
>> From: Jim Porter <jporterbugs <at> gmail.com>
>> Date: Mon, 15 Aug 2022 11:30:07 -0700
>>
>> The temp files are created by Eshell in lisp/eshell/esh-var.el in the
>> function 'eshell-parse-variable-ref', specifically in the part starting
>> with:
>>
>> (eq (char-after) ?\<)
>
> Ah, okay. It's a (mis)feature of Gnulib's gen_tempname function
> (which is the guts of make-temp-file) in its implementation for
> MS-Windows (and maybe other platforms?): it always begins from the
> same "random" characters in the file name, and only generates other
> random characters if there's already a file by that name. So if you
> are careful and delete the temporary file each time after usage, and
> never need more than one temporary file at the same time, you will get
> the same name every call.
In addition to the changes to temporary file name generation, I think it
would be useful for Eshell to kill the temporary buffer too. If you use
this feature in Eshell a lot, the temporary buffers could pile up,
consuming memory for no real benefit. (A user who wanted the buffer to
stick around would probably redirect to a non-temporary file, or even
just to a buffer.) Attached is a patch to do this.
[0001-Kill-the-buffer-associated-with-a-temp-file-when-usi.patch (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Sat, 20 Aug 2022 18:15:01 GMT)
Full text and
rfc822 format available.
Message #137 received at 57129 <at> debbugs.gnu.org (full text, mbox):
> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
> From: Jim Porter <jporterbugs <at> gmail.com>
> Date: Sat, 20 Aug 2022 11:03:27 -0700
>
> In addition to the changes to temporary file name generation, I think it
> would be useful for Eshell to kill the temporary buffer too. If you use
> this feature in Eshell a lot, the temporary buffers could pile up,
> consuming memory for no real benefit. (A user who wanted the buffer to
> stick around would probably redirect to a non-temporary file, or even
> just to a buffer.) Attached is a patch to do this.
That makes sense, of course, but why not use with-temp-buffer in the
first place?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Sat, 20 Aug 2022 18:50:01 GMT)
Full text and
rfc822 format available.
Message #140 received at 57129 <at> debbugs.gnu.org (full text, mbox):
On 8/20/2022 11:14 AM, Eli Zaretskii wrote:
>> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
>> From: Jim Porter <jporterbugs <at> gmail.com>
>> Date: Sat, 20 Aug 2022 11:03:27 -0700
>>
>> In addition to the changes to temporary file name generation, I think it
>> would be useful for Eshell to kill the temporary buffer too. If you use
>> this feature in Eshell a lot, the temporary buffers could pile up,
>> consuming memory for no real benefit. (A user who wanted the buffer to
>> stick around would probably redirect to a non-temporary file, or even
>> just to a buffer.) Attached is a patch to do this.
>
> That makes sense, of course, but why not use with-temp-buffer in the
> first place?
I'm not sure that would work in this case. 'with-temp-buffer' (or
probably 'with-temp-file', since we want a real file) would clean
everything up once BODY has executed, but we want to perform cleanup in
a hook, which might execute quite a while later (e.g. if passing this
temp file to an external process).
If there's a way to use that though, I'm all for it.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Sun, 21 Aug 2022 16:21:01 GMT)
Full text and
rfc822 format available.
Message #143 received at 57129 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Paul Eggert wrote:
> > Since this contradicts what you debugged on mingw, I ran the test 10000
> > times on native Windows. Result:
> > - on 32-bit mingw, no failure.
> > - on 64-bit mingw, around 30 failures (or around 10 failures with Paul's
> > newest patch). That is, a probability of ca. 0.3% of getting the same
> > file name as on the previous call.
>
> That's odd, for two reasons. First, 64-bit and 32-bit mingw shouldn't
> differ, as they both use uint_fast64_t which should be the same width on
> both platforms. Second, I could not reproduce the problem on 64-bit
> Ubuntu x86-64 (after altering tempname.c to always define
> HAS_CLOCK_ENTROPY to false) test-tempname always succeeded in 10000 tries.
>
> Could you investigate further why mingw 64-bit fails?
That's odd indeed, and here are more detailed results.
I changed test-tempname.c to skip the case 1 and execute only case 2, and
added printf statements per the attached tempname.c.diff. Then ran
$ for i in `seq 1000`; do ./test-tempname.exe; done 2>&1 | tee out1000
In 32-bit mingw, the result is fully deterministic: each run behaves the same.
The first file name is always gl-temp-3FXzHa.xyz;
the second file name is always gl-temp-HgtmVy.xyz.
Thus, for a single Emacs process, things will look fine, but as soon as someone
starts to use temporary files in two different Emacs processes, in the way Eli
described, there will be massive collisions.
In 64-bit mingw, the 'tempname 1' value is deterministic. This simply shows
that Windows 10 does not use address space randomization (ASLR).
The 'tempname 2' value is unique 99% of the time:
$ grep 'tempname 2' out1000-mingw64 | sort | uniq -d | wc -l
8
The interesting thing is that each of the duplicate values of v is occurring
in the same process:
$ grep 'tempname 2' out1000-mingw64 | sort | uniq -d
tempname 2 v=0x00c1efa91fb60900
tempname 2 v=0x32ccb2946974f2f6
tempname 2 v=0x5cafcc69e359a147
tempname 2 v=0x6d0676018e27d771
tempname 2 v=0x6d95bd6083168079
tempname 2 v=0x7cb95116ffae8ece
tempname 2 v=0xe0afc7086808ce33
tempname 2 v=0xe46a60c28fb0ec7f
$ grep 'tempname 2' out1000-mingw64 | grep -n 0x00c1efa91fb60900
560:tempname 2 v=0x00c1efa91fb60900
561:tempname 2 v=0x00c1efa91fb60900
$ grep 'tempname 2' out1000-mingw64 | grep -n 0x32ccb2946974f2f6
1129:tempname 2 v=0x32ccb2946974f2f6
1130:tempname 2 v=0x32ccb2946974f2f6
etc.
So, in this environment, different Emacs processes will not conflict. But
within a single Emacs process, with 1% probability, the two resulting file
names are the same.
Bruno
[tempname.c.diff (text/x-patch, attachment)]
[out1000-mingw32.gz (application/gzip, attachment)]
[out1000-mingw64.gz (application/gzip, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Sun, 21 Aug 2022 16:33:02 GMT)
Full text and
rfc822 format available.
Message #146 received at 57129 <at> debbugs.gnu.org (full text, mbox):
> From: Bruno Haible <bruno <at> clisp.org>
> Cc: bug-gnulib <at> gnu.org, larsi <at> gnus.org, 57129 <at> debbugs.gnu.org, jporterbugs <at> gmail.com
> Date: Sun, 21 Aug 2022 18:20:52 +0200
>
> I changed test-tempname.c to skip the case 1 and execute only case 2, and
> added printf statements per the attached tempname.c.diff. Then ran
> $ for i in `seq 1000`; do ./test-tempname.exe; done 2>&1 | tee out1000
>
> In 32-bit mingw, the result is fully deterministic: each run behaves the same.
> The first file name is always gl-temp-3FXzHa.xyz;
> the second file name is always gl-temp-HgtmVy.xyz.
>
> Thus, for a single Emacs process, things will look fine, but as soon as someone
> starts to use temporary files in two different Emacs processes, in the way Eli
> described, there will be massive collisions.
>
> In 64-bit mingw, the 'tempname 1' value is deterministic. This simply shows
> that Windows 10 does not use address space randomization (ASLR).
Some of these problems could be alleviated if the initial value of 'v'
depended on something non-deterministic, like the program's PID and/or
the current wallclock time (no need to use anything as fancy as
clock_gettime, a simple call to 'clock' should be enough).
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Sun, 21 Aug 2022 17:18:01 GMT)
Full text and
rfc822 format available.
Message #149 received at 57129 <at> debbugs.gnu.org (full text, mbox):
Paul Eggert asked:
> > Could you investigate further why mingw 64-bit fails?
Some words on the "why". You seem to have expectations regarding the
distribution quality of the resulting file names, but these expectations
are not warranted.
Donald E. Knuth wrote in TAOCP vol 2 "Seminumerical algorithms" that
an arbitrary sequence of number manipulating operations usually has a
bad quality as a pseudo-random number generator, and that in order
to get good quality pseudo-random numbers, one needs to use *specific*
pseudo-random number generators and then *prove* mathematical properties
of it. (The same is true, btw, for crypto algorithms.)
Then he started discussing linear congruential generators [1] as the
simplest example for which something could be proved.
The current source code of tempname.c generates the pseudo-random numbers
— assuming no HAS_CLOCK_ENTROPY and no ASLR — using a mix of three
operations:
(A) A linear congruential generator [2] with m = 2^64,
a = 2862933555777941757, c = 3037000493.
(B) A floor operation: v ← floor(v / 62^6)
(C) A xor operation: v ^= prev_v
There are three different questions:
(Q1) What is the expected quality inside a single gen_tempname call?
(Q2) What is the expected quality of multiple gen_tempname calls in a
single process?
(Q3) What is the expected quality when considering different processes
that call gen_tempname?
Answers:
(Q1) For just 6 'X'es, there is essentially a single (A) operation.
Therefore the quality will be good.
If someone uses more than 10 'X'es, for example, 50 'X'es, there
will be 5 (A) and 5 (B), interleaved: (A), (B), (A), (B), ...
This is *not* a linear congruential generator, therefore the
expected quality is BAD.
In order to fix this case, what I would do is to get back to
a linear congruential generator: (A), (A), ..., (A), (B).
In other words, feed into (A) exactly the output from the
previous (A). This means, do the statements
XXXXXX[i] = letters[v % 62];
v /= 62;
not on v itself, but on a copy of v.
But wait, there is also the desire to have predictability!
This means to not consume all the possible bits the
random_bits() call, but only part of it.
What I would do here, is to reduce BASE_62_DIGITS from 10 to 6.
So that in each round of the loop, 6 base-62 digits are consumed
and more than 4 base-62 digits are left in v, for predictability.
In the usual calls with 6 'X'es the loop will still end after a
single round.
(Q2) First of all, the multiple gen_tempname calls can occur in
different threads. Since no locking is involved, it is undefined
behaviour to access the 'prev_v' variable from different threads.
On machines with an IA-64 CPU, the 'prev_v' variable's value may
not be propagated from one thread to the other. [3][4][5]
The fix is simple, though: Mark 'prev_v' as 'volatile'.
Then, what the code does, is a mix of (A), (B), (C). Again, this
is *not* a linear congruential generator, therefore the expected
quality is BAD.
To get good quality, I would suggest to use a linear congruential
generator across *all* gen_tempname calls of the entire thread.
This means:
- Move out the (B) invocations out, like explained above in (Q1).
- Remove the (C) code that you added last week.
- Store the v value in a per-thread variable. Using '__thread'
on glibc systems and a TLS variable (#include "glthread/tls.h")
on the other platforms.
(Q3) Here, to force differences between different processes, I would
suggest to use a fine-grained clock value. In terms of platforms,
#if defined CLOCK_MONOTONIC && HAVE_CLOCK_GETTIME
is way too restrictive.
How about
- using CLOCK_REALTIME when CLOCK_MONOTONIC is not available,
- using gettimeofday() as fallback, especially on native Windows.
If one does (Q3), then the suggestions for (Q2) (other than the 'volatile')
may not be needed.
Bruno
[1] https://en.wikipedia.org/wiki/Linear_congruential_generator
[2] https://en.wikipedia.org/wiki/Linear_congruential_generator#c_%E2%89%A0_0
[3] https://es.cs.uni-kl.de/publications/datarsg/Geor16.pdf
[4] https://db.in.tum.de/teaching/ws1718/dataprocessing/chapter3.pdf?lang=de
page 18
[5] https://os.inf.tu-dresden.de/Studium/DOS/SS2014/04-Memory-Consistency.pdf
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Mon, 22 Aug 2022 20:48:01 GMT)
Full text and
rfc822 format available.
Message #152 received at 57129 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Thanks for the detailed diagnosis, Bruno. To try to fix the problems I
installed the attached patches into Gnulib. If I understand things
correctly, these patches should fix the 0.1% failure rate you observed
on 64-bit mingw. They also fix a minor security leak I discovered: in
rare cases, ASLR entropy was used to generate publicly visible file
names, which is a no-no as that might help an attacker infer the
randomized layout of a victim process.
These fixes follow some but not all the suggestions you made. The basic
problem I saw was that tempname.c was using too much belt-and-suspenders
code, so much so that the combination of belts and suspenders
misbehaved. I simplified it a bit and this removed the need for some of
the suggestions.
These fixes should be merged into glibc upstream since they fix glibc
bugs; I plan to follow up on that shortly.
[0001-tempname-merge-64-bit-time_t-fix-from-glibc.patch (text/x-patch, attachment)]
[0002-tempname-fix-multithreading-ASLR-leak-etc.patch (text/x-patch, attachment)]
[0003-tempname-don-t-lose-entropy-in-seed.patch (text/x-patch, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Tue, 23 Aug 2022 00:14:02 GMT)
Full text and
rfc822 format available.
Message #155 received at 57129 <at> debbugs.gnu.org (full text, mbox):
Paul Eggert wrote:
> Thanks for the detailed diagnosis, Bruno. To try to fix the problems I
> installed the attached patches into Gnulib.
These look all good. Here's again my evaluation of the three scenarios:
(Q1) What is the expected quality inside a single gen_tempname call?
It depends on quality of random_bits(). Usually on the quality of
getrandom(), which is good on most systems. And for the other systems,
the "ersatz" in random_bits() gives reasonable quality.
(Q2) What is the expected quality of multiple gen_tempname calls in a
single process?
(Q3) What is the expected quality when considering different processes
that call gen_tempname?
Both have the same answer: The file name essentially depends on the
first call to random_bits(). Two different calls to random_bits()
can be correlated only if
- getrandom() is not well supported, and
- CLOCK_REALTIME is not defined, and
- we're in the same process, less than 0.01 sec apart.
IMO, that's good enough.
> If I understand things
> correctly, these patches should fix the 0.1% failure rate you observed
> on 64-bit mingw.
Yes. Running the "case 2" part 1000 times again, among the 2000 generated
file names, there are no duplicates — neither on 32-bit mingw, nor on 64-bit
mingw.
> They also fix a minor security leak I discovered: in
> rare cases, ASLR entropy was used to generate publicly visible file
> names, which is a no-no as that might help an attacker infer the
> randomized layout of a victim process.
Excellent observation :-)
> These fixes follow some but not all the suggestions you made. The basic
> problem I saw was that tempname.c was using too much belt-and-suspenders
> code, so much so that the combination of belts and suspenders
> misbehaved. I simplified it a bit and this removed the need for some of
> the suggestions.
Thanks. The major quality boost comes from invoking getrandom() always.
Bruno
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Tue, 23 Aug 2022 11:19:02 GMT)
Full text and
rfc822 format available.
Message #158 received at 57129 <at> debbugs.gnu.org (full text, mbox):
> From: Bruno Haible <bruno <at> clisp.org>
> Cc: Eli Zaretskii <eliz <at> gnu.org>, bug-gnulib <at> gnu.org, larsi <at> gnus.org, 57129 <at> debbugs.gnu.org, jporterbugs <at> gmail.com
> Date: Tue, 23 Aug 2022 02:13:02 +0200
>
> > If I understand things
> > correctly, these patches should fix the 0.1% failure rate you observed
> > on 64-bit mingw.
>
> Yes. Running the "case 2" part 1000 times again, among the 2000 generated
> file names, there are no duplicates — neither on 32-bit mingw, nor on 64-bit
> mingw.
Was that with or without using clock_gettime? If it was with
clock_gettime, what happens without it (i.e. when 'clock' is used
instead)?
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Tue, 23 Aug 2022 14:50:02 GMT)
Full text and
rfc822 format available.
Message #161 received at 57129 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii asked:
> > Yes. Running the "case 2" part 1000 times again, among the 2000 generated
> > file names, there are no duplicates — neither on 32-bit mingw, nor on 64-bit
> > mingw.
>
> Was that with or without using clock_gettime? If it was with
> clock_gettime, what happens without it (i.e. when 'clock' is used
> instead)?
That was with clock_gettime. Without clock_gettime, i.e. without the
dependency to libwinpthread, the result is the same: no duplicates — neither
on 32-bit mingw, nor on 64-bit mingw.
Bruno
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Tue, 23 Aug 2022 16:08:02 GMT)
Full text and
rfc822 format available.
Message #164 received at 57129 <at> debbugs.gnu.org (full text, mbox):
> From: Bruno Haible <bruno <at> clisp.org>
> Cc: eggert <at> cs.ucla.edu, bug-gnulib <at> gnu.org, larsi <at> gnus.org, 57129 <at> debbugs.gnu.org, jporterbugs <at> gmail.com
> Date: Tue, 23 Aug 2022 16:49:36 +0200
>
> Eli Zaretskii asked:
> > > Yes. Running the "case 2" part 1000 times again, among the 2000 generated
> > > file names, there are no duplicates — neither on 32-bit mingw, nor on 64-bit
> > > mingw.
> >
> > Was that with or without using clock_gettime? If it was with
> > clock_gettime, what happens without it (i.e. when 'clock' is used
> > instead)?
>
> That was with clock_gettime. Without clock_gettime, i.e. without the
> dependency to libwinpthread, the result is the same: no duplicates — neither
> on 32-bit mingw, nor on 64-bit mingw.
Great, thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Wed, 24 Aug 2022 21:42:01 GMT)
Full text and
rfc822 format available.
Message #167 received at 57129 <at> debbugs.gnu.org (full text, mbox):
On 8/20/2022 11:03 AM, Jim Porter wrote:
> In addition to the changes to temporary file name generation, I think it
> would be useful for Eshell to kill the temporary buffer too.
[snip]
> Attached is a patch to do this.
Assuming there are no remaining objections, I'll merge this in a day or two.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#57129
; Package
emacs
.
(Fri, 26 Aug 2022 05:11:02 GMT)
Full text and
rfc822 format available.
Message #170 received at 57129 <at> debbugs.gnu.org (full text, mbox):
On 8/24/2022 2:41 PM, Jim Porter wrote:
> On 8/20/2022 11:03 AM, Jim Porter wrote:
>> In addition to the changes to temporary file name generation, I think
>> it would be useful for Eshell to kill the temporary buffer too.
> [snip]
>> Attached is a patch to do this.
>
> Assuming there are no remaining objections, I'll merge this in a day or
> two.
Merged as a457aa62577284333c7d25d48a49704788b25a04.
Reply sent
to
Jim Porter <jporterbugs <at> gmail.com>
:
You have taken responsibility.
(Thu, 16 Mar 2023 05:36:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Jim Porter <jporterbugs <at> gmail.com>
:
bug acknowledged by developer.
(Thu, 16 Mar 2023 05:36:02 GMT)
Full text and
rfc822 format available.
Message #175 received at 57129-done <at> debbugs.gnu.org (full text, mbox):
On 8/25/2022 10:10 PM, Jim Porter wrote:
> On 8/24/2022 2:41 PM, Jim Porter wrote:
>> On 8/20/2022 11:03 AM, Jim Porter wrote:
>>> In addition to the changes to temporary file name generation, I think
>>> it would be useful for Eshell to kill the temporary buffer too.
>> [snip]
>>> Attached is a patch to do this.
>>
>> Assuming there are no remaining objections, I'll merge this in a day
>> or two.
>
> Merged as a457aa62577284333c7d25d48a49704788b25a04.
I know there was a quite-lengthy discussion about the tempname function,
but I think those patches were merged, along with my patch for the
original bug topic. Therefore, I'm going to close this.
Of course, if there's still anything to do with the tempname stuff,
let's do it (though it might help to give it a separate bug for tracking
purposes).
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Thu, 13 Apr 2023 11:24:05 GMT)
Full text and
rfc822 format available.
This bug report was last modified 2 years and 31 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.