GNU bug report logs - #57129
29.0.50; Improve behavior of conditionals in Eshell

Package: emacs;

Reported by: Jim Porter <jporterbugs <at> gmail.com>

Date: Thu, 11 Aug 2022 02:44:02 UTC

Severity: normal

Found in version 29.0.50

Done: Jim Porter <jporterbugs <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 57129 in the body.
You can then email your comments to 57129 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox

Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Thu, 11 Aug 2022 02:44:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jim Porter <jporterbugs <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Thu, 11 Aug 2022 02:44:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 29.0.50; Improve behavior of conditionals in Eshell
Date: Wed, 10 Aug 2022 19:43:22 -0700

In Eshell, you can use conditionals pretty much like you'd expect from 
other shells. Starting from 'emacs -Q -f eshell':

  ~ $ [ foo = foo ] && echo hi
  hi
  ~ $ [ foo = bar ] && echo hi
  ~ $ [ foo = foo ] || echo hi
  ~ $ [ foo = bar ] || echo hi
  hi
  ~ $ if {[ foo = foo ]} {echo yes} {echo no}
  yes
  ~ $ if {[ foo = bar ]} {echo yes} {echo no}
  no

However, that only works for external commands. If the command is 
implemented in Lisp, it doesn't work:

  ~ $ (zerop 0) && echo hi
  t
  hi
  ~ $ (zerop 1) && echo hi
  hi  ;; Shouldn't say hi.

That's because the exit status is always 0 for Lisp commands. This works 
correctly for external commands:

  ~ $ [ foo = foo ]; echo status $? result $$
  ("status" 0 "result" nil)
  ~ $ [ foo = bar ]; echo status $? result $$
  ("status" 1 "result" nil)

But not for Lisp commands:

  ~ $ (zerop 0); echo status $? result $$
  t
  ("status" 0 "result" t)
  ~ $ (zerop 1); echo status $? result $$
  ("status" 0 "result" nil)
  ~ $ (zerop "foo"); echo status $? result $$
  Wrong type argument: number-or-marker-p, "foo"
  ("status" 0 "result" nil)

The manual says that the status should be 1 when a Lisp command fails, 
but it's 0 no matter what, even if it signaled an error. (Likewise, the 
manual says that the "result" variable should be t for successful 
external commands, but it's always nil.)

Similarly to the above, you can't use variable expansions for conditionals:

  ~ $ (setq foo t)
  t
  ~ $ if $foo {echo yes} {echo no}
  yes
  ~ $ (setq foo nil)
  ~ $ if $foo {echo yes} {echo no}
  yes  ;; Should say no.

Patch forthcoming. Just splitting it into two messages since it seemed 
more readable that way...

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Thu, 11 Aug 2022 02:47:01 GMT) Full text and rfc822 format available.

Message #8 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Wed, 10 Aug 2022 19:46:41 -0700

[Message part 1 (text/plain, inline)]

Here are some patches to fix this.

The first patch adds tests/documentation for the current state so that 
the subsequent patches are clearer. It also improves 
'eshell-close-handles' so that you can set the exit status elsewhere, 
which makes some of the code simpler.

The second patch fixes the use of variables in conditionals (e.g. if 
statements, while loops). The only non-test code change here is that 
'eshell-structure-basic-command' needed to be taught that forms 
beginning with 'eshell-escape-arg' should be treated as data, much like 
'eshell-convert'.

The third patch fixes the behavior of the '$?' and '$$' variables to 
match the manual: '$$' is t when an external command succeeds, and '$?' 
is 1 when a Lisp command signals an error. I also added a new feature 
that '$?' is 2 when an actual Lisp *form* returns nil. This lets you use 
Lisp forms as conditionals in Eshell much like you would in regular 
Lisp. However, it doesn't apply to the command syntax, even if it runs 
Lisp code:

  ~ $ (zerop 1); echo $?
  2
  ~ $ zerop 1; echo $?
  0

That's because Eshell prints the return value of Lisp commands (unless 
it's nil), so a Lisp command that wants to silently succeed would return 
nil. For example, the Lisp version of 'cat' returns nil. We want that to 
have an exit status of 0. However, when writing it as a Lisp form with 
parentheses, I think it makes more sense that nil is treated as false 
for conditionals. The overall goal is so that this prints hi:

  cat foo.txt && echo hi

And that this doesn't:

  (zerop 1) && echo hi

For people who don't like this behavior, they can customize the new 
'eshell-lisp-form-nil-is-failure' variable.

[0001-Only-set-Eshell-execution-result-metavariables-when-.patch (text/plain, attachment)]

[0002-Allow-using-dollar-expansions-in-Eshell-conditionals.patch (text/plain, attachment)]

[0003-Make-and-variables-more-consistent-in-Eshell.patch (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Fri, 12 Aug 2022 15:23:02 GMT) Full text and rfc822 format available.

Message #11 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Jim Porter <jporterbugs <at> gmail.com>
Cc: 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Fri, 12 Aug 2022 17:22:32 +0200

Jim Porter <jporterbugs <at> gmail.com> writes:

> Here are some patches to fix this.

I haven't tested the patches, but reading them, they make sense to me,
so please go ahead and push.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Sat, 13 Aug 2022 05:15:01 GMT) Full text and rfc822 format available.

Message #14 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Fri, 12 Aug 2022 22:14:02 -0700

On 8/12/2022 8:22 AM, Lars Ingebrigtsen wrote:
> Jim Porter <jporterbugs <at> gmail.com> writes:
> 
>> Here are some patches to fix this.
> 
> I haven't tested the patches, but reading them, they make sense to me,
> so please go ahead and push.

Thanks for taking a look. Merged as 
f3408af0a3251a744cb0b55b5e153565bfd57ea3.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Sat, 13 Aug 2022 07:02:01 GMT) Full text and rfc822 format available.

Message #17 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Jim Porter <jporterbugs <at> gmail.com>
Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Sat, 13 Aug 2022 10:01:21 +0300

> Cc: 57129 <at> debbugs.gnu.org
> From: Jim Porter <jporterbugs <at> gmail.com>
> Date: Fri, 12 Aug 2022 22:14:02 -0700
> 
> On 8/12/2022 8:22 AM, Lars Ingebrigtsen wrote:
> > Jim Porter <jporterbugs <at> gmail.com> writes:
> > 
> >> Here are some patches to fix this.
> > 
> > I haven't tested the patches, but reading them, they make sense to me,
> > so please go ahead and push.
> 
> Thanks for taking a look. Merged as 
> f3408af0a3251a744cb0b55b5e153565bfd57ea3.

One of the tests in esh-var-tests.el failed on MS-Windows; I fixed it,
although I'm not sure it's a correct fix, because the Eshell manual
seems to say that a built-in implementation of a command should be
preferred by default?

One of the tests in eshell-tests.el also fails on MS-Windows, but I
have no idea how to fix it, nor even what are the details of the
test.  Are the 'echo' commands in that test supposed to be external
shell commands or internal Eshell commands?  If the former, it could
be a problem on MS-Windows.

(This is a good example of my gripes about the test suite, which I
described in bug#57150 yesterday: debugging a failure is unnecessarily
hard and tedious.  There's no information about the specific intent of
the test and its inner workings and assumptions, except the general
idea described in the doc string.)

Here's the failure I get:

  Test eshell-test/subcommand-reset-in-pipeline backtrace:
    signal(ert-test-failed (((should (equal (eshell-test-command-result
    ert-fail(((should (equal (eshell-test-command-result (format templat
    (if (unwind-protect (setq value-38 (apply fn-36 args-37)) (setq form
    (let (form-description-40) (if (unwind-protect (setq value-38 (apply
    (let ((value-38 'ert-form-evaluation-aborted-39)) (let (form-descrip
    (let* ((fn-36 #'equal) (args-37 (condition-case err (let ((signal-ho
    (let ((template (car --dolist-tail--))) (let* ((fn-26 #'equal) (args
    (while --dolist-tail-- (let ((template (car --dolist-tail--))) (let*
    (let ((--dolist-tail-- '("echo {%s} | *cat" "echo ${%s} | *cat" "*ca
    (closure (t) nil (let* ((fn-21 #'executable-find) (args-22 (conditio
    ert--run-test-internal(#s(ert--test-execution-info :test #s(ert-test
    ert-run-test(#s(ert-test :name eshell-test/subcommand-reset-in-pipel
    ert-run-or-rerun-test(#s(ert--stats :selector ... :tests ... :test-m
    ert-run-tests((not (or (tag :unstable) (tag :nativecomp))) #f(compil
    ert-run-tests-batch((not (or (tag :unstable) (tag :nativecomp))))
    ert-run-tests-batch-and-exit((not (or (tag :unstable) (tag :nativeco
    eval((ert-run-tests-batch-and-exit '(not (or (tag :unstable) (tag :n
    command-line-1(("-L" ";." "-l" "ert" "-l" "lisp/eshell/eshell-tests.
    command-line()
    normal-top-level()
  Test eshell-test/subcommand-reset-in-pipeline condition:
      (ert-test-failed
       ((should
	 (equal
	  (eshell-test-command-result ...)
	  "first"))
	:form
	(equal nil "first")
	:value nil :explanation
	(different-types nil "first")))
     FAILED  18/18  eshell-test/subcommand-reset-in-pipeline (1.873920 sec) at lisp/eshell/eshell-tests.el:80

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Sat, 13 Aug 2022 18:57:02 GMT) Full text and rfc822 format available.

Message #20 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Sat, 13 Aug 2022 11:56:20 -0700

On 8/13/2022 12:01 AM, Eli Zaretskii wrote:
> One of the tests in esh-var-tests.el failed on MS-Windows; I fixed it,
> although I'm not sure it's a correct fix, because the Eshell manual
> seems to say that a built-in implementation of a command should be
> preferred by default?

Sorry about that. I think that's the right fix for that case. Maybe it 
would make sense to set 'eshell-prefer-lisp-functions' to t for most of 
the Eshell tests to improve reproducibility on various platforms; tests 
that want an external command can just put a "*" in front of the command 
name.

> One of the tests in eshell-tests.el also fails on MS-Windows, but I
> have no idea how to fix it, nor even what are the details of the
> test.  Are the 'echo' commands in that test supposed to be external
> shell commands or internal Eshell commands?  If the former, it could
> be a problem on MS-Windows.

The echo commands should be internal Eshell commands (since there's an 
'eshell/echo' Lisp function, that one should always be preferred by Eshell).

I'm surprised that test fails on MS Windows, since it *should* be 
testing internal Eshell logic that's not platform-specific. Based on the 
failure, it looks like one of the following commands is returning the 
wrong value:

  echo {echo $eshell-in-pipeline-p | echo} | *cat
  echo ${echo $eshell-in-pipeline-p | echo} | *cat
  *cat $<echo $eshell-in-pipeline-p | echo> | *cat

All of these should return 'first'. That test is just checking that, 
when you're in a subcommand ({...}, ${...}, or $<...>), the value of 
'eshell-in-pipeline-p' shouldn't be influenced by the pipeline in the 
"super-command". Some built-in Eshell commands consult 
'eshell-in-pipeline-p', and if it had the wrong value, they might do the 
wrong thing.

If nothing else, it would probably be helpful to set up ERT explainers 
so that the error messages are easier to understand. As it is now, 
they're not very explanatory.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Sun, 14 Aug 2022 05:04:01 GMT) Full text and rfc822 format available.

Message #23 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Sean Whitton <spwhitton <at> spwhitton.name>
To: Jim Porter <jporterbugs <at> gmail.com>
Cc: 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Sat, 13 Aug 2022 22:03:46 -0700

Hello,

On Fri 12 Aug 2022 at 10:14PM -07, Jim Porter wrote:

> On 8/12/2022 8:22 AM, Lars Ingebrigtsen wrote:
>> Jim Porter <jporterbugs <at> gmail.com> writes:
>>
>>> Here are some patches to fix this.
>> I haven't tested the patches, but reading them, they make sense to me,
>> so please go ahead and push.
>
> Thanks for taking a look. Merged as f3408af0a3251a744cb0b55b5e153565bfd57ea3.

Thanks for working on this, pretty cool.

-- 
Sean Whitton

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Sun, 14 Aug 2022 05:09:02 GMT) Full text and rfc822 format available.

Message #26 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Jim Porter <jporterbugs <at> gmail.com>
Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Sun, 14 Aug 2022 08:07:49 +0300

> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
> From: Jim Porter <jporterbugs <at> gmail.com>
> Date: Sat, 13 Aug 2022 11:56:20 -0700
> 
> On 8/13/2022 12:01 AM, Eli Zaretskii wrote:
> > One of the tests in esh-var-tests.el failed on MS-Windows; I fixed it,
> > although I'm not sure it's a correct fix, because the Eshell manual
> > seems to say that a built-in implementation of a command should be
> > preferred by default?
> 
> Sorry about that. I think that's the right fix for that case. Maybe it 
> would make sense to set 'eshell-prefer-lisp-functions' to t for most of 
> the Eshell tests to improve reproducibility on various platforms; tests 
> that want an external command can just put a "*" in front of the command 
> name.

But then this fragment from the Eshell manual:

     The command can be either an Elisp function or an external command.
  Eshell looks first for an alias (*note Aliases::) with the same name as
  the command, then a built-in (*note Built-ins::) or a function with the
  same name; if there is no match, it then tries to execute it as an
  external command.

seems to be inaccurate?  Since 'format' exists as a built-in command,
why did Eshell in this case invoke the external command instead?

> I'm surprised that test fails on MS Windows, since it *should* be 
> testing internal Eshell logic that's not platform-specific. Based on the 
> failure, it looks like one of the following commands is returning the 
> wrong value:
> 
>    echo {echo $eshell-in-pipeline-p | echo} | *cat
>    echo ${echo $eshell-in-pipeline-p | echo} | *cat
>    *cat $<echo $eshell-in-pipeline-p | echo> | *cat
> 
> All of these should return 'first'.

The first two do; the last one returns nothing.

> That test is just checking that, 
> when you're in a subcommand ({...}, ${...}, or $<...>), the value of 
> 'eshell-in-pipeline-p' shouldn't be influenced by the pipeline in the 
> "super-command". Some built-in Eshell commands consult 
> 'eshell-in-pipeline-p', and if it had the wrong value, they might do the 
> wrong thing.

So how to investigate this failure further?

> If nothing else, it would probably be helpful to set up ERT explainers 
> so that the error messages are easier to understand. As it is now, 
> they're not very explanatory.

Indeed, more explanations in this and other tests will be most
welcome.

Thanks.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Sun, 14 Aug 2022 05:38:02 GMT) Full text and rfc822 format available.

Message #29 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Sat, 13 Aug 2022 22:37:12 -0700

[Message part 1 (text/plain, inline)]

On 8/13/2022 10:07 PM, Eli Zaretskii wrote:
>> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
>> From: Jim Porter <jporterbugs <at> gmail.com>
>> Date: Sat, 13 Aug 2022 11:56:20 -0700
>>
>> Sorry about that. I think that's the right fix for that case. Maybe it
>> would make sense to set 'eshell-prefer-lisp-functions' to t for most of
>> the Eshell tests to improve reproducibility on various platforms; tests
>> that want an external command can just put a "*" in front of the command
>> name.
> 
> But then this fragment from the Eshell manual:
> 
>       The command can be either an Elisp function or an external command.
>    Eshell looks first for an alias (*note Aliases::) with the same name as
>    the command, then a built-in (*note Built-ins::) or a function with the
>    same name; if there is no match, it then tries to execute it as an
>    external command.
> 
> seems to be inaccurate?  Since 'format' exists as a built-in command,
> why did Eshell in this case invoke the external command instead?

"Built-in" in that part of the manual refers to a function named 
'eshell/FOO'. If you run "FOO" as an Eshell command, it will check for 
'eshell/FOO' before an external command on your system. The manual could 
probably stand to be improved here.

>> I'm surprised that test fails on MS Windows, since it *should* be
>> testing internal Eshell logic that's not platform-specific. Based on the
>> failure, it looks like one of the following commands is returning the
>> wrong value:
>>
>>     echo {echo $eshell-in-pipeline-p | echo} | *cat
>>     echo ${echo $eshell-in-pipeline-p | echo} | *cat
>>     *cat $<echo $eshell-in-pipeline-p | echo> | *cat
>>
>> All of these should return 'first'.
> 
> The first two do; the last one returns nothing.
[snip]
> So how to investigate this failure further?

I have access to an MS Windows system (but don't have a build 
environment set up for native MS Windows builds), so I can try to 
reproduce this using the Emacs 29 pretest binaries in the next few days. 
Hopefully that works.

If you'd like to dig into this further yourself, you could try running 
this command in Eshell:

  eshell-parse-command '*cat $<echo $eshell-in-pipeline-p | echo> | *cat'

That will print the Lisp form that the command gets converted to. I've 
attached the result that I get in a recent Emacs 29 build on GNU/Linux. 
The two functions that set 'eshell-in-pipeline-p' in there are:

* 'eshell-as-subcommand', which resets it to nil so that the outer
  command's pipeline state doesn't interfere with the subcommand.

* 'eshell-execute-pipeline', which calls 'eshell-do-pipelines' (except
  on MS-DOS, I think) and should set 'eshell-in-pipeline-p' to 'first'
  in the above case.

>> If nothing else, it would probably be helpful to set up ERT explainers
>> so that the error messages are easier to understand. As it is now,
>> they're not very explanatory.
> 
> Indeed, more explanations in this and other tests will be most
> welcome.

Yeah, this test in particular is high up on the list of tests that needs 
more explanation. It's pretty subtle, and the test doesn't really serve 
as a good example of why someone would care that this behavior works the 
way the test demands it.

[parsed-command.el (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Sun, 14 Aug 2022 07:31:02 GMT) Full text and rfc822 format available.

Message #32 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Jim Porter <jporterbugs <at> gmail.com>
Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Sun, 14 Aug 2022 10:30:02 +0300

> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
> From: Jim Porter <jporterbugs <at> gmail.com>
> Date: Sat, 13 Aug 2022 22:37:12 -0700
> 
> > But then this fragment from the Eshell manual:
> > 
> >       The command can be either an Elisp function or an external command.
> >    Eshell looks first for an alias (*note Aliases::) with the same name as
> >    the command, then a built-in (*note Built-ins::) or a function with the
> >    same name; if there is no match, it then tries to execute it as an
> >    external command.
> > 
> > seems to be inaccurate?  Since 'format' exists as a built-in command,
> > why did Eshell in this case invoke the external command instead?
> 
> "Built-in" in that part of the manual refers to a function named 
> 'eshell/FOO'. If you run "FOO" as an Eshell command, it will check for 
> 'eshell/FOO' before an external command on your system. The manual could 
> probably stand to be improved here.

The manual definitely should be clarified in that matter, because:

  d:/gnu/git/emacs/trunk/src $ which format
  format is a built-in function in ‘C source code’.

To me this says that 'format' is a built-in command, and the manual
says such commands should be invoked in preference to external
commands.  The "eshell/" part is not mentioned anywhere.

> If you'd like to dig into this further yourself, you could try running 
> this command in Eshell:
> 
>    eshell-parse-command '*cat $<echo $eshell-in-pipeline-p | echo> | *cat'
> 
> That will print the Lisp form that the command gets converted to. I've 
> attached the result that I get in a recent Emacs 29 build on GNU/Linux. 

I get the same output, with 2 exceptions:

  . the name of the temporary file is different
  . instead of a simple string here:

                     (eshell-set-output-handle 1 'overwrite "/tmp/QqPFwo"))

    I get a file name split into several parts, which are then
    concatenated by eshell-concat:

		     (eshell-set-output-handle 1 'overwrite
					       (eshell-extended-glob
						(eshell-concat nil "c:/DOCUME" "~" "1/Zaretzky/LOCALS" "~" "1/Temp/OIi8Wd")))

Any chance that the "~" parts here somehow get in the way?

If not, any other thoughts?  My main worry is that there's something
here specific to how we invoke subprocesses, which you lately changed.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Sun, 14 Aug 2022 21:41:01 GMT) Full text and rfc822 format available.

Message #35 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Sun, 14 Aug 2022 14:40:06 -0700

On 8/14/2022 12:30 AM, Eli Zaretskii wrote:
> The manual definitely should be clarified in that matter, because:
> 
>    d:/gnu/git/emacs/trunk/src $ which format
>    format is a built-in function in ‘C source code’.
> 
> To me this says that 'format' is a built-in command, and the manual
> says such commands should be invoked in preference to external
> commands.  The "eshell/" part is not mentioned anywhere.

I'll work on a fix for that. This is a little more complex than 
immediately apparent though, since Eshell also unfortunately uses the 
term "alias" to refer to two kinds of entities:

1) Command abbreviations created by the user using the 'alias' command,
   e.g. "alias ll 'ls -l'". These are sometimes called "command
   aliases", and are implemented in em-alias.el.

2) Lisp functions with names like 'eshell/FOO'. These are sometimes
   called "alias functions", and are implemented in esh-cmd.el. For
   example, see 'eshell-find-alias-function', which checks if
   'eshell/FOO' exists.

It would probably be good to fix all this terminology for once and for all.

> I get the same output, with 2 exceptions:
> 
>    . the name of the temporary file is different
>    . instead of a simple string here:
> 
>                       (eshell-set-output-handle 1 'overwrite "/tmp/QqPFwo"))
> 
>      I get a file name split into several parts, which are then
>      concatenated by eshell-concat:
> 
> 		     (eshell-set-output-handle 1 'overwrite
> 					       (eshell-extended-glob
> 						(eshell-concat nil "c:/DOCUME" "~" "1/Zaretzky/LOCALS" "~" "1/Temp/OIi8Wd")))
> 
> Any chance that the "~" parts here somehow get in the way?

It's possible. I think Eshell is trying to ensure that the "~" doesn't 
get parsed as "home directory" here. But it could be doing something 
wrong there. I did also touch 'eshell-concat' not too long ago, so I 
might have broken something there.

> If not, any other thoughts?  My main worry is that there's something
> here specific to how we invoke subprocesses, which you lately changed.

You could try some Eshell commands that don't use subprocesses to see if 
that changes anything. For example, these should only use Lisp functions:

  ;; Should print "first" in the Eshell buffer:
  eshell/echo ${eshell/echo $eshell-in-pipeline-p | eshell/echo}
  eshell/echo ${eshell/echo $eshell-in-pipeline-p | eshell/echo} | 
eshell/echo

  ;; Should open a temp file containing "first" in a new buffer:
  find-file $<eshell/echo $eshell-in-pipeline-p | eshell/echo>
  find-file $<eshell/echo $eshell-in-pipeline-p | eshell/echo> | 
eshell/echo

The "eshell/" prefix probably isn't necessary, but I figured it's worth 
being extra-careful here while diagnosing the issue. I'm particularly 
interested in whether the third and fourth commands work. If they work, 
that would definitely point towards a bug in Eshell's subprocess 
handling; if not, then that would point to a bug in the "$<subcommand>" 
handling.

I also filed bug#57216 to add ERT explainers that will print better 
error messages for most of the Eshell tests. That doesn't address the 
sometimes-vague documentation of the tests, but I'll do that separately.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Mon, 15 Aug 2022 12:15:02 GMT) Full text and rfc822 format available.

Message #38 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Jim Porter <jporterbugs <at> gmail.com>
Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Mon, 15 Aug 2022 15:13:55 +0300

> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
> From: Jim Porter <jporterbugs <at> gmail.com>
> Date: Sun, 14 Aug 2022 14:40:06 -0700
> 
> On 8/14/2022 12:30 AM, Eli Zaretskii wrote:
> > The manual definitely should be clarified in that matter, because:
> > 
> >    d:/gnu/git/emacs/trunk/src $ which format
> >    format is a built-in function in ‘C source code’.
> > 
> > To me this says that 'format' is a built-in command, and the manual
> > says such commands should be invoked in preference to external
> > commands.  The "eshell/" part is not mentioned anywhere.
> 
> I'll work on a fix for that. This is a little more complex than 
> immediately apparent though, since Eshell also unfortunately uses the 
> term "alias" to refer to two kinds of entities:
> 
> 1) Command abbreviations created by the user using the 'alias' command,
>     e.g. "alias ll 'ls -l'". These are sometimes called "command
>     aliases", and are implemented in em-alias.el.
> 
> 2) Lisp functions with names like 'eshell/FOO'. These are sometimes
>     called "alias functions", and are implemented in esh-cmd.el. For
>     example, see 'eshell-find-alias-function', which checks if
>     'eshell/FOO' exists.
> 
> It would probably be good to fix all this terminology for once and for all.

Yes, that could well be a source of a lot of confusion.

> You could try some Eshell commands that don't use subprocesses to see if 
> that changes anything. For example, these should only use Lisp functions:
> 
>    ;; Should print "first" in the Eshell buffer:
>    eshell/echo ${eshell/echo $eshell-in-pipeline-p | eshell/echo}
>    eshell/echo ${eshell/echo $eshell-in-pipeline-p | eshell/echo} | 
> eshell/echo
> 
>    ;; Should open a temp file containing "first" in a new buffer:
>    find-file $<eshell/echo $eshell-in-pipeline-p | eshell/echo>
>    find-file $<eshell/echo $eshell-in-pipeline-p | eshell/echo> | 
> eshell/echo
> 
> The "eshell/" prefix probably isn't necessary, but I figured it's worth 
> being extra-careful here while diagnosing the issue. I'm particularly 
> interested in whether the third and fourth commands work. If they work, 
> that would definitely point towards a bug in Eshell's subprocess 
> handling; if not, then that would point to a bug in the "$<subcommand>" 
> handling.

All the 4 commands do what you say they should do.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Mon, 15 Aug 2022 16:15:01 GMT) Full text and rfc822 format available.

Message #41 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Mon, 15 Aug 2022 09:14:04 -0700

On 8/15/2022 5:13 AM, Eli Zaretskii wrote:
>> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
>> From: Jim Porter <jporterbugs <at> gmail.com>
>> Date: Sun, 14 Aug 2022 14:40:06 -0700
>>
>> You could try some Eshell commands that don't use subprocesses to see if
>> that changes anything. For example, these should only use Lisp functions:
>>
>>     ;; Should print "first" in the Eshell buffer:
>>     eshell/echo ${eshell/echo $eshell-in-pipeline-p | eshell/echo}
>>     eshell/echo ${eshell/echo $eshell-in-pipeline-p | eshell/echo} |
>> eshell/echo
>>
>>     ;; Should open a temp file containing "first" in a new buffer:
>>     find-file $<eshell/echo $eshell-in-pipeline-p | eshell/echo>
>>     find-file $<eshell/echo $eshell-in-pipeline-p | eshell/echo> |
>> eshell/echo
>>
>> The "eshell/" prefix probably isn't necessary, but I figured it's worth
>> being extra-careful here while diagnosing the issue. I'm particularly
>> interested in whether the third and fourth commands work. If they work,
>> that would definitely point towards a bug in Eshell's subprocess
>> handling; if not, then that would point to a bug in the "$<subcommand>"
>> handling.
> 
> All the 4 commands do what you say they should do.

Hm, and I can't reproduce this on the prebuilt Emacs 29 snapshot from 
July 18, which is from just before I merged the changes to 
'make-process', so it's definitely possible that this is due to those 
commits.

Does reverting 4e59830bc0ab17cdbd85748b133c97837bed99e3 and 
d7b89ea4077d4fe677ba0577245328819ee79cdc fix the issue? I checked 
locally, and they should revert cleanly at least.

It might also be interesting to revert just the changes from d7b89ea in 
lisp/eshell/esh-proc.el. I don't think any of the changes I made to the 
C sources should impact this, since a) I was careful to keep the logic 
as close to before as I could, and b) it just changes the logic for 
creating a PTY, which doesn't work on MS Windows anyway.

I did notice one weird issue though, and maybe this has something to do 
with the problem: on MS Windows, if I run any command with a 
$<subcommand>, the second and subsequent times, it warns me about 
changing the temp file, e.g.:

  3Zz80A has changed since visited or saved.  Save anyway? (yes or no)

This probably shouldn't do that, and it could definitely cause issues in 
the tests, which can't prompt the user for things like that.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Mon, 15 Aug 2022 16:50:02 GMT) Full text and rfc822 format available.

Message #44 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Jim Porter <jporterbugs <at> gmail.com>
Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Mon, 15 Aug 2022 19:49:00 +0300

> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
> From: Jim Porter <jporterbugs <at> gmail.com>
> Date: Mon, 15 Aug 2022 09:14:04 -0700
> 
> > All the 4 commands do what you say they should do.
> 
> Hm, and I can't reproduce this on the prebuilt Emacs 29 snapshot from 
> July 18, which is from just before I merged the changes to 
> 'make-process', so it's definitely possible that this is due to those 
> commits.
> 
> Does reverting 4e59830bc0ab17cdbd85748b133c97837bed99e3 and 
> d7b89ea4077d4fe677ba0577245328819ee79cdc fix the issue? I checked 
> locally, and they should revert cleanly at least.
> 
> It might also be interesting to revert just the changes from d7b89ea in 
> lisp/eshell/esh-proc.el. I don't think any of the changes I made to the 
> C sources should impact this, since a) I was careful to keep the logic 
> as close to before as I could, and b) it just changes the logic for 
> creating a PTY, which doesn't work on MS Windows anyway.

None of these reverts fixes the issue.  I tried both the tests in
eshell-tests.el and the problematic command:

   *cat $<echo $eshell-in-pipeline-p | echo> | *cat

> I did notice one weird issue though, and maybe this has something to do 
> with the problem: on MS Windows, if I run any command with a 
> $<subcommand>, the second and subsequent times, it warns me about 
> changing the temp file, e.g.:
> 
>    3Zz80A has changed since visited or saved.  Save anyway? (yes or no)
> 
> This probably shouldn't do that, and it could definitely cause issues in 
> the tests, which can't prompt the user for things like that.

I guess this means you somehow reuse the same temporary file on
MS-Windows?  Maybe change the test a bit so that the file name
definitely differs?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Mon, 15 Aug 2022 18:09:02 GMT) Full text and rfc822 format available.

Message #47 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Mon, 15 Aug 2022 11:08:45 -0700

On 8/15/2022 9:49 AM, Eli Zaretskii wrote:
> None of these reverts fixes the issue.  I tried both the tests in
> eshell-tests.el and the problematic command:
> 
>     *cat $<echo $eshell-in-pipeline-p | echo> | *cat

Thanks for testing. I might have to get an MS-Windows dev environment 
set up to dig into this further. I'm a bit worried that this is related 
to bug#55590, where I had to jump through some awkward hoops to fix a 
somewhat-related issue with Eshell subcommands. I think that's due to 
'eshell-do-eval' being written before lexical binding existed, and so 
the variable scoping doesn't work quite as expected. It could be that 
you only see this failure when using external commands due to some 
differences in timing.

If that's the problem here too, it would be a pain to fix (but it would 
likely also mean that this bug has been present for a long time, and my 
tests/fixes have just brought it to the surface). As mentioned in 
bug#55590, I think it'd be nice to rip out 'eshell-do-eval' and replace 
it with the generator.el machinery, since they're conceptually pretty 
similar. That's easier said than done though...

>> I did notice one weird issue though, and maybe this has something to do
>> with the problem: on MS Windows, if I run any command with a
>> $<subcommand>, the second and subsequent times, it warns me about
>> changing the temp file, e.g.:
>>
>>     3Zz80A has changed since visited or saved.  Save anyway? (yes or no)
>>
>> This probably shouldn't do that, and it could definitely cause issues in
>> the tests, which can't prompt the user for things like that.
> 
> I guess this means you somehow reuse the same temporary file on
> MS-Windows?  Maybe change the test a bit so that the file name
> definitely differs?

I need to investigate this a bit further, since on GNU/Linux, I get a 
new temp file every time. I think that's the behavior we want, or 
failing that, we could at least kill the temp file's buffer after we're 
done with it. I don't think there's much value in leaving it around.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Mon, 15 Aug 2022 18:22:01 GMT) Full text and rfc822 format available.

Message #50 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Jim Porter <jporterbugs <at> gmail.com>
Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Mon, 15 Aug 2022 21:21:16 +0300

> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
> From: Jim Porter <jporterbugs <at> gmail.com>
> Date: Mon, 15 Aug 2022 11:08:45 -0700
> 
> >> I did notice one weird issue though, and maybe this has something to do
> >> with the problem: on MS Windows, if I run any command with a
> >> $<subcommand>, the second and subsequent times, it warns me about
> >> changing the temp file, e.g.:
> >>
> >>     3Zz80A has changed since visited or saved.  Save anyway? (yes or no)
> >>
> >> This probably shouldn't do that, and it could definitely cause issues in
> >> the tests, which can't prompt the user for things like that.
> > 
> > I guess this means you somehow reuse the same temporary file on
> > MS-Windows?  Maybe change the test a bit so that the file name
> > definitely differs?
> 
> I need to investigate this a bit further, since on GNU/Linux, I get a 
> new temp file every time.

Can you tell how you create these temporary files?  Or point me to the
code which does that?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Mon, 15 Aug 2022 18:31:01 GMT) Full text and rfc822 format available.

Message #53 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Mon, 15 Aug 2022 11:30:07 -0700

On 8/15/2022 11:21 AM, Eli Zaretskii wrote:
>> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
>> From: Jim Porter <jporterbugs <at> gmail.com>
>> Date: Mon, 15 Aug 2022 11:08:45 -0700
>>
>> I need to investigate this a bit further, since on GNU/Linux, I get a
>> new temp file every time.
> 
> Can you tell how you create these temporary files?  Or point me to the
> code which does that?

The temp files are created by Eshell in lisp/eshell/esh-var.el in the 
function 'eshell-parse-variable-ref', specifically in the part starting 
with:

  (eq (char-after) ?\<)

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Mon, 15 Aug 2022 18:59:02 GMT) Full text and rfc822 format available.

Message #56 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Jim Porter <jporterbugs <at> gmail.com>, Paul Eggert <eggert <at> cs.ucla.edu>
Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Mon, 15 Aug 2022 21:58:28 +0300

> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
> From: Jim Porter <jporterbugs <at> gmail.com>
> Date: Mon, 15 Aug 2022 11:30:07 -0700
> 
> >> I need to investigate this a bit further, since on GNU/Linux, I get a
> >> new temp file every time.
> > 
> > Can you tell how you create these temporary files?  Or point me to the
> > code which does that?
> 
> The temp files are created by Eshell in lisp/eshell/esh-var.el in the 
> function 'eshell-parse-variable-ref', specifically in the part starting 
> with:
> 
>    (eq (char-after) ?\<)

Ah, okay.  It's a (mis)feature of Gnulib's gen_tempname function
(which is the guts of make-temp-file) in its implementation for
MS-Windows (and maybe other platforms?): it always begins from the
same "random" characters in the file name, and only generates other
random characters if there's already a file by that name.  So if you
are careful and delete the temporary file each time after usage, and
never need more than one temporary file at the same time, you will get
the same name every call.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Mon, 15 Aug 2022 20:56:02 GMT) Full text and rfc822 format available.

Message #59 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Jim Porter <jporterbugs <at> gmail.com>, larsi <at> gnus.org,
 Gnulib bugs <bug-gnulib <at> gnu.org>, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Mon, 15 Aug 2022 13:55:35 -0700

[Message part 1 (text/plain, inline)]

On 8/15/22 11:58, Eli Zaretskii wrote:

> Ah, okay.  It's a (mis)feature of Gnulib's gen_tempname function
> (which is the guts of make-temp-file) in its implementation for
> MS-Windows (and maybe other platforms?): it always begins from the
> same "random" characters in the file name, and only generates other
> random characters if there's already a file by that name.

Not sure I'd call it a misfeature, as gen_tempname is generating a 
uniquely-named file that is exclusive to Emacs, which is all it's 
supposed to do.

I do see a comment saying that gen_tempname generates "hard-to-predict" 
names, which as you note is not correct on MS-DOS, nor even strictly 
speaking on all POSIX platforms. I installed the first attached patch 
into Gnulib to fix that comment.

I'm not sure I'm entirely understanding the Emacs problem, but it 
appears to be that Emacs has its own set of filenames that it thinks it 
knows about, and Emacs wants the new temporary file's name to not be a 
member of that set. If I'm right, does the second attached patch (this 
patch is to Emacs) address the problem? I haven't tested or installed it.

[0001-tempname-remove-incorrect-comment.patch (text/x-patch, attachment)]

[0001-Don-t-create-temp-file-with-same-name-as-visited.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Mon, 15 Aug 2022 21:23:02 GMT) Full text and rfc822 format available.

Message #62 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: bug-gnulib <at> gnu.org, Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Jim Porter <jporterbugs <at> gmail.com>, Eli Zaretskii <eliz <at> gnu.org>,
 larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Mon, 15 Aug 2022 23:22:27 +0200

Paul Eggert wrote:
> I do see a comment saying that gen_tempname generates "hard-to-predict" 
> names, which as you note is not correct on MS-DOS, nor even strictly 
> speaking on all POSIX platforms. I installed the first attached patch 
> into Gnulib to fix that comment.

Another comment fix is as below. Note that both comment fixes need to be
propagated to glibc.


2022-08-15  Bruno Haible  <bruno <at> clisp.org>

	tempname: Fix a comment.
	* lib/tempname.c (try_tempname_len): Use of entropy makes the function
	more, not less, secure.

diff --git a/lib/tempname.c b/lib/tempname.c
index 75a939e571..e6520191d7 100644
--- a/lib/tempname.c
+++ b/lib/tempname.c
@@ -273,7 +273,7 @@ try_tempname_len (char *tmpl, int suffixlen, void *args,
   /* Whether to consume entropy when acquiring random bits.  On the
      first try it's worth the entropy cost with __GT_NOCREATE, which
      is inherently insecure and can use the entropy to make it a bit
-     less secure.  On the (rare) second and later attempts it might
+     more secure.  On the (rare) second and later attempts it might
      help against DoS attacks.  */
   bool use_getrandom = tryfunc == try_nocreate;

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Tue, 16 Aug 2022 11:05:02 GMT) Full text and rfc822 format available.

Message #65 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, bug-gnulib <at> gnu.org,
 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Tue, 16 Aug 2022 14:04:10 +0300

> Date: Mon, 15 Aug 2022 13:55:35 -0700
> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org, Jim Porter
>  <jporterbugs <at> gmail.com>, Gnulib bugs <bug-gnulib <at> gnu.org>
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> 
> On 8/15/22 11:58, Eli Zaretskii wrote:
> 
> > Ah, okay.  It's a (mis)feature of Gnulib's gen_tempname function
> > (which is the guts of make-temp-file) in its implementation for
> > MS-Windows (and maybe other platforms?): it always begins from the
> > same "random" characters in the file name, and only generates other
> > random characters if there's already a file by that name.
> 
> Not sure I'd call it a misfeature

I didn't say "misfeature".  Evidently, for you and perhaps others it's
a feature.

> as gen_tempname is generating a uniquely-named file that is
> exclusive to Emacs, which is all it's supposed to do.

Then perhaps calling it from make-temp-file in Emacs is a mistake,
because that function's doc string says

  Create a temporary file.
  The returned file name (created by appending some random characters at the end
  of PREFIX, and expanding against ‘temporary-file-directory’ if necessary),
  is guaranteed to point to a newly created file.

When you tell someone that the file name will include "some random
characters", they don't expect that they will be the same "random
characters" every call, just like saying that some function generates
random numbers doesn't imply it will be the same number every call.

> I'm not sure I'm entirely understanding the Emacs problem, but it 
> appears to be that Emacs has its own set of filenames that it thinks it 
> knows about, and Emacs wants the new temporary file's name to not be a 
> member of that set. If I'm right, does the second attached patch (this 
> patch is to Emacs) address the problem? I haven't tested or installed it.

I don't think that's the problem, no.  The problem is that callers of
make-temp-file reasonably expect it to return a new name every call.
And evidently, it does that on GNU/Linux (not sure how, given that the
tempname.c code is supposed to be in glibc?).  So if there's no
intention to change gen_tempname on non-glibc platforms so that it
doesn't repeat the same "random characters" upon each call, I think
Emacs should call a different function to generate "random" names of
temporary files, for consistency across platforms if not for other
reasons.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Tue, 16 Aug 2022 13:36:01 GMT) Full text and rfc822 format available.

Message #68 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>, bug-gnulib <at> gnu.org,
 Eli Zaretskii <eliz <at> gnu.org>
Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Tue, 16 Aug 2022 15:35:17 +0200

Eli Zaretskii wrote:
> The problem is that callers of
> make-temp-file reasonably expect it to return a new name every call.
> And evidently, it does that on GNU/Linux (not sure how, given that the
> tempname.c code is supposed to be in glibc?).  So if there's no
> intention to change gen_tempname on non-glibc platforms so that it
> doesn't repeat the same "random characters" upon each call, I think
> Emacs should call a different function to generate "random" names of
> temporary files, for consistency across platforms if not for other
> reasons.

You are making incorrect assumptions or hypotheses. I am adding the unit
test below. It proves that (at least) on Linux, FreeBSD 11, NetBSD 7,
OpenBSD 6.0, macOS, AIX 7.1, Solaris 10, Cygwin, and native Windows,
gen_tempname *does* return a new file name on each call, with a very high
probability.

So, gen_tempname does *not* repeat the same "random characters" upon each call.

2022-08-16  Bruno Haible  <bruno <at> clisp.org>

	tempname: Add tests.
	* tests/test-tempname.c: New file.
	* modules/tempname-tests: New file.

=========================== tests/test-tempname.c ==========================
/* Test of creating a temporary file.
   Copyright (C) 2022 Free Software Foundation, Inc.

   This program is free software: you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation, either version 3 of the License, or
   (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program.  If not, see <https://www.gnu.org/licenses/>.  */

/* Written by Bruno Haible <bruno <at> clisp.org>, 2022.  */

#include <config.h>

#include "tempname.h"

#include <string.h>
#include <unistd.h>

#include "macros.h"

int
main ()
{
  /* Verify that two consecutive calls to gen_tempname return two different
     file names, with a high probability.  */
  const char *templ18 = "gl-temp-XXXXXX.xyz";
  char filename1[18 + 1];
  char filename2[18 + 1];

  strcpy (filename1, templ18);
  int fd1 = gen_tempname (filename1, strlen (".xyz"), 0, GT_FILE);
  ASSERT (fd1 >= 0);

  strcpy (filename2, templ18);
  int fd2 = gen_tempname (filename2, strlen (".xyz"), 0, GT_FILE);
  ASSERT (fd2 >= 0);

  /* With 6 'X' and a good pseudo-random number generator behind the scenes,
     the probability of getting the same file name twice in a row should be
     1/62^6 < 1/10^10.  */
  ASSERT (strcmp (filename1, filename2) != 0);

  /* Clean up.  */
  close (fd1);
  close (fd2);
  unlink (filename1);
  unlink (filename2);

  return 0;
}
=========================== modules/tempname-tests =========================
Files:
tests/test-tempname.c
tests/macros.h

Depends-on:
unlink

configure.ac:

Makefile.am:
TESTS += test-tempname
check_PROGRAMS += test-tempname
test_tempname_LDADD = $(LDADD) $(LIB_GETRANDOM) $(LIB_CLOCK_GETTIME)
============================================================================

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Tue, 16 Aug 2022 13:52:01 GMT) Full text and rfc822 format available.

Message #71 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Bruno Haible <bruno <at> clisp.org>
Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, eggert <at> cs.ucla.edu,
 bug-gnulib <at> gnu.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Tue, 16 Aug 2022 16:51:06 +0300

> From: Bruno Haible <bruno <at> clisp.org>
> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org, jporterbugs <at> gmail.com
> Date: Tue, 16 Aug 2022 15:35:17 +0200
> 
> Eli Zaretskii wrote:
> > The problem is that callers of
> > make-temp-file reasonably expect it to return a new name every call.
> > And evidently, it does that on GNU/Linux (not sure how, given that the
> > tempname.c code is supposed to be in glibc?).  So if there's no
> > intention to change gen_tempname on non-glibc platforms so that it
> > doesn't repeat the same "random characters" upon each call, I think
> > Emacs should call a different function to generate "random" names of
> > temporary files, for consistency across platforms if not for other
> > reasons.
> 
> You are making incorrect assumptions or hypotheses.

They are neither assumptions nor hypotheses.  They are what I actually
saw by stepping with a debugger into the code.

> I am adding the unit test below. It proves that (at least) on Linux,
> FreeBSD 11, NetBSD 7, OpenBSD 6.0, macOS, AIX 7.1, Solaris 10,
> Cygwin, and native Windows, gen_tempname *does* return a new file
> name on each call, with a very high probability.
> 
> So, gen_tempname does *not* repeat the same "random characters" upon each call.

Well, it does here, when called from Emacs in the particular scenario
of the unit test I was looking into.

Looking at your test program, I see that you generate the seconds file
name without deleting the first one.  When a file by the first
generated name already exists, gen_tempname will indeed generate a
different name.  But in the scenario I described, Emacs creates one
temporary file, uses it, then deletes it, and only after that creates
another file.  In terms of your test program, you need to move the
first unlink call to before the second call to gen_tempname.  Then
your test program will faithfully reproduce what Emacs does in the
case in point.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Tue, 16 Aug 2022 14:41:02 GMT) Full text and rfc822 format available.

Message #74 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, eggert <at> cs.ucla.edu,
 bug-gnulib <at> gnu.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Tue, 16 Aug 2022 16:40:16 +0200

Eli Zaretskii wrote:
> Looking at your test program, I see that you generate the seconds file
> name without deleting the first one.  When a file by the first
> generated name already exists, gen_tempname will indeed generate a
> different name.  But in the scenario I described, Emacs creates one
> temporary file, uses it, then deletes it, and only after that creates
> another file.

Why would it be important for the second temporary file to bear a different
name, once the first one is gone? I mean, process IDs get reused over time;
socket numbers get reused over time; what is wrong with reusing a file name,
once the older file is gone?

Maybe the problem is that the file gets deleted too early, when some parts
of Emacs still reference it?

Bruno

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Tue, 16 Aug 2022 16:26:02 GMT) Full text and rfc822 format available.

Message #77 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Bruno Haible <bruno <at> clisp.org>
Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, eggert <at> cs.ucla.edu,
 bug-gnulib <at> gnu.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Tue, 16 Aug 2022 19:25:16 +0300

> From: Bruno Haible <bruno <at> clisp.org>
> Cc: eggert <at> cs.ucla.edu, bug-gnulib <at> gnu.org, larsi <at> gnus.org, 57129 <at> debbugs.gnu.org, jporterbugs <at> gmail.com
> Date: Tue, 16 Aug 2022 16:40:16 +0200
> 
> Eli Zaretskii wrote:
> > Looking at your test program, I see that you generate the seconds file
> > name without deleting the first one.  When a file by the first
> > generated name already exists, gen_tempname will indeed generate a
> > different name.  But in the scenario I described, Emacs creates one
> > temporary file, uses it, then deletes it, and only after that creates
> > another file.
> 
> Why would it be important for the second temporary file to bear a different
> name, once the first one is gone? I mean, process IDs get reused over time;
> socket numbers get reused over time; what is wrong with reusing a file name,
> once the older file is gone?

Because the Emacs Lisp program expected that, based on what
make-temp-file in Emacs promises (see the "random characters" part),
and because that's how it behaves on GNU/Linux.  Then the same program
behaved differently on MS-Windows, and the programmer was stumped.

Sure, it's possible to write the same program somewhat differently,
carefully working around this issue, but my point is that such
functions should ideally present the same behavior traits on all
systems, otherwise the portability is hampered.

> Maybe the problem is that the file gets deleted too early, when some parts
> of Emacs still reference it?

The buffer which visited that file still exists, and still records the
name of the file, yes.  And then, when the program writes to another
file created by another call to make-temp-file, it actually writes to
a file that some buffer still visits, and in Emacs that triggers a
prompt to the user to refresh the buffer.  The programmer didn't
expect that because it is natural to expect each call to a function
that creates a temporary file to create a file under a new name, not
reuse the same name.  E.g., similar facilities in the Standard C
library exhibit that behavior.  So the program was written assuming
that the second write was to an entirely different and unrelated file.

And again, my main point is: why does this work differently on
different platforms?  If you consider it OK to return the same file
name each time, provided that no such file exists, then why doesn't
that function do it on GNU/Linux?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Tue, 16 Aug 2022 16:55:02 GMT) Full text and rfc822 format available.

Message #80 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, bug-gnulib <at> gnu.org,
 57129 <at> debbugs.gnu.org, Bruno Haible <bruno <at> clisp.org>
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Tue, 16 Aug 2022 09:54:25 -0700

On 8/16/22 09:25, Eli Zaretskii wrote:
> The programmer didn't
> expect that because it is natural to expect each call to a function
> that creates a temporary file to create a file under a new name, not
> reuse the same name.

Regardless of whether the programmer expects a random name or a 
predictable-but-unique name, there are only a finite number of names to 
choose from so the programmer must take into account the possibility 
that the chosen name clashes with names that the programmer would prefer 
not to be chosen.

This means an Emacs patch such as [1] is needed regardless of whether 
Gnulib's gen_tempname is changed to be more random than it is. Although 
it's true that the bug fixed by [1] is less likely if gen_tempname 
generates more-random names, the bug can occur no matter what we do 
about gen_tempname.

[1] 
https://debbugs.gnu.org/cgi/bugreport.cgi?filename=0001-Don-t-create-temp-file-with-same-name-as-visited.patch;bug=57129;msg=59;att=2

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Tue, 16 Aug 2022 17:06:01 GMT) Full text and rfc822 format available.

Message #83 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, bug-gnulib <at> gnu.org,
 57129 <at> debbugs.gnu.org, bruno <at> clisp.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Tue, 16 Aug 2022 20:04:49 +0300

> Date: Tue, 16 Aug 2022 09:54:25 -0700
> Cc: bug-gnulib <at> gnu.org, larsi <at> gnus.org, 57129 <at> debbugs.gnu.org,
>  jporterbugs <at> gmail.com, Bruno Haible <bruno <at> clisp.org>
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> 
> On 8/16/22 09:25, Eli Zaretskii wrote:
> > The programmer didn't
> > expect that because it is natural to expect each call to a function
> > that creates a temporary file to create a file under a new name, not
> > reuse the same name.
> 
> Regardless of whether the programmer expects a random name or a 
> predictable-but-unique name, there are only a finite number of names to 
> choose from so the programmer must take into account the possibility 
> that the chosen name clashes with names that the programmer would prefer 
> not to be chosen.

Then what is this comment and the following assertion in Bruno's code
about?

  /* With 6 'X' and a good pseudo-random number generator behind the scenes,
     the probability of getting the same file name twice in a row should be
     1/62^6 < 1/10^10.  */
  ASSERT (strcmp (filename1, filename2) != 0);

That "finite number" of 62^6 sounds pretty close to infinity to me!

> This means an Emacs patch such as [1] is needed regardless of whether 
> Gnulib's gen_tempname is changed to be more random than it is. Although 
> it's true that the bug fixed by [1] is less likely if gen_tempname 
> generates more-random names, the bug can occur no matter what we do 
> about gen_tempname.

No, sorry.  I object to this patch, because it hides the problem under
the carpet.  That there's a file-visiting buffer that records the name
of a temporary file is just one case where this issue gets in the way.
The general problem is still there.  And it isn't an Emacs problem, so
Emacs is not the place to solve it.

Therefore, if there's no intention to fix this in Gnulib, I'm going to
update the documentation of make-temp-file so that Emacs users and
programmers will be informed about that and will be careful enough to
side-step these issues in all the situations.  (Not that I understand
why won't Gnulib provide consistent behavior on all platforms, but I
guess it's above my pay grade.)

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Tue, 16 Aug 2022 17:27:02 GMT) Full text and rfc822 format available.

Message #86 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>, Bruno Haible <bruno <at> clisp.org>
Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, bug-gnulib <at> gnu.org,
 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Tue, 16 Aug 2022 10:25:57 -0700

[Message part 1 (text/plain, inline)]

On 8/16/22 09:25, Eli Zaretskii wrote:
> why does this work differently on
> different platforms?

Platforms that lack clock_gettime and CLOCK_MONOTONIC fall back on a 
deterministic algorithm. Presumably MS-DOS one of the few such 
platforms, which is why the problem is observed only on MS-DOS.

How about something like the attached patch to Gnulib's lib/tempname.c? 
If I understand things correctly this should make the names random 
enough on MS-DOS, though Emacs itself still needs a patch as I mentioned 
a few minutes ago.

If this patch isn't good enough for MS-DOS, what sort of random bits are 
easily available on that platform? We don't need anything 
cryptographically secure; a higher-res clock should suffice.

[rand.diff (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Tue, 16 Aug 2022 17:30:02 GMT) Full text and rfc822 format available.

Message #89 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, eggert <at> cs.ucla.edu,
 bug-gnulib <at> gnu.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Tue, 16 Aug 2022 19:29:33 +0200

Thanks for the explanations, Eli.

Eli Zaretskii wrote:
> > Maybe the problem is that the file gets deleted too early, when some parts
> > of Emacs still reference it?
> 
> The buffer which visited that file still exists, and still records the
> name of the file, yes.  And then, when the program writes to another
> file created by another call to make-temp-file, it actually writes to
> a file that some buffer still visits, and in Emacs that triggers a
> prompt to the user to refresh the buffer.

That is a reasonable behaviour for a text editor — but only for file
names that are explicitly controlled by some program or by the user,
not for temporary files.

> The programmer didn't
> expect that because it is natural to expect each call to a function
> that creates a temporary file to create a file under a new name, not
> reuse the same name.

This is an incorrect assumption. Just like socket numbers are randomly
generated in some situations, temp file names are random, and you can't
make assumptions about the resulting file name.

How about storing, as an attribute of the buffer, a boolean that tells
whether the underlying file name is randomly generated (i.e. a temp file),
and query this boolean before prompting to the user to refresh the buffer?

Bruno

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Tue, 16 Aug 2022 17:32:02 GMT) Full text and rfc822 format available.

Message #92 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, bug-gnulib <at> gnu.org,
 57129 <at> debbugs.gnu.org, bruno <at> clisp.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Tue, 16 Aug 2022 10:30:54 -0700

On 8/16/22 10:04, Eli Zaretskii wrote:
> That "finite number" of 62^6 sounds pretty close to infinity to me!

It's only 57 billion or so. That's not infinity, considering all the 
Emacs sessions out there. If Emacs's success grows, almost surely 
someone will run into this unlikely bug. Since we have the patch to 
prevent it, why not install it?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Tue, 16 Aug 2022 17:42:02 GMT) Full text and rfc822 format available.

Message #95 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: eggert <at> cs.ucla.edu, bruno <at> clisp.org
Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, bug-gnulib <at> gnu.org,
 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Tue, 16 Aug 2022 20:41:06 +0300

> Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, bug-gnulib <at> gnu.org,
>  57129 <at> debbugs.gnu.org, bruno <at> clisp.org
> Date: Tue, 16 Aug 2022 20:04:49 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> 
> Therefore, if there's no intention to fix this in Gnulib, I'm going to
> update the documentation of make-temp-file so that Emacs users and
> programmers will be informed about that and will be careful enough to
> side-step these issues in all the situations.  (Not that I understand
> why won't Gnulib provide consistent behavior on all platforms, but I
> guess it's above my pay grade.)

And btw, looking closer, I see that this is a regression in Emacs 28,
which was caused by a change in Gnulib: the versions of tempname.c
that we used up to and including Emacs 27.2 used gettimeofday and the
PID to generate the file name, so it was more random than the code in
current Gnulib, and in particular the sequence of

   generate file-name1
   delete file-name1
   generate file-name2

would produce file-name2 that is different from file-name1.  With the
current code in Gnulib, something similar happens only on glibc
systems.  So I hope the code in Gnulib's tempname.c will be amended to
call clock_gettime or something similar, so that the names on
non-glibc platforms could also be random.  Or, failing that, at least
that gen_tempname in glibc would behave identically to other systems,
i.e. start from a fixed "random" value.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Tue, 16 Aug 2022 17:48:02 GMT) Full text and rfc822 format available.

Message #98 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: bug-gnulib <at> gnu.org, larsi <at> gnus.org, bruno <at> clisp.org, 57129 <at> debbugs.gnu.org,
 jporterbugs <at> gmail.com
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Tue, 16 Aug 2022 20:47:10 +0300

> Date: Tue, 16 Aug 2022 10:25:57 -0700
> Cc: bug-gnulib <at> gnu.org, larsi <at> gnus.org, 57129 <at> debbugs.gnu.org,
>  jporterbugs <at> gmail.com
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> 
> On 8/16/22 09:25, Eli Zaretskii wrote:
> > why does this work differently on
> > different platforms?
> 
> Platforms that lack clock_gettime and CLOCK_MONOTONIC fall back on a 
> deterministic algorithm. Presumably MS-DOS one of the few such 
> platforms, which is why the problem is observed only on MS-DOS.

(Why are you talking about MS-DOS?)

MinGW does have clock_gettime, but we don't want to use it, because
one of the MinGW flavors (MinGW64) implements that function in
libwinpthreads, and that makes Emacs depend on libwinpthreads DLL,
which we want to avoid.

> How about something like the attached patch to Gnulib's lib/tempname.c? 

Thanks, but why not use 'random' instead?  Emacs does have it on all
platforms, including MS-Windows.  AFAIU, it's better than 'rand'.

> If I understand things correctly this should make the names random 
> enough on MS-DOS, though Emacs itself still needs a patch as I mentioned 
> a few minutes ago.

Why would Emacs need that patch?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Tue, 16 Aug 2022 17:54:01 GMT) Full text and rfc822 format available.

Message #101 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Bruno Haible <bruno <at> clisp.org>
Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, eggert <at> cs.ucla.edu,
 bug-gnulib <at> gnu.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Tue, 16 Aug 2022 20:52:57 +0300

> From: Bruno Haible <bruno <at> clisp.org>
> Cc: eggert <at> cs.ucla.edu, bug-gnulib <at> gnu.org, larsi <at> gnus.org, 57129 <at> debbugs.gnu.org, jporterbugs <at> gmail.com
> Date: Tue, 16 Aug 2022 19:29:33 +0200
> 
> > The buffer which visited that file still exists, and still records the
> > name of the file, yes.  And then, when the program writes to another
> > file created by another call to make-temp-file, it actually writes to
> > a file that some buffer still visits, and in Emacs that triggers a
> > prompt to the user to refresh the buffer.
> 
> That is a reasonable behaviour for a text editor — but only for file
> names that are explicitly controlled by some program or by the user,
> not for temporary files.

Why not for temporary files?

Emacs has a with-temp-file macro, which generates a temporary file
name, executes the body of the macro with a variable bound to the file
name, then deletes the file.  Very handy for writing test suites.  We
don't usually bother deleting the buffers where the file was processed
in those tests, because the test suite runs in batch mode, and Emacs
exits after running the tests, thus cleaning up.

Having to carefully make sure no such buffers were left from the
previous test is a nuisance.

> > The programmer didn't
> > expect that because it is natural to expect each call to a function
> > that creates a temporary file to create a file under a new name, not
> > reuse the same name.
> 
> This is an incorrect assumption. Just like socket numbers are randomly
> generated in some situations, temp file names are random, and you can't
> make assumptions about the resulting file name.

People get their ideas from other similar APIs, and functions like
tmpfile in C do generate a different name each call.

> How about storing, as an attribute of the buffer, a boolean that tells
> whether the underlying file name is randomly generated (i.e. a temp file),
> and query this boolean before prompting to the user to refresh the buffer?

That's a terrible complications, especially since the solution is so
close at hand, and it makes gen_tempname behave on Windows as it does
on GNU/Linux, and as it behaved just a year or two ago.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Tue, 16 Aug 2022 17:57:02 GMT) Full text and rfc822 format available.

Message #104 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, bug-gnulib <at> gnu.org,
 57129 <at> debbugs.gnu.org, bruno <at> clisp.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Tue, 16 Aug 2022 20:56:10 +0300

> Date: Tue, 16 Aug 2022 10:30:54 -0700
> Cc: bug-gnulib <at> gnu.org, larsi <at> gnus.org, 57129 <at> debbugs.gnu.org,
>  jporterbugs <at> gmail.com, bruno <at> clisp.org
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> 
> On 8/16/22 10:04, Eli Zaretskii wrote:
> > That "finite number" of 62^6 sounds pretty close to infinity to me!
> 
> It's only 57 billion or so. That's not infinity

Yeah, right.

> Since we have the patch to prevent it, why not install it?

Because it isn't needed, and is a weird thing to do.  It also slows
down make-temp-file, especially when a session has a lot of buffers
and with remote file names.  So thanks, but no, thanks.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Tue, 16 Aug 2022 19:12:01 GMT) Full text and rfc822 format available.

Message #107 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: bug-gnulib <at> gnu.org, larsi <at> gnus.org, bruno <at> clisp.org, 57129 <at> debbugs.gnu.org,
 jporterbugs <at> gmail.com
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Tue, 16 Aug 2022 12:11:24 -0700

[Message part 1 (text/plain, inline)]

On 8/16/22 10:47, Eli Zaretskii wrote:

> (Why are you talking about MS-DOS?)

I mistakenly thought it was an MS-DOS problem because tempname.c 
ordinarily would use clock_gettime on MinGW. I didn't know Emacs 
'configure' deliberately suppresses MinGW's clock_gettime.

> Thanks, but why not use 'random' instead?  Emacs does have it on all
> platforms, including MS-Windows.  AFAIU, it's better than 'rand'.

If the code used 'random' then the Gnulib 'tempname' module would need 
to add a dependency on the Gnulib 'random' module, which would in turn 
add a cascading dependency on Gnulib's 'random_r' module. It's better to 
avoid this dependency if we can easily do so.

Come to think of it, we don't need to use 'rand' either, since 
tempname.c already has a good-enough pseudorandom generator. I installed 
into Gnulib the attached patch, which I hope fixes the Emacs problem 
without changing glibc's generated code (once this gets migrated back 
into glibc).

>> If I understand things correctly this should make the names random
>> enough on MS-DOS, though Emacs itself still needs a patch as I mentioned
>> a few minutes ago.
> 
> Why would Emacs need that patch?

In another part of this thread you rejected that patch, on the grounds 
that fixing the unlikely Emacs bug is more trouble than it's worth. So 
I'll drop that suggestion.

[0001-tempname-generate-better-names-for-MinGW-Emacs.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Tue, 16 Aug 2022 20:00:02 GMT) Full text and rfc822 format available.

Message #110 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, eggert <at> cs.ucla.edu,
 bug-gnulib <at> gnu.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Tue, 16 Aug 2022 21:59:10 +0200

[Message part 1 (text/plain, inline)]

Eli Zaretskii wrote:
> Looking at your test program, I see that you generate the seconds file
> name without deleting the first one.  When a file by the first
> generated name already exists, gen_tempname will indeed generate a
> different name.  But in the scenario I described, Emacs creates one
> temporary file, uses it, then deletes it, and only after that creates
> another file.

I'm adding this scenario to the unit test. Still, the unit test succeeds
when run once on:
  Linux, FreeBSD, NetBSD, OpenBSD, macOS, Solaris, Cygwin, native Windows.

Since this contradicts what you debugged on mingw, I ran the test 10000
times on native Windows. Result:
  - on 32-bit mingw, no failure.
  - on 64-bit mingw, around 30 failures (or around 10 failures with Paul's
    newest patch). That is, a probability of ca. 0.3% of getting the same
    file name as on the previous call.

Bruno

[0001-tempname-Add-more-tests.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Tue, 16 Aug 2022 20:07:02 GMT) Full text and rfc822 format available.

Message #113 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, eggert <at> cs.ucla.edu,
 bug-gnulib <at> gnu.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Tue, 16 Aug 2022 22:06:13 +0200

Eli Zaretskii wrote:
> Emacs has a with-temp-file macro, which generates a temporary file
> name, executes the body of the macro with a variable bound to the file
> name, then deletes the file.  Very handy for writing test suites.

Since, even with Paul's newest patch, gen_tempname on 64-bit mingw
produces the same file name as on the previous call with a probability
of about 0.1%, you still need to be prepared to see the original problem
occasionally.

Bruno

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Tue, 16 Aug 2022 20:13:01 GMT) Full text and rfc822 format available.

Message #116 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: Eli Zaretskii <eliz <at> gnu.org>, Paul Eggert <eggert <at> cs.ucla.edu>
Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, bug-gnulib <at> gnu.org,
 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Tue, 16 Aug 2022 22:12:51 +0200

Paul Eggert wrote:
> Emacs 'configure' deliberately suppresses MinGW's clock_gettime.

Ah, this explains why Eli saw the problem in Emacs in a deterministic
manner, whereas I see it only with 0.3% probability in my tests on 64-bit
mingw.

> I installed 
> into Gnulib the attached patch, which I hope fixes the Emacs problem 
> without changing glibc's generated code (once this gets migrated back 
> into glibc).

In 10000 runs on 64-bit mingw, your patch went from 27 test failures
to 11 test failures. So, we're still at ca. 0.1% probability (*), far
above the 62^-6 or 10^-10 probability that one would have hoped for.

(*) I wouldn't consider these measurements reliable. Maybe the probability
did not change, and what I'm seeing is merely a measurement fluke.

Bruno

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Tue, 16 Aug 2022 20:54:01 GMT) Full text and rfc822 format available.

Message #119 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Bruno Haible <bruno <at> clisp.org>, Eli Zaretskii <eliz <at> gnu.org>
Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, bug-gnulib <at> gnu.org,
 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Tue, 16 Aug 2022 13:53:16 -0700

On 8/16/22 12:59, Bruno Haible wrote:
> Since this contradicts what you debugged on mingw, I ran the test 10000
> times on native Windows. Result:
>    - on 32-bit mingw, no failure.
>    - on 64-bit mingw, around 30 failures (or around 10 failures with Paul's
>      newest patch). That is, a probability of ca. 0.3% of getting the same
>      file name as on the previous call.

That's odd, for two reasons. First, 64-bit and 32-bit mingw shouldn't 
differ, as they both use uint_fast64_t which should be the same width on 
both platforms. Second, I could not reproduce the problem on 64-bit 
Ubuntu x86-64 (after altering tempname.c to always define 
HAS_CLOCK_ENTROPY to false) test-tempname always succeeded in 10000 tries.

Could you investigate further why mingw 64-bit fails?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Wed, 17 Aug 2022 11:10:01 GMT) Full text and rfc822 format available.

Message #122 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: bug-gnulib <at> gnu.org, larsi <at> gnus.org, bruno <at> clisp.org, 57129 <at> debbugs.gnu.org,
 jporterbugs <at> gmail.com
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Wed, 17 Aug 2022 14:08:44 +0300

> Date: Tue, 16 Aug 2022 12:11:24 -0700
> Cc: bruno <at> clisp.org, bug-gnulib <at> gnu.org, larsi <at> gnus.org,
>  57129 <at> debbugs.gnu.org, jporterbugs <at> gmail.com
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> 
> > Thanks, but why not use 'random' instead?  Emacs does have it on all
> > platforms, including MS-Windows.  AFAIU, it's better than 'rand'.
> 
> If the code used 'random' then the Gnulib 'tempname' module would need 
> to add a dependency on the Gnulib 'random' module, which would in turn 
> add a cascading dependency on Gnulib's 'random_r' module. It's better to 
> avoid this dependency if we can easily do so.
> 
> Come to think of it, we don't need to use 'rand' either, since 
> tempname.c already has a good-enough pseudorandom generator. I installed 
> into Gnulib the attached patch, which I hope fixes the Emacs problem 
> without changing glibc's generated code (once this gets migrated back 
> into glibc).

Thanks.

Would it be possible to cherry-pick this to the emacs-28 branch, so
that Emacs 28.2 would have this fixed?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Wed, 17 Aug 2022 11:31:02 GMT) Full text and rfc822 format available.

Message #125 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Bruno Haible <bruno <at> clisp.org>
Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, eggert <at> cs.ucla.edu,
 bug-gnulib <at> gnu.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Wed, 17 Aug 2022 14:29:49 +0300

> From: Bruno Haible <bruno <at> clisp.org>
> Cc: eggert <at> cs.ucla.edu, bug-gnulib <at> gnu.org, larsi <at> gnus.org, 57129 <at> debbugs.gnu.org, jporterbugs <at> gmail.com
> Date: Tue, 16 Aug 2022 22:06:13 +0200
> 
> Eli Zaretskii wrote:
> > Emacs has a with-temp-file macro, which generates a temporary file
> > name, executes the body of the macro with a variable bound to the file
> > name, then deletes the file.  Very handy for writing test suites.
> 
> Since, even with Paul's newest patch, gen_tempname on 64-bit mingw
> produces the same file name as on the previous call with a probability
> of about 0.1%, you still need to be prepared to see the original problem
> occasionally.

I'll take that risk, thank you.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Thu, 18 Aug 2022 06:06:02 GMT) Full text and rfc822 format available.

Message #128 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: bug-gnulib <at> gnu.org, larsi <at> gnus.org, bruno <at> clisp.org, 57129 <at> debbugs.gnu.org,
 jporterbugs <at> gmail.com
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Wed, 17 Aug 2022 23:05:33 -0700

On 8/17/22 04:08, Eli Zaretskii wrote:
> Would it be possible to cherry-pick this to the emacs-28 branch, so
> that Emacs 28.2 would have this fixed?

I installed that.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Thu, 18 Aug 2022 06:46:01 GMT) Full text and rfc822 format available.

Message #131 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: bug-gnulib <at> gnu.org, larsi <at> gnus.org, bruno <at> clisp.org, 57129 <at> debbugs.gnu.org,
 jporterbugs <at> gmail.com
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Thu, 18 Aug 2022 09:44:48 +0300

> Date: Wed, 17 Aug 2022 23:05:33 -0700
> Cc: bruno <at> clisp.org, bug-gnulib <at> gnu.org, larsi <at> gnus.org,
>  57129 <at> debbugs.gnu.org, jporterbugs <at> gmail.com
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> 
> On 8/17/22 04:08, Eli Zaretskii wrote:
> > Would it be possible to cherry-pick this to the emacs-28 branch, so
> > that Emacs 28.2 would have this fixed?
> 
> I installed that.

Thanks!

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Sat, 20 Aug 2022 18:04:01 GMT) Full text and rfc822 format available.

Message #134 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>, Paul Eggert <eggert <at> cs.ucla.edu>
Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Sat, 20 Aug 2022 11:03:27 -0700

[Message part 1 (text/plain, inline)]

On 8/15/2022 11:58 AM, Eli Zaretskii wrote:
>> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
>> From: Jim Porter <jporterbugs <at> gmail.com>
>> Date: Mon, 15 Aug 2022 11:30:07 -0700
>>
>> The temp files are created by Eshell in lisp/eshell/esh-var.el in the
>> function 'eshell-parse-variable-ref', specifically in the part starting
>> with:
>>
>>     (eq (char-after) ?\<)
> 
> Ah, okay.  It's a (mis)feature of Gnulib's gen_tempname function
> (which is the guts of make-temp-file) in its implementation for
> MS-Windows (and maybe other platforms?): it always begins from the
> same "random" characters in the file name, and only generates other
> random characters if there's already a file by that name.  So if you
> are careful and delete the temporary file each time after usage, and
> never need more than one temporary file at the same time, you will get
> the same name every call.

In addition to the changes to temporary file name generation, I think it 
would be useful for Eshell to kill the temporary buffer too. If you use 
this feature in Eshell a lot, the temporary buffers could pile up, 
consuming memory for no real benefit. (A user who wanted the buffer to 
stick around would probably redirect to a non-temporary file, or even 
just to a buffer.) Attached is a patch to do this.

[0001-Kill-the-buffer-associated-with-a-temp-file-when-usi.patch (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Sat, 20 Aug 2022 18:15:01 GMT) Full text and rfc822 format available.

Message #137 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Jim Porter <jporterbugs <at> gmail.com>
Cc: larsi <at> gnus.org, eggert <at> cs.ucla.edu, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Sat, 20 Aug 2022 21:14:07 +0300

> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
> From: Jim Porter <jporterbugs <at> gmail.com>
> Date: Sat, 20 Aug 2022 11:03:27 -0700
> 
> In addition to the changes to temporary file name generation, I think it 
> would be useful for Eshell to kill the temporary buffer too. If you use 
> this feature in Eshell a lot, the temporary buffers could pile up, 
> consuming memory for no real benefit. (A user who wanted the buffer to 
> stick around would probably redirect to a non-temporary file, or even 
> just to a buffer.) Attached is a patch to do this.

That makes sense, of course, but why not use with-temp-buffer in the
first place?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Sat, 20 Aug 2022 18:50:01 GMT) Full text and rfc822 format available.

Message #140 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: larsi <at> gnus.org, eggert <at> cs.ucla.edu, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Sat, 20 Aug 2022 11:49:34 -0700

On 8/20/2022 11:14 AM, Eli Zaretskii wrote:
>> Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
>> From: Jim Porter <jporterbugs <at> gmail.com>
>> Date: Sat, 20 Aug 2022 11:03:27 -0700
>>
>> In addition to the changes to temporary file name generation, I think it
>> would be useful for Eshell to kill the temporary buffer too. If you use
>> this feature in Eshell a lot, the temporary buffers could pile up,
>> consuming memory for no real benefit. (A user who wanted the buffer to
>> stick around would probably redirect to a non-temporary file, or even
>> just to a buffer.) Attached is a patch to do this.
> 
> That makes sense, of course, but why not use with-temp-buffer in the
> first place?

I'm not sure that would work in this case. 'with-temp-buffer' (or 
probably 'with-temp-file', since we want a real file) would clean 
everything up once BODY has executed, but we want to perform cleanup in 
a hook, which might execute quite a while later (e.g. if passing this 
temp file to an external process).

If there's a way to use that though, I'm all for it.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Sun, 21 Aug 2022 16:21:01 GMT) Full text and rfc822 format available.

Message #143 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: Eli Zaretskii <eliz <at> gnu.org>, Paul Eggert <eggert <at> cs.ucla.edu>
Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, bug-gnulib <at> gnu.org,
 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Sun, 21 Aug 2022 18:20:52 +0200

[Message part 1 (text/plain, inline)]

Paul Eggert wrote:
> > Since this contradicts what you debugged on mingw, I ran the test 10000
> > times on native Windows. Result:
> >    - on 32-bit mingw, no failure.
> >    - on 64-bit mingw, around 30 failures (or around 10 failures with Paul's
> >      newest patch). That is, a probability of ca. 0.3% of getting the same
> >      file name as on the previous call.
> 
> That's odd, for two reasons. First, 64-bit and 32-bit mingw shouldn't 
> differ, as they both use uint_fast64_t which should be the same width on 
> both platforms. Second, I could not reproduce the problem on 64-bit 
> Ubuntu x86-64 (after altering tempname.c to always define 
> HAS_CLOCK_ENTROPY to false) test-tempname always succeeded in 10000 tries.
> 
> Could you investigate further why mingw 64-bit fails?

That's odd indeed, and here are more detailed results.

I changed test-tempname.c to skip the case 1 and execute only case 2, and
added printf statements per the attached tempname.c.diff. Then ran
$ for i in `seq 1000`; do ./test-tempname.exe; done 2>&1 | tee out1000

In 32-bit mingw, the result is fully deterministic: each run behaves the same.
The first file name is always gl-temp-3FXzHa.xyz;
the second file name is always gl-temp-HgtmVy.xyz.

Thus, for a single Emacs process, things will look fine, but as soon as someone
starts to use temporary files in two different Emacs processes, in the way Eli
described, there will be massive collisions.

In 64-bit mingw, the 'tempname 1' value is deterministic. This simply shows
that Windows 10 does not use address space randomization (ASLR).
The 'tempname 2' value is unique 99% of the time:
  $ grep 'tempname 2' out1000-mingw64 | sort | uniq -d | wc -l
  8
The interesting thing is that each of the duplicate values of v is occurring
in the same process:
  $ grep 'tempname 2' out1000-mingw64 | sort | uniq -d
  tempname 2 v=0x00c1efa91fb60900
  tempname 2 v=0x32ccb2946974f2f6
  tempname 2 v=0x5cafcc69e359a147
  tempname 2 v=0x6d0676018e27d771
  tempname 2 v=0x6d95bd6083168079
  tempname 2 v=0x7cb95116ffae8ece
  tempname 2 v=0xe0afc7086808ce33
  tempname 2 v=0xe46a60c28fb0ec7f
  $ grep 'tempname 2' out1000-mingw64 | grep -n 0x00c1efa91fb60900
  560:tempname 2 v=0x00c1efa91fb60900
  561:tempname 2 v=0x00c1efa91fb60900
  $ grep 'tempname 2' out1000-mingw64 | grep -n 0x32ccb2946974f2f6
  1129:tempname 2 v=0x32ccb2946974f2f6
  1130:tempname 2 v=0x32ccb2946974f2f6
  etc.

So, in this environment, different Emacs processes will not conflict. But
within a single Emacs process, with 1% probability, the two resulting file
names are the same.

Bruno

[tempname.c.diff (text/x-patch, attachment)]

[out1000-mingw32.gz (application/gzip, attachment)]

[out1000-mingw64.gz (application/gzip, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Sun, 21 Aug 2022 16:33:02 GMT) Full text and rfc822 format available.

Message #146 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Bruno Haible <bruno <at> clisp.org>
Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, eggert <at> cs.ucla.edu,
 bug-gnulib <at> gnu.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Sun, 21 Aug 2022 19:32:09 +0300

> From: Bruno Haible <bruno <at> clisp.org>
> Cc: bug-gnulib <at> gnu.org, larsi <at> gnus.org, 57129 <at> debbugs.gnu.org, jporterbugs <at> gmail.com
> Date: Sun, 21 Aug 2022 18:20:52 +0200
> 
> I changed test-tempname.c to skip the case 1 and execute only case 2, and
> added printf statements per the attached tempname.c.diff. Then ran
> $ for i in `seq 1000`; do ./test-tempname.exe; done 2>&1 | tee out1000
> 
> In 32-bit mingw, the result is fully deterministic: each run behaves the same.
> The first file name is always gl-temp-3FXzHa.xyz;
> the second file name is always gl-temp-HgtmVy.xyz.
> 
> Thus, for a single Emacs process, things will look fine, but as soon as someone
> starts to use temporary files in two different Emacs processes, in the way Eli
> described, there will be massive collisions.
> 
> In 64-bit mingw, the 'tempname 1' value is deterministic. This simply shows
> that Windows 10 does not use address space randomization (ASLR).

Some of these problems could be alleviated if the initial value of 'v'
depended on something non-deterministic, like the program's PID and/or
the current wallclock time (no need to use anything as fancy as
clock_gettime, a simple call to 'clock' should be enough).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Sun, 21 Aug 2022 17:18:01 GMT) Full text and rfc822 format available.

Message #149 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: jporterbugs <at> gmail.com, Eli Zaretskii <eliz <at> gnu.org>, bug-gnulib <at> gnu.org,
 larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Sun, 21 Aug 2022 19:17:08 +0200

Paul Eggert asked:
> > Could you investigate further why mingw 64-bit fails?

Some words on the "why". You seem to have expectations regarding the
distribution quality of the resulting file names, but these expectations
are not warranted.

Donald E. Knuth wrote in TAOCP vol 2 "Seminumerical algorithms" that
an arbitrary sequence of number manipulating operations usually has a
bad quality as a pseudo-random number generator, and that in order
to get good quality pseudo-random numbers, one needs to use *specific*
pseudo-random number generators and then *prove* mathematical properties
of it. (The same is true, btw, for crypto algorithms.)
Then he started discussing linear congruential generators [1] as the
simplest example for which something could be proved.

The current source code of tempname.c generates the pseudo-random numbers
— assuming no HAS_CLOCK_ENTROPY and no ASLR — using a mix of three
operations:
  (A) A linear congruential generator [2] with m = 2^64,
      a = 2862933555777941757, c = 3037000493.
  (B) A floor operation: v ← floor(v / 62^6)
  (C) A xor operation: v ^= prev_v

There are three different questions:
(Q1) What is the expected quality inside a single gen_tempname call?
(Q2) What is the expected quality of multiple gen_tempname calls in a
     single process?
(Q3) What is the expected quality when considering different processes
     that call gen_tempname?

Answers:
(Q1) For just 6 'X'es, there is essentially a single (A) operation.
     Therefore the quality will be good.
     If someone uses more than 10 'X'es, for example, 50 'X'es, there
     will be 5 (A) and 5 (B), interleaved: (A), (B), (A), (B), ...
     This is *not* a linear congruential generator, therefore the
     expected quality is BAD.
     In order to fix this case, what I would do is to get back to
     a linear congruential generator: (A), (A), ..., (A), (B).
     In other words, feed into (A) exactly the output from the
     previous (A). This means, do the statements
          XXXXXX[i] = letters[v % 62];
          v /= 62;
     not on v itself, but on a copy of v.
     But wait, there is also the desire to have predictability!
     This means to not consume all the possible bits the
     random_bits() call, but only part of it.
     What I would do here, is to reduce BASE_62_DIGITS from 10 to 6.
     So that in each round of the loop, 6 base-62 digits are consumed
     and more than 4 base-62 digits are left in v, for predictability.
     In the usual calls with 6 'X'es the loop will still end after a
     single round.

(Q2) First of all, the multiple gen_tempname calls can occur in
     different threads. Since no locking is involved, it is undefined
     behaviour to access the 'prev_v' variable from different threads.
     On machines with an IA-64 CPU, the 'prev_v' variable's value may
     not be propagated from one thread to the other. [3][4][5]
     The fix is simple, though: Mark 'prev_v' as 'volatile'.

     Then, what the code does, is a mix of (A), (B), (C). Again, this
     is *not* a linear congruential generator, therefore the expected
     quality is BAD.

     To get good quality, I would suggest to use a linear congruential
     generator across *all* gen_tempname calls of the entire thread.
     This means:
       - Move out the (B) invocations out, like explained above in (Q1).
       - Remove the (C) code that you added last week.
       - Store the v value in a per-thread variable. Using '__thread'
         on glibc systems and a TLS variable (#include "glthread/tls.h")
         on the other platforms.

(Q3) Here, to force differences between different processes, I would
     suggest to use a fine-grained clock value. In terms of platforms,
       #if defined CLOCK_MONOTONIC && HAVE_CLOCK_GETTIME
     is way too restrictive.
     How about
       - using CLOCK_REALTIME when CLOCK_MONOTONIC is not available,
       - using gettimeofday() as fallback, especially on native Windows.

If one does (Q3), then the suggestions for (Q2) (other than the 'volatile')
may not be needed.

Bruno

[1] https://en.wikipedia.org/wiki/Linear_congruential_generator
[2] https://en.wikipedia.org/wiki/Linear_congruential_generator#c_%E2%89%A0_0
[3] https://es.cs.uni-kl.de/publications/datarsg/Geor16.pdf
[4] https://db.in.tum.de/teaching/ws1718/dataprocessing/chapter3.pdf?lang=de
    page 18
[5] https://os.inf.tu-dresden.de/Studium/DOS/SS2014/04-Memory-Consistency.pdf

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Mon, 22 Aug 2022 20:48:01 GMT) Full text and rfc822 format available.

Message #152 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Bruno Haible <bruno <at> clisp.org>
Cc: jporterbugs <at> gmail.com, Eli Zaretskii <eliz <at> gnu.org>, bug-gnulib <at> gnu.org,
 larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Mon, 22 Aug 2022 15:47:28 -0500

[Message part 1 (text/plain, inline)]

Thanks for the detailed diagnosis, Bruno. To try to fix the problems I 
installed the attached patches into Gnulib. If I understand things 
correctly, these patches should fix the 0.1% failure rate you observed 
on 64-bit mingw. They also fix a minor security leak I discovered: in 
rare cases, ASLR entropy was used to generate publicly visible file 
names, which is a no-no as that might help an attacker infer the 
randomized layout of a victim process.

These fixes follow some but not all the suggestions you made. The basic 
problem I saw was that tempname.c was using too much belt-and-suspenders 
code, so much so that the combination of belts and suspenders 
misbehaved. I simplified it a bit and this removed the need for some of 
the suggestions.

These fixes should be merged into glibc upstream since they fix glibc 
bugs; I plan to follow up on that shortly.

[0001-tempname-merge-64-bit-time_t-fix-from-glibc.patch (text/x-patch, attachment)]

[0002-tempname-fix-multithreading-ASLR-leak-etc.patch (text/x-patch, attachment)]

[0003-tempname-don-t-lose-entropy-in-seed.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Tue, 23 Aug 2022 00:14:02 GMT) Full text and rfc822 format available.

Message #155 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: jporterbugs <at> gmail.com, Eli Zaretskii <eliz <at> gnu.org>, bug-gnulib <at> gnu.org,
 larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Tue, 23 Aug 2022 02:13:02 +0200

Paul Eggert wrote:
> Thanks for the detailed diagnosis, Bruno. To try to fix the problems I 
> installed the attached patches into Gnulib.

These look all good. Here's again my evaluation of the three scenarios:

(Q1) What is the expected quality inside a single gen_tempname call?

It depends on quality of random_bits(). Usually on the quality of
getrandom(), which is good on most systems. And for the other systems,
the "ersatz" in random_bits() gives reasonable quality.

(Q2) What is the expected quality of multiple gen_tempname calls in a
     single process?
(Q3) What is the expected quality when considering different processes
     that call gen_tempname?

Both have the same answer: The file name essentially depends on the
first call to random_bits(). Two different calls to random_bits()
can be correlated only if
  - getrandom() is not well supported, and
  - CLOCK_REALTIME is not defined, and
  - we're in the same process, less than 0.01 sec apart.
IMO, that's good enough.

> If I understand things 
> correctly, these patches should fix the 0.1% failure rate you observed 
> on 64-bit mingw.

Yes. Running the "case 2" part 1000 times again, among the 2000 generated
file names, there are no duplicates — neither on 32-bit mingw, nor on 64-bit
mingw.

> They also fix a minor security leak I discovered: in 
> rare cases, ASLR entropy was used to generate publicly visible file 
> names, which is a no-no as that might help an attacker infer the 
> randomized layout of a victim process.

Excellent observation :-)

> These fixes follow some but not all the suggestions you made. The basic 
> problem I saw was that tempname.c was using too much belt-and-suspenders 
> code, so much so that the combination of belts and suspenders 
> misbehaved. I simplified it a bit and this removed the need for some of 
> the suggestions.

Thanks. The major quality boost comes from invoking getrandom() always.

Bruno

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Tue, 23 Aug 2022 11:19:02 GMT) Full text and rfc822 format available.

Message #158 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Bruno Haible <bruno <at> clisp.org>
Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, eggert <at> cs.ucla.edu,
 bug-gnulib <at> gnu.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Tue, 23 Aug 2022 14:18:08 +0300

> From: Bruno Haible <bruno <at> clisp.org>
> Cc: Eli Zaretskii <eliz <at> gnu.org>, bug-gnulib <at> gnu.org, larsi <at> gnus.org, 57129 <at> debbugs.gnu.org, jporterbugs <at> gmail.com
> Date: Tue, 23 Aug 2022 02:13:02 +0200
> 
> > If I understand things 
> > correctly, these patches should fix the 0.1% failure rate you observed 
> > on 64-bit mingw.
> 
> Yes. Running the "case 2" part 1000 times again, among the 2000 generated
> file names, there are no duplicates — neither on 32-bit mingw, nor on 64-bit
> mingw.

Was that with or without using clock_gettime?  If it was with
clock_gettime, what happens without it (i.e. when 'clock' is used
instead)?

Thanks.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Tue, 23 Aug 2022 14:50:02 GMT) Full text and rfc822 format available.

Message #161 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, eggert <at> cs.ucla.edu,
 bug-gnulib <at> gnu.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Tue, 23 Aug 2022 16:49:36 +0200

Eli Zaretskii asked:
> > Yes. Running the "case 2" part 1000 times again, among the 2000 generated
> > file names, there are no duplicates — neither on 32-bit mingw, nor on 64-bit
> > mingw.
> 
> Was that with or without using clock_gettime?  If it was with
> clock_gettime, what happens without it (i.e. when 'clock' is used
> instead)?

That was with clock_gettime. Without clock_gettime, i.e. without the
dependency to libwinpthread, the result is the same: no duplicates — neither
on 32-bit mingw, nor on 64-bit mingw.

Bruno

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Tue, 23 Aug 2022 16:08:02 GMT) Full text and rfc822 format available.

Message #164 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Bruno Haible <bruno <at> clisp.org>
Cc: jporterbugs <at> gmail.com, larsi <at> gnus.org, eggert <at> cs.ucla.edu,
 bug-gnulib <at> gnu.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Tue, 23 Aug 2022 19:07:02 +0300

> From: Bruno Haible <bruno <at> clisp.org>
> Cc: eggert <at> cs.ucla.edu, bug-gnulib <at> gnu.org, larsi <at> gnus.org, 57129 <at> debbugs.gnu.org, jporterbugs <at> gmail.com
> Date: Tue, 23 Aug 2022 16:49:36 +0200
> 
> Eli Zaretskii asked:
> > > Yes. Running the "case 2" part 1000 times again, among the 2000 generated
> > > file names, there are no duplicates — neither on 32-bit mingw, nor on 64-bit
> > > mingw.
> > 
> > Was that with or without using clock_gettime?  If it was with
> > clock_gettime, what happens without it (i.e. when 'clock' is used
> > instead)?
> 
> That was with clock_gettime. Without clock_gettime, i.e. without the
> dependency to libwinpthread, the result is the same: no duplicates — neither
> on 32-bit mingw, nor on 64-bit mingw.

Great, thanks.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Wed, 24 Aug 2022 21:42:01 GMT) Full text and rfc822 format available.

Message #167 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>, Paul Eggert <eggert <at> cs.ucla.edu>
Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Wed, 24 Aug 2022 14:41:37 -0700

On 8/20/2022 11:03 AM, Jim Porter wrote:
> In addition to the changes to temporary file name generation, I think it 
> would be useful for Eshell to kill the temporary buffer too.
[snip]
> Attached is a patch to do this.

Assuming there are no remaining objections, I'll merge this in a day or two.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#57129; Package emacs. (Fri, 26 Aug 2022 05:11:02 GMT) Full text and rfc822 format available.

Message #170 received at 57129 <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>, Paul Eggert <eggert <at> cs.ucla.edu>
Cc: larsi <at> gnus.org, 57129 <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Thu, 25 Aug 2022 22:10:27 -0700

On 8/24/2022 2:41 PM, Jim Porter wrote:
> On 8/20/2022 11:03 AM, Jim Porter wrote:
>> In addition to the changes to temporary file name generation, I think 
>> it would be useful for Eshell to kill the temporary buffer too.
> [snip]
>> Attached is a patch to do this.
> 
> Assuming there are no remaining objections, I'll merge this in a day or 
> two.

Merged as a457aa62577284333c7d25d48a49704788b25a04.

Reply sent to Jim Porter <jporterbugs <at> gmail.com>:
You have taken responsibility. (Thu, 16 Mar 2023 05:36:02 GMT) Full text and rfc822 format available.

Notification sent to Jim Porter <jporterbugs <at> gmail.com>:
bug acknowledged by developer. (Thu, 16 Mar 2023 05:36:02 GMT) Full text and rfc822 format available.

Message #175 received at 57129-done <at> debbugs.gnu.org (full text, mbox):

From: Jim Porter <jporterbugs <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>, Paul Eggert <eggert <at> cs.ucla.edu>
Cc: larsi <at> gnus.org, 57129-done <at> debbugs.gnu.org
Subject: Re: bug#57129: 29.0.50; Improve behavior of conditionals in Eshell
Date: Wed, 15 Mar 2023 22:35:24 -0700

On 8/25/2022 10:10 PM, Jim Porter wrote:
> On 8/24/2022 2:41 PM, Jim Porter wrote:
>> On 8/20/2022 11:03 AM, Jim Porter wrote:
>>> In addition to the changes to temporary file name generation, I think 
>>> it would be useful for Eshell to kill the temporary buffer too.
>> [snip]
>>> Attached is a patch to do this.
>>
>> Assuming there are no remaining objections, I'll merge this in a day 
>> or two.
> 
> Merged as a457aa62577284333c7d25d48a49704788b25a04.

I know there was a quite-lengthy discussion about the tempname function, 
but I think those patches were merged, along with my patch for the 
original bug topic. Therefore, I'm going to close this.

Of course, if there's still anything to do with the tempname stuff, 
let's do it (though it might help to give it a separate bug for tracking 
purposes).

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 13 Apr 2023 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 90 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #57129 29.0.50; Improve behavior of conditionals in Eshell

GNU bug report logs - #57129
29.0.50; Improve behavior of conditionals in Eshell