GNU bug report logs - #64216
Proc open-input-pipe returns #eof when reading on macOS, works as expected on Linux

Previous Next

Package: guile;

Reported by: Jose Ortiz <kotshie <at> gmail.com>

Date: Thu, 22 Jun 2023 05:48:02 UTC

Severity: normal

To reply to this bug, email your comments to 64216 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guile <at> gnu.org:
bug#64216; Package guile. (Thu, 22 Jun 2023 05:48:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jose Ortiz <kotshie <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-guile <at> gnu.org. (Thu, 22 Jun 2023 05:48:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jose Ortiz <kotshie <at> gmail.com>
To: bug-guile <at> gnu.org
Subject: Proc open-input-pipe returns #eof when reading on macOS, works as
 expected on Linux
Date: Wed, 21 Jun 2023 17:31:18 -0700
[Message part 1 (text/plain, inline)]
When reading from open-input-pipe, it returns #eof on macOS devices, while it works properly in Linux.

```
(use-modules (ice-9 popen)
             (ice-9 rdelim))
(let* ((port (open-input-pipe "date --utc"))
       (str  (read-line port)))
  (close-pipe port)
  str)
```
macOS> #<eof>
linux> "Thu Jun 22 12:30:26 AM UTC 2023"


Building latest and running popen.test also fails with:

m1 :: Projects/guile/test-suite ? guile -L . -e main -s guile-test tests/popen.test
Running tests/popen.test
                                                                                                                                                                  
FAIL: tests/popen.test: open-input-pipe: echo hello
FAIL: tests/popen.test: open-input-pipe: open-input-pipe process gets (current-input-port) as stdin
ERROR: tests/popen.test: open-output-pipe: no duplicate - arguments: ((wrong-type-arg "string-append" "Wro
ng type (expecting ~A): ~S" ("string" #f) (#f)))



- Jose
[Message part 2 (text/html, inline)]

Information forwarded to bug-guile <at> gnu.org:
bug#64216; Package guile. (Tue, 08 Aug 2023 11:31:01 GMT) Full text and rfc822 format available.

Message #8 received at 64216 <at> debbugs.gnu.org (full text, mbox):

From: Torrekie <me <at> torrekie.dev>
To: 64216 <at> debbugs.gnu.org
Subject: "ice-9 popen" cannot open process on Darwin
Date: Tue, 8 Aug 2023 19:30:46 +0800
[Message part 1 (text/plain, inline)]
Can confirm and reproduce this bug on macOS / iOS with Guile 3.0.9, it seems guile-config also affected by this bug which causing "pkg-config" calls always fail.

```
TorrekiedeMacBook-Pro:pv4 torrekie$ guile-config link
error: ("/opt/homebrew/opt/pkg-config/bin/pkg-config" "--libs" "guile-3.0") exited with non-zero error code 127
```

In guile-config, pkg-config was called through "open-pipe*"
```
(define (pkg-config . args)
  (let* ((real-args (cons %pkg-config-program args))
         (pipe (apply open-pipe* OPEN_READ real-args))
         (output (read-delimited "" pipe))
         (ret (close-pipe pipe)))
    (case (status:exit-val ret)
      ((0) (if (eof-object? output) "" output))
      (else (display-line-error
             (format #f "error: ~s exited with non-zero error code ~A"
                     (cons %pkg-config-program args) (status:exit-val ret)))
            ;; assume pkg-config sent diagnostics to stdout
            (exit (status:exit-val ret))))))
```

By attaching to LLDB I didn't see any exec/posix_spawn/popen been called, by inspecting the source code I see Guile's popen was implemented through `posix_spawn`

```
// libguile/posix.c
static SCM
scm_piped_process (SCM prog, SCM args, SCM from, SCM to)
#define FUNC_NAME "piped-process"
{
  pid_t pid;

  (void) piped_process (&pid, prog, args, from, to);
  if (pid == -1)
    {
      /* Create a dummy process that exits with value 127 to mimic the
         previous fork + exec implementation.  TODO: This is a
         compatibility shim to remove in the next stable series.  */
      pid = fork ();
      if (pid == -1)
        SCM_SYSERROR;
      if (pid == 0)
        _exit (127);
    }

  return scm_from_int (pid);
}
#undef FUNC_NAME
```

The function "piped_process" is a wrapper for "do_spawn", the "do_spawn" retuns -1 while `posix_spawn` or `posix_spawnp` fails which finally regarding to the `_exit(127)` in above code. But obviously posix_spawn calls was working as expected in other programs, by doing `nm` we can see libguile was not actually referenced posix_spawn, instead, it defined an internal implementation of it.
```
TorrekiedeMacBook-Pro:pv4 torrekie$ nm /opt/homebrew/opt/guile/lib/libguile-3.0.dylib | grep posix_spawn
00000000000bfdb4 t _gl_posix_spawn_internal
00000000000bfc90 t _rpl_posix_spawn_file_actions_addclose
00000000000bfd18 t _rpl_posix_spawn_file_actions_adddup2
```

In m4/posix_spawn.m4 logics, Guile does not trust Darwin posix_spawn* for some reason which causing REPLACE_POSIX_SPAWN has been defined, I am trying to rebuild one without replacing posix_spawn and see what happens.


[Message part 2 (text/html, inline)]

Information forwarded to bug-guile <at> gnu.org:
bug#64216; Package guile. (Fri, 11 Aug 2023 10:31:01 GMT) Full text and rfc822 format available.

Message #11 received at 64216 <at> debbugs.gnu.org (full text, mbox):

From: Torrekie <me <at> torrekie.dev>
To: 64216 <at> debbugs.gnu.org
Subject: libgnu __spawni fail on Darwin
Date: Fri, 11 Aug 2023 18:29:57 +0800
[Message part 1 (text/plain, inline)]
Due to the lack of attaching child process feature in LLDB, I cannot see what was happening after a fork() call, but one thing is clear that the alternative `posix_spawn` provided by lib/spawni.c was not quite working as expected on Darwin systems (might only happens under arm64). 

https://github.com/NanoComp/meep/issues/2495 <https://github.com/NanoComp/meep/issues/2495>

Same bug mentioned in this GitHub issue, and I have actually attempted to undef REPLACE_POSIX_SPAWN to use system one, but it was still corrupted by malformed file actions which returning 22 (EINVAL) and set errno to ENOENT that may complaining about dup2-ed non-standard stdio fds.

One thing is for sure, when REPLACE_POSIX_SPAWN is defined, posix_spawn calls will point to __spawni function as implementation. This function fails before actually calling `execve` (Or I didn't successfully caught that in lldb)

This was my test result on macOS 11.2 arm64

/Applications/Xcode.app/Contents/Developer/usr/bin/make  check-TESTS
test-system-cmds: system* exit status was 127 rather than 42
FAIL: test-system-cmds
PASS: test-bad-identifiers
PASS: test-require-extension
PASS: test-guile-snarf
PASS: test-import-order
PASS: test-command-line-encoding
PASS: test-command-line-encoding2
PASS: test-language
error: interrupted by the user
PASS: test-guild-compile
wrote `/Users/torrekie/proj/guile-3.0.9/cache/guile/ccache/3.0-LE-8-4.6/Users/torrekie/proj/guile-3.0.9/test-suite/standalone/test-signal-fork.go'
parent: 53087
....child: ..53133
.............................................
completed
PASS: test-signal-fork
PASS: test-num2integral
PASS: test-round
PASS: test-asmobs
PASS: test-ffi
PASS: test-foreign-object-scm
PASS: test-foreign-object-c
PASS: test-list
PASS: test-unwind
PASS: test-conversion
PASS: test-loose-ends
PASS: test-fast-slot-ref
PASS: test-mb-regexp
PASS: test-use-srfi
PASS: test-scm-c-read
PASS: test-scm-take-locale-symbol
PASS: test-scm-take-u8vector
PASS: test-scm-to-latin1-string
PASS: test-scm-values
PASS: test-scm-c-bind-keyword-arguments
PASS: test-srfi-4
PASS: test-extensions
PASS: test-with-guile-module
PASS: test-scm-with-guile
PASS: test-scm-spawn-thread
PASS: test-pthread-create
SKIP: test-pthread-create-secondary
PASS: test-smob-mark
PASS: test-smob-mark-race
wrote `/Users/torrekie/proj/guile-3.0.9/cache/guile/ccache/3.0-LE-8-4.6/Users/torrekie/proj/guile-3.0.9/test-suite/standalone/test-stack-overflow.go'
SKIP: test-stack-overflow
wrote `/Users/torrekie/proj/guile-3.0.9/cache/guile/ccache/3.0-LE-8-4.6/Users/torrekie/proj/guile-3.0.9/test-suite/standalone/test-out-of-memory.go'
SKIP: test-out-of-memory

;;; (child-exception ("scm_fdes_to_port" "~A" ("Bad file descriptor") (9)))

;;; (child-status 256)
PASS: test-close-on-exec
==================================
1 of 38 tests failed
(3 tests were not run)
Please report to bug-guile <at> gnu.org
==================================
make[5]: *** [check-TESTS] Error 1
make[4]: *** [check-am] Error 2
make[3]: *** [check] Error 2
make[2]: *** [check-recursive] Error 1
make[1]: *** [check-recursive] Error 1
make: *** [check] Error 2
[Message part 2 (text/html, inline)]

Information forwarded to bug-guile <at> gnu.org:
bug#64216; Package guile. (Sat, 12 Aug 2023 21:36:02 GMT) Full text and rfc822 format available.

Message #14 received at 64216 <at> debbugs.gnu.org (full text, mbox):

From: Torrekie <me <at> torrekie.dev>
To: 64216 <at> debbugs.gnu.org
Subject: *spawn calls fixed in latest commit
Date: Sun, 13 Aug 2023 05:34:51 +0800
[Message part 1 (text/plain, inline)]
By inspecting commit ccd7400fdbebca73fc4340ad4ca0248655009f04, I see this issue has been fixed for BSD posix_spawn and internal `__spawni`.

diff --git a/libguile/posix.c b/libguile/posix.c
index 3adc743c4..6776a7744 100644
--- a/libguile/posix.c
+++ b/libguile/posix.c
@@ -1390,12 +1390,19 @@ do_spawn (char *exec_file, char **exec_argv, char **exec_env,
   /* Move the fds out of the way, so that duplicate fds or fds equal
      to 0, 1, 2 don't trample each other */
 
-  posix_spawn_file_actions_adddup2 (&actions, in, fd_slot[0]);
-  posix_spawn_file_actions_adddup2 (&actions, out, fd_slot[1]);
-  posix_spawn_file_actions_adddup2 (&actions, err, fd_slot[2]);
-  posix_spawn_file_actions_adddup2 (&actions, fd_slot[0], 0);
-  posix_spawn_file_actions_adddup2 (&actions, fd_slot[1], 1);
-  posix_spawn_file_actions_adddup2 (&actions, fd_slot[2], 2);
+  int dup2_action_from[] = {in, out, err,
+                            fd_slot[0], fd_slot[1], fd_slot[2]};
+  int dup2_action_to  [] = {fd_slot[0], fd_slot[1], fd_slot[2],
+                            0, 1, 2};
+
+  errno = 0;
+  for (int i = 0; i < sizeof (dup2_action_from) / sizeof (int); i++)
+    {
+      errno = posix_spawn_file_actions_adddup2 (&actions, dup2_action_from[i],
+                                                dup2_action_to[i]);
+      if (errno != 0)
+        return -1;
+    }
 
 #ifdef HAVE_ADDCLOSEFROM
   /* This function appears in glibc 2.34.  It's both free from race

Can confirm FreeBSD's patch (https://github.com/freebsd/freebsd-ports/raw/6e9b9fd9dd69cbbfabd780e2cec9e2b98d5aef1e/lang/guile3/files/extra-patch-upstream-fixes.patch <https://github.com/freebsd/freebsd-ports/raw/6e9b9fd9dd69cbbfabd780e2cec9e2b98d5aef1e/lang/guile3/files/extra-patch-upstream-fixes.patch>) that based on latest commits fixed the problem on Darwin

iPad:/buildroot/guile-3.0.9 root# guile-config link
-lguile-3.0 -lgc -lpthread
iPad:/buildroot/guile-3.0.9 root# guile-config compile
-D_THREAD_SAFE -I/usr/include/guile/3.0 -I/usr
iPad:/buildroot/guile-3.0.9 root# /usr/bin/uname -a
Darwin iPad 20.4.0 Darwin Kernel Version 20.4.0: Sun Feb 28 21:05:09 PST 2021; root:xnu-7195.100.367~3/RELEASE_ARM64_T8101 arm64
iPad:/buildroot/guile-3.0.9 root# sw_vers
ProductName:    iPhone OS
ProductVersion: 14.5.1
BuildVersion:   18E212
[Message part 2 (text/html, inline)]

This bug report was last modified 265 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.