GNU bug report logs - #61095
possible misuse of posix_spawn API on non-linux OSes

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: guile; Reported by: Omar Polo <op@HIDDEN>; merged with #61079; dated Fri, 27 Jan 2023 11:53:01 UTC; Maintainer for guile is bug-guile@HIDDEN.
Merged 61079 61095. Request was from Ludovic Courtès <ludo@HIDDEN> to control <at> debbugs.gnu.org. Full text available.

Message received at 61095 <at> debbugs.gnu.org:


Received: (at 61095) by debbugs.gnu.org; 27 Jan 2023 12:25:24 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Jan 27 07:25:24 2023
Received: from localhost ([127.0.0.1]:36873 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1pLNnI-00059G-FL
	for submit <at> debbugs.gnu.org; Fri, 27 Jan 2023 07:25:24 -0500
Received: from mail.omarpolo.com ([144.91.116.244]:51648)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <op@HIDDEN>) id 1pLNnD-00058r-D8
 for 61095 <at> debbugs.gnu.org; Fri, 27 Jan 2023 07:25:22 -0500
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=omarpolo.com;
 s=20200327; t=1674822312;
 h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
 to:to:cc; bh=1tTA1Blv5lTEElPHbeaYQkKqMLDTk3GoVPQX6uQYHhw=;
 b=sJf7sdj8SfmV0MtW0pdyJ3bYIluTGcMt2tiJjn9kLT2ebohUZpJUg17alTQuTLEvlJPuZK
 AV8Xlmlbjzk/842qFjktHILIgHreN04wsFM0gazxyqZWbKpv6SKPt6IWCB3E9fKnfHqd0B
 4Og/wwFew/nmml5PIbNFEIdizu23XYI=
Received: from localhost (host-82-61-20-176.retail.telecomitalia.it
 [82.61.20.176])
 by mail.omarpolo.com (OpenSMTPD) with ESMTPSA id 12903135
 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for <61095 <at> debbugs.gnu.org>;
 Fri, 27 Jan 2023 13:25:12 +0100 (CET)
Received: from localhost (localhost [local])
 by localhost (OpenSMTPD) with ESMTPA id d6c3c8f5
 for <61095 <at> debbugs.gnu.org>; Fri, 27 Jan 2023 13:25:09 +0100 (CET)
Date: Fri, 27 Jan 2023 13:25:09 +0100
To: 61095 <at> debbugs.gnu.org
Subject: Re: possible misuse of posix_spawn API on non-linux OSes
From: Omar Polo <op@HIDDEN>
Message-Id: <3F1FNOS0VFO9X.356V67A0RSKPT@venera>
User-Agent: mblaze/1.2
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 61095
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

Actually I can avoid the EBADF by checking that the fd is 'live' with
something like fstat:

[[[
Index: libguile/posix.c
--- libguile/posix.c.orig
+++ libguile/posix.c
@@ -1325,8 +1325,12 @@ SCM_DEFINE (scm_fork, "primitive-fork", 0, 0, 0,
 static void
 close_inherited_fds_slow (posix_spawn_file_actions_t *actions, int max_fd)
 {
-  while (--max_fd > 2)
-    posix_spawn_file_actions_addclose (actions, max_fd);
+  struct stat sb;
+  max_fd = getdtablecount();
+  while (--max_fd > 2) {
+    if (fstat(max_fd, &sb) != -1)
+      posix_spawn_file_actions_addclose (actions, max_fd);
+  }
 }
 
 static void

]]]

The regress passes and while this workaround may be temporarly
acceptable I -personally- don't like it much.  There's a reason guile
can't set CLOEXEC for all the file descriptors > 2 obtained via open,
socket, pipe, ... like perl -for example- does?




Information forwarded to bug-guile@HIDDEN:
bug#61095; Package guile. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 27 Jan 2023 11:52:08 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Jan 27 06:52:08 2023
Received: from localhost ([127.0.0.1]:36849 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1pLNH3-0004HO-OS
	for submit <at> debbugs.gnu.org; Fri, 27 Jan 2023 06:52:08 -0500
Received: from lists.gnu.org ([209.51.188.17]:58508)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <op@HIDDEN>) id 1pLNGx-0004Gs-9w
 for submit <at> debbugs.gnu.org; Fri, 27 Jan 2023 06:52:03 -0500
Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <op@HIDDEN>) id 1pLNGw-0001CZ-I2
 for bug-guile@HIDDEN; Fri, 27 Jan 2023 06:51:58 -0500
Received: from mail.omarpolo.com ([144.91.116.244])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <op@HIDDEN>) id 1pLNGi-0007dJ-Pw
 for bug-guile@HIDDEN; Fri, 27 Jan 2023 06:51:47 -0500
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=omarpolo.com;
 s=20200327; t=1674820296;
 h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
 to:to:cc; bh=lXkxayJwYt7c0Gm/qLXJfijDmttThi7UpTRKia9CI1o=;
 b=3oyU0X88txNXZRKG4ofnVITuKo+SqfnHhV5FItjyHmaIpcgdAiUvcUO/DYYHAt7bTb8OgL
 G6C/1KlFMwgpdRn4XU3QpDL3aOrWUBbNfe+QH/KgUnUvharx/qUgeeQJ1oXnUlp+JK7c+N
 i2YxIs+dOUba9NekEzO0aWdE06/Dl0Y=
Received: from localhost (host-82-61-20-176.retail.telecomitalia.it
 [82.61.20.176])
 by mail.omarpolo.com (OpenSMTPD) with ESMTPSA id 984b554e
 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for <bug-guile@HIDDEN>;
 Fri, 27 Jan 2023 12:51:35 +0100 (CET)
Received: from localhost (localhost [local])
 by localhost (OpenSMTPD) with ESMTPA id 9eed019b
 for <bug-guile@HIDDEN>; Fri, 27 Jan 2023 12:51:32 +0100 (CET)
Date: Fri, 27 Jan 2023 12:51:32 +0100
To: bug-guile@HIDDEN
Subject: possible misuse of posix_spawn API on non-linux OSes
From: Omar Polo <op@HIDDEN>
Message-Id: <26OIN3L5D4V9L.2M0KM95K0YSNM@venera>
User-Agent: mblaze/1.2
Received-SPF: pass client-ip=144.91.116.244; envelope-from=op@HIDDEN;
 helo=mail.omarpolo.com
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-Spam-Score: -1.4 (-)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -2.4 (--)

Hello,

I've noticed that test-system-cmds fails on OpenBSD-CURRENT while
testing the update to guile 3.0.9:

    test-system-cmds: system* exit status was 127 rather than 42
    FAIL: test-system-cmds

Here's an excerpt of the ktrace of the child process while executing
that specific test: (the first fork() is the one implicitly done by
posix_spawn(3))

  5590 guile RET   fork 0
  [...]
  5590 guile CALL  dup2(0,3)
  5590 guile RET   dup2 3
  5590 guile CALL  dup2(1,4)
  5590 guile RET   dup2 4
  5590 guile CALL  dup2(2,5)
  5590 guile RET   dup2 5
  5590 guile CALL  dup2(3,0)
  5590 guile RET   dup2 0
  5590 guile CALL  dup2(4,1)
  5590 guile RET   dup2 1
  5590 guile CALL  dup2(5,2)
  5590 guile RET   dup2 2
  5590 guile CALL  close(1023)
  5590 guile RET   close -1 errno 9 Bad file descriptor
  5590 guile CALL  kbind(0x7f7ffffd51f8,24,0x2b5c5ced59893fa9)
  5590 guile RET   kbind 0
  5590 guile CALL  exit(127)

(if you prefer I can provide a full ktrace of guile executing that
test case)

My interpretation is that the sequence of dup2(2) is from
posix_spawn_file_actions_adddup2 in do_spawn, while the strange
close(1023) is from close_inherited_fds_slow.  Such file descriptor is
not open, so close(2) fails with EBADF and the posix_spawn machinery
exits prematurely.  My current RLIMIT_NOFILE is 1024, so the number
would make sense.

On OpenBSD I've tried to use the following patch to work around the
issue:

[[[
Index: libguile/posix.c
--- libguile/posix.c.orig
+++ libguile/posix.c
@@ -1325,6 +1325,7 @@ SCM_DEFINE (scm_fork, "primitive-fork", 0, 0, 0,
 static void
 close_inherited_fds_slow (posix_spawn_file_actions_t *actions, int max_fd)
 {
+  max_fd = getdtablecount();
   while (--max_fd > 2)
     posix_spawn_file_actions_addclose (actions, max_fd);
 }
]]]

getdtablecount(2) returns the number of file descriptor currently open
by the process.  unfortunately it doesn't seem to be portable.  (well,
tbf /proc/self/fd is not portable too.)

However, while this pleases the system* test, it breaks the pipe
tests:

    Running popen.test
    FAIL: popen.test: open-input-pipe: echo hello
    FAIL: popen.test: pipeline - arguments: (expected-value ("HELLO WORLD\n" (0 0)) actual-value ("" (127 0)))

the reason seem to be similar:

 74865 guile    CALL  dup2(7,3)
 74865 guile    RET   dup2 3
 74865 guile    CALL  dup2(10,4)
 74865 guile    RET   dup2 4
 74865 guile    CALL  dup2(2,5)
 74865 guile    RET   dup2 5
 74865 guile    CALL  dup2(3,0)
 74865 guile    RET   dup2 0
 74865 guile    CALL  dup2(4,1)
 74865 guile    RET   dup2 1
 74865 guile    CALL  dup2(5,2)
 74865 guile    RET   dup2 2
 74865 guile    CALL  close(8)
 74865 guile    RET   close -1 errno 9 Bad file descriptor
 74865 guile    CALL  kbind(0x7f7ffffcfa88,24,0x2125923bdf2ca9e)
 74865 guile    RET   kbind 0
 74865 guile    CALL  exit(127)

I guess it's trying to close the fd of the pipe that was closed.

I'm not sure what to do from here, I'm not used to the posix_spawn_*
APIs.  I'm happy to help testing diffs or by providing more info if
needed.


Thanks,

Omar Polo




Acknowledgement sent to Omar Polo <op@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-guile@HIDDEN. Full text available.
Report forwarded to bug-guile@HIDDEN:
bug#61095; Package guile. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Mon, 27 Mar 2023 13:45:01 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.