Received: (at 79334) by debbugs.gnu.org; 2 Sep 2025 11:18:44 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Tue Sep 02 07:18:44 2025
Received: from localhost ([127.0.0.1]:60513 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1utP2C-0005py-3B
for submit <at> debbugs.gnu.org; Tue, 02 Sep 2025 07:18:44 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:43994)
by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1utP29-0005pl-8p
for 79334 <at> debbugs.gnu.org; Tue, 02 Sep 2025 07:18:42 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
id 1utP23-0007eA-54; Tue, 02 Sep 2025 07:18:35 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
mime-version; bh=3EsvoDTrpakJXeHmJ3bLk+DwCSBxCLY4p2VHUqJV6Sg=; b=LRAICAfuZ3Qd
QU35EOgQxmviHX4+85qHzv/gylG/MLJNBGKC6DPIz5prhiYgXkTGoUCt+9a4jkgdOOw6XllElpXAZ
RqJm87xOHWUdIzp6D1kjO1SvSUEmWjLoMDbiBuc7Gpi4GzhqAKXdzgxdkypl5/FTZlPnijxxn7zae
B7ix97S59YRc9NoXzBKc+8AZql6xtnCKoCZq/244/yJ8kOBwCLddTwKR3f1xZaLFZIggxz1muVXoX
oHjNTp+5JukA9C2MPYrvs/7La4MBVW0CaAzTgbr0m9QT+2TtfsTI8pWufXQOAMwsVWUGveZZudYsQ
klBGsSCRfBpc1HTmlDg04Q==;
Date: Tue, 02 Sep 2025 14:18:31 +0300
Message-Id: <86bjntm7aw.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Spencer Baugh <sbaugh@HIDDEN>
In-Reply-To: <CAO=BR8PurEeiexhE-_enihOrJBoXSBQ0pQyH9eiC-SYe+HZtOg@HIDDEN>
(message from Spencer Baugh on Mon, 1 Sep 2025 15:40:37 -0400)
Subject: Re: bug#79334: [PATCH] Don't release thread select lock
unnecessarily, [PATCH] Defer closing file descriptors used by other threads
References: <ierwm6n6tno.fsf@HIDDEN> <86cy8es4bo.fsf@HIDDEN>
<ierqzwu6ztl.fsf@HIDDEN> <86a53is0zw.fsf@HIDDEN>
<ier8qj26nlj.fsf@HIDDEN>
<ier5xe57vrr.fsf@HIDDEN> <ier34997u4c.fsf_-_@HIDDEN>
<86zfbhqptt.fsf@HIDDEN> <ierldn0olcg.fsf@HIDDEN>
<86jz2kpy1w.fsf@HIDDEN>
<iera53eno2q.fsf@HIDDEN> <86ldmym0ux.fsf@HIDDEN>
<CAO=BR8PurEeiexhE-_enihOrJBoXSBQ0pQyH9eiC-SYe+HZtOg@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 79334
Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)
> From: Spencer Baugh <sbaugh@HIDDEN>
> Date: Mon, 1 Sep 2025 15:40:37 -0400
> Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, Dmitry Gutov <dmitry@HIDDEN>
>
> > In particular, Lisp programs should be able to call
> > delete-process on a process without getting an error just because some
> > thread is currently calling select and happens to be monitoring that
> > process. To emit such an error would make programming with processes
> > and threads extremely difficult.
>
> This is a problem two orders of magnitude smaller than what we are
> trying to fix here. So my suggestion is to leave it for later.
> There's no special reason why Lisp programs "should be able" to delete
> a process locked to another thread. If we don't want to signal an
> error (though I don't see why not), we could perhaps do nothing
> silently, until we get to dealing with that later.
>
> Silently doing nothing is also an unacceptably confusing result, in my opinion.
Why? Ignoring invalid actions is not something new to Emacs.
> > > Hmm... I don't understand: why do we need to do this? The calls to
> > > compute_input_wait_mask, compute_non_keyboard_wait_mask etc., which
> > > set bits in the Available mask passed top pselect already avoid
> > > setting bits for descriptors on which some other thread waits. So
> > > pselect cannot possibly indicate any output ready on any descriptors
> > > belonging to other threads, and this loop you suggest to add:
> > >
> > >> + /* Save the fds we're currently waiting_thread for. */
> > >> + fd_set waiting_fds;
> > >> + FD_ZERO (&waiting_fds);
> > >> + int max_waiting_fds = 0;
> > >> + for (int fd = 0; fd <= max_desc; ++fd)
> > >> + {
> > >> + if (fd_callback_info[fd].waiting_thread == current_thread)
> > >> + {
> > >> + FD_SET (fd, &waiting_fds);
> > >> + max_waiting_fds = fd;
> > >> + }
> > >> + }
> > >
> > > should be unnecessary. Or what am I missing?
> >
> > That's exactly the problem. A file descriptor for which we set
> > waiting_thread can have delete_read_fd or delete_write_fd called on it
> > by another thread, or by a signal handler, while we're waiting in
> > pselect.
>
> We need to prevent that from happening, not to go along with it. But
> anyway, I don't see why you need another fd_set for this. Why not
> just look at the descriptors in the Available set?
>
> The Available set is cleared by the select call, and afterwards only contains descriptors which has events. In
> that case we could just look at the descriptors in Available and check waiting_thread for each of them.
Yes, and we don't care about descriptors that are not ready to be
read.
> But if there was an EBADF, then Available is left in an undefined state (see pselect(1)) and we no longer
> know what descriptors we originally set waiting_thread on.
In that case pselect returns a negative result, no? And if that
happens, we don't examine the descriptors at all, right? So why is
this situation a problem we need to handle with this additional fd_set?
> Though, I guess we could alternatively just assume that EBADF is always spurious.
Assume and do what?
> > BTW, the fact that this problematic case is also possible via signal
> > handlers (namely the SIGCHLD handler) is a separate bug introduced by
> > the refactoring which added fd_callback_info. It means that with a
> > certain interleaving, I think it should be possible to get buggy
> > behavior by just using processes without threads. It's quite rare
> > though and I haven't been able to figure out what exact path that would
> > cause a problem.
>
> Let's not attempt to solve too many problems at the same time. How to
> make SIGCHLD safer when threads are present is an outstanding issue,
> and the solution will IMO depend in whether signals are delivered to
> the main thread and whether or not the build uses the child signal
> pipe.
>
> You misread me: I said SIGCHLD is unsafe even without threads.
Please describe how.
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.
Received: (at 79334) by debbugs.gnu.org; 1 Sep 2025 19:41:00 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Sep 01 15:41:00 2025
Received: from localhost ([127.0.0.1]:58577 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1utAOh-0003Gt-2Y
for submit <at> debbugs.gnu.org; Mon, 01 Sep 2025 15:40:59 -0400
Received: from mxout1.mail.janestreet.com ([38.105.200.78]:55107)
by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.84_2) (envelope-from <sbaugh@HIDDEN>)
id 1utAOd-0003Ge-LX
for 79334 <at> debbugs.gnu.org; Mon, 01 Sep 2025 15:40:57 -0400
Received: from mail-lj1-f200.google.com ([209.85.208.200])
by mxgoog2.mail.janestreet.com with esmtps (TLS1.3:TLS_AES_128_GCM_SHA256:128)
(Exim 4.98.2) id 1utAOY-00000005pgk-1xPJ for 79334 <at> debbugs.gnu.org;
Mon, 01 Sep 2025 15:40:50 -0400
Received: by mail-lj1-f200.google.com with SMTP id
38308e7fff4ca-336c3108badso20997601fa.3
for <79334 <at> debbugs.gnu.org>; Mon, 01 Sep 2025 12:40:50 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=janestreet.com; s=google; t=1756755649; x=1757360449; darn=debbugs.gnu.org;
h=cc:to:subject:message-id:date:from:in-reply-to:references
:mime-version:from:to:cc:subject:date:message-id:reply-to;
bh=hsXJ+hPsvWAeH4Zyfx3PMMW94ejI9xBkVjJV8rYzITw=;
b=m4q2qgaR1H6xyqZryrkHxSwBcQXgDs7ty5La0z8CH1TyakncVg2GDyjmKm+k5Hzpd2
zCAHqBNFF8YU1oSCm7s6vQIXUPsNUyGh0ZH/Sg37tGUEuQm9fS2jclt0NptCfVE9MEi1
Vu5Y9Qufapxj97dM9ZoeQUwNOfwzGTtZuBL3U=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com;
s=waixah; t=1756755650;
bh=hsXJ+hPsvWAeH4Zyfx3PMMW94ejI9xBkVjJV8rYzITw=;
h=References:In-Reply-To:From:Date:Subject:To:Cc;
b=UshhT9A7/S5jTOWoeWD5Ja7JW4TfDHeKKrsZvcH0id+L2zmpVJnZdv/etalJ45uk7
tvG3RmRG52gqSai1pFxrTbEFVv08OUaEnX31ZnAUBrW1ypunqsqdYnsyWiBaNbT7j8
XY0kw69uUBEZeDigT1lLMXpVgIQlE/56z24wqCLKWOtRgeNZA+/UP8FIqRPQEg3u9S
R/rPGm4CZtKh1Nvhcy7MMeHtDGEqZ4ACUdXPes0mL+NCi+2XUv8NZXhAA//5QZ2tqu
Ls1fNs7pffBd68zgEiZVtwwVr/x5kiWWLl+81n4i5bdjySqZL7bMVkc8Yu5fJp7JqU
DlLASARlkd8xw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20230601; t=1756755649; x=1757360449;
h=cc:to:subject:message-id:date:from:in-reply-to:references
:mime-version:x-gm-message-state:from:to:cc:subject:date:message-id
:reply-to;
bh=hsXJ+hPsvWAeH4Zyfx3PMMW94ejI9xBkVjJV8rYzITw=;
b=tuUGl2En6lpsPLP8T7IFfMC/b8DBocjkrvdGo7DA+BRdExoM5IcANX/0KCBfI2KImf
zfC0HefRtRbHdoh0ShIJNJTnj0RFT9ewCfQlZheY2T+WqltTqDvisZsEAJDsQys68LAP
nWygoMvvIo4fYXsb9quWtuY/y4md2uWdDfaBZCqjz6QWvW5mzYq4R9EkkrGzER85FMyV
h/Eb8h656P4aA6eHP0Tj4/SaXOnvE1aibA6e2tyqIY0kOvz8WOKpvG4LBrPkzoqPrmH3
VzJ8E/J/xmJiP8USJp6M+61w4fUKqapmXSNslNIEYAE1XGSzrUErrloS5q/ag3Kz2XeO
qkhQ==
X-Gm-Message-State: AOJu0Yy0YmOhQi3sr0PrZv6P7GldF8K94D6JHQQf3PuXz4L0pHBrfZ/J
xkdVLwdX8isqP4RHXBg4ZZZfXTeAk1fHMJ6bx3qgF5c0NfxtPRpBoVa4B/KcWNuWLcRdcP+kSUo
09xvfbvKYaW2ECsrjmaDbgELaXDbxDPyJkJHnagqUz5oxCsXu4SyxgqOXITmu+XGHOhnFOZsJBw
5XQYi8omb1pGglfr/csdC0XyHB0gadLA==
X-Gm-Gg: ASbGncsIsR0WijZC6kC2ZH18bvOMhooosBFhIFlO0Ih4KjxIZ5Tit+hJ8QTmPdi/o8c
rb6TDO75jL2mXYMsUJS/V52U+8cJqfQ6DFqeOzVz4rQGXBJ1jUmziwRrR0llzPztidsT00ute+G
XMsko+epwda+ToKdrL3ap8
X-Received: by 2002:a2e:be23:0:b0:332:1de5:c513 with SMTP id
38308e7fff4ca-336ca901440mr24870951fa.4.1756755648653;
Mon, 01 Sep 2025 12:40:48 -0700 (PDT)
X-Google-Smtp-Source: AGHT+IGUlSYGavU+sYo+tMJYoR07G5m3Q0RE9H92MDwv0DKAbvD4XFEgyipfkkE1Yt5VPmP2dcSijLJB2faDAruRMMg=
X-Received: by 2002:a2e:be23:0:b0:332:1de5:c513 with SMTP id
38308e7fff4ca-336ca901440mr24870891fa.4.1756755648163; Mon, 01 Sep 2025
12:40:48 -0700 (PDT)
MIME-Version: 1.0
References: <ierwm6n6tno.fsf@HIDDEN> <86cy8es4bo.fsf@HIDDEN>
<ierqzwu6ztl.fsf@HIDDEN> <86a53is0zw.fsf@HIDDEN>
<ier8qj26nlj.fsf@HIDDEN>
<ier5xe57vrr.fsf@HIDDEN> <ier34997u4c.fsf_-_@HIDDEN>
<86zfbhqptt.fsf@HIDDEN> <ierldn0olcg.fsf@HIDDEN>
<86jz2kpy1w.fsf@HIDDEN>
<iera53eno2q.fsf@HIDDEN> <86ldmym0ux.fsf@HIDDEN>
In-Reply-To: <86ldmym0ux.fsf@HIDDEN>
From: Spencer Baugh <sbaugh@HIDDEN>
Date: Mon, 1 Sep 2025 15:40:37 -0400
X-Gm-Features: Ac12FXyUblflpNO1YUj2f8dXcmOzgfch-ujFAZH6okGf-3Y__6g8my__JThELP0
Message-ID: <CAO=BR8PurEeiexhE-_enihOrJBoXSBQ0pQyH9eiC-SYe+HZtOg@HIDDEN>
Subject: Re: bug#79334: [PATCH] Don't release thread select lock
unnecessarily, [PATCH] Defer closing file descriptors used by other threads
To: Eli Zaretskii <eliz@HIDDEN>
Content-Type: multipart/alternative; boundary="0000000000005dc3f4063dc28cb0"
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 79334
Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, Dmitry Gutov <dmitry@HIDDEN>
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)
--0000000000005dc3f4063dc28cb0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
On Mon, Sep 1, 2025, 3:25=E2=80=AFPM Eli Zaretskii <eliz@HIDDEN> wrote:
> > From: Spencer Baugh <sbaugh@HIDDEN>
> > Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN
> > Date: Mon, 01 Sep 2025 12:18:37 -0400
> >
> > > You assume that the waiting_thread field will be cleared when a threa=
d
> > > dies, but that is not guaranteed. Why isn't it better to be defensiv=
e
> > > and make sure the thread is still alive? If you are right, that test
> > > will always be true, but what if you are wrong in some rare cases?
> >
> > Fair enough; the patch has changed a lot so this
> > other_thread_is_waiting_for function doesn't exist anymore, but if I
> > re-add it or similar checks, I will check that the thread is alive.
>
> Thanks.
>
> > > What I had in mind was neither delete it nor deactivate it. IOW, the
> > > loop which goes through the list of processes should only pay
> > > attention to processes locked to this thread and those that are not
> > > locked to any live thread. It should leave the other processes to
> > > when their threads call status_notify.
> >
> > There are too many ways in which we delete or deactivate processes to d=
o
> > that.
>
> We should audit all of them and prevent that from happening when the
> process is locked to another thread.
>
> > In particular, Lisp programs should be able to call
> > delete-process on a process without getting an error just because some
> > thread is currently calling select and happens to be monitoring that
> > process. To emit such an error would make programming with processes
> > and threads extremely difficult.
>
> This is a problem two orders of magnitude smaller than what we are
> trying to fix here. So my suggestion is to leave it for later.
> There's no special reason why Lisp programs "should be able" to delete
> a process locked to another thread. If we don't want to signal an
> error (though I don't see why not), we could perhaps do nothing
> silently, until we get to dealing with that later.
>
Silently doing nothing is also an unacceptably confusing result, in my
opinion.
But it's okay: I believe it's possible to solve the problem without
introducing such bad behavior.
For now, sabotaging threads that wait for some sub-process is a much
> more grave problem, and we must fix it if we want threads and
> processes to live in peace in Emacs.
>
Of course.
> > Hmm... I don't understand: why do we need to do this? The calls to
> > > compute_input_wait_mask, compute_non_keyboard_wait_mask etc., which
> > > set bits in the Available mask passed top pselect already avoid
> > > setting bits for descriptors on which some other thread waits. So
> > > pselect cannot possibly indicate any output ready on any descriptors
> > > belonging to other threads, and this loop you suggest to add:
> > >
> > >> + /* Save the fds we're currently waiting_thread for. */
> > >> + fd_set waiting_fds;
> > >> + FD_ZERO (&waiting_fds);
> > >> + int max_waiting_fds =3D 0;
> > >> + for (int fd =3D 0; fd <=3D max_desc; ++fd)
> > >> + {
> > >> + if (fd_callback_info[fd].waiting_thread =3D=3D current_thread)
> > >> + {
> > >> + FD_SET (fd, &waiting_fds);
> > >> + max_waiting_fds =3D fd;
> > >> + }
> > >> + }
> > >
> > > should be unnecessary. Or what am I missing?
> >
> > That's exactly the problem. A file descriptor for which we set
> > waiting_thread can have delete_read_fd or delete_write_fd called on it
> > by another thread, or by a signal handler, while we're waiting in
> > pselect.
>
> We need to prevent that from happening, not to go along with it. But
> anyway, I don't see why you need another fd_set for this. Why not
> just look at the descriptors in the Available set?
>
The Available set is cleared by the select call, and afterwards only
contains descriptors which has events. In that case we could just look at
the descriptors in Available and check waiting_thread for each of them.
But if there was an EBADF, then Available is left in an undefined state
(see pselect(1)) and we no longer know what descriptors we originally set
waiting_thread on. So we can't check to see if one of our descriptors was
"stolen" (had waiting_thread set to a different value), which would mean
the EBADF is spurious and we should continue running.
Though, I guess we could alternatively just assume that EBADF is always
spurious.
> BTW, the fact that this problematic case is also possible via signal
> > handlers (namely the SIGCHLD handler) is a separate bug introduced by
> > the refactoring which added fd_callback_info. It means that with a
> > certain interleaving, I think it should be possible to get buggy
> > behavior by just using processes without threads. It's quite rare
> > though and I haven't been able to figure out what exact path that would
> > cause a problem.
>
> Let's not attempt to solve too many problems at the same time. How to
> make SIGCHLD safer when threads are present is an outstanding issue,
> and the solution will IMO depend in whether signals are delivered to
> the main thread and whether or not the build uses the child signal
> pipe.
>
You misread me: I said SIGCHLD is unsafe even without threads.
>
--0000000000005dc3f4063dc28cb0
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
<div dir=3D"auto"><div><br><br><div class=3D"gmail_quote gmail_quote_contai=
ner"><div dir=3D"ltr" class=3D"gmail_attr">On Mon, Sep 1, 2025, 3:25=E2=80=
=AFPM Eli Zaretskii <<a href=3D"mailto:eliz@HIDDEN">eliz@HIDDEN</a>>=
; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .=
8ex;border-left:1px #ccc solid;padding-left:1ex">> From: Spencer Baugh &=
lt;<a href=3D"mailto:sbaugh@HIDDEN" target=3D"_blank" rel=3D"norefe=
rrer">sbaugh@HIDDEN</a>><br>
> Cc: <a href=3D"mailto:79334 <at> debbugs.gnu.org" target=3D"_blank" rel=3D"=
noreferrer">79334 <at> debbugs.gnu.org</a>,=C2=A0 <a href=3D"mailto:eggert@HIDDEN=
la.edu" target=3D"_blank" rel=3D"noreferrer">eggert@HIDDEN</a>,=C2=A0 =
<a href=3D"mailto:dmitry@HIDDEN" target=3D"_blank" rel=3D"noreferrer">dm=
itry@HIDDEN</a><br>
> Date: Mon, 01 Sep 2025 12:18:37 -0400<br>
> <br>
> > You assume that the waiting_thread field will be cleared when a t=
hread<br>
> > dies, but that is not guaranteed.=C2=A0 Why isn't it better t=
o be defensive<br>
> > and make sure the thread is still alive?=C2=A0 If you are right, =
that test<br>
> > will always be true, but what if you are wrong in some rare cases=
?<br>
> <br>
> Fair enough; the patch has changed a lot so this<br>
> other_thread_is_waiting_for function doesn't exist anymore, but if=
I<br>
> re-add it or similar checks, I will check that the thread is alive.<br=
>
<br>
Thanks.<br>
<br>
> > What I had in mind was neither delete it nor deactivate it.=C2=A0=
IOW, the<br>
> > loop which goes through the list of processes should only pay<br>
> > attention to processes locked to this thread and those that are n=
ot<br>
> > locked to any live thread.=C2=A0 It should leave the other proces=
ses to<br>
> > when their threads call status_notify.<br>
> <br>
> There are too many ways in which we delete or deactivate processes to =
do<br>
> that.<br>
<br>
We should audit all of them and prevent that from happening when the<br>
process is locked to another thread.<br>
<br>
> In particular, Lisp programs should be able to call<br>
> delete-process on a process without getting an error just because some=
<br>
> thread is currently calling select and happens to be monitoring that<b=
r>
> process.=C2=A0 To emit such an error would make programming with proce=
sses<br>
> and threads extremely difficult.<br>
<br>
This is a problem two orders of magnitude smaller than what we are<br>
trying to fix here.=C2=A0 So my suggestion is to leave it for later.<br>
There's no special reason why Lisp programs "should be able" =
to delete<br>
a process locked to another thread.=C2=A0 If we don't want to signal an=
<br>
error (though I don't see why not), we could perhaps do nothing<br>
silently, until we get to dealing with that later.<br></blockquote></div></=
div><div dir=3D"auto"><br></div><div dir=3D"auto">Silently doing nothing is=
also an unacceptably confusing result, in my opinion.</div><div dir=3D"aut=
o"><br></div><div dir=3D"auto">But it's okay: I believe it's possib=
le to solve the problem without introducing such bad behavior.</div><div di=
r=3D"auto"><br></div><div dir=3D"auto"><div class=3D"gmail_quote gmail_quot=
e_container"><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;b=
order-left:1px #ccc solid;padding-left:1ex">
For now, sabotaging threads that wait for some sub-process is a much<br>
more grave problem, and we must fix it if we want threads and<br>
processes to live in peace in Emacs.<br></blockquote></div></div><div dir=
=3D"auto"><br></div><div dir=3D"auto">Of course.</div><div dir=3D"auto"><br=
></div><div dir=3D"auto"><div class=3D"gmail_quote gmail_quote_container"><=
blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px=
#ccc solid;padding-left:1ex">
> > Hmm... I don't understand: why do we need to do this?=C2=A0 T=
he calls to<br>
> > compute_input_wait_mask, compute_non_keyboard_wait_mask etc., whi=
ch<br>
> > set bits in the Available mask passed top pselect already avoid<b=
r>
> > setting bits for descriptors on which some other thread waits.=C2=
=A0 So<br>
> > pselect cannot possibly indicate any output ready on any descript=
ors<br>
> > belonging to other threads, and this loop you suggest to add:<br>
> ><br>
> >> +=C2=A0 =C2=A0 =C2=A0 /* Save the fds we're currently wai=
ting_thread for.=C2=A0 */<br>
> >> +=C2=A0 =C2=A0 =C2=A0 fd_set waiting_fds;<br>
> >> +=C2=A0 =C2=A0 =C2=A0 FD_ZERO (&waiting_fds);<br>
> >> +=C2=A0 =C2=A0 =C2=A0 int max_waiting_fds =3D 0;<br>
> >> +=C2=A0 =C2=A0 =C2=A0 for (int fd =3D 0; fd <=3D max_desc;=
++fd)<br>
> >> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 {<br>
> >> +=C2=A0 =C2=A0 if (fd_callback_info[fd].waiting_thread =3D=3D=
current_thread)<br>
> >> +=C2=A0 =C2=A0 =C2=A0 {<br>
> >> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 FD_SET (fd, &waiting_fds);<b=
r>
> >> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 max_waiting_fds =3D fd;<br>
> >> +=C2=A0 =C2=A0 =C2=A0 }<br>
> >> +=C2=A0 }<br>
> ><br>
> > should be unnecessary.=C2=A0 Or what am I missing?<br>
> <br>
> That's exactly the problem.=C2=A0 A file descriptor for which we s=
et<br>
> waiting_thread can have delete_read_fd or delete_write_fd called on it=
<br>
> by another thread, or by a signal handler, while we're waiting in<=
br>
> pselect.<br>
<br>
We need to prevent that from happening, not to go along with it.=C2=A0 But<=
br>
anyway, I don't see why you need another fd_set for this.=C2=A0 Why not=
<br>
just look at the descriptors in the Available set?<br></blockquote></div></=
div><div dir=3D"auto"><br></div><div dir=3D"auto">The Available set is clea=
red by the select call, and afterwards only contains descriptors which has =
events.=C2=A0 In that case we could just look at the descriptors in Availab=
le and check waiting_thread for each of them.</div><div dir=3D"auto"><br></=
div><div dir=3D"auto">But if there was an EBADF, then Available is left in =
an undefined state (see pselect(1)) and we no longer know what descriptors =
we originally set waiting_thread on.=C2=A0 So we can't check to see if =
one of our descriptors was "stolen" (had waiting_thread set to a =
different value), which would mean the EBADF is spurious and we should cont=
inue running.</div><div dir=3D"auto"><br></div><div dir=3D"auto">Though, I =
guess we could alternatively just assume that EBADF is always spurious.</di=
v><div dir=3D"auto"><br></div><div dir=3D"auto"><div class=3D"gmail_quote g=
mail_quote_container"><blockquote class=3D"gmail_quote" style=3D"margin:0 0=
0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
> BTW, the fact that this problematic case is also possible via signal<b=
r>
> handlers (namely the SIGCHLD handler) is a separate bug introduced by<=
br>
> the refactoring which added fd_callback_info.=C2=A0 It means that with=
a<br>
> certain interleaving, I think it should be possible to get buggy<br>
> behavior by just using processes without threads.=C2=A0 It's quite=
rare<br>
> though and I haven't been able to figure out what exact path that =
would<br>
> cause a problem.<br>
<br>
Let's not attempt to solve too many problems at the same time.=C2=A0 Ho=
w to<br>
make SIGCHLD safer when threads are present is an outstanding issue,<br>
and the solution will IMO depend in whether signals are delivered to<br>
the main thread and whether or not the build uses the child signal<br>
pipe.<br></blockquote></div></div><div dir=3D"auto"><br></div><div dir=3D"a=
uto">You misread me: I said SIGCHLD is unsafe even without threads.</div><d=
iv dir=3D"auto"><div class=3D"gmail_quote gmail_quote_container"><blockquot=
e class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc sol=
id;padding-left:1ex">
</blockquote></div></div></div>
--0000000000005dc3f4063dc28cb0--
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.
Received: (at 79334) by debbugs.gnu.org; 1 Sep 2025 19:25:46 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Sep 01 15:25:45 2025
Received: from localhost ([127.0.0.1]:58537 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1utA9x-0002bo-A8
for submit <at> debbugs.gnu.org; Mon, 01 Sep 2025 15:25:45 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:41010)
by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1utA9u-0002bU-Mv
for 79334 <at> debbugs.gnu.org; Mon, 01 Sep 2025 15:25:43 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
id 1utA9l-0001HC-8M; Mon, 01 Sep 2025 15:25:33 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
mime-version; bh=9B0ZZ/jkgQ7hrvUpL46jOl1XsDU80uUi/rpDHxM6vbs=; b=i3DyP9NmmLbu
uGLvTZ/KnSy+hJe/SQNAgQlGKE6Ajsc9XbeIHECP+fTHITp0KGqkjla7QvHSGDFievkhi3nRBAtMM
tZF25njzMhxbr3wOFyjhU5J4BCKhMNGh8J+r7TpZazA41r9KxAipjvsDSAr7FlVyeFgwid0YKthts
ffiZoI75sxKN9IOobVeuV/vm6hLEOOOsAuYTsgmiz4IhMDkXReUN4JJObqCq5Af8E0tkXR+4du/DS
WC4YwBMPQiSTbFahlNOwObQVD9PdE+Acj9upRZ3yiUIBRTCP5PwhF0v02+H7feGcRCutZ4fJt+qX+
522hdddkfidmicw//knukw==;
Date: Mon, 01 Sep 2025 22:25:26 +0300
Message-Id: <86ldmym0ux.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Spencer Baugh <sbaugh@HIDDEN>
In-Reply-To: <iera53eno2q.fsf@HIDDEN> (message from Spencer Baugh on
Mon, 01 Sep 2025 12:18:37 -0400)
Subject: Re: bug#79334: [PATCH] Don't release thread select lock
unnecessarily, [PATCH] Defer closing file descriptors used by other
threads
References: <ierwm6n6tno.fsf@HIDDEN> <86cy8es4bo.fsf@HIDDEN>
<ierqzwu6ztl.fsf@HIDDEN> <86a53is0zw.fsf@HIDDEN>
<ier8qj26nlj.fsf@HIDDEN> <ier5xe57vrr.fsf@HIDDEN>
<ier34997u4c.fsf_-_@HIDDEN> <86zfbhqptt.fsf@HIDDEN>
<ierldn0olcg.fsf@HIDDEN> <86jz2kpy1w.fsf@HIDDEN>
<iera53eno2q.fsf@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 79334
Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)
> From: Spencer Baugh <sbaugh@HIDDEN>
> Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN
> Date: Mon, 01 Sep 2025 12:18:37 -0400
>
> > You assume that the waiting_thread field will be cleared when a thread
> > dies, but that is not guaranteed. Why isn't it better to be defensive
> > and make sure the thread is still alive? If you are right, that test
> > will always be true, but what if you are wrong in some rare cases?
>
> Fair enough; the patch has changed a lot so this
> other_thread_is_waiting_for function doesn't exist anymore, but if I
> re-add it or similar checks, I will check that the thread is alive.
Thanks.
> > What I had in mind was neither delete it nor deactivate it. IOW, the
> > loop which goes through the list of processes should only pay
> > attention to processes locked to this thread and those that are not
> > locked to any live thread. It should leave the other processes to
> > when their threads call status_notify.
>
> There are too many ways in which we delete or deactivate processes to do
> that.
We should audit all of them and prevent that from happening when the
process is locked to another thread.
> In particular, Lisp programs should be able to call
> delete-process on a process without getting an error just because some
> thread is currently calling select and happens to be monitoring that
> process. To emit such an error would make programming with processes
> and threads extremely difficult.
This is a problem two orders of magnitude smaller than what we are
trying to fix here. So my suggestion is to leave it for later.
There's no special reason why Lisp programs "should be able" to delete
a process locked to another thread. If we don't want to signal an
error (though I don't see why not), we could perhaps do nothing
silently, until we get to dealing with that later.
For now, sabotaging threads that wait for some sub-process is a much
more grave problem, and we must fix it if we want threads and
processes to live in peace in Emacs.
> > Hmm... I don't understand: why do we need to do this? The calls to
> > compute_input_wait_mask, compute_non_keyboard_wait_mask etc., which
> > set bits in the Available mask passed top pselect already avoid
> > setting bits for descriptors on which some other thread waits. So
> > pselect cannot possibly indicate any output ready on any descriptors
> > belonging to other threads, and this loop you suggest to add:
> >
> >> + /* Save the fds we're currently waiting_thread for. */
> >> + fd_set waiting_fds;
> >> + FD_ZERO (&waiting_fds);
> >> + int max_waiting_fds = 0;
> >> + for (int fd = 0; fd <= max_desc; ++fd)
> >> + {
> >> + if (fd_callback_info[fd].waiting_thread == current_thread)
> >> + {
> >> + FD_SET (fd, &waiting_fds);
> >> + max_waiting_fds = fd;
> >> + }
> >> + }
> >
> > should be unnecessary. Or what am I missing?
>
> That's exactly the problem. A file descriptor for which we set
> waiting_thread can have delete_read_fd or delete_write_fd called on it
> by another thread, or by a signal handler, while we're waiting in
> pselect.
We need to prevent that from happening, not to go along with it. But
anyway, I don't see why you need another fd_set for this. Why not
just look at the descriptors in the Available set?
> BTW, the fact that this problematic case is also possible via signal
> handlers (namely the SIGCHLD handler) is a separate bug introduced by
> the refactoring which added fd_callback_info. It means that with a
> certain interleaving, I think it should be possible to get buggy
> behavior by just using processes without threads. It's quite rare
> though and I haven't been able to figure out what exact path that would
> cause a problem.
Let's not attempt to solve too many problems at the same time. How to
make SIGCHLD safer when threads are present is an outstanding issue,
and the solution will IMO depend in whether signals are delivered to
the main thread and whether or not the build uses the child signal
pipe.
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.
Received: (at 79334) by debbugs.gnu.org; 1 Sep 2025 16:18:46 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Sep 01 12:18:46 2025
Received: from localhost ([127.0.0.1]:58313 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1ut7F0-0008Nb-08
for submit <at> debbugs.gnu.org; Mon, 01 Sep 2025 12:18:46 -0400
Received: from mxout5.mail.janestreet.com ([64.215.233.18]:38151)
by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.84_2) (envelope-from <sbaugh@HIDDEN>)
id 1ut7Ex-0008NM-4X
for 79334 <at> debbugs.gnu.org; Mon, 01 Sep 2025 12:18:43 -0400
From: Spencer Baugh <sbaugh@HIDDEN>
To: Eli Zaretskii <eliz@HIDDEN>
Subject: Re: bug#79334: [PATCH] Don't release thread select lock
unnecessarily, [PATCH] Defer closing file descriptors used by other
threads
In-Reply-To: <86jz2kpy1w.fsf@HIDDEN> (Eli Zaretskii's message of "Sat, 30 Aug
2025 19:35:39 +0300")
References: <ierwm6n6tno.fsf@HIDDEN> <86cy8es4bo.fsf@HIDDEN>
<ierqzwu6ztl.fsf@HIDDEN> <86a53is0zw.fsf@HIDDEN>
<ier8qj26nlj.fsf@HIDDEN> <ier5xe57vrr.fsf@HIDDEN>
<ier34997u4c.fsf_-_@HIDDEN> <86zfbhqptt.fsf@HIDDEN>
<ierldn0olcg.fsf@HIDDEN> <86jz2kpy1w.fsf@HIDDEN>
Date: Mon, 01 Sep 2025 12:18:37 -0400
Message-ID: <iera53eno2q.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13)
MIME-Version: 1.0
Content-Type: text/plain
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com;
s=waixah; t=1756743517;
bh=thdIrYshUBoOQVmp2I25ATjxrErnxWZ7ivf5EY1vQNc=;
h=From:To:Cc:Subject:In-Reply-To:References:Date;
b=3Dq5Z4ZTjnYJbtNpiOdCmv5jzIAhFXJNU4uUQc+o8f0MdOykWOnDLTErnoonsytY8
P4yFt8AfpwduaD+HXAJ6itz8AwRLZ9MnhNpQhC+lLCyf2q0EKB2BgN+NXz20NLGcQs
fsR4Ca0tcLATHS3oX7fLUjANkcOPk8f4lTqDHFRR/pc7hYEDs6evR1++qGevKHxv/e
EWCo52T4sS+BKlFDP++v6lB3dfwr5wvkHyLp65S7nJR5HjTf42b435rgNxXCjopqv/
71ClIv21i+qUBV8C56OItVcnMkBETLdPb99BrAhatkYMwmIWgsGtMGOrXA+G39QNtd
vLPb+NIkdGS/g==
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 79334
Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)
Eli Zaretskii <eliz@HIDDEN> writes:
>> From: Spencer Baugh <sbaugh@HIDDEN>
>> >> +static bool
>> >> +other_thread_is_waiting_for (struct fd_callback_data* elem)
>> >> +{
>> >> + return elem->waiting_thread != NULL && elem->waiting_thread != current_thread;
>> >> +}
>> >
>> > This should also check that the waiting thread is still alive. If it
>> > isn't, it's okay to close the descriptor. Otherwise, descriptors
>> > belonging to threads that exited might never be closed.
>>
>> waiting_thread always points to a living thread, because it's only set
>> while a thread is inside wait_reading_process_output.
>>
>> (In fact, when a thread sees waiting_thread set to a value which is
>> neither NULL nor current_thread, it can only ever point to a thread
>> which is currently inside thread_select)
>
> You assume that the waiting_thread field will be cleared when a thread
> dies, but that is not guaranteed. Why isn't it better to be defensive
> and make sure the thread is still alive? If you are right, that test
> will always be true, but what if you are wrong in some rare cases?
Fair enough; the patch has changed a lot so this
other_thread_is_waiting_for function doesn't exist anymore, but if I
re-add it or similar checks, I will check that the thread is alive.
>> >> + if (fd_callback_info[fd].waiting_thread == current_thread
>> >> + && (fd_callback_info[fd].flags == WAITING_FOR_CLOSE_FD))
>> >> + {
>> >> + fprintf (stderr, "closing deferred fd %d\n", fd);
>> >> + emacs_close (fd);
>> >> + /* Ignore any events which happened on this fd. */
>> >> + FD_CLR (fd, &Available);
>> >> + FD_CLR (fd, &Writeok);
>> >> + fd_callback_info[fd].flags = 0;
>> >
>> > Why do it here and not when we deactivate processes?
>>
>> Because that would close file descriptors that another thread is
>> currently selecting on, causing EBADF.
>
> No, I mean to deactivate only the processes whose I/O channels are
> locked to the current thread.
>
>> > Until now we never closed any descriptors inside
>> > wait_reading_process_output.
>>
>> Just to be clear, we can call deactivate_process from
>> wait_reading_process_output, through status_notify, through process
>> filters, or other similar. That closes descriptors.
>
> Yes, and so I suggest doing this logic of deactivating only the
> processes that are locked to the current thread there.
>
>> > More generally, why this design and not a simpler change which only
>> > deactivates processes whose I/O is waited by the current thread, as
>> > discussed previously?
>>
>> I looked into that, but I determined that it's too complicated. It
>> would require us to keep track of a new kind of process state: a process
>> which has been deleted, but hasn't been deactivated yet.
>
> What I had in mind was neither delete it nor deactivate it. IOW, the
> loop which goes through the list of processes should only pay
> attention to processes locked to this thread and those that are not
> locked to any live thread. It should leave the other processes to
> when their threads call status_notify.
There are too many ways in which we delete or deactivate processes to do
that. In particular, Lisp programs should be able to call
delete-process on a process without getting an error just because some
thread is currently calling select and happens to be monitoring that
process. To emit such an error would make programming with processes
and threads extremely difficult.
>> > Looking at all the callers of close_process_fd, it sounds like it's
>> > too low-level to do this. For example, it is called from
>> > create_process (including in the forked child) to close the unneeded
>> > ends of the pipe, where we probably don't want this. And
>> > clear_fd_callback_data is also used for the keyboard input descriptor,
>> > which is probably also not relevant.
>>
>> True.
>>
>> I looked into fixing these, and that suggested to me an altogether
>> simpler approach. We can keep the logic entirely contained in
>> wait_reading_process_output, just adding a bit of code around the select
>> call.
>>
>> And we don't need to defer deleting the fds at all: just recognize when
>> it's happened, and be careful to not touch those fds afterwards.
>
> Hmm... I don't understand: why do we need to do this? The calls to
> compute_input_wait_mask, compute_non_keyboard_wait_mask etc., which
> set bits in the Available mask passed top pselect already avoid
> setting bits for descriptors on which some other thread waits. So
> pselect cannot possibly indicate any output ready on any descriptors
> belonging to other threads, and this loop you suggest to add:
>
>> + /* Save the fds we're currently waiting_thread for. */
>> + fd_set waiting_fds;
>> + FD_ZERO (&waiting_fds);
>> + int max_waiting_fds = 0;
>> + for (int fd = 0; fd <= max_desc; ++fd)
>> + {
>> + if (fd_callback_info[fd].waiting_thread == current_thread)
>> + {
>> + FD_SET (fd, &waiting_fds);
>> + max_waiting_fds = fd;
>> + }
>> + }
>
> should be unnecessary. Or what am I missing?
That's exactly the problem. A file descriptor for which we set
waiting_thread can have delete_read_fd or delete_write_fd called on it
by another thread, or by a signal handler, while we're waiting in
pselect.
We need to catch the fact that the file descriptor has been removed from
the set of file descriptors we're supposed to monitor, and avoid
reacting to events on it. We can catch this easily by seeing that
waiting_thread has been cleared or changed while we were blocked in
select.
BTW, the fact that this problematic case is also possible via signal
handlers (namely the SIGCHLD handler) is a separate bug introduced by
the refactoring which added fd_callback_info. It means that with a
certain interleaving, I think it should be possible to get buggy
behavior by just using processes without threads. It's quite rare
though and I haven't been able to figure out what exact path that would
cause a problem.
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.Received: (at 79334) by debbugs.gnu.org; 1 Sep 2025 00:40:05 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sun Aug 31 20:40:05 2025 Received: from localhost ([127.0.0.1]:55810 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1ussaa-000670-Uw for submit <at> debbugs.gnu.org; Sun, 31 Aug 2025 20:40:05 -0400 Received: from fhigh-b4-smtp.messagingengine.com ([202.12.124.155]:41797) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <dmitry@HIDDEN>) id 1ussaW-00065X-Mg for 79334 <at> debbugs.gnu.org; Sun, 31 Aug 2025 20:40:02 -0400 Received: from phl-compute-10.internal (phl-compute-10.internal [10.202.2.50]) by mailfhigh.stl.internal (Postfix) with ESMTP id 2EE427A00E5; Sun, 31 Aug 2025 20:39:55 -0400 (EDT) Received: from phl-mailfrontend-02 ([10.202.2.163]) by phl-compute-10.internal (MEProxy); Sun, 31 Aug 2025 20:39:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc :cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm2; t=1756687195; x=1756773595; bh=8YCpyn73utj8PCl/bcalfnUy6XzbxM41ctQwwXN7Y/k=; b= AVORwJqvW96dssj4L65bo2MiHWyN+Tk5CTwo/K3aWosHRCc8iIcB6XfOHGgNSQ53 FijsWEBhPDn+E4UyqHyVIgQAusKHFouh7JYOzLvU1zwShYxEnHmPt1oSi2KjQK/g txvyU0kKhT+C66Q3U3SXwhQX4GR33wBtEDiIv2lJxP2nc30//P/FyzaRbb2BHED3 4RPIfRKP/G4tkcob0ekyOd3ia1rt/CCC2RgeDiMV/AseDxVvXi4H8Kld2GU6hhZe bB6hw1siAPpdE0JsOC/7yrUT6VXVrsnKH4qutbV7/XCOHFCmLt5GJcdO1k4uS1hs iHrcL0p/qyD3SMzfTR5UUg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t=1756687195; x= 1756773595; bh=8YCpyn73utj8PCl/bcalfnUy6XzbxM41ctQwwXN7Y/k=; b=h qEHRczFVR9ZRcLfxiFabSf6U9l01g/yYCVxg8tWTwIXo9aLz4ZMzZ7o97b1rXEs3 OtaODJ5im61pA74AwBGvjT9OjfdC/QR1mk9y/LKDI7VQSI2nx0Y2C4jaZzbjGNfo pxcUO7wXASuaqBMA/7ouaF5RZL/RR0rBFnslu9bXIBakWi6eiDnc9XLBdOT+XH6+ T6UcH+hFQA/ScvqiCyMcI8BeJA8DRZnNXldtLCT8h4APi45jR1WlMKXhIHx2HhP+ mvudFQ4AoWhmjdwrFnnfl+nlyd2rS6NCmhs0WAdCzPc5d21Q16+hPE5rBWIznPsg vDUS9piQScYSnfJ1NYlXA== X-ME-Sender: <xms:Wuu0aGVsSvztFosLuCmujArUjhwuZEzEliX_EpIwrpXL5sXopeLobQ> <xme:Wuu0aMAacHks8W-AAfwG3dGtgzQfeAVB6Qyr3SG2OX2gWz2MO3E65ZhuKkMh8KrAX FDEQy-Fabmyl-8H1Ds> X-ME-Received: <xmr:Wuu0aJ8xWuTKqf-thVNTIkfGNtN_aSd9UJY6e1TSe-_i39O3mjWqQi8DMp69jxteo2M> X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdefgdduledtjeefucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujf gurhepkfffgggfuffvvehfhfgjtgfgsehtjeertddtvdejnecuhfhrohhmpeffmhhithhr hicuifhuthhovhcuoegumhhithhrhiesghhuthhovhdruggvvheqnecuggftrfgrthhtvg hrnhepteduleejgeehtefgheegjeekueehvdevieekueeftddvtdevfefhvdevgedujeeh necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepughmih htrhihsehguhhtohhvrdguvghvpdhnsggprhgtphhtthhopeegpdhmohguvgepshhmthhp ohhuthdprhgtphhtthhopegvlhhiiiesghhnuhdrohhrghdprhgtphhtthhopehssggruh hghhesjhgrnhgvshhtrhgvvghtrdgtohhmpdhrtghpthhtohepjeelfeefgeesuggvsggs uhhgshdrghhnuhdrohhrghdprhgtphhtthhopegvghhgvghrthestghsrdhutghlrgdrvg guuh X-ME-Proxy: <xmx:Wuu0aDEOe2tiZckX0ajNshpE-EP9oJS0R0LfkBhM7LrQ3rA_dwUlnQ> <xmx:Wuu0aHP3fQD5xTaFimlwpclVQCJPISrFKpxC75AfbKlVXOe96zikrw> <xmx:Wuu0aNdACt23rE7himh7yXxrcC5iSHGmautYPK7oHqs5hyZoz3LCxQ> <xmx:Wuu0aMvvcxGI_cxk97VJCk7s_dSFveANGfoFFanEdvKRXJcE9AKKtg> <xmx:W-u0aPz_Db3Z-Rz71tsA3AATxekg3Da_5mECrtxhXsTou41J7vFQXOVX> Feedback-ID: i07de48aa:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Sun, 31 Aug 2025 20:39:53 -0400 (EDT) Message-ID: <ea775e4c-d182-48ce-b7c1-76ba2051cd0f@HIDDEN> Date: Mon, 1 Sep 2025 03:39:50 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: bug#79334: [PATCH] Don't release thread select lock unnecessarily To: Eli Zaretskii <eliz@HIDDEN>, Spencer Baugh <sbaugh@HIDDEN> References: <ierwm6n6tno.fsf@HIDDEN> <86cy8es4bo.fsf@HIDDEN> <ierqzwu6ztl.fsf@HIDDEN> <86a53is0zw.fsf@HIDDEN> <ieriki66y3g.fsf@HIDDEN> <867bymrzvr.fsf@HIDDEN> <ierfrda6v41.fsf@HIDDEN> <865xe6rv8z.fsf@HIDDEN> Content-Language: en-US From: Dmitry Gutov <dmitry@HIDDEN> In-Reply-To: <865xe6rv8z.fsf@HIDDEN> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 79334 Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.7 (-) On 29/08/2025 18:41, Eli Zaretskii wrote: > For example, try to run something in a non-main > thread and then attempt to interact with Emacs in the main thread. Could that be code that does straight computation and does not switch threads at all, unless sleep-for is added? Perhaps relatedly, diff-hl in "threaded" mode has a much better interactivity since switching to 'make-process' from 'call-process' for its external program interaction. But you're probably thinking of different kind of examples.
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.Received: (at 79334) by debbugs.gnu.org; 31 Aug 2025 03:20:39 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sat Aug 30 23:20:39 2025 Received: from localhost ([127.0.0.1]:51030 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1usYcR-0003dk-BX for submit <at> debbugs.gnu.org; Sat, 30 Aug 2025 23:20:39 -0400 Received: from fout-b4-smtp.messagingengine.com ([202.12.124.147]:59649) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <dmitry@HIDDEN>) id 1usYcN-0003dW-P7 for 79334 <at> debbugs.gnu.org; Sat, 30 Aug 2025 23:20:37 -0400 Received: from phl-compute-06.internal (phl-compute-06.internal [10.202.2.46]) by mailfout.stl.internal (Postfix) with ESMTP id 0AFD41D00079; Sat, 30 Aug 2025 23:20:30 -0400 (EDT) Received: from phl-mailfrontend-02 ([10.202.2.163]) by phl-compute-06.internal (MEProxy); Sat, 30 Aug 2025 23:20:30 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gutov.dev; h=cc :cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm2; t=1756610429; x=1756696829; bh=kBqdDm/J9GTebJbtyuW2zpOIxB1rhdME0Tm8bp3Ab7I=; b= fOPGnDhG0mGXGFYEBlMFdy8CpNsCzgDZzndt6JI5CKk+TdK3Pt3MqGgOsChq56An qjc/44gE8+2oRItg4kJyyrht/6GF1yMIUxQ0SPkrXSw6QUtV3OfXaABaf8rALsrl U1rcy3VyTdnRP1FrEFdDBQgDrsPCzLW3qvCQX1PBmi76rIjSaC5Ai+mDYIyXUuYK 8prpcK5IeiGOiA03Gwl6FG5i6xHr/vNbIY9MQhf0Vwjp8AcvT2SFE8uf88T7XHLj Hs+s3BSOIdH6/JpzhsGzIVFxt3oO2aBw6phKV39mzyJZwPeFCiXDAJkABqDplJ6s SdnrxETMTl6CMzBE5PtlMw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t=1756610429; x= 1756696829; bh=kBqdDm/J9GTebJbtyuW2zpOIxB1rhdME0Tm8bp3Ab7I=; b=D K7+vxNeR0ntkaLpqbYvcLK93lYK7VhaHi8FxuBg0EfnRKcayUcBSWUUkmDHJY7Sm 5I4VMun8tK8eDWgSzw7kXCm4GEZ9NcTMP0nps4i0+0ZJUy+YZUSQNFyG9ZqDDwGM APzu5eqjDmj2gVX04PkVjVrSVdyyJbKdqwtf37UarBOEuxIyS5Jk1OG+hLh4f/2y geKq9lRPLh8hEqSzpunA/kKIvAM37VisqG5kXKkfLTJhb9Mv5dcHJj/sjktGBeTR DE05TtgS5p5/aSmMdkPY8IoFiVPV60zb9Z1m3wwCzBcURduyzViZMG46gM4tiCt/ 0wwymAt7G9tO6IBufSksg== X-ME-Sender: <xms:fb-zaJJbRM1arfSVIYRbEaUAur_y4aDQJlCOXH93lT8_IMQAkz9vIQ> <xme:fb-zaOleOYQDvsTPClHTVbiZYo6DID-JbGZJ6K4b8ya2vVH5E0JgUZJ5bdrHzFxBm iCCOWuzcdMPLdKaEVU> X-ME-Received: <xmr:fb-zaBQNFbODelBz5qdq3KehI0CuTbDGblceis-v9HV2KiI5F8HfuH6mK7R1R0_tqFQ> X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdefgddukeekudekucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujf gurhepkfffgggfuffvvehfhfgjtgfgsehtkeertddtvdejnecuhfhrohhmpeffmhhithhr hicuifhuthhovhcuoegumhhithhrhiesghhuthhovhdruggvvheqnecuggftrfgrthhtvg hrnhepgeelfeetkefghfdvhfdtgeevveevteetgeetveegtedthefhudekteehffeukeek necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepughmih htrhihsehguhhtohhvrdguvghvpdhnsggprhgtphhtthhopeegpdhmohguvgepshhmthhp ohhuthdprhgtphhtthhopehssggruhhghhesjhgrnhgvshhtrhgvvghtrdgtohhmpdhrtg hpthhtohepvghlihiisehgnhhurdhorhhgpdhrtghpthhtohepjeelfeefgeesuggvsggs uhhgshdrghhnuhdrohhrghdprhgtphhtthhopegvghhgvghrthestghsrdhutghlrgdrvg guuh X-ME-Proxy: <xmx:fb-zaAKdERhbPcEsPrHURpFs9IM7n0M1ICubOYAXLr49_L7wJxovmA> <xmx:fb-zaHCMlKeBgkYIscewssvW5Msd56C1d6_KaH8uYVz3uyLMYzPLFQ> <xmx:fb-zaJD-vgq6tUJ60TYsMvEhvCRx8XHfO0Be5Hma9T7cVIulqX8dLA> <xmx:fb-zaJDtaD7241TZECoZm8_rEU0IEDil7Xo5n7RCBmY8TSR3ztsyEg> <xmx:fb-zaBGSNoHz0B8JbfbYzhbIKAAx7IYzArrsZU5XwahikBcmbYOtidVH> Feedback-ID: i07de48aa:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Sat, 30 Aug 2025 23:20:27 -0400 (EDT) Message-ID: <55676972-950d-440c-a291-0adefced4e6f@HIDDEN> Date: Sun, 31 Aug 2025 06:20:24 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: bug#79334: [PATCH] Don't release thread select lock unnecessarily To: Spencer Baugh <sbaugh@HIDDEN>, Eli Zaretskii <eliz@HIDDEN> References: <ierwm6n6tno.fsf@HIDDEN> <86cy8es4bo.fsf@HIDDEN> <ierqzwu6ztl.fsf@HIDDEN> <86a53is0zw.fsf@HIDDEN> <ieriki66y3g.fsf@HIDDEN> <867bymrzvr.fsf@HIDDEN> <ierfrda6v41.fsf@HIDDEN> <865xe6rv8z.fsf@HIDDEN> <CAO=BR8PCwypg+7TMbbQueQh82drqQTeYNU0LGqVQ=oDv58v5NQ@HIDDEN> <86349aruhe.fsf@HIDDEN> <CAO=BR8OFBKbTht7_uQsKOsfNOBz_EqTPBrrF9fizv8qb5K8KJw@HIDDEN> Content-Language: en-US From: Dmitry Gutov <dmitry@HIDDEN> In-Reply-To: <CAO=BR8OFBKbTht7_uQsKOsfNOBz_EqTPBrrF9fizv8qb5K8KJw@HIDDEN> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 79334 Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.7 (-) On 29/08/2025 19:11, Spencer Baugh wrote: > On Fri, Aug 29, 2025, 11:57 AM Eli Zaretskii <eliz@HIDDEN > <mailto:eliz@HIDDEN>> wrote: > > > From: Spencer Baugh <sbaugh@HIDDEN > <mailto:sbaugh@HIDDEN>> > > Date: Fri, 29 Aug 2025 11:43:21 -0400 > > Cc: 79334 <at> debbugs.gnu.org <mailto:79334 <at> debbugs.gnu.org>, > eggert@HIDDEN <mailto:eggert@HIDDEN>, Dmitry Gutov > <dmitry@HIDDEN <mailto:dmitry@HIDDEN>> > > > > Okay, so then how about we delete this unsafe thread_select and > put some thread-yield calls around > > wait_reading_process_output? > > I'm not objected to not calling thread_select there, but can we have > tests that will show us the gains? > > > The primary gain is that it becomes easier to reason about locking. I'm > not sure how to write a test for that :) I think what could help in this area is some benchmarking scenario, and the goal would be to show that it doesn't regress (or within the margin of error) when 'pselect' is used directly. A script which would create a bunch of threads and some amount of processes spawned by them with certain characteristics. Depending on what workload switching in that thread_select could help optimize.
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.
Received: (at 79334) by debbugs.gnu.org; 30 Aug 2025 18:39:05 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Aug 30 14:39:05 2025
Received: from localhost ([127.0.0.1]:49777 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1usQTg-0000rM-HA
for submit <at> debbugs.gnu.org; Sat, 30 Aug 2025 14:39:04 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:56948)
by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1usQTe-0000qs-5m
for 79334 <at> debbugs.gnu.org; Sat, 30 Aug 2025 14:39:02 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
id 1usQTX-0000q6-Ks; Sat, 30 Aug 2025 14:38:55 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
mime-version; bh=8LEoYm1gbNW2XxsqKaotoKk4KLK8AoTVTw7bWqdpKeI=; b=D7IEz5XMQNSF
b+BSITnsDKLNYKblAPBqkvKWdsgGGPC84kbTgOyhunHlP1VLUW/hhQtoeuztQGUJfiwIDeR5Vl+v4
q2rsS3RfFZpxQT4ZDsZnMLEPjgVOEcTIw46QnohG5bTI/hHHDVldZKfuFBLlAGNkqwVDYXlEPVFWV
I9HTaqd1p18pSjHnkBpYjYeRXeso8ZLHO0Jwh0X6LdAXo45uf+ahCvJaSEWOlf/PmH7ZrZLUJz58m
VuKtstb7J/jhDRhLIwZk5x7V+dKYVg7HF2W8FTsV8k48yA+L9n6hDRXt9hFTn7nMNwDZY/E//IOgi
a/4vxTrHYgN0dW1nDX+wYg==;
Date: Sat, 30 Aug 2025 21:38:52 +0300
Message-Id: <86a53gpscj.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: sbaugh@HIDDEN
In-Reply-To: <86jz2kpy1w.fsf@HIDDEN> (message from Eli Zaretskii on Sat, 30
Aug 2025 19:35:39 +0300)
Subject: Re: bug#79334: [PATCH] Don't release thread select lock unnecessarily,
[PATCH] Defer closing file descriptors used by other threads
References: <ierwm6n6tno.fsf@HIDDEN> <86cy8es4bo.fsf@HIDDEN>
<ierqzwu6ztl.fsf@HIDDEN> <86a53is0zw.fsf@HIDDEN>
<ier8qj26nlj.fsf@HIDDEN> <ier5xe57vrr.fsf@HIDDEN>
<ier34997u4c.fsf_-_@HIDDEN> <86zfbhqptt.fsf@HIDDEN>
<ierldn0olcg.fsf@HIDDEN> <86jz2kpy1w.fsf@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 79334
Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)
> Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN
> Date: Sat, 30 Aug 2025 19:35:39 +0300
> From: Eli Zaretskii <eliz@HIDDEN>
>
> > >> + if (fd_callback_info[fd].waiting_thread == current_thread
> > >> + && (fd_callback_info[fd].flags == WAITING_FOR_CLOSE_FD))
> > >> + {
> > >> + fprintf (stderr, "closing deferred fd %d\n", fd);
> > >> + emacs_close (fd);
> > >> + /* Ignore any events which happened on this fd. */
> > >> + FD_CLR (fd, &Available);
> > >> + FD_CLR (fd, &Writeok);
> > >> + fd_callback_info[fd].flags = 0;
> > >
> > > Why do it here and not when we deactivate processes?
> >
> > Because that would close file descriptors that another thread is
> > currently selecting on, causing EBADF.
>
> No, I mean to deactivate only the processes whose I/O channels are
> locked to the current thread.
>
> > > Until now we never closed any descriptors inside
> > > wait_reading_process_output.
> >
> > Just to be clear, we can call deactivate_process from
> > wait_reading_process_output, through status_notify, through process
> > filters, or other similar. That closes descriptors.
>
> Yes, and so I suggest doing this logic of deactivating only the
> processes that are locked to the current thread there.
Btw, as a nice bonus of that approach, process sentinel will run in
the context of the thread to which the process is locked.
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.
Received: (at 79334) by debbugs.gnu.org; 30 Aug 2025 16:36:18 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Aug 30 12:36:17 2025
Received: from localhost ([127.0.0.1]:49502 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1usOYq-0003EB-T7
for submit <at> debbugs.gnu.org; Sat, 30 Aug 2025 12:36:17 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:52644)
by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1usOYk-0003Do-CI
for 79334 <at> debbugs.gnu.org; Sat, 30 Aug 2025 12:36:13 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
id 1usOYb-0001nr-6T; Sat, 30 Aug 2025 12:36:01 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
mime-version; bh=raezIEp48dBi7BFRaXVbPHQaQPAaSVCnHRK66D8sQDQ=; b=rUlwx0ctE15Z
2iusHfN27meVn8FZwbHrkdjQNfkRZJSg9I8ISUhaC/cu/VJyqAxPQsa/MKwf550Jzv6owE8RZopJJ
BWnYBa6FI5nDcDnYzFIWNFcWChRRO6KZl42OmI0l6ODwLmW+ZB30fD7ACzekdPk3vJ9SvmQJZBFx9
62sUeElm29Nm3M+7vDYE4S4URdQCtGHOJRs1df/JT4ZAL74sjnBzEZPLD9yF2+VxRFneCoVC41mtW
M4ivDbEpopFnm2H2no78I46/0Ta4wJAtz1vGgCEnvcxe4parbbLZfNw87tqcFpXWQX0a6VViQMbxy
7CZLvWwANgOm21ydcL4w4A==;
Date: Sat, 30 Aug 2025 19:35:39 +0300
Message-Id: <86jz2kpy1w.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Spencer Baugh <sbaugh@HIDDEN>
In-Reply-To: <ierldn0olcg.fsf@HIDDEN> (message from Spencer Baugh on
Sat, 30 Aug 2025 11:55:27 -0400)
Subject: Re: bug#79334: [PATCH] Don't release thread select lock
unnecessarily, [PATCH] Defer closing file descriptors used by other
threads
References: <ierwm6n6tno.fsf@HIDDEN> <86cy8es4bo.fsf@HIDDEN>
<ierqzwu6ztl.fsf@HIDDEN> <86a53is0zw.fsf@HIDDEN>
<ier8qj26nlj.fsf@HIDDEN> <ier5xe57vrr.fsf@HIDDEN>
<ier34997u4c.fsf_-_@HIDDEN> <86zfbhqptt.fsf@HIDDEN>
<ierldn0olcg.fsf@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 79334
Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)
> From: Spencer Baugh <sbaugh@HIDDEN>
> Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN
> Date: Sat, 30 Aug 2025 11:55:27 -0400
>
> >> + /* Ignore any events which happened on this fd. */
> >> + FD_CLR (fd, &Available);
> >> + FD_CLR (fd, &Writeok);
> >
> > Is this wise? How can we be sure that these events were already
> > handled (presumably, in another thread)? If they were not, we will
> > lose events. What will happen if we don't ignore them here?
>
> Any events which might appear at this point were already events we were
> going to lose by closing this file descriptor. For example, if a
> process produces output just as we do delete-process on it, we never
> read that output. The only change is that we now have to explicitly
> ignore those events; before, we silently dropped them at the time we
> closed the file descriptor.
Where did we close it? There's this loop _after_ the place where you
add your loop which clears the bits in Available:
for (channel = 0; channel <= max_desc; ++channel)
{
struct fd_callback_data *d = &fd_callback_info[channel];
if (d->func
&& ((d->flags & FOR_READ
&& FD_ISSET (channel, &Available))
|| ((d->flags & FOR_WRITE)
&& FD_ISSET (channel, &Writeok))))
d->func (channel, d->data);
}
AFAIU, it would notice the set bits and act upon them. Now with this
new addition we are clearing them ahead of that loop. That doesn't
necessarily sound right to me.
> >> +static bool
> >> +other_thread_is_waiting_for (struct fd_callback_data* elem)
> >> +{
> >> + return elem->waiting_thread != NULL && elem->waiting_thread != current_thread;
> >> +}
> >
> > This should also check that the waiting thread is still alive. If it
> > isn't, it's okay to close the descriptor. Otherwise, descriptors
> > belonging to threads that exited might never be closed.
>
> waiting_thread always points to a living thread, because it's only set
> while a thread is inside wait_reading_process_output.
>
> (In fact, when a thread sees waiting_thread set to a value which is
> neither NULL nor current_thread, it can only ever point to a thread
> which is currently inside thread_select)
You assume that the waiting_thread field will be cleared when a thread
dies, but that is not guaranteed. Why isn't it better to be defensive
and make sure the thread is still alive? If you are right, that test
will always be true, but what if you are wrong in some rare cases?
> >> + if (fd_callback_info[fd].waiting_thread == current_thread
> >> + && (fd_callback_info[fd].flags == WAITING_FOR_CLOSE_FD))
> >> + {
> >> + fprintf (stderr, "closing deferred fd %d\n", fd);
> >> + emacs_close (fd);
> >> + /* Ignore any events which happened on this fd. */
> >> + FD_CLR (fd, &Available);
> >> + FD_CLR (fd, &Writeok);
> >> + fd_callback_info[fd].flags = 0;
> >
> > Why do it here and not when we deactivate processes?
>
> Because that would close file descriptors that another thread is
> currently selecting on, causing EBADF.
No, I mean to deactivate only the processes whose I/O channels are
locked to the current thread.
> > Until now we never closed any descriptors inside
> > wait_reading_process_output.
>
> Just to be clear, we can call deactivate_process from
> wait_reading_process_output, through status_notify, through process
> filters, or other similar. That closes descriptors.
Yes, and so I suggest doing this logic of deactivating only the
processes that are locked to the current thread there.
> > More generally, why this design and not a simpler change which only
> > deactivates processes whose I/O is waited by the current thread, as
> > discussed previously?
>
> I looked into that, but I determined that it's too complicated. It
> would require us to keep track of a new kind of process state: a process
> which has been deleted, but hasn't been deactivated yet.
What I had in mind was neither delete it nor deactivate it. IOW, the
loop which goes through the list of processes should only pay
attention to processes locked to this thread and those that are not
locked to any live thread. It should leave the other processes to
when their threads call status_notify.
> > Looking at all the callers of close_process_fd, it sounds like it's
> > too low-level to do this. For example, it is called from
> > create_process (including in the forked child) to close the unneeded
> > ends of the pipe, where we probably don't want this. And
> > clear_fd_callback_data is also used for the keyboard input descriptor,
> > which is probably also not relevant.
>
> True.
>
> I looked into fixing these, and that suggested to me an altogether
> simpler approach. We can keep the logic entirely contained in
> wait_reading_process_output, just adding a bit of code around the select
> call.
>
> And we don't need to defer deleting the fds at all: just recognize when
> it's happened, and be careful to not touch those fds afterwards.
Hmm... I don't understand: why do we need to do this? The calls to
compute_input_wait_mask, compute_non_keyboard_wait_mask etc., which
set bits in the Available mask passed top pselect already avoid
setting bits for descriptors on which some other thread waits. So
pselect cannot possibly indicate any output ready on any descriptors
belonging to other threads, and this loop you suggest to add:
> + /* Save the fds we're currently waiting_thread for. */
> + fd_set waiting_fds;
> + FD_ZERO (&waiting_fds);
> + int max_waiting_fds = 0;
> + for (int fd = 0; fd <= max_desc; ++fd)
> + {
> + if (fd_callback_info[fd].waiting_thread == current_thread)
> + {
> + FD_SET (fd, &waiting_fds);
> + max_waiting_fds = fd;
> + }
> + }
should be unnecessary. Or what am I missing?
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.
Received: (at 79334) by debbugs.gnu.org; 30 Aug 2025 15:55:36 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Aug 30 11:55:36 2025
Received: from localhost ([127.0.0.1]:49100 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1usNvT-0000cx-To
for submit <at> debbugs.gnu.org; Sat, 30 Aug 2025 11:55:36 -0400
Received: from mxout5.mail.janestreet.com ([64.215.233.18]:59135)
by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.84_2) (envelope-from <sbaugh@HIDDEN>)
id 1usNvR-0000cj-7S
for 79334 <at> debbugs.gnu.org; Sat, 30 Aug 2025 11:55:34 -0400
From: Spencer Baugh <sbaugh@HIDDEN>
To: Eli Zaretskii <eliz@HIDDEN>
Subject: Re: bug#79334: [PATCH] Don't release thread select lock
unnecessarily, [PATCH] Defer closing file descriptors used by other
threads
In-Reply-To: <86zfbhqptt.fsf@HIDDEN> (Eli Zaretskii's message of "Sat, 30 Aug
2025 09:35:42 +0300")
References: <ierwm6n6tno.fsf@HIDDEN> <86cy8es4bo.fsf@HIDDEN>
<ierqzwu6ztl.fsf@HIDDEN> <86a53is0zw.fsf@HIDDEN>
<ier8qj26nlj.fsf@HIDDEN> <ier5xe57vrr.fsf@HIDDEN>
<ier34997u4c.fsf_-_@HIDDEN> <86zfbhqptt.fsf@HIDDEN>
Date: Sat, 30 Aug 2025 11:55:27 -0400
Message-ID: <ierldn0olcg.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13)
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=-=-="
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com;
s=waixah; t=1756569327;
bh=2yHKk7CzlQRcNALXv3lUnlD9q8K1fQQENU6tF+gPOBM=;
h=From:To:Cc:Subject:In-Reply-To:References:Date;
b=P0AG8w3L+sbckKKhzOvJtoqacOkE6VTvYb2XE5/gUSq0I8tT0PYSHXW3grhb7w85U
VdtXsTZ72i3IYxR2mJPDUgTdKXfamBEmEvrUW0DD/JotkNdB58MzND51cYaAi1zCmL
9hBNXEB5qwdrWMmV+75Fi4atpaEOxUCPe3k4Ya9IJDD/2hCMzkJblOQne2gRUHm7ol
tVCge504xv4fjGv/1grhnu9a7xnRAvBA8PDuIevLpMFwyEfsMOgCt1c85iToX4mQLf
JtBqzMXV6XFP05JQkLEztgTkOHFjeU7Ri2eG2ZK3z4eXITx0zaj/O7BGtcyQdn4HI1
6WfLZjmkDXL9g==
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 79334
Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)
--=-=-=
Content-Type: text/plain
(I'm responding to both your emails with review feedback here)
Eli Zaretskii <eliz@HIDDEN> writes:
>> From: Spencer Baugh <sbaugh@HIDDEN>
>> Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN
>> Date: Fri, 29 Aug 2025 16:25:39 -0400
>> + for (int fd = 0; fd <= max_desc; ++fd)
>> + {
>> + if (fd_callback_info[fd].waiting_thread == current_thread
>> + && (fd_callback_info[fd].flags == WAITING_FOR_CLOSE_FD))
> ^^^^^^^^^^^^^^^^^^^^^^^
> Shouldn't this test that the WAITING_FOR_CLOSE_FD bit is set, not that
> it is the _only_ bit set? Even if the code is correct today, it might
> not be future-proof if we add some additional flags at some point.
True. Fixed.
>> + /* Ignore any events which happened on this fd. */
>> + FD_CLR (fd, &Available);
>> + FD_CLR (fd, &Writeok);
>
> Is this wise? How can we be sure that these events were already
> handled (presumably, in another thread)? If they were not, we will
> lose events. What will happen if we don't ignore them here?
Any events which might appear at this point were already events we were
going to lose by closing this file descriptor. For example, if a
process produces output just as we do delete-process on it, we never
read that output. The only change is that we now have to explicitly
ignore those events; before, we silently dropped them at the time we
closed the file descriptor.
>> Many different pieces of code in Emacs can close file descriptors. We
>> need to be generically careful to not close file descriptors which some
>> other thread is currently waiting on. In my patch, we do this by
>> letting the waiting_thread itself close those file descriptors, after it
>> returns from select.
>
> Not literally "when returning from pselect", I hope, but when the
> process is being deleted _and_ the waiting thread returns from
> pselect, right? IOW, we don't close the descriptor each time pselecft
> returns in the right thread, then reopen it before the next call to
> pselect, because that won't work.
Right.
>> +static bool
>> +other_thread_is_waiting_for (struct fd_callback_data* elem)
>> +{
>> + return elem->waiting_thread != NULL && elem->waiting_thread != current_thread;
>> +}
>
> This should also check that the waiting thread is still alive. If it
> isn't, it's okay to close the descriptor. Otherwise, descriptors
> belonging to threads that exited might never be closed.
waiting_thread always points to a living thread, because it's only set
while a thread is inside wait_reading_process_output.
(In fact, when a thread sees waiting_thread set to a value which is
neither NULL nor current_thread, it can only ever point to a thread
which is currently inside thread_select)
>> @@ -2113,7 +2131,11 @@ close_process_fd (int *fd_addr)
>> if (0 <= fd)
>> {
>> *fd_addr = -1;
>> - emacs_close (fd);
>> + if (other_thread_is_waiting_for (&fd_callback_info[fd]))
>> + /* Let waiting_thread handle closing the fd. */
> ^^
> Style: 2 spaces between the period and "*/".
Fixed.
>> + /* Close fds which other threads wanted to close. */
>> + for (int fd = 0; fd <= max_desc; ++fd)
>> + {
>> + if (fd_callback_info[fd].waiting_thread == current_thread
>> + && (fd_callback_info[fd].flags == WAITING_FOR_CLOSE_FD))
>> + {
>> + fprintf (stderr, "closing deferred fd %d\n", fd);
>> + emacs_close (fd);
>> + /* Ignore any events which happened on this fd. */
>> + FD_CLR (fd, &Available);
>> + FD_CLR (fd, &Writeok);
>> + fd_callback_info[fd].flags = 0;
>
> Why do it here and not when we deactivate processes?
Because that would close file descriptors that another thread is
currently selecting on, causing EBADF.
> Until now we never closed any descriptors inside
> wait_reading_process_output.
Just to be clear, we can call deactivate_process from
wait_reading_process_output, through status_notify, through process
filters, or other similar. That closes descriptors.
> More generally, why this design and not a simpler change which only
> deactivates processes whose I/O is waited by the current thread, as
> discussed previously?
I looked into that, but I determined that it's too complicated. It
would require us to keep track of a new kind of process state: a process
which has been deleted, but hasn't been deactivated yet. This would
need to be supported for each kind of process, and we'd need to record
new kinds of state to finish the deactivation later.
It's simpler to keep the ability to delete any process at any time, and
have that fully deactivate the process as it does today. Doing that
only causes one issue: if another thread is selecting on the process's
fds, that thread will break if it tries to handle events on the fds, or
the thread will simply get EBADF from select. So, we detect that by
checking waiting_thread, and marking the fds so that the other thread
knows not to touch them, and to instead just close them to finish
deleting the process.
> I think this alternative would be easier to understand and reason
> about, because (as established in another discussion)
> wait_reading_process_output frequently loops more than we expect it
> to.
I'm not sure what you mean by that, why does wait_reading_process_output
looping more than we expect cause issues? I don't think it would.
> Looking at all the callers of close_process_fd, it sounds like it's
> too low-level to do this. For example, it is called from
> create_process (including in the forked child) to close the unneeded
> ends of the pipe, where we probably don't want this. And
> clear_fd_callback_data is also used for the keyboard input descriptor,
> which is probably also not relevant.
True.
I looked into fixing these, and that suggested to me an altogether
simpler approach. We can keep the logic entirely contained in
wait_reading_process_output, just adding a bit of code around the select
call.
And we don't need to defer deleting the fds at all: just recognize when
it's happened, and be careful to not touch those fds afterwards.
Attached.
--=-=-=
Content-Type: text/x-patch
Content-Disposition: inline;
filename=0001-Ignore-descriptors-deleted-during-select.patch
From b37e71f47913e1011c4a7857e166e78eddc2a184 Mon Sep 17 00:00:00 2001
From: Spencer Baugh <sbaugh@HIDDEN>
Date: Sat, 30 Aug 2025 11:42:19 -0400
Subject: [PATCH] Ignore descriptors deleted during select
* src/process.c (wait_reading_process_output): Save the fds
we're waiting on and check that we're still waiting after
select. (bug#79334)
---
src/process.c | 37 +++++++++++++++++++++++++++++++++++--
1 file changed, 35 insertions(+), 2 deletions(-)
diff --git a/src/process.c b/src/process.c
index 1612248a11d..11af2f55f63 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5643,6 +5643,7 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
nfds = read_kbd ? 0 : 1;
no_avail = 1;
FD_ZERO (&Available);
+ xerrno = 0;
}
else
{
@@ -5756,6 +5757,18 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
&& timespec_cmp (short_timeout, timeout) < 0)
timeout = short_timeout;
#endif
+ /* Save the fds we're currently waiting_thread for. */
+ fd_set waiting_fds;
+ FD_ZERO (&waiting_fds);
+ int max_waiting_fds = 0;
+ for (int fd = 0; fd <= max_desc; ++fd)
+ {
+ if (fd_callback_info[fd].waiting_thread == current_thread)
+ {
+ FD_SET (fd, &waiting_fds);
+ max_waiting_fds = fd;
+ }
+ }
/* Android requires using a replacement for pselect in
android.c to poll for events. */
@@ -5809,10 +5822,30 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
FD_SET (channel, &Available);
}
#endif
+ xerrno = errno;
+ /* Check if any of waiting_fds was closed by another thread or
+ a signal handler. */
+ for (int fd = 0; fd <= max_waiting_fds; ++fd)
+ {
+ if (FD_ISSET (fd, &waiting_fds)
+ && fd_callback_info[fd].waiting_thread != current_thread)
+ {
+ /* Ignore any events which we happened to see on this
+ fd before it was closed; we can't do anything for
+ it now, since it's closed. */
+ FD_CLR (fd, &Available);
+ FD_CLR (fd, &Writeok);
+ /* If we got an EBADF and no events, this is probably
+ why; clear it so we can try again. */
+ if (nfds < 0 && xerrno == EBADF)
+ {
+ nfds = 0;
+ xerrno = 0;
+ }
+ }
+ }
}
- xerrno = errno;
-
/* Make C-g and alarm signals set flags again. */
clear_waiting_for_input ();
--
2.43.7
--=-=-=--
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.
Received: (at 79334) by debbugs.gnu.org; 30 Aug 2025 06:36:06 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Aug 30 02:36:06 2025
Received: from localhost ([127.0.0.1]:44312 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1usFBx-0001uC-GY
for submit <at> debbugs.gnu.org; Sat, 30 Aug 2025 02:36:06 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:45716)
by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1usFBt-0001sp-4A
for 79334 <at> debbugs.gnu.org; Sat, 30 Aug 2025 02:35:59 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
id 1usFBi-00031Y-1v; Sat, 30 Aug 2025 02:35:47 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
mime-version; bh=/P4xvteEzG7SA+MVTLWp3uJca9UwfYBherFneiGL2Xs=; b=oH6m3zEqU/Hj
jf5TfcvCz42AIq+luViPRKDd453t8dG/Ce+ScJlvf3YUaiR+sHYqylmyeufeB99QJ/O75EdAY3Xe3
zTLeUg2zDdrFxPuscrPMXgkDXHM/xZxjVIISVL0qgP1s58NMbNT6xmUchKRJvzfe/azQwFTOBU5bD
XRI/T73eU3ySz0WoiU4DvD6aKoeLjfA+5eKgGHqrMrCoReBboCefcpxkKE9RUIqvkMGKW4ikG/Y0i
zo19wl0k+4XKDyKYLZsYVfayuR7O1HyqclJq/lQX5nCn5XWH4WCErAh7/X/yKQJOtfUKk5Z1G0rja
LUjY1+XTef/A4p36sgwjGg==;
Date: Sat, 30 Aug 2025 09:35:42 +0300
Message-Id: <86zfbhqptt.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Spencer Baugh <sbaugh@HIDDEN>
In-Reply-To: <ier34997u4c.fsf_-_@HIDDEN> (message from Spencer Baugh
on Fri, 29 Aug 2025 16:25:39 -0400)
Subject: Re: bug#79334: [PATCH] Don't release thread select lock
unnecessarily, [PATCH] Defer closing file descriptors used by other
threads
References: <ierwm6n6tno.fsf@HIDDEN> <86cy8es4bo.fsf@HIDDEN>
<ierqzwu6ztl.fsf@HIDDEN> <86a53is0zw.fsf@HIDDEN>
<ier8qj26nlj.fsf@HIDDEN> <ier5xe57vrr.fsf@HIDDEN>
<ier34997u4c.fsf_-_@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 79334
Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)
> From: Spencer Baugh <sbaugh@HIDDEN>
> Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN
> Date: Fri, 29 Aug 2025 16:25:39 -0400
>
> In related news: running process-tests.el in a loop also triggers my
> "Closing deferred fd" message. Which suggests that src/process-tests.el
> was probably also suffering from bugs in this area, which could have
> been causing hangs or flaky failures. Which should now hopefully be
> fixed.
process-tests uses questionable setup, apart of being based on
non-portable assumptions. At the time, I was unable to convince the
author of the tests to spell out the assumptions and expectations from
what should happen in those situations, without which it was
impossible to decide whether some failures on MS-Windows are bugs or
just non-portable code that cannot work on Windows. Some of those
tests always fail for me on Windows.
> + for (int fd = 0; fd <= max_desc; ++fd)
> + {
> + if (fd_callback_info[fd].waiting_thread == current_thread
> + && (fd_callback_info[fd].flags == WAITING_FOR_CLOSE_FD))
^^^^^^^^^^^^^^^^^^^^^^^
Shouldn't this test that the WAITING_FOR_CLOSE_FD bit is set, not that
it is the _only_ bit set? Even if the code is correct today, it might
not be future-proof if we add some additional flags at some point.
> + /* Ignore any events which happened on this fd. */
> + FD_CLR (fd, &Available);
> + FD_CLR (fd, &Writeok);
Is this wise? How can we be sure that these events were already
handled (presumably, in another thread)? If they were not, we will
lose events. What will happen if we don't ignore them here?
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.
Received: (at 79334) by debbugs.gnu.org; 30 Aug 2025 06:18:27 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Aug 30 02:18:27 2025
Received: from localhost ([127.0.0.1]:44269 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1usEuv-0007ZO-F4
for submit <at> debbugs.gnu.org; Sat, 30 Aug 2025 02:18:26 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:55540)
by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1usEus-0007YD-VV
for 79334 <at> debbugs.gnu.org; Sat, 30 Aug 2025 02:18:24 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
id 1usEul-0000T5-Vv; Sat, 30 Aug 2025 02:18:16 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date:
mime-version; bh=37HdJREzOX0Ff/ItCAWw+CoUcHE8596w5562m4Crwjc=; b=rRiUTqsbfbdG
DNuiQG2SrdwiXrjLxp3wy/x0KhLeGYwN3lTZf0QWki/PaWdAi2xHOQjIuDghxLM5EEV20F0DwNt3S
A3MEnt/qTQbw3Mn4t0ckSXGDKquEPdmMhcodCmhEiQxIp7Vrl73whdrurxT7WLvzzTgHVG6mWYfjS
dF+Zi7JWK46YyhX1YOCmAYX2dn/vA/BbDHr7d+Fj2PHV6SMtCTITaag0SVG2yE3rtvD1JAKKilpLH
Tq50GJCTfMotAcCFNyjsYSiQGTyaW0O6xyDadzI/JK1K1hVDcq56TSU9aSsHtKk7ShUj1/64YxUIw
bCQnJBUKC1SOkeLTzPjrbg==;
Date: Sat, 30 Aug 2025 09:18:12 +0300
Message-Id: <861pots57f.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
To: Spencer Baugh <sbaugh@HIDDEN>
In-Reply-To: <ier5xe57vrr.fsf@HIDDEN> (message from Spencer Baugh on
Fri, 29 Aug 2025 15:50:00 -0400)
Subject: Re: bug#79334: [PATCH] Don't release thread select lock unnecessarily
References: <ierwm6n6tno.fsf@HIDDEN> <86cy8es4bo.fsf@HIDDEN>
<ierqzwu6ztl.fsf@HIDDEN> <86a53is0zw.fsf@HIDDEN>
<ier8qj26nlj.fsf@HIDDEN> <ier5xe57vrr.fsf@HIDDEN>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 79334
Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)
> From: Spencer Baugh <sbaugh@HIDDEN>
> Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN
> Date: Fri, 29 Aug 2025 15:50:00 -0400
>
> Many different pieces of code in Emacs can close file descriptors. We
> need to be generically careful to not close file descriptors which some
> other thread is currently waiting on. In my patch, we do this by
> letting the waiting_thread itself close those file descriptors, after it
> returns from select.
Not literally "when returning from pselect", I hope, but when the
process is being deleted _and_ the waiting thread returns from
pselect, right? IOW, we don't close the descriptor each time pselecft
returns in the right thread, then reopen it before the next call to
pselect, because that won't work.
> +static bool
> +other_thread_is_waiting_for (struct fd_callback_data* elem)
> +{
> + return elem->waiting_thread != NULL && elem->waiting_thread != current_thread;
> +}
This should also check that the waiting thread is still alive. If it
isn't, it's okay to close the descriptor. Otherwise, descriptors
belonging to threads that exited might never be closed.
> @@ -2113,7 +2131,11 @@ close_process_fd (int *fd_addr)
> if (0 <= fd)
> {
> *fd_addr = -1;
> - emacs_close (fd);
> + if (other_thread_is_waiting_for (&fd_callback_info[fd]))
> + /* Let waiting_thread handle closing the fd. */
^^
Style: 2 spaces between the period and "*/".
> + /* Close fds which other threads wanted to close. */
> + for (int fd = 0; fd <= max_desc; ++fd)
> + {
> + if (fd_callback_info[fd].waiting_thread == current_thread
> + && (fd_callback_info[fd].flags == WAITING_FOR_CLOSE_FD))
> + {
> + fprintf (stderr, "closing deferred fd %d\n", fd);
> + emacs_close (fd);
> + /* Ignore any events which happened on this fd. */
> + FD_CLR (fd, &Available);
> + FD_CLR (fd, &Writeok);
> + fd_callback_info[fd].flags = 0;
Why do it here and not when we deactivate processes? Until now we
never closed any descriptors inside wait_reading_process_output. More
generally, why this design and not a simpler change which only
deactivates processes whose I/O is waited by the current thread, as
discussed previously? I think this alternative would be easier to
understand and reason about, because (as established in another
discussion) wait_reading_process_output frequently loops more than we
expect it to.
Looking at all the callers of close_process_fd, it sounds like it's
too low-level to do this. For example, it is called from
create_process (including in the forked child) to close the unneeded
ends of the pipe, where we probably don't want this. And
clear_fd_callback_data is also used for the keyboard input descriptor,
which is probably also not relevant. So I wonder whether it's clean
to have this logic at such a low level, where the purpose of the
descriptor is barely known, and not at a higher level, where we know
what each descriptor is used for, and therefore the purpose and the
intent of the logic is much more clear.
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.
Received: (at 79334) by debbugs.gnu.org; 29 Aug 2025 20:25:49 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Aug 29 16:25:49 2025
Received: from localhost ([127.0.0.1]:43447 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1us5fQ-00029S-CD
for submit <at> debbugs.gnu.org; Fri, 29 Aug 2025 16:25:49 -0400
Received: from mxout5.mail.janestreet.com ([64.215.233.18]:52051)
by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.84_2) (envelope-from <sbaugh@HIDDEN>)
id 1us5fM-00028M-RH
for 79334 <at> debbugs.gnu.org; Fri, 29 Aug 2025 16:25:46 -0400
From: Spencer Baugh <sbaugh@HIDDEN>
To: Eli Zaretskii <eliz@HIDDEN>
Subject: Re: bug#79334: [PATCH] Don't release thread select lock
unnecessarily, [PATCH] Defer closing file descriptors used by other
threads
In-Reply-To: <ier5xe57vrr.fsf@HIDDEN> (Spencer Baugh's message of
"Fri, 29 Aug 2025 15:50:00 -0400")
References: <ierwm6n6tno.fsf@HIDDEN> <86cy8es4bo.fsf@HIDDEN>
<ierqzwu6ztl.fsf@HIDDEN> <86a53is0zw.fsf@HIDDEN>
<ier8qj26nlj.fsf@HIDDEN> <ier5xe57vrr.fsf@HIDDEN>
Date: Fri, 29 Aug 2025 16:25:39 -0400
Message-ID: <ier34997u4c.fsf_-_@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13)
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=-=-="
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com;
s=waixah; t=1756499139;
bh=phY/bhhTrwbAP0bEN7nHJIJsOHPuoxkPKqn5Fz+RGJk=;
h=From:To:Cc:Subject:In-Reply-To:References:Date;
b=He+3LH3Y86QjxzJUJNDXkvoR4yQviSTLniSzpanDyeN55+cFXfZ0luTJGz5VEiM6M
ZZwPdWhyP72jOT4SxO9ablJ2NenDTTyb3DXOFAHQtyhn1rM2xoXEZU2Rl+LFeYNNfI
ZXZxBqnEG41H9ntVVI7W0m3UFDDAYYmDn9JZhGBlyWuknq59UGqJm2XKnWXGpVLN9H
eWbGHyuUI5NXVAdLsZNLfN0J8O8GwIOy08dDE9synpYMPob6ZHLU0y6xFflKpgYFtx
uUW40U0Y5kK6MOK1udCuV29AMqHRZwpeekAojziNge4NkJJbly8s6EXB4ublqA2is3
VqyahkqRoPSqQ==
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 79334
Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)
--=-=-=
Content-Type: text/plain
Spencer Baugh <sbaugh@HIDDEN> writes:
> Spencer Baugh <sbaugh@HIDDEN> writes:
>> While doing this, I had another thought...
>>
>> Do you know if we have any code which handles the fact that
>> delete-process might be called on a process while another thread is
>> calling pselect on its fds?
>>
>> I don't think we do, which suggests that might also be problematic.
>> I'll work on a fix which encompasses that as well.
>
> OK, I think this patch fixes the issue.
>
> Many different pieces of code in Emacs can close file descriptors. We
> need to be generically careful to not close file descriptors which some
> other thread is currently waiting on. In my patch, we do this by
> letting the waiting_thread itself close those file descriptors, after it
> returns from select.
>
> Unfortunately I don't have a reliable test which produces the failure
> case, but in this patch I added a fprintf when we close a deferred fd.
> This prints whenever some thread would have closed an fd that another
> thread was waiting on. If I run the following program (with emacs -Q
> --batch):
>
> (defun my-break-thread ()
> (let ((proc (make-process
> :name "foo"
> :command '("sleep" "1"))))
> (while (process-live-p proc)
> (accept-process-output proc 0.05))))
> (while t
> (make-thread #'my-break-thread "thread1")
> (thread-join (make-thread #'my-break-thread "thread2")))
>
> I get about one print every 10 seconds. Each print is a problem case
> which has been averted, so I do think this patch fixes the bug.
>
> (Obviously, we should remove the fprintf before applying the change)
Of course, I immediately found a bug in my patch (I wasn't clearing
waiting_thread after closing a deferred fd). Fixed in the attached.
In related news: running process-tests.el in a loop also triggers my
"Closing deferred fd" message. Which suggests that src/process-tests.el
was probably also suffering from bugs in this area, which could have
been causing hangs or flaky failures. Which should now hopefully be
fixed.
--=-=-=
Content-Type: text/x-patch
Content-Disposition: inline;
filename=0001-Defer-closing-file-descriptors-used-by-other-threads.patch
From 9e46aa50f5de7d719a6e427fe44bb6b8b8e226fc Mon Sep 17 00:00:00 2001
From: Spencer Baugh <sbaugh@HIDDEN>
Date: Fri, 29 Aug 2025 15:39:00 -0400
Subject: [PATCH] Defer closing file descriptors used by other threads
* src/process.c (other_thread_is_waiting_for): Add.
(clear_fd_callback_data): Don't clear WAITING_FOR_CLOSE_FD.
(close_process_fd): Set WAITING_FOR_CLOSE_FD for fds with a
waiting_thread.
(wait_reading_process_output): Close fds with
WAITING_FOR_CLOSE_FD. (bug#79334)
---
src/process.c | 46 ++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 42 insertions(+), 4 deletions(-)
diff --git a/src/process.c b/src/process.c
index b33bebb9a35..9180b92c61f 100644
--- a/src/process.c
+++ b/src/process.c
@@ -448,7 +448,9 @@ make_lisp_proc (struct Lisp_Process *p)
/* This descriptor refers to a process. */
PROCESS_FD = 8,
/* A non-blocking connect. Only valid if FOR_WRITE is set. */
- NON_BLOCKING_CONNECT_FD = 16
+ NON_BLOCKING_CONNECT_FD = 16,
+ /* This fd is waiting to be closed. */
+ WAITING_FOR_CLOSE_FD = 32,
};
static struct fd_callback_data
@@ -466,14 +468,30 @@ make_lisp_proc (struct Lisp_Process *p)
struct thread_state *waiting_thread;
} fd_callback_info[FD_SETSIZE];
+static bool
+other_thread_is_waiting_for (struct fd_callback_data* elem)
+{
+ return elem->waiting_thread != NULL && elem->waiting_thread != current_thread;
+}
+
static void
clear_fd_callback_data(struct fd_callback_data* elem)
{
elem->func = NULL;
elem->data = NULL;
- elem->flags = 0;
elem->thread = NULL;
- elem->waiting_thread = NULL;
+ if (other_thread_is_waiting_for (elem))
+ {
+ /* Preserve WAITING_FOR_CLOSE_FD; see process_close_fd. Leaving
+ flags non-zero means recompute_max_desc won't reduce max_desc
+ past this element. */
+ elem->flags &= WAITING_FOR_CLOSE_FD;
+ }
+ else
+ {
+ elem->flags = 0;
+ elem->waiting_thread = NULL;
+ }
}
@@ -2113,7 +2131,11 @@ close_process_fd (int *fd_addr)
if (0 <= fd)
{
*fd_addr = -1;
- emacs_close (fd);
+ if (other_thread_is_waiting_for (&fd_callback_info[fd]))
+ /* Let waiting_thread handle closing the fd. */
+ fd_callback_info[fd].flags = WAITING_FOR_CLOSE_FD;
+ else
+ emacs_close (fd);
}
}
@@ -5816,6 +5838,22 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
/* Make C-g and alarm signals set flags again. */
clear_waiting_for_input ();
+ /* Close fds which other threads wanted to close. */
+ for (int fd = 0; fd <= max_desc; ++fd)
+ {
+ if (fd_callback_info[fd].waiting_thread == current_thread
+ && (fd_callback_info[fd].flags == WAITING_FOR_CLOSE_FD))
+ {
+ fprintf (stderr, "closing deferred fd %d\n", fd);
+ emacs_close (fd);
+ /* Ignore any events which happened on this fd. */
+ FD_CLR (fd, &Available);
+ FD_CLR (fd, &Writeok);
+ clear_fd_callback_data (&fd_callback_info[fd]);
+ }
+ }
+ recompute_max_desc ();
+
/* If we woke up due to SIGWINCH, actually change size now. */
do_pending_window_change (0);
--
2.43.7
--=-=-=--
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.
Received: (at 79334) by debbugs.gnu.org; 29 Aug 2025 19:50:13 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Aug 29 15:50:13 2025
Received: from localhost ([127.0.0.1]:43404 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1us56w-0005GM-3i
for submit <at> debbugs.gnu.org; Fri, 29 Aug 2025 15:50:13 -0400
Received: from mxout5.mail.janestreet.com ([64.215.233.18]:41087)
by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.84_2) (envelope-from <sbaugh@HIDDEN>)
id 1us56s-0005CF-3S
for 79334 <at> debbugs.gnu.org; Fri, 29 Aug 2025 15:50:07 -0400
From: Spencer Baugh <sbaugh@HIDDEN>
To: Eli Zaretskii <eliz@HIDDEN>
Subject: Re: bug#79334: [PATCH] Don't release thread select lock unnecessarily
In-Reply-To: <ier8qj26nlj.fsf@HIDDEN> (Spencer Baugh's message of
"Fri, 29 Aug 2025 13:31:52 -0400")
References: <ierwm6n6tno.fsf@HIDDEN> <86cy8es4bo.fsf@HIDDEN>
<ierqzwu6ztl.fsf@HIDDEN> <86a53is0zw.fsf@HIDDEN>
<ier8qj26nlj.fsf@HIDDEN>
Date: Fri, 29 Aug 2025 15:50:00 -0400
Message-ID: <ier5xe57vrr.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13)
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=-=-="
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com;
s=waixah; t=1756497000;
bh=atD57mlVj0ftflEqlmrlBDIpkjm+CYOS/wHD8+LsTVs=;
h=From:To:Cc:Subject:In-Reply-To:References:Date;
b=J8xwHa4+rzs2O/EReG8Ki2LzkjFCVjKVo+0nOOZkcgPk78tP7wxaZlSVgit5S0d8k
TraIACQIKoCa5xPTsIRCO6/53L8HaTU9r5vfOu6ORiItWwcJ/ZOm4n7iynW2USyaM1
Ppc/rGE7fWV4AGfCSGxPHOYitc0sg/8GHwiQj644yj4YOPxEa5ym55xodDU2hW9z/J
m5/bqQFVbzz8llGaLzP7cDxKe31BLQ48Rjq8YqQpzBuPkY9o+FW8vLjYgeuw3Fhc98
tyDaGjy0LnSCjeJ2ziYF0PbtR9DEHHmQDYnRflkenpZqLE8L2ABJB/JVQzDS33dpD/
b+kHdp0CO9nPA==
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 79334
Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)
--=-=-=
Content-Type: text/plain
Spencer Baugh <sbaugh@HIDDEN> writes:
> While doing this, I had another thought...
>
> Do you know if we have any code which handles the fact that
> delete-process might be called on a process while another thread is
> calling pselect on its fds?
>
> I don't think we do, which suggests that might also be problematic.
> I'll work on a fix which encompasses that as well.
OK, I think this patch fixes the issue.
Many different pieces of code in Emacs can close file descriptors. We
need to be generically careful to not close file descriptors which some
other thread is currently waiting on. In my patch, we do this by
letting the waiting_thread itself close those file descriptors, after it
returns from select.
Unfortunately I don't have a reliable test which produces the failure
case, but in this patch I added a fprintf when we close a deferred fd.
This prints whenever some thread would have closed an fd that another
thread was waiting on. If I run the following program (with emacs -Q
--batch):
(defun my-break-thread ()
(let ((proc (make-process
:name "foo"
:command '("sleep" "1"))))
(while (process-live-p proc)
(accept-process-output proc 0.05))))
(while t
(make-thread #'my-break-thread "thread1")
(thread-join (make-thread #'my-break-thread "thread2")))
I get about one print every 10 seconds. Each print is a problem case
which has been averted, so I do think this patch fixes the bug.
(Obviously, we should remove the fprintf before applying the change)
--=-=-=
Content-Type: text/x-patch
Content-Disposition: inline;
filename=0001-Defer-closing-file-descriptors-used-by-other-threads.patch
From 4dda5e4fff3067cac53d6273ed3bd327437d00c8 Mon Sep 17 00:00:00 2001
From: Spencer Baugh <sbaugh@HIDDEN>
Date: Fri, 29 Aug 2025 15:39:00 -0400
Subject: [PATCH] Defer closing file descriptors used by other threads
* src/process.c (other_thread_is_waiting_for): Add.
(clear_fd_callback_data): Don't clear WAITING_FOR_CLOSE_FD.
(close_process_fd): Set WAITING_FOR_CLOSE_FD for fds with a
waiting_thread.
(wait_reading_process_output): Close fds with
WAITING_FOR_CLOSE_FD. (bug#79334)
---
src/process.c | 46 ++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 42 insertions(+), 4 deletions(-)
diff --git a/src/process.c b/src/process.c
index b33bebb9a35..8273ba78095 100644
--- a/src/process.c
+++ b/src/process.c
@@ -448,7 +448,9 @@ make_lisp_proc (struct Lisp_Process *p)
/* This descriptor refers to a process. */
PROCESS_FD = 8,
/* A non-blocking connect. Only valid if FOR_WRITE is set. */
- NON_BLOCKING_CONNECT_FD = 16
+ NON_BLOCKING_CONNECT_FD = 16,
+ /* This fd is waiting to be closed. */
+ WAITING_FOR_CLOSE_FD = 32,
};
static struct fd_callback_data
@@ -466,14 +468,30 @@ make_lisp_proc (struct Lisp_Process *p)
struct thread_state *waiting_thread;
} fd_callback_info[FD_SETSIZE];
+static bool
+other_thread_is_waiting_for (struct fd_callback_data* elem)
+{
+ return elem->waiting_thread != NULL && elem->waiting_thread != current_thread;
+}
+
static void
clear_fd_callback_data(struct fd_callback_data* elem)
{
elem->func = NULL;
elem->data = NULL;
- elem->flags = 0;
elem->thread = NULL;
- elem->waiting_thread = NULL;
+ if (other_thread_is_waiting_for (elem))
+ {
+ /* Preserve WAITING_FOR_CLOSE_FD; see process_close_fd. Leaving
+ flags non-zero means recompute_max_desc won't reduce max_desc
+ past this element. */
+ elem->flags &= WAITING_FOR_CLOSE_FD;
+ }
+ else
+ {
+ elem->flags = 0;
+ elem->waiting_thread = NULL;
+ }
}
@@ -2113,7 +2131,11 @@ close_process_fd (int *fd_addr)
if (0 <= fd)
{
*fd_addr = -1;
- emacs_close (fd);
+ if (other_thread_is_waiting_for (&fd_callback_info[fd]))
+ /* Let waiting_thread handle closing the fd. */
+ fd_callback_info[fd].flags = WAITING_FOR_CLOSE_FD;
+ else
+ emacs_close (fd);
}
}
@@ -5816,6 +5838,22 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
/* Make C-g and alarm signals set flags again. */
clear_waiting_for_input ();
+ /* Close fds which other threads wanted to close. */
+ for (int fd = 0; fd <= max_desc; ++fd)
+ {
+ if (fd_callback_info[fd].waiting_thread == current_thread
+ && (fd_callback_info[fd].flags == WAITING_FOR_CLOSE_FD))
+ {
+ fprintf (stderr, "closing deferred fd %d\n", fd);
+ emacs_close (fd);
+ /* Ignore any events which happened on this fd. */
+ FD_CLR (fd, &Available);
+ FD_CLR (fd, &Writeok);
+ fd_callback_info[fd].flags = 0;
+ }
+ }
+ recompute_max_desc ();
+
/* If we woke up due to SIGWINCH, actually change size now. */
do_pending_window_change (0);
--
2.43.7
--=-=-=--
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.Received: (at 79334) by debbugs.gnu.org; 29 Aug 2025 17:32:06 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Aug 29 13:32:06 2025 Received: from localhost ([127.0.0.1]:43228 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1us2xJ-0001j8-MA for submit <at> debbugs.gnu.org; Fri, 29 Aug 2025 13:32:06 -0400 Received: from mxout5.mail.janestreet.com ([64.215.233.18]:52887) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <sbaugh@HIDDEN>) id 1us2xB-0001hA-Uo for 79334 <at> debbugs.gnu.org; Fri, 29 Aug 2025 13:32:02 -0400 From: Spencer Baugh <sbaugh@HIDDEN> To: Eli Zaretskii <eliz@HIDDEN> Subject: Re: bug#79334: [PATCH] Don't release thread select lock unnecessarily In-Reply-To: <86a53is0zw.fsf@HIDDEN> (Eli Zaretskii's message of "Fri, 29 Aug 2025 16:36:51 +0300") References: <ierwm6n6tno.fsf@HIDDEN> <86cy8es4bo.fsf@HIDDEN> <ierqzwu6ztl.fsf@HIDDEN> <86a53is0zw.fsf@HIDDEN> Date: Fri, 29 Aug 2025 13:31:52 -0400 Message-ID: <ier8qj26nlj.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=waixah; t=1756488712; bh=9IUO6GfrnOPEvuCYo1nlgXF+A7ZmqGgLuMpkqCAo3nM=; h=From:To:Cc:Subject:In-Reply-To:References:Date; b=vE7fSDsWizdWfI94QcCdb8RDPOqYKXXQaQ/V1dkkf00Gy6rBxi6iEJEUbTvUAx93A HE2uu9YbFyBYKb3gEWFbGAUTcLd37/YKUMEkqH4s3esxBX8XnwcTWdnHa1NHLW/aEk RlMJjYnR44jypuoroJBF5l7frNcuAfJEs/m6tvO/CDNWjn/xM2CSyd42T8xMaf2jZY PIsQBfKl8JnHWJsmY04bYVTDoC3aAgumuf+TX2jm/lPf1lwFLpb4FWQLIvZ3XQlpWo 8EHxiekZtOkXiBnid6yGiO0VJTIDt6UEw/CXS4+H4+8itpS4JH2xzfnH4h43ARDc1t 97Sj7OaesfPsw== X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79334 Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) Eli Zaretskii <eliz@HIDDEN> writes: >> From: Spencer Baugh <sbaugh@HIDDEN> >> Cc: Paul Eggert <eggert@HIDDEN>, 79334 <at> debbugs.gnu.org, Dmitry Gutov >> <dmitry@HIDDEN> >> Date: Fri, 29 Aug 2025 09:07:50 -0400 >> >> Eli Zaretskii <eliz@HIDDEN> writes: >> > Which brings us to status_notify. It indeed can close the descriptors >> > of processes which terminated, so we should probably make it safer. >> > For example, we could leave alone descriptors marked with non-NULL >> > waiting_thread if that thread is still alive and is different from the >> > current thread. Would that fix the problem? If yes, I think we >> > should install such a change. >> >> That was my initial idea for a fix as well. But I'm not sure about the >> specific details. Namely, perhaps we need to be checking waiting_thread >> everywhere that we're doing FOR_EACH_PROCESS, not just in status_notify. >> And maybe some new functions/macros should be added to simplify this; >> maybe a version of FOR_EACH_PROCESS which only iterates over processes >> for which waiting_thread == current_thread. > > You may be right, we should audit all the users of FOR_EACH_PROCESS. > At least one of them (get-buffer-process) is probably okay as it is, > but others might need to avoid looking at processes locked to other > threads, indeed. While doing this, I had another thought... Do you know if we have any code which handles the fact that delete-process might be called on a process while another thread is calling pselect on its fds? I don't think we do, which suggests that might also be problematic. I'll work on a fix which encompasses that as well.
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.Received: (at 79334) by debbugs.gnu.org; 29 Aug 2025 16:11:24 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Aug 29 12:11:23 2025 Received: from localhost ([127.0.0.1]:42987 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1us1hD-0006yr-0J for submit <at> debbugs.gnu.org; Fri, 29 Aug 2025 12:11:23 -0400 Received: from mxout1.mail.janestreet.com ([38.105.200.78]:49049) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <sbaugh@HIDDEN>) id 1us1h9-0006xb-S5 for 79334 <at> debbugs.gnu.org; Fri, 29 Aug 2025 12:11:21 -0400 Received: from mail-lf1-f70.google.com ([209.85.167.70]) by mxgoog2.mail.janestreet.com with esmtps (TLS1.3:TLS_AES_128_GCM_SHA256:128) (Exim 4.98.2) id 1us1h4-000000021Z0-2Ff0 for 79334 <at> debbugs.gnu.org; Fri, 29 Aug 2025 12:11:14 -0400 Received: by mail-lf1-f70.google.com with SMTP id 2adb3069b0e04-55f41cbbb89so1219956e87.3 for <79334 <at> debbugs.gnu.org>; Fri, 29 Aug 2025 09:11:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=google; t=1756483873; x=1757088673; darn=debbugs.gnu.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=rbb9gJC/mW9AXUi1WMgjOjVqpA/9OaqtV8CQbxbkb3A=; b=IfcwY7FJDWXMoNbEPl6wjXhOR+cYr/AiVpvbh84/+4KzICFbvmXvNfAPlsxhepMX/W twLSVPalPOW90X1nxRdoRkOdYZwZpqhb/8ZszWv0BkZSWhHocXyKmjD1tbfJZBK0095t NqNZd3mfijcnAP4rS+gEQD6FUwmqoR2JCq0Yc= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=waixah; t=1756483874; bh=rbb9gJC/mW9AXUi1WMgjOjVqpA/9OaqtV8CQbxbkb3A=; h=References:In-Reply-To:From:Date:Subject:To:Cc; b=XilS+77jhf0lQTP8tVj+Wwl1tm1nn3qiRecuGZi9eTfOQgDlweFoamGV/3Duj6aVS XX3RtWm9JcFxuOB3EduE5i/4SIvh6OOZc//Kuf3dXEemesMu8tZVK9OSGQzeU3UMT6 mzOnxqHtzJuiveA8arFBPSWoMD+Q4VYK2ywuoxp0bTmN6O+pWePWZE7SyBcrprxE7M 5Y8TjbZ2ebh4CEuY5lDr6Ah3t5iIXRz2DaeoSRZxGQJrTVhmXAg+DhnunRNyek63Mj tWU4WOEa2URi7vnPx0j/d/ZUuv+JeerVkVwDXS0v4Q3CfzCYRrosq4q4o21+7aIY4W 2FCiAGyHrW0wA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1756483873; x=1757088673; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=rbb9gJC/mW9AXUi1WMgjOjVqpA/9OaqtV8CQbxbkb3A=; b=uODLg178U8C1oexbDL1u6ujkfjq2xbkfuhPoQJQOWTZLs/SdkNeCwQJ4LaZkTHJ1s3 ATGmd0QAMoN8CVknIC97Tc1fbqYrO6ng7G+TbR0iHokUOE7GscW0u07zXtlWJd6qk58L eAaN4i+1l9hvE2lc7XQz+uanx5OVVS66IkvWYiV15TCWSs07uNefk3tiK6qr+xZQQEWv 5hLSqqlDI2ccCbnbCK8LJsHFQ6HQqKA0uaKZxRA0BbW4CT545RoH8PeSFO12mR2R+Hw2 lDHY5wNfJMcw5awlr8yOEfobqj2u7CfsurWJPtZcAhU9xRj6dVeajbrXqN9PKn54+2rm GaSg== X-Gm-Message-State: AOJu0Yx77Z7pCWaurDM9IP4LkJtEI+QWc64jFdpW/Vt2XVTMRz/F72nE 2HWAyWwc734/vEiTx0+xLhESbGqXe1hkK4Bl2tYm5OocmNBuHstMxyrbk4+GiP+FaM22ai2E0fM tKMQrPuXRV2lgmkp8iv+B9JBfhI2Qtku/AlnwWzMpFhglM79QWE111JcQMDBvAeXeEQQwUqT3L1 Ypm2MjNT/oRnytfvX+tEVft2O3jOuW4w== X-Gm-Gg: ASbGnctGClBsIJG9bAmTGEatLb0m9UlVZmFSMGHFjw1/n8tRDswSNOI2ApqoJw1pv2R 8bV9kgQg2MvmEBMfrpxndsP4kJlKXYP47xigEEklyWeKuIkOvVmBSistSIEDAUFNOcHSKvaP1k/ ev2Sue9ybbtgERGGQ6uUvG X-Received: by 2002:a05:6512:3e09:b0:55f:4746:6202 with SMTP id 2adb3069b0e04-55f47466584mr5235077e87.11.1756483872988; Fri, 29 Aug 2025 09:11:12 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGrhRSkqk/ar9SzOpgIyqc89bzt5v1+WVzT6yswIwUra87PecZsquU6TBama02ql/SfNmrrNWkT3yc/BuKRKNw= X-Received: by 2002:a05:6512:3e09:b0:55f:4746:6202 with SMTP id 2adb3069b0e04-55f47466584mr5235070e87.11.1756483872567; Fri, 29 Aug 2025 09:11:12 -0700 (PDT) MIME-Version: 1.0 References: <ierwm6n6tno.fsf@HIDDEN> <86cy8es4bo.fsf@HIDDEN> <ierqzwu6ztl.fsf@HIDDEN> <86a53is0zw.fsf@HIDDEN> <ieriki66y3g.fsf@HIDDEN> <867bymrzvr.fsf@HIDDEN> <ierfrda6v41.fsf@HIDDEN> <865xe6rv8z.fsf@HIDDEN> <CAO=BR8PCwypg+7TMbbQueQh82drqQTeYNU0LGqVQ=oDv58v5NQ@HIDDEN> <86349aruhe.fsf@HIDDEN> In-Reply-To: <86349aruhe.fsf@HIDDEN> From: Spencer Baugh <sbaugh@HIDDEN> Date: Fri, 29 Aug 2025 12:11:02 -0400 X-Gm-Features: Ac12FXwaXqkWr-aOstM-1dCSxGLjByIF83ucZOF8ZBjfKxfctAJxVfqp7lW67PU Message-ID: <CAO=BR8OFBKbTht7_uQsKOsfNOBz_EqTPBrrF9fizv8qb5K8KJw@HIDDEN> Subject: Re: bug#79334: [PATCH] Don't release thread select lock unnecessarily To: Eli Zaretskii <eliz@HIDDEN> Content-Type: multipart/alternative; boundary="000000000000478d8d063d8345d0" X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79334 Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, Dmitry Gutov <dmitry@HIDDEN> X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) --000000000000478d8d063d8345d0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, Aug 29, 2025, 11:57=E2=80=AFAM Eli Zaretskii <eliz@HIDDEN> wrote: > > From: Spencer Baugh <sbaugh@HIDDEN> > > Date: Fri, 29 Aug 2025 11:43:21 -0400 > > Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, Dmitry Gutov < > dmitry@HIDDEN> > > > > Okay, so then how about we delete this unsafe thread_select and put som= e > thread-yield calls around > > wait_reading_process_output? > > I'm not objected to not calling thread_select there, but can we have > tests that will show us the gains? > The primary gain is that it becomes easier to reason about locking. I'm not sure how to write a test for that :) Also, I think we should fix the problems with status_notify and other > users of FOR_EACH_PROCESS before we make the change with > thread_select, because current code in status_notify is definitely > wrong, and affects the related behavior. > Yes, I'll work on a patch which does both. (I'm not going to bother making a change which doesn't rely on the change to remove the thread_select, because this code is hard enough to reason about already) > --000000000000478d8d063d8345d0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"auto"><div><br><br><div class=3D"gmail_quote gmail_quote_contai= ner"><div dir=3D"ltr" class=3D"gmail_attr">On Fri, Aug 29, 2025, 11:57=E2= =80=AFAM Eli Zaretskii <<a href=3D"mailto:eliz@HIDDEN">eliz@HIDDEN</a>= > wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 = 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">> From: Spencer Baug= h <<a href=3D"mailto:sbaugh@HIDDEN" target=3D"_blank" rel=3D"nor= eferrer">sbaugh@HIDDEN</a>><br> > Date: Fri, 29 Aug 2025 11:43:21 -0400<br> > Cc: <a href=3D"mailto:79334 <at> debbugs.gnu.org" target=3D"_blank" rel=3D"= noreferrer">79334 <at> debbugs.gnu.org</a>, <a href=3D"mailto:eggert@HIDDEN= " target=3D"_blank" rel=3D"noreferrer">eggert@HIDDEN</a>, Dmitry Gutov= <<a href=3D"mailto:dmitry@HIDDEN" target=3D"_blank" rel=3D"noreferre= r">dmitry@HIDDEN</a>><br> > <br> > Okay, so then how about we delete this unsafe thread_select and put so= me thread-yield calls around<br> > wait_reading_process_output?<br> <br> I'm not objected to not calling thread_select there, but can we have<br= > tests that will show us the gains?<br></blockquote></div></div><div dir=3D"= auto"><br></div><div dir=3D"auto">The primary gain is that it becomes easie= r to reason about locking.=C2=A0 I'm not sure how to write a test for t= hat :)</div><div dir=3D"auto"><br></div><div dir=3D"auto"><div class=3D"gma= il_quote gmail_quote_container"><blockquote class=3D"gmail_quote" style=3D"= margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> Also, I think we should fix the problems with status_notify and other<br> users of FOR_EACH_PROCESS before we make the change with<br> thread_select, because current code in status_notify is definitely<br> wrong, and affects the related behavior.<br></blockquote></div></div><div d= ir=3D"auto"><br></div><div dir=3D"auto">Yes, I'll work on a patch which= does both.=C2=A0 (I'm not going to bother making a change which doesn&= #39;t rely on the change to remove the thread_select, because this code is = hard enough to reason about already)</div><div dir=3D"auto"><div class=3D"g= mail_quote gmail_quote_container"><blockquote class=3D"gmail_quote" style= =3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> </blockquote></div></div></div> --000000000000478d8d063d8345d0--
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.Received: (at 79334) by debbugs.gnu.org; 29 Aug 2025 15:57:57 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Aug 29 11:57:57 2025 Received: from localhost ([127.0.0.1]:42949 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1us1UC-0004xJ-KN for submit <at> debbugs.gnu.org; Fri, 29 Aug 2025 11:57:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49534) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1us1U9-0004wG-R2 for 79334 <at> debbugs.gnu.org; Fri, 29 Aug 2025 11:57:54 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <eliz@HIDDEN>) id 1us1U3-00067b-3f; Fri, 29 Aug 2025 11:57:47 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=hAvlzZKbhrFJaTcWfOhliEnhTcPlkM3JMVMJn+TMCQw=; b=mO+8W4S7kOBI 6CsXgPXsD/ExRCz6g8TCZV5IPLe2TauKfAfcFzydn8IIWPFSpZb8UE3wBcDeiX2KAD2t5sqtbHWyf W6VjlyguEPrpNhWLrcmRUd/HDAvCSc2M0V4U6gpovI46vtbjs39cIstNVdv1GUYXNyDe0UTaWMrlo tDDaKwPK+bvirV3tGbC3L0S6iTS7op0Kr301jrjlqNzDERmZ4XTIy1DcA72J7xylYDG8xM1tV0VRv eOyQRgLL25kiqlQsQ9RhQPn3KbS1fc27vOPYBZ0wDJ+SxGnBrnavj86M7k9QUx1FM61QyEnlHKjwG J3+kmVD40s/7V+6anVQU2w==; Date: Fri, 29 Aug 2025 18:57:33 +0300 Message-Id: <86349aruhe.fsf@HIDDEN> From: Eli Zaretskii <eliz@HIDDEN> To: Spencer Baugh <sbaugh@HIDDEN> In-Reply-To: <CAO=BR8PCwypg+7TMbbQueQh82drqQTeYNU0LGqVQ=oDv58v5NQ@HIDDEN> (message from Spencer Baugh on Fri, 29 Aug 2025 11:43:21 -0400) Subject: Re: bug#79334: [PATCH] Don't release thread select lock unnecessarily References: <ierwm6n6tno.fsf@HIDDEN> <86cy8es4bo.fsf@HIDDEN> <ierqzwu6ztl.fsf@HIDDEN> <86a53is0zw.fsf@HIDDEN> <ieriki66y3g.fsf@HIDDEN> <867bymrzvr.fsf@HIDDEN> <ierfrda6v41.fsf@HIDDEN> <865xe6rv8z.fsf@HIDDEN> <CAO=BR8PCwypg+7TMbbQueQh82drqQTeYNU0LGqVQ=oDv58v5NQ@HIDDEN> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79334 Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) > From: Spencer Baugh <sbaugh@HIDDEN> > Date: Fri, 29 Aug 2025 11:43:21 -0400 > Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, Dmitry Gutov <dmitry@HIDDEN> > > Okay, so then how about we delete this unsafe thread_select and put some thread-yield calls around > wait_reading_process_output? I'm not objected to not calling thread_select there, but can we have tests that will show us the gains? Also, I think we should fix the problems with status_notify and other users of FOR_EACH_PROCESS before we make the change with thread_select, because current code in status_notify is definitely wrong, and affects the related behavior.
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.Received: (at 79334) by debbugs.gnu.org; 29 Aug 2025 15:43:44 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Aug 29 11:43:44 2025 Received: from localhost ([127.0.0.1]:42926 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1us1GQ-0002rZ-PX for submit <at> debbugs.gnu.org; Fri, 29 Aug 2025 11:43:44 -0400 Received: from mxout5.mail.janestreet.com ([64.215.233.18]:56661) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <sbaugh@HIDDEN>) id 1us1GM-0002qE-Nu for 79334 <at> debbugs.gnu.org; Fri, 29 Aug 2025 11:43:40 -0400 Received: from mail-lj1-f199.google.com ([209.85.208.199]) by mxgoog2.mail.janestreet.com with esmtps (TLS1.3:TLS_AES_128_GCM_SHA256:128) (Exim 4.98.2) id 1us1GH-00000001yfd-1bVq for 79334 <at> debbugs.gnu.org; Fri, 29 Aug 2025 11:43:33 -0400 Received: by mail-lj1-f199.google.com with SMTP id 38308e7fff4ca-336b13923d4so7739521fa.1 for <79334 <at> debbugs.gnu.org>; Fri, 29 Aug 2025 08:43:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=google; t=1756482212; x=1757087012; darn=debbugs.gnu.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=fxdnxATSoWkbDJ8mPjtVVQCjjd9ivKg0YuV9rA5ZMKw=; b=ECsf5kpY0xQv9N0U88hPeXxvpEGt8E/cwTpcg7ct7qDnh92x/G6EN3yv2VzvXFvVOv W3wTdn93/SRAA0B4wU3XkNxoGzgIcDSsmo5xVYOApFDlE719+7f7PnvO9s2q6BSGUF8Q IhBOBvRal9aldN0E/zDdGavyDakuepecxAQ6Q= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=waixah; t=1756482213; bh=fxdnxATSoWkbDJ8mPjtVVQCjjd9ivKg0YuV9rA5ZMKw=; h=References:In-Reply-To:From:Date:Subject:To:Cc; b=xMqRpmQve88dSNKv6PVyL7TjBXCCVfhpLSqDYawmPEQyodIt7LMCGtGuUbJhGPEWp hw3d81higT2hffSrXZiq6ldKc9qLz7qxmdsHHeGV9vF2McZ4h7A/FQRdVxYgD2Y3o6 ayWxdVX9y8zSnBS5ZtfyiGYgA8b2GJ78qJ/FkdIfM1zMrT3dcu6jYwTI1/VylbtkuC Ak2A+cttvfkYl/Ta0AM3M96CVYHBV6qT3BY0jji+W1/2tPvi8Ypzl/xPnN3LcfVADm E+ACyVvf7v0CXNAaMeYIrhLS3gRlk6dAMZTQnCfNcGsepvDElzhMfIe+6y+BOepsvh hXebMUPsIiw6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1756482212; x=1757087012; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=fxdnxATSoWkbDJ8mPjtVVQCjjd9ivKg0YuV9rA5ZMKw=; b=lDXrUd3RuUHcRDSuxs8ThdLu9Q7eAZYzTZvPmMludupKnfcAv7C1RxF+hLYvwwc0v3 G4o9sowt0xpocGFd5PZZkl23RP1ISDHZWVhQMrJBty/ie6iv0EiVuXX1yO1WBDLPXEyR ypAX4x7mIfR3uK3SHbQzW5t1LuUJcguAn02HMp0lJfnSYkz3dpiGKGz0Ws+LeMY2qgOQ hriVjvJAmgeA/cud/gsnswioXOxXe1KCsrPDXw7JSqTFr1nA7LiBWr6J1mCJuGedGaQk WrAX8zipcT47v73KQ2aAtblm0hF72TpPJNVE9JWBx/ObFJWefXE6sbSF+tU/LG9kPF84 ZBpw== X-Gm-Message-State: AOJu0YwBpImaUj+B4W+Ugxy44X95Z9d4tUKy7fDH554uILDatSQLJ9F4 VMNLMoLUvOc6wdxfMULJ/+LZZTmS5xYkJUsXLj8/xcxN1kpx/rAw7ULfeTtj+mLefjmCh5nTsnn zhgGbbfebwiCHo7PLmw3ocA2C0uHyfqslfOszSevu5L7o/389sPV36A7yXZSXMz83US/DfRPrYj l1/VVk5lz9K8zHu8Sa5KHXf8g2dzQlMw== X-Gm-Gg: ASbGncuezZgq3UlqFR+C2EjosvwgTGCpsXpd6wxP2zUm4RF7e20+Ywa68umD0Cf0dOv Qv0XbQ7wspRRAdobRXLZJ6objAd6bxAN8gKTomGy1qb0mAl/RRUC/cTOS2i+1MvzsIaXU5+pFb4 Mu/jFGVxPGf07e/h8GcVwH X-Received: by 2002:a05:651c:f13:b0:32b:7165:d0 with SMTP id 38308e7fff4ca-33650e3f8e8mr63959121fa.10.1756482211833; Fri, 29 Aug 2025 08:43:31 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGvUPUj75CSkNFySI8hcG5Yj7udCRWw2P0gSKoXWkcKfDUIIHklgfxFk5HuT0NEsyS4awNj3cZ98+uWLOIOumk= X-Received: by 2002:a05:651c:f13:b0:32b:7165:d0 with SMTP id 38308e7fff4ca-33650e3f8e8mr63959021fa.10.1756482211398; Fri, 29 Aug 2025 08:43:31 -0700 (PDT) MIME-Version: 1.0 References: <ierwm6n6tno.fsf@HIDDEN> <86cy8es4bo.fsf@HIDDEN> <ierqzwu6ztl.fsf@HIDDEN> <86a53is0zw.fsf@HIDDEN> <ieriki66y3g.fsf@HIDDEN> <867bymrzvr.fsf@HIDDEN> <ierfrda6v41.fsf@HIDDEN> <865xe6rv8z.fsf@HIDDEN> In-Reply-To: <865xe6rv8z.fsf@HIDDEN> From: Spencer Baugh <sbaugh@HIDDEN> Date: Fri, 29 Aug 2025 11:43:21 -0400 X-Gm-Features: Ac12FXw3CwyQ9uwCdLgAJcsfw4YfJVjUGXJeXMiq2js0UPK1KUrL3hbsvWwwhaE Message-ID: <CAO=BR8PCwypg+7TMbbQueQh82drqQTeYNU0LGqVQ=oDv58v5NQ@HIDDEN> Subject: Re: bug#79334: [PATCH] Don't release thread select lock unnecessarily To: Eli Zaretskii <eliz@HIDDEN> Content-Type: multipart/alternative; boundary="00000000000043dbae063d82e2d0" X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79334 Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, Dmitry Gutov <dmitry@HIDDEN> X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) --00000000000043dbae063d82e2d0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, Aug 29, 2025, 11:41=E2=80=AFAM Eli Zaretskii <eliz@HIDDEN> wrote: > > From: Spencer Baugh <sbaugh@HIDDEN> > > Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN > > Date: Fri, 29 Aug 2025 10:49:34 -0400 > > > > Eli Zaretskii <eliz@HIDDEN> writes: > > > > >> Primarily: Calling thread_select here complicates reasoning about th= e > > >> code. We should not release the lock when not necessary, because it > > >> makes it harder to think about possible thread interleavings. And > it's > > >> not necessary here because this pselect call returns immediately. > > > > > > How do we define "necessary" in this context? > > > > It's necessary to release the lock around long pselect calls because > > otherwise other threads will not run while Emacs is waiting for input, > > which means they probably won't run at all. > > > > >> (The only reason the pselect call wouldn't return immediately is if = we > > >> released the lock and then switched to another thread) > > > > > > You seem to assume that a call to thread_select is only necessary whe= n > > > some potentially prolonged operation or wait is expected? But I'm no= t > > > sure this is correct, because that could prevent other threads from > > > running for too long, and force programmers to inject sleep-for > > > etc. in their programs. By allowing thread switches as frequently as > > > possible lets other threads more opportunities to run (to some extent > > > simulating preemptive scheduling), which I think is a Good Thing > > > overall, no? > > > > Three points: > > > > - Most importantly: We can only allow thread switches at known safe > > points. It's not clear that this is a safe point. (Actually, we kno= w > > it currently isn't, because of the status_notify issue. But I suspec= t > > there may be other issues which are harder to analyze) > > This reason will be removed once we fix status_notify. > > > - Adding more thread switches is not necessarily a good thing. It can > > degrade whole-system performance, because context switching is slow. > > > > - wait_reading_process_output already does a thread switch around the > > pselect, so it's not important for one to happen here. > > You basically give the same single reason in several different > wordings. > > I see your point, but my experience with threads in Emacs is the > opposite: there seem to be too few thread switch opportunities, so > many programs don't work as expected unless one sprinkles sleep-for in > various places. For example, try to run something in a non-main > thread and then attempt to interact with Emacs in the main thread. > > IOW, it is quite easy for a Lisp program to hog the global lock and > starve the other threads, so each additional thread-switch opportunity > is IME a blessing, not a curse. > Okay, so then how about we delete this unsafe thread_select and put some thread-yield calls around wait_reading_process_output? > --00000000000043dbae063d82e2d0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"auto"><div><br><br><div class=3D"gmail_quote gmail_quote_contai= ner"><div dir=3D"ltr" class=3D"gmail_attr">On Fri, Aug 29, 2025, 11:41=E2= =80=AFAM Eli Zaretskii <<a href=3D"mailto:eliz@HIDDEN">eliz@HIDDEN</a>= > wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 = 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">> From: Spencer Baug= h <<a href=3D"mailto:sbaugh@HIDDEN" target=3D"_blank" rel=3D"nor= eferrer">sbaugh@HIDDEN</a>><br> > Cc: <a href=3D"mailto:79334 <at> debbugs.gnu.org" target=3D"_blank" rel=3D"= noreferrer">79334 <at> debbugs.gnu.org</a>,=C2=A0 <a href=3D"mailto:eggert@HIDDEN= la.edu" target=3D"_blank" rel=3D"noreferrer">eggert@HIDDEN</a>,=C2=A0 = <a href=3D"mailto:dmitry@HIDDEN" target=3D"_blank" rel=3D"noreferrer">dm= itry@HIDDEN</a><br> > Date: Fri, 29 Aug 2025 10:49:34 -0400<br> > <br> > Eli Zaretskii <<a href=3D"mailto:eliz@HIDDEN" target=3D"_blank" re= l=3D"noreferrer">eliz@HIDDEN</a>> writes:<br> > <br> > >> Primarily: Calling thread_select here complicates reasoning a= bout the<br> > >> code.=C2=A0 We should not release the lock when not necessary= , because it<br> > >> makes it harder to think about possible thread interleavings.= =C2=A0 And it's<br> > >> not necessary here because this pselect call returns immediat= ely.<br> > ><br> > > How do we define "necessary" in this context?<br> > <br> > It's necessary to release the lock around long pselect calls becau= se<br> > otherwise other threads will not run while Emacs is waiting for input,= <br> > which means they probably won't run at all.<br> > <br> > >> (The only reason the pselect call wouldn't return immedia= tely is if we<br> > >> released the lock and then switched to another thread)<br> > ><br> > > You seem to assume that a call to thread_select is only necessary= when<br> > > some potentially prolonged operation or wait is expected?=C2=A0 B= ut I'm not<br> > > sure this is correct, because that could prevent other threads fr= om<br> > > running for too long, and force programmers to inject sleep-for<b= r> > > etc. in their programs.=C2=A0 By allowing thread switches as freq= uently as<br> > > possible lets other threads more opportunities to run (to some ex= tent<br> > > simulating preemptive scheduling), which I think is a Good Thing<= br> > > overall, no?<br> > <br> > Three points:<br> > <br> > - Most importantly: We can only allow thread switches at known safe<br= > >=C2=A0 =C2=A0points.=C2=A0 It's not clear that this is a safe point= .=C2=A0 (Actually, we know<br> >=C2=A0 =C2=A0it currently isn't, because of the status_notify issue= .=C2=A0 But I suspect<br> >=C2=A0 =C2=A0there may be other issues which are harder to analyze)<br> <br> This reason will be removed once we fix status_notify.<br> <br> > - Adding more thread switches is not necessarily a good thing.=C2=A0 I= t can<br> >=C2=A0 =C2=A0degrade whole-system performance, because context switchin= g is slow.<br> > <br> > - wait_reading_process_output already does a thread switch around the<= br> >=C2=A0 =C2=A0pselect, so it's not important for one to happen here.= <br> <br> You basically give the same single reason in several different<br> wordings.<br> <br> I see your point, but my experience with threads in Emacs is the<br> opposite: there seem to be too few thread switch opportunities, so<br> many programs don't work as expected unless one sprinkles sleep-for in<= br> various places.=C2=A0 For example, try to run something in a non-main<br> thread and then attempt to interact with Emacs in the main thread.<br> <br> IOW, it is quite easy for a Lisp program to hog the global lock and<br> starve the other threads, so each additional thread-switch opportunity<br> is IME a blessing, not a curse.<br></blockquote></div></div><div dir=3D"aut= o"><br></div><div dir=3D"auto">Okay, so then how about we delete this unsaf= e thread_select and put some thread-yield calls around wait_reading_process= _output?</div><div dir=3D"auto"><div class=3D"gmail_quote gmail_quote_conta= iner"><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-l= eft:1px #ccc solid;padding-left:1ex"> </blockquote></div></div></div> --00000000000043dbae063d82e2d0--
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.Received: (at 79334) by debbugs.gnu.org; 29 Aug 2025 15:41:53 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Aug 29 11:41:53 2025 Received: from localhost ([127.0.0.1]:42921 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1us1Eb-0002by-Ve for submit <at> debbugs.gnu.org; Fri, 29 Aug 2025 11:41:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49032) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1us1EX-0002ae-9Q for 79334 <at> debbugs.gnu.org; Fri, 29 Aug 2025 11:41:48 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <eliz@HIDDEN>) id 1us1EP-0003An-4X; Fri, 29 Aug 2025 11:41:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=wwmcgsbcK/TljZ988x0eaBWcKQ36ow9LSegB+I8w3xw=; b=quytDHwMnKjS Li9dCNei+sCJ4ySJXAJVVImUueaEhCqBUs2c2tf1Wh9pbmuwC4D4SmQVdAEHLVhnmQQc5u1KmvauJ xqXQiOfCZNgileh5pdDbacVGJBTteI3hlljEMt0sdgP/m6dc+uUAh4mjzrUmx73zK1ptkfHgX8Z8g AoOHFBKtDl9ePPYHmxikn1VrIiWZ8RZwVGjd3G+cdGNlSfA0xlxPuk597gxccLm0iEnpsoWeS0csO X4I2Yeheyvu/BUXMRcqqGbAzNfiRe9kluGYYJ56HwNP//eBrLChSiAv2DIUhNNZfX1EaNzjjhr5j1 ztvQYLbwCIawrEXPKaP5qA==; Date: Fri, 29 Aug 2025 18:41:00 +0300 Message-Id: <865xe6rv8z.fsf@HIDDEN> From: Eli Zaretskii <eliz@HIDDEN> To: Spencer Baugh <sbaugh@HIDDEN> In-Reply-To: <ierfrda6v41.fsf@HIDDEN> (message from Spencer Baugh on Fri, 29 Aug 2025 10:49:34 -0400) Subject: Re: bug#79334: [PATCH] Don't release thread select lock unnecessarily References: <ierwm6n6tno.fsf@HIDDEN> <86cy8es4bo.fsf@HIDDEN> <ierqzwu6ztl.fsf@HIDDEN> <86a53is0zw.fsf@HIDDEN> <ieriki66y3g.fsf@HIDDEN> <867bymrzvr.fsf@HIDDEN> <ierfrda6v41.fsf@HIDDEN> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79334 Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) > From: Spencer Baugh <sbaugh@HIDDEN> > Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN > Date: Fri, 29 Aug 2025 10:49:34 -0400 > > Eli Zaretskii <eliz@HIDDEN> writes: > > >> Primarily: Calling thread_select here complicates reasoning about the > >> code. We should not release the lock when not necessary, because it > >> makes it harder to think about possible thread interleavings. And it's > >> not necessary here because this pselect call returns immediately. > > > > How do we define "necessary" in this context? > > It's necessary to release the lock around long pselect calls because > otherwise other threads will not run while Emacs is waiting for input, > which means they probably won't run at all. > > >> (The only reason the pselect call wouldn't return immediately is if we > >> released the lock and then switched to another thread) > > > > You seem to assume that a call to thread_select is only necessary when > > some potentially prolonged operation or wait is expected? But I'm not > > sure this is correct, because that could prevent other threads from > > running for too long, and force programmers to inject sleep-for > > etc. in their programs. By allowing thread switches as frequently as > > possible lets other threads more opportunities to run (to some extent > > simulating preemptive scheduling), which I think is a Good Thing > > overall, no? > > Three points: > > - Most importantly: We can only allow thread switches at known safe > points. It's not clear that this is a safe point. (Actually, we know > it currently isn't, because of the status_notify issue. But I suspect > there may be other issues which are harder to analyze) This reason will be removed once we fix status_notify. > - Adding more thread switches is not necessarily a good thing. It can > degrade whole-system performance, because context switching is slow. > > - wait_reading_process_output already does a thread switch around the > pselect, so it's not important for one to happen here. You basically give the same single reason in several different wordings. I see your point, but my experience with threads in Emacs is the opposite: there seem to be too few thread switch opportunities, so many programs don't work as expected unless one sprinkles sleep-for in various places. For example, try to run something in a non-main thread and then attempt to interact with Emacs in the main thread. IOW, it is quite easy for a Lisp program to hog the global lock and starve the other threads, so each additional thread-switch opportunity is IME a blessing, not a curse.
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.Received: (at 79334) by debbugs.gnu.org; 29 Aug 2025 14:49:46 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Aug 29 10:49:45 2025 Received: from localhost ([127.0.0.1]:42774 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1us0QC-0003Mn-5H for submit <at> debbugs.gnu.org; Fri, 29 Aug 2025 10:49:45 -0400 Received: from mxout5.mail.janestreet.com ([64.215.233.18]:49683) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <sbaugh@HIDDEN>) id 1us0Q8-0003LE-FM for 79334 <at> debbugs.gnu.org; Fri, 29 Aug 2025 10:49:42 -0400 From: Spencer Baugh <sbaugh@HIDDEN> To: Eli Zaretskii <eliz@HIDDEN> Subject: Re: bug#79334: [PATCH] Don't release thread select lock unnecessarily In-Reply-To: <867bymrzvr.fsf@HIDDEN> (Eli Zaretskii's message of "Fri, 29 Aug 2025 17:00:56 +0300") References: <ierwm6n6tno.fsf@HIDDEN> <86cy8es4bo.fsf@HIDDEN> <ierqzwu6ztl.fsf@HIDDEN> <86a53is0zw.fsf@HIDDEN> <ieriki66y3g.fsf@HIDDEN> <867bymrzvr.fsf@HIDDEN> Date: Fri, 29 Aug 2025 10:49:34 -0400 Message-ID: <ierfrda6v41.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=waixah; t=1756478974; bh=at7cROYOKajFHWykh5sJEnUx1DFD2SfPsFq6+h/bbDg=; h=From:To:Cc:Subject:In-Reply-To:References:Date; b=yQS/z161DuInozzR2waFgcA8AGX5HLUBko+wVk7IFgK/1hbBAMU0cXPr9Ys5X1s/O 04kvkug3gAcz6tTHYs2jXcuWGohpqzlDqeU8dN6i10xtCGADsNCHsty2ws+CNb6U7X YPFO1MMFWLf1TCaxG53oR76pent60LcjOJC8WenQYqkQhxKM28+irVSS/NpXCctgnN G1jkMmwRpLxri80vtqTywZAT+qPRwmuwWy+yLk4S05lhO39OPn7WFXcRnBBjkJWd+N mzE6RkZ+zcn7sHuDBnCclS2IvkKsRa0XLZqSPIc7r3a7z/ConeQFB6FYOmGbkKvHH6 KqPPqXWj/q/PQ== X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79334 Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) Eli Zaretskii <eliz@HIDDEN> writes: >> From: Spencer Baugh <sbaugh@HIDDEN> >> Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN >> Date: Fri, 29 Aug 2025 09:45:07 -0400 >> >> Eli Zaretskii <eliz@HIDDEN> writes: >> >> >> That being said, I'm confident that it's incorrect and unnecessary to be >> >> using thread_select instead of pselect here, so we should stop doing >> >> that in addition to any other fix. >> > >> > If we fix status_notify, what other reasons remain for not calling >> > thread_select there? >> >> Primarily: Calling thread_select here complicates reasoning about the >> code. We should not release the lock when not necessary, because it >> makes it harder to think about possible thread interleavings. And it's >> not necessary here because this pselect call returns immediately. > > How do we define "necessary" in this context? It's necessary to release the lock around long pselect calls because otherwise other threads will not run while Emacs is waiting for input, which means they probably won't run at all. >> (The only reason the pselect call wouldn't return immediately is if we >> released the lock and then switched to another thread) > > You seem to assume that a call to thread_select is only necessary when > some potentially prolonged operation or wait is expected? But I'm not > sure this is correct, because that could prevent other threads from > running for too long, and force programmers to inject sleep-for > etc. in their programs. By allowing thread switches as frequently as > possible lets other threads more opportunities to run (to some extent > simulating preemptive scheduling), which I think is a Good Thing > overall, no? Three points: - Most importantly: We can only allow thread switches at known safe points. It's not clear that this is a safe point. (Actually, we know it currently isn't, because of the status_notify issue. But I suspect there may be other issues which are harder to analyze) - Adding more thread switches is not necessarily a good thing. It can degrade whole-system performance, because context switching is slow. - wait_reading_process_output already does a thread switch around the pselect, so it's not important for one to happen here.
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.Received: (at 79334) by debbugs.gnu.org; 29 Aug 2025 14:01:08 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Aug 29 10:01:08 2025 Received: from localhost ([127.0.0.1]:42576 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1urzf9-0005DB-HL for submit <at> debbugs.gnu.org; Fri, 29 Aug 2025 10:01:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:56786) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1urzf6-0005BZ-5e for 79334 <at> debbugs.gnu.org; Fri, 29 Aug 2025 10:01:05 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <eliz@HIDDEN>) id 1urzf0-0003np-GE; Fri, 29 Aug 2025 10:00:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=BZTU3E0kIckzj2s1rgae/vZF9A3NjXstMJzQXgeDSvw=; b=WT5Qr6gjYOFx Sa7uxPlt9FnFagWnOsC92XGG84RxXWGRVF3tedq89VcxgAQdD/NHyslgS0i9+30R3ghRtSpyEnhA1 q24O88vOsy3mVlXSlhNueeOkzfWjiYvBstvy9TQrNzKvwXro9FGh6KlSxhMVUQ13i2igeaZqE8J+6 fByhtVApuHYHWJkXVtnLtoi9B4hMedFloXRLmvjTYFO2NRbU2O4m/dh3q7kZCnuqLIIHwXxY1sFOj sOyaLqF+3NtKWNMVePr6JjZWkBmeNaZI1vLp+EI9cvIijQ5eZfrpHMK4VBttP7k1NO5z0TeUSJIWt Hl46DInfexcNJm4KCOerCQ==; Date: Fri, 29 Aug 2025 17:00:56 +0300 Message-Id: <867bymrzvr.fsf@HIDDEN> From: Eli Zaretskii <eliz@HIDDEN> To: Spencer Baugh <sbaugh@HIDDEN> In-Reply-To: <ieriki66y3g.fsf@HIDDEN> (message from Spencer Baugh on Fri, 29 Aug 2025 09:45:07 -0400) Subject: Re: bug#79334: [PATCH] Don't release thread select lock unnecessarily References: <ierwm6n6tno.fsf@HIDDEN> <86cy8es4bo.fsf@HIDDEN> <ierqzwu6ztl.fsf@HIDDEN> <86a53is0zw.fsf@HIDDEN> <ieriki66y3g.fsf@HIDDEN> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79334 Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) > From: Spencer Baugh <sbaugh@HIDDEN> > Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN > Date: Fri, 29 Aug 2025 09:45:07 -0400 > > Eli Zaretskii <eliz@HIDDEN> writes: > > >> That being said, I'm confident that it's incorrect and unnecessary to be > >> using thread_select instead of pselect here, so we should stop doing > >> that in addition to any other fix. > > > > If we fix status_notify, what other reasons remain for not calling > > thread_select there? > > Primarily: Calling thread_select here complicates reasoning about the > code. We should not release the lock when not necessary, because it > makes it harder to think about possible thread interleavings. And it's > not necessary here because this pselect call returns immediately. How do we define "necessary" in this context? > (The only reason the pselect call wouldn't return immediately is if we > released the lock and then switched to another thread) You seem to assume that a call to thread_select is only necessary when some potentially prolonged operation or wait is expected? But I'm not sure this is correct, because that could prevent other threads from running for too long, and force programmers to inject sleep-for etc. in their programs. By allowing thread switches as frequently as possible lets other threads more opportunities to run (to some extent simulating preemptive scheduling), which I think is a Good Thing overall, no?
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.Received: (at 79334) by debbugs.gnu.org; 29 Aug 2025 13:45:18 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Aug 29 09:45:18 2025 Received: from localhost ([127.0.0.1]:42186 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1urzPo-0002nF-CV for submit <at> debbugs.gnu.org; Fri, 29 Aug 2025 09:45:17 -0400 Received: from mxout5.mail.janestreet.com ([64.215.233.18]:50375) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <sbaugh@HIDDEN>) id 1urzPk-0002iS-RG for 79334 <at> debbugs.gnu.org; Fri, 29 Aug 2025 09:45:13 -0400 From: Spencer Baugh <sbaugh@HIDDEN> To: Eli Zaretskii <eliz@HIDDEN> Subject: Re: bug#79334: [PATCH] Don't release thread select lock unnecessarily In-Reply-To: <86a53is0zw.fsf@HIDDEN> (Eli Zaretskii's message of "Fri, 29 Aug 2025 16:36:51 +0300") References: <ierwm6n6tno.fsf@HIDDEN> <86cy8es4bo.fsf@HIDDEN> <ierqzwu6ztl.fsf@HIDDEN> <86a53is0zw.fsf@HIDDEN> Date: Fri, 29 Aug 2025 09:45:07 -0400 Message-ID: <ieriki66y3g.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=waixah; t=1756475107; bh=PKJZnrnixgrqu84Y0iPcA3eO5ttQIxkzUmsbGEvFx7Y=; h=From:To:Cc:Subject:In-Reply-To:References:Date; b=XkLlJR7CpUGqhmAmALvGFx+vAZ8igtC0n0YVZL27IfVAziqh1wUZ/U7bdKX84+YuG BREehTmNSK7gZdyvk4/m5DT/OpaafOf6chcU/txzj3S8P8ICLxKTeSA6uBAd7h3FHo cLXlEp81QQrmbzyzpj7qOFljExbzQw8lMSlMhQkUEg1fuFHrWS7P2aWZRlpQY5/jKn fsT/TD1aA6ry1Ej+FgeSAE68eVV/ZZUS2N+HIgNLFGhRa9TGZyXYTTf3YLYnAttr7t cbJmrYwcT2qvYAh3IuBnFEiz2MqarWqDyYuEVo10zaZPZFQpvOS53pZFNmU2XsTz1r Vktp7mDa83HhA== X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79334 Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) Eli Zaretskii <eliz@HIDDEN> writes: >> From: Spencer Baugh <sbaugh@HIDDEN> >> Cc: Paul Eggert <eggert@HIDDEN>, 79334 <at> debbugs.gnu.org, Dmitry Gutov >> <dmitry@HIDDEN> >> Date: Fri, 29 Aug 2025 09:07:50 -0400 >> >> Eli Zaretskii <eliz@HIDDEN> writes: >> >> > However, I don't think I follow the logic behind this change. I agree >> > that status_notify does things that can be dangerous (more about that >> > below), but the call to status_notify will happen whether we call >> > thread_select or pselect directly, so if status_notify can do some >> > damage, it will do that regardless of whether we call pselect directly >> > or not. I think you assume that the problematic call to status_notify >> > will happen in another thread, if we allow switching threads at this >> > point, is that right? IOW, I think you assume that, if we switch to >> > another thread, that other thread could call status_notify, and thus >> > close some descriptors on which the current thread, the one which >> > calls thread_select at this point, is waiting. If that is your >> > assumption, then the actual situation could also be the opposite: the >> > current thread, the one where you want to call pselect, will call >> > status_notify after pselect returns, and that will close some >> > descriptors used by other threads. In that case, calling pselect >> > directly will make things worse, not better, because it makes this >> > closing of descriptors imminent. >> >> Yes. I think there's multiple problems in this code and I haven't >> tracked them all down yet. >> >> That being said, I'm confident that it's incorrect and unnecessary to be >> using thread_select instead of pselect here, so we should stop doing >> that in addition to any other fix. > > If we fix status_notify, what other reasons remain for not calling > thread_select there? Primarily: Calling thread_select here complicates reasoning about the code. We should not release the lock when not necessary, because it makes it harder to think about possible thread interleavings. And it's not necessary here because this pselect call returns immediately. (The only reason the pselect call wouldn't return immediately is if we released the lock and then switched to another thread) Also, avoiding thread_select here will slightly improve performance by avoiding lock operations.
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.Received: (at 79334) by debbugs.gnu.org; 29 Aug 2025 13:37:06 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Aug 29 09:37:06 2025 Received: from localhost ([127.0.0.1]:42175 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1urzHs-0001bt-RO for submit <at> debbugs.gnu.org; Fri, 29 Aug 2025 09:37:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51086) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1urzHo-0001aY-6i for 79334 <at> debbugs.gnu.org; Fri, 29 Aug 2025 09:37:02 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <eliz@HIDDEN>) id 1urzHh-0003xm-H4; Fri, 29 Aug 2025 09:36:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=GM5lVVYcgn+bIJ0Q1L7OTwvAGXgIiPlgD6RkljjB9NY=; b=Z5i4iAXaT7Kx YNGedJ9YlYpyNUbBLas4ku5Fv3m/NsibpHqCXsPg/UbGi0u1nujxPcF2VxJ8X+rBHLwhsgfMc0x1l 2jJ1qPR7iH6kfSKo25CTPMR+lQjhK7J7u+I64t4u9LajMgIHH8PwWfSqDx3cmV943YuOmgyM+qY97 nMHDHiFD6AqJxGQxRbK/JXkOrsOotemKFe4ShiqiiBvDBbR65N+Cba85LheByffH2urBe3nQE4QY6 VgMMesX6BA+rGFIki+qnNFO7Zxg8FX6snkK2Ittu/DYWhMrnOiOMfOBy6pCeh4vSsRwS4hGz27Q4t dyu0+FDOk3wkWJkKGDm8gA==; Date: Fri, 29 Aug 2025 16:36:51 +0300 Message-Id: <86a53is0zw.fsf@HIDDEN> From: Eli Zaretskii <eliz@HIDDEN> To: Spencer Baugh <sbaugh@HIDDEN> In-Reply-To: <ierqzwu6ztl.fsf@HIDDEN> (message from Spencer Baugh on Fri, 29 Aug 2025 09:07:50 -0400) Subject: Re: bug#79334: [PATCH] Don't release thread select lock unnecessarily References: <ierwm6n6tno.fsf@HIDDEN> <86cy8es4bo.fsf@HIDDEN> <ierqzwu6ztl.fsf@HIDDEN> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79334 Cc: 79334 <at> debbugs.gnu.org, eggert@HIDDEN, dmitry@HIDDEN X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) > From: Spencer Baugh <sbaugh@HIDDEN> > Cc: Paul Eggert <eggert@HIDDEN>, 79334 <at> debbugs.gnu.org, Dmitry Gutov > <dmitry@HIDDEN> > Date: Fri, 29 Aug 2025 09:07:50 -0400 > > Eli Zaretskii <eliz@HIDDEN> writes: > > > However, I don't think I follow the logic behind this change. I agree > > that status_notify does things that can be dangerous (more about that > > below), but the call to status_notify will happen whether we call > > thread_select or pselect directly, so if status_notify can do some > > damage, it will do that regardless of whether we call pselect directly > > or not. I think you assume that the problematic call to status_notify > > will happen in another thread, if we allow switching threads at this > > point, is that right? IOW, I think you assume that, if we switch to > > another thread, that other thread could call status_notify, and thus > > close some descriptors on which the current thread, the one which > > calls thread_select at this point, is waiting. If that is your > > assumption, then the actual situation could also be the opposite: the > > current thread, the one where you want to call pselect, will call > > status_notify after pselect returns, and that will close some > > descriptors used by other threads. In that case, calling pselect > > directly will make things worse, not better, because it makes this > > closing of descriptors imminent. > > Yes. I think there's multiple problems in this code and I haven't > tracked them all down yet. > > That being said, I'm confident that it's incorrect and unnecessary to be > using thread_select instead of pselect here, so we should stop doing > that in addition to any other fix. If we fix status_notify, what other reasons remain for not calling thread_select there? > > So, while I agree we have a problem here, I think the place and the > > way we should solve it are different. > > > > Which brings us to status_notify. It indeed can close the descriptors > > of processes which terminated, so we should probably make it safer. > > For example, we could leave alone descriptors marked with non-NULL > > waiting_thread if that thread is still alive and is different from the > > current thread. Would that fix the problem? If yes, I think we > > should install such a change. > > That was my initial idea for a fix as well. But I'm not sure about the > specific details. Namely, perhaps we need to be checking waiting_thread > everywhere that we're doing FOR_EACH_PROCESS, not just in status_notify. > And maybe some new functions/macros should be added to simplify this; > maybe a version of FOR_EACH_PROCESS which only iterates over processes > for which waiting_thread == current_thread. You may be right, we should audit all the users of FOR_EACH_PROCESS. At least one of them (get-buffer-process) is probably okay as it is, but others might need to avoid looking at processes locked to other threads, indeed. > > We should perhaps also understand better what happens with the current > > code if and when descriptors are closed by some other thread in this > > scenario. A thread using such a descriptor could be either stuck in > > the pselect call, or it could be before or after the call to pselect. > > Paul, what happens if a descriptor passed to pselect becomes closed > > while pselect waits? does pselect return with EBADF, or does pselect > > handle that gracefully, like by considering such descriptors to be > > never "ready"? > > > > If the thread is before or after pselect, then either calling pselect > > or attempting to read from a closed descriptor after pselect returns > > will cause EBADF. Is our code prepared to deal with such a situation > > (e.g., by ignoring the failed read), or will that cause some Lisp API > > to signal an error or produce unexpected results? If the latter, we > > should probably handle EBADF specially, because I don't believe we > > could completely prevent this from happening. > > (I'm more hopeful: I think we can completely prevent this from > happening, we just need to be more careful about our locking.) If we can prevent that completely, it's even better.
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.Received: (at 79334) by debbugs.gnu.org; 29 Aug 2025 13:08:01 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Aug 29 09:08:01 2025 Received: from localhost ([127.0.0.1]:42079 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1urypk-0005qe-1B for submit <at> debbugs.gnu.org; Fri, 29 Aug 2025 09:08:01 -0400 Received: from mxout5.mail.janestreet.com ([64.215.233.18]:44385) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <sbaugh@HIDDEN>) id 1urypg-0005pS-Cw for 79334 <at> debbugs.gnu.org; Fri, 29 Aug 2025 09:07:57 -0400 From: Spencer Baugh <sbaugh@HIDDEN> To: Eli Zaretskii <eliz@HIDDEN> Subject: Re: bug#79334: [PATCH] Don't release thread select lock unnecessarily In-Reply-To: <86cy8es4bo.fsf@HIDDEN> (Eli Zaretskii's message of "Fri, 29 Aug 2025 15:24:59 +0300") References: <ierwm6n6tno.fsf@HIDDEN> <86cy8es4bo.fsf@HIDDEN> Date: Fri, 29 Aug 2025 09:07:50 -0400 Message-ID: <ierqzwu6ztl.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=waixah; t=1756472870; bh=9nk+1fqBMoMPmgli+EJG7Ou/LigbjYj07QqRvUfgPB0=; h=From:To:Cc:Subject:In-Reply-To:References:Date; b=ZHDmYO/AcuoBh9mXcr+309GYVkxFuBbmY9RKc6Fyv6LkVsz+4Ep5zcuwQiriBX5JB +yCu7+yPTZxV9SHKNds/9+JIU/ik8D/k1rIa+7zmNCyigRDAWT5v7IJM8L1dGrL1Jn 74Jn52/evTYEgcBJjnEc6rvZnXOf7VV7lJmuGrwP+KVwynS6myX549r4HXexTXMJJT vcDmWlNNKPMyCefu4cJOlZiUMuiyNB8923qb2EdAPCvJTbYUO7Vu8w53AB2wnbFrgz 7C4lAH+okQetca2iI6Xx0KXWLBCYFPHuthcv6tuBT7RjI3X3f8e0qc+/SCJPrOvZz/ A7+pTM0z1lgJw== X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79334 Cc: 79334 <at> debbugs.gnu.org, Paul Eggert <eggert@HIDDEN>, Dmitry Gutov <dmitry@HIDDEN> X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) Eli Zaretskii <eliz@HIDDEN> writes: >> From: Spencer Baugh <sbaugh@HIDDEN> >> Date: Thu, 28 Aug 2025 17:08:43 -0400 >> >> Previously, we were using thread_select to release the thread >> lock around two different calls to select in >> wait_reading_process_output. The main call to select was fine, >> but we also used thread_select for a pselect call which returns >> immediately (with a timeout of 0) and is supposed to just check >> if any file descriptors are active. >> >> If we actually thread switch at that pselect, it's likely to >> break internal state for Emacs: namely, the call to >> status_notify immediately after can close file descriptors which >> another thread is selecting on, causing that thread to get EBADF >> from select and then call emacs_abort. >> >> We don't need to thread switch here, so don't. >> >> * src/process.c (wait_reading_process_output): Remove >> unnecessary thread_select wrapper. > > Thanks. > > However, I don't think I follow the logic behind this change. I agree > that status_notify does things that can be dangerous (more about that > below), but the call to status_notify will happen whether we call > thread_select or pselect directly, so if status_notify can do some > damage, it will do that regardless of whether we call pselect directly > or not. I think you assume that the problematic call to status_notify > will happen in another thread, if we allow switching threads at this > point, is that right? IOW, I think you assume that, if we switch to > another thread, that other thread could call status_notify, and thus > close some descriptors on which the current thread, the one which > calls thread_select at this point, is waiting. If that is your > assumption, then the actual situation could also be the opposite: the > current thread, the one where you want to call pselect, will call > status_notify after pselect returns, and that will close some > descriptors used by other threads. In that case, calling pselect > directly will make things worse, not better, because it makes this > closing of descriptors imminent. Yes. I think there's multiple problems in this code and I haven't tracked them all down yet. That being said, I'm confident that it's incorrect and unnecessary to be using thread_select instead of pselect here, so we should stop doing that in addition to any other fix. > So, while I agree we have a problem here, I think the place and the > way we should solve it are different. > > Which brings us to status_notify. It indeed can close the descriptors > of processes which terminated, so we should probably make it safer. > For example, we could leave alone descriptors marked with non-NULL > waiting_thread if that thread is still alive and is different from the > current thread. Would that fix the problem? If yes, I think we > should install such a change. That was my initial idea for a fix as well. But I'm not sure about the specific details. Namely, perhaps we need to be checking waiting_thread everywhere that we're doing FOR_EACH_PROCESS, not just in status_notify. And maybe some new functions/macros should be added to simplify this; maybe a version of FOR_EACH_PROCESS which only iterates over processes for which waiting_thread == current_thread. > We should perhaps also understand better what happens with the current > code if and when descriptors are closed by some other thread in this > scenario. A thread using such a descriptor could be either stuck in > the pselect call, or it could be before or after the call to pselect. > Paul, what happens if a descriptor passed to pselect becomes closed > while pselect waits? does pselect return with EBADF, or does pselect > handle that gracefully, like by considering such descriptors to be > never "ready"? > > If the thread is before or after pselect, then either calling pselect > or attempting to read from a closed descriptor after pselect returns > will cause EBADF. Is our code prepared to deal with such a situation > (e.g., by ignoring the failed read), or will that cause some Lisp API > to signal an error or produce unexpected results? If the latter, we > should probably handle EBADF specially, because I don't believe we > could completely prevent this from happening. (I'm more hopeful: I think we can completely prevent this from happening, we just need to be more careful about our locking.)
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.Received: (at 79334) by debbugs.gnu.org; 29 Aug 2025 12:25:18 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Aug 29 08:25:17 2025 Received: from localhost ([127.0.0.1]:41981 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1uryAL-00085W-Te for submit <at> debbugs.gnu.org; Fri, 29 Aug 2025 08:25:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:52872) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1uryAF-00083q-TB for 79334 <at> debbugs.gnu.org; Fri, 29 Aug 2025 08:25:11 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <eliz@HIDDEN>) id 1uryA9-0008SE-FW; Fri, 29 Aug 2025 08:25:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=A94arxGB1YohBT1rPRd0MPJzmHkTduP0TX8/MPX2v5o=; b=ZdbZ5Nn/xJrU aEk4NFcaVzfyRtuxuqfgddmGDvvJI4fKv05DORjADj5ov2y5jhfQ/olqwx2jXpBDDOvlZXmPF9lZ0 kGsWEXWSVSCcanJtawOz4IovlmlyUP1uHYbyAxDhiukGI+DbvOSSFAh3xYtYT3e/ojkwrD3Kr/PLQ PuOHu8r6vTCl6HfoAVtW15yVDcFLbL+DJeyCb0CAOmvnzDuUCIVSefIi58FRDRxBGMB4p6R5zqJTA /JqHx4tAMWdWyK6Kg9mzaLDEVohACjlV1+u9axBAsEg6+USHxU2SliFLN2mP0ytk+Tl+jw9VzOzzi oDR/mVAWikbfFGOmP5kHrA==; Date: Fri, 29 Aug 2025 15:24:59 +0300 Message-Id: <86cy8es4bo.fsf@HIDDEN> From: Eli Zaretskii <eliz@HIDDEN> To: Spencer Baugh <sbaugh@HIDDEN>, Paul Eggert <eggert@HIDDEN> In-Reply-To: <ierwm6n6tno.fsf@HIDDEN> (message from Spencer Baugh on Thu, 28 Aug 2025 17:08:43 -0400) Subject: Re: [PATCH] Don't release thread select lock unnecessarily References: <ierwm6n6tno.fsf@HIDDEN> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79334 Cc: 79334 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) > From: Spencer Baugh <sbaugh@HIDDEN> > Date: Thu, 28 Aug 2025 17:08:43 -0400 > > Previously, we were using thread_select to release the thread > lock around two different calls to select in > wait_reading_process_output. The main call to select was fine, > but we also used thread_select for a pselect call which returns > immediately (with a timeout of 0) and is supposed to just check > if any file descriptors are active. > > If we actually thread switch at that pselect, it's likely to > break internal state for Emacs: namely, the call to > status_notify immediately after can close file descriptors which > another thread is selecting on, causing that thread to get EBADF > from select and then call emacs_abort. > > We don't need to thread switch here, so don't. > > * src/process.c (wait_reading_process_output): Remove > unnecessary thread_select wrapper. Thanks. However, I don't think I follow the logic behind this change. I agree that status_notify does things that can be dangerous (more about that below), but the call to status_notify will happen whether we call thread_select or pselect directly, so if status_notify can do some damage, it will do that regardless of whether we call pselect directly or not. I think you assume that the problematic call to status_notify will happen in another thread, if we allow switching threads at this point, is that right? IOW, I think you assume that, if we switch to another thread, that other thread could call status_notify, and thus close some descriptors on which the current thread, the one which calls thread_select at this point, is waiting. If that is your assumption, then the actual situation could also be the opposite: the current thread, the one where you want to call pselect, will call status_notify after pselect returns, and that will close some descriptors used by other threads. In that case, calling pselect directly will make things worse, not better, because it makes this closing of descriptors imminent. So, while I agree we have a problem here, I think the place and the way we should solve it are different. Which brings us to status_notify. It indeed can close the descriptors of processes which terminated, so we should probably make it safer. For example, we could leave alone descriptors marked with non-NULL waiting_thread if that thread is still alive and is different from the current thread. Would that fix the problem? If yes, I think we should install such a change. We should perhaps also understand better what happens with the current code if and when descriptors are closed by some other thread in this scenario. A thread using such a descriptor could be either stuck in the pselect call, or it could be before or after the call to pselect. Paul, what happens if a descriptor passed to pselect becomes closed while pselect waits? does pselect return with EBADF, or does pselect handle that gracefully, like by considering such descriptors to be never "ready"? If the thread is before or after pselect, then either calling pselect or attempting to read from a closed descriptor after pselect returns will cause EBADF. Is our code prepared to deal with such a situation (e.g., by ignoring the failed read), or will that cause some Lisp API to signal an error or produce unexpected results? If the latter, we should probably handle EBADF specially, because I don't believe we could completely prevent this from happening.
bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.
Received: (at submit) by debbugs.gnu.org; 28 Aug 2025 21:09:11 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Aug 28 17:09:11 2025
Received: from localhost ([127.0.0.1]:40303 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1urjrn-0000Mb-CD
for submit <at> debbugs.gnu.org; Thu, 28 Aug 2025 17:09:10 -0400
Received: from lists.gnu.org ([2001:470:142::17]:53598)
by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.84_2) (envelope-from <sbaugh@HIDDEN>)
id 1urjrb-0000K2-2Z
for submit <at> debbugs.gnu.org; Thu, 28 Aug 2025 17:09:02 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10])
by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.90_1) (envelope-from <sbaugh@HIDDEN>)
id 1urjrV-0007cx-G4
for bug-gnu-emacs@HIDDEN; Thu, 28 Aug 2025 17:08:49 -0400
Received: from mxout5.mail.janestreet.com ([64.215.233.18])
by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(Exim 4.90_1) (envelope-from <sbaugh@HIDDEN>)
id 1urjrS-0006yi-Hf
for bug-gnu-emacs@HIDDEN; Thu, 28 Aug 2025 17:08:49 -0400
From: Spencer Baugh <sbaugh@HIDDEN>
To: bug-gnu-emacs@HIDDEN
Subject: [PATCH] Don't release thread select lock unnecessarily
X-Debbugs-Cc: Dmitry Gutov <dmitry@HIDDEN>
Date: Thu, 28 Aug 2025 17:08:43 -0400
Message-ID: <ierwm6n6tno.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=-=-="
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com;
s=waixah; t=1756415324;
bh=oG6k7arttFbIy05aU6tB9IVuhq7uUKY56UKJ/ZvUbhw=;
h=From:To:Subject:Date;
b=FmijEXYwTC3zYww9Ot1knKLIt5dOgbBu5C8n6nB4eThxiXbt2u065WfO/Ihhf2c0M
FNUhK6MaFrjBI8ZbDs44PcAzaLwzBPnJ8YePPd/roklIEnY59NV5k2y7R5VaYzsIyZ
4sa3UmX8bdr5zaYwq4c1n701LUiUnqXzFzPts6GvHy+ymxlDy8qZINZjvz/Z5AG0Fj
fUmKia1R5AcjseqKlhnlyhKaE2LHC6zu4nqClpYW4KLXkTfgmsrPEr5v0yqrxKm8YC
FeRi3erFJz6f5FI+G7sCARZRXfwDVVBXq3bfmzbEjKOnG63r4UeFSlt7iCY8MF6xmZ
b6G8cyW2Zvzxw==
Received-SPF: pass client-ip=64.215.233.18; envelope-from=sbaugh@HIDDEN;
helo=mxout5.mail.janestreet.com
X-Spam_score_int: -43
X-Spam_score: -4.4
X-Spam_bar: ----
X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001,
RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001,
SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-Spam-Score: 0.9 (/)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.1 (/)
--=-=-=
Content-Type: text/plain
Tags: patch
Previously, we were using thread_select to release the thread
lock around two different calls to select in
wait_reading_process_output. The main call to select was fine,
but we also used thread_select for a pselect call which returns
immediately (with a timeout of 0) and is supposed to just check
if any file descriptors are active.
If we actually thread switch at that pselect, it's likely to
break internal state for Emacs: namely, the call to
status_notify immediately after can close file descriptors which
another thread is selecting on, causing that thread to get EBADF
from select and then call emacs_abort.
We don't need to thread switch here, so don't.
* src/process.c (wait_reading_process_output): Remove
unnecessary thread_select wrapper.
In GNU Emacs 30.1.90 (build 29, x86_64-pc-linux-gnu, X toolkit, cairo
version 1.15.12, Xaw scroll bars) of 2025-08-26 built on
igm-qws-u22796a
Repository revision: 54857fe2fb0ed033afcba231f920f8f7fa185333
Repository branch: emacs-30
Windowing system distributor 'The X.Org Foundation', version 11.0.12011000
System Description: Rocky Linux 8.10 (Green Obsidian)
Configured using:
'configure --with-x-toolkit=lucid --without-gpm --without-gconf
--without-selinux --without-imagemagick --with-modules --with-gif=no
--with-cairo --with-rsvg --without-compress-install --with-tree-sitter
--with-native-compilation=aot
PKG_CONFIG_PATH=/usr/local/home/garnish/libtree-sitter/0.22.6-1/lib/pkgconfig/'
--=-=-=
Content-Type: text/patch
Content-Disposition: attachment;
filename=0001-Don-t-release-thread-select-lock-unnecessarily.patch
From 68e7ed37036498a0df0232196c50cced38cdb751 Mon Sep 17 00:00:00 2001
From: Spencer Baugh <sbaugh@HIDDEN>
Date: Thu, 28 Aug 2025 17:06:56 -0400
Subject: [PATCH] Don't release thread select lock unnecessarily
Previously, we were using thread_select to release the thread
lock around two different calls to select in
wait_reading_process_output. The main call to select was fine,
but we also used thread_select for a pselect call which returns
immediately (with a timeout of 0) and is supposed to just check
if any file descriptors are active.
If we actually thread switch at that pselect, it's likely to
break internal state for Emacs: namely, the call to
status_notify immediately after can close file descriptors which
another thread is selecting on, causing that thread to get EBADF
from select and then call emacs_abort.
We don't need to thread switch here, so don't.
* src/process.c (wait_reading_process_output): Remove
unnecessary thread_select wrapper.
---
src/process.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/src/process.c b/src/process.c
index c8b70a4174c..b33bebb9a35 100644
--- a/src/process.c
+++ b/src/process.c
@@ -5513,11 +5513,13 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
if (0 <= fd)
FD_CLR (fd, &Atemp);
+ /* We're just checking to see if any of these fds are active,
+ so make pselect return immediately. */
timeout = make_timespec (0, 0);
- if ((thread_select (pselect, max_desc + 1,
- &Atemp,
- (num_pending_connects > 0 ? &Ctemp : NULL),
- NULL, &timeout, NULL)
+ if ((pselect (max_desc + 1,
+ &Atemp,
+ (num_pending_connects > 0 ? &Ctemp : NULL),
+ NULL, &timeout, NULL)
<= 0))
{
/* It's okay for us to do this and then continue with
--
2.43.7
--=-=-=--
Spencer Baugh <sbaugh@HIDDEN>:dmitry@HIDDEN, bug-gnu-emacs@HIDDEN.
Full text available.dmitry@HIDDEN, bug-gnu-emacs@HIDDEN:bug#79334; Package emacs.
Full text available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.