GNU bug report logs - #48468
substitute server connection timeout

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: guix; Severity: important; Reported by: Mathieu Othacehe <othacehe@HIDDEN>; dated Sun, 16 May 2021 17:58:02 UTC; Maintainer for guix is bug-guix@HIDDEN.
Severity set to 'important' from 'normal' Request was from Ludovic Courtès <ludo@HIDDEN> to control <at> debbugs.gnu.org. Full text available.

Message received at 48468 <at> debbugs.gnu.org:


Received: (at 48468) by debbugs.gnu.org; 21 May 2021 13:30:31 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri May 21 09:30:30 2021
Received: from localhost ([127.0.0.1]:36065 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1lk5EU-0001jq-Jk
	for submit <at> debbugs.gnu.org; Fri, 21 May 2021 09:30:30 -0400
Received: from eggs.gnu.org ([209.51.188.92]:38876)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <othacehe@HIDDEN>) id 1lk5EM-0001jV-EN
 for 48468 <at> debbugs.gnu.org; Fri, 21 May 2021 09:30:28 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e]:43390)
 by eggs.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <othacehe@HIDDEN>)
 id 1lk5EH-00057D-04; Fri, 21 May 2021 09:30:17 -0400
Received: from [2a01:e0a:19b:d9a0:f2f7:a404:c3d3:f8b4] (port=50402 helo=meije)
 by fencepost.gnu.org with esmtpsa
 (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1)
 (envelope-from <othacehe@HIDDEN>)
 id 1lk5EG-0003ny-MQ; Fri, 21 May 2021 09:30:16 -0400
From: Mathieu Othacehe <othacehe@HIDDEN>
To: Christopher Baines <mail@HIDDEN>
Subject: Re: bug#48468: substitute server connection timeout
References: <87lf8e4l42.fsf@HIDDEN> <87eee6cz75.fsf@HIDDEN>
 <87eee59zzm.fsf@HIDDEN> <878s4c85zu.fsf@HIDDEN>
Date: Fri, 21 May 2021 15:30:15 +0200
In-Reply-To: <878s4c85zu.fsf@HIDDEN> (Mathieu Othacehe's message of "Tue, 18
 May 2021 16:35:17 +0200")
Message-ID: <87cztkqkns.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 48468
Cc: ludo@HIDDEN, 48468 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)


Hey,

I posted a patchset adding keep-alive support to guix publish earlier:
https://issues.guix.gnu.org/48556.

Thanks,

Mathieu




Information forwarded to bug-guix@HIDDEN:
bug#48468; Package guix. Full text available.

Message received at 48468 <at> debbugs.gnu.org:


Received: (at 48468) by debbugs.gnu.org; 18 May 2021 14:35:34 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Tue May 18 10:35:34 2021
Received: from localhost ([127.0.0.1]:56847 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1lj0oo-0007lD-C0
	for submit <at> debbugs.gnu.org; Tue, 18 May 2021 10:35:34 -0400
Received: from eggs.gnu.org ([209.51.188.92]:57368)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <othacehe@HIDDEN>) id 1lj0ol-0007kl-Bf
 for 48468 <at> debbugs.gnu.org; Tue, 18 May 2021 10:35:31 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e]:37814)
 by eggs.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <othacehe@HIDDEN>)
 id 1lj0of-0005tX-Go; Tue, 18 May 2021 10:35:25 -0400
Received: from [2a01:e0a:19b:d9a0:f2f7:a404:c3d3:f8b4] (port=34756 helo=meije)
 by fencepost.gnu.org with esmtpsa
 (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1)
 (envelope-from <othacehe@HIDDEN>)
 id 1lj0oe-0002Rn-CL; Tue, 18 May 2021 10:35:24 -0400
From: Mathieu Othacehe <othacehe@HIDDEN>
To: Christopher Baines <mail@HIDDEN>
Subject: Re: bug#48468: substitute server connection timeout
References: <87lf8e4l42.fsf@HIDDEN> <87eee6cz75.fsf@HIDDEN>
 <87eee59zzm.fsf@HIDDEN>
Date: Tue, 18 May 2021 16:35:17 +0200
In-Reply-To: <87eee59zzm.fsf@HIDDEN> (Mathieu Othacehe's message of "Mon, 17
 May 2021 16:49:49 +0200")
Message-ID: <878s4c85zu.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 48468
Cc: Ludovic =?utf-8?Q?Court?= =?utf-8?Q?=C3=A8s?= <ludo@HIDDEN>,
 bug-guix@HIDDEN, 48468 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)


Hey,

> I'll have a closer look, thanks for your help.

So this snippet in the http-write procedure of the (guix scripts
publish) module:

--8<---------------cut here---------------start------------->8---
          (swallow-zlib-error
           (close-port port))
--8<---------------cut here---------------end--------------->8---

is closing the client port unconditionally, which means that guix
publish cannot keep connections alive, unless sitting behind an Nginx
proxy.

I'm trying to turn the close-port call into a maybe-close-port with the
following procedure:

--8<---------------cut here---------------start------------->8---
(define (maybe-close-port port)
    (cond
     ((keep-alive? response)
      (poll-set-add! (http-poll-set server) port *events*))
     (else
      (close-port port))))
--8<---------------cut here---------------end--------------->8---

however this is terribly hacky, as I need to access the private poll-set
from (web server http).

Ludo, do you have a better idea?

Thanks,

Mathieu




Information forwarded to bug-guix@HIDDEN:
bug#48468; Package guix. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 18 May 2021 14:35:30 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Tue May 18 10:35:30 2021
Received: from localhost ([127.0.0.1]:56844 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1lj0ok-0007kx-5f
	for submit <at> debbugs.gnu.org; Tue, 18 May 2021 10:35:30 -0400
Received: from lists.gnu.org ([209.51.188.17]:35134)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <othacehe@HIDDEN>) id 1lj0og-0007km-Ml
 for submit <at> debbugs.gnu.org; Tue, 18 May 2021 10:35:28 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:51018)
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <othacehe@HIDDEN>) id 1lj0og-0003Ay-F8
 for bug-guix@HIDDEN; Tue, 18 May 2021 10:35:26 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e]:37814)
 by eggs.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <othacehe@HIDDEN>)
 id 1lj0of-0005tX-Go; Tue, 18 May 2021 10:35:25 -0400
Received: from [2a01:e0a:19b:d9a0:f2f7:a404:c3d3:f8b4] (port=34756 helo=meije)
 by fencepost.gnu.org with esmtpsa
 (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1)
 (envelope-from <othacehe@HIDDEN>)
 id 1lj0oe-0002Rn-CL; Tue, 18 May 2021 10:35:24 -0400
From: Mathieu Othacehe <othacehe@HIDDEN>
To: Christopher Baines <mail@HIDDEN>
Subject: Re: bug#48468: substitute server connection timeout
References: <87lf8e4l42.fsf@HIDDEN> <87eee6cz75.fsf@HIDDEN>
 <87eee59zzm.fsf@HIDDEN>
Date: Tue, 18 May 2021 16:35:17 +0200
In-Reply-To: <87eee59zzm.fsf@HIDDEN> (Mathieu Othacehe's message of "Mon, 17
 May 2021 16:49:49 +0200")
Message-ID: <878s4c85zu.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: submit
Cc: Ludovic =?utf-8?Q?Court?= =?utf-8?Q?=C3=A8s?= <ludo@HIDDEN>,
 bug-guix@HIDDEN, 48468 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)


Hey,

> I'll have a closer look, thanks for your help.

So this snippet in the http-write procedure of the (guix scripts
publish) module:

--8<---------------cut here---------------start------------->8---
          (swallow-zlib-error
           (close-port port))
--8<---------------cut here---------------end--------------->8---

is closing the client port unconditionally, which means that guix
publish cannot keep connections alive, unless sitting behind an Nginx
proxy.

I'm trying to turn the close-port call into a maybe-close-port with the
following procedure:

--8<---------------cut here---------------start------------->8---
(define (maybe-close-port port)
    (cond
     ((keep-alive? response)
      (poll-set-add! (http-poll-set server) port *events*))
     (else
      (close-port port))))
--8<---------------cut here---------------end--------------->8---

however this is terribly hacky, as I need to access the private poll-set
from (web server http).

Ludo, do you have a better idea?

Thanks,

Mathieu




Information forwarded to bug-guix@HIDDEN:
bug#48468; Package guix. Full text available.

Message received at 48468 <at> debbugs.gnu.org:


Received: (at 48468) by debbugs.gnu.org; 17 May 2021 14:50:10 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon May 17 10:50:10 2021
Received: from localhost ([127.0.0.1]:53399 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1lieZK-00085F-9k
	for submit <at> debbugs.gnu.org; Mon, 17 May 2021 10:50:10 -0400
Received: from eggs.gnu.org ([209.51.188.92]:40592)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <othacehe@HIDDEN>) id 1lieZE-00084J-FE
 for 48468 <at> debbugs.gnu.org; Mon, 17 May 2021 10:50:04 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e]:45226)
 by eggs.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <othacehe@HIDDEN>)
 id 1lieZ8-0003sN-6g; Mon, 17 May 2021 10:49:55 -0400
Received: from [109.190.253.11] (port=55430 helo=meije)
 by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <othacehe@HIDDEN>)
 id 1lieZ7-0002KA-Mq; Mon, 17 May 2021 10:49:54 -0400
From: Mathieu Othacehe <othacehe@HIDDEN>
To: Christopher Baines <mail@HIDDEN>
Subject: Re: bug#48468: substitute server connection timeout
References: <87lf8e4l42.fsf@HIDDEN> <87eee6cz75.fsf@HIDDEN>
Date: Mon, 17 May 2021 16:49:49 +0200
In-Reply-To: <87eee6cz75.fsf@HIDDEN> (Christopher Baines's message of
 "Sun, 16 May 2021 19:26:22 +0100")
Message-ID: <87eee59zzm.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 48468
Cc: bug-guix@HIDDEN, 48468 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)


Hey,

> That's on the server side, the actual problem is probably on the client
> side, as I guess there are possibly places where closed connections
> aren't handled properly. This reminds me I sent some patches relating to
> closing connections, this could well be related [1].

Oh, you're right, the Nginx server probably makes the difference
here. The http-write procedure of the Guile (web server http) module
seems to handle keep-alive connections. However, the wrapping http-write
procedure of (guix scripts publish) may not.

I'll have a closer look, thanks for your help.

Mathieu




Information forwarded to bug-guix@HIDDEN:
bug#48468; Package guix. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 17 May 2021 14:49:59 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon May 17 10:49:59 2021
Received: from localhost ([127.0.0.1]:53395 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1lieZD-00084T-38
	for submit <at> debbugs.gnu.org; Mon, 17 May 2021 10:49:59 -0400
Received: from lists.gnu.org ([209.51.188.17]:36806)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <othacehe@HIDDEN>) id 1lieZ9-00084K-Q9
 for submit <at> debbugs.gnu.org; Mon, 17 May 2021 10:49:57 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:34242)
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <othacehe@HIDDEN>) id 1lieZ9-0007kn-Dt
 for bug-guix@HIDDEN; Mon, 17 May 2021 10:49:55 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e]:45226)
 by eggs.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <othacehe@HIDDEN>)
 id 1lieZ8-0003sN-6g; Mon, 17 May 2021 10:49:55 -0400
Received: from [109.190.253.11] (port=55430 helo=meije)
 by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <othacehe@HIDDEN>)
 id 1lieZ7-0002KA-Mq; Mon, 17 May 2021 10:49:54 -0400
From: Mathieu Othacehe <othacehe@HIDDEN>
To: Christopher Baines <mail@HIDDEN>
Subject: Re: bug#48468: substitute server connection timeout
References: <87lf8e4l42.fsf@HIDDEN> <87eee6cz75.fsf@HIDDEN>
Date: Mon, 17 May 2021 16:49:49 +0200
In-Reply-To: <87eee6cz75.fsf@HIDDEN> (Christopher Baines's message of
 "Sun, 16 May 2021 19:26:22 +0100")
Message-ID: <87eee59zzm.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: submit
Cc: bug-guix@HIDDEN, 48468 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)


Hey,

> That's on the server side, the actual problem is probably on the client
> side, as I guess there are possibly places where closed connections
> aren't handled properly. This reminds me I sent some patches relating to
> closing connections, this could well be related [1].

Oh, you're right, the Nginx server probably makes the difference
here. The http-write procedure of the Guile (web server http) module
seems to handle keep-alive connections. However, the wrapping http-write
procedure of (guix scripts publish) may not.

I'll have a closer look, thanks for your help.

Mathieu




Information forwarded to bug-guix@HIDDEN:
bug#48468; Package guix. Full text available.

Message received at 48468 <at> debbugs.gnu.org:


Received: (at 48468) by debbugs.gnu.org; 16 May 2021 18:26:30 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun May 16 14:26:30 2021
Received: from localhost ([127.0.0.1]:51455 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1liLTB-0002mu-Ma
	for submit <at> debbugs.gnu.org; Sun, 16 May 2021 14:26:30 -0400
Received: from mira.cbaines.net ([212.71.252.8]:38702)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <mail@HIDDEN>) id 1liLT9-0002ml-B0
 for 48468 <at> debbugs.gnu.org; Sun, 16 May 2021 14:26:27 -0400
Received: from localhost (unknown [IPv6:2a02:8010:68c1:0:8ac0:b4c7:f5c8:7caa])
 by mira.cbaines.net (Postfix) with ESMTPSA id 64FE527BC78;
 Sun, 16 May 2021 19:26:26 +0100 (BST)
Received: from capella (localhost [127.0.0.1])
 by localhost (OpenSMTPD) with ESMTP id 835715d7;
 Sun, 16 May 2021 18:26:25 +0000 (UTC)
References: <87lf8e4l42.fsf@HIDDEN>
User-agent: mu4e 1.4.15; emacs 27.1
From: Christopher Baines <mail@HIDDEN>
To: Mathieu Othacehe <othacehe@HIDDEN>
Subject: Re: bug#48468: substitute server connection timeout
In-reply-to: <87lf8e4l42.fsf@HIDDEN>
Date: Sun, 16 May 2021 19:26:22 +0100
Message-ID: <87eee6cz75.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=-=-=";
 micalg=pgp-sha512; protocol="application/pgp-signature"
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 48468
Cc: bug-guix@HIDDEN, 48468 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

--=-=-=
Content-Type: text/plain


Mathieu Othacehe <othacehe@HIDDEN> writes:

> Hello,
>
> We recently have a lot of those errors on Cuirass:
>
> --8<---------------cut here---------------start------------->8---
> guix substitute: warning: while fetching http://141.80.167.131:5557/nar/g7ka09613k5v1vlznh87yg35905ggw51-python2-scipy-1.2.2-guile-builder: server is somewhat slow
> guix substitute: warning: try `--no-substitutes' if the problem persists
> guix substitute: error: connect*: Connection timed out
> --8<---------------cut here---------------end--------------->8---
>
> which means that the workers are failing to connect to the Cuirass
> remote-server publish server on berlin at 141.80.167.131:5557.
>
> Stracing this publish server shows that connection reuse seems to be
> broken:
>
> --8<---------------cut here---------------start------------->8---
> accept4(9, {sa_family=AF_INET, sin_port=htons(41742), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41744), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41746), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 25
> accept4(9, {sa_family=AF_INET, sin_port=htons(41748), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 24
> accept4(9, {sa_family=AF_INET, sin_port=htons(41750), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41752), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41754), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 25
> accept4(9, {sa_family=AF_INET, sin_port=htons(41756), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41758), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 26
> accept4(9, {sa_family=AF_INET, sin_port=htons(41760), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 24
> accept4(9, {sa_family=AF_INET, sin_port=htons(41762), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41764), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41766), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41768), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 22
> accept4(9, {sa_family=AF_INET, sin_port=htons(41770), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41772), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41774), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41776), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41778), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41780), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41782), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> --8<---------------cut here---------------end--------------->8---
>
> Investigating it, I found that the connection is closed and opened
> multiple times in the call-with-cached-connection procedure of the (guix
> script substitute) module.
>
> It looks like its because a 'bad-headers exception is raised when trying
> to parse an eof object:
>
> --8<---------------cut here---------------start------------->8---
> ;;; (error bad-header (read-header-line #<eof>))
> --8<---------------cut here---------------end--------------->8---
>
> I'm not sure where this eof comes from. There is this comment in the
> http-multiple-get procedure in (guix http-client):
>
> --8<---------------cut here---------------start------------->8---
> ;; Swallow networking errors that could occur due to connection reuse
> ;; and the like; they will be handled down the road when trying to
> ;; read responses.
> (false-if-networking-error
>  (begin
>    (for-each (cut write-request <> buffer) batch)
>    (put-bytevector p (get))
>    (force-output p))))
> --8<---------------cut here---------------end--------------->8---
>
> which would suggest that connection reuse could cause networking errors?
>
> What also puzzles me it that the main guix publish server on berlin does
> not seem to present this issue. That would indicate that this error is
> caused by how the Cuirass remote-server publish server is started or
> configured.
>
> Ludo, Chris, any idea?

While I've been working in this area, I've actually been trying to pick
apart the connection caching, since the single thread assumption doesn't
hold in the Guix Build Coordinator.

Anyway, I do have a theory. Assuming I'm correct in saying that there's
no nginx between the client and publish server here, I think that's your
configuration difference.

For ci.guix.gnu.org, as well as data.guix.gnu.org, it's NGinx which is
keeping connections alive. I'm not sure the Guile code for the publish
server does similarly, so talking to it directly might be different.

That's on the server side, the actual problem is probably on the client
side, as I guess there are possibly places where closed connections
aren't handled properly. This reminds me I sent some patches relating to
closing connections, this could well be related [1].

1: https://issues.guix.gnu.org/47174

--=-=-=
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----

iQKlBAEBCgCPFiEEPonu50WOcg2XVOCyXiijOwuE9XcFAmChY89fFIAAAAAALgAo
aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDNF
ODlFRUU3NDU4RTcyMEQ5NzU0RTBCMjVFMjhBMzNCMEI4NEY1NzcRHG1haWxAY2Jh
aW5lcy5uZXQACgkQXiijOwuE9XfG9w/+KvzCHwfnB+eeEd6i0xTvTfc8+oLxb+GS
j1TSbXSGR02ByGMGIbJeAMmNPV20V9Lzi24a49451fskUZbCGeg6aS2rO3c9QpUX
jbF7//NW8zwgmhO/Hh9dvB4v+q01+VwrzE3c6ssAVO6772ZrTJqGA7oCU9UqD1w6
dQYUto8iSxa95UhFcNFMtGcBAevdoN4Ku9OpkqIt8uViCED+YM6a9Ej4d9f9oC8h
vxw6EU0YsCrobJQJ8KW56N9uR4oeI4IqWevou2LbSAmcHEWmGHAF6XDgxu0594Ip
YOFuqso4nuRxU3ffzgEupMbJA07yrwf0EB4lreqsSfU8gtZupFMLgyc8PviyXjw3
XPpo3PAukKhsVZyQlnTc6ldatWWVHIeTPy8vxdr6GvSQnzliaQmzpHj7xVqkfS3f
R1gz5hEXoS0o/CXpVkpnaYMC8m+Y2ujM5AbEUBVo6itR6mmUCCF5D9q/BtosjQvu
8wrLes42iAqwlnNeg8UGKra/cwJ0YGVrOInsOyXX97jz3UVjkxV7jq53+vkw3fGZ
eyHDugAJ8DKw9js8BX88AmXaV/Art4M0qG6D6Zp+MvIUbU9z+oNeTjAXgGewxntw
sfIdxHju+DNs5x5je3pTzyRSDPEfle8Xvd58DbGyAIoP8v7C7LYPmxVID7R0D/cs
ccIQKCj/kqg=
=yVQQ
-----END PGP SIGNATURE-----
--=-=-=--




Information forwarded to bug-guix@HIDDEN:
bug#48468; Package guix. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 16 May 2021 18:26:32 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun May 16 14:26:32 2021
Received: from localhost ([127.0.0.1]:51458 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1liLTE-0002n9-5G
	for submit <at> debbugs.gnu.org; Sun, 16 May 2021 14:26:32 -0400
Received: from lists.gnu.org ([209.51.188.17]:40602)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <mail@HIDDEN>) id 1liLTD-0002n2-2G
 for submit <at> debbugs.gnu.org; Sun, 16 May 2021 14:26:31 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:51284)
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <mail@HIDDEN>) id 1liLTC-00028q-TL
 for bug-guix@HIDDEN; Sun, 16 May 2021 14:26:30 -0400
Received: from mira.cbaines.net ([212.71.252.8]:56692)
 by eggs.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <mail@HIDDEN>)
 id 1liLTA-0005Ms-Op; Sun, 16 May 2021 14:26:30 -0400
Received: from localhost (unknown [IPv6:2a02:8010:68c1:0:8ac0:b4c7:f5c8:7caa])
 by mira.cbaines.net (Postfix) with ESMTPSA id 64FE527BC78;
 Sun, 16 May 2021 19:26:26 +0100 (BST)
Received: from capella (localhost [127.0.0.1])
 by localhost (OpenSMTPD) with ESMTP id 835715d7;
 Sun, 16 May 2021 18:26:25 +0000 (UTC)
References: <87lf8e4l42.fsf@HIDDEN>
User-agent: mu4e 1.4.15; emacs 27.1
From: Christopher Baines <mail@HIDDEN>
To: Mathieu Othacehe <othacehe@HIDDEN>
Subject: Re: bug#48468: substitute server connection timeout
In-reply-to: <87lf8e4l42.fsf@HIDDEN>
Date: Sun, 16 May 2021 19:26:22 +0100
Message-ID: <87eee6cz75.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=-=-=";
 micalg=pgp-sha512; protocol="application/pgp-signature"
Received-SPF: pass client-ip=212.71.252.8; envelope-from=mail@HIDDEN;
 helo=mira.cbaines.net
X-Spam_score_int: -6
X-Spam_score: -0.7
X-Spam_bar: /
X-Spam_report: (-0.7 / 5.0 requ) BAYES_00=-1.9, NORMAL_HTTP_TO_IP=0.001,
 NUMERIC_HTTP_ADDR=1.242, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001,
 WEIRD_PORT=0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-Spam-Score: -1.4 (-)
X-Debbugs-Envelope-To: submit
Cc: bug-guix@HIDDEN, 48468 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -2.4 (--)

--=-=-=
Content-Type: text/plain


Mathieu Othacehe <othacehe@HIDDEN> writes:

> Hello,
>
> We recently have a lot of those errors on Cuirass:
>
> --8<---------------cut here---------------start------------->8---
> guix substitute: warning: while fetching http://141.80.167.131:5557/nar/g7ka09613k5v1vlznh87yg35905ggw51-python2-scipy-1.2.2-guile-builder: server is somewhat slow
> guix substitute: warning: try `--no-substitutes' if the problem persists
> guix substitute: error: connect*: Connection timed out
> --8<---------------cut here---------------end--------------->8---
>
> which means that the workers are failing to connect to the Cuirass
> remote-server publish server on berlin at 141.80.167.131:5557.
>
> Stracing this publish server shows that connection reuse seems to be
> broken:
>
> --8<---------------cut here---------------start------------->8---
> accept4(9, {sa_family=AF_INET, sin_port=htons(41742), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41744), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41746), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 25
> accept4(9, {sa_family=AF_INET, sin_port=htons(41748), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 24
> accept4(9, {sa_family=AF_INET, sin_port=htons(41750), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41752), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41754), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 25
> accept4(9, {sa_family=AF_INET, sin_port=htons(41756), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41758), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 26
> accept4(9, {sa_family=AF_INET, sin_port=htons(41760), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 24
> accept4(9, {sa_family=AF_INET, sin_port=htons(41762), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41764), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41766), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41768), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 22
> accept4(9, {sa_family=AF_INET, sin_port=htons(41770), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41772), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41774), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41776), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41778), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41780), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> accept4(9, {sa_family=AF_INET, sin_port=htons(41782), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
> --8<---------------cut here---------------end--------------->8---
>
> Investigating it, I found that the connection is closed and opened
> multiple times in the call-with-cached-connection procedure of the (guix
> script substitute) module.
>
> It looks like its because a 'bad-headers exception is raised when trying
> to parse an eof object:
>
> --8<---------------cut here---------------start------------->8---
> ;;; (error bad-header (read-header-line #<eof>))
> --8<---------------cut here---------------end--------------->8---
>
> I'm not sure where this eof comes from. There is this comment in the
> http-multiple-get procedure in (guix http-client):
>
> --8<---------------cut here---------------start------------->8---
> ;; Swallow networking errors that could occur due to connection reuse
> ;; and the like; they will be handled down the road when trying to
> ;; read responses.
> (false-if-networking-error
>  (begin
>    (for-each (cut write-request <> buffer) batch)
>    (put-bytevector p (get))
>    (force-output p))))
> --8<---------------cut here---------------end--------------->8---
>
> which would suggest that connection reuse could cause networking errors?
>
> What also puzzles me it that the main guix publish server on berlin does
> not seem to present this issue. That would indicate that this error is
> caused by how the Cuirass remote-server publish server is started or
> configured.
>
> Ludo, Chris, any idea?

While I've been working in this area, I've actually been trying to pick
apart the connection caching, since the single thread assumption doesn't
hold in the Guix Build Coordinator.

Anyway, I do have a theory. Assuming I'm correct in saying that there's
no nginx between the client and publish server here, I think that's your
configuration difference.

For ci.guix.gnu.org, as well as data.guix.gnu.org, it's NGinx which is
keeping connections alive. I'm not sure the Guile code for the publish
server does similarly, so talking to it directly might be different.

That's on the server side, the actual problem is probably on the client
side, as I guess there are possibly places where closed connections
aren't handled properly. This reminds me I sent some patches relating to
closing connections, this could well be related [1].

1: https://issues.guix.gnu.org/47174

--=-=-=
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----

iQKlBAEBCgCPFiEEPonu50WOcg2XVOCyXiijOwuE9XcFAmChY89fFIAAAAAALgAo
aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDNF
ODlFRUU3NDU4RTcyMEQ5NzU0RTBCMjVFMjhBMzNCMEI4NEY1NzcRHG1haWxAY2Jh
aW5lcy5uZXQACgkQXiijOwuE9XfG9w/+KvzCHwfnB+eeEd6i0xTvTfc8+oLxb+GS
j1TSbXSGR02ByGMGIbJeAMmNPV20V9Lzi24a49451fskUZbCGeg6aS2rO3c9QpUX
jbF7//NW8zwgmhO/Hh9dvB4v+q01+VwrzE3c6ssAVO6772ZrTJqGA7oCU9UqD1w6
dQYUto8iSxa95UhFcNFMtGcBAevdoN4Ku9OpkqIt8uViCED+YM6a9Ej4d9f9oC8h
vxw6EU0YsCrobJQJ8KW56N9uR4oeI4IqWevou2LbSAmcHEWmGHAF6XDgxu0594Ip
YOFuqso4nuRxU3ffzgEupMbJA07yrwf0EB4lreqsSfU8gtZupFMLgyc8PviyXjw3
XPpo3PAukKhsVZyQlnTc6ldatWWVHIeTPy8vxdr6GvSQnzliaQmzpHj7xVqkfS3f
R1gz5hEXoS0o/CXpVkpnaYMC8m+Y2ujM5AbEUBVo6itR6mmUCCF5D9q/BtosjQvu
8wrLes42iAqwlnNeg8UGKra/cwJ0YGVrOInsOyXX97jz3UVjkxV7jq53+vkw3fGZ
eyHDugAJ8DKw9js8BX88AmXaV/Art4M0qG6D6Zp+MvIUbU9z+oNeTjAXgGewxntw
sfIdxHju+DNs5x5je3pTzyRSDPEfle8Xvd58DbGyAIoP8v7C7LYPmxVID7R0D/cs
ccIQKCj/kqg=
=yVQQ
-----END PGP SIGNATURE-----
--=-=-=--




Information forwarded to bug-guix@HIDDEN:
bug#48468; Package guix. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 16 May 2021 17:57:56 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun May 16 13:57:56 2021
Received: from localhost ([127.0.0.1]:51426 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1liL1Y-00024J-5i
	for submit <at> debbugs.gnu.org; Sun, 16 May 2021 13:57:56 -0400
Received: from lists.gnu.org ([209.51.188.17]:56062)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <othacehe@HIDDEN>) id 1liL1W-00024A-Al
 for submit <at> debbugs.gnu.org; Sun, 16 May 2021 13:57:54 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:44674)
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <othacehe@HIDDEN>) id 1liL1V-0000JG-NQ
 for bug-guix@HIDDEN; Sun, 16 May 2021 13:57:53 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e]:57356)
 by eggs.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <othacehe@HIDDEN>) id 1liL1V-0004Qj-Ga
 for bug-guix@HIDDEN; Sun, 16 May 2021 13:57:53 -0400
Received: from [2a01:e34:ed27:e500:a880:d241:574d:84e1] (port=51494 helo=meije)
 by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <othacehe@HIDDEN>) id 1liL1T-0007vp-Vn
 for bug-guix@HIDDEN; Sun, 16 May 2021 13:57:52 -0400
From: Mathieu Othacehe <othacehe@HIDDEN>
To: bug-guix@HIDDEN
Subject: substitute server connection timeout
Date: Sun, 16 May 2021 19:57:49 +0200
Message-ID: <87lf8e4l42.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)


Hello,

We recently have a lot of those errors on Cuirass:

--8<---------------cut here---------------start------------->8---
guix substitute: warning: while fetching http://141.80.167.131:5557/nar/g7ka09613k5v1vlznh87yg35905ggw51-python2-scipy-1.2.2-guile-builder: server is somewhat slow
guix substitute: warning: try `--no-substitutes' if the problem persists
guix substitute: error: connect*: Connection timed out
--8<---------------cut here---------------end--------------->8---

which means that the workers are failing to connect to the Cuirass
remote-server publish server on berlin at 141.80.167.131:5557.

Stracing this publish server shows that connection reuse seems to be
broken:

--8<---------------cut here---------------start------------->8---
accept4(9, {sa_family=AF_INET, sin_port=htons(41742), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
accept4(9, {sa_family=AF_INET, sin_port=htons(41744), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
accept4(9, {sa_family=AF_INET, sin_port=htons(41746), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 25
accept4(9, {sa_family=AF_INET, sin_port=htons(41748), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 24
accept4(9, {sa_family=AF_INET, sin_port=htons(41750), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
accept4(9, {sa_family=AF_INET, sin_port=htons(41752), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
accept4(9, {sa_family=AF_INET, sin_port=htons(41754), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 25
accept4(9, {sa_family=AF_INET, sin_port=htons(41756), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
accept4(9, {sa_family=AF_INET, sin_port=htons(41758), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 26
accept4(9, {sa_family=AF_INET, sin_port=htons(41760), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 24
accept4(9, {sa_family=AF_INET, sin_port=htons(41762), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
accept4(9, {sa_family=AF_INET, sin_port=htons(41764), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
accept4(9, {sa_family=AF_INET, sin_port=htons(41766), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
accept4(9, {sa_family=AF_INET, sin_port=htons(41768), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 22
accept4(9, {sa_family=AF_INET, sin_port=htons(41770), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
accept4(9, {sa_family=AF_INET, sin_port=htons(41772), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
accept4(9, {sa_family=AF_INET, sin_port=htons(41774), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
accept4(9, {sa_family=AF_INET, sin_port=htons(41776), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
accept4(9, {sa_family=AF_INET, sin_port=htons(41778), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
accept4(9, {sa_family=AF_INET, sin_port=htons(41780), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
accept4(9, {sa_family=AF_INET, sin_port=htons(41782), sin_addr=inet_addr("141.80.167.185")}, [112->16], 0) = 21
--8<---------------cut here---------------end--------------->8---

Investigating it, I found that the connection is closed and opened
multiple times in the call-with-cached-connection procedure of the (guix
script substitute) module.

It looks like its because a 'bad-headers exception is raised when trying
to parse an eof object:

--8<---------------cut here---------------start------------->8---
;;; (error bad-header (read-header-line #<eof>))
--8<---------------cut here---------------end--------------->8---

I'm not sure where this eof comes from. There is this comment in the
http-multiple-get procedure in (guix http-client):

--8<---------------cut here---------------start------------->8---
;; Swallow networking errors that could occur due to connection reuse
;; and the like; they will be handled down the road when trying to
;; read responses.
(false-if-networking-error
 (begin
   (for-each (cut write-request <> buffer) batch)
   (put-bytevector p (get))
   (force-output p))))
--8<---------------cut here---------------end--------------->8---

which would suggest that connection reuse could cause networking errors?

What also puzzles me it that the main guix publish server on berlin does
not seem to present this issue. That would indicate that this error is
caused by how the Cuirass remote-server publish server is started or
configured.

Ludo, Chris, any idea?

I will keep searching anyway :)

Thanks,

Mathieu




Acknowledgement sent to Mathieu Othacehe <othacehe@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-guix@HIDDEN. Full text available.
Report forwarded to bug-guix@HIDDEN:
bug#48468; Package guix. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Sat, 29 May 2021 21:15:01 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.