GNU bug report logs - #66684
[shepherd] Altering system time renders herd unresponsive

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: guix; Severity: important; Reported by: Vladilen Kozin <vladilen.kozin@HIDDEN>; merged with #68476; dated Sun, 22 Oct 2023 16:42:02 UTC; Maintainer for guix is bug-guix@HIDDEN.
Merged 66684 68476. Request was from Sergey Trofimov <sarg@HIDDEN> to control <at> debbugs.gnu.org. Full text available.
Severity set to 'important' from 'normal' Request was from Ludovic Courtès <ludo@HIDDEN> to control <at> debbugs.gnu.org. Full text available.

Message received at 66684 <at> debbugs.gnu.org:


Received: (at 66684) by debbugs.gnu.org; 23 Oct 2023 19:59:16 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Oct 23 15:59:16 2023
Received: from localhost ([127.0.0.1]:51866 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1qv152-0002v5-6J
	for submit <at> debbugs.gnu.org; Mon, 23 Oct 2023 15:59:16 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:52276)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <ludo@HIDDEN>) id 1qv150-0002ur-1M
 for 66684 <at> debbugs.gnu.org; Mon, 23 Oct 2023 15:59:14 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <ludo@HIDDEN>)
 id 1qv14R-0003e2-Ia; Mon, 23 Oct 2023 15:58:39 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-Version:Date:References:In-Reply-To:Subject:To:
 From; bh=H7r3S6Dj8EjSlk9DUVWMlO3TXuLxBmraXBwX/201jis=; b=qgs47WjxD1pGfORQSZeE
 tYwpWBz5PD8ij2uWPVRL0zrNMqlaGPh07WJ3N5C73P1bT7wWsp/L21akI7JqVLI2DU6ofGnGQtoE1
 sXlDX7bd4mzLJHh1/9ocDkENOqGPk6UAiX4P9Pc/2Doxn2jHoFAESnO8Z0C9LBd2Bid1RFetcB6qM
 253C5O1xbugjBaGQl4pvxTQg6ow2CvhExODAVgE19NcMYM3VQ6QVlElLWKmy283yu+3HxTy2tnj67
 ihVlhZf/iFLhGg1R2KMfIsOufAZsknxBCVTmEOs3BloUWQbJ3n0V3KwRWdvONQdQ1ugoUbtQ35O4+
 hpQmRDB1nDkPzQ==;
From: =?utf-8?Q?Ludovic_Court=C3=A8s?= <ludo@HIDDEN>
To: Vladilen Kozin <vladilen.kozin@HIDDEN>
Subject: Re: bug#66684: [shepherd] Altering system time renders herd
 unresponsive
In-Reply-To: <CACw=CXN8dbRb8RmiHimqTs6J_QtSz5HuXaxf0mkRJeEEX1Wy7w@HIDDEN>
 (Vladilen Kozin's message of "Sun, 22 Oct 2023 14:43:28 +0100")
References: <CACw=CXN8dbRb8RmiHimqTs6J_QtSz5HuXaxf0mkRJeEEX1Wy7w@HIDDEN>
Date: Mon, 23 Oct 2023 21:58:33 +0200
Message-ID: <874jihf6vq.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 66684
Cc: 66684 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

Hi Vladilen,

Vladilen Kozin <vladilen.kozin@HIDDEN> skribis:

> My server would consistently run with system time 1h ahead of actual.
> Both `date` and `hwclock` would show the same time off by 1hr, while
> BIOS showed me correct time. I'm not sure why, but some services won't
> run if time difference is e.g. over 15min or smth, so.
>
> $ sudo date -s '-1 hour'
>
> fixes time but causes `herd` to become unresponsive as in you type a
> command, any command and stare at tty stuck. Also ssh'ing into the
> system becomes impossible.

Thanks for your report.  This issue comes from Fibers 1.3.1:

  https://github.com/wingo/fibers/issues/89

There=E2=80=99s currently no bug-fix in sight though.

Ludo=E2=80=99.




Information forwarded to bug-guix@HIDDEN:
bug#66684; Package guix. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 22 Oct 2023 16:41:14 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Oct 22 12:41:14 2023
Received: from localhost ([127.0.0.1]:47963 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1qubVp-0001BU-E9
	for submit <at> debbugs.gnu.org; Sun, 22 Oct 2023 12:41:14 -0400
Received: from lists.gnu.org ([2001:470:142::17]:34830)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <vladilen.kozin@HIDDEN>) id 1quYkc-0000WF-5M
 for submit <at> debbugs.gnu.org; Sun, 22 Oct 2023 09:44:18 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <vladilen.kozin@HIDDEN>)
 id 1quYk3-00043B-SK
 for bug-guix@HIDDEN; Sun, 22 Oct 2023 09:43:43 -0400
Received: from mail-ed1-x532.google.com ([2a00:1450:4864:20::532])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
 (Exim 4.90_1) (envelope-from <vladilen.kozin@HIDDEN>)
 id 1quYk2-0003Gw-7d
 for bug-guix@HIDDEN; Sun, 22 Oct 2023 09:43:43 -0400
Received: by mail-ed1-x532.google.com with SMTP id
 4fb4d7f45d1cf-53e2dc8fa02so3530897a12.2
 for <bug-guix@HIDDEN>; Sun, 22 Oct 2023 06:43:40 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gmail.com; s=20230601; t=1697982219; x=1698587019; darn=gnu.org;
 h=to:subject:message-id:date:from:mime-version:from:to:cc:subject
 :date:message-id:reply-to;
 bh=RYnLd9oKa9vC2fjy/Lmye3ugKSJY5zV2aDajbK4cTjY=;
 b=aT5Yyv6qrzIEPxSoSjAzW30dIW12Q2NOB3uVLgYdlfIBjNZW+X7/KrHSfXwvGgs3Qc
 iEcg4/XFmJLBMEhND8AQzQxrKIGJDwXoAA4rit3Mze0RqUMDfgfTNfZPyceb6Yy4XXzV
 Kmt3hjrX2LRQ9HjFcKgpBFeOauzlhs6uffX+TDkBMYIlS4kH+ezxj272m3RH5SO2lNC9
 cL5hjgu4Q79b4l+QG5fHUXX/8WcYPq4kwIq/eptWXBKMi6xDCXi50aOyUo/vjj2SuOTP
 diLCr/RNACpjzLaaToaoBoKP7NC/efjXc/u8Sa6TuxX/P4EwTQw35t7Ddzai1oM1Oxr9
 VQ5Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1697982219; x=1698587019;
 h=to:subject:message-id:date:from:mime-version:x-gm-message-state
 :from:to:cc:subject:date:message-id:reply-to;
 bh=RYnLd9oKa9vC2fjy/Lmye3ugKSJY5zV2aDajbK4cTjY=;
 b=kMnDfXERl3aMFvXijoNbRC38/dDidhxM+Oq3vNFq3pNJY9+pQhji51mY91f/ik3S2V
 5CYlsXK9DdfXaHdkNIOmll1B/NGDGVdkcPtrdaNNU3lhnx4SsatAjsQRruEiyo2RVmBU
 OwfWsnHP3nUSDwB2YENQEAXarr9yNfS1nw/KzgYso3zk4Ekxgt5UDWLdmzCCYf+BmPtd
 KH8N4ln1cHCVMWcKy5Ex2AQA1Rat8asVN4OhK3kijqlZVptPJGcbTW6MYddX74PXHOsj
 Qf/kVk/EvdVWFLtRhmHLuw37Prit5W7RfPkhoEDESqL4zj2AotZkVY2wsDl3BnnmSPpx
 mMnw==
X-Gm-Message-State: AOJu0YxjXUMpuVHPtNbmNYBEuj5Qo6H3kGWtd3dpRNmUemOHqWPDFoiw
 kUWo6tbvr4yyLR07zE08BRhbCRPhKvnNbOzsKhXyPoiwP2c=
X-Google-Smtp-Source: AGHT+IGhI5LaIU2Ibp1XjIJKTcv7sbC0xxKAF+ZnhYt+sTu27Iah/hU5nqWLqH48TQUVBUvjiXMZve53I5PQ4uuEQKM=
X-Received: by 2002:a50:aad8:0:b0:53f:9ced:e5c2 with SMTP id
 r24-20020a50aad8000000b0053f9cede5c2mr4097460edc.12.1697982218943; Sun, 22
 Oct 2023 06:43:38 -0700 (PDT)
MIME-Version: 1.0
From: Vladilen Kozin <vladilen.kozin@HIDDEN>
Date: Sun, 22 Oct 2023 14:43:28 +0100
Message-ID: <CACw=CXN8dbRb8RmiHimqTs6J_QtSz5HuXaxf0mkRJeEEX1Wy7w@HIDDEN>
Subject: [shepherd] Altering system time renders herd unresponsive
To: bug-guix@HIDDEN
Content-Type: text/plain; charset="UTF-8"
Received-SPF: pass client-ip=2a00:1450:4864:20::532;
 envelope-from=vladilen.kozin@HIDDEN; helo=mail-ed1-x532.google.com
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001,
 RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-Spam-Score: 1.0 (+)
X-Debbugs-Envelope-To: submit
X-Mailman-Approved-At: Sun, 22 Oct 2023 12:41:10 -0400
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

Hello guix.

My server would consistently run with system time 1h ahead of actual.
Both `date` and `hwclock` would show the same time off by 1hr, while
BIOS showed me correct time. I'm not sure why, but some services won't
run if time difference is e.g. over 15min or smth, so.

$ sudo date -s '-1 hour'

fixes time but causes `herd` to become unresponsive as in you type a
command, any command and stare at tty stuck. Also ssh'ing into the
system becomes impossible. Any attempt gets logged in
/var/log/messages - I can see that, but you again just stare at
unresponsive terminal. Initially I thought it fried shepherd
completely, so I powercycle the system to get it back. `sudo reboot`
being an alias to `herd` command will of course not work - so you have
to do it physically. Annoying but feasible on a desktop system -
complete nightmare on a physical server which may take up to 20min to
reboot due to inventory lifecycle and such.

By chance, I got distracted this time and just left it hanging. Lo and
behold it unfroze some 15-20min later. What gives I've no clue.

I hope I won't be seeing this particular issue again, cause I followed
system clock alteration with:
$ sudo hwclock -w
and reboot shows correct time.

In general my experience with shepherd has been less than stellar.
IMO, this just shouldn't happen with PID 1 ever - cause there isn't
anything you can do at this point. Not the first time it became
unresponsive. On occasion after pull that changes some user service
code, followed by system reconfigure those services would start
failing to find their binaries - best guess I have there is that those
specific services depend on user-home service or some such and
something happens that prevents discovery of said binaries in PATH -
binaries in those services aren't referenced by absolute path in GNU
store. Separate issue.

Generation 8 Oct 14 2023 00:22:53 (current)
  file name: /var/guix/profiles/system-8-link
  canonical file name: /gnu/store/j9i2w1zacw7sl8vlb7k1g7p0vnd58ns7-system
  label: GNU with Linux 6.4.16
  bootloader: grub
  root device: label: "r720-guix-0"
  kernel: /gnu/store/cbc7x9in2dnjrnh840c21ivgygnndp1c-linux-6.4.16/bzImage
  channels:
    guix:
      repository URL: https://git.savannah.gnu.org/git/guix.git
      branch: master
      commit: 3963fa1a465708690cd1554d911613f1c92f5eef

Thank you

-- 
Best regards
Vlad Kozin




Acknowledgement sent to Vladilen Kozin <vladilen.kozin@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-guix@HIDDEN. Full text available.
Report forwarded to bug-guix@HIDDEN:
bug#66684; Package guix. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Sat, 20 Jan 2024 12:30:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.