GNU bug report logs - #39970
guix commands broken on Azerbaijani 'az_AZ' and Turkish 'tr_TR' locales

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: guix; Reported by: "pelzflorian (Florian Pelz)" <pelzflorian@HIDDEN>; dated Sat, 7 Mar 2020 12:02:01 UTC; Maintainer for guix is bug-guix@HIDDEN.
Did not alter fixed versions and reopened. Request was from Debbugs Internal Request <help-debbugs@HIDDEN> to internal_control <at> debbugs.gnu.org. Full text available.

Message received at 39970-done <at> debbugs.gnu.org:


Received: (at 39970-done) by debbugs.gnu.org; 5 May 2021 09:23:01 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed May 05 05:23:01 2021
Received: from localhost ([127.0.0.1]:58948 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1leDkC-0006sx-Pl
	for submit <at> debbugs.gnu.org; Wed, 05 May 2021 05:23:00 -0400
Received: from pelzflorian.de ([5.45.111.108]:48978 helo=mail.pelzflorian.de)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <pelzflorian@HIDDEN>) id 1leDkA-0006sr-1r
 for 39970-done <at> debbugs.gnu.org; Wed, 05 May 2021 05:22:59 -0400
Received: from pelzflorian.localdomain (unknown [5.45.111.108])
 by mail.pelzflorian.de (Postfix) with ESMTPSA id 80B1736063D;
 Wed,  5 May 2021 11:22:56 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=pelzflorian.de;
 s=mail; t=1620206576;
 bh=KEhqQkFSkkAeYNZuU86/q05KwcOH/kcuvopjaQ5p5UQ=;
 h=Date:From:To:Cc:Subject:References:In-Reply-To;
 b=olC2ulWlv2S6qOuZdhka4XtmIGTw4PXTHYz/tYkpthJ8b37ATFzea2xMxW0zjLaru
 xPRJOwzxqNIc1V1YQARJwnbG+k/Xtl4x3sao+zi98Pk3pQFPgyfeaqpvWUPglXdv7s
 Gi02O+knJepLUPUCO/JnG9UIOjnaSlXadxvCYqSw=
Date: Wed, 5 May 2021 11:22:48 +0200
From: "pelzflorian (Florian Pelz)" <pelzflorian@HIDDEN>
To: Maxim Cournoyer <maxim.cournoyer@HIDDEN>
Subject: Re: bug#39970: guix commands broken on Azerbaijani 'az_AZ' and
 Turkish 'tr_TR' locales
Message-ID: <20210505092248.i3qfrteekhkwhd3y@HIDDEN>
References: <20200307120052.ocwzphlvemvmb2ts@HIDDEN>
 <20200307152003.myj7jkjthokbmark@HIDDEN>
 <20200308070804.ylpb5yrwpgbc3p3w@HIDDEN>
 <8736ah1mxb.fsf@HIDDEN>
 <20200312110206.2hsinzejnmcefmot@HIDDEN>
 <874kutsgmx.fsf@HIDDEN>
 <20200317094443.cnajoi4yuzvxaafe@HIDDEN>
 <87wo7ibrwe.fsf@HIDDEN>
 <20200318064712.iycghze5nnr7q777@HIDDEN>
 <875yzxlrcp.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <875yzxlrcp.fsf@HIDDEN>
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 39970-done
Cc: Ludovic =?utf-8?Q?Court=C3=A8s?= <ludo@HIDDEN>, 39970-done <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

On Wed, May 05, 2021 at 12:47:02AM -0400, Maxim Cournoyer wrote:
> Closing.
> 
> Thank you,
> 
> Maxim

Sorry for forgetting about this bug.  The above

LC_ALL=tr_TR.utf8 make check TESTS=tests/cran.scm

is *not* fixed, but I won’t take the time to really understand and fix
the few remaining troubles, I think.  Possibly libc bug
<https://sourceware.org/bugzilla/show_bug.cgi?id=23393> is the real
issue.

Regards,
Florian




Information forwarded to bug-guix@HIDDEN:
bug#39970; Package guix. Full text available.

Message received at 39970 <at> debbugs.gnu.org:


Received: (at 39970) by debbugs.gnu.org; 5 May 2021 07:04:43 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed May 05 03:04:43 2021
Received: from localhost ([127.0.0.1]:58131 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1leBaM-0004O2-Np
	for submit <at> debbugs.gnu.org; Wed, 05 May 2021 03:04:42 -0400
Received: from mail-ed1-f51.google.com ([209.85.208.51]:35389)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <taylan.kammer@HIDDEN>) id 1leBaJ-0004Nw-Hp
 for 39970 <at> debbugs.gnu.org; Wed, 05 May 2021 03:04:41 -0400
Received: by mail-ed1-f51.google.com with SMTP id di13so705123edb.2
 for <39970 <at> debbugs.gnu.org>; Wed, 05 May 2021 00:04:39 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=subject:to:cc:references:from:message-id:date:user-agent
 :mime-version:in-reply-to:content-language:content-transfer-encoding;
 bh=YIgexaiMINSlMsO56JoCOrDkjonXjwvuxJe2hBC3oes=;
 b=Xe87cw8LiT/n2apPx65n0ZXxdOCXrDq3jduvg0So++5C0tlodORuIqdNkEFBjCLFvG
 Ba6daTARJL4Gx5oDnBzAhCbHtNPAXJBsd0iBihngjrkMc8eaJjXfQdTjt3aQG9HjbK7r
 Z6jJO+yKyweiPGvi1ZdgLuiHkP0pNw6tLI2OEPyXotiTmiNLmOL6Sl3u9T71L6yyd4hl
 5osvEB18hmw3PBGFi1uYEWwYkk27PT1dZSXVQq0vp6xELuqCn92nQbbEzl1Ch1tbP5UL
 YVsxYjNVincQKALidqicGRN5+e2QsmWGZMcMfD+QIAUOaTaBRa33iKOlwCzJTtKJ2rr3
 6OWg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:cc:references:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-language
 :content-transfer-encoding;
 bh=YIgexaiMINSlMsO56JoCOrDkjonXjwvuxJe2hBC3oes=;
 b=rrDTYI4f1lAxoFRsukphhER064NNtlIyhXNL40ggj0JegMeelufv+Mc2jWhEi7/qpb
 MOfFMftwWK5rCl6UDLmj5723mt55zeao2xDKFuSsSvH3ncITDaMmSCCdrwPZUWpmq9z8
 OTZUKQo+3X1vCj+SDeowXIlwRg9xK0wy1ed6kHL1WjfD1YMBR8MbKfrFhXg755uUReRJ
 rVtyKg7IOqqVd9kfo8niv+daDTpB7sUIYUbUloiIF2KOJJF44/QA8f4wc/TOraX+6aYk
 6dSC4C0ATIdJgIzK1dM7NQFWPcTWLNhEhVUQt+AThe34euPP8rzGqMzbkukJfMQEsR40
 r13A==
X-Gm-Message-State: AOAM533bgYXhQ/cIbzaPSdCsSiZ1uZFDXqK+QOqDvl1PW9DVV94IrFFl
 NmByWJZJMDyQolbxRKqDHbfiiv99QjN84Q==
X-Google-Smtp-Source: ABdhPJwRs0OyADg8AJYKFZsNrHOiOb/piZfFIk1/O1z3iVGvTeqz5bHUaIr3mxwr3i/pnEhkZV+3Jg==
X-Received: by 2002:a05:6402:3090:: with SMTP id
 de16mr30587624edb.177.1620198273670; 
 Wed, 05 May 2021 00:04:33 -0700 (PDT)
Received: from [192.168.178.20] (b2b-109-90-125-150.unitymedia.biz.
 [109.90.125.150])
 by smtp.gmail.com with ESMTPSA id r17sm3560987edo.48.2021.05.05.00.04.33
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Wed, 05 May 2021 00:04:33 -0700 (PDT)
Subject: Re: bug#39970: guix commands broken on Azerbaijani 'az_AZ' and
 Turkish 'tr_TR' locales
To: "pelzflorian (Florian Pelz)" <pelzflorian@HIDDEN>,
 =?UTF-8?Q?Ludovic_Court=c3=a8s?= <ludo@HIDDEN>
References: <20200307120052.ocwzphlvemvmb2ts@HIDDEN>
 <20200307152003.myj7jkjthokbmark@HIDDEN>
 <20200308070804.ylpb5yrwpgbc3p3w@HIDDEN>
 <8736ah1mxb.fsf@HIDDEN>
 <20200312110206.2hsinzejnmcefmot@HIDDEN>
From: Taylan Kammer <taylan.kammer@HIDDEN>
Message-ID: <e3929738-588c-e27a-69b4-bb405f536e75@HIDDEN>
Date: Wed, 5 May 2021 09:04:32 +0200
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
 Thunderbird/78.10.0
MIME-Version: 1.0
In-Reply-To: <20200312110206.2hsinzejnmcefmot@HIDDEN>
Content-Type: text/plain; charset=UTF-8
Content-Language: en-US
Content-Transfer-Encoding: 8bit
X-Spam-Score: -0.0 (/)
X-Debbugs-Envelope-To: 39970
Cc: 39970 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

On 12.03.2020 12:02, pelzflorian (Florian Pelz) wrote:
> 
> Guile’s behavior that i is not among [a-z] has been confirmed as
> unexpected by a natively Turkish friend of mine.  It is different from
> the behavior of current glibc:
> 
> florian@florianmacbook ~$ cat iyiyim.c
> #include <regex.h>
> #include <stdio.h>
> #include <stdlib.h>
> #define STR "iyiyım"
> int main (int    argc,
>           char** argv)
> {
>   regex_t only_letters;
>   int r = regcomp (&only_letters, "[a-z]+", REG_EXTENDED);
>   if (r != 0)
>     printf ("This error does not happen.\n");
>   r = regexec (&only_letters, STR, 1, malloc (sizeof (regmatch_t)), 0);
>   if (r == 0)
>     printf ("The string " STR " matched!\n");
>   else
>     printf ("No match for " STR ".\n");
> }
> florian@florianmacbook ~$ gcc -o iyiyim iyiyim.c
> florian@florianmacbook ~$ LANG=tr_TR.utf8 ./iyiyim 
> The string iyiyım matched!
> 
> Apparently Guile uses a bundled regular expression library rather than
> glibc.  I can try making Guile use a newer GNUlib for its regular
> expressions, maybe that helps.  Shall I file a separate bug for Guile?
> 
Also native Turkish speaker here, and yeah that seems like a clear bug.

By the way, Turkish doesn't have q, w, or x.  So if [a-z] is interpreted
by locale, it would fail to match those letters.  I suppose that doesn't
matter for the patch you guys used but it might have been part of the
original problem.

The dotless lowercase i / dotted uppercase I mostly bites programmers in
case conversion.  The uppercase of i is İ and the lowercase of I is ı.
There was even an exploit in GitHub related to this:

  https://eng.getwisdom.io/hacking-github-with-unicode-dotless-i/


- Taylan




Information forwarded to bug-guix@HIDDEN:
bug#39970; Package guix. Full text available.

Message received at 39970-done <at> debbugs.gnu.org:


Received: (at 39970-done) by debbugs.gnu.org; 5 May 2021 04:47:11 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed May 05 00:47:11 2021
Received: from localhost ([127.0.0.1]:57506 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1le9RH-0006vY-HT
	for submit <at> debbugs.gnu.org; Wed, 05 May 2021 00:47:11 -0400
Received: from mail-qk1-f178.google.com ([209.85.222.178]:35486)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <maxim.cournoyer@HIDDEN>) id 1le9RF-0006vG-3T
 for 39970-done <at> debbugs.gnu.org; Wed, 05 May 2021 00:47:10 -0400
Received: by mail-qk1-f178.google.com with SMTP id x8so460036qkl.2
 for <39970-done <at> debbugs.gnu.org>; Tue, 04 May 2021 21:47:09 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=from:to:cc:subject:references:date:in-reply-to:message-id
 :user-agent:mime-version:content-transfer-encoding;
 bh=vo0tZzzv8fQCm4ReTeScxZf2p4aHnNx5GuV1w2Id5fo=;
 b=Rwz/V/8qNrCuNjus/ksXpsbvH6QNRoFApsLqRSzYGOpjAJoH1C2IsPWfFNY+0AyhEO
 9INjwBLruTZTPTUqBgvq47uHnaokzSNo6/SVnMR5Xsl4Gg++ihf6v+dQsdOY4tSklv9E
 jG9LAzkWSRpIZuD6dQOOkAFuAMPtNOWDhspHBb1hG7HCwg2ANbXHGH7NGWI9cb7evJB+
 4jXWQVXYsL+iOClSzkOnXPefv8SxZyJCB73/IU75mz2uqCMGkk9xuXKX6vQqh72ZYz+G
 Kq7Skkr7SQYCXxBrLuSpoXbs6rrUOntfTcnX9deHmw6M1yeAag0MbNNJZGRaUMMw49ZX
 2WGg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to
 :message-id:user-agent:mime-version:content-transfer-encoding;
 bh=vo0tZzzv8fQCm4ReTeScxZf2p4aHnNx5GuV1w2Id5fo=;
 b=E9XgPjlZanY9MxuvQR7heAh1wx01K7+mcyg6yjYAQAyRfIvxFb8zc+MAvLVO5nYxAa
 0QrDA/PrCwOoWXrMKCrXFemnLK3qF7pGn3L9EfwcL3P7DAX6QKKK/z1r0ILAhm1lbZrB
 XXAf37u8zTemiv5mtts2e8ZhJQuc8awKWs2eiXnfQ84kI8CaWMh5enXPZWW+xeqCqJMy
 vcqWCq24sJ1ZdAyyB9czrPn8/UROz9ZUsGaHY8kxRfjNeh6KH/5sGIUv1Jfq37hlA4zn
 7jTXGhLE9wOFifwqZ8QZeXv6K7svAv57Pf4nWUWfBaPKwHyg5Omx/Jb23X32zMqcAhju
 T2Xw==
X-Gm-Message-State: AOAM5315LEJcViOPkfm0IQMUdw9O/cScXzb7WAPPLpg1w1xVcUvgVY9z
 9LX04IaYn83KXcqjYp81QDq2BC1ya5Rtl+Q2
X-Google-Smtp-Source: ABdhPJycxDjYxDWsCaQ9H3taydZa2LVAmEKSsyiXc/I4BPiqwV1Hf5Hog9F0VGNhPAuQSo3ezjcNFA==
X-Received: by 2002:ae9:ed85:: with SMTP id c127mr6072913qkg.288.1620190023594; 
 Tue, 04 May 2021 21:47:03 -0700 (PDT)
Received: from hurd (mtl.savoirfairelinux.net. [208.88.110.46])
 by smtp.gmail.com with ESMTPSA id p1sm3981063qtq.12.2021.05.04.21.47.02
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Tue, 04 May 2021 21:47:03 -0700 (PDT)
From: Maxim Cournoyer <maxim.cournoyer@HIDDEN>
To: "pelzflorian (Florian Pelz)" <pelzflorian@HIDDEN>
Subject: Re: bug#39970: guix commands broken on Azerbaijani 'az_AZ' and
 Turkish 'tr_TR' locales
References: <20200307120052.ocwzphlvemvmb2ts@HIDDEN>
 <20200307152003.myj7jkjthokbmark@HIDDEN>
 <20200308070804.ylpb5yrwpgbc3p3w@HIDDEN>
 <8736ah1mxb.fsf@HIDDEN>
 <20200312110206.2hsinzejnmcefmot@HIDDEN>
 <874kutsgmx.fsf@HIDDEN>
 <20200317094443.cnajoi4yuzvxaafe@HIDDEN>
 <87wo7ibrwe.fsf@HIDDEN>
 <20200318064712.iycghze5nnr7q777@HIDDEN>
Date: Wed, 05 May 2021 00:47:02 -0400
In-Reply-To: <20200318064712.iycghze5nnr7q777@HIDDEN>
 (pelzflorian@HIDDEN's message of "Wed, 18 Mar 2020 07:47:12
 +0100")
Message-ID: <875yzxlrcp.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 39970-done
Cc: Ludovic =?utf-8?Q?Court=C3=A8s?= <ludo@HIDDEN>, 39970-done <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

"pelzflorian (Florian Pelz)" <pelzflorian@HIDDEN> writes:

> On Tue, Mar 17, 2020 at 10:20:01PM +0100, Ludovic Court=C3=A8s wrote:
>> "pelzflorian (Florian Pelz)" <pelzflorian@HIDDEN> skribis:
>> > `LC_ALL=3Dtr_TR.utf8 make check` is still very unhappy though.
>> > There are many failures.  I will continue to investigate later today.
>>=20
>> OK.
>
> The tests fail to many other uses of [a-z] in regexps.  I will look;
> for e.g. guix/import/cran.scm
>
> (if (string-match "^[A-Za-z][^ :]+:( |\n|$)" line)
>     =E2=80=A6)
>
> it would be easier and clearer to just list [a-z] explicitly:
>
>
>> LGTM, thank you!
>
> :) Pushed as 771c5e155d7862ed91a5d503eecc00c1db1150ad.

Closing.

Thank you,

Maxim




Notification sent to "pelzflorian (Florian Pelz)" <pelzflorian@HIDDEN>:
bug acknowledged by developer. Full text available.
Reply sent to Maxim Cournoyer <maxim.cournoyer@HIDDEN>:
You have taken responsibility. Full text available.

Message received at 39970 <at> debbugs.gnu.org:


Received: (at 39970) by debbugs.gnu.org; 18 Mar 2020 08:40:33 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Mar 18 04:40:33 2020
Received: from localhost ([127.0.0.1]:39804 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jEUFd-0005mM-9T
	for submit <at> debbugs.gnu.org; Wed, 18 Mar 2020 04:40:33 -0400
Received: from eggs.gnu.org ([209.51.188.92]:33987)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <ludo@HIDDEN>) id 1jEUFb-0005m9-Mb
 for 39970 <at> debbugs.gnu.org; Wed, 18 Mar 2020 04:40:32 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e]:49628)
 by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <ludo@HIDDEN>)
 id 1jEUFW-0007e8-3U; Wed, 18 Mar 2020 04:40:26 -0400
Received: from 91-160-117-201.subs.proxad.net ([91.160.117.201]:53706
 helo=ribbon)
 by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256)
 (Exim 4.82) (envelope-from <ludo@HIDDEN>)
 id 1jEUFV-0003H8-Ff; Wed, 18 Mar 2020 04:40:25 -0400
From: =?utf-8?Q?Ludovic_Court=C3=A8s?= <ludo@HIDDEN>
To: "pelzflorian \(Florian Pelz\)" <pelzflorian@HIDDEN>
Subject: Re: bug#39970: guix commands broken on Azerbaijani 'az_AZ' and
 Turkish 'tr_TR' locales
References: <20200307120052.ocwzphlvemvmb2ts@HIDDEN>
 <20200307152003.myj7jkjthokbmark@HIDDEN>
 <20200308070804.ylpb5yrwpgbc3p3w@HIDDEN>
 <8736ah1mxb.fsf@HIDDEN>
 <20200312110206.2hsinzejnmcefmot@HIDDEN>
 <874kutsgmx.fsf@HIDDEN>
 <20200317094443.cnajoi4yuzvxaafe@HIDDEN>
 <87wo7ibrwe.fsf@HIDDEN>
 <20200318064712.iycghze5nnr7q777@HIDDEN>
X-URL: http://www.fdn.fr/~lcourtes/
X-Revolutionary-Date: 29 =?utf-8?Q?Vent=C3=B4se?= an 228 de la =?utf-8?Q?R?=
 =?utf-8?Q?=C3=A9volution?=
X-PGP-Key-ID: 0x090B11993D9AEBB5
X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc
X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4  0CFB 090B 1199 3D9A EBB5
X-OS: x86_64-pc-linux-gnu
Date: Wed, 18 Mar 2020 09:40:07 +0100
In-Reply-To: <20200318064712.iycghze5nnr7q777@HIDDEN>
 (pelzflorian@HIDDEN's message of "Wed, 18 Mar 2020 07:47:12
 +0100")
Message-ID: <87a74eawew.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 39970
Cc: 39970 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

"pelzflorian (Florian Pelz)" <pelzflorian@HIDDEN> skribis:

> On Tue, Mar 17, 2020 at 10:20:01PM +0100, Ludovic Court=C3=A8s wrote:
>> "pelzflorian (Florian Pelz)" <pelzflorian@HIDDEN> skribis:
>> > `LC_ALL=3Dtr_TR.utf8 make check` is still very unhappy though.
>> > There are many failures.  I will continue to investigate later today.
>>=20
>> OK.
>
> The tests fail to many other uses of [a-z] in regexps.  I will look;
> for e.g. guix/import/cran.scm
>
> (if (string-match "^[A-Za-z][^ :]+:( |\n|$)" line)
>     =E2=80=A6)
>
> it would be easier and clearer to just list [a-z] explicitly:

Yes, agreed.

It would be nice if =E2=80=98string-match=E2=80=99 & co. could take an opti=
onal locale
object (info "(guile) i18n Introduction") but that=E2=80=99s not the case
currently.

Thanks,
Ludo=E2=80=99.




Information forwarded to bug-guix@HIDDEN:
bug#39970; Package guix. Full text available.

Message received at 39970 <at> debbugs.gnu.org:


Received: (at 39970) by debbugs.gnu.org; 18 Mar 2020 06:47:20 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Wed Mar 18 02:47:20 2020
Received: from localhost ([127.0.0.1]:39781 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jESU4-0003Ay-Gh
	for submit <at> debbugs.gnu.org; Wed, 18 Mar 2020 02:47:20 -0400
Received: from pelzflorian.de ([5.45.111.108]:49362 helo=mail.pelzflorian.de)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <pelzflorian@HIDDEN>) id 1jESU1-0003Am-2y
 for 39970 <at> debbugs.gnu.org; Wed, 18 Mar 2020 02:47:17 -0400
Received: from pelzflorian.localdomain (unknown [5.45.111.108])
 by mail.pelzflorian.de (Postfix) with ESMTPSA id A271A3604F7;
 Wed, 18 Mar 2020 07:47:13 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=pelzflorian.de;
 s=mail; t=1584514033;
 bh=3GnQS6tAV0pgqhkc3J0XTvNUR5mSVCH/+ujgCyksJ0Y=;
 h=Date:From:To:Cc:Subject:References:In-Reply-To;
 b=rWqen7RfBOTvMa9ho7YL63vWsSfl2EGb/V7YhzLxtAkXMTAUud4zQ2aN7O6ZngSSg
 EZklTW5JBer5S70DtQ/pqXGIwuq9bREQbmhg2uVNplMpjlrgYepsv0mGR6GpMGtiFt
 mtWtc3XvRavatjUloCEebEikwJ9EsAvyeR/FKfQY=
Date: Wed, 18 Mar 2020 07:47:12 +0100
From: "pelzflorian (Florian Pelz)" <pelzflorian@HIDDEN>
To: Ludovic =?utf-8?Q?Court=C3=A8s?= <ludo@HIDDEN>
Subject: Re: bug#39970: guix commands broken on Azerbaijani 'az_AZ' and
 Turkish 'tr_TR' locales
Message-ID: <20200318064712.iycghze5nnr7q777@HIDDEN>
References: <20200307120052.ocwzphlvemvmb2ts@HIDDEN>
 <20200307152003.myj7jkjthokbmark@HIDDEN>
 <20200308070804.ylpb5yrwpgbc3p3w@HIDDEN>
 <8736ah1mxb.fsf@HIDDEN>
 <20200312110206.2hsinzejnmcefmot@HIDDEN>
 <874kutsgmx.fsf@HIDDEN>
 <20200317094443.cnajoi4yuzvxaafe@HIDDEN>
 <87wo7ibrwe.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <87wo7ibrwe.fsf@HIDDEN>
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 39970
Cc: 39970 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

On Tue, Mar 17, 2020 at 10:20:01PM +0100, Ludovic Courtès wrote:
> "pelzflorian (Florian Pelz)" <pelzflorian@HIDDEN> skribis:
> > `LC_ALL=tr_TR.utf8 make check` is still very unhappy though.
> > There are many failures.  I will continue to investigate later today.
> 
> OK.

The tests fail to many other uses of [a-z] in regexps.  I will look;
for e.g. guix/import/cran.scm

(if (string-match "^[A-Za-z][^ :]+:( |\n|$)" line)
    …)

it would be easier and clearer to just list [a-z] explicitly:


> LGTM, thank you!

:) Pushed as 771c5e155d7862ed91a5d503eecc00c1db1150ad.

Regards,
Florian




Information forwarded to bug-guix@HIDDEN:
bug#39970; Package guix. Full text available.

Message received at 39970 <at> debbugs.gnu.org:


Received: (at 39970) by debbugs.gnu.org; 17 Mar 2020 21:20:14 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Tue Mar 17 17:20:13 2020
Received: from localhost ([127.0.0.1]:39619 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jEJdF-0000Ee-Lv
	for submit <at> debbugs.gnu.org; Tue, 17 Mar 2020 17:20:13 -0400
Received: from eggs.gnu.org ([209.51.188.92]:55022)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <ludo@HIDDEN>) id 1jEJdD-0000EK-Fq
 for 39970 <at> debbugs.gnu.org; Tue, 17 Mar 2020 17:20:12 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e]:42019)
 by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <ludo@HIDDEN>)
 id 1jEJd8-0003XZ-7F; Tue, 17 Mar 2020 17:20:06 -0400
Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=58074 helo=ribbon)
 by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256)
 (Exim 4.82) (envelope-from <ludo@HIDDEN>)
 id 1jEJd7-0008BH-3n; Tue, 17 Mar 2020 17:20:05 -0400
From: =?utf-8?Q?Ludovic_Court=C3=A8s?= <ludo@HIDDEN>
To: "pelzflorian \(Florian Pelz\)" <pelzflorian@HIDDEN>
Subject: Re: bug#39970: guix commands broken on Azerbaijani 'az_AZ' and
 Turkish 'tr_TR' locales
References: <20200307120052.ocwzphlvemvmb2ts@HIDDEN>
 <20200307152003.myj7jkjthokbmark@HIDDEN>
 <20200308070804.ylpb5yrwpgbc3p3w@HIDDEN>
 <8736ah1mxb.fsf@HIDDEN>
 <20200312110206.2hsinzejnmcefmot@HIDDEN>
 <874kutsgmx.fsf@HIDDEN>
 <20200317094443.cnajoi4yuzvxaafe@HIDDEN>
X-URL: http://www.fdn.fr/~lcourtes/
X-Revolutionary-Date: 28 =?utf-8?Q?Vent=C3=B4se?= an 228 de la =?utf-8?Q?R?=
 =?utf-8?Q?=C3=A9volution?=
X-PGP-Key-ID: 0x090B11993D9AEBB5
X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc
X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4  0CFB 090B 1199 3D9A EBB5
X-OS: x86_64-pc-linux-gnu
Date: Tue, 17 Mar 2020 22:20:01 +0100
In-Reply-To: <20200317094443.cnajoi4yuzvxaafe@HIDDEN>
 (pelzflorian@HIDDEN's message of "Tue, 17 Mar 2020 10:44:43
 +0100")
Message-ID: <87wo7ibrwe.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 39970
Cc: 39970 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

Hi,

"pelzflorian (Florian Pelz)" <pelzflorian@HIDDEN> skribis:

> On Thu, Mar 12, 2020 at 05:05:26PM +0100, Ludovic Court=C3=A8s wrote:
>> "pelzflorian (Florian Pelz)" <pelzflorian@HIDDEN> skribis:
>> > Why would not using a regexp be better?
>>=20
>> It reduces reliance on libc, reduces complexity, and performs better as
>> noted in the commit log of 35eb77b09d957019b2437e7681bd88013d67d3cd.
>
> Thank you for your wisdom.  I hope the attached patch is OK.
>
> `LC_ALL=3Den_US.utf8 make check` is mostly fine (except tests/pack.scm,
> which also failed before).
>
> Manual testing of `./pre-inst-env guix environment` works.

Good!

> `LC_ALL=3Dtr_TR.utf8 make check` is still very unhappy though.
> There are many failures.  I will continue to investigate later today.

OK.

> From: Florian Pelz <pelzflorian@HIDDEN>
> Date: Thu, 12 Mar 2020 11:08:16 +0100
> Content-Type: text/plain; charset=3DUTF-8
> Content-Transfer-Encoding: 8bit
> Subject: [PATCH] store: Fix many guix commands failing on some locales.
>
> Partly fixes bug #39970 (see: https://bugs.gnu.org/39970).

I=E2=80=99d just write:

  Partly fixes <https://bugs.gnu.org/39970>.

Concise, clear, greppable.  :-)

> At least 'guix environment', 'guix install' and 'guix pull'
> on 'az_AZ.utf8' and 'tr_TR.utf8' were affected.
>
> * guix/store.scm (store-path-hash-part): Move base path detection to ...
> (store-path-base): ... this new exported procedure.
> (store-path-package-name): Use it instead of locale-dependent regexps.
> (store-regexp*): Remove.

LGTM, thank you!

Ludo=E2=80=99.




Information forwarded to bug-guix@HIDDEN:
bug#39970; Package guix. Full text available.

Message received at 39970 <at> debbugs.gnu.org:


Received: (at 39970) by debbugs.gnu.org; 17 Mar 2020 09:44:48 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Tue Mar 17 05:44:48 2020
Received: from localhost ([127.0.0.1]:38190 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jE8mF-000712-Lx
	for submit <at> debbugs.gnu.org; Tue, 17 Mar 2020 05:44:47 -0400
Received: from pelzflorian.de ([5.45.111.108]:48028 helo=mail.pelzflorian.de)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <pelzflorian@HIDDEN>) id 1jE8mD-00070u-T4
 for 39970 <at> debbugs.gnu.org; Tue, 17 Mar 2020 05:44:46 -0400
Received: from pelzflorian.localdomain (unknown [5.45.111.108])
 by mail.pelzflorian.de (Postfix) with ESMTPSA id A8BA93604F7;
 Tue, 17 Mar 2020 10:44:44 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=pelzflorian.de;
 s=mail; t=1584438284;
 bh=FViUB2onlkxlPJz3sgsG6UPdaGGq0G4v3H7AS4KvKyE=;
 h=Date:From:To:Cc:Subject:References:In-Reply-To;
 b=jyJPlRWCiRjQWeh4WnD5jE61H6owflDzHayCPGP3TubmHm36cPtBU78kT6zhnLC1p
 1hBxMFeK1iS1eUKKhNmJxlumMSEBQeU8Q7gpsZCt3A9oOcxSv+OT1ot28ZMj8bnbOI
 8Ix8siXRlGRJ521x+9OrbwbnkYHsu5NBCbvvzMXM=
Date: Tue, 17 Mar 2020 10:44:43 +0100
From: "pelzflorian (Florian Pelz)" <pelzflorian@HIDDEN>
To: Ludovic =?utf-8?Q?Court=C3=A8s?= <ludo@HIDDEN>
Subject: Re: bug#39970: guix commands broken on Azerbaijani 'az_AZ' and
 Turkish 'tr_TR' locales
Message-ID: <20200317094443.cnajoi4yuzvxaafe@HIDDEN>
References: <20200307120052.ocwzphlvemvmb2ts@HIDDEN>
 <20200307152003.myj7jkjthokbmark@HIDDEN>
 <20200308070804.ylpb5yrwpgbc3p3w@HIDDEN>
 <8736ah1mxb.fsf@HIDDEN>
 <20200312110206.2hsinzejnmcefmot@HIDDEN>
 <874kutsgmx.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="2sl72hxfrvw7sfvd"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <874kutsgmx.fsf@HIDDEN>
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 39970
Cc: 39970 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)


--2sl72hxfrvw7sfvd
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit

On Thu, Mar 12, 2020 at 05:05:26PM +0100, Ludovic Courtès wrote:
> "pelzflorian (Florian Pelz)" <pelzflorian@HIDDEN> skribis:
> > Why would not using a regexp be better?
> 
> It reduces reliance on libc, reduces complexity, and performs better as
> noted in the commit log of 35eb77b09d957019b2437e7681bd88013d67d3cd.

Thank you for your wisdom.  I hope the attached patch is OK.

`LC_ALL=en_US.utf8 make check` is mostly fine (except tests/pack.scm,
which also failed before).

Manual testing of `./pre-inst-env guix environment` works.

`LC_ALL=tr_TR.utf8 make check` is still very unhappy though.
There are many failures.  I will continue to investigate later today.

Regards,
Florian

--2sl72hxfrvw7sfvd
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: attachment;
	filename="0001-store-Fix-many-guix-commands-failing-on-some-locales.patch"
Content-Transfer-Encoding: 8bit

From: Florian Pelz <pelzflorian@HIDDEN>
Date: Thu, 12 Mar 2020 11:08:16 +0100
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Subject: [PATCH] store: Fix many guix commands failing on some locales.

Partly fixes bug #39970 (see: https://bugs.gnu.org/39970).

At least 'guix environment', 'guix install' and 'guix pull'
on 'az_AZ.utf8' and 'tr_TR.utf8' were affected.

* guix/store.scm (store-path-hash-part): Move base path detection to ...
(store-path-base): ... this new exported procedure.
(store-path-package-name): Use it instead of locale-dependent regexps.
(store-regexp*): Remove.
---
 guix/store.scm | 32 +++++++++++++++-----------------
 1 file changed, 15 insertions(+), 17 deletions(-)

diff --git a/guix/store.scm b/guix/store.scm
index f99fa581a8..5465204f5f 100644
--- a/guix/store.scm
+++ b/guix/store.scm
@@ -2,6 +2,7 @@
 ;;; Copyright © 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019 Ludovic Courtès <ludo@HIDDEN>
 ;;; Copyright © 2018 Jan Nieuwenhuizen <janneke@HIDDEN>
 ;;; Copyright © 2019 Mathieu Othacehe <m.othacehe@HIDDEN>
+;;; Copyright © 2020 Florian Pelz <pelzflorian@HIDDEN>
 ;;;
 ;;; This file is part of GNU Guix.
 ;;;
@@ -43,7 +44,6 @@
   #:use-module (srfi srfi-35)
   #:use-module (srfi srfi-39)
   #:use-module (ice-9 match)
-  #:use-module (ice-9 regex)
   #:use-module (ice-9 vlist)
   #:use-module (ice-9 popen)
   #:use-module (ice-9 threads)
@@ -172,6 +172,7 @@
             store-path?
             direct-store-path?
             derivation-path?
+            store-path-base
             store-path-package-name
             store-path-hash-part
             direct-store-path
@@ -1943,29 +1944,26 @@ valid inputs."
   "Return #t if PATH is a derivation path."
   (and (store-path? path) (string-suffix? ".drv" path)))
 
-(define store-regexp*
-  ;; The substituter makes repeated calls to 'store-path-hash-part', hence
-  ;; this optimization.
-  (mlambda (store)
-    "Return a regexp matching a file in STORE."
-    (make-regexp (string-append "^" (regexp-quote store)
-                                "/([0-9a-df-np-sv-z]{32})-([^/]+)$"))))
+(define (store-path-base path)
+  "Return the base path of a path in the store."
+  (and (string-prefix? (%store-prefix) path)
+       (let ((base (string-drop path (+ 1 (string-length (%store-prefix))))))
+         (and (> (string-length base) 33)
+              (not (string-index base #\/))
+              base))))
 
 (define (store-path-package-name path)
   "Return the package name part of PATH, a file name in the store."
-  (let ((path-rx (store-regexp* (%store-prefix))))
-    (and=> (regexp-exec path-rx path)
-           (cut match:substring <> 2))))
+  (let ((base (store-path-base path)))
+    (string-drop base (+ 32 1)))) ;32 hash part + 1 hyphen
 
 (define (store-path-hash-part path)
   "Return the hash part of PATH as a base32 string, or #f if PATH is not a
 syntactically valid store path."
-  (and (string-prefix? (%store-prefix) path)
-       (let ((base (string-drop path (+ 1 (string-length (%store-prefix))))))
-         (and (> (string-length base) 33)
-              (let ((hash (string-take base 32)))
-                (and (string-every %nix-base32-charset hash)
-                     hash))))))
+  (let* ((base (store-path-base path))
+         (hash (string-take base 32)))
+    (and (string-every %nix-base32-charset hash)
+         hash)))
 
 (define (derivation-log-file drv)
   "Return the build log file for DRV, a derivation file name, or #f if it
-- 
2.25.1


--2sl72hxfrvw7sfvd--




Information forwarded to bug-guix@HIDDEN:
bug#39970; Package guix. Full text available.

Message received at 39970 <at> debbugs.gnu.org:


Received: (at 39970) by debbugs.gnu.org; 12 Mar 2020 16:05:35 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Mar 12 12:05:35 2020
Received: from localhost ([127.0.0.1]:57731 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jCQL1-0007Li-Bx
	for submit <at> debbugs.gnu.org; Thu, 12 Mar 2020 12:05:35 -0400
Received: from eggs.gnu.org ([209.51.188.92]:39236)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <ludo@HIDDEN>) id 1jCQL0-0007LN-4z
 for 39970 <at> debbugs.gnu.org; Thu, 12 Mar 2020 12:05:34 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e]:60468)
 by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <ludo@HIDDEN>)
 id 1jCQKu-0001YT-MO; Thu, 12 Mar 2020 12:05:28 -0400
Received: from [2001:660:6102:320:e120:2c8f:8909:cdfe] (port=49886 helo=ribbon)
 by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256)
 (Exim 4.82) (envelope-from <ludo@HIDDEN>)
 id 1jCQKu-0006Cm-1q; Thu, 12 Mar 2020 12:05:28 -0400
From: =?utf-8?Q?Ludovic_Court=C3=A8s?= <ludo@HIDDEN>
To: "pelzflorian \(Florian Pelz\)" <pelzflorian@HIDDEN>
Subject: Re: bug#39970: guix commands broken on Azerbaijani 'az_AZ' and
 Turkish 'tr_TR' locales
References: <20200307120052.ocwzphlvemvmb2ts@HIDDEN>
 <20200307152003.myj7jkjthokbmark@HIDDEN>
 <20200308070804.ylpb5yrwpgbc3p3w@HIDDEN>
 <8736ah1mxb.fsf@HIDDEN>
 <20200312110206.2hsinzejnmcefmot@HIDDEN>
X-URL: http://www.fdn.fr/~lcourtes/
X-Revolutionary-Date: 23 =?utf-8?Q?Vent=C3=B4se?= an 228 de la =?utf-8?Q?R?=
 =?utf-8?Q?=C3=A9volution?=
X-PGP-Key-ID: 0x090B11993D9AEBB5
X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc
X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4  0CFB 090B 1199 3D9A EBB5
X-OS: x86_64-pc-linux-gnu
Date: Thu, 12 Mar 2020 17:05:26 +0100
In-Reply-To: <20200312110206.2hsinzejnmcefmot@HIDDEN>
 (pelzflorian@HIDDEN's message of "Thu, 12 Mar 2020 12:02:06
 +0100")
Message-ID: <874kutsgmx.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 39970
Cc: 39970 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

Hi Florian,

"pelzflorian (Florian Pelz)" <pelzflorian@HIDDEN> skribis:

> On Mon, Mar 09, 2020 at 06:02:40PM +0100, Ludovic Court=C3=A8s wrote:
>> To me it=E2=80=99s not a bug in Guile, but simply the fact that regexps,=
 as
>> implemented by the C library, are locale-dependent.
>>=20
>
> (use-modules (ice-9 regex))
> (regexp-exec (make-regexp "^([a-z]+)$")
>              "iyiyim")
> =E2=87=92 #f
>
> Guile=E2=80=99s behavior that i is not among [a-z] has been confirmed as
> unexpected by a natively Turkish friend of mine.  It is different from
> the behavior of current glibc:
>
> florian@florianmacbook ~$ cat iyiyim.c
> #include <regex.h>
> #include <stdio.h>
> #include <stdlib.h>
> #define STR "iyiy=C4=B1m"
> int main (int    argc,
>           char** argv)
> {

You=E2=80=99re seeing a different behavior because you forgot a:

  setlocale (LC_ALL, "");

call here.

>> The patch you proposed looks good to me, though perhaps we could
>> explicitly list all the alphabet in the regexp?
>>=20
>> A better option is to reimplement =E2=80=98store-path-package-name=E2=80=
=99 in a way
>> similar to =E2=80=98store-path-hash-part=E2=80=99, as in commit
>> 35eb77b09d957019b2437e7681bd88013d67d3cd.
>
> I suppose it would be better to cache the compiled regexp.  What is
> this mcached syntax inside (guix store)?  Or do I use Scheme=E2=80=99s 'd=
elay'
> and 'force' for caching?

I lean towards avoiding regexps altogether, as I wrote above.

WDYT?

> The attached patch fixes the regexp.  Shall I push the attached patch
> and then try making it cache the compiled regexp or do you still
> prefer an implementation without regexps?  Why would not using a
> regexp be better?

It reduces reliance on libc, reduces complexity, and performs better as
noted in the commit log of 35eb77b09d957019b2437e7681bd88013d67d3cd.

Thanks,
Ludo=E2=80=99.




Information forwarded to bug-guix@HIDDEN:
bug#39970; Package guix. Full text available.

Message received at 39970 <at> debbugs.gnu.org:


Received: (at 39970) by debbugs.gnu.org; 12 Mar 2020 11:02:12 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Mar 12 07:02:12 2020
Received: from localhost ([127.0.0.1]:56026 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jCLbQ-0004ld-7R
	for submit <at> debbugs.gnu.org; Thu, 12 Mar 2020 07:02:12 -0400
Received: from pelzflorian.de ([5.45.111.108]:38946 helo=mail.pelzflorian.de)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <pelzflorian@HIDDEN>) id 1jCLbN-0004lT-AH
 for 39970 <at> debbugs.gnu.org; Thu, 12 Mar 2020 07:02:10 -0400
Received: from pelzflorian.localdomain (unknown [5.45.111.108])
 by mail.pelzflorian.de (Postfix) with ESMTPSA id 9E70436055C;
 Thu, 12 Mar 2020 12:02:07 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=pelzflorian.de;
 s=mail; t=1584010927;
 bh=AANki3xAx4PXrO6y9Mh4j/xrDg8K+0BCPVEVHgXefPM=;
 h=Date:From:To:Cc:Subject:References:In-Reply-To;
 b=bcjRaZB9HOT37d3pXA3fe7cwv7ITMAhWwTIyfkpOifroB94rWUS91kjYKVozCpZSE
 GjZcHr4ffH4n/pE7pmlzw70yQwgNQ0/oNOuT5yl0ybdNMbEA4lVDAVnMlvm2d+wrYD
 Rz9Ntw8MHDHXfVygdz5EIrJdk7rR+1wGmvfjfjvQ=
Date: Thu, 12 Mar 2020 12:02:06 +0100
From: "pelzflorian (Florian Pelz)" <pelzflorian@HIDDEN>
To: Ludovic =?utf-8?Q?Court=C3=A8s?= <ludo@HIDDEN>
Subject: Re: bug#39970: guix commands broken on Azerbaijani 'az_AZ' and
 Turkish 'tr_TR' locales
Message-ID: <20200312110206.2hsinzejnmcefmot@HIDDEN>
References: <20200307120052.ocwzphlvemvmb2ts@HIDDEN>
 <20200307152003.myj7jkjthokbmark@HIDDEN>
 <20200308070804.ylpb5yrwpgbc3p3w@HIDDEN>
 <8736ah1mxb.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="n32ce3wcv3t3334v"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <8736ah1mxb.fsf@HIDDEN>
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 39970
Cc: 39970 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)


--n32ce3wcv3t3334v
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit

On Mon, Mar 09, 2020 at 06:02:40PM +0100, Ludovic Courtès wrote:
> To me it’s not a bug in Guile, but simply the fact that regexps, as
> implemented by the C library, are locale-dependent.
> 

(use-modules (ice-9 regex))
(regexp-exec (make-regexp "^([a-z]+)$")
             "iyiyim")
⇒ #f

Guile’s behavior that i is not among [a-z] has been confirmed as
unexpected by a natively Turkish friend of mine.  It is different from
the behavior of current glibc:

florian@florianmacbook ~$ cat iyiyim.c
#include <regex.h>
#include <stdio.h>
#include <stdlib.h>
#define STR "iyiyım"
int main (int    argc,
          char** argv)
{
  regex_t only_letters;
  int r = regcomp (&only_letters, "[a-z]+", REG_EXTENDED);
  if (r != 0)
    printf ("This error does not happen.\n");
  r = regexec (&only_letters, STR, 1, malloc (sizeof (regmatch_t)), 0);
  if (r == 0)
    printf ("The string " STR " matched!\n");
  else
    printf ("No match for " STR ".\n");
}
florian@florianmacbook ~$ gcc -o iyiyim iyiyim.c
florian@florianmacbook ~$ LANG=tr_TR.utf8 ./iyiyim 
The string iyiyım matched!

Apparently Guile uses a bundled regular expression library rather than
glibc.  I can try making Guile use a newer GNUlib for its regular
expressions, maybe that helps.  Shall I file a separate bug for Guile?

> The patch you proposed looks good to me, though perhaps we could
> explicitly list all the alphabet in the regexp?
> 
> A better option is to reimplement ‘store-path-package-name’ in a way
> similar to ‘store-path-hash-part’, as in commit
> 35eb77b09d957019b2437e7681bd88013d67d3cd.

I suppose it would be better to cache the compiled regexp.  What is
this mcached syntax inside (guix store)?  Or do I use Scheme’s 'delay'
and 'force' for caching?

The attached patch fixes the regexp.  Shall I push the attached patch
and then try making it cache the compiled regexp or do you still
prefer an implementation without regexps?  Why would not using a
regexp be better?

Regards,
Florian

--n32ce3wcv3t3334v
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment;
	filename="0001-store-Fix-many-guix-commands-failing-on-some-locales.patch"

From: Florian Pelz <pelzflorian@HIDDEN>
Date: Thu, 12 Mar 2020 11:08:16 +0100
Subject: [PATCH] store: Fix many guix commands failing on some locales.

Fixes bug #39970 (see: https://bugs.gnu.org/39970).

At least 'guix environment', 'guix install' and 'guix pull'
on 'az_AZ.utf8' and 'tr_TR.utf8' are affected.

* guix/store.scm (store-regexp*): Avoid dependence on locale.
---
 guix/store.scm | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/guix/store.scm b/guix/store.scm
index f99fa581a8..82d7403bb6 100644
--- a/guix/store.scm
+++ b/guix/store.scm
@@ -1949,7 +1949,8 @@ valid inputs."
   (mlambda (store)
     "Return a regexp matching a file in STORE."
     (make-regexp (string-append "^" (regexp-quote store)
-                                "/([0-9a-df-np-sv-z]{32})-([^/]+)$"))))
+                                "\
+/([0-9abcdfghijklmnpqrsvwxyz]{32})-([^/]+)$"))))
 
 (define (store-path-package-name path)
   "Return the package name part of PATH, a file name in the store."
-- 
2.25.1


--n32ce3wcv3t3334v--




Information forwarded to bug-guix@HIDDEN:
bug#39970; Package guix. Full text available.

Message received at 39970 <at> debbugs.gnu.org:


Received: (at 39970) by debbugs.gnu.org; 9 Mar 2020 17:02:51 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Mar 09 13:02:51 2020
Received: from localhost ([127.0.0.1]:51406 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jBLnn-0002U5-0g
	for submit <at> debbugs.gnu.org; Mon, 09 Mar 2020 13:02:51 -0400
Received: from eggs.gnu.org ([209.51.188.92]:40113)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <ludo@HIDDEN>) id 1jBLnl-0002Tq-DK
 for 39970 <at> debbugs.gnu.org; Mon, 09 Mar 2020 13:02:49 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e]:46764)
 by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <ludo@HIDDEN>)
 id 1jBLng-00075B-2X; Mon, 09 Mar 2020 13:02:44 -0400
Received: from [2001:660:6102:320:e120:2c8f:8909:cdfe] (port=56274 helo=ribbon)
 by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256)
 (Exim 4.82) (envelope-from <ludo@HIDDEN>)
 id 1jBLnf-0000Ws-Cx; Mon, 09 Mar 2020 13:02:43 -0400
From: =?utf-8?Q?Ludovic_Court=C3=A8s?= <ludo@HIDDEN>
To: "pelzflorian \(Florian Pelz\)" <pelzflorian@HIDDEN>
Subject: Re: bug#39970: guix commands broken on Azerbaijani 'az_AZ' and
 Turkish 'tr_TR' locales
References: <20200307120052.ocwzphlvemvmb2ts@HIDDEN>
 <20200307152003.myj7jkjthokbmark@HIDDEN>
 <20200308070804.ylpb5yrwpgbc3p3w@HIDDEN>
X-URL: http://www.fdn.fr/~lcourtes/
X-Revolutionary-Date: 20 =?utf-8?Q?Vent=C3=B4se?= an 228 de la =?utf-8?Q?R?=
 =?utf-8?Q?=C3=A9volution?=
X-PGP-Key-ID: 0x090B11993D9AEBB5
X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc
X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4  0CFB 090B 1199 3D9A EBB5
X-OS: x86_64-pc-linux-gnu
Date: Mon, 09 Mar 2020 18:02:40 +0100
In-Reply-To: <20200308070804.ylpb5yrwpgbc3p3w@HIDDEN>
 (pelzflorian@HIDDEN's message of "Sun, 8 Mar 2020 08:08:04
 +0100")
Message-ID: <8736ah1mxb.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 39970
Cc: 39970 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

Hi Florian,

"pelzflorian (Florian Pelz)" <pelzflorian@HIDDEN> skribis:

> This seems similar to <https://bugs.gnu.org/35785>.

Yes, same story.

> I think enumerating all characters explicitly is a similar fix,
> whether or not there is a bug in Guile.

To me it=E2=80=99s not a bug in Guile, but simply the fact that regexps, as
implemented by the C library, are locale-dependent.

The patch you proposed looks good to me, though perhaps we could
explicitly list all the alphabet in the regexp?

A better option is to reimplement =E2=80=98store-path-package-name=E2=80=99=
 in a way
similar to =E2=80=98store-path-hash-part=E2=80=99, as in commit
35eb77b09d957019b2437e7681bd88013d67d3cd.

Thoughts?

Ludo=E2=80=99.




Information forwarded to bug-guix@HIDDEN:
bug#39970; Package guix. Full text available.

Message received at 39970 <at> debbugs.gnu.org:


Received: (at 39970) by debbugs.gnu.org; 8 Mar 2020 07:08:13 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Mar 08 03:08:13 2020
Received: from localhost ([127.0.0.1]:47843 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jAq2l-0004Ys-M2
	for submit <at> debbugs.gnu.org; Sun, 08 Mar 2020 03:08:12 -0400
Received: from pelzflorian.de ([5.45.111.108]:60416 helo=mail.pelzflorian.de)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <pelzflorian@HIDDEN>) id 1jAq2h-0004Yh-K4
 for 39970 <at> debbugs.gnu.org; Sun, 08 Mar 2020 03:08:08 -0400
Received: from pelzflorian.localdomain (unknown [5.45.111.108])
 by mail.pelzflorian.de (Postfix) with ESMTPSA id D7B393604F7
 for <39970 <at> debbugs.gnu.org>; Sun,  8 Mar 2020 08:08:05 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=pelzflorian.de;
 s=mail; t=1583651285;
 bh=H8wLe+1JHgwFsr5xYovoRoQ+aLGhou+WW6jlY2STzyk=;
 h=Date:From:To:Subject:References:In-Reply-To;
 b=tdGtzU34p0QiUvjmrjI80kbccdKcnEti0vCVBf0DpXc91zUUpaOD67+7bT7byTpqk
 yuyRClWE9X7e6J+WbpqbmbxxxFHEQWmZ+gBqsWPQwi3DpXLUD6Z3JOtBYRskyHnzEc
 wMuUMiijyC4AwZ4rphNv872mNhMmk0qxutS9GbjM=
Date: Sun, 8 Mar 2020 08:08:04 +0100
From: "pelzflorian (Florian Pelz)" <pelzflorian@HIDDEN>
To: 39970 <at> debbugs.gnu.org
Subject: Re: bug#39970: guix commands broken on Azerbaijani 'az_AZ' and
 Turkish 'tr_TR' locales
Message-ID: <20200308070804.ylpb5yrwpgbc3p3w@HIDDEN>
References: <20200307120052.ocwzphlvemvmb2ts@HIDDEN>
 <20200307152003.myj7jkjthokbmark@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20200307152003.myj7jkjthokbmark@HIDDEN>
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 39970
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

This seems similar to <https://bugs.gnu.org/35785>.  I think
enumerating all characters explicitly is a similar fix, whether or not
there is a bug in Guile.

Regards,
Florian




Information forwarded to bug-guix@HIDDEN:
bug#39970; Package guix. Full text available.

Message received at 39970 <at> debbugs.gnu.org:


Received: (at 39970) by debbugs.gnu.org; 7 Mar 2020 15:20:08 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Mar 07 10:20:08 2020
Received: from localhost ([127.0.0.1]:47389 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jAbFH-0007eJ-NG
	for submit <at> debbugs.gnu.org; Sat, 07 Mar 2020 10:20:07 -0500
Received: from pelzflorian.de ([5.45.111.108]:59254 helo=mail.pelzflorian.de)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <pelzflorian@HIDDEN>) id 1jAbFG-0007eA-9Q
 for 39970 <at> debbugs.gnu.org; Sat, 07 Mar 2020 10:20:07 -0500
Received: from pelzflorian.localdomain (unknown [5.45.111.108])
 by mail.pelzflorian.de (Postfix) with ESMTPSA id D844A3604F7
 for <39970 <at> debbugs.gnu.org>; Sat,  7 Mar 2020 16:20:04 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=pelzflorian.de;
 s=mail; t=1583594404;
 bh=gsrcqTx00cvuN+3NQyTQ4jOfeqHmeAVV0iEcKHbpTXY=;
 h=Date:From:To:Subject:References:In-Reply-To;
 b=aR3ZZKVVSFQp0XKfQ3OR6jzNJMifbmY+LVVLaFgZcELMB9ZkiYN/InKMe5UAOeEZs
 2HcwhP0t74ySv+cYTOtuFLfriWx2J+FaETOU1/Fgob4jPFRljv7pNqY1g5MRxeuOmr
 f7WR5f/DhyY+L3Uqy0OnWWyqwIH7vSUw6Ayewa+0=
Date: Sat, 7 Mar 2020 16:20:03 +0100
From: "pelzflorian (Florian Pelz)" <pelzflorian@HIDDEN>
To: 39970 <at> debbugs.gnu.org
Subject: Re: bug#39970: guix commands broken on Azerbaijani 'az_AZ' and
 Turkish 'tr_TR' locales
Message-ID: <20200307152003.myj7jkjthokbmark@HIDDEN>
References: <20200307120052.ocwzphlvemvmb2ts@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <20200307120052.ocwzphlvemvmb2ts@HIDDEN>
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 39970
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

On Sat, Mar 07, 2020 at 01:00:52PM +0100, pelzflorian (Florian Pelz) wrote:
> Running guix via ./pre-inst-env gives a more useful backtrace.  The
> reason is that in guix/store.scm
> 
> (use-modules (ice-9 regex))
> (regexp-exec (make-regexp "^/gnu/store/([0-9a-df-np-sv-z]{32})-([^/]+)$")
>              "/gnu/store/bv9py3f2dsa5iw0aijqjv9zxwprcy1nb-fontconfig-2.13.1.drv")
> 
> evaluates to #f in Turkish, possibly because of the presence of
> dotless i (ı) in the range.
> 

Actually it seems the issue is that i is missing from the range [a-z]
ı and ğ are missing as well, as are non-Turkish letters like ä that
are included when using the en_US.utf8 locale, even though they are no
English letters either.

(use-modules (ice-9 regex))
(regexp-exec (make-regexp "^([a-z]+)$")
             "iyiyim")

fails.

But running a glibc C program

florian@florianmacbook ~$ cat iyiyim.c
#include <regex.h>
#include <stdio.h>
#define STR "iyiyim"
int main (int    argc,
          char** argv)
{
  regex_t only_letters;
  int r = regcomp (&only_letters, "[a-z]", 0);
  if (r != 0)
    printf ("This error does not happen.\n");
  r = regexec (&only_letters, STR, 0, NULL, 0);
  if (r == 0)
    printf ("The string " STR " matched!\n");
  else
    printf ("No match for " STR ".\n");
}
florian@florianmacbook ~$ gcc -o iyiyim iyiyim.c 
florian@florianmacbook ~$ LANG=tr_TR.utf8 ./iyiyim 
The string iyiyim matched!

succeeds on tr_TR.utf8 and en_US.utf8 locales (and a native Turkish
speaker confirmed to me ıi should be in the alphabet right after h).
Maybe this is a bug in Guile, somehow?

> […]
> I wonder what else is affected; the installer maybe?  I have not
> tested yet.
>

I checked; the graphical installer appears unaffected, but the issue
appears on the installed system.

Regards,
Florian




Information forwarded to bug-guix@HIDDEN:
bug#39970; Package guix. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 7 Mar 2020 12:01:03 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Mar 07 07:01:03 2020
Received: from localhost ([127.0.0.1]:46326 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1jAY8d-00076x-Au
	for submit <at> debbugs.gnu.org; Sat, 07 Mar 2020 07:01:03 -0500
Received: from lists.gnu.org ([209.51.188.17]:51469)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <pelzflorian@HIDDEN>) id 1jAY8b-00076e-LQ
 for submit <at> debbugs.gnu.org; Sat, 07 Mar 2020 07:01:02 -0500
Received: from eggs.gnu.org ([2001:470:142:3::10]:54923)
 by lists.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <pelzflorian@HIDDEN>) id 1jAY8Y-0005KP-5q
 for bug-guix@HIDDEN; Sat, 07 Mar 2020 07:01:00 -0500
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,URIBL_BLOCKED
 autolearn=disabled version=3.3.2
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <pelzflorian@HIDDEN>) id 1jAY8X-0001G5-17
 for bug-guix@HIDDEN; Sat, 07 Mar 2020 07:00:58 -0500
Received: from pelzflorian.de ([5.45.111.108]:38970 helo=mail.pelzflorian.de)
 by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <pelzflorian@HIDDEN>)
 id 1jAY8W-0001Bf-G9
 for bug-guix@HIDDEN; Sat, 07 Mar 2020 07:00:56 -0500
Received: from pelzflorian.localdomain (unknown [5.45.111.108])
 by mail.pelzflorian.de (Postfix) with ESMTPSA id A388A3604F7
 for <bug-guix@HIDDEN>; Sat,  7 Mar 2020 13:00:53 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=pelzflorian.de;
 s=mail; t=1583582453;
 bh=ZZun/EedyaYOGn1SbwQJGv0StjxN5hjB1YzmJSzwN+Y=;
 h=Date:From:To:Subject;
 b=zqJFLa89uJhi1xo68JZj4DNwZnk89W5Lya7qX9jkpwpDDWr7vf3vKkavuECD15JaU
 nKxJ8Rqpw6Ga7OXzqG9BRXjLn+Osvpdj7H9QxX9sDRyrqaogerINeXzIlVjCNr5kdX
 QffRTL1k7WN9bB4NrG+usmqv9WuNVugZNAWyMY+Y=
Date: Sat, 7 Mar 2020 13:00:52 +0100
From: "pelzflorian (Florian Pelz)" <pelzflorian@HIDDEN>
To: bug-guix@HIDDEN
Subject: guix commands broken on Azerbaijani 'az_AZ' and Turkish 'tr_TR'
 locales
Message-ID: <20200307120052.ocwzphlvemvmb2ts@HIDDEN>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="n7jv6se5e47ytj2s"
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
 [fuzzy]
X-Received-From: 5.45.111.108
X-Spam-Score: 0.2 (/)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.8 (/)


--n7jv6se5e47ytj2s
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

After running

export LC_ALL=3Dtr_TR.utf8

many important Guix commands like 'guix environment', 'guix install'
and 'guix pull' fail.

$ guix environment --ad-hoc hello
Backtrace:
           1 (primitive-load "/home/florian/.config/guix/current/bin=E2=80=
=A6")
In guix/ui.scm:
  1826:12  0 (run-guix-command _ . _)

guix/ui.scm:1826:12: In procedure run-guix-command:
In procedure string-length: Wrong type argument in position 1 (expecting =
string): #f


Running guix via ./pre-inst-env gives a more useful backtrace.  The
reason is that in guix/store.scm

(use-modules (ice-9 regex))
(regexp-exec (make-regexp "^/gnu/store/([0-9a-df-np-sv-z]{32})-([^/]+)$")
             "/gnu/store/bv9py3f2dsa5iw0aijqjv9zxwprcy1nb-fontconfig-2.13=
.1.drv")

evaluates to #f in Turkish, possibly because of the presence of
dotless i (=C4=B1) in the range.

The attached patch fixes the issue by including i explicitly, but I
believe enumerating all of [0-9abcdfghijklmnpqrsvwxyz] explicitly
might be more future-proof.

Shall I push the patch modified to list all letters in
[0-9abcdfghijklmnpqrsvwxyz] explicitly?  Numbers too?  I suppose there
is no downside to listing all without ranges.

I wonder what else is affected; the installer maybe?  I have not
tested yet.

Regards,
Florian

--n7jv6se5e47ytj2s
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment;
	filename="0001-store-Fix-many-guix-commands-failing-on-some-locales.patch"

From 4445284e9fd40b3e271fa7b511d2856c03c8ccfb Mon Sep 17 00:00:00 2001
From: Florian Pelz <pelzflorian@HIDDEN>
Date: Sat, 7 Mar 2020 11:38:59 +0100
Subject: [PATCH] store: Fix many guix commands failing on some locales.

At least 'guix environment', 'guix install' and 'guix pull'
on 'az_AZ.utf8' and 'tr_TR.utf8' are affected.

* guix/store.scm (store-regexp*): Avoid dependence on locale.
---
 guix/store.scm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/guix/store.scm b/guix/store.scm
index f99fa581a8..a1d9713c24 100644
--- a/guix/store.scm
+++ b/guix/store.scm
@@ -1949,7 +1949,7 @@ valid inputs."
   (mlambda (store)
     "Return a regexp matching a file in STORE."
     (make-regexp (string-append "^" (regexp-quote store)
-                                "/([0-9a-df-np-sv-z]{32})-([^/]+)$"))))
+                                "/([0-9a-df-hij-np-sv-z]{32})-([^/]+)$"))))
 
 (define (store-path-package-name path)
   "Return the package name part of PATH, a file name in the store."
-- 
2.25.0


--n7jv6se5e47ytj2s--




Acknowledgement sent to "pelzflorian (Florian Pelz)" <pelzflorian@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-guix@HIDDEN. Full text available.
Report forwarded to bug-guix@HIDDEN:
bug#39970; Package guix. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Wed, 5 May 2021 15:15:01 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.