GNU bug report logs - #20822
environment mangled by locale

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: guile; Reported by: Zefram <zefram@HIDDEN>; dated Tue, 16 Jun 2015 04:18:02 UTC; Maintainer for guile is bug-guile@HIDDEN.

Message received at 20822 <at> debbugs.gnu.org:


Received: (at 20822) by debbugs.gnu.org; 26 Jun 2016 10:33:55 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Jun 26 06:33:55 2016
Received: from localhost ([127.0.0.1]:56098 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1bH7Nz-00013E-Lb
	for submit <at> debbugs.gnu.org; Sun, 26 Jun 2016 06:33:55 -0400
Received: from river.fysh.org ([87.98.248.19]:36101 ident=Debian-exim)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <zefram@HIDDEN>) id 1bH7Nx-000135-RB
 for 20822 <at> debbugs.gnu.org; Sun, 26 Jun 2016 06:33:54 -0400
Received: from zefram by river.fysh.org with local (Exim 4.84_2 #1 (Debian))
 id 1bH7Nt-0006ec-Vt; Sun, 26 Jun 2016 11:33:49 +0100
Date: Sun, 26 Jun 2016 11:33:49 +0100
From: Zefram <zefram@HIDDEN>
To: Mark H Weaver <mhw@HIDDEN>
Subject: Re: bug#20822: environment mangled by locale
Message-ID: <20160626103349.GK1170@HIDDEN>
References: <20150616041736.GA2718@HIDDEN> <87eg7njfhk.fsf@HIDDEN>
 <87wplcpxev.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <87wplcpxev.fsf@HIDDEN>
X-Spam-Score: -1.4 (-)
X-Debbugs-Envelope-To: 20822
Cc: Andy Wingo <wingo@HIDDEN>, 20822 <at> debbugs.gnu.org, ludo@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.4 (-)

Mark H Weaver wrote:
>                                           by convention they are
>supposed to encoded in the locale encoding.

This convention is bunk.  The encoding aspect of the locale system is
fundamentally broken: the model is that every string in the universe
(every file content, filename, command line argument, etc.) is encoded
in the same way, and the locale environment variable tells you which
universe you're in.  But in the real universe, files, filenames, and so
on turn up encoded how their authors liked to encode them, and that's
not always the same.  In the real universe we have to cope with data
that is not encoded in our preferred way.

>                                             If that convention is
>violated, I don't see what a program could do about it.

If the convention is violated, then there is some difficulty in presenting
correctly-encoded (or even consistently-encoded) output to the user, but
it is not insuperable.  Perhaps the program knows by some non-locale means
how a string is encoded, and can explicitly convert.  Perhaps it doesn't
know the real encoding, but can trust that the user will understand the
octet string if it is passed through with neither decoding of input nor
encoding for output.  Or perhaps the program doesn't need to put the
string into textual output at all, but only to use it some API or file
format that's expecting an encodingless octet string.

So there are many things a program can reasonably do about it, and which
one to do depends on the application.

>Can someone show me a realistic example of how this would be used in
>practice?

Looking specifically at environment variables: an environment
variable could give the name of a file that is to be consulted under
specified circumstances, and the right file may happen to have a name
that is inconsistent with the encoding used by the user's terminal.
(The filename is not required for output; it only needs to be passed as
an uninterpreted octet string to the open(2) syscall.)  An environment
variable could specify a Unicode-using name of a language module to be
loaded, while the user doesn't otherwise use Unicode, or doesn't use
an encoding encompassing enough of it.  (Name not required on output,
again; will be either transformed into a filename or looked up in a file
format that specifies its own encoding.)  The program could be env(1), not
interpreting the environment but needing to output the octets correctly.
The program could be saving an uninterpreted environment, for a cron
job to later run some other program with equivalent settings.

-zefram




Information forwarded to bug-guile@HIDDEN:
bug#20822; Package guile. Full text available.

Message received at 20822 <at> debbugs.gnu.org:


Received: (at 20822) by debbugs.gnu.org; 26 Jun 2016 01:11:07 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sat Jun 25 21:11:07 2016
Received: from localhost ([127.0.0.1]:55863 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1bGybL-0002I4-LT
	for submit <at> debbugs.gnu.org; Sat, 25 Jun 2016 21:11:07 -0400
Received: from world.peace.net ([50.252.239.5]:58669)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <mhw@HIDDEN>) id 1bGybK-0002HZ-Ai
 for 20822 <at> debbugs.gnu.org; Sat, 25 Jun 2016 21:11:06 -0400
Received: from pool-71-174-35-80.bstnma.east.verizon.net ([71.174.35.80]
 helo=jojen)
 by world.peace.net with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
 (Exim 4.84_2) (envelope-from <mhw@HIDDEN>)
 id 1bGybD-0001Ko-RP; Sat, 25 Jun 2016 21:10:59 -0400
From: Mark H Weaver <mhw@HIDDEN>
To: Andy Wingo <wingo@HIDDEN>
Subject: Re: bug#20822: environment mangled by locale
References: <20150616041736.GA2718@HIDDEN> <87eg7njfhk.fsf@HIDDEN>
Date: Sat, 25 Jun 2016 21:10:48 -0400
In-Reply-To: <87eg7njfhk.fsf@HIDDEN> (Andy Wingo's message of "Fri, 24 Jun
 2016 07:57:43 +0200")
Message-ID: <87wplcpxev.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.95 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 20822
Cc: 20822 <at> debbugs.gnu.org, Zefram <zefram@HIDDEN>, ludo@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: 0.0 (/)

Andy Wingo <wingo@HIDDEN> writes:

> On Tue 16 Jun 2015 06:17, Zefram <zefram@HIDDEN> writes:
>
>> When guile-2.0 is asked to read environment variables, via getenv,
>> it always decodes the underlying octet string according to the current
>> locale's nominal character encoding.  This is a problem, because the
>> environment variable's value is not necessarily encoded that way, and
>> may not even be an encoding of a character string at all.  The decoding
>> is lossy, where the octet string isn't consistent with the character
>> encoding, so the original octet string cannot be recovered from the
>> mangled form.  I don't see any Scheme interface that retrieves the
>> environment without locale decoding.
>
> Options:
>
>   Add optional "encoding" arg to scm_getenv; encoding is a string
>
>   Add alternate getenv interface that returns a bytevector
>
> We'll have to do the same for setenv too, I think.
>
> I think I would go with adding an encoding argument to getenv.  WDYT
> Mark and Ludovic?

I just don't see how this could be used sanely in actual practice.
These things are conceptually strings, and by convention they are
supposed to encoded in the locale encoding.  If that convention is
violated, I don't see what a program could do about it.

Can someone show me a realistic example of how this would be used in
practice?

      Mark




Information forwarded to bug-guile@HIDDEN:
bug#20822; Package guile. Full text available.

Message received at 20822 <at> debbugs.gnu.org:


Received: (at 20822) by debbugs.gnu.org; 24 Jun 2016 05:57:57 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Jun 24 01:57:56 2016
Received: from localhost ([127.0.0.1]:53226 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1bGK7o-0006K9-Mq
	for submit <at> debbugs.gnu.org; Fri, 24 Jun 2016 01:57:56 -0400
Received: from pb-sasl2.pobox.com ([64.147.108.67]:55740
 helo=sasl.smtp.pobox.com) by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <wingo@HIDDEN>) id 1bGK7m-0006Jz-0m
 for 20822 <at> debbugs.gnu.org; Fri, 24 Jun 2016 01:57:55 -0400
Received: from sasl.smtp.pobox.com (unknown [127.0.0.1])
 by pb-sasl2.pobox.com (Postfix) with ESMTP id 80BFB26B92;
 Fri, 24 Jun 2016 01:57:52 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc
 :subject:references:date:in-reply-to:message-id:mime-version
 :content-type; s=sasl; bh=uAvPY3snXKgih0syRmBkl3nRjqQ=; b=CwaQMz
 6ef532+YQJzwzBRAweBGUsReIvR/lqt1eGYviWb4Su8fXRdKNFy6wv0xO7XCFQk1
 YnJDSrhHTYYmHXd+AmpoxSr39ihw1ifNrwMS2wUlHf9clA0YcWwjr1uqfRyHOmym
 ccW1Dp8D3FC4zGh6JCndsp/7Dbevm89miKZ/4=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc
 :subject:references:date:in-reply-to:message-id:mime-version
 :content-type; q=dns; s=sasl; b=rm+tyMZzamT4e8W84Y8/u4dh2WnzTBkK
 sN7Txc6/gVgtAXVF3cJ4x7HTXHbYZHCmqW+yZfaS25L/hlgKFsMw/8La3/OcGtpu
 +VzKzVc7QBYLNoyit70RgEx1OYcqOxUxdkBK6o/I1lwfSBB6j9cdbPot4/WI30kT
 YKCffEq/7YY=
Received: from pb-sasl2.nyi.icgroup.com (unknown [127.0.0.1])
 by pb-sasl2.pobox.com (Postfix) with ESMTP id 6A07426B91;
 Fri, 24 Jun 2016 01:57:52 -0400 (EDT)
Received: from clucks (unknown [88.160.190.192])
 (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by pb-sasl2.pobox.com (Postfix) with ESMTPSA id 7E6C426B8F;
 Fri, 24 Jun 2016 01:57:51 -0400 (EDT)
From: Andy Wingo <wingo@HIDDEN>
To: Zefram <zefram@HIDDEN>
Subject: Re: bug#20822: environment mangled by locale
References: <20150616041736.GA2718@HIDDEN>
Date: Fri, 24 Jun 2016 07:57:43 +0200
In-Reply-To: <20150616041736.GA2718@HIDDEN> (zefram@HIDDEN's message of
 "Tue, 16 Jun 2015 05:17:36 +0100")
Message-ID: <87eg7njfhk.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Pobox-Relay-ID: 962A85CC-39D0-11E6-899B-28A6F1301B6D-02397024!pb-sasl2.pobox.com
X-Spam-Score: -1.4 (-)
X-Debbugs-Envelope-To: 20822
Cc: 20822 <at> debbugs.gnu.org, ludo@HIDDEN, mhw@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.4 (-)

On Tue 16 Jun 2015 06:17, Zefram <zefram@HIDDEN> writes:

> When guile-2.0 is asked to read environment variables, via getenv,
> it always decodes the underlying octet string according to the current
> locale's nominal character encoding.  This is a problem, because the
> environment variable's value is not necessarily encoded that way, and
> may not even be an encoding of a character string at all.  The decoding
> is lossy, where the octet string isn't consistent with the character
> encoding, so the original octet string cannot be recovered from the
> mangled form.  I don't see any Scheme interface that retrieves the
> environment without locale decoding.

Options:

  Add optional "encoding" arg to scm_getenv; encoding is a string

  Add alternate getenv interface that returns a bytevector

We'll have to do the same for setenv too, I think.

I think I would go with adding an encoding argument to getenv.  WDYT
Mark and Ludovic?

Andy




Information forwarded to bug-guile@HIDDEN:
bug#20822; Package guile. Full text available.

Message received at 20822 <at> debbugs.gnu.org:


Received: (at 20822) by debbugs.gnu.org; 4 Mar 2016 23:22:36 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Mar 04 18:22:36 2016
Received: from localhost ([127.0.0.1]:34180 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1abz3M-0005Qa-Fj
	for submit <at> debbugs.gnu.org; Fri, 04 Mar 2016 18:22:36 -0500
Received: from river.fysh.org ([87.98.248.19]:53708 ident=Debian-exim)
 by debbugs.gnu.org with esmtp (Exim 4.84)
 (envelope-from <zefram@HIDDEN>) id 1abz3K-0005QS-Mq
 for 20822 <at> debbugs.gnu.org; Fri, 04 Mar 2016 18:22:35 -0500
Received: from zefram by river.fysh.org with local (Exim 4.80 #2 (Debian))
 id 1abz3G-00042J-N7; Fri, 04 Mar 2016 23:22:30 +0000
Date: Fri, 4 Mar 2016 23:22:30 +0000
From: Zefram <zefram@HIDDEN>
To: 20822 <at> debbugs.gnu.org
Subject: Re: bug#20822: environment mangled by locale
Message-ID: <20160304232230.GA13009@HIDDEN>
References: <20150616041736.GA2718@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20150616041736.GA2718@HIDDEN>
X-Spam-Score: -0.0 (/)
X-Debbugs-Envelope-To: 20822
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

I wrote:
>There's an obvious parallel with reading data from an input port.
>If setlocale is called, then input is by default decoded according
>to locale, including the very lossy ASCII decode for C/POSIX.  But if
>setlocale has not been called, then input is by default decoded according
>to ISO-8859-1, preserving the actual octets.  It would probably be most
>sensible that, if setlocale hasn't been called, getenv should likewise
>decode according to ISO-8859-1.  It might also be sensible to offer
>some explicit control over the encoding to be used with the environment,
>just as I/O ports have a concept of per-port selected encoding.

In the light of what I've learned recently about Guile's locale handling,
this needs some revision.  What I thought was a well-defined "setlocale
not called" state is a mirage.  The encoding of ports is not reliably
fixed at ISO-8859-1; per bug#22910 it can be affected by ostensibly
read-only calls to setlocale, and seems to be only accidentally
ISO-8859-1 until that's done.  So that's not a good model.  Due to the
GUILE_INSTALL_LOCALE mechanism, a program wanting no locale selected
can't just never call setlocale in write mode.  So setlocale not having
been called is not really available as a way to control anything.

So it would seem to be necessary to use some explicit control of character
encoding for environment access.  (This must be control of encoding
per se, not merely of which locale to use for environment access,
because, as I noted in the original report, there's no guarantee of a
locale with a suitable encoding.)  This could be an optional parameter
to the environment access functions, or a settable variable that takes
precedence over locale to determine encoding for all environment access.
The latter would match the encoding model used by ports.

-zefram




Information forwarded to bug-guile@HIDDEN:
bug#20822; Package guile. Full text available.

Message received at 20822 <at> debbugs.gnu.org:


Received: (at 20822) by debbugs.gnu.org; 16 Jun 2015 20:50:25 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Tue Jun 16 16:50:24 2015
Received: from localhost ([127.0.0.1]:56339 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1Z4xoN-0005tW-Hr
	for submit <at> debbugs.gnu.org; Tue, 16 Jun 2015 16:50:24 -0400
Received: from de.cellform.com ([88.217.224.109]:55491 helo=jocasta.intra)
 by debbugs.gnu.org with esmtp (Exim 4.80)
 (envelope-from <john@HIDDEN>) id 1Z4xoJ-0005tM-P3
 for 20822 <at> debbugs.gnu.org; Tue, 16 Jun 2015 16:50:22 -0400
Received: from jocasta.intra (localhost [127.0.0.1])
 by jocasta.intra (8.14.4/8.14.4/Debian-4) with ESMTP id t5GKoILv023503
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
 Tue, 16 Jun 2015 22:50:18 +0200
Received: (from john@localhost)
 by jocasta.intra (8.14.4/8.14.4/Submit) id t5GKoH0k023502;
 Tue, 16 Jun 2015 22:50:17 +0200
Date: Tue, 16 Jun 2015 22:50:17 +0200
From: John Darrington <john@HIDDEN>
To: Andreas Rottmann <mail@HIDDEN>
Subject: Re: bug#20822: environment mangled by locale
Message-ID: <20150616205017.GA23390@HIDDEN>
References: <20150616041736.GA2718@HIDDEN>
 <20150616062631.GA700@HIDDEN>
 <874mm7whcr.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
 protocol="application/pgp-signature"; boundary="GvXjxJ+pjyke8COw"
Content-Disposition: inline
In-Reply-To: <874mm7whcr.fsf@HIDDEN>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Score: -0.0 (/)
X-Debbugs-Envelope-To: 20822
Cc: 20822 <at> debbugs.gnu.org, Zefram <zefram@HIDDEN>,
 John Darrington <john@HIDDEN>
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.0 (/)


--GvXjxJ+pjyke8COw
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Jun 16, 2015 at 10:03:48PM +0200, Andreas Rottmann wrote:
     John Darrington <john@HIDDEN> writes:
    =20
     > Can we configure this mailing list better?
     >
     > Many (all?) of the messages posted have no obvious indication of whi=
ch
     > mailing list they are coming from.
     >
     > The subject line is something like "bug#12345: description"
     > The To: field is 12354 <at> debbugs.gnu.org
     >
     > In general, it takes a lot of detective work to discover that message
     > relates to guile.
     >
     No, it doesn't, there's a List-Id header in all messages sent out via
     the list:
    =20
     List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" =
<bug-guile.gnu.org>

OK Thanks.  If that is invariant, then I'll set a rule accordingly. =20
Now that I know, I can do that.
    =20
     Putting an
     identifier into the subject just clutters the view for people who have
     set up their email clients appropriatly for use with mailing list.

I don't agree.  In fact, I set my email client for use with mailing lists. =
 That is why
I made the suggestion.  I like to know if I'm receiving personally addresse=
d mail, or
mail via a list (without having to explicitly check the envelope and all he=
aders).
    =20
     >From the email headers of your post, it seems you use mutt; I don't k=
now
     if mutt has built-in support for grouping based on List-Id (I'd guess
     no), but you can use a tool (MDA) like "maildrop"[1], "scmail"[2] or
     "procmail"[3]" to automatically put the email you receive via mailing
     lists into different (e.g.) IMAP mailboxes.

I do know how to use my computer - I just didn't know what field this list
used to identify itself.  But thanks for reminding me anyway.
    =20
     > Can it not be configured to Prepend the Subject: line with Bug-Guile
     > or something similar?  That way it'd be easier to manage - either
     > manually or automatically.
     >
     As mentioned above, this is not a good idea.

There are a lot of email conventions which are not good ideas.  They are ne=
vertheless
ubiquitous, and refusing to conform to them is also not a good idea.
    =20
Anyway, I'll set a rule on the List-Id field as you suggested and  hopefull=
y that'll
fix the problem.

Sorry for the noise.

J'

--=20
PGP Public key ID: 1024D/2DE827B3=20
fingerprint =3D 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://sks-keyservers.net or any PGP keyserver for public key.


--GvXjxJ+pjyke8COw
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlWAjAkACgkQimdxnC3oJ7N0MgCbBLaG64kqxAMeUO4QVA8d/YJo
Vb8AnR1xLfgThm/BA9kXgmzmtHPvfSvd
=vtRu
-----END PGP SIGNATURE-----

--GvXjxJ+pjyke8COw--




Information forwarded to bug-guile@HIDDEN:
bug#20822; Package guile. Full text available.

Message received at 20822 <at> debbugs.gnu.org:


Received: (at 20822) by debbugs.gnu.org; 16 Jun 2015 20:05:50 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Tue Jun 16 16:05:50 2015
Received: from localhost ([127.0.0.1]:56306 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1Z4x7E-0004qV-Be
	for submit <at> debbugs.gnu.org; Tue, 16 Jun 2015 16:05:49 -0400
Received: from yade.xx.vu ([78.47.92.94]:43822 ident=postfix)
 by debbugs.gnu.org with esmtp (Exim 4.80)
 (envelope-from <mail@HIDDEN>) id 1Z4x5M-0004nL-Ah
 for 20822 <at> debbugs.gnu.org; Tue, 16 Jun 2015 16:03:57 -0400
Received: from delenn.home.rotty.xx.vu
 (85-127-115-72.dynamic.xdsl-line.inode.at [85.127.115.72])
 by yade.xx.vu (Postfix) with ESMTPSA id ED7D7230DA4;
 Tue, 16 Jun 2015 22:03:47 +0200 (CEST)
Received: by delenn.home.rotty.xx.vu (Postfix, from userid 1000)
 id 97B813258E5; Tue, 16 Jun 2015 22:03:48 +0200 (CEST)
From: Andreas Rottmann <mail@HIDDEN>
To: John Darrington <john@HIDDEN>
Subject: Re: bug#20822: environment mangled by locale
References: <20150616041736.GA2718@HIDDEN>
 <20150616062631.GA700@HIDDEN>
Date: Tue, 16 Jun 2015 22:03:48 +0200
In-Reply-To: <20150616062631.GA700@HIDDEN> (John Darrington's message
 of "Tue, 16 Jun 2015 08:26:32 +0200")
Message-ID: <874mm7whcr.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: -0.6 (/)
X-Debbugs-Envelope-To: 20822
X-Mailman-Approved-At: Tue, 16 Jun 2015 16:05:46 -0400
Cc: 20822 <at> debbugs.gnu.org, Zefram <zefram@HIDDEN>
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.6 (/)

John Darrington <john@HIDDEN> writes:

> Can we configure this mailing list better?
>
> Many (all?) of the messages posted have no obvious indication of which
> mailing list they are coming from.
>
> The subject line is something like "bug#12345: description"
> The To: field is 12354 <at> debbugs.gnu.org
>
> In general, it takes a lot of detective work to discover that message
> relates to guile.
>
No, it doesn't, there's a List-Id header in all messages sent out via
the list:

List-Id: "Bug reports for GUILE, GNU's Ubiquitous Extension Language" <bug-guile.gnu.org>

You should use an email client (MUA) or mail delivery agent (MDA) which
can act on that header, and group all (e.g.) bug-guile mails into their
own folder (or view, or however your client calls that). Putting an
identifier into the subject just clutters the view for people who have
set up their email clients appropriatly for use with mailing list.

From the email headers of your post, it seems you use mutt; I don't know
if mutt has built-in support for grouping based on List-Id (I'd guess
no), but you can use a tool (MDA) like "maildrop"[1], "scmail"[2] or
"procmail"[3]" to automatically put the email you receive via mailing
lists into different (e.g.) IMAP mailboxes.

[1] http://www.courier-mta.org/maildrop/
[2] http://0xcc.net/scmail/index.html.en
[3] http://www.procmail.org/

Personally, I use scmail, as I quite like its Scheme-based configuration
file format; here's what I've done on a Debian system to set this up
(I'm hopefully not forgetting something here):

Create a .forward file containing the following single line in your
$HOME, to process incoming mail with scmail:

| /usr/bin/scmail-deliver

Then, follow the instructions on the scmail homepage[1], creating
~/.scmail/config and ~/.scmail/deliver-rules to split your incoming mail
into multiple mailboxes; I use the following rules for the Guile lists:

(add-filter-rule!
  '(list-id (#/guile-devel\.gnu\.org/i "lists/guile-devel"))
  '(list-id (#/guile-user\.gnu\.org/i "lists/guile-user"))
  '(list-id (#/bug-guile\.gnu\.org/i "lists/guile-bug")))

The exact destinations you can use (e.g. "lists/guile-devel") depends on
which program access your mail (IMAP server, local MUA). For a mutt
instance running on the same system, my config looks like [4]. Note that
this config is not polished at all; I use mutt on the server only as a
fallback.

[4] http://rotty.xx.vu/git/dotfiles/mutt/tree/.muttrc

> Can it not be configured to Prepend the Subject: line with Bug-Guile
> or something similar?  That way it'd be easier to manage - either
> manually or automatically.
>
As mentioned above, this is not a good idea.

Kind regards, Rotty
-- 
Andreas Rottmann -- <http://rotty.xx.vu/>




Information forwarded to bug-guile@HIDDEN:
bug#20822; Package guile. Full text available.

Message received at 20822 <at> debbugs.gnu.org:


Received: (at 20822) by debbugs.gnu.org; 16 Jun 2015 06:26:42 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Tue Jun 16 02:26:42 2015
Received: from localhost ([127.0.0.1]:55170 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1Z4kKX-0004tw-8a
	for submit <at> debbugs.gnu.org; Tue, 16 Jun 2015 02:26:41 -0400
Received: from de.cellform.com ([88.217.224.109]:55470 helo=jocasta.intra)
 by debbugs.gnu.org with esmtp (Exim 4.80)
 (envelope-from <john@HIDDEN>) id 1Z4kKR-0004tj-MU
 for 20822 <at> debbugs.gnu.org; Tue, 16 Jun 2015 02:26:38 -0400
Received: from jocasta.intra (localhost [127.0.0.1])
 by jocasta.intra (8.14.4/8.14.4/Debian-4) with ESMTP id t5G6QW50001212
 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);
 Tue, 16 Jun 2015 08:26:32 +0200
Received: (from john@localhost)
 by jocasta.intra (8.14.4/8.14.4/Submit) id t5G6QWeB001210;
 Tue, 16 Jun 2015 08:26:32 +0200
Date: Tue, 16 Jun 2015 08:26:32 +0200
From: John Darrington <john@HIDDEN>
To: Zefram <zefram@HIDDEN>
Subject: Re: bug#20822: environment mangled by locale
Message-ID: <20150616062631.GA700@HIDDEN>
References: <20150616041736.GA2718@HIDDEN>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
 protocol="application/pgp-signature"; boundary="HlL+5n6rz5pIUxbD"
Content-Disposition: inline
In-Reply-To: <20150616041736.GA2718@HIDDEN>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Score: -0.0 (/)
X-Debbugs-Envelope-To: 20822
Cc: 20822 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.0 (/)


--HlL+5n6rz5pIUxbD
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Can we configure this mailing list better?

Many (all?) of the messages posted have no obvious indication of which
mailing list they are coming from.

The subject line is something like "bug#12345: description"
The To: field is 12354 <at> debbugs.gnu.org

In general, it takes a lot of detective work to discover that message relat=
es to guile.


Can it not be configured to Prepend the Subject: line with Bug-Guile or som=
ething similar?
That way it'd be easier to manage - either manually or automatically.


J'
    =20

--=20
PGP Public key ID: 1024D/2DE827B3=20
fingerprint =3D 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://sks-keyservers.net or any PGP keyserver for public key.


--HlL+5n6rz5pIUxbD
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlV/wZcACgkQimdxnC3oJ7OXXwCeP7mrnnsaqaKvyxs7NMAItYyS
2/8An3+syT3wnBicFO1Z6h1VMeH476+m
=iZ3h
-----END PGP SIGNATURE-----

--HlL+5n6rz5pIUxbD--




Information forwarded to bug-guile@HIDDEN:
bug#20822; Package guile. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 16 Jun 2015 04:17:58 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Tue Jun 16 00:17:58 2015
Received: from localhost ([127.0.0.1]:55109 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1Z4iJx-0001mU-ST
	for submit <at> debbugs.gnu.org; Tue, 16 Jun 2015 00:17:58 -0400
Received: from eggs.gnu.org ([208.118.235.92]:40382)
 by debbugs.gnu.org with esmtp (Exim 4.80)
 (envelope-from <zefram@HIDDEN>) id 1Z4iJv-0001mH-UH
 for submit <at> debbugs.gnu.org; Tue, 16 Jun 2015 00:17:56 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <zefram@HIDDEN>) id 1Z4iJp-0000vg-Px
 for submit <at> debbugs.gnu.org; Tue, 16 Jun 2015 00:17:50 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled
 version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:54996)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <zefram@HIDDEN>) id 1Z4iJp-0000vZ-MI
 for submit <at> debbugs.gnu.org; Tue, 16 Jun 2015 00:17:49 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:53883)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <zefram@HIDDEN>) id 1Z4iJo-0004cE-FL
 for bug-guile@HIDDEN; Tue, 16 Jun 2015 00:17:49 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <zefram@HIDDEN>) id 1Z4iJl-0000ui-8i
 for bug-guile@HIDDEN; Tue, 16 Jun 2015 00:17:48 -0400
Received: from river.fysh.org ([5.135.154.127]:60991)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <zefram@HIDDEN>) id 1Z4iJl-0000tK-2r
 for bug-guile@HIDDEN; Tue, 16 Jun 2015 00:17:45 -0400
Received: from zefram by river.fysh.org with local (Exim 4.80 #2 (Debian))
 id 1Z4iJc-0002F5-VS; Tue, 16 Jun 2015 05:17:37 +0100
Date: Tue, 16 Jun 2015 05:17:36 +0100
From: Zefram <zefram@HIDDEN>
To: bug-guile@HIDDEN
Subject: environment mangled by locale
Message-ID: <20150616041736.GA2718@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x
X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address
 (bad octet value).
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -4.0 (----)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -4.0 (----)

When guile-2.0 is asked to read environment variables, via getenv,
it always decodes the underlying octet string according to the current
locale's nominal character encoding.  This is a problem, because the
environment variable's value is not necessarily encoded that way, and
may not even be an encoding of a character string at all.  The decoding
is lossy, where the octet string isn't consistent with the character
encoding, so the original octet string cannot be recovered from the
mangled form.  I don't see any Scheme interface that retrieves the
environment without locale decoding.

The decoding is governed by the currently selected locale at the time that
getenv is called, so this can be controlled to some extent by setlocale.
However, this doesn't provide a way round the lossy decoding problem,
because there is no guarantee of a cooperative locale being available
(and especially being available under a predictable name).  On my Debian
system here, the "POSIX" and "C" locales' nominal character encoding is
ASCII, so decoding under these locales results in all high-half octets
being turned into question marks.  Retrieving environment without calling
setlocale at all also yields this lossy ASCII decode.

Demos:

$ env - FOO=$'L\xc3\xa9on' guile-2.0 -c '(write (map char->integer (string->list (getenv "FOO")))) (newline)'
(76 63 63 111 110)
$ env - FOO=$'L\xc3\xa9on' guile-2.0 -c '(setlocale LC_ALL "POSIX") (write (map char->integer (string->list (getenv "FOO")))) (newline)'
(76 63 63 111 110)
$ env - FOO=$'L\xc3\xa9on' guile-2.0 -c '(setlocale LC_ALL "de_DE.utf8") (write (map char->integer (string->list (getenv "FOO")))) (newline)'
(76 233 111 110)
$ env - FOO=$'L\xc3\xa9on' guile-2.0 -c '(setlocale LC_ALL "de_DE.iso88591") (write (map char->integer (string->list (getenv "FOO")))) (newline)'
(76 195 169 111 110)

The actual data passed between processes is an octet string, and there
really needs to be some reliable way to access that octet string.
There's an obvious parallel with reading data from an input port.
If setlocale is called, then input is by default decoded according
to locale, including the very lossy ASCII decode for C/POSIX.  But if
setlocale has not been called, then input is by default decoded according
to ISO-8859-1, preserving the actual octets.  It would probably be most
sensible that, if setlocale hasn't been called, getenv should likewise
decode according to ISO-8859-1.  It might also be sensible to offer
some explicit control over the encoding to be used with the environment,
just as I/O ports have a concept of per-port selected encoding.

The same issue applies to other environment access functions too.
For setenv the corresponding problem is the inability to *write* an
arbitrary octet string to an environment variable.  Obviously all the
functions should have mutually consistent behaviour.

-zefram




Acknowledgement sent to Zefram <zefram@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-guile@HIDDEN. Full text available.
Report forwarded to bug-guile@HIDDEN:
bug#20822; Package guile. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Mon, 25 Nov 2019 12:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.