GNU logs - #31343, boring messages


Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#31343: scm_c_primitive_load behavior/documentation bug
Resent-From: Tom Balzer <niebieskitrociny@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Wed, 02 May 2018 17:56:01 +0000
Resent-Message-ID: <handler.31343.B.152528373028185 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: report 31343
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: 31343 <at> debbugs.gnu.org
Cc: Tom Balzer <niebieskitrociny@HIDDEN>
X-Debbugs-Original-To: bug-guile@HIDDEN
Received: via spool by submit <at> debbugs.gnu.org id=B.152528373028185
          (code B ref -1); Wed, 02 May 2018 17:56:01 +0000
Received: (at submit) by debbugs.gnu.org; 2 May 2018 17:55:30 +0000
Received: from localhost ([127.0.0.1]:47295 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1fDvyU-0007KX-4X
	for submit <at> debbugs.gnu.org; Wed, 02 May 2018 13:55:30 -0400
Received: from eggs.gnu.org ([208.118.235.92]:58018)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <niebieskitrociny@HIDDEN>) id 1fDvfz-0006uM-8I
 for submit <at> debbugs.gnu.org; Wed, 02 May 2018 13:36:23 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <niebieskitrociny@HIDDEN>) id 1fDvft-0006yA-3G
 for submit <at> debbugs.gnu.org; Wed, 02 May 2018 13:36:18 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.0 required=5.0 tests=BAYES_40,FREEMAIL_FROM,
 T_DKIM_INVALID autolearn=disabled version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:51943)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <niebieskitrociny@HIDDEN>)
 id 1fDvfs-0006y3-Vl
 for submit <at> debbugs.gnu.org; Wed, 02 May 2018 13:36:17 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:48983)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <niebieskitrociny@HIDDEN>) id 1fDvfr-0001HT-RE
 for bug-guile@HIDDEN; Wed, 02 May 2018 13:36:16 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <niebieskitrociny@HIDDEN>) id 1fDvfm-0006u7-TK
 for bug-guile@HIDDEN; Wed, 02 May 2018 13:36:15 -0400
Received: from mail-yb0-x231.google.com ([2607:f8b0:4002:c09::231]:37264)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16)
 (Exim 4.71) (envelope-from <niebieskitrociny@HIDDEN>)
 id 1fDvfm-0006so-Oq
 for bug-guile@HIDDEN; Wed, 02 May 2018 13:36:10 -0400
Received: by mail-yb0-x231.google.com with SMTP id i13-v6so5557199ybl.4
 for <bug-guile@HIDDEN>; Wed, 02 May 2018 10:36:09 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=user-agent:from:to:subject:message-id:cc:date:mime-version
 :content-transfer-encoding;
 bh=nYHHLbaEG90owzCcbCr/o1Cx71s3n3fPDPGpH1VSrE0=;
 b=DiW6J/Sll2ZeBL+XoDP5xinJgvI+lnUE5Bz/F/j/PjXPyC3tj/L6oeZtp6l72zMEkR
 EyzN6bE60r5xzYoxbxe01qBS7XYnt//v8PQOtEecyd0TvBVhjryDuu8hcaeXVgtywrgP
 01/Nmoo5kIM8kb6UIW0VofaOh3xUNbbFz1AQYmJjP9MeTvXwo56+22vxFvigWh4uSDbg
 Xrv7SCbJGA8+9XNT5jjqxSv6oDyjoG5t5Sw4Z4af2P8CKUPmZxABINIEGxIScKryCLpl
 Eropqu8Qi6+lBaDcusUVKI9Jo0X9nPEaTUGKY/i1PFAWG9+zhRalXn7DdUlMShgAiE15
 VvuQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:user-agent:from:to:subject:message-id:cc:date
 :mime-version:content-transfer-encoding;
 bh=nYHHLbaEG90owzCcbCr/o1Cx71s3n3fPDPGpH1VSrE0=;
 b=oFZV8Nk3fKzbU2Oibe8LrS/eKP2nyaQfrlLR0uLVr5lBZj2Dt/q6ILxRfX699TlJns
 xkYfceyI6T+AgPOI1UxzEYW7HXZadSksgDaxTCOOdFerWk+uHOauNLWu/d+0T6ZjC8um
 G+tKFb0W1vWTo2t4m6dPsV/e//fjDacAu/IEteAgwBTgud+/SsIR3r0mWr+KVDiucV4t
 zk+lm5iRESoVulF3lcb1KhdVVMYZZMk1Oe6PVTwq+UQ/+wNYZ+nUqgUORCw4e45l9kfK
 ZqB1cInZVsj5VizxR3F6g3TdPSBz8Z5CzPDFXX688FRgV5O7QHDoC1JwPNFMpzNEtFZX
 4Xvg==
X-Gm-Message-State: ALQs6tCHhAyoreImR8l3wJYYGrBpl14rXj1ZRx2YMEmh89nP3jPFRgaX
 hd1USd/I+VVqSzNkmtSxfeBjT83Y
X-Google-Smtp-Source: AB8JxZpe8HtlPXW697bduw1WJa/+L5RyXi5xBrNdgdg1/jUnuPhGyiRreWVB3uaqDw8epR8pYRMbvQ==
X-Received: by 2002:a25:6a46:: with SMTP id
 f67-v6mr13101458ybc.235.1525282568326; 
 Wed, 02 May 2018 10:36:08 -0700 (PDT)
Received: from niebie-guix ([172.56.20.38])
 by smtp.gmail.com with ESMTPSA id o62-v6sm5437897ywb.39.2018.05.02.10.36.06
 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256);
 Wed, 02 May 2018 10:36:07 -0700 (PDT)
User-agent: mu4e 1.0; emacs 25.3.1
From: Tom Balzer <niebieskitrociny@HIDDEN>
Message-ID: <877eom2c63.fsf@HIDDEN>
Date: Wed, 02 May 2018 12:35:54 -0500
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
 recognized.
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -4.0 (----)
X-Mailman-Approved-At: Wed, 02 May 2018 13:55:29 -0400
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -5.0 (-----)


Hello -

In ./guile/libguile/load.c, the function scm_c_primitive_load converts a
c string to a SCM string via scm_from_locale_string. I was reading the
manual and in section 6.6.5.14, it says:

> C Function: SCM scm_from_locale_string (const char *str)
> C Function: SCM scm_from_locale_stringn (const char *str, size_t
>
> [...]
>
> Note that these functions should _not_ be used to convert C string
> constants, because there is no guarantee that the current locale
> will match that of the execution character set, used for string and
> character constants.  Most modern C compilers use UTF-8 by default,
> so to convert C string constants we recommend
> ‘scm_from_utf8_string’.

This implies to me that you should not use scm_c_primitive_load with any
constant, like this:

#include <libguile.h>
#include <stdlib.h>

#define FILE "/home/niebie/sc/sdl/states.scm"

void *some_func(void *arg){
  SCM scm_c_primitive_load(FILE);
  
  return NULL;
}

int main(int argc, char **argv){
  void *res = scm_with_guile(some_func, NULL);
  return EXIT_SUCCESS;
}

I saw this only by reading the source for this function, as from the
documentation it isn't obvious. I am sending this to bug-guile because I
think that either this is a documentation bug or an implementation
bug. In either case I am happy to send a patch that fixes whichever is
at fault.

A counter example is scm_c_public_variable in modules.c, which uses
scm_from_utf8_symbol on the inputs, which precludes the use of dynamic c
strings for the input. Again, not something documented. I would think
both of these functions would do things the same way.

Thanks,
Tom




Message sent:


Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Mailer: MIME-tools 5.505 (Entity 5.505)
Content-Type: text/plain; charset=utf-8
X-Loop: help-debbugs@HIDDEN
From: help-debbugs@HIDDEN (GNU bug Tracking System)
To: Tom Balzer <niebieskitrociny@HIDDEN>
Subject: bug#31343: Acknowledgement (scm_c_primitive_load behavior/documentation
 bug)
Message-ID: <handler.31343.B.152528373028185.ack <at> debbugs.gnu.org>
References: <877eom2c63.fsf@HIDDEN>
X-Gnu-PR-Message: ack 31343
X-Gnu-PR-Package: guile
Reply-To: 31343 <at> debbugs.gnu.org
Date: Wed, 02 May 2018 17:56:01 +0000

Thank you for filing a new bug report with debbugs.gnu.org.

This is an automatically generated reply to let you know your message
has been received.

Your message is being forwarded to the package maintainers and other
interested parties for their attention; they will reply in due course.

Your message has been sent to the package maintainer(s):
 bug-guile@HIDDEN

If you wish to submit further information on this problem, please
send it to 31343 <at> debbugs.gnu.org.

Please do not send mail to help-debbugs@HIDDEN unless you wish
to report a problem with the Bug-tracking system.

--=20
31343: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D31343
GNU Bug Tracking System
Contact help-debbugs@HIDDEN with problems


Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#31343: scm_c_primitive_load behavior/documentation bug
Resent-From: Mark H Weaver <mhw@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Mon, 28 May 2018 12:58:01 +0000
Resent-Message-ID: <handler.31343.B31343.152751223825803 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 31343
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: Tom Balzer <niebieskitrociny@HIDDEN>
Cc: 31343 <at> debbugs.gnu.org
Received: via spool by 31343-submit <at> debbugs.gnu.org id=B31343.152751223825803
          (code B ref 31343); Mon, 28 May 2018 12:58:01 +0000
Received: (at 31343) by debbugs.gnu.org; 28 May 2018 12:57:18 +0000
Received: from localhost ([127.0.0.1]:51824 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1fNHi8-0006i6-Nf
	for submit <at> debbugs.gnu.org; Mon, 28 May 2018 08:57:18 -0400
Received: from world.peace.net ([64.112.178.59]:38488)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <mhw@HIDDEN>) id 1fNHi6-0006hs-Ge
 for 31343 <at> debbugs.gnu.org; Mon, 28 May 2018 08:57:14 -0400
Received: from mhw by world.peace.net with esmtpsa
 (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89)
 (envelope-from <mhw@HIDDEN>)
 id 1fNHi0-0003a5-1e; Mon, 28 May 2018 08:57:08 -0400
From: Mark H Weaver <mhw@HIDDEN>
References: <877eom2c63.fsf@HIDDEN>
Date: Mon, 28 May 2018 08:55:56 -0400
In-Reply-To: <877eom2c63.fsf@HIDDEN> (Tom Balzer's message of "Wed, 02 May
 2018 12:35:54 -0500")
Message-ID: <87bmd0x7kz.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Score: 0.0 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

Hi Tom,

Sorry for taking so long to respond.

Tom Balzer <niebieskitrociny@HIDDEN> writes:

> In ./guile/libguile/load.c, the function scm_c_primitive_load converts a
> c string to a SCM string via scm_from_locale_string. I was reading the
> manual and in section 6.6.5.14, it says:
>
>> C Function: SCM scm_from_locale_string (const char *str)
>> C Function: SCM scm_from_locale_stringn (const char *str, size_t
>>
>> [...]
>>
>> Note that these functions should _not_ be used to convert C string
>> constants, because there is no guarantee that the current locale
>> will match that of the execution character set, used for string and
>> character constants.  Most modern C compilers use UTF-8 by default,
>> so to convert C string constants we recommend
>> =E2=80=98scm_from_utf8_string=E2=80=99.
>
> This implies to me that you should not use scm_c_primitive_load with any
> constant, like this:
>
> #include <libguile.h>
> #include <stdlib.h>
>
> #define FILE "/home/niebie/sc/sdl/states.scm"
>
> void *some_func(void *arg){
>   SCM scm_c_primitive_load(FILE);

If the C string literal contains only ASCII characters, then it doesn't
matter either way, because all C locale encodings are ASCII-compatible.
Perhaps we should make that more clear in the documentation that you
quoted above.

A related question is whether we should change the API of
'scm_c_primitive_load' to expect a UTF-8 encoded file name instead of a
locale encoded one.

If the file name comes from a C string literal, then it will probably be
UTF-8 encoded, because that's what modern compilers tend to do.  On the
other hand, if the file name comes from somewhere else, e.g. from user
input, POSIX command line arguments, or environment variables, then it
should probably be the locale encoding.

I'm inclined to leave 'scm_c_primitive_load' as it is, because the
expected encoding is effectively part of the API.  Some programs might
depend on its current behavior, and file names are reasonably likely to
come from sources like environment variables or command-line arguments.
Furthermore, file names in C string literals are quite likely to be
ASCII-only anyway.

> I saw this only by reading the source for this function, as from the
> documentation it isn't obvious. I am sending this to bug-guile because I
> think that either this is a documentation bug or an implementation
> bug. In either case I am happy to send a patch that fixes whichever is
> at fault.
>
> A counter example is scm_c_public_variable in modules.c, which uses
> scm_from_utf8_symbol on the inputs, which precludes the use of dynamic c
> strings for the input. Again, not something documented. I would think
> both of these functions would do things the same way.

I'm not sure, because file names are reasonably likely to come from
external sources that are likely to be locale encoded, whereas Scheme
variable names are overwhelmingly likely to be C string literals.

In any case, these are longstanding APIs, so I don't think we should
change them.

So, I think the proper fixes here are to the documentation.  As you
suggested, the documentation for 'scm_c_public_variable',
'scm_c_primitive_load', and all other C functions in our API should
specify the encoding for C string arguments.

If you'd like to work on it, I'd be glad to accept documentation fixes
along these lines.

     Thanks!
       Mark





Last modified: Mon, 25 Nov 2019 12:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.