Mark H Weaver <mhw@HIDDEN>
to control <at> debbugs.gnu.org
.
Full text available.Received: (at 18520) by debbugs.gnu.org; 24 Sep 2014 12:37:37 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Wed Sep 24 08:37:37 2014 Received: from localhost ([127.0.0.1]:50526 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1XWlp9-0005J4-T8 for submit <at> debbugs.gnu.org; Wed, 24 Sep 2014 08:37:36 -0400 Received: from fencepost.gnu.org ([208.118.235.10]:52755) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <dak@HIDDEN>) id 1XWlp5-0005It-8H for 18520 <at> debbugs.gnu.org; Wed, 24 Sep 2014 08:37:32 -0400 Received: from localhost ([127.0.0.1]:60060 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from <dak@HIDDEN>) id 1XWlp3-0001tY-Cx; Wed, 24 Sep 2014 08:37:29 -0400 Received: by lola (Postfix, from userid 1000) id 65A87DF8CA; Wed, 24 Sep 2014 14:00:38 +0200 (CEST) From: David Kastrup <dak@HIDDEN> To: Mark H Weaver <mhw@HIDDEN> Subject: Re: bug#18520: string ports should not have an encoding References: <87iokgmttc.fsf@HIDDEN> <87oau5h9f0.fsf@HIDDEN> Date: Wed, 24 Sep 2014 14:00:38 +0200 In-Reply-To: <87oau5h9f0.fsf@HIDDEN> (Mark H. Weaver's message of "Wed, 24 Sep 2014 01:30:59 -0400") Message-ID: <87k34ti5y1.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -5.7 (-----) X-Debbugs-Envelope-To: 18520 Cc: 18520 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -5.7 (-----) Mark H Weaver <mhw@HIDDEN> writes: > David Kastrup <dak@HIDDEN> writes: > >> In Guile 2.0, at the time a string port is opened, the value of the >> fluid %default-port-encoding is used for deciding how to encode the >> string into a byte stream, [...] > > I agree that this was a mistake. The issue is fixed on the master > branch. The mistake is having a string port use a different sequence-of-character encoding than a string. >> Ports fundamentally deliver characters, and so reading and writing >> from a string source/sink should not involve _any_ coding system. > > David, you know as well as I that internally, there is always a coding > system. Strings have a coding system too, even if it's UCS-4. Emacs > uses something based on UTF-8, and I'd like to Guile to do something > similar in the future. > > I guess you don't like the fact that it is possible to expose the > internal representation via 'set-port-encoding!', 'ftell' or 'seek'. > I don't see this as a problem, and arguably it's a benefit. Shrug. That arguable benefit went down in flames in Emacs=A020. It triggered the last great migration from Emacs users to XEmacs. It took until Emacs=A020.4 until the horrible mistake of exposing byte offsets to the user in either strings or buffers was corrected. You write above "Emacs uses something based on UTF-8", and it's worth pointing out that it does so starting with Emacs=A023. Previously Emacs used its own peculiar multibyte encoding that existed long before UTF-8. The important thing to note is that is was _completely_ hidden from sight from Elisp users when the Emacs=A020 tribulations were over. Emacs was able to swap out this multibyte encoding for the Emacs=A023 coding rather transparently, and the main reason to do that was to make UTF-8 a favored encoding regarding performance of encoding/decoding and processing of Elisp source files. Emacs' internal encoding is not proper UTF-8. You can take a random byte string, tell Emacs that it is encoded in UTF-8, and decode it into Emacs' internal representation. All passages that happen to be proper uniquely represented UTF-8 will pass the transcoding unchanged, but everything else will be transcoded into a UTF-8-like representation of "unencodable byte". I think Emacs uses the UTF-8 forbidden code points from 0xd800 to 0xd880 for encoding stray bytes, or something like that. So if you reencode the unchanged "UTF-8" Emacs uses internally, the result will again faithfully reproduce the random byte stream. Garbage in, _same_ garbage out. A very important property that many of Emacs' supported file encodings share. Notable exception are various Japanese encodings based on escape characters. At any rate, unless you are using explicit conversions like string-as-unibyte or _encoding_ to Emacs' internal representation (it is available as a named coding system), the representation is not exposed. Strings are indexed per character, and buffers (which are at their heart random-access string ports) are indexed per character. Emacs has both unibyte and multibyte strings and unibyte and multibyte buffers, and unibyte strings and buffers are the source for decoding and the target for encoding into multibyte strings and buffers. XEmacs does not have unibyte strings/buffers, so a lot of string internals do not need to make the distinction. GUILE could probably get away without unibyte strings as well because it has bytevectors. This would imply that if you wanted to do stuff akin to string operations on unibyte strings, you'd have to first convert bytevectors to multibyte strings, do your operations, convert back. XEmacs chose _not_ to have unibyte strings (and the corresponding complications to support both in the primitives), Emacs chose to have them. I think both approaches are defensible. Since GUILE presents itself as an extension language and since strings will need to get passed in and out of extension languages all the time, the implementation cost of offering a low-cost unibyte string is probably even more defensible than with Elisp where Elisp is the main processing language. > First I'll address the non-standard 'set-port-encoding!'. As you say, > it doesn't even make sense on string ports, and arguably should be an > error. So why do you care if some internal details leak out when you > do this nonsensical thing? Admittedly, we're missing an opportunity > to report a possible bug to the user, but that's the only problem I > see here. > > Regarding 'ftell' and 'seek', it's not entirely clear to me what's the > best representation of those positions. In some situations, I guess > it would be convenient for them to count unicode code points or string > indices. In other situations, I could imagine it being more > convenient for them to count grapheme clusters or UTF-8 bytes. > > R6RS, the only Scheme standard that supports getting or setting file > positions, gives us complete freedom to choose our representation of > positions on textual ports. The R6RS is explicit that they don't even > have to be integers, and if they are, they don't have to correspond to > bytes or characters. R6RS gives you the freedom to match your semantics to your implementation. String ports are strings-in-progress (and Emacs buffers are strings-in-progress on steroids), so it makes sense to match the fseek/ftell semantics of string ports to those of strings and the implementation to those of strings. You don't have anything to gain from converting characters to bytes and back just because you can. > For better or for worse, Guile's ports are fundamentally based on > bytes, Seriously? The whole point of this issue was that fundamentally basing GUILE's string ports on bytes is for worse. > and allow mixed binary and textual operations on all ports. I'll go out on a limb here and state "they don't". They work with bytes (either located on file or in some internally generated or consumed byte vector) and they input/output characters on their Scheme side, and you can change the en/decoding system which which characters are put into the stream or consumed. Their external side is identical to its internal side, and the Scheme/character/string side is fundamentally different. By changing the port encoding, you can change the conversion between Scheme on the one side and internal/external on the other. All operations are binary on the internal side, and textual on the Scheme side. That there are encodings which are less costly does not fundamentally change this. > Sometimes this is very helpful, for example when implementing HTTP. I > can think of one other case where it's very helpful: > > I don't know how deeply you've looked at UTF-8, It is a somewhat safe bet that a person who is the head maintainer of an application conversing in UTF-8 while using GUILE-1.8 in its internals has had some basic amount of exposure to UTF-8. In general, the working assumption "David just has little clue about computing" is rarely helpful for dismissing matters since David tends to have picked up tidbits occasionally since he started computing on systems where lowercase letters already needed a multi-sextet representation in its 60bit words. So it is a reasonably safe bet that when David has some problems with matters, chances are that a non-negligible percentage of other users will not fare significantly better, so it is a somewhat relevant indicator what to avoid. > but it has some unusual properties that allow many (most?) string > algorithms to be most naturally (and efficiently) implemented by > operating on bytes rather than code points. Much of the time, you > don't even have to be aware of the code point boundaries, which is a > great savings. Efficient lookup tables based on bytes are also much > cheaper than ones based on code points, etc. That's all very nice but totally irrelevant for this issue. If you like UTF-8, by all means base the internal string representation of GUILE on it. It comes at a cost since strings in Scheme are writable (and there are more operations for doing so than in Elisp) and indexed by character. Emacs has paid this cost: I think the basic speed of Emacs dropped by a factor of 2 when indexing was moved from bytes to characters around Emacs 20.2 or similar. But this issue is about not using different internal coding and exposed interfaces for strings and string ports. Whatever internal string representation you choose, it does not make sense to pick a different representation and indexing for string ports. > In fact, I intend to propose that in a future version of Guile, > strings will not only be based on UTF-8 internally, but that this fact > should be exposed in the API, allowing users to implement UTF-8 string > operations that operate on bytes not code points. This experiment has been tried and crashed and burnt with the initial MULE versions in Emacs=A020. Current versions _do_ offer conversion-less reinterpretations string-as-unibyte and string-as-multibyte and offer working with either string type. As explained, that comes at the cost of having to make all primitives able to work with either. They are actually rarely used by application level programmers, so most applications do not have this as a porting problem between Emacs and XEmacs (XEmacs has only multibyte strings). Personally, I'd consider that worth the cost in the case of GUILE. While XEmacs gets along without this addition, it seems important for efficient passing of data in and out of GUILE. It would also make sense to distinguish between multibyte (internal form of UTF-8, anything may happen if it is not properly formed) and external UTF-8 (reading/writing it uses a conversion process turning all illegal UTF-8 bytes into some reproducible representation). > I'd also like lightweight, fast string ports that allow access to > these bytes when desired. Any string port that does not involve encoding/decoding will be lightweight and fast, lighter and faster than any implementation having to code/decode gratuitously. Which is one of the points of this issue, even though I am more concerned with the conceptual cost than the runtime cost. But both have an impact. > This leads me to believe that it's a feature, not a bug, that string > ports use UTF-8 internally, and that it's possible (via non-standard > extensions) to get access to the underlying bytes. Getting confused about bytes and characters and introducing unnecessary conversions is not a feature. Even if you at one time use an UTF-8 based string representation, working with external UTF-8 will involve encoding/decoding processes. Forcing a string port to encode/decode during operation will remain expensive. Exposing string internals beyond quite special-purpose functions will be hard to deal with. All those lessons have already been learnt with Emacs. If you want to relearn them from scratch, the available developer power will not make basing Emacs on GUILE realistic in the next 10 years: Emacs fundamentally operates with texts. Too many reliability or efficiency problems doing that (or having to implement them as foreign datatypes altogether) will not make Guilemacs acceptable. So even in cases where multiple strategies are feasible, it may make sense to lean towards Emacs' choices. One choice that has served Emacs well is to hide its internal encoding system well from the external ones. That way its switch to an internal coding system based on UTF-8 affected almost no existing Elisp packages, and the programming model was conceptually clean. --=20 David Kastrup
bug-guile@HIDDEN
:bug#18520
; Package guile
.
Full text available.Received: (at 18520) by debbugs.gnu.org; 24 Sep 2014 05:33:04 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Wed Sep 24 01:33:04 2014 Received: from localhost ([127.0.0.1]:50397 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1XWfCJ-0000Tq-61 for submit <at> debbugs.gnu.org; Wed, 24 Sep 2014 01:33:03 -0400 Received: from world.peace.net ([96.39.62.75]:53294) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <mhw@HIDDEN>) id 1XWfCG-0000TF-CD for 18520 <at> debbugs.gnu.org; Wed, 24 Sep 2014 01:33:01 -0400 Received: from c-24-62-95-23.hsd1.ma.comcast.net ([24.62.95.23] helo=yeeloong.lan) by world.peace.net with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from <mhw@HIDDEN>) id 1XWfC7-0006jS-VL; Wed, 24 Sep 2014 01:32:52 -0400 From: Mark H Weaver <mhw@HIDDEN> To: David Kastrup <dak@HIDDEN> Subject: Re: bug#18520: string ports should not have an encoding References: <87iokgmttc.fsf@HIDDEN> Date: Wed, 24 Sep 2014 01:30:59 -0400 In-Reply-To: <87iokgmttc.fsf@HIDDEN> (David Kastrup's message of "Mon, 22 Sep 2014 01:34:39 +0200") Message-ID: <87oau5h9f0.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 18520 Cc: 18520 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 0.0 (/) David Kastrup <dak@HIDDEN> writes: > In Guile 2.0, at the time a string port is opened, the value of the > fluid %default-port-encoding is used for deciding how to encode the > string into a byte stream, [...] I agree that this was a mistake. The issue is fixed on the master branch. > Ports fundamentally deliver characters, and so reading and writing from > a string source/sink should not involve _any_ coding system. David, you know as well as I that internally, there is always a coding system. Strings have a coding system too, even if it's UCS-4. Emacs uses something based on UTF-8, and I'd like to Guile to do something similar in the future. I guess you don't like the fact that it is possible to expose the internal representation via 'set-port-encoding!', 'ftell' or 'seek'. I don't see this as a problem, and arguably it's a benefit. First I'll address the non-standard 'set-port-encoding!'. As you say, it doesn't even make sense on string ports, and arguably should be an error. So why do you care if some internal details leak out when you do this nonsensical thing? Admittedly, we're missing an opportunity to report a possible bug to the user, but that's the only problem I see here. Regarding 'ftell' and 'seek', it's not entirely clear to me what's the best representation of those positions. In some situations, I guess it would be convenient for them to count unicode code points or string indices. In other situations, I could imagine it being more convenient for them to count grapheme clusters or UTF-8 bytes. R6RS, the only Scheme standard that supports getting or setting file positions, gives us complete freedom to choose our representation of positions on textual ports. The R6RS is explicit that they don't even have to be integers, and if they are, they don't have to correspond to bytes or characters. For better or for worse, Guile's ports are fundamentally based on bytes, and allow mixed binary and textual operations on all ports. Sometimes this is very helpful, for example when implementing HTTP. I can think of one other case where it's very helpful: I don't know how deeply you've looked at UTF-8, but it has some unusual properties that allow many (most?) string algorithms to be most naturally (and efficiently) implemented by operating on bytes rather than code points. Much of the time, you don't even have to be aware of the code point boundaries, which is a great savings. Efficient lookup tables based on bytes are also much cheaper than ones based on code points, etc. In fact, I intend to propose that in a future version of Guile, strings will not only be based on UTF-8 internally, but that this fact should be exposed in the API, allowing users to implement UTF-8 string operations that operate on bytes not code points. I'd also like lightweight, fast string ports that allow access to these bytes when desired. This leads me to believe that it's a feature, not a bug, that string ports use UTF-8 internally, and that it's possible (via non-standard extensions) to get access to the underlying bytes. Mark
bug-guile@HIDDEN
:bug#18520
; Package guile
.
Full text available.Received: (at 18520) by debbugs.gnu.org; 23 Sep 2014 19:33:32 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Tue Sep 23 15:33:32 2014 Received: from localhost ([127.0.0.1]:50262 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1XWVq8-00072j-70 for submit <at> debbugs.gnu.org; Tue, 23 Sep 2014 15:33:32 -0400 Received: from hera.aquilenet.fr ([141.255.128.1]:57205) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <ludo@HIDDEN>) id 1XWVq5-00072V-JO for 18520 <at> debbugs.gnu.org; Tue, 23 Sep 2014 15:33:30 -0400 Received: from localhost (localhost [127.0.0.1]) by hera.aquilenet.fr (Postfix) with ESMTP id 2917C3B1A; Tue, 23 Sep 2014 21:33:28 +0200 (CEST) Received: from hera.aquilenet.fr ([127.0.0.1]) by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gy2PjRV5ifTA; Tue, 23 Sep 2014 21:33:28 +0200 (CEST) Received: from pluto (reverse-83.fdn.fr [80.67.176.83]) by hera.aquilenet.fr (Postfix) with ESMTPSA id B0BA73B05; Tue, 23 Sep 2014 21:33:27 +0200 (CEST) From: ludo@HIDDEN (Ludovic =?utf-8?Q?Court=C3=A8s?=) To: David Kastrup <dak@HIDDEN> Subject: Re: bug#18520: string ports should not have an encoding References: <87iokgmttc.fsf@HIDDEN> <87mw9rq20u.fsf@HIDDEN> <87sijjlqx0.fsf@HIDDEN> <87sijjmvlr.fsf@HIDDEN> <87bnq7lgg9.fsf@HIDDEN> <87d2anl79a.fsf@HIDDEN> <87tx3zjod1.fsf@HIDDEN> <87egv2pwv5.fsf@HIDDEN> <87lhpak8ye.fsf@HIDDEN> <87bnq6oelf.fsf@HIDDEN> <87h9zyk0wo.fsf@HIDDEN> <87tx3yjzzw.fsf@HIDDEN> <87d2amjxq9.fsf@HIDDEN> <87tx3yfhrb.fsf@HIDDEN> <871tr2joiu.fsf@HIDDEN> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 2 =?utf-8?Q?Vend=C3=A9miaire?= an 223 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0xEA52ECF4 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 83C4 F8E5 10A3 3B4C 5BEA D15D 77DD 95E2 EA52 ECF4 X-OS: x86_64-unknown-linux-gnu Date: Tue, 23 Sep 2014 21:33:29 +0200 In-Reply-To: <871tr2joiu.fsf@HIDDEN> (David Kastrup's message of "Tue, 23 Sep 2014 18:21:45 +0200") Message-ID: <8761gew2ra.fsf@HIDDEN> User-Agent: Gnus/5.130011 (Ma Gnus v0.11) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 18520 Cc: 18520 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 1.0 (+) --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable David Kastrup <dak@HIDDEN> skribis: > I stated quite definitely that I am perfectly capable of dealing with > the mess GUILE made of string ports. Good to know, this was not my understanding until now. The intent of the change in 2.2 is to hide the very fact that string ports =E2=80=9Chave an encoding.=E2=80=9D So from that viewpoint, that bug= is closed. If the bug is about =E2=80=98ftell=E2=80=99, that=E2=80=99s a different sto= ry. I would tend to suggest that =E2=80=98ftell=E2=80=99 and =E2=80=98seek=E2=80=99 for string = ports operate on an abstract notion of position within the string port data. This is in fact the path that the R6RS takes: For a binary port, the port-position procedure returns the index of the position at which the next byte would be read from or written to the port as an exact non-negative integer object. For a textual port, port-position returns a value of some implementation-dependent type representing the port's position; this value may be useful only as the pos argument to set-port-position!, if the latter is supported on the port (see below). Thus, I would suggest a clarification along these lines: --=-=-= Content-Type: text/x-patch Content-Disposition: inline diff --git a/doc/ref/api-io.texi b/doc/ref/api-io.texi index 02d92a2..8331378 100644 --- a/doc/ref/api-io.texi +++ b/doc/ref/api-io.texi @@ -443,8 +443,12 @@ open. @deffn {Scheme Procedure} seek fd_port offset whence @deffnx {C Function} scm_seek (fd_port, offset, whence) Sets the current position of @var{fd_port} to the integer -@var{offset}, which is interpreted according to the value of -@var{whence}. +@var{offset}. For a file port, @var{offset} is expressed +as a number of bytes; for other types of ports, such as string +ports, @var{offset} is an abstract representation of the +position within the port's data, not necessarily expressed +as a number of bytes. @var{offset} is interpreted according to +the value of @var{whence}. One of the following variables should be supplied for @var{whence}: @@ -460,7 +464,7 @@ Seek from the end of the file. If @var{fd_port} is a file descriptor, the underlying system call is @code{lseek}. @var{port} may be a string port. -The value returned is the new position in the file. This means +The value returned is the new position in @var{fd_port}. This means that the current position of a port can be obtained using: @lisp (seek port 0 SEEK_CUR) --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Thoughts? Thanks, Ludo=E2=80=99. --=-=-=--
bug-guile@HIDDEN
:bug#18520
; Package guile
.
Full text available.Received: (at 18520) by debbugs.gnu.org; 23 Sep 2014 16:21:50 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Tue Sep 23 12:21:50 2014 Received: from localhost ([127.0.0.1]:50139 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1XWSqc-0006dM-3K for submit <at> debbugs.gnu.org; Tue, 23 Sep 2014 12:21:50 -0400 Received: from fencepost.gnu.org ([208.118.235.10]:32814) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <dak@HIDDEN>) id 1XWSqZ-0006dE-4K for 18520 <at> debbugs.gnu.org; Tue, 23 Sep 2014 12:21:47 -0400 Received: from localhost ([127.0.0.1]:40120 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from <dak@HIDDEN>) id 1XWSqY-0005k2-7p; Tue, 23 Sep 2014 12:21:46 -0400 Received: by lola (Postfix, from userid 1000) id 238E2E6239; Tue, 23 Sep 2014 18:21:45 +0200 (CEST) From: David Kastrup <dak@HIDDEN> To: ludo@HIDDEN (Ludovic =?iso-8859-1?Q?Court=E8s?=) Subject: Re: bug#18520: string ports should not have an encoding References: <87iokgmttc.fsf@HIDDEN> <87mw9rq20u.fsf@HIDDEN> <87sijjlqx0.fsf@HIDDEN> <87sijjmvlr.fsf@HIDDEN> <87bnq7lgg9.fsf@HIDDEN> <87d2anl79a.fsf@HIDDEN> <87tx3zjod1.fsf@HIDDEN> <87egv2pwv5.fsf@HIDDEN> <87lhpak8ye.fsf@HIDDEN> <87bnq6oelf.fsf@HIDDEN> <87h9zyk0wo.fsf@HIDDEN> <87tx3yjzzw.fsf@HIDDEN> <87d2amjxq9.fsf@HIDDEN> <87tx3yfhrb.fsf@HIDDEN> Date: Tue, 23 Sep 2014 18:21:45 +0200 In-Reply-To: <87tx3yfhrb.fsf@HIDDEN> ("Ludovic =?iso-8859-1?Q?Court=E8s?= =?iso-8859-1?Q?=22's?= message of "Tue, 23 Sep 2014 18:01:28 +0200") Message-ID: <871tr2joiu.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -5.7 (-----) X-Debbugs-Envelope-To: 18520 Cc: 18520 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -5.7 (-----) ludo@HIDDEN (Ludovic Court=E8s) writes: > Does this help for LilyPond? I stated quite definitely that I am perfectly capable of dealing with the mess GUILE made of string ports. The issue is that I should not have to, nor should anybody else. This issue _is_ _not_ _about_ _LilyPond_. Working on LilyPond merely shines a light on it. So please stop painting this as a request for help. It isn't. It is a request for change. The subject line is "string ports should not have an encoding". It isn't "help, I don't understand string ports". --=20 David Kastrup
bug-guile@HIDDEN
:bug#18520
; Package guile
.
Full text available.Received: (at 18520) by debbugs.gnu.org; 23 Sep 2014 16:01:32 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Tue Sep 23 12:01:31 2014 Received: from localhost ([127.0.0.1]:50129 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1XWSWx-00067h-Dw for submit <at> debbugs.gnu.org; Tue, 23 Sep 2014 12:01:31 -0400 Received: from hera.aquilenet.fr ([141.255.128.1]:56945) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <ludo@HIDDEN>) id 1XWSWu-00065W-6R for 18520 <at> debbugs.gnu.org; Tue, 23 Sep 2014 12:01:29 -0400 Received: from localhost (localhost [127.0.0.1]) by hera.aquilenet.fr (Postfix) with ESMTP id D30053B12; Tue, 23 Sep 2014 18:01:26 +0200 (CEST) Received: from hera.aquilenet.fr ([127.0.0.1]) by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id X807WdqRWyhb; Tue, 23 Sep 2014 18:01:26 +0200 (CEST) Received: from pluto (pluto.bordeaux.inria.fr [193.50.110.57]) by hera.aquilenet.fr (Postfix) with ESMTPSA id 95CF43A00; Tue, 23 Sep 2014 18:01:26 +0200 (CEST) From: ludo@HIDDEN (Ludovic =?utf-8?Q?Court=C3=A8s?=) To: David Kastrup <dak@HIDDEN> Subject: Re: bug#18520: string ports should not have an encoding References: <87iokgmttc.fsf@HIDDEN> <87mw9rq20u.fsf@HIDDEN> <87sijjlqx0.fsf@HIDDEN> <87sijjmvlr.fsf@HIDDEN> <87bnq7lgg9.fsf@HIDDEN> <87d2anl79a.fsf@HIDDEN> <87tx3zjod1.fsf@HIDDEN> <87egv2pwv5.fsf@HIDDEN> <87lhpak8ye.fsf@HIDDEN> <87bnq6oelf.fsf@HIDDEN> <87h9zyk0wo.fsf@HIDDEN> <87tx3yjzzw.fsf@HIDDEN> <87d2amjxq9.fsf@HIDDEN> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 2 =?utf-8?Q?Vend=C3=A9miaire?= an 223 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0xEA52ECF4 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 83C4 F8E5 10A3 3B4C 5BEA D15D 77DD 95E2 EA52 ECF4 X-OS: x86_64-unknown-linux-gnu Date: Tue, 23 Sep 2014 18:01:28 +0200 In-Reply-To: <87d2amjxq9.fsf@HIDDEN> (David Kastrup's message of "Tue, 23 Sep 2014 15:02:54 +0200") Message-ID: <87tx3yfhrb.fsf@HIDDEN> User-Agent: Gnus/5.130011 (Ma Gnus v0.11) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 18520 Cc: 18520 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 1.0 (+) David Kastrup <dak@HIDDEN> skribis: > They result in code like > > // we do our own utf8 encoding and verification in the parser, so we > // use the no-conversion equivalent of latin1 > SCM str =3D scm_from_latin1_string (c_str ()); > scm_dynwind_begin ((scm_t_dynwind_flags)0); > // Why doesn't scm_set_port_encoding_x work here? > scm_dynwind_fluid (ly_lily_module_constant ("%default-port-encoding"), = SCM_BOOL_F); > str_port_ =3D scm_open_input_string (str); > scm_dynwind_end (); > scm_set_port_filename_x (str_port_, ly_string2scm (name_)); > } So here =E2=80=98c_str=E2=80=99 returns a char * that is a UTF-8-encoded st= ring, right? In that case, it should be enough to do: /* Get a Scheme string from its UTF-8 representation. */ str =3D scm_from_utf8_string (c_str ()); /* Create an input string port. =E2=80=98read-char=E2=80=99 & co. will r= eturn each character from STR, one at a time. */ str_port =3D open_input_string (str); scm_set_port_filename_x (str_port, file); As long as textual I/O procedures are used on =E2=80=98str_port=E2=80=99, t= here=E2=80=99s no need to worry about its encoding. Now, to be able to use =E2=80=98ftell=E2=80=99 and assume it returns the po= sition as a number of bytes in the UTF-8 sequence, something like this should work (for 2.0; for 2.2 nothing special is needed): /* Get a Scheme string from its UTF-8 representation. */ str =3D scm_from_utf8_string (c_str ()); scm_dynwind_begin (0); /* Make sure the following string port uses UTF-8 as the internal encoding of its buffer. */ scm_dynwind_fluid (scm_public_ref ("guile", "%default-port-encoding"), scm_from_latin1_string ("UTF-8")); /* Create an input string port. =E2=80=98read-char=E2=80=99 & co. will r= eturn each character from STR, one at a time. */ str_port =3D open_input_string (str); scm_dynwind_end (); scm_set_port_filename_x (str_port, file); Does this help for LilyPond? Ludo=E2=80=99.
bug-guile@HIDDEN
:bug#18520
; Package guile
.
Full text available.Received: (at 18520) by debbugs.gnu.org; 23 Sep 2014 13:03:00 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Tue Sep 23 09:02:59 2014 Received: from localhost ([127.0.0.1]:49591 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1XWPkB-00005l-6P for submit <at> debbugs.gnu.org; Tue, 23 Sep 2014 09:02:59 -0400 Received: from fencepost.gnu.org ([208.118.235.10]:57289) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <dak@HIDDEN>) id 1XWPk7-00005a-Uz for 18520 <at> debbugs.gnu.org; Tue, 23 Sep 2014 09:02:56 -0400 Received: from localhost ([127.0.0.1]:36362 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from <dak@HIDDEN>) id 1XWPk7-0007RQ-7x; Tue, 23 Sep 2014 09:02:55 -0400 Received: by lola (Postfix, from userid 1000) id B76CFE61BB; Tue, 23 Sep 2014 15:02:54 +0200 (CEST) From: David Kastrup <dak@HIDDEN> To: ludo@HIDDEN (Ludovic =?iso-8859-1?Q?Court=E8s?=) Subject: Re: bug#18520: string ports should not have an encoding References: <87iokgmttc.fsf@HIDDEN> <87mw9rq20u.fsf@HIDDEN> <87sijjlqx0.fsf@HIDDEN> <87sijjmvlr.fsf@HIDDEN> <87bnq7lgg9.fsf@HIDDEN> <87d2anl79a.fsf@HIDDEN> <87tx3zjod1.fsf@HIDDEN> <87egv2pwv5.fsf@HIDDEN> <87lhpak8ye.fsf@HIDDEN> <87bnq6oelf.fsf@HIDDEN> <87h9zyk0wo.fsf@HIDDEN> <87tx3yjzzw.fsf@HIDDEN> Date: Tue, 23 Sep 2014 15:02:54 +0200 In-Reply-To: <87tx3yjzzw.fsf@HIDDEN> ("Ludovic =?iso-8859-1?Q?Court=E8s?= =?iso-8859-1?Q?=22's?= message of "Tue, 23 Sep 2014 14:13:55 +0200") Message-ID: <87d2amjxq9.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -5.7 (-----) X-Debbugs-Envelope-To: 18520 Cc: 18520 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -5.7 (-----) ludo@HIDDEN (Ludovic Court=C3=A8s) writes: > David Kastrup <dak@HIDDEN> skribis: > >> ludo@HIDDEN (Ludovic Court=C3=A8s) writes: >> >>> David Kastrup <dak@HIDDEN> skribis: >>> >>>>> Line/column info remains identical regardless of the encoding, so I t= end >>>>> to think it=E2=80=99s more robust to use that. >>>> >>>> Column info remains identical regardless of the encoding? Since when? >>> >>> The character on line L and column M is always there, regardless of >>> whether the file is encoded in UTF-8, Latin-1, etc. >>> >>> Would that work for LilyPond? >> >> Last time I looked, in the following line x was in column 3 in latin-1 >> encoding and in column 2 in utf-8 encoding: >> >> =C3=BCx > > I=E2=80=99m not sure what you mean. This line contains two characters: = =E2=80=98u=E2=80=99 with > umlaut followed by =E2=80=98x=E2=80=99. =E2=80=98=C3=BC=E2=80=99 is in t= he first column, and =E2=80=98x=E2=80=99 in the > second column. It contains three bytes. 0xc3, 0xbc, 0x78. In utf-8, this is =C3=BCx, in Latin-1 it is =C3=83=C2=BCx. This whole issue is about string ports _not_ being represented in terms of characters but bytes. > Is there a simple way to reproduce the issue with LilyPond? This issue is at best marginally about LilyPond, in that the semantics chosen for GUILE-2.0 (and switched again in GUILE-2.2) are both surprising and a source for headaches. They result in code like // we do our own utf8 encoding and verification in the parser, so we // use the no-conversion equivalent of latin1 SCM str =3D scm_from_latin1_string (c_str ()); scm_dynwind_begin ((scm_t_dynwind_flags)0); // Why doesn't scm_set_port_encoding_x work here? scm_dynwind_fluid (ly_lily_module_constant ("%default-port-encoding"), SC= M_BOOL_F); str_port_ =3D scm_open_input_string (str); scm_dynwind_end (); scm_set_port_filename_x (str_port_, ly_string2scm (name_)); } which will, incidentally, stop working in GUILE-2.2 at which time another workaround will be found. GUILE is an extension language. The stance that any kind of dealing with characters/strings that is not under control of GUILE and its character model is simply inappropriate. It is not the job of GUILE to dictate how an application has to organize matters internally. For that reason, its behavior needs to be straightforward and unsurprising. That includes sane boundaries between strings as character vectors, byte vectors, and encoding and decoding operations. Going through a byte-based encoding when copying a character-based string to a string, even when going through a string port, does not make sense. As a sign that this does not make sense, the effects of %default-port-encoding and set-port-encoding! on input and output string ports are unsymmetric. More so in GUILE-2.2 than in GUILE-2.0, but already in GUILE-2.0. That inconsistency (and its effects on overall performance) is what this issue is about. That I am tripping all over GUILE in the course of working with LilyPond is at best incidental to this issue. I could equally well be tripping over it when working with TeXmacs. I am not going to further reply to this issue since this is _not_, I=C2=A0repeat _not_ some complaint that I=C2=A0am too stupid to understand = what GUILE is doing here. I understand it perfectly well, and I=C2=A0am perfect= ly able to hack around GUILE's deficiencies and inconsistencies. One consequence of design problems like this is that the chosen semantics under such a fundamental design problem are arbitrary and thus more likely to change to different semantics in future versions. That means a higher likelihood of future maintenance. When I am going to have to redo this for GUILE-2.2 anyway, I prefer doing it in a sane manner that will stick around for good. I don't see that here. That does not mean that I am too stupid to work with the GUILE=C2=A02.0 behavior or the GUILE=C2=A02.2 behavior or the GUIL= E=C2=A01.8 behavior (in fact, the first port to GUILE=C2=A02 will set LC_CTYPE to C and just stick with GUILE=C2=A01.8 behavior, but that's not a long-term perspective since working with characters rather than bytes as string constituents _is_ nicer for the user). --=20 David Kastrup
bug-guile@HIDDEN
:bug#18520
; Package guile
.
Full text available.Received: (at 18520) by debbugs.gnu.org; 23 Sep 2014 12:13:59 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Tue Sep 23 08:13:59 2014 Received: from localhost ([127.0.0.1]:49546 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1XWOyl-0007DW-99 for submit <at> debbugs.gnu.org; Tue, 23 Sep 2014 08:13:59 -0400 Received: from hera.aquilenet.fr ([141.255.128.1]:56654) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <ludo@HIDDEN>) id 1XWOyh-0007DM-5z for 18520 <at> debbugs.gnu.org; Tue, 23 Sep 2014 08:13:56 -0400 Received: from localhost (localhost [127.0.0.1]) by hera.aquilenet.fr (Postfix) with ESMTP id D1D5739C4; Tue, 23 Sep 2014 14:13:53 +0200 (CEST) Received: from hera.aquilenet.fr ([127.0.0.1]) by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id n2M3cx3BJ2WX; Tue, 23 Sep 2014 14:13:53 +0200 (CEST) Received: from pluto (pluto.bordeaux.inria.fr [193.50.110.57]) by hera.aquilenet.fr (Postfix) with ESMTPSA id A0E1FD56; Tue, 23 Sep 2014 14:13:53 +0200 (CEST) From: ludo@HIDDEN (Ludovic =?utf-8?Q?Court=C3=A8s?=) To: David Kastrup <dak@HIDDEN> Subject: Re: bug#18520: string ports should not have an encoding References: <87iokgmttc.fsf@HIDDEN> <87mw9rq20u.fsf@HIDDEN> <87sijjlqx0.fsf@HIDDEN> <87sijjmvlr.fsf@HIDDEN> <87bnq7lgg9.fsf@HIDDEN> <87d2anl79a.fsf@HIDDEN> <87tx3zjod1.fsf@HIDDEN> <87egv2pwv5.fsf@HIDDEN> <87lhpak8ye.fsf@HIDDEN> <87bnq6oelf.fsf@HIDDEN> <87h9zyk0wo.fsf@HIDDEN> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 2 =?utf-8?Q?Vend=C3=A9miaire?= an 223 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0xEA52ECF4 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 83C4 F8E5 10A3 3B4C 5BEA D15D 77DD 95E2 EA52 ECF4 X-OS: x86_64-unknown-linux-gnu Date: Tue, 23 Sep 2014 14:13:55 +0200 In-Reply-To: <87h9zyk0wo.fsf@HIDDEN> (David Kastrup's message of "Tue, 23 Sep 2014 13:54:15 +0200") Message-ID: <87tx3yjzzw.fsf@HIDDEN> User-Agent: Gnus/5.130011 (Ma Gnus v0.11) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 18520 Cc: 18520 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 1.0 (+) David Kastrup <dak@HIDDEN> skribis: > ludo@HIDDEN (Ludovic Court=C3=A8s) writes: > >> David Kastrup <dak@HIDDEN> skribis: >> >>>> Line/column info remains identical regardless of the encoding, so I te= nd >>>> to think it=E2=80=99s more robust to use that. >>> >>> Column info remains identical regardless of the encoding? Since when? >> >> The character on line L and column M is always there, regardless of >> whether the file is encoded in UTF-8, Latin-1, etc. >> >> Would that work for LilyPond? > > Last time I looked, in the following line x was in column 3 in latin-1 > encoding and in column 2 in utf-8 encoding: > > =C3=BCx I=E2=80=99m not sure what you mean. This line contains two characters: =E2= =80=98u=E2=80=99 with umlaut followed by =E2=80=98x=E2=80=99. =E2=80=98=C3=BC=E2=80=99 is in the= first column, and =E2=80=98x=E2=80=99 in the second column. If we get a different column number, that means we=E2=80=99re looking at a different line. It could be because the encoding of the input port from which that line was read was incorrectly specified. This is the issue what would need to be fixed. Is there a simple way to reproduce the issue with LilyPond? Thanks, Ludo=E2=80=99.
bug-guile@HIDDEN
:bug#18520
; Package guile
.
Full text available.Received: (at 18520) by debbugs.gnu.org; 23 Sep 2014 11:54:22 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Tue Sep 23 07:54:22 2014 Received: from localhost ([127.0.0.1]:49519 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1XWOfl-0006hD-S4 for submit <at> debbugs.gnu.org; Tue, 23 Sep 2014 07:54:22 -0400 Received: from fencepost.gnu.org ([208.118.235.10]:55439) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <dak@HIDDEN>) id 1XWOfj-0006h2-8d for 18520 <at> debbugs.gnu.org; Tue, 23 Sep 2014 07:54:20 -0400 Received: from localhost ([127.0.0.1]:34509 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from <dak@HIDDEN>) id 1XWOfh-00047E-7h; Tue, 23 Sep 2014 07:54:17 -0400 Received: by lola (Postfix, from userid 1000) id B8E4EE61BB; Tue, 23 Sep 2014 13:54:15 +0200 (CEST) From: David Kastrup <dak@HIDDEN> To: ludo@HIDDEN (Ludovic =?iso-8859-1?Q?Court=E8s?=) Subject: Re: bug#18520: string ports should not have an encoding References: <87iokgmttc.fsf@HIDDEN> <87mw9rq20u.fsf@HIDDEN> <87sijjlqx0.fsf@HIDDEN> <87sijjmvlr.fsf@HIDDEN> <87bnq7lgg9.fsf@HIDDEN> <87d2anl79a.fsf@HIDDEN> <87tx3zjod1.fsf@HIDDEN> <87egv2pwv5.fsf@HIDDEN> <87lhpak8ye.fsf@HIDDEN> <87bnq6oelf.fsf@HIDDEN> Date: Tue, 23 Sep 2014 13:54:15 +0200 In-Reply-To: <87bnq6oelf.fsf@HIDDEN> ("Ludovic =?iso-8859-1?Q?Court=E8s?= =?iso-8859-1?Q?=22's?= message of "Tue, 23 Sep 2014 11:45:00 +0200") Message-ID: <87h9zyk0wo.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -5.7 (-----) X-Debbugs-Envelope-To: 18520 Cc: 18520 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -5.7 (-----) ludo@HIDDEN (Ludovic Court=C3=A8s) writes: > David Kastrup <dak@HIDDEN> skribis: > >>> Line/column info remains identical regardless of the encoding, so I tend >>> to think it=E2=80=99s more robust to use that. >> >> Column info remains identical regardless of the encoding? Since when? > > The character on line L and column M is always there, regardless of > whether the file is encoded in UTF-8, Latin-1, etc. > > Would that work for LilyPond? Last time I looked, in the following line x was in column 3 in latin-1 encoding and in column 2 in utf-8 encoding: =C3=BCx At any rate, we are missing the point of the issue. The issue is not whether a workaround may be designed for every way in which GUILE tries tripping up its users. The question is how GUILE may provide the least amount of surprise to its users without sacrificing functionality. GUILE's current implementation uses two character set conversions for string ports. For input string ports, the first is a batch encoding when the string port is opened (using %default-port-encoding resp. "UTF-8" in GUILE-2.0 and GUILE-2.2), this encoding is set as the port's encoding (I hope) and then, unless changed, every read operation employs the encoding that is, at any given time, current. Accompanying the opening of a string with an encoding operation (whether using a forced encoding or %default-port-encoding) is expensive (not least of all because everything needs to be decoded again), leads to arbitrary semantics for port positioning, and is asymmetric since the port encoding is only used for reading on an input string and for writing on an output string. Oh, and for writing on an input string using unread-string, of course. No kidding. There is also a conversion in there. Would it be worth ditching the sort of unnecessary conversion? Well, just look at: commit be7ecef05c1eea66f30360f658c610710c5cb22e Author: Andy Wingo <wingo@HIDDEN> Date: Sat Aug 31 10:44:07 2013 +0200 unread-char: inline conversion from codepoint to bytes * libguile/ports.c (scm_ungetc_unlocked): Inline the conversion from codepoint to bytes for UTF-8 and latin-1 ports. Speeds up a numbers-reading test case by 100% (!). That sounds like quite some gain just for _simplifying_ the back-and-forth conversion, and we could be just foregoing it instead (yes, peek-char as getc+ungetc presents a challenge in connection with encoding switches: I think that declaring the first impression of peek-char as sticky would be reasonable). At any rate, the above commit looks like it would make a hash out of (with-input-from-string "Huh\"" (lambda () (unread-string "\"=C3=A4" (current-input-port)) (read))) because of a broken character range check (I cannot currently check with a compilation of master since that takes about a day on my computer, but I would be surprised if the above worked fine). So yes, the required complexity to deal with GUILE's current behavior can introduce problems. --=20 David Kastrup
bug-guile@HIDDEN
:bug#18520
; Package guile
.
Full text available.Received: (at 18520) by debbugs.gnu.org; 23 Sep 2014 09:45:04 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Tue Sep 23 05:45:04 2014 Received: from localhost ([127.0.0.1]:49428 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1XWMee-0007vv-51 for submit <at> debbugs.gnu.org; Tue, 23 Sep 2014 05:45:04 -0400 Received: from hera.aquilenet.fr ([141.255.128.1]:56480) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <ludo@HIDDEN>) id 1XWMea-0007vB-DH for 18520 <at> debbugs.gnu.org; Tue, 23 Sep 2014 05:45:01 -0400 Received: from localhost (localhost [127.0.0.1]) by hera.aquilenet.fr (Postfix) with ESMTP id 366FC3A69; Tue, 23 Sep 2014 11:44:59 +0200 (CEST) Received: from hera.aquilenet.fr ([127.0.0.1]) by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4LAnhQWriOYu; Tue, 23 Sep 2014 11:44:59 +0200 (CEST) Received: from pluto (pluto.bordeaux.inria.fr [193.50.110.57]) by hera.aquilenet.fr (Postfix) with ESMTPSA id F37EE2C9F; Tue, 23 Sep 2014 11:44:58 +0200 (CEST) From: ludo@HIDDEN (Ludovic =?utf-8?Q?Court=C3=A8s?=) To: David Kastrup <dak@HIDDEN> Subject: Re: bug#18520: string ports should not have an encoding References: <87iokgmttc.fsf@HIDDEN> <87mw9rq20u.fsf@HIDDEN> <87sijjlqx0.fsf@HIDDEN> <87sijjmvlr.fsf@HIDDEN> <87bnq7lgg9.fsf@HIDDEN> <87d2anl79a.fsf@HIDDEN> <87tx3zjod1.fsf@HIDDEN> <87egv2pwv5.fsf@HIDDEN> <87lhpak8ye.fsf@HIDDEN> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 2 =?utf-8?Q?Vend=C3=A9miaire?= an 223 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0xEA52ECF4 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 83C4 F8E5 10A3 3B4C 5BEA D15D 77DD 95E2 EA52 ECF4 X-OS: x86_64-unknown-linux-gnu Date: Tue, 23 Sep 2014 11:45:00 +0200 In-Reply-To: <87lhpak8ye.fsf@HIDDEN> (David Kastrup's message of "Tue, 23 Sep 2014 11:00:25 +0200") Message-ID: <87bnq6oelf.fsf@HIDDEN> User-Agent: Gnus/5.130011 (Ma Gnus v0.11) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 18520 Cc: 18520 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 1.0 (+) David Kastrup <dak@HIDDEN> skribis: >> Line/column info remains identical regardless of the encoding, so I tend >> to think it=E2=80=99s more robust to use that. > > Column info remains identical regardless of the encoding? Since when? The character on line L and column M is always there, regardless of whether the file is encoded in UTF-8, Latin-1, etc. Would that work for LilyPond? Ludo=E2=80=99.
bug-guile@HIDDEN
:bug#18520
; Package guile
.
Full text available.Received: (at 18520) by debbugs.gnu.org; 23 Sep 2014 09:01:03 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Tue Sep 23 05:01:03 2014 Received: from localhost ([127.0.0.1]:49405 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1XWLy3-0006qM-13 for submit <at> debbugs.gnu.org; Tue, 23 Sep 2014 05:01:03 -0400 Received: from fencepost.gnu.org ([208.118.235.10]:52334) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <dak@HIDDEN>) id 1XWLy0-0006pp-O2 for 18520 <at> debbugs.gnu.org; Tue, 23 Sep 2014 05:01:01 -0400 Received: from localhost ([127.0.0.1]:59640 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from <dak@HIDDEN>) id 1XWLxz-0001mx-Qc; Tue, 23 Sep 2014 05:01:00 -0400 Received: by lola (Postfix, from userid 1000) id 0375EE06F7; Tue, 23 Sep 2014 11:00:25 +0200 (CEST) From: David Kastrup <dak@HIDDEN> To: ludo@HIDDEN (Ludovic =?iso-8859-1?Q?Court=E8s?=) Subject: Re: bug#18520: string ports should not have an encoding References: <87iokgmttc.fsf@HIDDEN> <87mw9rq20u.fsf@HIDDEN> <87sijjlqx0.fsf@HIDDEN> <87sijjmvlr.fsf@HIDDEN> <87bnq7lgg9.fsf@HIDDEN> <87d2anl79a.fsf@HIDDEN> <87tx3zjod1.fsf@HIDDEN> <87egv2pwv5.fsf@HIDDEN> Date: Tue, 23 Sep 2014 11:00:25 +0200 In-Reply-To: <87egv2pwv5.fsf@HIDDEN> ("Ludovic =?iso-8859-1?Q?Court=E8s?= =?iso-8859-1?Q?=22's?= message of "Tue, 23 Sep 2014 10:25:02 +0200") Message-ID: <87lhpak8ye.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -6.0 (------) X-Debbugs-Envelope-To: 18520 Cc: 18520 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -6.0 (------) ludo@HIDDEN (Ludovic Court=C3=A8s) writes: > David Kastrup <dak@HIDDEN> skribis: > >> ludo@HIDDEN (Ludovic Court=C3=A8s) writes: >> >>> David Kastrup <dak@HIDDEN> skribis: >>>> >>>> For error messages, yes. For associating a position in a string with a >>>> previously parsed closure, no. >>> >>> But wouldn=E2=80=99t a line/column pair be as suitable as a unique iden= tifier as >>> the position in the file? >> >> As long as the reencoded UTF-8 is byte-identical to the original. > > Sorry, what do you mean by =E2=80=9Creencoded UTF-8=E2=80=9D? The intern= al string port > buffer? Sure. That's where ftell gets its info from. > Line/column info remains identical regardless of the encoding, so I tend > to think it=E2=80=99s more robust to use that. Column info remains identical regardless of the encoding? Since when? --=20 David Kastrup
bug-guile@HIDDEN
:bug#18520
; Package guile
.
Full text available.Received: (at 18520) by debbugs.gnu.org; 23 Sep 2014 08:25:06 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Tue Sep 23 04:25:05 2014 Received: from localhost ([127.0.0.1]:49392 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1XWLPF-0005ZY-FL for submit <at> debbugs.gnu.org; Tue, 23 Sep 2014 04:25:05 -0400 Received: from hera.aquilenet.fr ([141.255.128.1]:56386) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <ludo@HIDDEN>) id 1XWLPC-0005Yu-4K for 18520 <at> debbugs.gnu.org; Tue, 23 Sep 2014 04:25:03 -0400 Received: from localhost (localhost [127.0.0.1]) by hera.aquilenet.fr (Postfix) with ESMTP id C4A223A65; Tue, 23 Sep 2014 10:25:00 +0200 (CEST) Received: from hera.aquilenet.fr ([127.0.0.1]) by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RrRdm-lrIzFV; Tue, 23 Sep 2014 10:25:00 +0200 (CEST) Received: from pluto (pluto.bordeaux.inria.fr [193.50.110.57]) by hera.aquilenet.fr (Postfix) with ESMTPSA id 9332ED56; Tue, 23 Sep 2014 10:25:00 +0200 (CEST) From: ludo@HIDDEN (Ludovic =?utf-8?Q?Court=C3=A8s?=) To: David Kastrup <dak@HIDDEN> Subject: Re: bug#18520: string ports should not have an encoding References: <87iokgmttc.fsf@HIDDEN> <87mw9rq20u.fsf@HIDDEN> <87sijjlqx0.fsf@HIDDEN> <87sijjmvlr.fsf@HIDDEN> <87bnq7lgg9.fsf@HIDDEN> <87d2anl79a.fsf@HIDDEN> <87tx3zjod1.fsf@HIDDEN> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 2 =?utf-8?Q?Vend=C3=A9miaire?= an 223 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0xEA52ECF4 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 83C4 F8E5 10A3 3B4C 5BEA D15D 77DD 95E2 EA52 ECF4 X-OS: x86_64-unknown-linux-gnu Date: Tue, 23 Sep 2014 10:25:02 +0200 In-Reply-To: <87tx3zjod1.fsf@HIDDEN> (David Kastrup's message of "Tue, 23 Sep 2014 00:12:58 +0200") Message-ID: <87egv2pwv5.fsf@HIDDEN> User-Agent: Gnus/5.130011 (Ma Gnus v0.11) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 18520 Cc: 18520 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 1.0 (+) David Kastrup <dak@HIDDEN> skribis: > ludo@HIDDEN (Ludovic Court=C3=A8s) writes: > >> David Kastrup <dak@HIDDEN> skribis: >>> >>> For error messages, yes. For associating a position in a string with a >>> previously parsed closure, no. >> >> But wouldn=E2=80=99t a line/column pair be as suitable as a unique ident= ifier as >> the position in the file? > > As long as the reencoded UTF-8 is byte-identical to the original. Sorry, what do you mean by =E2=80=9Creencoded UTF-8=E2=80=9D? The internal= string port buffer? Line/column info remains identical regardless of the encoding, so I tend to think it=E2=80=99s more robust to use that. Thanks, Ludo=E2=80=99.
bug-guile@HIDDEN
:bug#18520
; Package guile
.
Full text available.Received: (at 18520) by debbugs.gnu.org; 22 Sep 2014 22:16:32 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Sep 22 18:16:32 2014 Received: from localhost ([127.0.0.1]:49274 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1XWBuJ-0006ht-6a for submit <at> debbugs.gnu.org; Mon, 22 Sep 2014 18:16:32 -0400 Received: from fencepost.gnu.org ([208.118.235.10]:41908) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <dak@HIDDEN>) id 1XWBuC-0006hc-RQ for 18520 <at> debbugs.gnu.org; Mon, 22 Sep 2014 18:16:28 -0400 Received: from localhost ([127.0.0.1]:49212 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from <dak@HIDDEN>) id 1XWBuA-0005Aj-VY; Mon, 22 Sep 2014 18:16:23 -0400 Received: by lola (Postfix, from userid 1000) id 6A9A9E620D; Tue, 23 Sep 2014 00:12:58 +0200 (CEST) From: David Kastrup <dak@HIDDEN> To: ludo@HIDDEN (Ludovic =?iso-8859-1?Q?Court=E8s?=) Subject: Re: bug#18520: string ports should not have an encoding References: <87iokgmttc.fsf@HIDDEN> <87mw9rq20u.fsf@HIDDEN> <87sijjlqx0.fsf@HIDDEN> <87sijjmvlr.fsf@HIDDEN> <87bnq7lgg9.fsf@HIDDEN> <87d2anl79a.fsf@HIDDEN> Date: Tue, 23 Sep 2014 00:12:58 +0200 In-Reply-To: <87d2anl79a.fsf@HIDDEN> ("Ludovic =?iso-8859-1?Q?Court=E8s?= =?iso-8859-1?Q?=22's?= message of "Mon, 22 Sep 2014 22:39:29 +0200") Message-ID: <87tx3zjod1.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -6.0 (------) X-Debbugs-Envelope-To: 18520 Cc: 18520 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -6.0 (------) ludo@HIDDEN (Ludovic Court=C3=A8s) writes: > David Kastrup <dak@HIDDEN> skribis: >> >> For error messages, yes. For associating a position in a string with a >> previously parsed closure, no. > > But wouldn=E2=80=99t a line/column pair be as suitable as a unique identi= fier as > the position in the file? As long as the reencoded UTF-8 is byte-identical to the original. At the current point of time, we flag non-UTF-8 sequences with a warning and continue. People complained previously about things like Latin-1 characters (most likely to occur in comments or lyrics where they cause little or well-identifiable havoc) leading to unceremonious aborts without identifiable cause. At any rate, the current behavior does not make sense. Guile 2.0 might refuse to turn a string into a port, and for Guile 2.2 the port encoding may be used to have a UTF-8 rendition of the string characters be interpreted in another encoding (like latin-1) but not the other way round. Both versions make only some half-baked sense. Most resulting problems can probably be worked around in some manner, but string ports are actually the main stringbuf-like mechanism that Scheme has (dynamically growing strings that are more compact than a list of characters). Wedging a compulsory code conversion into it that is mirrored in the port positions seems like a distraction. > Also, if the result of =E2=80=98ftell=E2=80=99 is used as a unique identi= fier, does it > really matter whether it=E2=80=99s an offset measured in bytes or in > character? In the LilyPond lexer, stuff is usually measured with byte offsets. Yes, one can certainly parse the UTF-8 character distances and hope to arrive at the same results as the UTF-8 reencoding. But the point of GUILE's character set support was not really to make everything more complicated, was it? --=20 David Kastrup
bug-guile@HIDDEN
:bug#18520
; Package guile
.
Full text available.Received: (at 18520) by debbugs.gnu.org; 22 Sep 2014 20:39:36 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Sep 22 16:39:36 2014 Received: from localhost ([127.0.0.1]:49229 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1XWAOV-0002wt-9u for submit <at> debbugs.gnu.org; Mon, 22 Sep 2014 16:39:35 -0400 Received: from hera.aquilenet.fr ([141.255.128.1]:55955) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <ludo@HIDDEN>) id 1XWAOR-0002wj-Rv for 18520 <at> debbugs.gnu.org; Mon, 22 Sep 2014 16:39:33 -0400 Received: from localhost (localhost [127.0.0.1]) by hera.aquilenet.fr (Postfix) with ESMTP id 7714A3AE8; Mon, 22 Sep 2014 22:39:29 +0200 (CEST) Received: from hera.aquilenet.fr ([127.0.0.1]) by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rti5oj3j8ye2; Mon, 22 Sep 2014 22:39:29 +0200 (CEST) Received: from pluto (reverse-83.fdn.fr [80.67.176.83]) by hera.aquilenet.fr (Postfix) with ESMTPSA id 032F43509; Mon, 22 Sep 2014 22:39:28 +0200 (CEST) From: ludo@HIDDEN (Ludovic =?utf-8?Q?Court=C3=A8s?=) To: David Kastrup <dak@HIDDEN> Subject: Re: bug#18520: string ports should not have an encoding References: <87iokgmttc.fsf@HIDDEN> <87mw9rq20u.fsf@HIDDEN> <87sijjlqx0.fsf@HIDDEN> <87sijjmvlr.fsf@HIDDEN> <87bnq7lgg9.fsf@HIDDEN> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 1 =?utf-8?Q?Vend=C3=A9miaire?= an 223 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0xEA52ECF4 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 83C4 F8E5 10A3 3B4C 5BEA D15D 77DD 95E2 EA52 ECF4 X-OS: x86_64-unknown-linux-gnu Date: Mon, 22 Sep 2014 22:39:29 +0200 In-Reply-To: <87bnq7lgg9.fsf@HIDDEN> (David Kastrup's message of "Mon, 22 Sep 2014 19:20:54 +0200") Message-ID: <87d2anl79a.fsf@HIDDEN> User-Agent: Gnus/5.130011 (Ma Gnus v0.11) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 18520 Cc: 18520 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 1.0 (+) David Kastrup <dak@HIDDEN> skribis: > ludo@HIDDEN (Ludovic Court=C3=A8s) writes: > >> David Kastrup <dak@HIDDEN> skribis: >> >>> I'm currently migrating LilyPond over to GUILE 2.0. LilyPond has its >>> own UTF-8 verification, error flagging, processing and indexing. >> >> Do I understand correctly that LilyPond expects Guile strings to be byte >> vectors, which it can feed with UTF-8 byte sequences that it built by >> itself? > > Not really. LilyPond reads and parses its own files but it does divert > parts through GUILE occasionally in the process. Some stuff is passed > through GUILE with time delays and parts wrapped into closures and > flagged with machine-identifiable source locations. OK. >>> If you take a look at >>> <URL:http://git.savannah.gnu.org/cgit/lilypond.git/tree/scm/parser-ly-f= rom-scheme.scm>, >>> ftell on a string port is here used for correlating the positions of >>> parsed subexpressions with the original data. Reencoding strings in >>> utf-8 is not going to make this work with string indexing since ftell >>> does not bear a useful relation to string positions. >> >> AIUI the result of =E2=80=98ftell=E2=80=99 is used in only one place, wh= ile >> =E2=80=98port-line=E2=80=99 and =E2=80=98port-column=E2=80=99 are used i= n other places. > > The ftell information is wrapped into an alist together with a closure > corresponding to the source location. At a later point of time, the > surrounding string may be interpreted, and the source location is > correlated with the closure and the closure used instead of a call to > local-eval (which does not have the same power of evaluating materials > in a preserved lexical environment as a closure has). > >> The latter seems more appropriate to me when it comes to tracking >> source location. > > For error messages, yes. For associating a position in a string with a > previously parsed closure, no. But wouldn=E2=80=99t a line/column pair be as suitable as a unique identifi= er as the position in the file? Also, if the result of =E2=80=98ftell=E2=80=99 is used as a unique identifi= er, does it really matter whether it=E2=80=99s an offset measured in bytes or in charac= ter? (Trying to make sure I understand the problem.) Thanks, Ludo=E2=80=99.
bug-guile@HIDDEN
:bug#18520
; Package guile
.
Full text available.Received: (at 18520) by debbugs.gnu.org; 22 Sep 2014 17:21:00 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Sep 22 13:20:59 2014 Received: from localhost ([127.0.0.1]:49075 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1XW7IJ-00059y-Fe for submit <at> debbugs.gnu.org; Mon, 22 Sep 2014 13:20:59 -0400 Received: from fencepost.gnu.org ([208.118.235.10]:35519) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <dak@HIDDEN>) id 1XW7IG-00059p-LT for 18520 <at> debbugs.gnu.org; Mon, 22 Sep 2014 13:20:57 -0400 Received: from localhost ([127.0.0.1]:42824 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from <dak@HIDDEN>) id 1XW7IF-0000ou-If; Mon, 22 Sep 2014 13:20:56 -0400 Received: by lola (Postfix, from userid 1000) id 6CE69E620D; Mon, 22 Sep 2014 19:20:54 +0200 (CEST) From: David Kastrup <dak@HIDDEN> To: ludo@HIDDEN (Ludovic =?iso-8859-1?Q?Court=E8s?=) Subject: Re: bug#18520: string ports should not have an encoding References: <87iokgmttc.fsf@HIDDEN> <87mw9rq20u.fsf@HIDDEN> <87sijjlqx0.fsf@HIDDEN> <87sijjmvlr.fsf@HIDDEN> Date: Mon, 22 Sep 2014 19:20:54 +0200 In-Reply-To: <87sijjmvlr.fsf@HIDDEN> ("Ludovic =?iso-8859-1?Q?Court=E8s?= =?iso-8859-1?Q?=22's?= message of "Mon, 22 Sep 2014 19:08:16 +0200") Message-ID: <87bnq7lgg9.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -6.0 (------) X-Debbugs-Envelope-To: 18520 Cc: 18520 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -6.0 (------) ludo@HIDDEN (Ludovic Court=C3=A8s) writes: > David Kastrup <dak@HIDDEN> skribis: > >> I'm currently migrating LilyPond over to GUILE 2.0. LilyPond has its >> own UTF-8 verification, error flagging, processing and indexing. > > Do I understand correctly that LilyPond expects Guile strings to be byte > vectors, which it can feed with UTF-8 byte sequences that it built by > itself? Not really. LilyPond reads and parses its own files but it does divert parts through GUILE occasionally in the process. Some stuff is passed through GUILE with time delays and parts wrapped into closures and flagged with machine-identifiable source locations. >> If you take a look at >> <URL:http://git.savannah.gnu.org/cgit/lilypond.git/tree/scm/parser-ly-fr= om-scheme.scm>, >> ftell on a string port is here used for correlating the positions of >> parsed subexpressions with the original data. Reencoding strings in >> utf-8 is not going to make this work with string indexing since ftell >> does not bear a useful relation to string positions. > > AIUI the result of =E2=80=98ftell=E2=80=99 is used in only one place, whi= le > =E2=80=98port-line=E2=80=99 and =E2=80=98port-column=E2=80=99 are used in= other places. The ftell information is wrapped into an alist together with a closure corresponding to the source location. At a later point of time, the surrounding string may be interpreted, and the source location is correlated with the closure and the closure used instead of a call to local-eval (which does not have the same power of evaluating materials in a preserved lexical environment as a closure has). > The latter seems more appropriate to me when it comes to tracking > source location. For error messages, yes. For associating a position in a string with a previously parsed closure, no. > How is the result of =E2=80=98ftell=E2=80=99 used by callers of > =E2=80=98read-lily-expression=E2=80=99? See above. >> I=C2=A0have more than enough crashes and obscure errors to contend with = as >> it stands, > > Could you open a separate bug with the backtrace of such crashes, if you > think it may be Guile=E2=80=99s fault? The backtraces are usually quite useless for diagnosing the crashes. For example, there are crashes in scm_sloppy_assq. If you look at the code, it is clear that they can only happen for pairs that have already been collected by garbage collection. So the bug has occured quite a bit previously to the crash. So one has to figure out how the collection could possibly have happened (naturally, it didn't with GUILE 1.8). You can try doing that with the rather expensive process of "reverse execution" (which basically traces and keeps a history you can then explore backwards from the crash), but that requires that the bugs are reproducible, and with collection in a separate thread, that is not really the case. Sometimes a crash segfaults, more often you get std::exception triggered. All with the same input and executable. --=20 David Kastrup
bug-guile@HIDDEN
:bug#18520
; Package guile
.
Full text available.Received: (at 18520) by debbugs.gnu.org; 22 Sep 2014 17:08:21 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Sep 22 13:08:21 2014 Received: from localhost ([127.0.0.1]:49058 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1XW764-0004qm-GD for submit <at> debbugs.gnu.org; Mon, 22 Sep 2014 13:08:21 -0400 Received: from hera.aquilenet.fr ([141.255.128.1]:55863) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <ludo@HIDDEN>) id 1XW761-0004qc-Un for 18520 <at> debbugs.gnu.org; Mon, 22 Sep 2014 13:08:19 -0400 Received: from localhost (localhost [127.0.0.1]) by hera.aquilenet.fr (Postfix) with ESMTP id AD8B03AD4; Mon, 22 Sep 2014 19:08:16 +0200 (CEST) Received: from hera.aquilenet.fr ([127.0.0.1]) by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id K6SqqEscs4Jh; Mon, 22 Sep 2014 19:08:16 +0200 (CEST) Received: from pluto (reverse-83.fdn.fr [80.67.176.83]) by hera.aquilenet.fr (Postfix) with ESMTPSA id 58A9739C4; Mon, 22 Sep 2014 19:08:16 +0200 (CEST) From: ludo@HIDDEN (Ludovic =?utf-8?Q?Court=C3=A8s?=) To: David Kastrup <dak@HIDDEN> Subject: Re: bug#18520: string ports should not have an encoding References: <87iokgmttc.fsf@HIDDEN> <87mw9rq20u.fsf@HIDDEN> <87sijjlqx0.fsf@HIDDEN> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 1 =?utf-8?Q?Vend=C3=A9miaire?= an 223 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0xEA52ECF4 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 83C4 F8E5 10A3 3B4C 5BEA D15D 77DD 95E2 EA52 ECF4 X-OS: x86_64-unknown-linux-gnu Date: Mon, 22 Sep 2014 19:08:16 +0200 In-Reply-To: <87sijjlqx0.fsf@HIDDEN> (David Kastrup's message of "Mon, 22 Sep 2014 15:34:51 +0200") Message-ID: <87sijjmvlr.fsf@HIDDEN> User-Agent: Gnus/5.130011 (Ma Gnus v0.11) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 18520 Cc: 18520 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 1.0 (+) David Kastrup <dak@HIDDEN> skribis: > I'm currently migrating LilyPond over to GUILE 2.0. LilyPond has its > own UTF-8 verification, error flagging, processing and indexing. Do I understand correctly that LilyPond expects Guile strings to be byte vectors, which it can feed with UTF-8 byte sequences that it built by itself? > If you take a look at > <URL:http://git.savannah.gnu.org/cgit/lilypond.git/tree/scm/parser-ly-fro= m-scheme.scm>, > ftell on a string port is here used for correlating the positions of > parsed subexpressions with the original data. Reencoding strings in > utf-8 is not going to make this work with string indexing since ftell > does not bear a useful relation to string positions. AIUI the result of =E2=80=98ftell=E2=80=99 is used in only one place, while= =E2=80=98port-line=E2=80=99 and =E2=80=98port-column=E2=80=99 are used in other places. The latter see= ms more appropriate to me when it comes to tracking source location. How is the result of =E2=80=98ftell=E2=80=99 used by callers of =E2=80=98re= ad-lily-expression=E2=80=99? > I=C2=A0have more than enough crashes and obscure errors to contend with as > it stands, Could you open a separate bug with the backtrace of such crashes, if you think it may be Guile=E2=80=99s fault? Thanks, Ludo=E2=80=99.
bug-guile@HIDDEN
:bug#18520
; Package guile
.
Full text available.Received: (at 18520) by debbugs.gnu.org; 22 Sep 2014 13:35:17 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Sep 22 09:35:17 2014 Received: from localhost ([127.0.0.1]:48255 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1XW3lr-0007bf-Vz for submit <at> debbugs.gnu.org; Mon, 22 Sep 2014 09:35:16 -0400 Received: from fencepost.gnu.org ([208.118.235.10]:57571) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <dak@HIDDEN>) id 1XW3ll-0007bH-V1 for 18520 <at> debbugs.gnu.org; Mon, 22 Sep 2014 09:35:11 -0400 Received: from localhost ([127.0.0.1]:36643 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from <dak@HIDDEN>) id 1XW3ll-0001aK-Fa; Mon, 22 Sep 2014 09:35:09 -0400 Received: by lola (Postfix, from userid 1000) id 037B1E0C78; Mon, 22 Sep 2014 15:34:51 +0200 (CEST) From: David Kastrup <dak@HIDDEN> To: ludo@HIDDEN (Ludovic =?iso-8859-1?Q?Court=E8s?=) Subject: Re: bug#18520: string ports should not have an encoding References: <87iokgmttc.fsf@HIDDEN> <87mw9rq20u.fsf@HIDDEN> Date: Mon, 22 Sep 2014 15:34:51 +0200 In-Reply-To: <87mw9rq20u.fsf@HIDDEN> ("Ludovic =?iso-8859-1?Q?Court=E8s?= =?iso-8859-1?Q?=22's?= message of "Mon, 22 Sep 2014 14:21:21 +0200") Message-ID: <87sijjlqx0.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -6.0 (------) X-Debbugs-Envelope-To: 18520 Cc: 18520 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -6.0 (------) ludo@HIDDEN (Ludovic Court=C3=A8s) writes: > David Kastrup <dak@HIDDEN> skribis: > >> Guile-2.2 does not consult %default-port-encoding but uses UTF-8 >> consistently (I guess, overriding set-port-encoding! will again change >> that). >> >> That still is not satisfactory. For example, using ftell on the input >> port will not report the string index of the string connected to the >> string port but rather a byte index into a UTF-8 encoded version of the >> string. This is a number that has nothing to do with the original >> string and cannot be used for correlating string and port. > > Right. > >> Ports fundamentally deliver characters, and so reading and writing from >> a string source/sink should not involve _any_ coding system. >> >> Files fundamentally deliver bytes, a conversion is required. The same >> would be the case when opening a port on a _bytevector_. Here an >> encoding would make equally make sense, and ftell/fseek offsets would >> naturally be in bytes. But a port on a string delivers and consumes >> characters. Any conversion, even a fixed UTF-8 conversion, will destroy >> the predictable nature of with-output-to-string and >> with-input-from-string and the respective uses of string ports. > > Guile ports can be mixed textual/binary (unlike R6 ports, which are > either textual or binary.) Thus, they fundamentally deliver bytes, > possibly with a textual conversion. I think that is a mischaracterization. GUILE ports at the current point of time can _only_ be binary, to the degree that strings/texts first have to be encoded into a binary stream before they can be passed through a port. Which is what this issue is about. > Although the manual isn=E2=80=99t clear about it, =E2=80=98ftell=E2=80=99= , when available, > returns a position in bytes. Which is not helpful if the input does not consist of bytes. > The situation for string ports here is comparable to that of other > ports used for textual I/O. No. The situation for file ports is that ftell refers to identifiable and reproducible byte offsets of the input, the input being a file consisting of bytes and indexed using bytes. The situation for string ports is that ftell refers to unidentifiable and incidental byte offsets of a temporary inaccessible ad-hoc encoding of the input, the input being a string consisting of characters and indexed using characters. > Do you have a situation where you were relying on 1.8=E2=80=99s behavior = in > that regard? Could we see whether this can be solved differently? I'm currently migrating LilyPond over to GUILE 2.0. LilyPond has its own UTF-8 verification, error flagging, processing and indexing. I=C2=A0ha= ve more than enough crashes and obscure errors to contend with as it stands, so the first port will use LC_CTYPE=3DC (LC_CTYPE=3DISO-8859-1 does not work since then GUILE/iconv considers itself entitled to complain about improper Latin-1) and will keep GUILE=C2=A02.0 from thinking about UTF-8 at all. Moving string processing to UTF-8 will be a gradual process, and a separate project involving programmer choices about what to represent where how: much of LilyPond is written in C++ and so UTF-8 encoded strings (rather than GUILE's strings consisting of either UCS-8 or UCS-32) are ubiquitous, with most of LilyPond's core literals fitting in the common ASCII subset. Whenever GUILE chooses to take decisions from the user and programmer, problems are likely to result, and workarounds will abound. For efficiency reasons, it is not realistic to demand that any string data passed between GUILE and LilyPond will have to be encoded and reencoded at every call gate: there is a real lot of them. --=20 David Kastrup
bug-guile@HIDDEN
:bug#18520
; Package guile
.
Full text available.Received: (at 18520) by debbugs.gnu.org; 22 Sep 2014 13:35:13 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Sep 22 09:35:13 2014 Received: from localhost ([127.0.0.1]:48253 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1XW3lo-0007bQ-1b for submit <at> debbugs.gnu.org; Mon, 22 Sep 2014 09:35:12 -0400 Received: from fencepost.gnu.org ([208.118.235.10]:57569) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <dak@HIDDEN>) id 1XW3ll-0007bG-S2 for 18520 <at> debbugs.gnu.org; Mon, 22 Sep 2014 09:35:10 -0400 Received: from localhost ([127.0.0.1]:36642 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from <dak@HIDDEN>) id 1XW3ll-0001aJ-FI; Mon, 22 Sep 2014 09:35:09 -0400 Received: by lola (Postfix, from userid 1000) id 20A33E0C64; Mon, 22 Sep 2014 15:09:25 +0200 (CEST) From: David Kastrup <dak@HIDDEN> To: ludo@HIDDEN (Ludovic =?iso-8859-1?Q?Court=E8s?=) Subject: Re: bug#18520: string ports should not have an encoding References: <87iokgmttc.fsf@HIDDEN> <87k34vrhu2.fsf@HIDDEN> Date: Mon, 22 Sep 2014 15:09:25 +0200 In-Reply-To: <87k34vrhu2.fsf@HIDDEN> ("Ludovic =?iso-8859-1?Q?Court=E8s?= =?iso-8859-1?Q?=22's?= message of "Mon, 22 Sep 2014 13:54:29 +0200") Message-ID: <87wq8vls3e.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -6.0 (------) X-Debbugs-Envelope-To: 18520 Cc: 18520 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -6.0 (------) ludo@HIDDEN (Ludovic Court=C3=A8s) writes: > This has been addressed in two ways: No, it hasn't. > 1. In 2.0, (srfi srfi-6) uses Unicode-capable string ports (commit > ecb48dc.) This issue report is not about adding more optional functionality on top. It is about _removing_ unwarranted redirection and complication from existing core functionality. The artifacts of making with-input-from-string and with-output-to-string go through an additional character->bytevector->character encoding/recoding layer are not invisible. > 2. In 2.2, string ports are always Unicode-capable, and > =E2=80=98%default-port-encoding=E2=80=99 is ignored (commit 6dce942.) String ports should not be "Unicode capable" but transparent. Characters in, characters out. ftell/fseek should be based on character position in strings rather than offsets in a magically created bytestream of some particular encoding. > So for 2.0, the workaround is to either use (srfi srfi-6), or force > =E2=80=98%default-port-encoding=E2=80=99 to "UTF-8". Which is what the latter _only_ does. It still interprets set-port-encoding! with respect to a byte stream meaning, and it still calculates positions according to a byte stream meaning not related to string positions: (use-modules (srfi srfi-6)) (define s (list->string (map integer->char '(20 200 2000 20000)))) (let ((port (open-input-string s))) (let loop ((ch (read-char port))) (if (not (eof-object? ch)) (begin (format #t "~d, pos=3D~d\n" (char->integer ch) (ftell port)) (loop (read-char port)))))) 20, pos=3D1 200, pos=3D3 2000, pos=3D5 20000, pos=3D8 Tying string ports to an artificial bytevector presentation in a manner bleeding through like that means that it is not possible to synchronize string positions and stream positions when parts of the source string are _not_ processed from within the stream. Which is precisely the problem I am currently dealing with while porting LilyPond: it has its own lexer working on an (utf-8 encoded) byte stream which is at the same time available as a string port. Whenever embedded Scheme is interpreted, the string port is moved to the proper position, GUILE reads an expression and is told what to do with it, the string port position is picked off and the LilyPond lexer is moved to the respective position to continue. If you take a look at <URL:http://git.savannah.gnu.org/cgit/lilypond.git/tree/scm/parser-ly-from-= scheme.scm>, ftell on a string port is here used for correlating the positions of parsed subexpressions with the original data. Reencoding strings in utf-8 is not going to make this work with string indexing since ftell does not bear a useful relation to string positions. The behavior of ftell and port-encoding is perfectly fine for reading from bytevectors or files, and reading from bytevectors or files also does not incur a encode-when-open action governed by %default-port-encoding in GUILE-2.0 and by hardwired UTF-8 in GUILE-2.2. But strings are already decoded characters. Reencoding makes no sense and detaches things like ftell and fseek from the actual input into the port. --=20 David Kastrup
bug-guile@HIDDEN
:bug#18520
; Package guile
.
Full text available.Received: (at 18520) by debbugs.gnu.org; 22 Sep 2014 12:21:26 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Sep 22 08:21:26 2014 Received: from localhost ([127.0.0.1]:48178 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1XW2cP-0005kC-Er for submit <at> debbugs.gnu.org; Mon, 22 Sep 2014 08:21:25 -0400 Received: from hera.aquilenet.fr ([141.255.128.1]:55637) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <ludo@HIDDEN>) id 1XW2cL-0005k0-FB for 18520 <at> debbugs.gnu.org; Mon, 22 Sep 2014 08:21:22 -0400 Received: from localhost (localhost [127.0.0.1]) by hera.aquilenet.fr (Postfix) with ESMTP id E92AF3A84; Mon, 22 Sep 2014 14:21:20 +0200 (CEST) Received: from hera.aquilenet.fr ([127.0.0.1]) by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 104wVn5IyTge; Mon, 22 Sep 2014 14:21:20 +0200 (CEST) Received: from pluto (reverse-83.fdn.fr [80.67.176.83]) by hera.aquilenet.fr (Postfix) with ESMTPSA id 7F9AD3A69; Mon, 22 Sep 2014 14:21:20 +0200 (CEST) From: ludo@HIDDEN (Ludovic =?utf-8?Q?Court=C3=A8s?=) To: David Kastrup <dak@HIDDEN> Subject: Re: bug#18520: string ports should not have an encoding References: <87iokgmttc.fsf@HIDDEN> Date: Mon, 22 Sep 2014 14:21:21 +0200 In-Reply-To: <87iokgmttc.fsf@HIDDEN> (David Kastrup's message of "Mon, 22 Sep 2014 01:34:39 +0200") Message-ID: <87mw9rq20u.fsf@HIDDEN> User-Agent: Gnus/5.130011 (Ma Gnus v0.11) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 18520 Cc: 18520 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 1.0 (+) I see my reply failed to address some of the points raised. David Kastrup <dak@HIDDEN> skribis: > Guile-2.2 does not consult %default-port-encoding but uses UTF-8 > consistently (I guess, overriding set-port-encoding! will again change > that). > > That still is not satisfactory. For example, using ftell on the input > port will not report the string index of the string connected to the > string port but rather a byte index into a UTF-8 encoded version of the > string. This is a number that has nothing to do with the original > string and cannot be used for correlating string and port. Right. > Ports fundamentally deliver characters, and so reading and writing from > a string source/sink should not involve _any_ coding system. > > Files fundamentally deliver bytes, a conversion is required. The same > would be the case when opening a port on a _bytevector_. Here an > encoding would make equally make sense, and ftell/fseek offsets would > naturally be in bytes. But a port on a string delivers and consumes > characters. Any conversion, even a fixed UTF-8 conversion, will destroy > the predictable nature of with-output-to-string and > with-input-from-string and the respective uses of string ports. Guile ports can be mixed textual/binary (unlike R6 ports, which are either textual or binary.) Thus, they fundamentally deliver bytes, possibly with a textual conversion. Although the manual isn=E2=80=99t clear about it, =E2=80=98ftell=E2=80=99, = when available, returns a position in bytes. The situation for string ports here is comparable to that of other ports used for textual I/O. Do you have a situation where you were relying on 1.8=E2=80=99s behavior in= that regard? Could we see whether this can be solved differently? Thanks, Ludo=E2=80=99.
bug-guile@HIDDEN
:bug#18520
; Package guile
.
Full text available.Received: (at 18520) by debbugs.gnu.org; 22 Sep 2014 11:54:34 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Sep 22 07:54:34 2014 Received: from localhost ([127.0.0.1]:48154 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1XW2CP-0003pj-NQ for submit <at> debbugs.gnu.org; Mon, 22 Sep 2014 07:54:33 -0400 Received: from hera.aquilenet.fr ([141.255.128.1]:55615) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <ludo@HIDDEN>) id 1XW2CM-0003pX-N0 for 18520 <at> debbugs.gnu.org; Mon, 22 Sep 2014 07:54:31 -0400 Received: from localhost (localhost [127.0.0.1]) by hera.aquilenet.fr (Postfix) with ESMTP id 298F43ADB; Mon, 22 Sep 2014 13:54:29 +0200 (CEST) Received: from hera.aquilenet.fr ([127.0.0.1]) by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Z5jErDYIKZsY; Mon, 22 Sep 2014 13:54:29 +0200 (CEST) Received: from pluto (reverse-83.fdn.fr [80.67.176.83]) by hera.aquilenet.fr (Postfix) with ESMTPSA id C9F423AD4; Mon, 22 Sep 2014 13:54:28 +0200 (CEST) From: ludo@HIDDEN (Ludovic =?utf-8?Q?Court=C3=A8s?=) To: David Kastrup <dak@HIDDEN> Subject: Re: bug#18520: string ports should not have an encoding References: <87iokgmttc.fsf@HIDDEN> Date: Mon, 22 Sep 2014 13:54:29 +0200 In-Reply-To: <87iokgmttc.fsf@HIDDEN> (David Kastrup's message of "Mon, 22 Sep 2014 01:34:39 +0200") Message-ID: <87k34vrhu2.fsf@HIDDEN> User-Agent: Gnus/5.130011 (Ma Gnus v0.11) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 18520 Cc: 18520 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 1.0 (+) This has been addressed in two ways: 1. In 2.0, (srfi srfi-6) uses Unicode-capable string ports (commit ecb48dc.) 2. In 2.2, string ports are always Unicode-capable, and =E2=80=98%default-port-encoding=E2=80=99 is ignored (commit 6dce942.) So for 2.0, the workaround is to either use (srfi srfi-6), or force =E2=80=98%default-port-encoding=E2=80=99 to "UTF-8". HTH, Ludo=E2=80=99.
bug-guile@HIDDEN
:bug#18520
; Package guile
.
Full text available.Received: (at submit) by debbugs.gnu.org; 21 Sep 2014 23:34:59 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sun Sep 21 19:34:59 2014 Received: from localhost ([127.0.0.1]:47794 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1XVqeg-00070i-Ha for submit <at> debbugs.gnu.org; Sun, 21 Sep 2014 19:34:58 -0400 Received: from eggs.gnu.org ([208.118.235.92]:41589) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <dak@HIDDEN>) id 1XVqed-00070Z-2X for submit <at> debbugs.gnu.org; Sun, 21 Sep 2014 19:34:56 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <dak@HIDDEN>) id 1XVqeb-0005AV-QH for submit <at> debbugs.gnu.org; Sun, 21 Sep 2014 19:34:54 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=BAYES_50,RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:39329) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <dak@HIDDEN>) id 1XVqeb-0005AC-N2 for submit <at> debbugs.gnu.org; Sun, 21 Sep 2014 19:34:53 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43298) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <dak@HIDDEN>) id 1XVqeV-0004Qr-Ne for bug-guile@HIDDEN; Sun, 21 Sep 2014 19:34:48 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <dak@HIDDEN>) id 1XVqeU-00059w-Hm for bug-guile@HIDDEN; Sun, 21 Sep 2014 19:34:47 -0400 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:43718) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <dak@HIDDEN>) id 1XVqeU-00059Q-EE for bug-guile@HIDDEN; Sun, 21 Sep 2014 19:34:46 -0400 Received: from localhost ([127.0.0.1]:50894 helo=lola) by fencepost.gnu.org with esmtp (Exim 4.71) (envelope-from <dak@HIDDEN>) id 1XVqeO-0006VD-MW for bug-guile@HIDDEN; Sun, 21 Sep 2014 19:34:41 -0400 Received: by lola (Postfix, from userid 1000) id 7CFFCDF8CA; Mon, 22 Sep 2014 01:34:39 +0200 (CEST) From: David Kastrup <dak@HIDDEN> To: bug-guile@HIDDEN Subject: string ports should not have an encoding Date: Mon, 22 Sep 2014 01:34:39 +0200 Message-ID: <87iokgmttc.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.8 (-----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -5.8 (-----) In Guile 2.0, at the time a string port is opened, the value of the fluid %default-port-encoding is used for deciding how to encode the string into a byte stream, and set-port-encoding! may then be used for deciding how to decode that byte stream back into characters. This does not make sense as ports deliver characters, and strings contain characters. There is no point in going through bytes. Guile-2.2 does not consult %default-port-encoding but uses UTF-8 consistently (I guess, overriding set-port-encoding! will again change that). That still is not satisfactory. For example, using ftell on the input port will not report the string index of the string connected to the string port but rather a byte index into a UTF-8 encoded version of the string. This is a number that has nothing to do with the original string and cannot be used for correlating string and port. Ports fundamentally deliver characters, and so reading and writing from a string source/sink should not involve _any_ coding system. Files fundamentally deliver bytes, a conversion is required. The same would be the case when opening a port on a _bytevector_. Here an encoding would make equally make sense, and ftell/fseek offsets would naturally be in bytes. But a port on a string delivers and consumes characters. Any conversion, even a fixed UTF-8 conversion, will destroy the predictable nature of with-output-to-string and with-input-from-string and the respective uses of string ports. In code like the following, the results should not depend on either the fluid-set! or the set-port-encoding!, and the ftell should always output successive integers independent from either fluid-set! or set-port-encoding!. set-port-encoding! should probably flag an error, like an fseek on an unseekable device. (fluid-set! %default-port-encoding "UTF-8") (define s (list->string (map integer->char '(20 200 2000 20000)))) (with-input-from-string s (lambda () (set-port-encoding! (current-input-port) "ISO-8859-1") (let loop ((ch (read-char (current-input-port)))) (if (not (eof-object? ch)) (begin (format #t "~d, pos=~d\n" (char->integer ch) (ftell (current-input-port))) (loop (read-char (current-input-port)))))))) Again, things are quite different from bytevectors which could be accepted instead of a string for opening ports with the string-port commands, or could have their own port open/close commands, and the respective ports then definitely would want to obey set-port-encoding! (defaulting to %default-port-encoding) for _decoding_ the bytevector. I don't know what r7rs might think here. But for me, associating encodings for connecting strings to ports does not make sense. The relation is one of characters to characters. -- David Kastrup
David Kastrup <dak@HIDDEN>
:bug-guile@HIDDEN
.
Full text available.bug-guile@HIDDEN
:bug#18520
; Package guile
.
Full text available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.