GNU bug report logs -
#62290
Error when handling invalid unicode with suspendable ports
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 62290 in the body.
You can then email your comments to 62290 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-guile <at> gnu.org
:
bug#62290
; Package
guile
.
(Mon, 20 Mar 2023 09:13:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Christopher Baines <mail <at> cbaines.net>
:
New bug report received and forwarded. Copy sent to
bug-guile <at> gnu.org
.
(Mon, 20 Mar 2023 09:13:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Here's a simple reproducer:
(use-modules (ice-9 binary-ports)
(ice-9 suspendable-ports)
(rnrs bytevectors))
(define (test)
(let* ((sequence
'(#xf4 #xa4 #xbd #xa4))
(p (open-bytevector-input-port
(u8-list->bytevector sequence))))
(set-port-encoding! p "UTF-8")
(set-port-conversion-strategy! p 'substitute)
(peek (read-char p))))
(test)
(install-suspendable-ports!)
(test)
If you run it, it outputs #\� as expected the first time, but then using
suspendable ports, it raises an exception. The behaviour should be the
same.
;;; (#\�)
Backtrace:
In ice-9/boot-9.scm:
1752:10 8 (with-exception-handler _ _ #:unwind? _ # _)
In unknown file:
7 (apply-smob/0 #<thunk 7f3f09cbe300>)
In ice-9/boot-9.scm:
724:2 6 (call-with-prompt ("prompt") #<procedure 7f3f09ccb320 …> …)
In ice-9/eval.scm:
619:8 5 (_ #(#(#<directory (guile-user) 7f3f09cc1c80>)))
In ice-9/boot-9.scm:
2836:4 4 (save-module-excursion #<procedure 7f3f09cb2300 at ice-…>)
4388:12 3 (_)
In /home/chris/Projects/Guile/guile/bad-unicode.scm:
12:10 2 (test)
In ice-9/suspendable-ports.scm:
591:33 1 (read-char _)
499:12 0 (peek-char-and-next-cur/utf8 _ _ _ _)
ice-9/suspendable-ports.scm:499:12: In procedure peek-char-and-next-cur/utf8:
In procedure integer->char: Argument 1 out of range: 1199972
Information forwarded
to
bug-guile <at> gnu.org
:
bug#62290
; Package
guile
.
(Mon, 20 Mar 2023 09:16:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 62290 <at> debbugs.gnu.org (full text, mbox):
Based on the implementation in ports.c. I don't understand what this
code is really doing, but the suspendable ports implementation differs
from the similar C code for a couple of inequalities.
* module/ice-9/suspendable-ports.scm (decode-utf8, bad-utf8-len): Flip a
couple of inequalities.
* test-suite/tests/ports.test ("string ports"): Add additional invalid
UTF-8 test case.
---
module/ice-9/suspendable-ports.scm | 8 ++++----
test-suite/tests/ports.test | 7 +++++++
2 files changed, 11 insertions(+), 4 deletions(-)
diff --git a/module/ice-9/suspendable-ports.scm b/module/ice-9/suspendable-ports.scm
index a823f1d37..9fac1df62 100644
--- a/module/ice-9/suspendable-ports.scm
+++ b/module/ice-9/suspendable-ports.scm
@@ -419,7 +419,7 @@
(= (logand u8_2 #xc0) #x80)
(case u8_0
((#xe0) (>= u8_1 #xa0))
- ((#xed) (>= u8_1 #x9f))
+ ((#xed) (<= u8_1 #x9f))
(else #t)))
(kt (integer->char
(logior (ash (logand u8_0 #x0f) 12)
@@ -436,7 +436,7 @@
(= (logand u8_3 #xc0) #x80)
(case u8_0
((#xf0) (>= u8_1 #x90))
- ((#xf4) (>= u8_1 #x8f))
+ ((#xf4) (<= u8_1 #x8f))
(else #t)))
(kt (integer->char
(logior (ash (logand u8_0 #x07) 18)
@@ -462,7 +462,7 @@
((< buffering 2) 1)
((not (= (logand (ref 1) #xc0) #x80)) 1)
((and (eq? first-byte #xe0) (< (ref 1) #xa0)) 1)
- ((and (eq? first-byte #xed) (< (ref 1) #x9f)) 1)
+ ((and (eq? first-byte #xed) (> (ref 1) #x9f)) 1)
((< buffering 3) 2)
((not (= (logand (ref 2) #xc0) #x80)) 2)
(else 0)))
@@ -471,7 +471,7 @@
((< buffering 2) 1)
((not (= (logand (ref 1) #xc0) #x80)) 1)
((and (eq? first-byte #xf0) (< (ref 1) #x90)) 1)
- ((and (eq? first-byte #xf4) (< (ref 1) #x8f)) 1)
+ ((and (eq? first-byte #xf4) (> (ref 1) #x8f)) 1)
((< buffering 3) 2)
((not (= (logand (ref 2) #xc0) #x80)) 2)
((< buffering 4) 3)
diff --git a/test-suite/tests/ports.test b/test-suite/tests/ports.test
index 66e10e3dd..1b30e1a68 100644
--- a/test-suite/tests/ports.test
+++ b/test-suite/tests/ports.test
@@ -1059,6 +1059,13 @@
eof))
(test-decoding-error (#xf0 #x88 #x88 #x88) "UTF-8"
+ (error ;; 2nd byte should be in the 90..BF range
+ error ;; 88: not a valid starting byte
+ error ;; 88: not a valid starting byte
+ error ;; 88: not a valid starting byte
+ eof))
+
+ (test-decoding-error (#xf4 #xa4 #xbd #xa4) "UTF-8"
(error ;; 2nd byte should be in the 90..BF range
error ;; 88: not a valid starting byte
error ;; 88: not a valid starting byte
--
2.39.1
Reply sent
to
Ludovic Courtès <ludo <at> gnu.org>
:
You have taken responsibility.
(Mon, 20 Mar 2023 22:28:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Christopher Baines <mail <at> cbaines.net>
:
bug acknowledged by developer.
(Mon, 20 Mar 2023 22:28:02 GMT)
Full text and
rfc822 format available.
Message #13 received at 62290-done <at> debbugs.gnu.org (full text, mbox):
Hello,
Christopher Baines <mail <at> cbaines.net> skribis:
> Based on the implementation in ports.c. I don't understand what this
> code is really doing, but the suspendable ports implementation differs
> from the similar C code for a couple of inequalities.
>
> * module/ice-9/suspendable-ports.scm (decode-utf8, bad-utf8-len): Flip a
> couple of inequalities.
> * test-suite/tests/ports.test ("string ports"): Add additional invalid
> UTF-8 test case.
Pushed as cba2e7e3fec3c781230570f5d1ef070625eeeda8.
Thanks for documenting the problem and providing a perfect patch!
Ludo’.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Tue, 18 Apr 2023 11:24:08 GMT)
Full text and
rfc822 format available.
This bug report was last modified 2 years and 25 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.