X-Loop: help-debbugs@HIDDEN Subject: bug#38269: SSAX incorrect handling of > in CDATA Resent-From: Andrew Gierth <andrew@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Tue, 19 Nov 2019 14:50:01 +0000 Resent-Message-ID: <handler.38269.B.157417499422816 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: report 38269 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: 38269 <at> debbugs.gnu.org X-Debbugs-Original-To: bug-guile@HIDDEN Received: via spool by submit <at> debbugs.gnu.org id=B.157417499422816 (code B ref -1); Tue, 19 Nov 2019 14:50:01 +0000 Received: (at submit) by debbugs.gnu.org; 19 Nov 2019 14:49:54 +0000 Received: from localhost ([127.0.0.1]:46838 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1iX4pG-0005vv-0N for submit <at> debbugs.gnu.org; Tue, 19 Nov 2019 09:49:54 -0500 Received: from lists.gnu.org ([209.51.188.17]:53971) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <andrew@HIDDEN>) id 1iX3lk-0002OB-P0 for submit <at> debbugs.gnu.org; Tue, 19 Nov 2019 08:42:13 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:37959) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from <andrew@HIDDEN>) id 1iX3lj-0004sy-6w for bug-guile@HIDDEN; Tue, 19 Nov 2019 08:42:12 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <andrew@HIDDEN>) id 1iX3li-0008Vp-54 for bug-guile@HIDDEN; Tue, 19 Nov 2019 08:42:11 -0500 Received: from lungold.riddles.org.uk ([82.68.208.19]:57560) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from <andrew@HIDDEN>) id 1iX3lh-0008R4-TM for bug-guile@HIDDEN; Tue, 19 Nov 2019 08:42:10 -0500 Received: from [192.168.127.1] (port=38258 helo=caithnard.riddles.org.uk) by lungold.riddles.org.uk with esmtp (Exim 4.92.3 (FreeBSD)) (envelope-from <andrew@HIDDEN>) id 1iX3lT-0006Pt-2v for bug-guile@HIDDEN; Tue, 19 Nov 2019 13:41:55 +0000 Received: from localhost ([127.0.0.1]:23006 helo=caithnard.riddles.org.uk) by caithnard.riddles.org.uk with esmtp (Exim 4.92.3 (FreeBSD)) (envelope-from <andrew@HIDDEN>) id 1iX3lS-000286-Qa for bug-guile@HIDDEN; Tue, 19 Nov 2019 13:41:54 +0000 From: Andrew Gierth <andrew@HIDDEN> Message-ID: <87zhgsyost.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (berkeley-unix) Date: Tue, 19 Nov 2019 13:41:54 +0000 MIME-Version: 1.0 Content-Type: text/plain X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x [fuzzy] X-Received-From: 82.68.208.19 X-Spam-Score: -2.3 (--) X-Mailman-Approved-At: Tue, 19 Nov 2019 09:49:52 -0500 X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) The bug: > (xml->sxml "<e><![CDATA[>]]></e>") $2 = (*TOP* (e ">")) The expected result is (*TOP* (e ">")). In upstream/SSAX.scm: ; procedure+: ssax:read-cdata-body PORT STR-HANDLER SEED [...] ; Within a CDATA section all characters are taken at their face value, ; with only three exceptions: [..] ; > is treated as an embedded #\> character This handling of > is contrary to the XML specification, in which there are no special character sequences inside CDATA except newline and the "]]>" closing tag. I have confirmed this by checking other XML parsers. The code seems to be based on a wild misreading of another section of the specification that does not apply here. (And unfortunately, the W3C validation suite for XML happens not to contain any instances of > inside CDATA.) I believe the fix should be as simple as removing the entire (#\&) case from the function (and fixing the test cases). This bug seems to exist in all versions of SSAX. -- Andrew.
Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) Content-Type: text/plain; charset=utf-8 X-Loop: help-debbugs@HIDDEN From: help-debbugs@HIDDEN (GNU bug Tracking System) To: Andrew Gierth <andrew@HIDDEN> Subject: bug#38269: Acknowledgement (SSAX incorrect handling of > in CDATA) Message-ID: <handler.38269.B.157417499422816.ack <at> debbugs.gnu.org> References: <87zhgsyost.fsf@HIDDEN> X-Gnu-PR-Message: ack 38269 X-Gnu-PR-Package: guile Reply-To: 38269 <at> debbugs.gnu.org Date: Tue, 19 Nov 2019 14:50:01 +0000 Thank you for filing a new bug report with debbugs.gnu.org. This is an automatically generated reply to let you know your message has been received. Your message is being forwarded to the package maintainers and other interested parties for their attention; they will reply in due course. Your message has been sent to the package maintainer(s): bug-guile@HIDDEN If you wish to submit further information on this problem, please send it to 38269 <at> debbugs.gnu.org. Please do not send mail to help-debbugs@HIDDEN unless you wish to report a problem with the Bug-tracking system. --=20 38269: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D38269 GNU Bug Tracking System Contact help-debbugs@HIDDEN with problems
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.