GNU bug report logs - #38269
SSAX incorrect handling of > in CDATA

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: guile; Reported by: Andrew Gierth <andrew@HIDDEN>; dated Tue, 19 Nov 2019 14:50:01 UTC; Maintainer for guile is bug-guile@HIDDEN.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 19 Nov 2019 14:49:54 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Tue Nov 19 09:49:54 2019
Received: from localhost ([127.0.0.1]:46838 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1iX4pG-0005vv-0N
	for submit <at> debbugs.gnu.org; Tue, 19 Nov 2019 09:49:54 -0500
Received: from lists.gnu.org ([209.51.188.17]:53971)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <andrew@HIDDEN>) id 1iX3lk-0002OB-P0
 for submit <at> debbugs.gnu.org; Tue, 19 Nov 2019 08:42:13 -0500
Received: from eggs.gnu.org ([2001:470:142:3::10]:37959)
 by lists.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <andrew@HIDDEN>) id 1iX3lj-0004sy-6w
 for bug-guile@HIDDEN; Tue, 19 Nov 2019 08:42:12 -0500
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled
 version=3.3.2
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <andrew@HIDDEN>) id 1iX3li-0008Vp-54
 for bug-guile@HIDDEN; Tue, 19 Nov 2019 08:42:11 -0500
Received: from lungold.riddles.org.uk ([82.68.208.19]:57560)
 by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <andrew@HIDDEN>)
 id 1iX3lh-0008R4-TM
 for bug-guile@HIDDEN; Tue, 19 Nov 2019 08:42:10 -0500
Received: from [192.168.127.1] (port=38258 helo=caithnard.riddles.org.uk)
 by lungold.riddles.org.uk with esmtp (Exim 4.92.3 (FreeBSD))
 (envelope-from <andrew@HIDDEN>) id 1iX3lT-0006Pt-2v
 for bug-guile@HIDDEN; Tue, 19 Nov 2019 13:41:55 +0000
Received: from localhost ([127.0.0.1]:23006 helo=caithnard.riddles.org.uk)
 by caithnard.riddles.org.uk with esmtp (Exim 4.92.3 (FreeBSD))
 (envelope-from <andrew@HIDDEN>) id 1iX3lS-000286-Qa
 for bug-guile@HIDDEN; Tue, 19 Nov 2019 13:41:54 +0000
From: Andrew Gierth <andrew@HIDDEN>
To: bug-guile@HIDDEN
Subject: SSAX incorrect handling of &gt; in CDATA
Message-ID: <87zhgsyost.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (berkeley-unix)
Date: Tue, 19 Nov 2019 13:41:54 +0000
MIME-Version: 1.0
Content-Type: text/plain
X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x [fuzzy]
X-Received-From: 82.68.208.19
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: submit
X-Mailman-Approved-At: Tue, 19 Nov 2019 09:49:52 -0500
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

The bug:

> (xml->sxml "<e><![CDATA[&gt;]]></e>")
$2 = (*TOP* (e ">"))

The expected result is (*TOP* (e "&gt;")).

In upstream/SSAX.scm:

; procedure+: 	ssax:read-cdata-body PORT STR-HANDLER SEED
[...]
; Within a CDATA section all characters are taken at their face value,
; with only three exceptions:
[..]
;	&gt; is treated as an embedded #\> character

This handling of &gt; is contrary to the XML specification, in which
there are no special character sequences inside CDATA except newline and
the "]]>" closing tag. I have confirmed this by checking other XML
parsers. The code seems to be based on a wild misreading of another
section of the specification that does not apply here. (And
unfortunately, the W3C validation suite for XML happens not to contain
any instances of &gt; inside CDATA.)

I believe the fix should be as simple as removing the entire (#\&) case
from the function (and fixing the test cases).

This bug seems to exist in all versions of SSAX.

-- 
Andrew.




Acknowledgement sent to Andrew Gierth <andrew@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-guile@HIDDEN. Full text available.
Report forwarded to bug-guile@HIDDEN:
bug#38269; Package guile. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Mon, 25 Nov 2019 12:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.