X-Loop: help-debbugs@HIDDEN Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces? Resent-From: tomas@HIDDEN Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Wed, 15 Apr 2015 19:48:02 +0000 Resent-Message-ID: <handler.20339.B.142912725311595 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: report 20339 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: 20339 <at> debbugs.gnu.org X-Debbugs-Original-To: bug-guile@HIDDEN Received: via spool by submit <at> debbugs.gnu.org id=B.142912725311595 (code B ref -1); Wed, 15 Apr 2015 19:48:02 +0000 Received: (at submit) by debbugs.gnu.org; 15 Apr 2015 19:47:33 +0000 Received: from localhost ([127.0.0.1]:57415 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1YiTHY-00030w-92 for submit <at> debbugs.gnu.org; Wed, 15 Apr 2015 15:47:32 -0400 Received: from eggs.gnu.org ([208.118.235.92]:33015) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <tomas@HIDDEN>) id 1YiTHU-00030b-Pc for submit <at> debbugs.gnu.org; Wed, 15 Apr 2015 15:47:29 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <tomas@HIDDEN>) id 1YiTHO-0001CM-Gx for submit <at> debbugs.gnu.org; Wed, 15 Apr 2015 15:47:23 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:36257) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <tomas@HIDDEN>) id 1YiTHO-0001CI-EO for submit <at> debbugs.gnu.org; Wed, 15 Apr 2015 15:47:22 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46510) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <tomas@HIDDEN>) id 1YiTHM-0007IB-MS for bug-guile@HIDDEN; Wed, 15 Apr 2015 15:47:22 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <tomas@HIDDEN>) id 1YiTHH-0001BI-SO for bug-guile@HIDDEN; Wed, 15 Apr 2015 15:47:20 -0400 Received: from mail.tuxteam.de ([5.199.139.25]:44167 helo=tomasium.tuxteam.de) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <tomas@HIDDEN>) id 1YiTHH-0001BE-Ls for bug-guile@HIDDEN; Wed, 15 Apr 2015 15:47:15 -0400 Received: from tomas by tomasium.tuxteam.de with local (Exim 4.80) (envelope-from <tomas@HIDDEN>) id 1YiTHG-0008Cw-LG for bug-guile@HIDDEN; Wed, 15 Apr 2015 21:47:14 +0200 Date: Wed, 15 Apr 2015 21:47:14 +0200 From: tomas@HIDDEN Message-ID: <20150415194714.GA30295@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; x-action=pgp-signed Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.5.21 (2010-09-15) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -5.0 (-----) -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, I posted more details on guile-devel. Perhaps this was the wrong list? When transforming SXML to XML, namespaces don't seem to be handled properly: #!/usr/bin/guile -s !# (use-modules (sxml simple)) ;; An XML with two namespaces (one default) (define the-svg "<svg xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'> <rect x='5' y='5' width='20' height='20' stroke-width='2' stroke='purple' fill='yellow' id='rect1' /> <rect x='30' y='5' width='20' height='20' ry='5' rx='8' stroke-width='2' stroke='purple' fill='blue' xlink:href='#rect1' /> </svg>") ;; Note how SXML handles QNames (just concatenating NS and ;; local-name with a colon): (define the-sxml (with-input-from-string the-svg xml->sxml)) (format #t "~A\n" the-sxml) ;; If we try to serialize this: kaboom! (sxml->xml the-sxml) The parsing into SXML goes well, the (format ...) outputs what I'd expect. But the (sxml->xml ...) dies with: ERROR: In procedure scm-error: ERROR: Invalid QName: more than one colon http://www.w3.org/2000/svg:svg The problem is that SXML used the concatenated (full) namespace with the name as tag (and attribute) names for namespaced items. When serializing to XML it should try to find abbreviations for those namespaces and issue the corresponding namespace declarations. Instead, sxml->xml tries to split the (namespace:name) combination at the first colon and to check the name -- and fails miserably at (namespace:name) combinations à la "http://www.w3.org/1999/xlink:href" (procedure check-name). Since there are two colons, the name part has now a colon. There are more details at: http://lists.gnu.org/archive/html/guile-devel/2015-04/msg00000.html with a first attempt at a patch against guile (GNU Guile) 2.0.5-deb+1-3. I'm more than willing to beat the patch into shape, but will possibly need some guidance. Perhaps I'd need to sign papers with the FSF, which I'd gladly do. Regards - -- tomás -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEYEARECAAYFAlUuwEIACgkQBcgs9XrR2kbJWQCfQ/ALFQrf0crOK47SbaOlJlMv MwAAn3fxDBWOhgNF0L7E35k0skol2T0V =FIId -----END PGP SIGNATURE-----
Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.503 (Entity 5.503) Content-Type: text/plain; charset=utf-8 X-Loop: help-debbugs@HIDDEN From: help-debbugs@HIDDEN (GNU bug Tracking System) To: tomas@HIDDEN Subject: bug#20339: Acknowledgement (sxml simple: sxml->xml mishandles namespaces?) Message-ID: <handler.20339.B.142912725311595.ack <at> debbugs.gnu.org> References: <20150415194714.GA30295@HIDDEN> X-Gnu-PR-Message: ack 20339 X-Gnu-PR-Package: guile Reply-To: 20339 <at> debbugs.gnu.org Date: Wed, 15 Apr 2015 19:48:02 +0000 Thank you for filing a new bug report with debbugs.gnu.org. This is an automatically generated reply to let you know your message has been received. Your message is being forwarded to the package maintainers and other interested parties for their attention; they will reply in due course. Your message has been sent to the package maintainer(s): bug-guile@HIDDEN If you wish to submit further information on this problem, please send it to 20339 <at> debbugs.gnu.org. Please do not send mail to help-debbugs@HIDDEN unless you wish to report a problem with the Bug-tracking system. --=20 20339: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D20339 GNU Bug Tracking System Contact help-debbugs@HIDDEN with problems
X-Loop: help-debbugs@HIDDEN Subject: bug#20339: [PATCH] sxml->xml and namespaces: updated patch References: <20150415194714.GA30295@HIDDEN> In-Reply-To: <20150415194714.GA30295@HIDDEN> Resent-From: <tomas@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Mon, 20 Apr 2015 07:46:02 +0000 Resent-Message-ID: <handler.20339.B20339.142951592327523 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 20339 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: 20339 <at> debbugs.gnu.org Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.142951592327523 (code B ref 20339); Mon, 20 Apr 2015 07:46:02 +0000 Received: (at 20339) by debbugs.gnu.org; 20 Apr 2015 07:45:23 +0000 Received: from localhost ([127.0.0.1]:32934 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1Yk6OQ-00079q-5X for submit <at> debbugs.gnu.org; Mon, 20 Apr 2015 03:45:23 -0400 Received: from mail.tuxteam.de ([5.199.139.25]:51574 helo=tomasium.tuxteam.de) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <tomas@HIDDEN>) id 1Yk6ON-00079g-KW for 20339 <at> debbugs.gnu.org; Mon, 20 Apr 2015 03:45:21 -0400 Received: from tomas by tomasium.tuxteam.de with local (Exim 4.80) (envelope-from <tomas@HIDDEN>) id 1Yk6OL-0008LR-7s for 20339 <at> debbugs.gnu.org; Mon, 20 Apr 2015 09:45:17 +0200 Date: Mon, 20 Apr 2015 09:45:17 +0200 Message-ID: <20150420074517.GA31087@HIDDEN> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="da4uJneut+ArUgXk" Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) From: <tomas@HIDDEN> X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -0.0 (/) --da4uJneut+ArUgXk Content-Type: multipart/mixed; boundary="l76fUT7nc3MelDdI" Content-Disposition: inline --l76fUT7nc3MelDdI Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi, I've embellished my proposed patch a bit: - use values resp. call-with-values instead of passing around lists. This was one thing I didn't like about my first patch candidate: the namespace --> ns abbreviation lookup had two things to return, for noe the abbreviation, and whether this abbreviation was "new" (for convenience in the form of a (namespace . abbreviation) pair). Instead of returning a list, now it returns multiple values. - patch is now against current stable instead of against "whatever Debian stable packages", i.e. against d680713 2015-04-03 16:35:54 +0200 Ludovic Court=E8s (stable-2.0) doc: Up= date libgc URL. I'm still not sure whether this is the way to go (i.e. mixing the abbreviation stuff into the serialization), or whether a pre-pass (replacing namespaces by abbreviations and generating the namespace declaration "attributes") would be the way to go. Besides, I'd like to have some input on whether it'd be worth to follow the usual convention and to put the namespace declarations before regular attributes (forcing us to do two passes on a tag node's attribute list). The generated XML looks pretty weird as is now. What I'd still like to introduce is a "mapping preference" as an optional argument by the user, possibly per-node (like "I'd like 'http://www.w3.org/1999/xlink' to be abbreviated as 'xlink' or something like that). Other XML serializers offer that. I envision this as a function, the library would fall back to generate the abbreviation whenever the function returns #f. The question on whether this patch (or whatever it evolves into) has a chance of getting into Guile is still open: I'd have to get my papers from the FSF in this case. Inputs? --l76fUT7nc3MelDdI Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="abbreviate-and-declare-namespaces.patch" Content-Transfer-Encoding: quoted-printable diff --git a/module/sxml/simple.scm b/module/sxml/simple.scm index 703ad91..86b0784 100644 --- a/module/sxml/simple.scm +++ b/module/sxml/simple.scm @@ -215,29 +215,38 @@ port." (elements (reverse (parser port '())))) `(*TOP* ,@elements))) =20 -(define check-name - (let ((*good-cache* (make-hash-table))) - (lambda (name) - (if (not (hashq-ref *good-cache* name)) - (let* ((str (symbol->string name)) - (i (string-index str #\:)) - (head (or (and i (substring str 0 i)) str)) - (tail (and i (substring str (1+ i))))) - (and i (string-index (substring str (1+ i)) #\:) - (error "Invalid QName: more than one colon" name)) - (for-each - (lambda (s) - (and s - (or (char-alphabetic? (string-ref s 0)) - (eq? (string-ref s 0) #\_) - (error "Invalid name starting character" s name)) - (string-for-each - (lambda (c) - (or (char-alphabetic? c) (string-index "0123456789.= -_" c) - (error "Invalid name character" c s name))) - s))) - (list head tail)) - (hashq-set! *good-cache* name #t)))))) +(define (ns-lookup ns nsmap) + "Look up namespace ns in nsmap. Return its abbreviation or #f" + (assoc-ref nsmap ns)) + +(define ns-abbr-new + (let ((*nscounter* 0)) + (lambda () + (set! *nscounter* (1+ *nscounter*)) + (string-append "ns" (number->string *nscounter*))))) + +(define (ns-abbr name nsmap) + "Takes a QName, SXML style (i.e a symbol whose string value is either a +clean local name or a colon-concatenated pair of namespace:name, and retur= ns +two values: the string <nsabbrev>:<local-name> and either a pair (<namesp= ace> . +nsabbrev) whenever <namespace> wasn't in nsmap, or #f when it was" + ;; FIXME check for empty ns (e.g ":foo") + ;; check (worse!) for empty locname (e.g. "foo:") + (let* ((str (symbol->string name)) + (i (string-rindex str #\:)) + (ns (and i (substring str 0 i))) + (locname (or (and i (substring str (1+ i))) str))) + (if ns + (let ((nsabbr (ns-lookup ns nsmap))) + (if nsabbr + ;; known namespace: + (values (string-append nsabbr ":" locname) #f) + ;; unknown namespace + (let ((nsabbr (ns-abbr-new))) + (values (string-append nsabbr ":" locname) + (cons ns nsabbr))))) + ;; empty namespace: clean local-name: + (values locname #f)))) =20 ;; The following two functions serialize tags and attributes. They are ;; being used in the node handlers for the post-order function, see @@ -260,42 +269,58 @@ port." port)))) =20 (define (attribute->xml attr value port) - (check-name attr) (display attr port) (display "=3D\"" port) (attribute-value->xml value port) (display #\" port)) =20 -(define (element->xml tag attrs body port) - (check-name tag) - (display #\< port) - (display tag port) - (if attrs - (let lp ((attrs attrs)) - (if (pair? attrs) - (let ((attr (car attrs))) +(define (element->xml tag attrs body port nsmap) + (let ((new-namespaces '())) + (call-with-values (lambda () (ns-abbr tag nsmap)) + (lambda (abname new-ns) + (when new-ns + (set! new-namespaces (cons new-ns new-namespaces))) + (display #\< port) + (display abname port) + (if attrs + (let lp ((attrs attrs)) + (if (pair? attrs) + (let ((attr (car attrs))) + (display #\space port) + (if (pair? attr) + (call-with-values (lambda () (ns-abbr (car attr) n= smap)) + (lambda (abname new-ns) + (when new-ns + (set! new-namespaces (cons new-ns new-namesp= aces))) + (attribute->xml abname (cdr attr) port))) + (error "bad attribute" tag attr)) + (lp (cdr attrs))) + (if (not (null? attrs)) + (error "bad attributes" tag attrs))))) + ;; Output namespace declarations + (let lp ((new-namespaces new-namespaces)) + (unless (null? new-namespaces) + ;; remember: car is namespace, cdr is abbrev + (let ((ns (caar new-namespaces)) + (nsabbr (cdar new-namespaces))) (display #\space port) - (if (pair? attr) - (attribute->xml (car attr) (cdr attr) port) - (error "bad attribute" tag attr)) - (lp (cdr attrs))) - (if (not (null? attrs)) - (error "bad attributes" tag attrs))))) - (if (pair? body) - (begin - (display #\> port) - (let lp ((body body)) - (cond - ((pair? body) - (sxml->xml (car body) port) - (lp (cdr body))) - ((null? body) - (display "</" port) - (display tag port) - (display ">" port)) - (else - (error "bad element body" tag body))))) - (display " />" port))) + (attribute->xml (string-append "xmlns:" nsabbr) ns port)) + (lp (cdr new-namespaces)))) + (if (pair? body) + (begin + (display #\> port) + (let lp ((body body)) + (cond + ((pair? body) + (sxml->xml (car body) port (append new-namespaces nsmap)) + (lp (cdr body))) + ((null? body) + (display "</" port) + (display abname port) + (display ">" port)) + (else + (error "bad element body" tag body))))) + (display " />" port)))))) =20 ;; FIXME: ensure name is valid (define (entity->xml name port) @@ -311,7 +336,8 @@ port." (display str port) (display "?>" port)) =20 -(define* (sxml->xml tree #:optional (port (current-output-port))) +(define* (sxml->xml tree #:optional (port (current-output-port)) + (nsmap '())) "Serialize the sxml tree @var{tree} as XML. The output will be written to the current output port, unless the optional argument @var{port} is present." @@ -322,7 +348,7 @@ present." (let ((tag (car tree))) (case tag ((*TOP*) - (sxml->xml (cdr tree) port)) + (sxml->xml (cdr tree) port nsmap)) ((*ENTITY*) (if (and (list? (cdr tree)) (=3D (length (cdr tree)) 1)) (entity->xml (cadr tree) port) @@ -336,9 +362,9 @@ present." (attrs (and (pair? elems) (pair? (car elems)) (eq? '@ (caar elems)) (cdar elems)))) - (element->xml tag attrs (if attrs (cdr elems) elems) port))= ))) + (element->xml tag attrs (if attrs (cdr elems) elems) port n= smap))))) ;; A nodelist. - (for-each (lambda (x) (sxml->xml x port)) tree))) + (for-each (lambda (x) (sxml->xml x port nsmap)) tree))) ((string? tree) (string->escaped-xml tree port)) ((null? tree) *unspecified*) --l76fUT7nc3MelDdI-- --da4uJneut+ArUgXk Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEYEARECAAYFAlU0rowACgkQBcgs9XrR2kZRwACffTrZx5cCTIr7pMETu2kLbqvZ H8kAnAq9DYpMgKjL7sRpox496i/QN7Dl =Yxx8 -----END PGP SIGNATURE----- --da4uJneut+ArUgXk--
X-Loop: help-debbugs@HIDDEN Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces? Resent-From: Ricardo Wurmus <rekado@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Tue, 21 Apr 2015 09:25:02 +0000 Resent-Message-ID: <handler.20339.B20339.142960825926865 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 20339 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: tomas@HIDDEN Cc: 20339 <at> debbugs.gnu.org Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.142960825926865 (code B ref 20339); Tue, 21 Apr 2015 09:25:02 +0000 Received: (at 20339) by debbugs.gnu.org; 21 Apr 2015 09:24:19 +0000 Received: from localhost ([127.0.0.1]:34180 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1YkUPi-0006zF-S1 for submit <at> debbugs.gnu.org; Tue, 21 Apr 2015 05:24:19 -0400 Received: from sender1.zohomail.com ([74.201.84.162]:52621) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <rekado@HIDDEN>) id 1YkUPe-0006z2-Or for 20339 <at> debbugs.gnu.org; Tue, 21 Apr 2015 05:24:17 -0400 Received: from localhost (141.80.115.59 [141.80.115.59]) by mx.zohomail.com with SMTPS id 1429608247649710.8344961909249; Tue, 21 Apr 2015 02:24:07 -0700 (PDT) References: <20150415194714.GA30295@HIDDEN> From: Ricardo Wurmus <rekado@HIDDEN> In-reply-to: <20150415194714.GA30295@HIDDEN> Date: Tue, 21 Apr 2015 11:24:03 +0200 Message-ID: <87oamh25sc.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 1.0 (+) Hi Tomás, tomas@HIDDEN writes: > When transforming SXML to XML, namespaces don't seem to be handled > properly: > [...] > > The problem is that SXML used the concatenated (full) namespace with the > name as tag (and attribute) names for namespaced items. When serializing > to XML it should try to find abbreviations for those namespaces and issue > the corresponding namespace declarations. > > Instead, sxml->xml tries to split the (namespace:name) combination > at the first colon and to check the name -- and fails miserably at > (namespace:name) combinations à la "http://www.w3.org/1999/xlink:href" > (procedure check-name). Since there are two colons, the name part > has now a colon. xml->sxml has an optional #:namespaces argument, where you can pass an alist of keys to URLs to be used in the sxml output: (let* ((ns '((svg . "http://www.w3.org/2000/svg") (xlink . "http://www.w3.org/1999/xlink"))) (the-sxml (xml->sxml the-svg #:namespaces ns))) (display the-sxml)) => (*TOP* (svg:svg (svg:rect (@ (y 5) (x 5) (width 20) (stroke-width 2) (stroke purple) (id rect1) (height 20) (fill yellow))) (svg:rect (@ (xlink:href #rect1) (y 5) (x 30) (width 20) (stroke-width 2) (stroke purple) (ry 5) (rx 8) (height 20) (fill blue))))) Passing this to sxml->xml yields: <svg:svg> <svg:rect y="5" x="5" width="20" stroke-width="2" stroke="purple" id="rect1" height="20" fill="yellow" /> <svg:rect xlink:href="#rect1" y="5" x="30" width="20" stroke-width="2" stroke="purple" ry="5" rx="8" height="20" fill="blue" /> </svg:svg> Unfortunately, sxml->xml will not replace the namespace abbreviations, nor will it add appropriate xmlns attributes, so "svg" and "xlink" are devoid of any meaning. Since xml->sxml accepts a namespace alist I suppose it would make sense to extend sxml->xml to do the same. ~~ Ricardo
X-Loop: help-debbugs@HIDDEN Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces? Resent-From: tomas@HIDDEN Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Tue, 21 Apr 2015 09:46:02 +0000 Resent-Message-ID: <handler.20339.B20339.142960952028915 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 20339 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: Ricardo Wurmus <rekado@HIDDEN> Cc: tomas@HIDDEN, 20339 <at> debbugs.gnu.org Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.142960952028915 (code B ref 20339); Tue, 21 Apr 2015 09:46:02 +0000 Received: (at 20339) by debbugs.gnu.org; 21 Apr 2015 09:45:20 +0000 Received: from localhost ([127.0.0.1]:34266 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1YkUk3-0007WF-Ts for submit <at> debbugs.gnu.org; Tue, 21 Apr 2015 05:45:20 -0400 Received: from mail.tuxteam.de ([5.199.139.25]:54735 helo=tomasium.tuxteam.de) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <tomas@HIDDEN>) id 1YkUjQ-0007Uo-TH for 20339 <at> debbugs.gnu.org; Tue, 21 Apr 2015 05:44:41 -0400 Received: from tomas by tomasium.tuxteam.de with local (Exim 4.80) (envelope-from <tomas@HIDDEN>) id 1YkUjO-000623-OP; Tue, 21 Apr 2015 11:44:38 +0200 Date: Tue, 21 Apr 2015 11:44:38 +0200 From: tomas@HIDDEN Message-ID: <20150421094438.GA22715@HIDDEN> References: <20150415194714.GA30295@HIDDEN> <87oamh25sc.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; x-action=pgp-signed Content-Transfer-Encoding: 8bit In-Reply-To: <87oamh25sc.fsf@HIDDEN> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -0.0 (/) -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Tue, Apr 21, 2015 at 11:24:03AM +0200, Ricardo Wurmus wrote: > Hi Tomás, > > tomas@HIDDEN writes: > > > When transforming SXML to XML, namespaces don't seem to be handled > > properly: > > > [...] > > > > The problem is that SXML used the concatenated (full) namespace with the > > name as tag (and attribute) names for namespaced items. When serializing > > to XML it should try to find abbreviations for those namespaces and issue > > the corresponding namespace declarations. > > > > Instead, sxml->xml tries to split the (namespace:name) combination > > at the first colon and to check the name -- and fails miserably at > > (namespace:name) combinations à la "http://www.w3.org/1999/xlink:href" > > (procedure check-name). Since there are two colons, the name part > > has now a colon. > > xml->sxml has an optional #:namespaces argument, where you can pass an > alist of keys to URLs to be used in the sxml output: Aha. Didn't know about this one, thanks. Yes, the problem is that SXML loses the link to the "real" namespaces: the application around it has to keep track of that. > Passing this to sxml->xml yields: > > <svg:svg> > <svg:rect y="5" x="5" > width="20" > stroke-width="2" > stroke="purple" > id="rect1" > height="20" > fill="yellow" /> > <svg:rect xlink:href="#rect1" > y="5" x="30" > width="20" > stroke-width="2" > stroke="purple" > ry="5" rx="8" > height="20" > fill="blue" /> > </svg:svg> Yes, this looks "nearly" right, except... > Unfortunately, sxml->xml will not replace the namespace abbreviations, > nor will it add appropriate xmlns attributes, so "svg" and "xlink" are > devoid of any meaning. exactly. > Since xml->sxml accepts a namespace alist I suppose it would make sense > to extend sxml->xml to do the same. This is more or less what I do in my proposed patch (it's in the bugs mailing list as 20339 <at> debbugs.gnu.org). It passes around an alist of (namespace . abbrev) associations (it's inverted wrt #:namespaces in xml->sxml). Only that the abbreviations are "generated" as ns1, ns2 and so on (and the namespace declarations are woven into the attributes list). So far not reply to my bug report, but this gives me the chance to bikeshed my patch to death :-P Thanks for looking into that -- and for prodding me into looking at more sources :) Regards - -- t -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEYEARECAAYFAlU2HAYACgkQBcgs9XrR2kYq+gCfexhJ5qFyN4QmIf4TfddPqyfT 434An3BSVKtyovRJdg8MGHzAY8I0/NTD =O9Kj -----END PGP SIGNATURE-----
X-Loop: help-debbugs@HIDDEN Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces? Resent-From: Ricardo Wurmus <rekado@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Wed, 22 Apr 2015 14:31:02 +0000 Resent-Message-ID: <handler.20339.B20339.14297130154707 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 20339 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: tomas@HIDDEN Cc: 20339 <at> debbugs.gnu.org Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.14297130154707 (code B ref 20339); Wed, 22 Apr 2015 14:31:02 +0000 Received: (at 20339) by debbugs.gnu.org; 22 Apr 2015 14:30:15 +0000 Received: from localhost ([127.0.0.1]:36626 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1YkvfK-0001Dp-DS for submit <at> debbugs.gnu.org; Wed, 22 Apr 2015 10:30:15 -0400 Received: from sender1.zohomail.com ([74.201.84.162]:52364) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <rekado@HIDDEN>) id 1YkvfF-0001Dd-DZ for 20339 <at> debbugs.gnu.org; Wed, 22 Apr 2015 10:30:11 -0400 Received: from localhost (89.15.238.113 [89.15.238.113]) by mx.zohomail.com with SMTPS id 1429712981639612.5748451576998; Wed, 22 Apr 2015 07:29:41 -0700 (PDT) References: <20150415194714.GA30295@HIDDEN> <87oamh25sc.fsf@HIDDEN> <20150421094438.GA22715@HIDDEN> From: Ricardo Wurmus <rekado@HIDDEN> In-reply-to: <20150421094438.GA22715@HIDDEN> Date: Wed, 22 Apr 2015 16:29:32 +0200 Message-ID: <87fv7s1bjn.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Zoho-Virus-Status: 1 X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 1.0 (+) --=-=-= Content-Type: text/plain >> Since xml->sxml accepts a namespace alist I suppose it would make sense >> to extend sxml->xml to do the same. Attached is a minimal patch to extend "sxml->xml" such that it accepts an optional keyword argument "namespaces" with an alist of prefixes to URLs, analogous to "xml->sxml". When the namespaces alist is provided, "xmlns:prefix=url" attributes are prepended to the element's list of attributes. ;; Define SVG document with namespaces (define the-svg "<svg xmlns='http://www.w3.org/2000/svg' xmlns:xlink='http://www.w3.org/1999/xlink'> <rect x='5' y='5' width='20' height='20' stroke-width='2' stroke='purple' fill='yellow' id='rect1' /> <rect x='30' y='5' width='20' height='20' ry='5' rx='8' stroke-width='2' stroke='purple' fill='blue' xlink:href='#rect1' /> </svg>") ;; Define alist of namespaces (define ns '((svg . "http://www.w3.org/2000/svg") (xlink . "http://www.w3.org/1999/xlink"))) ;; Convert to SXML, abbreviate namespaces according to ns alist (define the-sxml (xml->sxml the-svg #:namespaces ns)) ;; Convert back to XML (sxml->xml the-sxml #:namespaces ns) => <svg:svg xmlns:svg="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"> <svg:rect y="5" x="5" width="20" stroke-width="2" stroke="purple" id="rect1" height="20" fill="yellow" /> <svg:rect xlink:href="#rect1" y="5" x="30" width="20" stroke-width="2" stroke="purple" ry="5" rx="8" height="20" fill="blue" /> </svg:svg> Does this do what you want? ~~ Ricardo --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename=0001-Write-XML-namespaces-when-serializing.patch From 81fa92ad0c5537c41419fa1e55c6130bf0558c9f Mon Sep 17 00:00:00 2001 From: rekado <rekado@HIDDEN> Date: Wed, 22 Apr 2015 13:09:27 +0200 Subject: [PATCH] Write XML namespaces when serializing. * module/sxml/simple.scm (sxml->xml): Add optional keyword argument "namespaces". --- module/sxml/simple.scm | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/module/sxml/simple.scm b/module/sxml/simple.scm index 703ad91..8cc20dd 100644 --- a/module/sxml/simple.scm +++ b/module/sxml/simple.scm @@ -311,7 +311,8 @@ port." (display str port) (display "?>" port)) -(define* (sxml->xml tree #:optional (port (current-output-port))) +(define* (sxml->xml tree #:optional (port (current-output-port)) #:key + (namespaces '())) "Serialize the sxml tree @var{tree} as XML. The output will be written to the current output port, unless the optional argument @var{port} is present." @@ -322,7 +323,7 @@ present." (let ((tag (car tree))) (case tag ((*TOP*) - (sxml->xml (cdr tree) port)) + (sxml->xml (cdr tree) port #:namespaces namespaces)) ((*ENTITY*) (if (and (list? (cdr tree)) (= (length (cdr tree)) 1)) (entity->xml (cadr tree) port) @@ -335,10 +336,16 @@ present." (let* ((elems (cdr tree)) (attrs (and (pair? elems) (pair? (car elems)) (eq? '@ (caar elems)) - (cdar elems)))) - (element->xml tag attrs (if attrs (cdr elems) elems) port))))) + (cdar elems))) + (xmlns (map (lambda (x) + (cons (symbol-append 'xmlns: (car x)) + (cdr x))) + namespaces))) + (element->xml tag + (if attrs (append xmlns attrs) xmlns) + (if attrs (cdr elems) elems) port))))) ;; A nodelist. - (for-each (lambda (x) (sxml->xml x port)) tree))) + (for-each (lambda (x) (sxml->xml x port #:namespaces namespaces)) tree))) ((string? tree) (string->escaped-xml tree port)) ((null? tree) *unspecified*) -- 2.1.0 --=-=-=--
X-Loop: help-debbugs@HIDDEN Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces? Resent-From: tomas@HIDDEN Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Thu, 23 Apr 2015 06:58:01 +0000 Resent-Message-ID: <handler.20339.B20339.14297722401723 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 20339 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: Ricardo Wurmus <rekado@HIDDEN> Cc: 20339 <at> debbugs.gnu.org Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.14297722401723 (code B ref 20339); Thu, 23 Apr 2015 06:58:01 +0000 Received: (at 20339) by debbugs.gnu.org; 23 Apr 2015 06:57:20 +0000 Received: from localhost ([127.0.0.1]:37024 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1YlB4a-0000Rj-4q for submit <at> debbugs.gnu.org; Thu, 23 Apr 2015 02:57:20 -0400 Received: from mail.tuxteam.de ([5.199.139.25]:60241 helo=tomasium.tuxteam.de) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <tomas@HIDDEN>) id 1YlB4X-0000RT-G8 for 20339 <at> debbugs.gnu.org; Thu, 23 Apr 2015 02:57:18 -0400 Received: from tomas by tomasium.tuxteam.de with local (Exim 4.80) (envelope-from <tomas@HIDDEN>) id 1YlB4V-0005BL-4W; Thu, 23 Apr 2015 08:57:15 +0200 Date: Thu, 23 Apr 2015 08:57:14 +0200 From: tomas@HIDDEN Message-ID: <20150423065714.GB19410@HIDDEN> References: <20150415194714.GA30295@HIDDEN> <87oamh25sc.fsf@HIDDEN> <20150421094438.GA22715@HIDDEN> <87fv7s1bjn.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; x-action=pgp-signed Content-Transfer-Encoding: 8bit In-Reply-To: <87fv7s1bjn.fsf@HIDDEN> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -0.0 (/) -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wed, Apr 22, 2015 at 04:29:32PM +0200, Ricardo Wurmus wrote: > >> Since xml->sxml accepts a namespace alist I suppose it would make sense > >> to extend sxml->xml to do the same. > > Attached is a minimal patch to extend "sxml->xml" such that it accepts an > optional keyword argument "namespaces" with an alist of prefixes to > URLs, analogous to "xml->sxml". Thanks, I'll have a look at this this afternoon. Your code is far prettier than mine, that's for sure :-) What's yet missing (as far as I can read off the diff) is a way to "dream up" an abbreviation when it's not in the namespaces alist. Thanks again and regards - -- tomás -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEYEARECAAYFAlU4l8oACgkQBcgs9XrR2kb7SwCeNO0Z+RJZy6VUeQotm3+qX5rd nXMAn2QeowgVnEj+9Zh3gMIBZW99Y3bx =BrEt -----END PGP SIGNATURE-----
X-Loop: help-debbugs@HIDDEN Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces? Resent-From: Ricardo Wurmus <rekado@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Thu, 23 Apr 2015 07:06:02 +0000 Resent-Message-ID: <handler.20339.B20339.14297727103193 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 20339 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: tomas@HIDDEN Cc: 20339 <at> debbugs.gnu.org Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.14297727103193 (code B ref 20339); Thu, 23 Apr 2015 07:06:02 +0000 Received: (at 20339) by debbugs.gnu.org; 23 Apr 2015 07:05:10 +0000 Received: from localhost ([127.0.0.1]:37029 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1YlBC9-0000pL-39 for submit <at> debbugs.gnu.org; Thu, 23 Apr 2015 03:05:09 -0400 Received: from sender1.zohomail.com ([74.201.84.162]:53632) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <rekado@HIDDEN>) id 1YlBC5-0000p4-OK for 20339 <at> debbugs.gnu.org; Thu, 23 Apr 2015 03:05:07 -0400 Received: from localhost (xd933f8e5.dyn.telefonica.de [217.51.248.229]) by mx.zohomail.com with SMTPS id 1429772699616412.097398393296; Thu, 23 Apr 2015 00:04:59 -0700 (PDT) References: <20150415194714.GA30295@HIDDEN> <87oamh25sc.fsf@HIDDEN> <20150421094438.GA22715@HIDDEN> <87fv7s1bjn.fsf@HIDDEN> <20150423065714.GB19410@HIDDEN> From: Ricardo Wurmus <rekado@HIDDEN> In-reply-to: <20150423065714.GB19410@HIDDEN> Date: Thu, 23 Apr 2015 09:04:46 +0200 Message-ID: <878udj1g1d.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 1.0 (+) tomas@HIDDEN writes: > What's yet missing (as far as I can read off the diff) is a way to > "dream up" an abbreviation when it's not in the namespaces alist. True. Ideally, this should work even without passing a namespaces alist at all in both "xml->sxml" and "sxml->xml". The non-abbreviated namespaces should not cause "sxml->xml" to fail. Passing around a namespaces alist to both these procedures is the least invasive approach I could think of, but I still think that it *should* be made to work without explicitly declaring namespaces.
X-Loop: help-debbugs@HIDDEN Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces? Resent-From: tomas@HIDDEN Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Thu, 23 Apr 2015 07:41:03 +0000 Resent-Message-ID: <handler.20339.B20339.14297748409886 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 20339 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: Ricardo Wurmus <rekado@HIDDEN> Cc: tomas@HIDDEN, 20339 <at> debbugs.gnu.org Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.14297748409886 (code B ref 20339); Thu, 23 Apr 2015 07:41:03 +0000 Received: (at 20339) by debbugs.gnu.org; 23 Apr 2015 07:40:40 +0000 Received: from localhost ([127.0.0.1]:37069 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1YlBkW-0002ZO-Fr for submit <at> debbugs.gnu.org; Thu, 23 Apr 2015 03:40:40 -0400 Received: from mail.tuxteam.de ([5.199.139.25]:60331 helo=tomasium.tuxteam.de) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <tomas@HIDDEN>) id 1YlBkT-0002Z9-5X for 20339 <at> debbugs.gnu.org; Thu, 23 Apr 2015 03:40:38 -0400 Received: from tomas by tomasium.tuxteam.de with local (Exim 4.80) (envelope-from <tomas@HIDDEN>) id 1YlBkQ-0005Va-JE; Thu, 23 Apr 2015 09:40:34 +0200 Date: Thu, 23 Apr 2015 09:40:34 +0200 From: tomas@HIDDEN Message-ID: <20150423074034.GA20961@HIDDEN> References: <20150415194714.GA30295@HIDDEN> <87oamh25sc.fsf@HIDDEN> <20150421094438.GA22715@HIDDEN> <87fv7s1bjn.fsf@HIDDEN> <20150423065714.GB19410@HIDDEN> <878udj1g1d.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; x-action=pgp-signed Content-Transfer-Encoding: 8bit In-Reply-To: <878udj1g1d.fsf@HIDDEN> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -0.0 (/) -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Thu, Apr 23, 2015 at 09:04:46AM +0200, Ricardo Wurmus wrote: > > tomas@HIDDEN writes: > > > What's yet missing (as far as I can read off the diff) is a way to > > "dream up" an abbreviation when it's not in the namespaces alist. > > True. > > Ideally, this should work even without passing a namespaces alist at all > in both "xml->sxml" and "sxml->xml". The non-abbreviated namespaces > should not cause "sxml->xml" to fail. > > Passing around a namespaces alist to both these procedures is the least > invasive approach I could think of, but I still think that it *should* > be made to work without explicitly declaring namespaces. I think a combination of our approaches could work: the only difference (apart of the code elegance) is that my patch grows this alist on its way down the tree as it encounters new namespace. This meshes well with the namespace declaration, which scopes recursively down the XML tree. This afternoon, while I sit at the e-Lok waiting for the FSFE meeting is a very good moment for me to look into it. I'll report tonight :-) Thanks & later (dayjob calling) - -- tomás -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEYEARECAAYFAlU4ofIACgkQBcgs9XrR2kaFNwCfWzPunxHiiDJIJean02rx7pMT 92IAn2IGYW01Cx7aJt32MLRDQYuY9FbP =owfk -----END PGP SIGNATURE-----
X-Loop: help-debbugs@HIDDEN Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces? Resent-From: tomas@HIDDEN Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Sat, 25 Apr 2015 20:26:01 +0000 Resent-Message-ID: <handler.20339.B20339.142999351510435 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 20339 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: Ricardo Wurmus <rekado@HIDDEN> Cc: tomas@HIDDEN, 20339 <at> debbugs.gnu.org Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.142999351510435 (code B ref 20339); Sat, 25 Apr 2015 20:26:01 +0000 Received: (at 20339) by debbugs.gnu.org; 25 Apr 2015 20:25:15 +0000 Received: from localhost ([127.0.0.1]:39987 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1Ym6dW-0002iE-8w for submit <at> debbugs.gnu.org; Sat, 25 Apr 2015 16:25:14 -0400 Received: from mail.tuxteam.de ([5.199.139.25]:39513 helo=tomasium.tuxteam.de) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <tomas@HIDDEN>) id 1Ym6dT-0002i3-If for 20339 <at> debbugs.gnu.org; Sat, 25 Apr 2015 16:25:12 -0400 Received: from tomas by tomasium.tuxteam.de with local (Exim 4.80) (envelope-from <tomas@HIDDEN>) id 1Ym6dR-0001BN-An; Sat, 25 Apr 2015 22:25:09 +0200 Date: Sat, 25 Apr 2015 22:25:09 +0200 From: tomas@HIDDEN Message-ID: <20150425202509.GA3544@HIDDEN> References: <20150415194714.GA30295@HIDDEN> <87oamh25sc.fsf@HIDDEN> <20150421094438.GA22715@HIDDEN> <87fv7s1bjn.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; x-action=pgp-signed Content-Transfer-Encoding: 8bit In-Reply-To: <87fv7s1bjn.fsf@HIDDEN> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -0.0 (/) -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wed, Apr 22, 2015 at 04:29:32PM +0200, Ricardo Wurmus wrote: > >> Since xml->sxml accepts a namespace alist I suppose it would make sense > >> to extend sxml->xml to do the same. > > Attached is a minimal patch to extend "sxml->xml" such that it accepts an > optional keyword argument "namespaces" with an alist of prefixes to > URLs, analogous to "xml->sxml". Thank you again for the patch. I applied it against 2.0.11, and can confirm that it works as advertised :-) I didn't see that xml->sxml has an optional parameter #:namespaces -- to be honest, I didn't expect it there. So if one knows beforehand what namespaces are used in the XML in question, it's possible to use the pair xml->sxml and xml->sxml this way (with your patch, of course, because otherwise sxml->xml "forgets" to output the relevant XML namespace declarations). Reading again Oleg Kiselyov's paper[1] I understand that SXML can, as does XML have namespace abbreviations (called there user-ns-shortcut). It's not exctly the same thing, but somehow isomorphic. One might use the XML's abbreviations in the SXML representation, of course. The problem with this approach is that you either have to carry the namespace associations "out-of-band", and that you have to know which namespaces to expect before parsing the XML. A (more cosmtic) problem is that all namespace declarations are "moved" to the top-level, because the SXML keeps no "memory" of which node the namespace declarations were attached to in the original XML. In [1], there is a mechanism for stashing namespace mappings in the "attributes list" (strictly in the annotations, which are optionally tacked to the tail of the attributes list, under the tag *NAMESPACES*. Anyway -- what would be a good way forward here? I could imagine taking note of the namespace abbreviations in the *NAMESPACES* list (while xml->sxml) and issuing the corresponding declarations in sxml->xml. Makes sense? Regards [1] <http://okmij.org/ftp/papers/SXML-paper.pdf> - -- tomás -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEYEARECAAYFAlU7+CUACgkQBcgs9XrR2kaSxACfdljxbGyVNILgombB3jYWjeOq 1zwAn2RzIEHcJbJIlIMRkaEAIjNFcH7M =MSYu -----END PGP SIGNATURE-----
X-Loop: help-debbugs@HIDDEN Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces? Resent-From: tomas@HIDDEN Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Sun, 26 Apr 2015 10:29:01 +0000 Resent-Message-ID: <handler.20339.B20339.14300440971117 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 20339 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: Ricardo Wurmus <rekado@HIDDEN> Cc: tomas@HIDDEN, 20339 <at> debbugs.gnu.org Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.14300440971117 (code B ref 20339); Sun, 26 Apr 2015 10:29:01 +0000 Received: (at 20339) by debbugs.gnu.org; 26 Apr 2015 10:28:17 +0000 Received: from localhost ([127.0.0.1]:40171 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1YmJnM-0000Hx-SG for submit <at> debbugs.gnu.org; Sun, 26 Apr 2015 06:28:17 -0400 Received: from mail.tuxteam.de ([5.199.139.25]:41210 helo=tomasium.tuxteam.de) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <tomas@HIDDEN>) id 1YmJnK-0000Hm-65 for 20339 <at> debbugs.gnu.org; Sun, 26 Apr 2015 06:28:15 -0400 Received: from tomas by tomasium.tuxteam.de with local (Exim 4.80) (envelope-from <tomas@HIDDEN>) id 1YmJnG-0001iD-Uq; Sun, 26 Apr 2015 12:28:10 +0200 Date: Sun, 26 Apr 2015 12:28:10 +0200 From: tomas@HIDDEN Message-ID: <20150426102810.GB5922@HIDDEN> References: <20150415194714.GA30295@HIDDEN> <87oamh25sc.fsf@HIDDEN> <20150421094438.GA22715@HIDDEN> <87fv7s1bjn.fsf@HIDDEN> <20150425202509.GA3544@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; x-action=pgp-signed Content-Transfer-Encoding: 8bit In-Reply-To: <20150425202509.GA3544@HIDDEN> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -0.0 (/) -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sat, Apr 25, 2015 at 10:25:09PM +0200, tomas@HIDDEN wrote: [...] > Reading again Oleg Kiselyov's paper[1] I understand that SXML can, as does > XML have namespace abbreviations (called there user-ns-shortcut). It's not > exctly the same thing, but somehow isomorphic. One might use the XML's > abbreviations in the SXML representation, of course. I take that back: as far as I understand the paper, the (SXML-side) shortcuts are global to the document, whereas the (XML-side) abbreviations are subtree- scoped (i.e. for the whole subtree of the element where the declaration is attached. I don't know ATM whether shadowing is allowed, but I'll look that up). So there *is* a subtle difference between "user-ns-shortcut" (the one you were manipulating with #:namespaces) and the XML "namespace abbreviation" (the official jargon is "namespace prefix"). Regards [1] <http://okmij.org/ftp/papers/SXML-paper.pdf> - -- tomás -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEYEARECAAYFAlU8vboACgkQBcgs9XrR2kadlACeI+p4W8N/dJ49cGBypYNEP/ta l6MAn3exlNUpj6Z4cYG0Dcb1ltyuQQBB =x74j -----END PGP SIGNATURE-----
X-Loop: help-debbugs@HIDDEN Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces? Resent-From: Andy Wingo <wingo@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Thu, 23 Jun 2016 19:33:01 +0000 Resent-Message-ID: <handler.20339.B20339.146671034717357 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 20339 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: tomas@HIDDEN Cc: 20339 <at> debbugs.gnu.org Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.146671034717357 (code B ref 20339); Thu, 23 Jun 2016 19:33:01 +0000 Received: (at 20339) by debbugs.gnu.org; 23 Jun 2016 19:32:27 +0000 Received: from localhost ([127.0.0.1]:53006 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1bGAMV-0004Vs-Hz for submit <at> debbugs.gnu.org; Thu, 23 Jun 2016 15:32:27 -0400 Received: from pb-sasl2.pobox.com ([64.147.108.67]:65092 helo=sasl.smtp.pobox.com) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <wingo@HIDDEN>) id 1bGAMT-0004Vj-Kt for 20339 <at> debbugs.gnu.org; Thu, 23 Jun 2016 15:32:26 -0400 Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by pb-sasl2.pobox.com (Postfix) with ESMTP id 5109024111; Thu, 23 Jun 2016 15:32:25 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=sasl; bh=4YlpFEfUufNoTQ6bmCt/Ajn2Jyg=; b=jBffSu NLCIcbvErtcRfOefXEkdflyGHUhgaW1PykivVz/ioM0TRvkZ2uDG1PLvVNUNPgRB zrYL/xN9u3uOVrEMd/1zwJ0KkuL/5R98gFVFnNUHebl2kNk1O8SZCHwupiR0wxFP Vts7A6uRHRhPrDafCthkAHFwQHPFaIJcy+NNo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; q=dns; s=sasl; b=ZUgkGbEjm90diz08WVOMrS1nD05ltdqP cPRnzxleFR4oEZT8givxzbATtsyyntguoBOYp8lM1B3p6gWxgzrzuJ3x4RmkrJMM X4ldMyYa75kHnffQe1sPPDMbZCwFEe7DQklA/DZ6M5aUkLNyYC6pcjF6bAGArtlk vEaQRO7AFr4= Received: from pb-sasl2.nyi.icgroup.com (unknown [127.0.0.1]) by pb-sasl2.pobox.com (Postfix) with ESMTP id 4A39424110; Thu, 23 Jun 2016 15:32:25 -0400 (EDT) Received: from clucks (unknown [88.160.190.192]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by pb-sasl2.pobox.com (Postfix) with ESMTPSA id 605DA2410F; Thu, 23 Jun 2016 15:32:24 -0400 (EDT) From: Andy Wingo <wingo@HIDDEN> References: <20150415194714.GA30295@HIDDEN> Date: Thu, 23 Jun 2016 21:32:16 +0200 In-Reply-To: <20150415194714.GA30295@HIDDEN> (tomas@HIDDEN's message of "Wed, 15 Apr 2015 21:47:14 +0200") Message-ID: <87y45vln0f.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Pobox-Relay-ID: 365DE7E4-3979-11E6-B008-28A6F1301B6D-02397024!pb-sasl2.pobox.com X-Spam-Score: -1.4 (-) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.4 (-) See thread here as well: http://thread.gmane.org/gmane.lisp.guile.devel/17709 I like Ricardo's patch but have some comments here: http://article.gmane.org/gmane.lisp.guile.devel/18384 Andy
X-Loop: help-debbugs@HIDDEN Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces? Resent-From: tomas@HIDDEN Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Wed, 13 Jul 2016 13:25:01 +0000 Resent-Message-ID: <handler.20339.B20339.14684162575157 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 20339 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: 20339 <at> debbugs.gnu.org Cc: Andy Wingo <wingo@HIDDEN>, Ricardo Wurmus <rekado@HIDDEN> Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.14684162575157 (code B ref 20339); Wed, 13 Jul 2016 13:25:01 +0000 Received: (at 20339) by debbugs.gnu.org; 13 Jul 2016 13:24:17 +0000 Received: from localhost ([127.0.0.1]:49200 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1bNK97-0001L2-4j for submit <at> debbugs.gnu.org; Wed, 13 Jul 2016 09:24:17 -0400 Received: from mail.tuxteam.de ([5.199.139.25]:57950 helo=tomasium.tuxteam.de) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <tomas@HIDDEN>) id 1bNK91-0001Ko-AI for 20339 <at> debbugs.gnu.org; Wed, 13 Jul 2016 09:24:11 -0400 Received: from tomas by tomasium.tuxteam.de with local (Exim 4.80) (envelope-from <tomas@HIDDEN>) id 1bNK8y-00019w-FF; Wed, 13 Jul 2016 15:24:04 +0200 Date: Wed, 13 Jul 2016 15:24:03 +0200 From: tomas@HIDDEN Message-ID: <20160713132403.GA2349@HIDDEN> References: <20150415194714.GA30295@HIDDEN> <87y45vln0f.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; x-action=pgp-signed Content-Transfer-Encoding: 8bit In-Reply-To: <87y45vln0f.fsf@HIDDEN> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -1.3 (-) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.3 (-) -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Thu, Jun 23, 2016 at 09:32:16PM +0200, Andy Wingo wrote: > See thread here as well: > http://thread.gmane.org/gmane.lisp.guile.devel/17709 > > I like Ricardo's patch but have some comments here: > http://article.gmane.org/gmane.lisp.guile.devel/18384 (sorry for cc'ing both of you, but I don't know whether you are subscribed to the bug. Two copies seemed more polite than none). Sorry folks for not coming back earlier. Real Life and things. Since I'm going to be off the 'net for one month starting next Friday, I thought I'll write a short note. I'll be back the 15th of August and am really willing to do whatever it takes to bring this forward. OTOH, if any of you decides to pick it up, I'm sure the results will be better :-) Referring to Oleg Kiseliov's paper [1], there are actually three things involved: - the namespace. This is an XML thing and will typically be an URI (I don't quite remember whether it *must* be an URI, but that's irrelevant. It may contain nasty characters (to XML: it isn't an XML "Name", and potentially to Scheme: there may be patentheses and things in there, so some Schemes won't make a symbol of that; Guile doesn't mind) - the namespace prefix. Again, an XML thing, basically giving a non-nasty abbreviation for the namespace, to stick it to the Name, making a "QName". The association prefix -> namespace is scoped to a node and its descendants, and can be shadowed at some node below - the namespace-id, an SXML thing. In [1], this is typically the namespace, but Oleg Kyselyov made provisions in [1] for a similar "abbreviation" (the user-ns-shortcut in [1], page 3), whose mapping can be attached to any node via the pseudo-attribute *NAMESPACES* [2], which can also carry the original (XML) namespace prefix. As far as I understand the paper, most of the time this namespace-id will be identical to the URI, but it is this what will be prefixed to the tag name symbols in the SXML representation. What Ricardo's patch does is to conflate namespace prefix and namespace-id and provide a mapping (namespace-id aka prefix) -> namespace. This is actually quite elegant, since we don't need the distinction between (XML) prefix and (SXML) namespace-id. I think that we can, at least as (sxml simple) is concerned, ignore this distinction. What is missing? From my point of view: - At xml->sxml time, the user doesn't know which namespaces are in the xml. So it would be nice if the XML parser could provide that. - It would be super-nice if the XML parser could put that into the same nodes it found it, as described in [1] (i.e. in the (*NAMESPACES* ...) pseudo-attribute). This way we wouldn't have a global mapping, but one that resembles the original XML, even with the same prefixes. Less surprises overall. The round trip xml -> sxml -> xml would be (nearly) the identity. With Ricardo's patch it would lump all the namespace declarations up in the top node, which formally is correct, but might scare XML people a bit :-) - At sxml->xml time there should be a way to somehow generate prefixex for "new" namespaces. I don't know at the moment how this would work, that depends on how the user is supposed to insert new nodes in the SXML. Does she specify the namespace? Both prefix (aka namespace-id, under my current assumption) *and* namespace? (note that the namespace-id/prefix alone wouldn't be sufficient). Sorry for this wall of text. I hope it makes some sense. Regards [1] http://okmij.org/ftp/papers/SXML-paper.pdf [2] Actually, I'm cheating here: the thing is part of an "annotations" part, which according to the grammar comes *last*, after all the attributes. But it looks a bit like an attribute, with a strange name and a more complex value. - -- tomás -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEYEARECAAYFAleGQPMACgkQBcgs9XrR2kaMfgCeKbA4pWFrCZoxofDF4n9utgnZ IzYAn1gozFwBLPd/rmNkZvJYDTJ9cIvr =etJd -----END PGP SIGNATURE-----
X-Loop: help-debbugs@HIDDEN Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces? Resent-From: tomas@HIDDEN Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Wed, 13 Jul 2016 18:09:02 +0000 Resent-Message-ID: <handler.20339.B20339.14684333377585 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 20339 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: 20339 <at> debbugs.gnu.org Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.14684333377585 (code B ref 20339); Wed, 13 Jul 2016 18:09:02 +0000 Received: (at 20339) by debbugs.gnu.org; 13 Jul 2016 18:08:57 +0000 Received: from localhost ([127.0.0.1]:50115 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1bNOaf-0001yH-Jk for submit <at> debbugs.gnu.org; Wed, 13 Jul 2016 14:08:57 -0400 Received: from mail.tuxteam.de ([5.199.139.25]:58527 helo=tomasium.tuxteam.de) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <tomas@HIDDEN>) id 1bNOae-0001y9-4u for 20339 <at> debbugs.gnu.org; Wed, 13 Jul 2016 14:08:56 -0400 Received: from tomas by tomasium.tuxteam.de with local (Exim 4.80) (envelope-from <tomas@HIDDEN>) id 1bNOac-0003Jl-HB for 20339 <at> debbugs.gnu.org; Wed, 13 Jul 2016 20:08:54 +0200 Date: Wed, 13 Jul 2016 20:08:54 +0200 From: tomas@HIDDEN Message-ID: <20160713180854.GA12635@HIDDEN> References: <20150415194714.GA30295@HIDDEN> <87y45vln0f.fsf@HIDDEN> <20160713132403.GA2349@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; x-action=pgp-signed Content-Transfer-Encoding: 8bit In-Reply-To: <20160713132403.GA2349@HIDDEN> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -1.3 (-) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.3 (-) -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wed, Jul 13, 2016 at 03:24:03PM +0200, tomas@HIDDEN wrote: [...] > What is missing? From my point of view: > > - At xml->sxml time, the user doesn't know which namespaces > are in the xml. So it would be nice if the XML parser > could provide that. > > - It would be super-nice if the XML parser could put that > into the same nodes it found it, as described in [1] > (i.e. in the (*NAMESPACES* ...) pseudo-attribute). > This way we wouldn't have a global mapping, but one > that resembles the original XML, even with the same > prefixes. Less surprises overall. The round trip > xml -> sxml -> xml would be (nearly) the identity. > > With Ricardo's patch it would lump all the namespace > declarations up in the top node, which formally is > correct, but might scare XML people a bit :-) > > - At sxml->xml time there should be a way to somehow > generate prefixex for "new" namespaces. I don't know > at the moment how this would work, that depends on > how the user is supposed to insert new nodes in the > SXML. Does she specify the namespace? Both prefix > (aka namespace-id, under my current assumption) *and* > namespace? (note that the namespace-id/prefix alone > wouldn't be sufficient). Argh. First post, then think, sorry. Actually ditch the last point. I think it would be OK to make the user responsible to keep the *NAMESPACES* pseudo-attribute up-to-date whenever she adds nodes with new namespaces to the SXML. regards [1] http://okmij.org/ftp/papers/SXML-paper.pdf - -- tomás -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEYEARECAAYFAleGg7YACgkQBcgs9XrR2kY7hACdG5drjpPVlzB4wW6sXhuRKliv h3cAnAmHC5RxiEc6RXi0tu5U3yF4YYbx =7uGa -----END PGP SIGNATURE-----
X-Loop: help-debbugs@HIDDEN Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces? Resent-From: Andy Wingo <wingo@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Thu, 14 Jul 2016 10:11:02 +0000 Resent-Message-ID: <handler.20339.B20339.14684910297110 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 20339 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: tomas@HIDDEN Cc: Ricardo Wurmus <rekado@HIDDEN>, 20339 <at> debbugs.gnu.org Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.14684910297110 (code B ref 20339); Thu, 14 Jul 2016 10:11:02 +0000 Received: (at 20339) by debbugs.gnu.org; 14 Jul 2016 10:10:29 +0000 Received: from localhost ([127.0.0.1]:50557 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1bNdbB-0001qc-Iu for submit <at> debbugs.gnu.org; Thu, 14 Jul 2016 06:10:29 -0400 Received: from pb-sasl2.pobox.com ([64.147.108.67]:52537 helo=sasl.smtp.pobox.com) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <wingo@HIDDEN>) id 1bNdb9-0001qS-Q2 for 20339 <at> debbugs.gnu.org; Thu, 14 Jul 2016 06:10:28 -0400 Received: from sasl.smtp.pobox.com (unknown [127.0.0.1]) by pb-sasl2.pobox.com (Postfix) with ESMTP id 2CEE924B17; Thu, 14 Jul 2016 06:10:25 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=sasl; bh=BMXjZYHe7FoMrTAHw4gJAuxvmcE=; b=HsGvJo 7tmpzP013YDbLVJEvYrvReCyRoPTVqGTLzKpYCmMzXHfgP/nu+K3egJruWscjamU k+Mml/xoz30gUU9OUWINloXU9Ohh/OnpZz07i7HpB6wRxVpR1QdhvBeOpxLoTKFW t5yAzcB4rXqz0+kpVwXh9ZR5fxM4rHyvJZ0n8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; q=dns; s=sasl; b=NQjFYk/wG3MBm4KXR+7/Nj8Y/QDZ/1A9 GW2Tb9P6EItfZeC2xLoTg9TXnpkSFpxbVEJmLAbCWzrZgb9Dt6IeUwbTA01y19Zs q4TPEPAWdaGr+Q5+7sFyoJGkOSSn2Auo+WW8KkM2DwNM2jUTmXMCpKuRCkbRZaj1 92eOR0D8YxI= Received: from pb-sasl2.nyi.icgroup.com (unknown [127.0.0.1]) by pb-sasl2.pobox.com (Postfix) with ESMTP id 1674C24B13; Thu, 14 Jul 2016 06:10:25 -0400 (EDT) Received: from clucks (unknown [88.160.190.192]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by pb-sasl2.pobox.com (Postfix) with ESMTPSA id 2C8C524B12; Thu, 14 Jul 2016 06:10:24 -0400 (EDT) From: Andy Wingo <wingo@HIDDEN> References: <20150415194714.GA30295@HIDDEN> <87y45vln0f.fsf@HIDDEN> <20160713132403.GA2349@HIDDEN> Date: Thu, 14 Jul 2016 12:10:17 +0200 In-Reply-To: <20160713132403.GA2349@HIDDEN> (tomas@HIDDEN's message of "Wed, 13 Jul 2016 15:24:03 +0200") Message-ID: <87furc1qeu.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Pobox-Relay-ID: 2E2ADE00-49AB-11E6-9089-28A6F1301B6D-02397024!pb-sasl2.pobox.com X-Spam-Score: -1.3 (-) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.3 (-) Hi :) On Wed 13 Jul 2016 15:24, tomas@HIDDEN writes: > Referring to Oleg Kiseliov's paper [1], there are actually three > things involved: This summary is helpful, thanks. > What is missing? From my point of view: > > - At xml->sxml time, the user doesn't know which namespaces > are in the xml. So it would be nice if the XML parser > could provide that. For some documents you do know, of course. And for larger perspective, I think that SSAX gives you all the tools you need to build specialist and very flexible XML parsers. So to an extent solving the general problem isn't necessary -- we can always point people to SSAX. But that's a bit rude ;) so if there are common patterns we should try to capture them in xml->sxml. I see this bug as being a search for those patterns, but without the requirement of solving the problem in its most general form. > - It would be super-nice if the XML parser could put that > into the same nodes it found it, as described in [1] > (i.e. in the (*NAMESPACES* ...) pseudo-attribute). > This way we wouldn't have a global mapping, but one > that resembles the original XML, even with the same > prefixes. Less surprises overall. The round trip > xml -> sxml -> xml would be (nearly) the identity. > > With Ricardo's patch it would lump all the namespace > declarations up in the top node, which formally is > correct, but might scare XML people a bit :-) ACK. > - At sxml->xml time there should be a way to somehow > generate prefixex for "new" namespaces. I don't know > at the moment how this would work, that depends on > how the user is supposed to insert new nodes in the > SXML. Does she specify the namespace? Both prefix > (aka namespace-id, under my current assumption) *and* > namespace? (note that the namespace-id/prefix alone > wouldn't be sufficient). ACK. What do you think the next step is? I am happy to wait FWIW, dunno if Ricardo has any feelings here. Enjoy your holiday :) Andy
X-Loop: help-debbugs@HIDDEN Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces? Resent-From: tomas@HIDDEN Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Thu, 14 Jul 2016 10:27:01 +0000 Resent-Message-ID: <handler.20339.B20339.14684919999133 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 20339 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: Andy Wingo <wingo@HIDDEN> Cc: Ricardo Wurmus <rekado@HIDDEN>, tomas@HIDDEN, 20339 <at> debbugs.gnu.org Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.14684919999133 (code B ref 20339); Thu, 14 Jul 2016 10:27:01 +0000 Received: (at 20339) by debbugs.gnu.org; 14 Jul 2016 10:26:39 +0000 Received: from localhost ([127.0.0.1]:50576 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1bNdqo-0002ND-P9 for submit <at> debbugs.gnu.org; Thu, 14 Jul 2016 06:26:38 -0400 Received: from mail.tuxteam.de ([5.199.139.25]:60509 helo=tomasium.tuxteam.de) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <tomas@HIDDEN>) id 1bNdqk-0002N0-SR for 20339 <at> debbugs.gnu.org; Thu, 14 Jul 2016 06:26:37 -0400 Received: from tomas by tomasium.tuxteam.de with local (Exim 4.80) (envelope-from <tomas@HIDDEN>) id 1bNdqh-0001hN-JM; Thu, 14 Jul 2016 12:26:31 +0200 Date: Thu, 14 Jul 2016 12:26:31 +0200 From: tomas@HIDDEN Message-ID: <20160714102631.GB5611@HIDDEN> References: <20150415194714.GA30295@HIDDEN> <87y45vln0f.fsf@HIDDEN> <20160713132403.GA2349@HIDDEN> <87furc1qeu.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; x-action=pgp-signed Content-Transfer-Encoding: 8bit In-Reply-To: <87furc1qeu.fsf@HIDDEN> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -1.3 (-) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.3 (-) -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Thu, Jul 14, 2016 at 12:10:17PM +0200, Andy Wingo wrote: > Hi :) > > On Wed 13 Jul 2016 15:24, tomas@HIDDEN writes: > > > Referring to Oleg Kiseliov's paper [1], there are actually three > > things involved: > > This summary is helpful, thanks. > > What is missing? From my point of view: > > > > - At xml->sxml time, the user doesn't know which namespaces > > are in the xml. So it would be nice if the XML parser > > could provide that. > > For some documents you do know, of course. > > And for larger perspective, I think that SSAX gives you all the tools > you need to build specialist and very flexible XML parsers. So to an > extent solving the general problem isn't necessary -- we can always > point people to SSAX. But that's a bit rude ;) so if there are common > patterns we should try to capture them in xml->sxml. I see this bug as > being a search for those patterns, but without the requirement of > solving the problem in its most general form. It's (sxml simple), after all. I too hesitate to stuff too much into it. For me, a documented "no, we don't do namespaces" would be one valid pattern. > > - It would be super-nice if the XML parser could put that > > into the same nodes it found it [...] > ACK. > > > - At sxml->xml time there should be a way to somehow > > generate prefixex [...] > ACK. > > What do you think the next step is? I am happy to wait FWIW, dunno if > Ricardo has any feelings here. We meet this afternoon anyway. On my side, I'd be happy to try something along the sketched lines when I'm back. If someone who cares beats me at it, I'd be as happy. > Enjoy your holiday :) Looking forward to. BTW: if I understood properly the area you're living in, we'll cycle past you (somewhat to the West) on our way to the north. Regards - -- tomás -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEYEARECAAYFAleHaNcACgkQBcgs9XrR2kaQgQCaAzyyBkI3w0XGJ0HUI9Dz/YXa 7yQAni4CWIDE5ezu+x0DwanoAjfH4Wr2 =DEuD -----END PGP SIGNATURE-----
X-Loop: help-debbugs@HIDDEN Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces? Resent-From: Ricardo Wurmus <rekado@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Mon, 04 Feb 2019 20:45:01 +0000 Resent-Message-ID: <handler.20339.B20339.154931307929681 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 20339 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: Andy Wingo <wingo@HIDDEN> Cc: tomas@HIDDEN, 20339 <at> debbugs.gnu.org Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.154931307929681 (code B ref 20339); Mon, 04 Feb 2019 20:45:01 +0000 Received: (at 20339) by debbugs.gnu.org; 4 Feb 2019 20:44:39 +0000 Received: from localhost ([127.0.0.1]:59743 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1gql6a-0007id-TL for submit <at> debbugs.gnu.org; Mon, 04 Feb 2019 15:44:38 -0500 Received: from sender-of-o51.zoho.com ([135.84.80.216]:21001) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <rekado@HIDDEN>) id 1gql6W-0007iQ-Nv for 20339 <at> debbugs.gnu.org; Mon, 04 Feb 2019 15:44:35 -0500 ARC-Seal: i=1; a=rsa-sha256; t=1549313048; cv=none; d=zoho.com; s=zohoarc; b=aEM43GiCOLAoo3/H7giGfKs8upF4UZi0os8gj4YEBc5z2rKyMvllEkEQwuGu3/ISB4LjfNczJUX5lfhn6rJKXxyon8g3DnHHkjyzaWn5J4G9WCKAe2JTW2M/K4v6VN+4LlTBFbS1kCaR/ZnTNQxbMgjkZNih7xcLGCUHRNSpHqU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1549313048; h=Content-Type:Cc:Date:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:To:ARC-Authentication-Results; bh=bCrbMDqVhJDVvpeXEH9CHrQC9t2ZELPeqFUK2NaOzFE=; b=msWUNiweN4x1qkx4lIm9WPMK12EWZWP69cSnC5PEAc70QWTM91Wx4GIdcPLqlVpKEl6+EYmAtt3FLS+qxDH6rjdVV1ycShC5aj0pxF6BRV2+sBW9yan2BFtEc/MhHWdbsBW+cVJvSj2VBnhUz68tNNqDWBg7u1XdDcGQ/eB9JdA= ARC-Authentication-Results: i=1; mx.zoho.com; dkim=pass header.i=elephly.net; spf=pass smtp.mailfrom=rekado@HIDDEN; dmarc=pass header.from=<rekado@HIDDEN> header.from=<rekado@HIDDEN> DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1549313048; s=zoho; d=elephly.net; i=rekado@HIDDEN; h=References:From:To:Cc:Subject:In-reply-to:Date:Message-ID:MIME-Version:Content-Type; l=7804; bh=bCrbMDqVhJDVvpeXEH9CHrQC9t2ZELPeqFUK2NaOzFE=; b=PEZScoHbQjYgL9ONk/wwysvvK2XpWEPfkdc6yFBKyEYEhTgrdIVCaRS3/dxo+Pgc 4dOSEcoEPyKEco85HXYPHuNuI+iNMP4nVrVl4HRVsfJrpNc2gWteiLgRLgaHTJwVw/D e+zDWvHtSapihB1fbxq5y/6ZhJzfLNrZ+NrzhT4w= Received: from localhost (p578E68C8.dip0.t-ipconnect.de [87.142.104.200]) by mx.zohomail.com with SMTPS id 1549313046910741.4151801955618; Mon, 4 Feb 2019 12:44:06 -0800 (PST) References: <20150415194714.GA30295@HIDDEN> <87y45vln0f.fsf@HIDDEN> <20160713132403.GA2349@HIDDEN> <87furc1qeu.fsf@HIDDEN> User-agent: mu4e 1.0; emacs 26.1 From: Ricardo Wurmus <rekado@HIDDEN> In-reply-to: <87furc1qeu.fsf@HIDDEN> X-URL: https://elephly.net X-PGP-Key: https://elephly.net/rekado.pubkey X-PGP-Fingerprint: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC Date: Mon, 04 Feb 2019 21:44:02 +0100 Message-ID: <87a7jbi8rx.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-ZohoMailClient: External X-Zoho-Virus-Status: 1 X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hello! I just looked at this again and I think I came with something useful. Here=E2=80=99s some context: Andy Wingo <wingo@HIDDEN> writes: > Hi :) > > On Wed 13 Jul 2016 15:24, tomas@HIDDEN writes: > >> Referring to Oleg Kiseliov's paper [1], there are actually three >> things involved: > > This summary is helpful, thanks. >> What is missing? From my point of view: >> >> - At xml->sxml time, the user doesn't know which namespaces >> are in the xml. So it would be nice if the XML parser >> could provide that. > > For some documents you do know, of course. > > And for larger perspective, I think that SSAX gives you all the tools > you need to build specialist and very flexible XML parsers. So to an > extent solving the general problem isn't necessary -- we can always > point people to SSAX. But that's a bit rude ;) so if there are common > patterns we should try to capture them in xml->sxml. I see this bug as > being a search for those patterns, but without the requirement of > solving the problem in its most general form. > >> - It would be super-nice if the XML parser could put that >> into the same nodes it found it, as described in [1] >> (i.e. in the (*NAMESPACES* ...) pseudo-attribute). >> This way we wouldn't have a global mapping, but one >> that resembles the original XML, even with the same >> prefixes. Less surprises overall. The round trip >> xml -> sxml -> xml would be (nearly) the identity. >> >> With Ricardo's patch it would lump all the namespace >> declarations up in the top node, which formally is >> correct, but might scare XML people a bit :-) > > ACK. > >> - At sxml->xml time there should be a way to somehow >> generate prefixex for "new" namespaces. I don't know >> at the moment how this would work, that depends on >> how the user is supposed to insert new nodes in the >> SXML. Does she specify the namespace? Both prefix >> (aka namespace-id, under my current assumption) *and* >> namespace? (note that the namespace-id/prefix alone >> wouldn't be sufficient). > > ACK. > > What do you think the next step is? I am happy to wait FWIW, dunno if > Ricardo has any feelings here. Attached is a patch that does the requested things. The parser procedures like FINISH-ELEMENT have access to all the namespaces, so we I changed the FINISH-ELEMENT procedure to return the list of namespaces in addition to its SXML tree return value. I changed name->sxml to use only the namespace aliases / abbreviations instead of the namespace URIs. (This is not very efficient because we need to traverse the list of namespaces every time. Maybe we could memoize this. On the other hand, the length of the namespaces list may not be large enough to affect performance too much.) In the end we get both namespace list and SXML tree from running the parser. Before wrapping this up in *TOP* we generate xmlns attributes for all abbreviations and =E2=80=9Cpatch=E2=80=9D the first proper element= =E2=80=99s attribute list (i.e. we skip over a *PI* element if it exists). The result is an SXML tree that begins with namespace declarations, mapping abbreviations to URIs. Within the SXML tree we=E2=80=99re only usi= ng abbreviations, so there are no more invalid characters when converting SXML to a string. I would be happy if you could test this as I=E2=80=99m not 100% confident t= hat this is correct. Here are questions I wasn=E2=80=99t able to answer conclusively: * Is the value for =E2=80=9Cnamespaces=E2=80=9D that=E2=80=99s passed in to= the FINISH-ELEMENT procedure always the same? * Will the second return value of the final call to FINISH-ELEMENT really always be the complete list of *all* namespaces that have been encountered? * Are there valid XML documents for which the match patterns to inject namespace declarations would not apply? (e.g. documents with a PI element and two separate XML trees) -- Ricardo --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename=0001-sxml-xml-sxml-Record-and-use-namespace-abbreviations.patch From 83ee9de18a0ecaa237eb73e1b75d0b21e3e8d321 Mon Sep 17 00:00:00 2001 From: Ricardo Wurmus <rekado@HIDDEN> Date: Mon, 4 Feb 2019 21:39:06 +0100 Subject: [PATCH] sxml: xml->sxml: Record and use namespace abbreviations. * module/sxml/simple.scm (xml->sxml): Add namespace declarations to the attribute list of the first XML element. [name->sxml]: Accept namespaces argument to look up abbreviation. Return name with abbreviation prefix. [parser]: Let FINISH-ELEMENT procedure return namespaces in addition to SXML tree. --- module/sxml/simple.scm | 50 +++++++++++++++++++++++++++++++++--------- 1 file changed, 40 insertions(+), 10 deletions(-) diff --git a/module/sxml/simple.scm b/module/sxml/simple.scm index 703ad9137..52dd9af12 100644 --- a/module/sxml/simple.scm +++ b/module/sxml/simple.scm @@ -1,7 +1,8 @@ ;;;; (sxml simple) -- a simple interface to the SSAX parser ;;;; -;;;; Copyright (C) 2009, 2010, 2013 Free Software Foundation, Inc. +;;;; Copyright (C) 2009, 2010, 2013, 2019 Free Software Foundation, Inc. ;;;; Modified 2004 by Andy Wingo <wingo at pobox dot com>. +;;;; Modified 2019 by Ricardo Wurmus <rekado@HIDDEN>. ;;;; Originally written by Oleg Kiselyov <oleg at pobox dot com> as SXML-to-HTML.scm. ;;;; ;;;; This library is free software; you can redistribute it and/or @@ -30,6 +31,7 @@ #:use-module (sxml ssax) #:use-module (sxml transform) #:use-module (ice-9 match) + #:use-module (srfi srfi-1) #:use-module (srfi srfi-13) #:export (xml->sxml sxml->xml sxml->string)) @@ -123,10 +125,15 @@ port." (acons '*DEFAULT* default-entity-handler entities) entities)) - (define (name->sxml name) + (define (name->sxml name namespaces) (match name ((prefix . local-part) - (symbol-append prefix (string->symbol ":") local-part)) + (let ((abbrev (and=> (find (match-lambda + ((abbrev uri . rest) + (and (eq? uri prefix) abbrev))) + namespaces) + first))) + (symbol-append abbrev (string->symbol ":") local-part))) (_ name))) (define (doctype-continuation seed) @@ -152,14 +159,16 @@ port." (ssax:reverse-collect-str seed))) (attrs (attlist-fold (lambda (attr accum) - (cons (list (name->sxml (car attr)) (cdr attr)) + (cons (list (name->sxml (car attr) namespaces) + (cdr attr)) accum)) '() attributes))) - (acons (name->sxml elem-gi) - (if (null? attrs) - seed - (cons (cons '@ attrs) seed)) - parent-seed))) + (values (acons (name->sxml elem-gi namespaces) + (if (null? attrs) + seed + (cons (cons '@ attrs) seed)) + parent-seed) + namespaces))) CHAR-DATA-HANDLER ; fhere (lambda (string1 string2 seed) @@ -212,7 +221,28 @@ port." (let* ((port (if (string? string-or-port) (open-input-string string-or-port) string-or-port)) - (elements (reverse (parser port '())))) + (elements (call-with-values + (lambda () (parser port '())) + (lambda (elements namespaces) + ;; Generate namespace declarations mapping + ;; abbreviations to URLs. + (let ((ns-declarations + (filter-map (match-lambda + (('*DEFAULT* . _) #f) + ((abbrev uri . _) + (list (symbol-append 'xmlns: abbrev) + (symbol->string uri)))) + namespaces))) + ;; Inject namespace declarations into the first + ;; proper element. + (match (reverse elements) + (((and pi-elem ('*PI* . _)) + (tag ('@ . attrs) . children)) + `(,pi-elem (,tag (@ ,@ns-declarations ,attrs) + ,@children))) + (((tag ('@ . attrs) . children)) + `(,tag (@ ,@ns-declarations ,attrs) + ,@children)))))))) `(*TOP* ,@elements))) (define check-name -- 2.20.1 --=-=-=--
X-Loop: help-debbugs@HIDDEN Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces? Resent-From: John Cowan <cowan@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Mon, 04 Feb 2019 22:56:03 +0000 Resent-Message-ID: <handler.20339.B20339.154932093417966 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 20339 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: Ricardo Wurmus <rekado@HIDDEN> Cc: Andy Wingo <wingo@HIDDEN>, 20339 <at> debbugs.gnu.org Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.154932093417966 (code B ref 20339); Mon, 04 Feb 2019 22:56:03 +0000 Received: (at 20339) by debbugs.gnu.org; 4 Feb 2019 22:55:34 +0000 Received: from localhost ([127.0.0.1]:59857 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1gqn9K-0004fi-80 for submit <at> debbugs.gnu.org; Mon, 04 Feb 2019 17:55:34 -0500 Received: from mail-wr1-f41.google.com ([209.85.221.41]:45143) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <cowan@HIDDEN>) id 1gqn9I-0004fT-58 for 20339 <at> debbugs.gnu.org; Mon, 04 Feb 2019 17:55:33 -0500 Received: by mail-wr1-f41.google.com with SMTP id q15so1644863wro.12 for <20339 <at> debbugs.gnu.org>; Mon, 04 Feb 2019 14:55:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ccil-org.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=p6f67KqRUdU6E3bC/b3cXEtW5JAbEk+91l+ztTzDdwo=; b=RBMSXSaeWIYhikNG46EakI1+YbXb3Kj6tiT0h5C9ZVs4rX3WZixLtUtvKozcmy1bcU Outss4EZQIFjkEjIWz8YAGvFj5/MLab0vEPlrNDjozznySlWUngHXdZb7xOsafU4CE78 7ZSG/cI93XgIF5wkKQC7qD8afHjYnQaR4+7mN4nwB0kfTqMBO3Tcxyqd0KZwi8xnAijL NnyUwsxLXCE1C42oDBnyTBlk+tzflnD3wAd/WglRnfVXPXyV1oj2jS99Ntu4fqQRTRPQ i87B1cQnzENwu4FXsOzxCNhxNtpiD3yU4SLQ/oA63DcgGGQ4YCf8QuiuBiOCKgop64Hi 31Fw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=p6f67KqRUdU6E3bC/b3cXEtW5JAbEk+91l+ztTzDdwo=; b=BdBo8yG0G1YOalKlctSbajdQZgcSs5gxa34tMNkh30w0VnfGFjM+gWRfa6w9MaGTED KgIcBLOAmRxIXgVrZr7sswm/dcVL67BZQdkAYT1QZFPCCNr3I6CHEqdtzVvGD1J1jG1X AOfa/4sY7WzH99+YvhCViVuyuOhLPgHXcLVgIZ2eoYcgAxuwhC/It1ORhczOp0S9IeKw F1jtRr0VNJ+tt7sKDWDyoEagdCj8Vv9l8lgDIrmwU1vDrOaQjSb1pao40SoKb2xdjpQj JJR8hQVBzVgk9+fhEfKZKP0Dt0pJt8zHV0tSKfABsBgVX+R+EEyy4tcCxjUbuHLpVU1w Tiuw== X-Gm-Message-State: AHQUAuZnv0sBQiTm79ThUx1WDF5GOatA0RQuoNPHuNv3Gj/bYthZ9uvw lxal/N5/LgIeNw8B9ST38tc9gW9foBFqNRGDSWG7qQ== X-Google-Smtp-Source: AHgI3IZ17fj5ok1cXdgK20K427VetXcFg57DGLwZ1Tblozmy5bPKBSP0u9BJKYAAtNZ9zW/vDjutehUQAgix5oU7lQk= X-Received: by 2002:a5d:5101:: with SMTP id s1mr1206131wrt.89.1549320926241; Mon, 04 Feb 2019 14:55:26 -0800 (PST) MIME-Version: 1.0 References: <20150415194714.GA30295@HIDDEN> <87y45vln0f.fsf@HIDDEN> <20160713132403.GA2349@HIDDEN> <87furc1qeu.fsf@HIDDEN> <87a7jbi8rx.fsf@HIDDEN> In-Reply-To: <87a7jbi8rx.fsf@HIDDEN> From: John Cowan <cowan@HIDDEN> Date: Mon, 4 Feb 2019 17:55:14 -0500 Message-ID: <CAD2gp_ScjmURZ7yTFronxyR9r4P4P2L91mXNHguXpZG86chdVA@HIDDEN> Content-Type: multipart/alternative; boundary="00000000000073c594058119636f" X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) --00000000000073c594058119636f Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, Feb 4, 2019 at 3:45 PM Ricardo Wurmus <rekado@HIDDEN> wrote: \I changed name->sxml to use only the namespace aliases / abbreviations > instead of the namespace URIs. The trouble with that is that XML rnamespaces are lexically scoped, like Scheme local variables. It is perfectly valid to map a prefix to more than one URL, as long as the namespace declarations are in either disjoint or nested elements. So you don't know what the absolute name of the element or attribute is from just the prefix and the local part. Furthermore, it is also legal to define more than one prefix for the same URL, in which case names using either prefix are normally treated as equivalent (however, you can't have elements like <a:foo>...</b:foo> even if a and b map to the same namespace). * Is the value for =E2=80=9Cnamespaces=E2=80=9D that=E2=80=99s passed in to= the > FINISH-ELEMENT procedure always the same? > > * Will the second return value of the final call to FINISH-ELEMENT > really always be the complete list of *all* namespaces that have been > encountered? > Definitely not, only the namespaces that are currently in scope. * Are there valid XML documents for which the match patterns to inject > namespace declarations would not apply? (e.g. documents with a PI > element and two separate XML trees) > That's not well-formed: you can only have a single element tree per XML document, although you can have any number of PIs, comments, and whitespace (which is normally ignored) before and after. --=20 John Cowan http://vrici.lojban.org/~cowan cowan@HIDDEN If I have seen farther than others, it is because I was looking through a spyglass with my one good eye, with a parrot standing on my shoulder. --"Y" --00000000000073c594058119636f Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><br></div><br><div class= =3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Mon, Feb 4, 2019 = at 3:45 PM Ricardo Wurmus <<a href=3D"mailto:rekado@HIDDEN">rekado@= elephly.net</a>> wrote:<br></div><div dir=3D"ltr" class=3D"gmail_attr"><= br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8e= x;border-left:1px solid rgb(204,204,204);padding-left:1ex">\I changed name-= >sxml to use only the namespace aliases / abbreviations<br> instead of the namespace URIs.</blockquote><div><br></div><div>The trouble = with that is that XML rnamespaces are lexically scoped, like Scheme</div><d= iv>local variables.=C2=A0 It is perfectly valid to map a prefix to more tha= n one URL,</div><div>as long as the namespace declarations are in either di= sjoint or nested</div><div>elements.=C2=A0 So you don't know what the a= bsolute name of the element</div><div>or attribute is from just the prefix = and the local part.</div><div><br></div><div>Furthermore, it is also legal = to define more than one prefix for</div><div>the same URL, in which case na= mes using either prefix are normally</div><div>treated as equivalent (howev= er, you can't have elements like <a:foo>...</b:foo>=C2=A0</= div><div>even if a and b map to the same namespace).</div><div><br></div><b= lockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-le= ft:1px solid rgb(204,204,204);padding-left:1ex">* Is the value for =E2=80= =9Cnamespaces=E2=80=9D that=E2=80=99s passed in to the<br> =C2=A0 FINISH-ELEMENT procedure always the same?<br> <br> * Will the second return value of the final call to FINISH-ELEMENT<br> =C2=A0 really always be the complete list of *all* namespaces that have bee= n<br> =C2=A0 encountered?<br></blockquote><div><br></div><div>Definitely not, onl= y the namespaces that are currently in scope.=C2=A0</div><div><br></div><bl= ockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-lef= t:1px solid rgb(204,204,204);padding-left:1ex">* Are there valid XML docume= nts for which the match patterns to inject<br> =C2=A0 namespace declarations would not apply?=C2=A0 (e.g. documents with a= PI<br> =C2=A0 element and two separate XML trees)<br></blockquote><div><br></div><= div>That's not well-formed: you can only have a single element tree per= XML</div><div>document, although you can have any number of PIs, comments,= and</div><div>whitespace (which is normally ignored) before and after.</di= v><div><br></div><div>--=C2=A0</div><div><div>John Cowan=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 <a href=3D"http://vrici.lojban.org/~cowan">http://vrici.l= ojban.org/~cowan</a>=C2=A0 =C2=A0 =C2=A0 =C2=A0 <a href=3D"mailto:cowan@cci= l.org">cowan@HIDDEN</a></div><div>If I have seen farther than others, it = is because I was looking through a</div><div>spyglass with my one good eye,= with a parrot standing on my shoulder. --"Y"</div></div><div><br= ></div></div></div></div> --00000000000073c594058119636f--
X-Loop: help-debbugs@HIDDEN Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces? Resent-From: Ricardo Wurmus <rekado@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Tue, 05 Feb 2019 10:58:02 +0000 Resent-Message-ID: <handler.20339.B20339.154936428023663 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 20339 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: John Cowan <cowan@HIDDEN> Cc: Andy Wingo <wingo@HIDDEN>, 20339 <at> debbugs.gnu.org Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.154936428023663 (code B ref 20339); Tue, 05 Feb 2019 10:58:02 +0000 Received: (at 20339) by debbugs.gnu.org; 5 Feb 2019 10:58:00 +0000 Received: from localhost ([127.0.0.1]:60232 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1gqyQS-00069a-Ew for submit <at> debbugs.gnu.org; Tue, 05 Feb 2019 05:58:00 -0500 Received: from sender-of-o51.zoho.com ([135.84.80.216]:21117) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <rekado@HIDDEN>) id 1gqyQO-00069O-2H for 20339 <at> debbugs.gnu.org; Tue, 05 Feb 2019 05:57:57 -0500 ARC-Seal: i=1; a=rsa-sha256; t=1549357932; cv=none; d=zoho.com; s=zohoarc; b=LxINLeuV0tR7uuQtUndIcJVTPdE1zA5Ck1IDc1ECK1fcujLyNAu3Yaeq8rpheviw/sGAkcI/rjD1/5Qhl9xjrqqhb5ZGH2YBSNH7IgS2e1DGYh0qxKDPRk0b6vfQU9+4eTdwfEZPT/jBRF3o+sTYoeQVdMM68hql3paMZNmCQtQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1549357932; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:To:ARC-Authentication-Results; bh=O5PO701Bi01oap6qC8cU7pcn0SPa5EGfUldj9TalWf8=; b=GoijDIlcY6xXeClzLgvJBFU18Hm9I5ZTPRQKwiEgYYNsE+UHDARuZRapPgDmUX1pgelfCkvjhLwgC9nOfRs+5YGC9tl6DrWvuPNmSTDe7dAwqThcGtCD3fA8r7AJ1bmDu4zbB3DBvPVopKnjA/KGmjWBB7xZcJfTGv8ofuXota0= ARC-Authentication-Results: i=1; mx.zoho.com; dkim=pass header.i=elephly.net; spf=pass smtp.mailfrom=rekado@HIDDEN; dmarc=pass header.from=<rekado@HIDDEN> header.from=<rekado@HIDDEN> DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1549357932; s=zoho; d=elephly.net; i=rekado@HIDDEN; h=References:From:To:Cc:Subject:In-reply-to:Date:Message-ID:MIME-Version:Content-Type:Content-Transfer-Encoding; l=1303; bh=O5PO701Bi01oap6qC8cU7pcn0SPa5EGfUldj9TalWf8=; b=S7033f83I9pPS/E3suPCIxdcWzenD82ZAfgJ678d1bHzOrpenImyBeAwlS6fAZJR c3qA4WbIuz5xPqze3ZxE8GKP/XXdlomj0a7FWmhkcxhl1xdPCvX4Li1dmjXbJ4jTd2s mZavg4iJXgZk4X8OiP0biIKnrWWBWU7o9RwOCQlc= Received: from localhost (p3E9E957E.dip0.t-ipconnect.de [62.158.149.126]) by mx.zohomail.com with SMTPS id 1549357930133206.1824205767324; Tue, 5 Feb 2019 01:12:10 -0800 (PST) References: <20150415194714.GA30295@HIDDEN> <87y45vln0f.fsf@HIDDEN> <20160713132403.GA2349@HIDDEN> <87furc1qeu.fsf@HIDDEN> <87a7jbi8rx.fsf@HIDDEN> <CAD2gp_ScjmURZ7yTFronxyR9r4P4P2L91mXNHguXpZG86chdVA@HIDDEN> User-agent: mu4e 1.0; emacs 26.1 From: Ricardo Wurmus <rekado@HIDDEN> In-reply-to: <CAD2gp_ScjmURZ7yTFronxyR9r4P4P2L91mXNHguXpZG86chdVA@HIDDEN> X-URL: https://elephly.net X-PGP-Key: https://elephly.net/rekado.pubkey X-PGP-Fingerprint: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC Date: Tue, 05 Feb 2019 10:12:06 +0100 Message-ID: <874l9iiopl.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-ZohoMailClient: External X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) Hi John, > The trouble with that is that XML rnamespaces are lexically scoped, like > Scheme > local variables. It is perfectly valid to map a prefix to more than one > URL, > as long as the namespace declarations are in either disjoint or nested > elements. So you don't know what the absolute name of the element > or attribute is from just the prefix and the local part. > > Furthermore, it is also legal to define more than one prefix for > the same URL, in which case names using either prefix are normally > treated as equivalent (however, you can't have elements like > <a:foo>...</b:foo> > even if a and b map to the same namespace). > > * Is the value for =E2=80=9Cnamespaces=E2=80=9D that=E2=80=99s passed in = to the >> FINISH-ELEMENT procedure always the same? >> >> * Will the second return value of the final call to FINISH-ELEMENT >> really always be the complete list of *all* namespaces that have been >> encountered? >> > > Definitely not, only the namespaces that are currently in scope. Thanks for the clarifications! In that case we coud have FINISH-ELEMENT add all namespace declarations that are in scope to the current node that is about to be returned. It would be a little verbose, but more correct. What do you think? --=20 Ricardo
X-Loop: help-debbugs@HIDDEN Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces? Resent-From: Ricardo Wurmus <rekado@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Wed, 06 Feb 2019 04:45:01 +0000 Resent-Message-ID: <handler.20339.B20339.15494282682283 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 20339 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: John Cowan <cowan@HIDDEN> Cc: 20339 <at> debbugs.gnu.org Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.15494282682283 (code B ref 20339); Wed, 06 Feb 2019 04:45:01 +0000 Received: (at 20339) by debbugs.gnu.org; 6 Feb 2019 04:44:28 +0000 Received: from localhost ([127.0.0.1]:33924 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1grF4V-0000al-CD for submit <at> debbugs.gnu.org; Tue, 05 Feb 2019 23:44:27 -0500 Received: from sender-of-o51.zoho.com ([135.84.80.216]:21058) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <rekado@HIDDEN>) id 1grF4S-0000aZ-Ip for 20339 <at> debbugs.gnu.org; Tue, 05 Feb 2019 23:44:25 -0500 ARC-Seal: i=1; a=rsa-sha256; t=1549371437; cv=none; d=zoho.com; s=zohoarc; b=QvC9tolm152kf1f/XXQR9an58hT30JiReaKmWp2o4BPfkHyCmuIvPrUnQkkpgWooRwiXQrtRsetxYrr+885dpo6h41PiR4wNgEc+IXa/MZ3vdEOP26N58O5j1uE+UIU8szZMhF++vxuEnhx8+aZVwvZI6YxoHk63COnwjSoPK+4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1549371437; h=Content-Type:Cc:Date:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:To:ARC-Authentication-Results; bh=FvyclqO+OIQmmvjA2uzVA79hbzeOuP1ZDkiOv4qVpys=; b=fvlDG+x9OxlpANi6g0bHefRJ0UT/esKc5t6YPPF/JNtGvLrWkQdJ0ZJVUsatP5NKgUrcGQYfdtZANQ7OLNKVyh5DaEHfAyXUNJqdseRmBNePwmt98dt2hHx7rg6IELHE1/2/YHTBiVPag1Ms+yF69n9mRihZvjE8LScBsUkWD+U= ARC-Authentication-Results: i=1; mx.zoho.com; dkim=pass header.i=elephly.net; spf=pass smtp.mailfrom=rekado@HIDDEN; dmarc=pass header.from=<rekado@HIDDEN> header.from=<rekado@HIDDEN> DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1549371437; s=zoho; d=elephly.net; i=rekado@HIDDEN; h=References:From:To:Cc:Subject:In-reply-to:Date:Message-ID:MIME-Version:Content-Type; l=3449; bh=FvyclqO+OIQmmvjA2uzVA79hbzeOuP1ZDkiOv4qVpys=; b=bBtnED2NwAp8U1v48HfmCcB47tdCJriQ1bmFcc9tBCIzi/lvBLrKF4irUA6iz+Bq 3ZEVWYRNhHcng6MF0kWBJRJQu2/ni4Ph7qUG+FQSRFC2TBOkyHT9/7VeQFYM+w6V0wX IKq+NKmIqpZSJKvbfqc36sXnrfW9YI9CO0QSjbC8= Received: from localhost (141.80.247.165 [141.80.247.165]) by mx.zohomail.com with SMTPS id 1549371435332230.93305749332058; Tue, 5 Feb 2019 04:57:15 -0800 (PST) References: <20150415194714.GA30295@HIDDEN> <87y45vln0f.fsf@HIDDEN> <20160713132403.GA2349@HIDDEN> <87furc1qeu.fsf@HIDDEN> <87a7jbi8rx.fsf@HIDDEN> <CAD2gp_ScjmURZ7yTFronxyR9r4P4P2L91mXNHguXpZG86chdVA@HIDDEN> <874l9iiopl.fsf@HIDDEN> User-agent: mu4e 1.0; emacs 26.1 From: Ricardo Wurmus <rekado@HIDDEN> In-reply-to: <874l9iiopl.fsf@HIDDEN> X-URL: https://elephly.net X-PGP-Key: https://elephly.net/rekado.pubkey X-PGP-Fingerprint: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC Date: Tue, 05 Feb 2019 13:57:11 +0100 Message-ID: <87r2cmgzq0.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-ZohoMailClient: External X-Zoho-Virus-Status: 1 X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) --=-=-= Content-Type: text/plain Ricardo Wurmus <rekado@HIDDEN> writes: > In that case we coud have FINISH-ELEMENT add all namespace declarations > that are in scope to the current node that is about to be returned. It > would be a little verbose, but more correct. Like this: --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename=0001-sxml-xml-sxml-Record-and-use-namespace-abbreviations.patch From d44c702718baea4c4557d12ca8dd7dab724c7fb6 Mon Sep 17 00:00:00 2001 From: Ricardo Wurmus <rekado@HIDDEN> Date: Mon, 4 Feb 2019 21:39:06 +0100 Subject: [PATCH] sxml: xml->sxml: Record and use namespace abbreviations. * module/sxml/simple.scm (xml->sxml) [name->sxml]: Accept namespaces argument to look up abbreviation. Return name with abbreviation prefix. [parser]: Let FINISH-ELEMENT procedure return namespaces in addition to the SXML tree's attributes. --- module/sxml/simple.scm | 34 +++++++++++++++++++++++++--------- 1 file changed, 25 insertions(+), 9 deletions(-) diff --git a/module/sxml/simple.scm b/module/sxml/simple.scm index 703ad9137..2bb332c83 100644 --- a/module/sxml/simple.scm +++ b/module/sxml/simple.scm @@ -1,7 +1,8 @@ ;;;; (sxml simple) -- a simple interface to the SSAX parser ;;;; -;;;; Copyright (C) 2009, 2010, 2013 Free Software Foundation, Inc. +;;;; Copyright (C) 2009, 2010, 2013, 2019 Free Software Foundation, Inc. ;;;; Modified 2004 by Andy Wingo <wingo at pobox dot com>. +;;;; Modified 2019 by Ricardo Wurmus <rekado@HIDDEN>. ;;;; Originally written by Oleg Kiselyov <oleg at pobox dot com> as SXML-to-HTML.scm. ;;;; ;;;; This library is free software; you can redistribute it and/or @@ -30,6 +31,7 @@ #:use-module (sxml ssax) #:use-module (sxml transform) #:use-module (ice-9 match) + #:use-module (srfi srfi-1) #:use-module (srfi srfi-13) #:export (xml->sxml sxml->xml sxml->string)) @@ -123,10 +125,15 @@ port." (acons '*DEFAULT* default-entity-handler entities) entities)) - (define (name->sxml name) + (define (name->sxml name namespaces) (match name ((prefix . local-part) - (symbol-append prefix (string->symbol ":") local-part)) + (let ((abbrev (and=> (find (match-lambda + ((abbrev uri . rest) + (and (eq? uri prefix) abbrev))) + namespaces) + first))) + (symbol-append abbrev (string->symbol ":") local-part))) (_ name))) (define (doctype-continuation seed) @@ -150,12 +157,21 @@ port." (let ((seed (if trim-whitespace? (ssax:reverse-collect-str-drop-ws seed) (ssax:reverse-collect-str seed))) - (attrs (attlist-fold - (lambda (attr accum) - (cons (list (name->sxml (car attr)) (cdr attr)) - accum)) - '() attributes))) - (acons (name->sxml elem-gi) + (attrs (append + ;; Namespace declarations + (filter-map (match-lambda + (('*DEFAULT* . _) #f) + ((abbrev uri . _) + (list (symbol-append 'xmlns: abbrev) + (symbol->string uri)))) + namespaces) + (attlist-fold + (lambda (attr accum) + (cons (list (name->sxml (car attr) namespaces) + (cdr attr)) + accum)) + '() attributes)))) + (acons (name->sxml elem-gi namespaces) (if (null? attrs) seed (cons (cons '@ attrs) seed)) -- 2.20.1 --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable It=E2=80=99s quite verbose because it doesn=E2=80=99t check if a namespace = declaration is the same in a parent. -- Ricardo --=-=-=--
X-Loop: help-debbugs@HIDDEN Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces? Resent-From: tomas@HIDDEN Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Tue, 12 Feb 2019 09:57:02 +0000 Resent-Message-ID: <handler.20339.B20339.15499653739393 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 20339 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: Ricardo Wurmus <rekado@HIDDEN> Cc: Andy Wingo <wingo@HIDDEN>, 20339 <at> debbugs.gnu.org Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.15499653739393 (code B ref 20339); Tue, 12 Feb 2019 09:57:02 +0000 Received: (at 20339) by debbugs.gnu.org; 12 Feb 2019 09:56:13 +0000 Received: from localhost ([127.0.0.1]:44428 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1gtUnV-0002RQ-4H for submit <at> debbugs.gnu.org; Tue, 12 Feb 2019 04:56:13 -0500 Received: from mail.tuxteam.de ([5.199.139.25]:52178) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <tomas@HIDDEN>) id 1gtUnQ-0002RF-BX for 20339 <at> debbugs.gnu.org; Tue, 12 Feb 2019 04:56:11 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=tuxteam.de; s=mail; h=In-Reply-To:Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date; bh=Qyl3/uqExotiKF+0PI1Es89oxwo4tCnHaYWu6Vuhh80=; b=WDD6Mg/C9iHhNgIbwJ96LxsoW9aPFk6Lr3JU9cGUR5mocNgdavgHebttNr9+x/WUA7WZX5FwOV97kQoFLQ5xk4ZiAmNfgrAd9hzDBcG56z7VVurp9OQExPNTVFjcSGr0GI50APbxkADQcTy54eFA0ZodxKZZcac6Ky2wabTT54tLFGBT0JPqY/QGHDS3L1YhWf21THRpyo5ylS6Fn+IJL1HmT8GXW8cownU1CYO1F8YwEkoxPF3WZQ+sqiSq1//UxFfBy42qFI25Kafgi2atZFoNEi8kNBfJ2zL669bCmiPOCRrs8uSL9Jc8ie3n8GXTPsZTHw87U7vHns+Vb8ex9Q==; Received: from tomas by mail.tuxteam.de with local (Exim 4.80) (envelope-from <tomas@HIDDEN>) id 1gtUnK-0004XC-Lt; Tue, 12 Feb 2019 10:56:02 +0100 Date: Tue, 12 Feb 2019 10:56:02 +0100 From: tomas@HIDDEN Message-ID: <20190212095602.GD13448@HIDDEN> References: <20150415194714.GA30295@HIDDEN> <87y45vln0f.fsf@HIDDEN> <20160713132403.GA2349@HIDDEN> <87furc1qeu.fsf@HIDDEN> <87a7jbi8rx.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Pk6IbRAofICFmK5e" Content-Disposition: inline In-Reply-To: <87a7jbi8rx.fsf@HIDDEN> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) --Pk6IbRAofICFmK5e Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Feb 04, 2019 at 09:44:02PM +0100, Ricardo Wurmus wrote: > Hello! >=20 > I just looked at this again and I think I came with something useful. > Here=E2=80=99s some context: [...] > Attached is a patch that does the requested things. The parser > procedures like FINISH-ELEMENT have access to all the namespaces, so we > I changed the FINISH-ELEMENT procedure to return the list of namespaces > in addition to its SXML tree return value. It's great that you pick that up, I'm excited :-) I have lost a bit of contact to Guile as of late. But I'm preparing some tooling to give your patches a whirl; in the meantime a couple of comments from the peanut gallery: As John has noted, the namespace mappings (i.e. the prefix -> namespace URI binding) are kind of lexically scoped (I'd call it subtree scoped, but structurally it is the same). While parsing is "easy" (assuming well-formed XML), serializing is not unambiguous. In a way, the library might want to be prepared to take hints from the application (as far as the XML is to be read by humans, there might be "better" and "worse" serializations). It may take me a couple of days to come up to speed. Thanks a lot & cheers -- t --Pk6IbRAofICFmK5e Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEUEARECAAYFAlximDIACgkQBcgs9XrR2kbicwCWNOloNf1OUTw7vsDBAlmuxDLi egCffA4PYlxxVDtlzgdSZ4HqlUTN1o4= =DZql -----END PGP SIGNATURE----- --Pk6IbRAofICFmK5e--
X-Loop: help-debbugs@HIDDEN Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces? Resent-From: Ricardo Wurmus <rekado@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Wed, 13 Feb 2019 00:17:01 +0000 Resent-Message-ID: <handler.20339.B20339.155001698524126 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 20339 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: tomas@HIDDEN Cc: Andy Wingo <wingo@HIDDEN>, 20339 <at> debbugs.gnu.org Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.155001698524126 (code B ref 20339); Wed, 13 Feb 2019 00:17:01 +0000 Received: (at 20339) by debbugs.gnu.org; 13 Feb 2019 00:16:25 +0000 Received: from localhost ([127.0.0.1]:45563 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1gtiDw-0006Go-W2 for submit <at> debbugs.gnu.org; Tue, 12 Feb 2019 19:16:25 -0500 Received: from sender-of-o51.zoho.com ([135.84.80.216]:21146) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <rekado@HIDDEN>) id 1gtiDt-0006Dl-H3 for 20339 <at> debbugs.gnu.org; Tue, 12 Feb 2019 19:16:24 -0500 ARC-Seal: i=1; a=rsa-sha256; t=1550003411; cv=none; d=zoho.com; s=zohoarc; b=BzPTtnVsdtKj91FW2mGqniG3l3iowKk35iqDYogZPgAqd+YnojHAGc+bTfrVpmvsMBnjEwgKKDhUH4jr0i/ynsQxB8DZ/cIRPRmkaE2FPNI8A7bPQT9Y6Vw5EmtI9Ncp/Fcq93HaJZoAz8YkOjs6td3NDV0+9G5CyFynNjmimB4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1550003411; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:To:ARC-Authentication-Results; bh=rgwjLTIHDGg8TKkjlK2JEo2aDgrYMkop9+qsNXVTdFg=; b=aaOxbRkEyXmyBHLQ5h0c6OIJvpCr8VWPgCjA7C8A/8abTz4dZMt3X/moVE1D/ywQzYdy9zvdjL/x2ok8hBr95i/I3fS7pLOiekTDLmSuReO0t5D5ggxpKUIbKxzW6f6vwftIstkHjXt8ZqPMiXuQs3lwXu/K44jC7h4wsqE2PIA= ARC-Authentication-Results: i=1; mx.zoho.com; dkim=pass header.i=elephly.net; spf=pass smtp.mailfrom=rekado@HIDDEN; dmarc=pass header.from=<rekado@HIDDEN> header.from=<rekado@HIDDEN> DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1550003411; s=zoho; d=elephly.net; i=rekado@HIDDEN; h=References:From:To:Cc:Subject:In-reply-to:Date:Message-ID:MIME-Version:Content-Type:Content-Transfer-Encoding; l=1960; bh=rgwjLTIHDGg8TKkjlK2JEo2aDgrYMkop9+qsNXVTdFg=; b=W310mK9tZOQquXjTd4oWbgz0S99qWH3Hoeh97mk6kbSvXUzlrmsvhKeiM+JgzZK8 MWW2BqeIq38Ti42F44MQCtBRZVqt7cyVSoYNGTZuMD6HteMtMPM13B6Z6dGv1Z13o6q nQek8J4Ap5p2J0tvuj0+/mXnTYYsp6l9u107/eNA= Received: from localhost (p3E9E9E6F.dip0.t-ipconnect.de [62.158.158.111]) by mx.zohomail.com with SMTPS id 1550003408876634.6826516030922; Tue, 12 Feb 2019 12:30:08 -0800 (PST) References: <20150415194714.GA30295@HIDDEN> <87y45vln0f.fsf@HIDDEN> <20160713132403.GA2349@HIDDEN> <87furc1qeu.fsf@HIDDEN> <87a7jbi8rx.fsf@HIDDEN> <20190212095602.GD13448@HIDDEN> User-agent: mu4e 1.0; emacs 26.1 From: Ricardo Wurmus <rekado@HIDDEN> In-reply-to: <20190212095602.GD13448@HIDDEN> X-URL: https://elephly.net X-PGP-Key: https://elephly.net/rekado.pubkey X-PGP-Fingerprint: BCA6 89B6 3655 3801 C3C6 2150 197A 5888 235F ACAC Date: Tue, 12 Feb 2019 21:30:04 +0100 Message-ID: <87wom4iwc3.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-ZohoMailClient: External X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) tomas@HIDDEN writes: > As John has noted, the namespace mappings (i.e. the prefix -> namespace > URI binding) are kind of lexically scoped (I'd call it subtree scoped, > but structurally it is the same). While parsing is "easy" (assuming > well-formed XML), serializing is not unambiguous. The =E2=80=9Cfup=E2=80=9D handler of the parser visits every element and ha= s a list of namespaces that are in scope at this point. Its purpose is to return the SXML representation of that element. At this point we can record the namespaces as attributes. (That=E2=80=99s what the patch does.) When baking XML from SXML we don=E2=80=99t need to do anything special =E2= =80=94 we only need to convert everything to text, including the recorded namespace attributes. This isn=E2=80=99t pretty SXML (nor is it pretty XML), but it appears to be correct as none of the namespace information is lost. To get a better serialized representation the parser needs to do a better job of identifying =E2=80=9Cnew=E2=80=9D namespaces. > In a way, the library might want to be prepared to take hints from the > application (as far as the XML is to be read by humans, there might be > "better" and "worse" serializations). The XML produced when this patch is applied will not be pretty. To generate minimal/pretty XML knowledge of the parent elements=E2=80=99 names= paces is required =E2=80=94 knowledge that the parser=E2=80=99s =E2=80=9Cfup=E2= =80=9D handler does not have. We could try to alter the parser so that it not only passes the list of namespaces that are currently in scope, but also a list of namespaces that are in scope for the parent node. This would allow us to determine the list of *new* namespaces that absolutely must be declared for the current node. If there are no new namespaces we can simply ignore them and produce minimal SXML (and thus minimal XML later when the SXML is serialized). -- Ricardo
X-Loop: help-debbugs@HIDDEN Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces? Resent-From: <tomas@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Mon, 08 Apr 2019 12:15:02 +0000 Resent-Message-ID: <handler.20339.B20339.155472565528064 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 20339 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: Ricardo Wurmus <rekado@HIDDEN> Cc: John Cowan <cowan@HIDDEN>, 20339 <at> debbugs.gnu.org Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.155472565528064 (code B ref 20339); Mon, 08 Apr 2019 12:15:02 +0000 Received: (at 20339) by debbugs.gnu.org; 8 Apr 2019 12:14:15 +0000 Received: from localhost ([127.0.0.1]:49075 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1hDTAE-0007IZ-GN for submit <at> debbugs.gnu.org; Mon, 08 Apr 2019 08:14:14 -0400 Received: from mail.tuxteam.de ([5.199.139.25]:47211) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <tomas@HIDDEN>) id 1hDTAB-0007IL-3J for 20339 <at> debbugs.gnu.org; Mon, 08 Apr 2019 08:14:13 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=tuxteam.de; s=mail; h=From:In-Reply-To:Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:Date; bh=9rttPn+Z2+suDJb3cpPrezQSV7+57zD/mpt/BOrh6pg=; b=KNHTOkZxHdP+4ohMcWgeG29Lp9zSZNZqFFkIUR6sD9u63tEVA+2bqq6udxH0xOdUiUGVnADSKxgAwtcd4P+08keYwrPWzt6iv8p+aD1dwBsvwevUWDPIYbz/c55UbPMmqBwO0OITP9RiohTLTiLHLtO1oNmAA6zR1CgqDLRzb2lk5WTcVJ/6PPksOkYZlBIJVar6QDOeO6wUS7WJ7tA62G8fxitrClwh3RjAMIR6NIQApU2IymX1pZj5sNn0EOjEqIM+tlQTXlIHDabHLSQEXnCw94BdxPonDTbFWGID+d+cY6KMARXIaeIu1cLGuS9OHTRAkreb9lEdMaRBndPA/g==; Received: from tomas by mail.tuxteam.de with local (Exim 4.80) (envelope-from <tomas@HIDDEN>) id 1hDTA3-0000Vy-WB; Mon, 08 Apr 2019 14:14:04 +0200 Date: Mon, 8 Apr 2019 14:14:03 +0200 Message-ID: <20190408121403.GA781@HIDDEN> References: <20150415194714.GA30295@HIDDEN> <87y45vln0f.fsf@HIDDEN> <20160713132403.GA2349@HIDDEN> <87furc1qeu.fsf@HIDDEN> <87a7jbi8rx.fsf@HIDDEN> <CAD2gp_ScjmURZ7yTFronxyR9r4P4P2L91mXNHguXpZG86chdVA@HIDDEN> <874l9iiopl.fsf@HIDDEN> <87r2cmgzq0.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="yrj/dFKFPuw6o+aM" Content-Disposition: inline In-Reply-To: <87r2cmgzq0.fsf@HIDDEN> User-Agent: Mutt/1.5.21 (2010-09-15) From: <tomas@HIDDEN> X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) --yrj/dFKFPuw6o+aM Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Feb 05, 2019 at 01:57:11PM +0100, Ricardo Wurmus wrote: >=20 > Ricardo Wurmus <rekado@HIDDEN> writes: >=20 > > In that case we coud have FINISH-ELEMENT add all namespace declarations > > that are in scope to the current node that is about to be returned. It > > would be a little verbose, but more correct. >=20 > Like this: Thanks again for your patch, and sorry for my glacial pace. I now came around to test it (against Guile 2.2.4, commit 791cae940afcb2b2eb2c167fe438be1dc1008a73). TL;DR: - The default namespace is still a problem (see below) - It would be nice to inhibit the down-inheritance of namespace declararions at xml->sxml time. Then, the sxml representation would closely mimic the XML, this has obvious advantages, since it'd give the user much more control over the generated XML. I'd be willing to prepare a patch along these lines, but for that, I'd like to get an idea of which direction we want to take this whole thing to. To see what's going on, I tried with a small XML example: First with explicit (aka non-default) namespace: #+NAME: minimal-explicit #+BEGIN_EXAMPLE <?xml version=3D"1.0"?> <myns:root xmlns:myns=3D"http://example.org/namespaces/myns"> <myns:subnode/> </myns:root> #+END_EXAMPLE Before your patch: #+NAME: minimal-explicit-before #+BEGIN_SRC scheme :results output verbatim :var the-xml=3Dminimal-explic= it (use-modules (sxml simple)) (use-modules (ice-9 pretty-print)) (pretty-print (xml->sxml the-xml)) #+END_SRC #+RESULTS: minimal-explicit-before : <stdin>:12:0: warning: possibly unbound variable `pretty-print' : <stdin>:12:14: warning: possibly unbound variable `xml->sxml' : (*TOP* (*PI* xml "version=3D\"1.0\"") : (http://example.org/namespaces/myns:root : "\n " : (http://example.org/namespaces/myns:subnode) : "\n")) As we know, this replaces the namespace prefixes with the namespace URIs After your patch: #+NAME: minimal-explicit-after #+BEGIN_SRC scheme :results output verbatim :var the-xml=3Dminimal-explic= it (set! %load-path (cons "." %load-path)) (use-modules (sxml simple)) (use-modules (ice-9 pretty-print)) (pretty-print (xml->sxml the-xml)) #+END_SRC #+RESULTS: minimal-explicit-after #+begin_example <stdin>:13:0: warning: possibly unbound variable `pretty-print' <stdin>:13:14: warning: possibly unbound variable `xml->sxml' ;;; note: source file ./sxml/simple.scm ;;; newer than compiled /usr/local/lib/guile/2.2/ccache/sxml/simple= =2Ego ;;; found fresh local cache at /home/tomas/.cache/guile/ccache/2.2-LE-8-3= =2EA/home/tomas/guile/sxml-fix/sxml/simple.scm.go (*TOP* (*PI* xml "version=3D\"1.0\"") (myns:root (@ (xmlns:myns "http://example.org/namespaces/myns")) "\n " (myns:subnode (@ (xmlns:myns "http://example.org/namespaces/myns"))) "\n")) #+end_example (I've put sxml/simple.scm in the current directory, thus the manipulation of %load-path). This mimics the XML more closely, using namespace prefixes instead of namespaces in the sxml. This is compelling :-) The only difference to the xml is that the namespace declaration is inherit= ed to lower-level nodes (that's why sxml->xml propagates them, too). This works, with the above downside, which you noted too. It doesn't work with a default namespace, though: #+NAME: minimal-implicit #+BEGIN_EXAMPLE <?xml version=3D"1.0"?> <root xmlns=3D"http://example.org/namespaces/myns"> <subnode/> </root> #+END_EXAMPLE With your patch: #+NAME: minimal-implicit-after #+BEGIN_SRC scheme :results output verbatim :var the-xml=3Dminimal-implic= it (set! %load-path (cons "." %load-path)) (use-modules (sxml simple)) (use-modules (ice-9 pretty-print)) (pretty-print (xml->sxml the-xml)) #+END_SRC #+RESULTS: minimal-implicit-after : <stdin>:13:0: warning: possibly unbound variable `pretty-print' : <stdin>:13:14: warning: possibly unbound variable `xml->sxml' : ;;; note: source file ./sxml/simple.scm : ;;; newer than compiled /usr/local/lib/guile/2.2/ccache/sxml/simp= le.go : ;;; found fresh local cache at /home/tomas/.cache/guile/ccache/2.2-LE-8= -3.A/home/tomas/guile/sxml-fix/sxml/simple.scm.go : (*TOP* (*PI* xml "version=3D\"1.0\"") : (*DEFAULT*:root "\n " (*DEFAULT*:subnode) "\n")) Note that the namespace declaration for *DEFAULT* is missing, so we lost that bit of information. Besides, this is not serializable: #+NAME: reserialize-implicit #+BEGIN_SRC scheme :results output verbatim (set! %load-path (cons "." %load-path)) (use-modules (sxml simple)) (define the-sxml '(*TOP* (*PI* xml "version=3D\"1.0\"") (*DEFAULT*:root "\n " (*DEFAULT*:subnode) "\n"))) (sxml->xml the-sxml) #+END_SRC It catches the bad (xml) name starting with a star: #+RESULTS: reserialize-implicit : ERROR: In procedure scm-error: : Invalid name starting character "*DEFAULT*" *DEFAULT*:root :=20 : Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue. : scheme@(guile-user) [1]>=20 Cheers -- tom=C3=A1s --yrj/dFKFPuw6o+aM Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEYEARECAAYFAlyrOwsACgkQBcgs9XrR2kabQgCeIvJGAfCZb5KnVNe7M7VFapAY l9kAn110JNoUb3XRLxV8nCAk4ihppgsF =bnBc -----END PGP SIGNATURE----- --yrj/dFKFPuw6o+aM--
X-Loop: help-debbugs@HIDDEN Subject: bug#20339: Taking a step back (was: sxml simple: sxml->xml mishandles namespaces?) Resent-From: tomas@HIDDEN Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Fri, 03 May 2019 10:47:02 +0000 Resent-Message-ID: <handler.20339.B20339.15568803991733 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 20339 X-GNU-PR-Package: guile X-GNU-PR-Keywords: To: Ricardo Wurmus <rekado@HIDDEN> Cc: Andy Wingo <wingo@HIDDEN>, 20339 <at> debbugs.gnu.org Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.15568803991733 (code B ref 20339); Fri, 03 May 2019 10:47:02 +0000 Received: (at 20339) by debbugs.gnu.org; 3 May 2019 10:46:39 +0000 Received: from localhost ([127.0.0.1]:47827 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1hMViA-0000Rr-JF for submit <at> debbugs.gnu.org; Fri, 03 May 2019 06:46:39 -0400 Received: from mail.tuxteam.de ([5.199.139.25]:36828) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <tomas@HIDDEN>) id 1hMVi5-0000Rf-PQ for 20339 <at> debbugs.gnu.org; Fri, 03 May 2019 06:46:36 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=tuxteam.de; s=mail; h=In-Reply-To:Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date; bh=JsUyCYj5dp82jov8QJTMxBPXPEXdwUHlTIQYdfEpU+E=; b=l7AiYLeYETz7SyflbCLPIbNsDFf6JM41FkZjcbfshK++mwhT7RGL9flGapjF2b9dCLUSO9szcgoyxfdMXuF+ui9eTZlVP4LhBHZTxB9JiYxfrAalINzkp/4r4Dlrgh5ikwi3klo7Rs/s44ecP2F/ltsWXeKoNZwP0U90r5FMkUWfjbluHvu+pIz3ORlLnzpOz8HFEhBCFqSpWztzW50rhECHfSuqSdbYK6X+EViO8Ia0Qy6dUtX10vHLsDOVJgMc331Il0gaTm2rlefd/XLiCyv/7i1MYmaD+vAA8/PBHpiKJCHqiCDQ8vYXIb5N4eoNTzB/Lse3wsgD5UmA8vejMw==; Received: from tomas by mail.tuxteam.de with local (Exim 4.80) (envelope-from <tomas@HIDDEN>) id 1hMVhz-0001i0-Ci; Fri, 03 May 2019 12:46:27 +0200 Date: Fri, 3 May 2019 12:46:27 +0200 From: tomas@HIDDEN Message-ID: <20190503104627.GE31083@HIDDEN> References: <20150415194714.GA30295@HIDDEN> <87y45vln0f.fsf@HIDDEN> <20160713132403.GA2349@HIDDEN> <87furc1qeu.fsf@HIDDEN> <87a7jbi8rx.fsf@HIDDEN> <20190212095602.GD13448@HIDDEN> <87wom4iwc3.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="imjhCm/Pyz7Rq5F2" Content-Disposition: inline In-Reply-To: <87wom4iwc3.fsf@HIDDEN> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) --imjhCm/Pyz7Rq5F2 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi, after mulling over it for a while, I think it's time to take a step back and think a bit about where we'd like to go with this. Note that I'm ignoring technical details (the fact that the SXML, and thus the XML serialization now has namespace declarations everywhere down the path instead of just at the corresponding root node, and the thing with the default namespaces, as noted in [1], seem to me "fixable" technical details). Your patch, Ricardo, takes a new approach wrt. the SXML resulting =66rom an XML parse: the full tag names (the QNAMEs, in XML parlance) are now composed of <prefix>:<name> (mimicking the XML) instead of <namespace uri>:<name>, as the former (sxml simple) used to do. This has upsides and downsides. I'll call your approach the "prefix" approach (as having the prefixes to qualify the tag names) and the approach followed by (sxml simple) up to now the "URI" approach, which haves the full namespace URI qualifying the name. In the URI approach, a qualified tag name would look like "http://example.org/namespaces/myns:node" whereas in the prefix approach, it'd look like "myns:root" plus the knowledge somewhere that the prefix "myns" stands for myns -> http://example.org/namespaces/myns Upsides of the prefix approach: + it mimics more closely the XML syntax. Since that is what the XML folks see, that follows the "principle of least astonishment" (aka POLA) + it is forced to keep the prefix -> namespace associations (it would be semantically incomplete if not, since what counts semantically is the namespace URI) Downsides - it contradicts current documentation "All namespaces in the XML document must be declared, via xmlns attributes. SXML elements built from non-default namespaces will have their tags prefixed with their URI. Users can specify custom prefixes for certain namespaces with the #:namespaces keyword argument to xml->sxml." [2] This can be changed, of course :-) But perhaps someone is already relying on it? - working on the resulting SXML becomes harder, because to compare two qualified names, we'd have to resolve the namespace associations. Upsides of the URI approach + it is what the documentation says + it follows more closely the XML semantics (the namespace prefix in itself is irrelevant after all). As a corollary, working on the SXML becomes easier: a comparison of two qualified names becomes a simple string comparison, etc. I think that is why (sxml simple)'s original design followed this path. Downsides Well, negate the "prefix approach" upsides :-) Let me just say that there seem to be precedents for the prefix approach out there in the 'net: the Wikipedia article [3] (yes, there's a wikipedia on that!) follows the prefix approach. This nice blog post [4] too. I think I'll stop here. Mi fingers itch with some hacking, but I think we should pause and ponder before hacking. Perhaps we should take this to guile-devel? OTOH, if someone knows The Way Forward (TM), I'm willing to hack in this direction. Cheers & thanks [1] Message ID <20190408121403.GA781@HIDDEN> http://lists.gnu.org/archive/html/bug-guile/2019-04/msg00001.html [2] https://www.gnu.org/software/guile/manual/guile.html#SXML [3] https://en.wikipedia.org/wiki/SXML [4] https://www.more-magic.net/posts/lispy-dsl-sxml.html -- tom=C3=A1s --imjhCm/Pyz7Rq5F2 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iEYEARECAAYFAlzMHAMACgkQBcgs9XrR2karLACdFBBbZnzvLF3kxFuyGiO1LdFl 7a8An3REZ122yhfCev5iLBMuQTKWSwMH =m2+q -----END PGP SIGNATURE----- --imjhCm/Pyz7Rq5F2--
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.