GNU logs - #20339, boring messages


Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces?
Resent-From: tomas@HIDDEN
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Wed, 15 Apr 2015 19:48:02 +0000
Resent-Message-ID: <handler.20339.B.142912725311595 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: report 20339
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: 20339 <at> debbugs.gnu.org
X-Debbugs-Original-To: bug-guile@HIDDEN
Received: via spool by submit <at> debbugs.gnu.org id=B.142912725311595
          (code B ref -1); Wed, 15 Apr 2015 19:48:02 +0000
Received: (at submit) by debbugs.gnu.org; 15 Apr 2015 19:47:33 +0000
Received: from localhost ([127.0.0.1]:57415 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1YiTHY-00030w-92
	for submit <at> debbugs.gnu.org; Wed, 15 Apr 2015 15:47:32 -0400
Received: from eggs.gnu.org ([208.118.235.92]:33015)
 by debbugs.gnu.org with esmtp (Exim 4.80)
 (envelope-from <tomas@HIDDEN>) id 1YiTHU-00030b-Pc
 for submit <at> debbugs.gnu.org; Wed, 15 Apr 2015 15:47:29 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <tomas@HIDDEN>) id 1YiTHO-0001CM-Gx
 for submit <at> debbugs.gnu.org; Wed, 15 Apr 2015 15:47:23 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled
 version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:36257)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <tomas@HIDDEN>) id 1YiTHO-0001CI-EO
 for submit <at> debbugs.gnu.org; Wed, 15 Apr 2015 15:47:22 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:46510)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <tomas@HIDDEN>) id 1YiTHM-0007IB-MS
 for bug-guile@HIDDEN; Wed, 15 Apr 2015 15:47:22 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <tomas@HIDDEN>) id 1YiTHH-0001BI-SO
 for bug-guile@HIDDEN; Wed, 15 Apr 2015 15:47:20 -0400
Received: from mail.tuxteam.de ([5.199.139.25]:44167 helo=tomasium.tuxteam.de)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <tomas@HIDDEN>) id 1YiTHH-0001BE-Ls
 for bug-guile@HIDDEN; Wed, 15 Apr 2015 15:47:15 -0400
Received: from tomas by tomasium.tuxteam.de with local (Exim 4.80)
 (envelope-from <tomas@HIDDEN>) id 1YiTHG-0008Cw-LG
 for bug-guile@HIDDEN; Wed, 15 Apr 2015 21:47:14 +0200
Date: Wed, 15 Apr 2015 21:47:14 +0200
From: tomas@HIDDEN
Message-ID: <20150415194714.GA30295@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; x-action=pgp-signed
Content-Transfer-Encoding: 8bit
User-Agent: Mutt/1.5.21 (2010-09-15)
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x
X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address
 (bad octet value).
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -5.0 (-----)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -5.0 (-----)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

I posted more details on guile-devel. Perhaps this was the wrong list?

When transforming SXML to XML, namespaces don't seem to be handled
properly:


  #!/usr/bin/guile -s
  !#
  (use-modules (sxml simple))
  
  ;; An XML with two namespaces (one default)
  (define the-svg "<svg xmlns='http://www.w3.org/2000/svg'
       xmlns:xlink='http://www.w3.org/1999/xlink'>
    <rect x='5' y='5' width='20' height='20'
          stroke-width='2' stroke='purple' fill='yellow'
          id='rect1' />
    <rect x='30' y='5' width='20' height='20'
          ry='5' rx='8' stroke-width='2' stroke='purple' fill='blue'
          xlink:href='#rect1' />
  </svg>")
  
  ;; Note how SXML handles QNames (just concatenating NS and
  ;; local-name with a colon):
  (define the-sxml
    (with-input-from-string the-svg xml->sxml))
  (format #t "~A\n" the-sxml)
  
  ;; If we try to serialize this: kaboom!
  (sxml->xml the-sxml)

The parsing into SXML goes well, the (format ...) outputs what
I'd expect. But the (sxml->xml ...) dies with:

  ERROR: In procedure scm-error:
  ERROR: Invalid QName: more than one colon http://www.w3.org/2000/svg:svg

The problem is that SXML used the concatenated (full) namespace with the
name as tag (and attribute) names for namespaced items. When serializing
to XML it should try to find abbreviations for those namespaces and issue
the corresponding namespace declarations.

Instead, sxml->xml tries to split the (namespace:name) combination
at the first colon and to check the name -- and fails miserably at
(namespace:name) combinations à la "http://www.w3.org/1999/xlink:href"
(procedure check-name). Since there are two colons, the name part
has now a colon.

There are more details at:

http://lists.gnu.org/archive/html/guile-devel/2015-04/msg00000.html

with a first attempt at a patch against guile (GNU Guile) 2.0.5-deb+1-3.
I'm more than willing to beat the patch into shape, but will possibly
need some guidance. Perhaps I'd need to sign papers with the FSF, which
I'd gladly do.

Regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlUuwEIACgkQBcgs9XrR2kbJWQCfQ/ALFQrf0crOK47SbaOlJlMv
MwAAn3fxDBWOhgNF0L7E35k0skol2T0V
=FIId
-----END PGP SIGNATURE-----




Message sent:


Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Mailer: MIME-tools 5.503 (Entity 5.503)
Content-Type: text/plain; charset=utf-8
X-Loop: help-debbugs@HIDDEN
From: help-debbugs@HIDDEN (GNU bug Tracking System)
To: tomas@HIDDEN
Subject: bug#20339: Acknowledgement (sxml simple: sxml->xml mishandles
 namespaces?)
Message-ID: <handler.20339.B.142912725311595.ack <at> debbugs.gnu.org>
References: <20150415194714.GA30295@HIDDEN>
X-Gnu-PR-Message: ack 20339
X-Gnu-PR-Package: guile
Reply-To: 20339 <at> debbugs.gnu.org
Date: Wed, 15 Apr 2015 19:48:02 +0000

Thank you for filing a new bug report with debbugs.gnu.org.

This is an automatically generated reply to let you know your message
has been received.

Your message is being forwarded to the package maintainers and other
interested parties for their attention; they will reply in due course.

Your message has been sent to the package maintainer(s):
 bug-guile@HIDDEN

If you wish to submit further information on this problem, please
send it to 20339 <at> debbugs.gnu.org.

Please do not send mail to help-debbugs@HIDDEN unless you wish
to report a problem with the Bug-tracking system.

--=20
20339: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D20339
GNU Bug Tracking System
Contact help-debbugs@HIDDEN with problems


Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#20339: [PATCH] sxml->xml and namespaces: updated patch
References: <20150415194714.GA30295@HIDDEN>
In-Reply-To: <20150415194714.GA30295@HIDDEN>
Resent-From: <tomas@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Mon, 20 Apr 2015 07:46:02 +0000
Resent-Message-ID: <handler.20339.B20339.142951592327523 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 20339
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: 20339 <at> debbugs.gnu.org
Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.142951592327523
          (code B ref 20339); Mon, 20 Apr 2015 07:46:02 +0000
Received: (at 20339) by debbugs.gnu.org; 20 Apr 2015 07:45:23 +0000
Received: from localhost ([127.0.0.1]:32934 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1Yk6OQ-00079q-5X
	for submit <at> debbugs.gnu.org; Mon, 20 Apr 2015 03:45:23 -0400
Received: from mail.tuxteam.de ([5.199.139.25]:51574 helo=tomasium.tuxteam.de)
 by debbugs.gnu.org with esmtp (Exim 4.80)
 (envelope-from <tomas@HIDDEN>) id 1Yk6ON-00079g-KW
 for 20339 <at> debbugs.gnu.org; Mon, 20 Apr 2015 03:45:21 -0400
Received: from tomas by tomasium.tuxteam.de with local (Exim 4.80)
 (envelope-from <tomas@HIDDEN>) id 1Yk6OL-0008LR-7s
 for 20339 <at> debbugs.gnu.org; Mon, 20 Apr 2015 09:45:17 +0200
Date: Mon, 20 Apr 2015 09:45:17 +0200
Message-ID: <20150420074517.GA31087@HIDDEN>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
 protocol="application/pgp-signature"; boundary="da4uJneut+ArUgXk"
Content-Disposition: inline
User-Agent: Mutt/1.5.21 (2010-09-15)
From: <tomas@HIDDEN>
X-Spam-Score: -0.0 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.0 (/)


--da4uJneut+ArUgXk
Content-Type: multipart/mixed; boundary="l76fUT7nc3MelDdI"
Content-Disposition: inline


--l76fUT7nc3MelDdI
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hi,

I've embellished my proposed patch a bit:

 - use values resp. call-with-values instead of passing around
   lists.

   This was one thing I didn't like about my first patch candidate:
   the namespace --> ns abbreviation lookup had two things to return,
   for noe the abbreviation, and whether this abbreviation was "new"
   (for convenience in the form of a (namespace . abbreviation) pair).
   Instead of returning a list, now it returns multiple values.

 - patch is now against current stable instead of against "whatever
   Debian stable packages", i.e. against

   d680713 2015-04-03 16:35:54 +0200 Ludovic Court=E8s (stable-2.0) doc: Up=
date libgc URL.

I'm still not sure whether this is the way to go (i.e. mixing the
abbreviation stuff into the serialization), or whether a pre-pass
(replacing namespaces by abbreviations and generating the namespace
declaration "attributes") would be the way to go.

Besides, I'd like to have some input on whether it'd be worth to
follow the usual convention and to put the namespace declarations
before regular attributes (forcing us to do two passes on a tag
node's attribute list). The generated XML looks pretty weird as
is now.

What I'd still like to introduce is a "mapping preference" as an
optional argument by the user, possibly per-node (like "I'd like
'http://www.w3.org/1999/xlink' to be abbreviated as 'xlink' or
something like that). Other XML serializers offer that. I envision
this as a function, the library would fall back to generate the
abbreviation whenever the function returns #f.

The question on whether this patch (or whatever it evolves into)
has a chance of getting into Guile is still open: I'd have to
get my papers from the FSF in this case.

Inputs?

--l76fUT7nc3MelDdI
Content-Type: text/x-diff; charset=us-ascii
Content-Disposition: attachment; filename="abbreviate-and-declare-namespaces.patch"
Content-Transfer-Encoding: quoted-printable

diff --git a/module/sxml/simple.scm b/module/sxml/simple.scm
index 703ad91..86b0784 100644
--- a/module/sxml/simple.scm
+++ b/module/sxml/simple.scm
@@ -215,29 +215,38 @@ port."
          (elements (reverse (parser port '()))))
     `(*TOP* ,@elements)))
=20
-(define check-name
-  (let ((*good-cache* (make-hash-table)))
-    (lambda (name)
-      (if (not (hashq-ref *good-cache* name))
-          (let* ((str (symbol->string name))
-                 (i (string-index str #\:))
-                 (head (or (and i (substring str 0 i)) str))
-                 (tail (and i (substring str (1+ i)))))
-            (and i (string-index (substring str (1+ i)) #\:)
-                 (error "Invalid QName: more than one colon" name))
-            (for-each
-             (lambda (s)
-               (and s
-                    (or (char-alphabetic? (string-ref s 0))
-                        (eq? (string-ref s 0) #\_)
-                        (error "Invalid name starting character" s name))
-                    (string-for-each
-                     (lambda (c)
-                       (or (char-alphabetic? c) (string-index "0123456789.=
-_" c)
-                           (error "Invalid name character" c s name)))
-                     s)))
-             (list head tail))
-            (hashq-set! *good-cache* name #t))))))
+(define (ns-lookup ns nsmap)
+  "Look up namespace ns in nsmap. Return its abbreviation or #f"
+  (assoc-ref nsmap ns))
+
+(define ns-abbr-new
+  (let ((*nscounter* 0))
+    (lambda ()
+      (set! *nscounter* (1+ *nscounter*))
+      (string-append "ns" (number->string *nscounter*)))))
+
+(define (ns-abbr name nsmap)
+  "Takes a QName, SXML style (i.e a symbol whose string value is either a
+clean local name or a colon-concatenated pair of namespace:name, and retur=
ns
+two values: the string  <nsabbrev>:<local-name> and either a pair (<namesp=
ace> .
+nsabbrev) whenever <namespace> wasn't in nsmap, or #f when it was"
+  ;; FIXME check for empty ns (e.g ":foo")
+  ;; check (worse!) for empty locname (e.g. "foo:")
+  (let* ((str (symbol->string name))
+         (i (string-rindex str #\:))
+         (ns (and i (substring str 0 i)))
+         (locname (or (and i (substring str (1+ i))) str)))
+    (if ns
+        (let ((nsabbr (ns-lookup ns nsmap)))
+          (if nsabbr
+              ;; known namespace:
+              (values (string-append nsabbr ":" locname) #f)
+              ;; unknown namespace
+              (let ((nsabbr (ns-abbr-new)))
+                (values (string-append nsabbr ":" locname)
+                      (cons ns nsabbr)))))
+        ;; empty namespace: clean local-name:
+        (values locname #f))))
=20
 ;; The following two functions serialize tags and attributes. They are
 ;; being used in the node handlers for the post-order function, see
@@ -260,42 +269,58 @@ port."
      port))))
=20
 (define (attribute->xml attr value port)
-  (check-name attr)
   (display attr port)
   (display "=3D\"" port)
   (attribute-value->xml value port)
   (display #\" port))
=20
-(define (element->xml tag attrs body port)
-  (check-name tag)
-  (display #\< port)
-  (display tag port)
-  (if attrs
-      (let lp ((attrs attrs))
-        (if (pair? attrs)
-            (let ((attr (car attrs)))
+(define (element->xml tag attrs body port nsmap)
+  (let ((new-namespaces '()))
+    (call-with-values (lambda () (ns-abbr tag  nsmap))
+      (lambda (abname new-ns)
+        (when new-ns
+          (set! new-namespaces (cons new-ns new-namespaces)))
+        (display #\< port)
+        (display abname port)
+        (if attrs
+            (let lp ((attrs attrs))
+              (if (pair? attrs)
+                  (let ((attr (car attrs)))
+                    (display #\space port)
+                    (if (pair? attr)
+                        (call-with-values (lambda () (ns-abbr (car attr) n=
smap))
+                          (lambda (abname new-ns)
+                            (when new-ns
+                              (set! new-namespaces (cons new-ns new-namesp=
aces)))
+                            (attribute->xml abname (cdr attr) port)))
+                        (error "bad attribute" tag attr))
+                    (lp (cdr attrs)))
+                  (if (not (null? attrs))
+                      (error "bad attributes" tag attrs)))))
+        ;; Output namespace declarations
+        (let lp ((new-namespaces new-namespaces))
+          (unless (null? new-namespaces)
+            ;; remember: car is namespace, cdr is abbrev
+            (let ((ns (caar new-namespaces))
+                  (nsabbr (cdar new-namespaces)))
               (display #\space port)
-              (if (pair? attr)
-                  (attribute->xml (car attr) (cdr attr) port)
-                  (error "bad attribute" tag attr))
-              (lp (cdr attrs)))
-            (if (not (null? attrs))
-                (error "bad attributes" tag attrs)))))
-  (if (pair? body)
-      (begin
-        (display #\> port)
-        (let lp ((body body))
-          (cond
-           ((pair? body)
-            (sxml->xml (car body) port)
-            (lp (cdr body)))
-           ((null? body)
-            (display "</" port)
-            (display tag port)
-            (display ">" port))
-           (else
-            (error "bad element body" tag body)))))
-      (display " />" port)))
+              (attribute->xml (string-append "xmlns:" nsabbr) ns port))
+            (lp (cdr new-namespaces))))
+        (if (pair? body)
+            (begin
+              (display #\> port)
+              (let lp ((body body))
+                (cond
+                 ((pair? body)
+                  (sxml->xml (car body) port (append new-namespaces nsmap))
+                  (lp (cdr body)))
+                 ((null? body)
+                  (display "</" port)
+                  (display abname port)
+                  (display ">" port))
+                 (else
+                  (error "bad element body" tag body)))))
+            (display " />" port))))))
=20
 ;; FIXME: ensure name is valid
 (define (entity->xml name port)
@@ -311,7 +336,8 @@ port."
   (display str port)
   (display "?>" port))
=20
-(define* (sxml->xml tree #:optional (port (current-output-port)))
+(define* (sxml->xml tree #:optional (port (current-output-port))
+                    (nsmap '()))
   "Serialize the sxml tree @var{tree} as XML. The output will be written
 to the current output port, unless the optional argument @var{port} is
 present."
@@ -322,7 +348,7 @@ present."
         (let ((tag (car tree)))
           (case tag
             ((*TOP*)
-             (sxml->xml (cdr tree) port))
+             (sxml->xml (cdr tree) port nsmap))
             ((*ENTITY*)
              (if (and (list? (cdr tree)) (=3D (length (cdr tree)) 1))
                  (entity->xml (cadr tree) port)
@@ -336,9 +362,9 @@ present."
                     (attrs (and (pair? elems) (pair? (car elems))
                                 (eq? '@ (caar elems))
                                 (cdar elems))))
-               (element->xml tag attrs (if attrs (cdr elems) elems) port))=
)))
+               (element->xml tag attrs (if attrs (cdr elems) elems) port n=
smap)))))
         ;; A nodelist.
-        (for-each (lambda (x) (sxml->xml x port)) tree)))
+        (for-each (lambda (x) (sxml->xml x port nsmap)) tree)))
    ((string? tree)
     (string->escaped-xml tree port))
    ((null? tree) *unspecified*)

--l76fUT7nc3MelDdI--

--da4uJneut+ArUgXk
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlU0rowACgkQBcgs9XrR2kZRwACffTrZx5cCTIr7pMETu2kLbqvZ
H8kAnAq9DYpMgKjL7sRpox496i/QN7Dl
=Yxx8
-----END PGP SIGNATURE-----

--da4uJneut+ArUgXk--




Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces?
Resent-From: Ricardo Wurmus <rekado@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Tue, 21 Apr 2015 09:25:02 +0000
Resent-Message-ID: <handler.20339.B20339.142960825926865 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 20339
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: tomas@HIDDEN
Cc: 20339 <at> debbugs.gnu.org
Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.142960825926865
          (code B ref 20339); Tue, 21 Apr 2015 09:25:02 +0000
Received: (at 20339) by debbugs.gnu.org; 21 Apr 2015 09:24:19 +0000
Received: from localhost ([127.0.0.1]:34180 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1YkUPi-0006zF-S1
	for submit <at> debbugs.gnu.org; Tue, 21 Apr 2015 05:24:19 -0400
Received: from sender1.zohomail.com ([74.201.84.162]:52621)
 by debbugs.gnu.org with esmtp (Exim 4.80)
 (envelope-from <rekado@HIDDEN>) id 1YkUPe-0006z2-Or
 for 20339 <at> debbugs.gnu.org; Tue, 21 Apr 2015 05:24:17 -0400
Received: from localhost (141.80.115.59 [141.80.115.59]) by mx.zohomail.com
 with SMTPS id 1429608247649710.8344961909249;
 Tue, 21 Apr 2015 02:24:07 -0700 (PDT)
References: <20150415194714.GA30295@HIDDEN>
From: Ricardo Wurmus <rekado@HIDDEN>
In-reply-to: <20150415194714.GA30295@HIDDEN>
Date: Tue, 21 Apr 2015 11:24:03 +0200
Message-ID: <87oamh25sc.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Spam-Score: 1.0 (+)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: 1.0 (+)

Hi Tomás,

tomas@HIDDEN writes:

> When transforming SXML to XML, namespaces don't seem to be handled
> properly:
>
[...]
>
> The problem is that SXML used the concatenated (full) namespace with the
> name as tag (and attribute) names for namespaced items. When serializing
> to XML it should try to find abbreviations for those namespaces and issue
> the corresponding namespace declarations.
>
> Instead, sxml->xml tries to split the (namespace:name) combination
> at the first colon and to check the name -- and fails miserably at
> (namespace:name) combinations à la "http://www.w3.org/1999/xlink:href"
> (procedure check-name). Since there are two colons, the name part
> has now a colon.

xml->sxml has an optional #:namespaces argument, where you can pass an
alist of keys to URLs to be used in the sxml output:

   (let* ((ns '((svg . "http://www.w3.org/2000/svg")
                (xlink . "http://www.w3.org/1999/xlink")))
          (the-sxml (xml->sxml the-svg #:namespaces ns)))
     (display the-sxml))

=> (*TOP*
     (svg:svg
       (svg:rect (@ (y 5)
                    (x 5)
                    (width 20)
                    (stroke-width 2)
                    (stroke purple)
                    (id rect1)
                    (height 20)
                    (fill yellow)))
       (svg:rect (@ (xlink:href #rect1)
                    (y 5)
                    (x 30)
                    (width 20)
                    (stroke-width 2)
                    (stroke purple)
                    (ry 5)
                    (rx 8)
                    (height 20)
                    (fill blue)))))

Passing this to sxml->xml yields:

  <svg:svg>
    <svg:rect y="5" x="5"
              width="20"
              stroke-width="2"
              stroke="purple"
              id="rect1"
              height="20"
              fill="yellow" />
    <svg:rect xlink:href="#rect1"
              y="5" x="30"
              width="20"
              stroke-width="2"
              stroke="purple"
              ry="5" rx="8"
              height="20"
              fill="blue" />
  </svg:svg>

Unfortunately, sxml->xml will not replace the namespace abbreviations,
nor will it add appropriate xmlns attributes, so "svg" and "xlink" are
devoid of any meaning.

Since xml->sxml accepts a namespace alist I suppose it would make sense
to extend sxml->xml to do the same.

~~ Ricardo





Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces?
Resent-From: tomas@HIDDEN
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Tue, 21 Apr 2015 09:46:02 +0000
Resent-Message-ID: <handler.20339.B20339.142960952028915 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 20339
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: Ricardo Wurmus <rekado@HIDDEN>
Cc: tomas@HIDDEN, 20339 <at> debbugs.gnu.org
Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.142960952028915
          (code B ref 20339); Tue, 21 Apr 2015 09:46:02 +0000
Received: (at 20339) by debbugs.gnu.org; 21 Apr 2015 09:45:20 +0000
Received: from localhost ([127.0.0.1]:34266 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1YkUk3-0007WF-Ts
	for submit <at> debbugs.gnu.org; Tue, 21 Apr 2015 05:45:20 -0400
Received: from mail.tuxteam.de ([5.199.139.25]:54735 helo=tomasium.tuxteam.de)
 by debbugs.gnu.org with esmtp (Exim 4.80)
 (envelope-from <tomas@HIDDEN>) id 1YkUjQ-0007Uo-TH
 for 20339 <at> debbugs.gnu.org; Tue, 21 Apr 2015 05:44:41 -0400
Received: from tomas by tomasium.tuxteam.de with local (Exim 4.80)
 (envelope-from <tomas@HIDDEN>)
 id 1YkUjO-000623-OP; Tue, 21 Apr 2015 11:44:38 +0200
Date: Tue, 21 Apr 2015 11:44:38 +0200
From: tomas@HIDDEN
Message-ID: <20150421094438.GA22715@HIDDEN>
References: <20150415194714.GA30295@HIDDEN>
 <87oamh25sc.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; x-action=pgp-signed
Content-Transfer-Encoding: 8bit
In-Reply-To: <87oamh25sc.fsf@HIDDEN>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Score: -0.0 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, Apr 21, 2015 at 11:24:03AM +0200, Ricardo Wurmus wrote:
> Hi Tomás,
> 
> tomas@HIDDEN writes:
> 
> > When transforming SXML to XML, namespaces don't seem to be handled
> > properly:
> >
> [...]
> >
> > The problem is that SXML used the concatenated (full) namespace with the
> > name as tag (and attribute) names for namespaced items. When serializing
> > to XML it should try to find abbreviations for those namespaces and issue
> > the corresponding namespace declarations.
> >
> > Instead, sxml->xml tries to split the (namespace:name) combination
> > at the first colon and to check the name -- and fails miserably at
> > (namespace:name) combinations à la "http://www.w3.org/1999/xlink:href"
> > (procedure check-name). Since there are two colons, the name part
> > has now a colon.
> 
> xml->sxml has an optional #:namespaces argument, where you can pass an
> alist of keys to URLs to be used in the sxml output:

Aha. Didn't know about this one, thanks. Yes, the problem is that SXML
loses the link to the "real" namespaces: the application around it has
to keep track of that.

> Passing this to sxml->xml yields:
> 
>   <svg:svg>
>     <svg:rect y="5" x="5"
>               width="20"
>               stroke-width="2"
>               stroke="purple"
>               id="rect1"
>               height="20"
>               fill="yellow" />
>     <svg:rect xlink:href="#rect1"
>               y="5" x="30"
>               width="20"
>               stroke-width="2"
>               stroke="purple"
>               ry="5" rx="8"
>               height="20"
>               fill="blue" />
>   </svg:svg>

Yes, this looks "nearly" right, except...

> Unfortunately, sxml->xml will not replace the namespace abbreviations,
> nor will it add appropriate xmlns attributes, so "svg" and "xlink" are
> devoid of any meaning.

exactly.

> Since xml->sxml accepts a namespace alist I suppose it would make sense
> to extend sxml->xml to do the same.

This is more or less what I do in my proposed patch (it's in the bugs
mailing list as 20339 <at> debbugs.gnu.org). It passes around an alist of
(namespace . abbrev) associations (it's inverted wrt #:namespaces in
xml->sxml). Only that the abbreviations are "generated" as ns1, ns2
and so on (and the namespace declarations are woven into the attributes
list).

So far not reply to my bug report, but this gives me the chance to
bikeshed my patch to death :-P

Thanks for looking into that -- and for prodding me into looking at
more sources :)

Regards
- -- t
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlU2HAYACgkQBcgs9XrR2kYq+gCfexhJ5qFyN4QmIf4TfddPqyfT
434An3BSVKtyovRJdg8MGHzAY8I0/NTD
=O9Kj
-----END PGP SIGNATURE-----




Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces?
Resent-From: Ricardo Wurmus <rekado@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Wed, 22 Apr 2015 14:31:02 +0000
Resent-Message-ID: <handler.20339.B20339.14297130154707 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 20339
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: tomas@HIDDEN
Cc: 20339 <at> debbugs.gnu.org
Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.14297130154707
          (code B ref 20339); Wed, 22 Apr 2015 14:31:02 +0000
Received: (at 20339) by debbugs.gnu.org; 22 Apr 2015 14:30:15 +0000
Received: from localhost ([127.0.0.1]:36626 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1YkvfK-0001Dp-DS
	for submit <at> debbugs.gnu.org; Wed, 22 Apr 2015 10:30:15 -0400
Received: from sender1.zohomail.com ([74.201.84.162]:52364)
 by debbugs.gnu.org with esmtp (Exim 4.80)
 (envelope-from <rekado@HIDDEN>) id 1YkvfF-0001Dd-DZ
 for 20339 <at> debbugs.gnu.org; Wed, 22 Apr 2015 10:30:11 -0400
Received: from localhost (89.15.238.113 [89.15.238.113]) by mx.zohomail.com
 with SMTPS id 1429712981639612.5748451576998;
 Wed, 22 Apr 2015 07:29:41 -0700 (PDT)
References: <20150415194714.GA30295@HIDDEN>
 <87oamh25sc.fsf@HIDDEN> <20150421094438.GA22715@HIDDEN>
From: Ricardo Wurmus <rekado@HIDDEN>
In-reply-to: <20150421094438.GA22715@HIDDEN>
Date: Wed, 22 Apr 2015 16:29:32 +0200
Message-ID: <87fv7s1bjn.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=-=-="
X-Zoho-Virus-Status: 1
X-Spam-Score: 1.0 (+)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: 1.0 (+)

--=-=-=
Content-Type: text/plain

>> Since xml->sxml accepts a namespace alist I suppose it would make sense
>> to extend sxml->xml to do the same.

Attached is a minimal patch to extend "sxml->xml" such that it accepts an
optional keyword argument "namespaces" with an alist of prefixes to
URLs, analogous to "xml->sxml".

When the namespaces alist is provided, "xmlns:prefix=url" attributes are
prepended to the element's list of attributes.


    ;; Define SVG document with namespaces
    (define the-svg "<svg xmlns='http://www.w3.org/2000/svg'
       xmlns:xlink='http://www.w3.org/1999/xlink'>
    <rect x='5' y='5' width='20' height='20'
          stroke-width='2' stroke='purple' fill='yellow'
          id='rect1' />
    <rect x='30' y='5' width='20' height='20'
          ry='5' rx='8' stroke-width='2' stroke='purple' fill='blue'
          xlink:href='#rect1' />
    </svg>")

    ;; Define alist of namespaces
    (define ns '((svg . "http://www.w3.org/2000/svg")
                 (xlink . "http://www.w3.org/1999/xlink")))

    ;; Convert to SXML, abbreviate namespaces according to ns alist
    (define the-sxml (xml->sxml the-svg #:namespaces ns))

    ;; Convert back to XML
    (sxml->xml the-sxml #:namespaces ns)

    => <svg:svg xmlns:svg="http://www.w3.org/2000/svg"
                xmlns:xlink="http://www.w3.org/1999/xlink">
         <svg:rect y="5" x="5"
                   width="20"
                   stroke-width="2"
                   stroke="purple"
                   id="rect1"
                   height="20"
                   fill="yellow" />
         <svg:rect xlink:href="#rect1"
                   y="5" x="30"
                   width="20"
                   stroke-width="2"
                   stroke="purple"
                   ry="5" rx="8"
                   height="20"
                   fill="blue" />
       </svg:svg>

Does this do what you want?

~~ Ricardo


--=-=-=
Content-Type: text/x-patch
Content-Disposition: inline;
 filename=0001-Write-XML-namespaces-when-serializing.patch

From 81fa92ad0c5537c41419fa1e55c6130bf0558c9f Mon Sep 17 00:00:00 2001
From: rekado <rekado@HIDDEN>
Date: Wed, 22 Apr 2015 13:09:27 +0200
Subject: [PATCH] Write XML namespaces when serializing.

* module/sxml/simple.scm (sxml->xml): Add optional keyword argument
  "namespaces".
---
 module/sxml/simple.scm | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/module/sxml/simple.scm b/module/sxml/simple.scm
index 703ad91..8cc20dd 100644
--- a/module/sxml/simple.scm
+++ b/module/sxml/simple.scm
@@ -311,7 +311,8 @@ port."
   (display str port)
   (display "?>" port))
 
-(define* (sxml->xml tree #:optional (port (current-output-port)))
+(define* (sxml->xml tree #:optional (port (current-output-port)) #:key
+                    (namespaces '()))
   "Serialize the sxml tree @var{tree} as XML. The output will be written
 to the current output port, unless the optional argument @var{port} is
 present."
@@ -322,7 +323,7 @@ present."
         (let ((tag (car tree)))
           (case tag
             ((*TOP*)
-             (sxml->xml (cdr tree) port))
+             (sxml->xml (cdr tree) port #:namespaces namespaces))
             ((*ENTITY*)
              (if (and (list? (cdr tree)) (= (length (cdr tree)) 1))
                  (entity->xml (cadr tree) port)
@@ -335,10 +336,16 @@ present."
              (let* ((elems (cdr tree))
                     (attrs (and (pair? elems) (pair? (car elems))
                                 (eq? '@ (caar elems))
-                                (cdar elems))))
-               (element->xml tag attrs (if attrs (cdr elems) elems) port)))))
+                                (cdar elems)))
+                    (xmlns (map (lambda (x)
+                                  (cons (symbol-append 'xmlns: (car x))
+                                        (cdr x)))
+                                namespaces)))
+               (element->xml tag
+                             (if attrs (append xmlns attrs) xmlns)
+                             (if attrs (cdr elems) elems) port)))))
         ;; A nodelist.
-        (for-each (lambda (x) (sxml->xml x port)) tree)))
+        (for-each (lambda (x) (sxml->xml x port #:namespaces namespaces)) tree)))
    ((string? tree)
     (string->escaped-xml tree port))
    ((null? tree) *unspecified*)
-- 
2.1.0


--=-=-=--





Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces?
Resent-From: tomas@HIDDEN
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Thu, 23 Apr 2015 06:58:01 +0000
Resent-Message-ID: <handler.20339.B20339.14297722401723 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 20339
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: Ricardo Wurmus <rekado@HIDDEN>
Cc: 20339 <at> debbugs.gnu.org
Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.14297722401723
          (code B ref 20339); Thu, 23 Apr 2015 06:58:01 +0000
Received: (at 20339) by debbugs.gnu.org; 23 Apr 2015 06:57:20 +0000
Received: from localhost ([127.0.0.1]:37024 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1YlB4a-0000Rj-4q
	for submit <at> debbugs.gnu.org; Thu, 23 Apr 2015 02:57:20 -0400
Received: from mail.tuxteam.de ([5.199.139.25]:60241 helo=tomasium.tuxteam.de)
 by debbugs.gnu.org with esmtp (Exim 4.80)
 (envelope-from <tomas@HIDDEN>) id 1YlB4X-0000RT-G8
 for 20339 <at> debbugs.gnu.org; Thu, 23 Apr 2015 02:57:18 -0400
Received: from tomas by tomasium.tuxteam.de with local (Exim 4.80)
 (envelope-from <tomas@HIDDEN>)
 id 1YlB4V-0005BL-4W; Thu, 23 Apr 2015 08:57:15 +0200
Date: Thu, 23 Apr 2015 08:57:14 +0200
From: tomas@HIDDEN
Message-ID: <20150423065714.GB19410@HIDDEN>
References: <20150415194714.GA30295@HIDDEN>
 <87oamh25sc.fsf@HIDDEN>
 <20150421094438.GA22715@HIDDEN>
 <87fv7s1bjn.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; x-action=pgp-signed
Content-Transfer-Encoding: 8bit
In-Reply-To: <87fv7s1bjn.fsf@HIDDEN>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Score: -0.0 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wed, Apr 22, 2015 at 04:29:32PM +0200, Ricardo Wurmus wrote:
> >> Since xml->sxml accepts a namespace alist I suppose it would make sense
> >> to extend sxml->xml to do the same.
> 
> Attached is a minimal patch to extend "sxml->xml" such that it accepts an
> optional keyword argument "namespaces" with an alist of prefixes to
> URLs, analogous to "xml->sxml".

Thanks, I'll have a look at this this afternoon.

Your code is far prettier than mine, that's for sure :-)

What's yet missing (as far as I can read off the diff) is a way to
"dream up" an abbreviation when it's not in the namespaces alist.

Thanks again and regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlU4l8oACgkQBcgs9XrR2kb7SwCeNO0Z+RJZy6VUeQotm3+qX5rd
nXMAn2QeowgVnEj+9Zh3gMIBZW99Y3bx
=BrEt
-----END PGP SIGNATURE-----




Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces?
Resent-From: Ricardo Wurmus <rekado@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Thu, 23 Apr 2015 07:06:02 +0000
Resent-Message-ID: <handler.20339.B20339.14297727103193 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 20339
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: tomas@HIDDEN
Cc: 20339 <at> debbugs.gnu.org
Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.14297727103193
          (code B ref 20339); Thu, 23 Apr 2015 07:06:02 +0000
Received: (at 20339) by debbugs.gnu.org; 23 Apr 2015 07:05:10 +0000
Received: from localhost ([127.0.0.1]:37029 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1YlBC9-0000pL-39
	for submit <at> debbugs.gnu.org; Thu, 23 Apr 2015 03:05:09 -0400
Received: from sender1.zohomail.com ([74.201.84.162]:53632)
 by debbugs.gnu.org with esmtp (Exim 4.80)
 (envelope-from <rekado@HIDDEN>) id 1YlBC5-0000p4-OK
 for 20339 <at> debbugs.gnu.org; Thu, 23 Apr 2015 03:05:07 -0400
Received: from localhost (xd933f8e5.dyn.telefonica.de [217.51.248.229]) by
 mx.zohomail.com with SMTPS id 1429772699616412.097398393296;
 Thu, 23 Apr 2015 00:04:59 -0700 (PDT)
References: <20150415194714.GA30295@HIDDEN>
 <87oamh25sc.fsf@HIDDEN> <20150421094438.GA22715@HIDDEN>
 <87fv7s1bjn.fsf@HIDDEN> <20150423065714.GB19410@HIDDEN>
From: Ricardo Wurmus <rekado@HIDDEN>
In-reply-to: <20150423065714.GB19410@HIDDEN>
Date: Thu, 23 Apr 2015 09:04:46 +0200
Message-ID: <878udj1g1d.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain
X-Spam-Score: 1.0 (+)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: 1.0 (+)


tomas@HIDDEN writes:

> What's yet missing (as far as I can read off the diff) is a way to
> "dream up" an abbreviation when it's not in the namespaces alist.

True.

Ideally, this should work even without passing a namespaces alist at all
in both "xml->sxml" and "sxml->xml".  The non-abbreviated namespaces
should not cause "sxml->xml" to fail.

Passing around a namespaces alist to both these procedures is the least
invasive approach I could think of, but I still think that it *should*
be made to work without explicitly declaring namespaces.





Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces?
Resent-From: tomas@HIDDEN
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Thu, 23 Apr 2015 07:41:03 +0000
Resent-Message-ID: <handler.20339.B20339.14297748409886 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 20339
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: Ricardo Wurmus <rekado@HIDDEN>
Cc: tomas@HIDDEN, 20339 <at> debbugs.gnu.org
Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.14297748409886
          (code B ref 20339); Thu, 23 Apr 2015 07:41:03 +0000
Received: (at 20339) by debbugs.gnu.org; 23 Apr 2015 07:40:40 +0000
Received: from localhost ([127.0.0.1]:37069 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1YlBkW-0002ZO-Fr
	for submit <at> debbugs.gnu.org; Thu, 23 Apr 2015 03:40:40 -0400
Received: from mail.tuxteam.de ([5.199.139.25]:60331 helo=tomasium.tuxteam.de)
 by debbugs.gnu.org with esmtp (Exim 4.80)
 (envelope-from <tomas@HIDDEN>) id 1YlBkT-0002Z9-5X
 for 20339 <at> debbugs.gnu.org; Thu, 23 Apr 2015 03:40:38 -0400
Received: from tomas by tomasium.tuxteam.de with local (Exim 4.80)
 (envelope-from <tomas@HIDDEN>)
 id 1YlBkQ-0005Va-JE; Thu, 23 Apr 2015 09:40:34 +0200
Date: Thu, 23 Apr 2015 09:40:34 +0200
From: tomas@HIDDEN
Message-ID: <20150423074034.GA20961@HIDDEN>
References: <20150415194714.GA30295@HIDDEN>
 <87oamh25sc.fsf@HIDDEN>
 <20150421094438.GA22715@HIDDEN>
 <87fv7s1bjn.fsf@HIDDEN>
 <20150423065714.GB19410@HIDDEN>
 <878udj1g1d.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; x-action=pgp-signed
Content-Transfer-Encoding: 8bit
In-Reply-To: <878udj1g1d.fsf@HIDDEN>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Score: -0.0 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thu, Apr 23, 2015 at 09:04:46AM +0200, Ricardo Wurmus wrote:
> 
> tomas@HIDDEN writes:
> 
> > What's yet missing (as far as I can read off the diff) is a way to
> > "dream up" an abbreviation when it's not in the namespaces alist.
> 
> True.
> 
> Ideally, this should work even without passing a namespaces alist at all
> in both "xml->sxml" and "sxml->xml".  The non-abbreviated namespaces
> should not cause "sxml->xml" to fail.
> 
> Passing around a namespaces alist to both these procedures is the least
> invasive approach I could think of, but I still think that it *should*
> be made to work without explicitly declaring namespaces.

I think a combination of our approaches could work: the only difference
(apart of the code elegance) is that my patch grows this alist on its
way down the tree as it encounters new namespace. This meshes well with
the namespace declaration, which scopes recursively down the XML tree.

This afternoon, while I sit at the e-Lok waiting for the FSFE meeting
is a very good moment for me to look into it. I'll report tonight :-)

Thanks & later (dayjob calling)
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlU4ofIACgkQBcgs9XrR2kaFNwCfWzPunxHiiDJIJean02rx7pMT
92IAn2IGYW01Cx7aJt32MLRDQYuY9FbP
=owfk
-----END PGP SIGNATURE-----




Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces?
Resent-From: tomas@HIDDEN
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Sat, 25 Apr 2015 20:26:01 +0000
Resent-Message-ID: <handler.20339.B20339.142999351510435 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 20339
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: Ricardo Wurmus <rekado@HIDDEN>
Cc: tomas@HIDDEN, 20339 <at> debbugs.gnu.org
Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.142999351510435
          (code B ref 20339); Sat, 25 Apr 2015 20:26:01 +0000
Received: (at 20339) by debbugs.gnu.org; 25 Apr 2015 20:25:15 +0000
Received: from localhost ([127.0.0.1]:39987 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1Ym6dW-0002iE-8w
	for submit <at> debbugs.gnu.org; Sat, 25 Apr 2015 16:25:14 -0400
Received: from mail.tuxteam.de ([5.199.139.25]:39513 helo=tomasium.tuxteam.de)
 by debbugs.gnu.org with esmtp (Exim 4.80)
 (envelope-from <tomas@HIDDEN>) id 1Ym6dT-0002i3-If
 for 20339 <at> debbugs.gnu.org; Sat, 25 Apr 2015 16:25:12 -0400
Received: from tomas by tomasium.tuxteam.de with local (Exim 4.80)
 (envelope-from <tomas@HIDDEN>)
 id 1Ym6dR-0001BN-An; Sat, 25 Apr 2015 22:25:09 +0200
Date: Sat, 25 Apr 2015 22:25:09 +0200
From: tomas@HIDDEN
Message-ID: <20150425202509.GA3544@HIDDEN>
References: <20150415194714.GA30295@HIDDEN>
 <87oamh25sc.fsf@HIDDEN>
 <20150421094438.GA22715@HIDDEN>
 <87fv7s1bjn.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; x-action=pgp-signed
Content-Transfer-Encoding: 8bit
In-Reply-To: <87fv7s1bjn.fsf@HIDDEN>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Score: -0.0 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wed, Apr 22, 2015 at 04:29:32PM +0200, Ricardo Wurmus wrote:
> >> Since xml->sxml accepts a namespace alist I suppose it would make sense
> >> to extend sxml->xml to do the same.
> 
> Attached is a minimal patch to extend "sxml->xml" such that it accepts an
> optional keyword argument "namespaces" with an alist of prefixes to
> URLs, analogous to "xml->sxml".

Thank you again for the patch. I applied it against 2.0.11, and can confirm
that it works as advertised :-)

I didn't see that xml->sxml has an optional parameter #:namespaces --
to be honest, I didn't expect it there.

So if one knows beforehand what namespaces are used in the XML in question,
it's possible to use the pair xml->sxml and xml->sxml this way (with your
patch, of course, because otherwise sxml->xml "forgets" to output the
relevant XML namespace declarations).

Reading again Oleg Kiselyov's paper[1] I understand that SXML can, as does
XML have namespace abbreviations (called there user-ns-shortcut). It's not
exctly the same thing, but somehow isomorphic. One might use the XML's
abbreviations in the SXML representation, of course.

The problem with this approach is that you either have to carry the
namespace associations "out-of-band", and that you have to know which
namespaces to expect before parsing the XML.

A (more cosmtic) problem is that all namespace declarations are "moved"
to the top-level, because the SXML keeps no "memory" of which node the
namespace declarations were attached to in the original XML.

In [1], there is a mechanism for stashing namespace mappings in the
"attributes list" (strictly in the annotations, which are optionally
tacked to the tail of the attributes list, under the tag *NAMESPACES*.

Anyway -- what would be a good way forward here?

I could imagine taking note of the namespace abbreviations in the
*NAMESPACES* list (while xml->sxml) and issuing the corresponding
declarations in sxml->xml.

Makes sense?

Regards

[1] <http://okmij.org/ftp/papers/SXML-paper.pdf>

- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlU7+CUACgkQBcgs9XrR2kaSxACfdljxbGyVNILgombB3jYWjeOq
1zwAn2RzIEHcJbJIlIMRkaEAIjNFcH7M
=MSYu
-----END PGP SIGNATURE-----




Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces?
Resent-From: tomas@HIDDEN
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Sun, 26 Apr 2015 10:29:01 +0000
Resent-Message-ID: <handler.20339.B20339.14300440971117 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 20339
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: Ricardo Wurmus <rekado@HIDDEN>
Cc: tomas@HIDDEN, 20339 <at> debbugs.gnu.org
Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.14300440971117
          (code B ref 20339); Sun, 26 Apr 2015 10:29:01 +0000
Received: (at 20339) by debbugs.gnu.org; 26 Apr 2015 10:28:17 +0000
Received: from localhost ([127.0.0.1]:40171 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1YmJnM-0000Hx-SG
	for submit <at> debbugs.gnu.org; Sun, 26 Apr 2015 06:28:17 -0400
Received: from mail.tuxteam.de ([5.199.139.25]:41210 helo=tomasium.tuxteam.de)
 by debbugs.gnu.org with esmtp (Exim 4.80)
 (envelope-from <tomas@HIDDEN>) id 1YmJnK-0000Hm-65
 for 20339 <at> debbugs.gnu.org; Sun, 26 Apr 2015 06:28:15 -0400
Received: from tomas by tomasium.tuxteam.de with local (Exim 4.80)
 (envelope-from <tomas@HIDDEN>)
 id 1YmJnG-0001iD-Uq; Sun, 26 Apr 2015 12:28:10 +0200
Date: Sun, 26 Apr 2015 12:28:10 +0200
From: tomas@HIDDEN
Message-ID: <20150426102810.GB5922@HIDDEN>
References: <20150415194714.GA30295@HIDDEN>
 <87oamh25sc.fsf@HIDDEN>
 <20150421094438.GA22715@HIDDEN>
 <87fv7s1bjn.fsf@HIDDEN>
 <20150425202509.GA3544@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; x-action=pgp-signed
Content-Transfer-Encoding: 8bit
In-Reply-To: <20150425202509.GA3544@HIDDEN>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Score: -0.0 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sat, Apr 25, 2015 at 10:25:09PM +0200, tomas@HIDDEN wrote:

[...]

> Reading again Oleg Kiselyov's paper[1] I understand that SXML can, as does
> XML have namespace abbreviations (called there user-ns-shortcut). It's not
> exctly the same thing, but somehow isomorphic. One might use the XML's
> abbreviations in the SXML representation, of course.

I take that back: as far as I understand the paper, the (SXML-side) shortcuts
are global to the document, whereas the (XML-side) abbreviations are subtree-
scoped (i.e. for the whole subtree of the element where the declaration
is attached. I don't know ATM whether shadowing is allowed, but I'll look that
up).

So there *is* a subtle difference between "user-ns-shortcut" (the one
you were manipulating with #:namespaces) and the XML "namespace abbreviation"
(the official jargon is "namespace prefix").

Regards

[1] <http://okmij.org/ftp/papers/SXML-paper.pdf>

- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlU8vboACgkQBcgs9XrR2kadlACeI+p4W8N/dJ49cGBypYNEP/ta
l6MAn3exlNUpj6Z4cYG0Dcb1ltyuQQBB
=x74j
-----END PGP SIGNATURE-----




Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces?
Resent-From: Andy Wingo <wingo@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Thu, 23 Jun 2016 19:33:01 +0000
Resent-Message-ID: <handler.20339.B20339.146671034717357 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 20339
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: tomas@HIDDEN
Cc: 20339 <at> debbugs.gnu.org
Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.146671034717357
          (code B ref 20339); Thu, 23 Jun 2016 19:33:01 +0000
Received: (at 20339) by debbugs.gnu.org; 23 Jun 2016 19:32:27 +0000
Received: from localhost ([127.0.0.1]:53006 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1bGAMV-0004Vs-Hz
	for submit <at> debbugs.gnu.org; Thu, 23 Jun 2016 15:32:27 -0400
Received: from pb-sasl2.pobox.com ([64.147.108.67]:65092
 helo=sasl.smtp.pobox.com) by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <wingo@HIDDEN>) id 1bGAMT-0004Vj-Kt
 for 20339 <at> debbugs.gnu.org; Thu, 23 Jun 2016 15:32:26 -0400
Received: from sasl.smtp.pobox.com (unknown [127.0.0.1])
 by pb-sasl2.pobox.com (Postfix) with ESMTP id 5109024111;
 Thu, 23 Jun 2016 15:32:25 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc
 :subject:references:date:in-reply-to:message-id:mime-version
 :content-type; s=sasl; bh=4YlpFEfUufNoTQ6bmCt/Ajn2Jyg=; b=jBffSu
 NLCIcbvErtcRfOefXEkdflyGHUhgaW1PykivVz/ioM0TRvkZ2uDG1PLvVNUNPgRB
 zrYL/xN9u3uOVrEMd/1zwJ0KkuL/5R98gFVFnNUHebl2kNk1O8SZCHwupiR0wxFP
 Vts7A6uRHRhPrDafCthkAHFwQHPFaIJcy+NNo=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc
 :subject:references:date:in-reply-to:message-id:mime-version
 :content-type; q=dns; s=sasl; b=ZUgkGbEjm90diz08WVOMrS1nD05ltdqP
 cPRnzxleFR4oEZT8givxzbATtsyyntguoBOYp8lM1B3p6gWxgzrzuJ3x4RmkrJMM
 X4ldMyYa75kHnffQe1sPPDMbZCwFEe7DQklA/DZ6M5aUkLNyYC6pcjF6bAGArtlk
 vEaQRO7AFr4=
Received: from pb-sasl2.nyi.icgroup.com (unknown [127.0.0.1])
 by pb-sasl2.pobox.com (Postfix) with ESMTP id 4A39424110;
 Thu, 23 Jun 2016 15:32:25 -0400 (EDT)
Received: from clucks (unknown [88.160.190.192])
 (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by pb-sasl2.pobox.com (Postfix) with ESMTPSA id 605DA2410F;
 Thu, 23 Jun 2016 15:32:24 -0400 (EDT)
From: Andy Wingo <wingo@HIDDEN>
References: <20150415194714.GA30295@HIDDEN>
Date: Thu, 23 Jun 2016 21:32:16 +0200
In-Reply-To: <20150415194714.GA30295@HIDDEN> (tomas@HIDDEN's message
 of "Wed, 15 Apr 2015 21:47:14 +0200")
Message-ID: <87y45vln0f.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Pobox-Relay-ID: 365DE7E4-3979-11E6-B008-28A6F1301B6D-02397024!pb-sasl2.pobox.com
X-Spam-Score: -1.4 (-)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.4 (-)

See thread here as well:
http://thread.gmane.org/gmane.lisp.guile.devel/17709

I like Ricardo's patch but have some comments here:
http://article.gmane.org/gmane.lisp.guile.devel/18384

Andy




Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces?
Resent-From: tomas@HIDDEN
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Wed, 13 Jul 2016 13:25:01 +0000
Resent-Message-ID: <handler.20339.B20339.14684162575157 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 20339
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: 20339 <at> debbugs.gnu.org
Cc: Andy Wingo <wingo@HIDDEN>, Ricardo Wurmus <rekado@HIDDEN>
Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.14684162575157
          (code B ref 20339); Wed, 13 Jul 2016 13:25:01 +0000
Received: (at 20339) by debbugs.gnu.org; 13 Jul 2016 13:24:17 +0000
Received: from localhost ([127.0.0.1]:49200 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1bNK97-0001L2-4j
	for submit <at> debbugs.gnu.org; Wed, 13 Jul 2016 09:24:17 -0400
Received: from mail.tuxteam.de ([5.199.139.25]:57950 helo=tomasium.tuxteam.de)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <tomas@HIDDEN>) id 1bNK91-0001Ko-AI
 for 20339 <at> debbugs.gnu.org; Wed, 13 Jul 2016 09:24:11 -0400
Received: from tomas by tomasium.tuxteam.de with local (Exim 4.80)
 (envelope-from <tomas@HIDDEN>)
 id 1bNK8y-00019w-FF; Wed, 13 Jul 2016 15:24:04 +0200
Date: Wed, 13 Jul 2016 15:24:03 +0200
From: tomas@HIDDEN
Message-ID: <20160713132403.GA2349@HIDDEN>
References: <20150415194714.GA30295@HIDDEN>
 <87y45vln0f.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; x-action=pgp-signed
Content-Transfer-Encoding: 8bit
In-Reply-To: <87y45vln0f.fsf@HIDDEN>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Score: -1.3 (-)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.3 (-)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thu, Jun 23, 2016 at 09:32:16PM +0200, Andy Wingo wrote:
> See thread here as well:
> http://thread.gmane.org/gmane.lisp.guile.devel/17709
> 
> I like Ricardo's patch but have some comments here:
> http://article.gmane.org/gmane.lisp.guile.devel/18384

(sorry for cc'ing both of you, but I don't know whether you are
subscribed to the bug. Two copies seemed more polite than none).

Sorry folks for not coming back earlier. Real Life and things.

Since I'm going to be off the 'net for one month starting next Friday,
I thought I'll write a short note.

I'll be back the 15th of August and am really willing to do whatever
it takes to bring this forward. OTOH, if any of you decides to pick
it up, I'm sure the results will be better :-)

Referring to Oleg Kiseliov's paper [1], there are actually three
things involved:

 - the namespace. This is an XML thing and will typically be
   an URI (I don't quite remember whether it *must* be an
   URI, but that's irrelevant. It may contain nasty characters
   (to XML: it isn't an XML "Name", and potentially to Scheme:
   there may be patentheses and things in there, so some
   Schemes won't make a symbol of that; Guile doesn't mind)

 - the namespace prefix. Again, an XML thing, basically giving
   a non-nasty abbreviation for the namespace, to stick it to
   the Name, making a "QName". The association prefix -> namespace
   is scoped to a node and its descendants, and can be shadowed
   at some node below

 - the namespace-id, an SXML thing. In [1], this is typically
   the namespace, but Oleg Kyselyov made provisions in [1] for a
   similar "abbreviation" (the user-ns-shortcut in [1], page 3),
   whose mapping can be attached to any node via the
   pseudo-attribute *NAMESPACES* [2], which can also carry the
   original (XML) namespace prefix.

   As far as I understand the paper, most of the time this
   namespace-id will be identical to the URI, but it is this
   what will be prefixed to the tag name symbols in the
   SXML representation.

What Ricardo's patch does is to conflate namespace prefix and
namespace-id and provide a mapping (namespace-id aka prefix) ->
namespace. This is actually quite elegant, since we don't need
the distinction between (XML) prefix and (SXML) namespace-id.

I think that we can, at least as (sxml simple) is concerned,
ignore this distinction.

What is missing? From my point of view:

 - At xml->sxml time, the user doesn't know which namespaces
   are in the xml. So it would be nice if the XML parser
   could provide that.

 - It would be super-nice if the XML parser could put that
   into the same nodes it found it, as described in [1]
   (i.e. in the (*NAMESPACES* ...) pseudo-attribute).
   This way we wouldn't have a global mapping, but one
   that resembles the original XML, even with the same
   prefixes. Less surprises overall. The round trip
   xml -> sxml -> xml would be (nearly) the identity.

   With Ricardo's patch it would lump all the namespace
   declarations up in the top node, which formally is
   correct, but might scare XML people a bit :-)

 - At sxml->xml time there should be a way to somehow
   generate prefixex for "new" namespaces. I don't know
   at the moment how this would work, that depends on
   how the user is supposed to insert new nodes in the
   SXML. Does she specify the namespace? Both prefix
   (aka namespace-id, under my current assumption) *and*
   namespace? (note that the namespace-id/prefix alone
   wouldn't be sufficient).

Sorry for this wall of text. I hope it makes some sense.

Regards

[1] http://okmij.org/ftp/papers/SXML-paper.pdf
[2] Actually, I'm cheating here: the thing is part of an
   "annotations" part, which according to the grammar comes
   *last*, after all the attributes. But it looks a bit
   like an attribute, with a strange name and a more
   complex value.

- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAleGQPMACgkQBcgs9XrR2kaMfgCeKbA4pWFrCZoxofDF4n9utgnZ
IzYAn1gozFwBLPd/rmNkZvJYDTJ9cIvr
=etJd
-----END PGP SIGNATURE-----




Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces?
Resent-From: tomas@HIDDEN
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Wed, 13 Jul 2016 18:09:02 +0000
Resent-Message-ID: <handler.20339.B20339.14684333377585 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 20339
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: 20339 <at> debbugs.gnu.org
Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.14684333377585
          (code B ref 20339); Wed, 13 Jul 2016 18:09:02 +0000
Received: (at 20339) by debbugs.gnu.org; 13 Jul 2016 18:08:57 +0000
Received: from localhost ([127.0.0.1]:50115 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1bNOaf-0001yH-Jk
	for submit <at> debbugs.gnu.org; Wed, 13 Jul 2016 14:08:57 -0400
Received: from mail.tuxteam.de ([5.199.139.25]:58527 helo=tomasium.tuxteam.de)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <tomas@HIDDEN>) id 1bNOae-0001y9-4u
 for 20339 <at> debbugs.gnu.org; Wed, 13 Jul 2016 14:08:56 -0400
Received: from tomas by tomasium.tuxteam.de with local (Exim 4.80)
 (envelope-from <tomas@HIDDEN>) id 1bNOac-0003Jl-HB
 for 20339 <at> debbugs.gnu.org; Wed, 13 Jul 2016 20:08:54 +0200
Date: Wed, 13 Jul 2016 20:08:54 +0200
From: tomas@HIDDEN
Message-ID: <20160713180854.GA12635@HIDDEN>
References: <20150415194714.GA30295@HIDDEN> <87y45vln0f.fsf@HIDDEN>
 <20160713132403.GA2349@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; x-action=pgp-signed
Content-Transfer-Encoding: 8bit
In-Reply-To: <20160713132403.GA2349@HIDDEN>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Score: -1.3 (-)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.3 (-)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wed, Jul 13, 2016 at 03:24:03PM +0200, tomas@HIDDEN wrote:

[...]

> What is missing? From my point of view:
> 
>  - At xml->sxml time, the user doesn't know which namespaces
>    are in the xml. So it would be nice if the XML parser
>    could provide that.
> 
>  - It would be super-nice if the XML parser could put that
>    into the same nodes it found it, as described in [1]
>    (i.e. in the (*NAMESPACES* ...) pseudo-attribute).
>    This way we wouldn't have a global mapping, but one
>    that resembles the original XML, even with the same
>    prefixes. Less surprises overall. The round trip
>    xml -> sxml -> xml would be (nearly) the identity.
> 
>    With Ricardo's patch it would lump all the namespace
>    declarations up in the top node, which formally is
>    correct, but might scare XML people a bit :-)
> 
>  - At sxml->xml time there should be a way to somehow
>    generate prefixex for "new" namespaces. I don't know
>    at the moment how this would work, that depends on
>    how the user is supposed to insert new nodes in the
>    SXML. Does she specify the namespace? Both prefix
>    (aka namespace-id, under my current assumption) *and*
>    namespace? (note that the namespace-id/prefix alone
>    wouldn't be sufficient).

Argh. First post, then think, sorry.

Actually ditch the last point. I think it would be OK
to make the user responsible to keep the *NAMESPACES*
pseudo-attribute up-to-date whenever she adds nodes
with new namespaces to the SXML.

regards

[1] http://okmij.org/ftp/papers/SXML-paper.pdf

- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAleGg7YACgkQBcgs9XrR2kY7hACdG5drjpPVlzB4wW6sXhuRKliv
h3cAnAmHC5RxiEc6RXi0tu5U3yF4YYbx
=7uGa
-----END PGP SIGNATURE-----




Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces?
Resent-From: Andy Wingo <wingo@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Thu, 14 Jul 2016 10:11:02 +0000
Resent-Message-ID: <handler.20339.B20339.14684910297110 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 20339
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: tomas@HIDDEN
Cc: Ricardo Wurmus <rekado@HIDDEN>, 20339 <at> debbugs.gnu.org
Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.14684910297110
          (code B ref 20339); Thu, 14 Jul 2016 10:11:02 +0000
Received: (at 20339) by debbugs.gnu.org; 14 Jul 2016 10:10:29 +0000
Received: from localhost ([127.0.0.1]:50557 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1bNdbB-0001qc-Iu
	for submit <at> debbugs.gnu.org; Thu, 14 Jul 2016 06:10:29 -0400
Received: from pb-sasl2.pobox.com ([64.147.108.67]:52537
 helo=sasl.smtp.pobox.com) by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <wingo@HIDDEN>) id 1bNdb9-0001qS-Q2
 for 20339 <at> debbugs.gnu.org; Thu, 14 Jul 2016 06:10:28 -0400
Received: from sasl.smtp.pobox.com (unknown [127.0.0.1])
 by pb-sasl2.pobox.com (Postfix) with ESMTP id 2CEE924B17;
 Thu, 14 Jul 2016 06:10:25 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc
 :subject:references:date:in-reply-to:message-id:mime-version
 :content-type; s=sasl; bh=BMXjZYHe7FoMrTAHw4gJAuxvmcE=; b=HsGvJo
 7tmpzP013YDbLVJEvYrvReCyRoPTVqGTLzKpYCmMzXHfgP/nu+K3egJruWscjamU
 k+Mml/xoz30gUU9OUWINloXU9Ohh/OnpZz07i7HpB6wRxVpR1QdhvBeOpxLoTKFW
 t5yAzcB4rXqz0+kpVwXh9ZR5fxM4rHyvJZ0n8=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:to:cc
 :subject:references:date:in-reply-to:message-id:mime-version
 :content-type; q=dns; s=sasl; b=NQjFYk/wG3MBm4KXR+7/Nj8Y/QDZ/1A9
 GW2Tb9P6EItfZeC2xLoTg9TXnpkSFpxbVEJmLAbCWzrZgb9Dt6IeUwbTA01y19Zs
 q4TPEPAWdaGr+Q5+7sFyoJGkOSSn2Auo+WW8KkM2DwNM2jUTmXMCpKuRCkbRZaj1
 92eOR0D8YxI=
Received: from pb-sasl2.nyi.icgroup.com (unknown [127.0.0.1])
 by pb-sasl2.pobox.com (Postfix) with ESMTP id 1674C24B13;
 Thu, 14 Jul 2016 06:10:25 -0400 (EDT)
Received: from clucks (unknown [88.160.190.192])
 (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by pb-sasl2.pobox.com (Postfix) with ESMTPSA id 2C8C524B12;
 Thu, 14 Jul 2016 06:10:24 -0400 (EDT)
From: Andy Wingo <wingo@HIDDEN>
References: <20150415194714.GA30295@HIDDEN> <87y45vln0f.fsf@HIDDEN>
 <20160713132403.GA2349@HIDDEN>
Date: Thu, 14 Jul 2016 12:10:17 +0200
In-Reply-To: <20160713132403.GA2349@HIDDEN> (tomas@HIDDEN's message of
 "Wed, 13 Jul 2016 15:24:03 +0200")
Message-ID: <87furc1qeu.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-Pobox-Relay-ID: 2E2ADE00-49AB-11E6-9089-28A6F1301B6D-02397024!pb-sasl2.pobox.com
X-Spam-Score: -1.3 (-)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.3 (-)

Hi :)

On Wed 13 Jul 2016 15:24, tomas@HIDDEN writes:

> Referring to Oleg Kiseliov's paper [1], there are actually three
> things involved:

This summary is helpful, thanks.
> What is missing? From my point of view:
>
>  - At xml->sxml time, the user doesn't know which namespaces
>    are in the xml. So it would be nice if the XML parser
>    could provide that.

For some documents you do know, of course.

And for larger perspective, I think that SSAX gives you all the tools
you need to build specialist and very flexible XML parsers.  So to an
extent solving the general problem isn't necessary -- we can always
point people to SSAX.  But that's a bit rude ;) so if there are common
patterns we should try to capture them in xml->sxml.  I see this bug as
being a search for those patterns, but without the requirement of
solving the problem in its most general form.

>  - It would be super-nice if the XML parser could put that
>    into the same nodes it found it, as described in [1]
>    (i.e. in the (*NAMESPACES* ...) pseudo-attribute).
>    This way we wouldn't have a global mapping, but one
>    that resembles the original XML, even with the same
>    prefixes. Less surprises overall. The round trip
>    xml -> sxml -> xml would be (nearly) the identity.
>
>    With Ricardo's patch it would lump all the namespace
>    declarations up in the top node, which formally is
>    correct, but might scare XML people a bit :-)

ACK.

>  - At sxml->xml time there should be a way to somehow
>    generate prefixex for "new" namespaces. I don't know
>    at the moment how this would work, that depends on
>    how the user is supposed to insert new nodes in the
>    SXML. Does she specify the namespace? Both prefix
>    (aka namespace-id, under my current assumption) *and*
>    namespace? (note that the namespace-id/prefix alone
>    wouldn't be sufficient).

ACK.

What do you think the next step is?  I am happy to wait FWIW, dunno if
Ricardo has any feelings here.

Enjoy your holiday :)

Andy




Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces?
Resent-From: tomas@HIDDEN
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Thu, 14 Jul 2016 10:27:01 +0000
Resent-Message-ID: <handler.20339.B20339.14684919999133 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 20339
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: Andy Wingo <wingo@HIDDEN>
Cc: Ricardo Wurmus <rekado@HIDDEN>, tomas@HIDDEN, 20339 <at> debbugs.gnu.org
Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.14684919999133
          (code B ref 20339); Thu, 14 Jul 2016 10:27:01 +0000
Received: (at 20339) by debbugs.gnu.org; 14 Jul 2016 10:26:39 +0000
Received: from localhost ([127.0.0.1]:50576 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1bNdqo-0002ND-P9
	for submit <at> debbugs.gnu.org; Thu, 14 Jul 2016 06:26:38 -0400
Received: from mail.tuxteam.de ([5.199.139.25]:60509 helo=tomasium.tuxteam.de)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <tomas@HIDDEN>) id 1bNdqk-0002N0-SR
 for 20339 <at> debbugs.gnu.org; Thu, 14 Jul 2016 06:26:37 -0400
Received: from tomas by tomasium.tuxteam.de with local (Exim 4.80)
 (envelope-from <tomas@HIDDEN>)
 id 1bNdqh-0001hN-JM; Thu, 14 Jul 2016 12:26:31 +0200
Date: Thu, 14 Jul 2016 12:26:31 +0200
From: tomas@HIDDEN
Message-ID: <20160714102631.GB5611@HIDDEN>
References: <20150415194714.GA30295@HIDDEN> <87y45vln0f.fsf@HIDDEN>
 <20160713132403.GA2349@HIDDEN> <87furc1qeu.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; x-action=pgp-signed
Content-Transfer-Encoding: 8bit
In-Reply-To: <87furc1qeu.fsf@HIDDEN>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Score: -1.3 (-)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.3 (-)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thu, Jul 14, 2016 at 12:10:17PM +0200, Andy Wingo wrote:
> Hi :)
> 
> On Wed 13 Jul 2016 15:24, tomas@HIDDEN writes:
> 
> > Referring to Oleg Kiseliov's paper [1], there are actually three
> > things involved:
> 
> This summary is helpful, thanks.
> > What is missing? From my point of view:
> >
> >  - At xml->sxml time, the user doesn't know which namespaces
> >    are in the xml. So it would be nice if the XML parser
> >    could provide that.
> 
> For some documents you do know, of course.
> 
> And for larger perspective, I think that SSAX gives you all the tools
> you need to build specialist and very flexible XML parsers.  So to an
> extent solving the general problem isn't necessary -- we can always
> point people to SSAX.  But that's a bit rude ;) so if there are common
> patterns we should try to capture them in xml->sxml.  I see this bug as
> being a search for those patterns, but without the requirement of
> solving the problem in its most general form.

It's (sxml simple), after all. I too hesitate to stuff too much into
it. For me, a documented "no, we don't do namespaces" would be one
valid pattern.

> >  - It would be super-nice if the XML parser could put that
> >    into the same nodes it found it [...]

> ACK.
> 
> >  - At sxml->xml time there should be a way to somehow
> >    generate prefixex [...]

> ACK.
> 
> What do you think the next step is?  I am happy to wait FWIW, dunno if
> Ricardo has any feelings here.

We meet this afternoon anyway. On my side, I'd be happy to try
something along the sketched lines when I'm back. If someone
who cares beats me at it, I'd be as happy.

> Enjoy your holiday :)

Looking forward to. BTW: if I understood properly the area you're
living in, we'll cycle past you (somewhat to the West) on our
way to the north.

Regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAleHaNcACgkQBcgs9XrR2kaQgQCaAzyyBkI3w0XGJ0HUI9Dz/YXa
7yQAni4CWIDE5ezu+x0DwanoAjfH4Wr2
=DEuD
-----END PGP SIGNATURE-----




Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces?
Resent-From: Ricardo Wurmus <rekado@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Mon, 04 Feb 2019 20:45:01 +0000
Resent-Message-ID: <handler.20339.B20339.154931307929681 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 20339
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: Andy Wingo <wingo@HIDDEN>
Cc: tomas@HIDDEN, 20339 <at> debbugs.gnu.org
Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.154931307929681
          (code B ref 20339); Mon, 04 Feb 2019 20:45:01 +0000
Received: (at 20339) by debbugs.gnu.org; 4 Feb 2019 20:44:39 +0000
Received: from localhost ([127.0.0.1]:59743 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1gql6a-0007id-TL
	for submit <at> debbugs.gnu.org; Mon, 04 Feb 2019 15:44:38 -0500
Received: from sender-of-o51.zoho.com ([135.84.80.216]:21001)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <rekado@HIDDEN>) id 1gql6W-0007iQ-Nv
 for 20339 <at> debbugs.gnu.org; Mon, 04 Feb 2019 15:44:35 -0500
ARC-Seal: i=1; a=rsa-sha256; t=1549313048; cv=none; d=zoho.com; s=zohoarc; 
 b=aEM43GiCOLAoo3/H7giGfKs8upF4UZi0os8gj4YEBc5z2rKyMvllEkEQwuGu3/ISB4LjfNczJUX5lfhn6rJKXxyon8g3DnHHkjyzaWn5J4G9WCKAe2JTW2M/K4v6VN+4LlTBFbS1kCaR/ZnTNQxbMgjkZNih7xcLGCUHRNSpHqU=
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com;
 s=zohoarc; t=1549313048;
 h=Content-Type:Cc:Date:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:To:ARC-Authentication-Results;
 bh=bCrbMDqVhJDVvpeXEH9CHrQC9t2ZELPeqFUK2NaOzFE=; 
 b=msWUNiweN4x1qkx4lIm9WPMK12EWZWP69cSnC5PEAc70QWTM91Wx4GIdcPLqlVpKEl6+EYmAtt3FLS+qxDH6rjdVV1ycShC5aj0pxF6BRV2+sBW9yan2BFtEc/MhHWdbsBW+cVJvSj2VBnhUz68tNNqDWBg7u1XdDcGQ/eB9JdA=
ARC-Authentication-Results: i=1; mx.zoho.com; dkim=pass  header.i=elephly.net;
 spf=pass  smtp.mailfrom=rekado@HIDDEN;
 dmarc=pass header.from=<rekado@HIDDEN> header.from=<rekado@HIDDEN>
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1549313048; 
 s=zoho; d=elephly.net; i=rekado@HIDDEN;
 h=References:From:To:Cc:Subject:In-reply-to:Date:Message-ID:MIME-Version:Content-Type;
 l=7804; bh=bCrbMDqVhJDVvpeXEH9CHrQC9t2ZELPeqFUK2NaOzFE=;
 b=PEZScoHbQjYgL9ONk/wwysvvK2XpWEPfkdc6yFBKyEYEhTgrdIVCaRS3/dxo+Pgc
 4dOSEcoEPyKEco85HXYPHuNuI+iNMP4nVrVl4HRVsfJrpNc2gWteiLgRLgaHTJwVw/D
 e+zDWvHtSapihB1fbxq5y/6ZhJzfLNrZ+NrzhT4w=
Received: from localhost (p578E68C8.dip0.t-ipconnect.de [87.142.104.200]) by
 mx.zohomail.com with SMTPS id 1549313046910741.4151801955618;
 Mon, 4 Feb 2019 12:44:06 -0800 (PST)
References: <20150415194714.GA30295@HIDDEN> <87y45vln0f.fsf@HIDDEN>
 <20160713132403.GA2349@HIDDEN> <87furc1qeu.fsf@HIDDEN>
User-agent: mu4e 1.0; emacs 26.1
From: Ricardo Wurmus <rekado@HIDDEN>
In-reply-to: <87furc1qeu.fsf@HIDDEN>
X-URL: https://elephly.net
X-PGP-Key: https://elephly.net/rekado.pubkey
X-PGP-Fingerprint: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
Date: Mon, 04 Feb 2019 21:44:02 +0100
Message-ID: <87a7jbi8rx.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=-=-="
X-ZohoMailClient: External
X-Zoho-Virus-Status: 1
X-Spam-Score: -0.0 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

--=-=-=
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable

Hello!

I just looked at this again and I think I came with something useful.
Here=E2=80=99s some context:

Andy Wingo <wingo@HIDDEN> writes:

> Hi :)
>
> On Wed 13 Jul 2016 15:24, tomas@HIDDEN writes:
>
>> Referring to Oleg Kiseliov's paper [1], there are actually three
>> things involved:
>
> This summary is helpful, thanks.
>> What is missing? From my point of view:
>>
>>  - At xml->sxml time, the user doesn't know which namespaces
>>    are in the xml. So it would be nice if the XML parser
>>    could provide that.
>
> For some documents you do know, of course.
>
> And for larger perspective, I think that SSAX gives you all the tools
> you need to build specialist and very flexible XML parsers.  So to an
> extent solving the general problem isn't necessary -- we can always
> point people to SSAX.  But that's a bit rude ;) so if there are common
> patterns we should try to capture them in xml->sxml.  I see this bug as
> being a search for those patterns, but without the requirement of
> solving the problem in its most general form.
>
>>  - It would be super-nice if the XML parser could put that
>>    into the same nodes it found it, as described in [1]
>>    (i.e. in the (*NAMESPACES* ...) pseudo-attribute).
>>    This way we wouldn't have a global mapping, but one
>>    that resembles the original XML, even with the same
>>    prefixes. Less surprises overall. The round trip
>>    xml -> sxml -> xml would be (nearly) the identity.
>>
>>    With Ricardo's patch it would lump all the namespace
>>    declarations up in the top node, which formally is
>>    correct, but might scare XML people a bit :-)
>
> ACK.
>
>>  - At sxml->xml time there should be a way to somehow
>>    generate prefixex for "new" namespaces. I don't know
>>    at the moment how this would work, that depends on
>>    how the user is supposed to insert new nodes in the
>>    SXML. Does she specify the namespace? Both prefix
>>    (aka namespace-id, under my current assumption) *and*
>>    namespace? (note that the namespace-id/prefix alone
>>    wouldn't be sufficient).
>
> ACK.
>
> What do you think the next step is?  I am happy to wait FWIW, dunno if
> Ricardo has any feelings here.

Attached is a patch that does the requested things.  The parser
procedures like FINISH-ELEMENT have access to all the namespaces, so we
I changed the FINISH-ELEMENT procedure to return the list of namespaces
in addition to its SXML tree return value.

I changed name->sxml to use only the namespace aliases / abbreviations
instead of the namespace URIs.  (This is not very efficient because we
need to traverse the list of namespaces every time.  Maybe we could
memoize this.  On the other hand, the length of the namespaces list may
not be large enough to affect performance too much.)

In the end we get both namespace list and SXML tree from running the
parser.  Before wrapping this up in *TOP* we generate xmlns attributes
for all abbreviations and =E2=80=9Cpatch=E2=80=9D the first proper element=
=E2=80=99s attribute
list (i.e. we skip over a *PI* element if it exists).

The result is an SXML tree that begins with namespace declarations,
mapping abbreviations to URIs.  Within the SXML tree we=E2=80=99re only usi=
ng
abbreviations, so there are no more invalid characters when converting
SXML to a string.

I would be happy if you could test this as I=E2=80=99m not 100% confident t=
hat
this is correct.  Here are questions I wasn=E2=80=99t able to answer
conclusively:

* Is the value for =E2=80=9Cnamespaces=E2=80=9D that=E2=80=99s passed in to=
 the
  FINISH-ELEMENT procedure always the same?

* Will the second return value of the final call to FINISH-ELEMENT
  really always be the complete list of *all* namespaces that have been
  encountered?

* Are there valid XML documents for which the match patterns to inject
  namespace declarations would not apply?  (e.g. documents with a PI
  element and two separate XML trees)

--
Ricardo



--=-=-=
Content-Type: text/x-patch
Content-Disposition: inline;
 filename=0001-sxml-xml-sxml-Record-and-use-namespace-abbreviations.patch

From 83ee9de18a0ecaa237eb73e1b75d0b21e3e8d321 Mon Sep 17 00:00:00 2001
From: Ricardo Wurmus <rekado@HIDDEN>
Date: Mon, 4 Feb 2019 21:39:06 +0100
Subject: [PATCH] sxml: xml->sxml: Record and use namespace abbreviations.

* module/sxml/simple.scm (xml->sxml): Add namespace declarations to the
attribute list of the first XML element.
[name->sxml]: Accept namespaces argument to look up abbreviation.
Return name with abbreviation prefix.
[parser]: Let FINISH-ELEMENT procedure return namespaces in addition to
SXML tree.
---
 module/sxml/simple.scm | 50 +++++++++++++++++++++++++++++++++---------
 1 file changed, 40 insertions(+), 10 deletions(-)

diff --git a/module/sxml/simple.scm b/module/sxml/simple.scm
index 703ad9137..52dd9af12 100644
--- a/module/sxml/simple.scm
+++ b/module/sxml/simple.scm
@@ -1,7 +1,8 @@
 ;;;; (sxml simple) -- a simple interface to the SSAX parser
 ;;;;
-;;;; 	Copyright (C) 2009, 2010, 2013  Free Software Foundation, Inc.
+;;;; 	Copyright (C) 2009, 2010, 2013, 2019  Free Software Foundation, Inc.
 ;;;;    Modified 2004 by Andy Wingo <wingo at pobox dot com>.
+;;;;    Modified 2019 by Ricardo Wurmus <rekado@HIDDEN>.
 ;;;;    Originally written by Oleg Kiselyov <oleg at pobox dot com> as SXML-to-HTML.scm.
 ;;;; 
 ;;;; This library is free software; you can redistribute it and/or
@@ -30,6 +31,7 @@
   #:use-module (sxml ssax)
   #:use-module (sxml transform)
   #:use-module (ice-9 match)
+  #:use-module (srfi srfi-1)
   #:use-module (srfi srfi-13)
   #:export (xml->sxml sxml->xml sxml->string))
 
@@ -123,10 +125,15 @@ port."
         (acons '*DEFAULT* default-entity-handler entities)
         entities))
 
-  (define (name->sxml name)
+  (define (name->sxml name namespaces)
     (match name
       ((prefix . local-part)
-       (symbol-append prefix (string->symbol ":") local-part))
+       (let ((abbrev (and=> (find (match-lambda
+                                    ((abbrev uri . rest)
+                                     (and (eq? uri prefix) abbrev)))
+                                  namespaces)
+                            first)))
+         (symbol-append abbrev (string->symbol ":") local-part)))
       (_ name)))
 
   (define (doctype-continuation seed)
@@ -152,14 +159,16 @@ port."
                        (ssax:reverse-collect-str seed)))
              (attrs (attlist-fold
                      (lambda (attr accum)
-                       (cons (list (name->sxml (car attr)) (cdr attr))
+                       (cons (list (name->sxml (car attr) namespaces)
+                                   (cdr attr))
                              accum))
                      '() attributes)))
-         (acons (name->sxml elem-gi)
-                (if (null? attrs)
-                    seed
-                    (cons (cons '@ attrs) seed))
-                parent-seed)))
+         (values (acons (name->sxml elem-gi namespaces)
+                        (if (null? attrs)
+                            seed
+                            (cons (cons '@ attrs) seed))
+                        parent-seed)
+                 namespaces)))
 
      CHAR-DATA-HANDLER ; fhere
      (lambda (string1 string2 seed)
@@ -212,7 +221,28 @@ port."
   (let* ((port (if (string? string-or-port)
                    (open-input-string string-or-port)
                    string-or-port))
-         (elements (reverse (parser port '()))))
+         (elements (call-with-values
+                       (lambda () (parser port '()))
+                     (lambda (elements namespaces)
+                       ;; Generate namespace declarations mapping
+                       ;; abbreviations to URLs.
+                       (let ((ns-declarations
+                              (filter-map (match-lambda
+                                            (('*DEFAULT* . _) #f)
+                                            ((abbrev uri . _)
+                                             (list (symbol-append 'xmlns: abbrev)
+                                                   (symbol->string uri))))
+                                          namespaces)))
+                         ;; Inject namespace declarations into the first
+                         ;; proper element.
+                         (match (reverse elements)
+                           (((and pi-elem ('*PI* . _))
+                             (tag ('@ . attrs) . children))
+                            `(,pi-elem (,tag (@ ,@ns-declarations ,attrs)
+                                             ,@children)))
+                           (((tag ('@ . attrs) . children))
+                            `(,tag (@ ,@ns-declarations ,attrs)
+                                   ,@children))))))))
     `(*TOP* ,@elements)))
 
 (define check-name
-- 
2.20.1


--=-=-=--





Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces?
Resent-From: John Cowan <cowan@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Mon, 04 Feb 2019 22:56:03 +0000
Resent-Message-ID: <handler.20339.B20339.154932093417966 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 20339
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: Ricardo Wurmus <rekado@HIDDEN>
Cc: Andy Wingo <wingo@HIDDEN>, 20339 <at> debbugs.gnu.org
Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.154932093417966
          (code B ref 20339); Mon, 04 Feb 2019 22:56:03 +0000
Received: (at 20339) by debbugs.gnu.org; 4 Feb 2019 22:55:34 +0000
Received: from localhost ([127.0.0.1]:59857 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1gqn9K-0004fi-80
	for submit <at> debbugs.gnu.org; Mon, 04 Feb 2019 17:55:34 -0500
Received: from mail-wr1-f41.google.com ([209.85.221.41]:45143)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <cowan@HIDDEN>) id 1gqn9I-0004fT-58
 for 20339 <at> debbugs.gnu.org; Mon, 04 Feb 2019 17:55:33 -0500
Received: by mail-wr1-f41.google.com with SMTP id q15so1644863wro.12
 for <20339 <at> debbugs.gnu.org>; Mon, 04 Feb 2019 14:55:32 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=ccil-org.20150623.gappssmtp.com; s=20150623;
 h=mime-version:references:in-reply-to:from:date:message-id:subject:to
 :cc; bh=p6f67KqRUdU6E3bC/b3cXEtW5JAbEk+91l+ztTzDdwo=;
 b=RBMSXSaeWIYhikNG46EakI1+YbXb3Kj6tiT0h5C9ZVs4rX3WZixLtUtvKozcmy1bcU
 Outss4EZQIFjkEjIWz8YAGvFj5/MLab0vEPlrNDjozznySlWUngHXdZb7xOsafU4CE78
 7ZSG/cI93XgIF5wkKQC7qD8afHjYnQaR4+7mN4nwB0kfTqMBO3Tcxyqd0KZwi8xnAijL
 NnyUwsxLXCE1C42oDBnyTBlk+tzflnD3wAd/WglRnfVXPXyV1oj2jS99Ntu4fqQRTRPQ
 i87B1cQnzENwu4FXsOzxCNhxNtpiD3yU4SLQ/oA63DcgGGQ4YCf8QuiuBiOCKgop64Hi
 31Fw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc;
 bh=p6f67KqRUdU6E3bC/b3cXEtW5JAbEk+91l+ztTzDdwo=;
 b=BdBo8yG0G1YOalKlctSbajdQZgcSs5gxa34tMNkh30w0VnfGFjM+gWRfa6w9MaGTED
 KgIcBLOAmRxIXgVrZr7sswm/dcVL67BZQdkAYT1QZFPCCNr3I6CHEqdtzVvGD1J1jG1X
 AOfa/4sY7WzH99+YvhCViVuyuOhLPgHXcLVgIZ2eoYcgAxuwhC/It1ORhczOp0S9IeKw
 F1jtRr0VNJ+tt7sKDWDyoEagdCj8Vv9l8lgDIrmwU1vDrOaQjSb1pao40SoKb2xdjpQj
 JJR8hQVBzVgk9+fhEfKZKP0Dt0pJt8zHV0tSKfABsBgVX+R+EEyy4tcCxjUbuHLpVU1w
 Tiuw==
X-Gm-Message-State: AHQUAuZnv0sBQiTm79ThUx1WDF5GOatA0RQuoNPHuNv3Gj/bYthZ9uvw
 lxal/N5/LgIeNw8B9ST38tc9gW9foBFqNRGDSWG7qQ==
X-Google-Smtp-Source: AHgI3IZ17fj5ok1cXdgK20K427VetXcFg57DGLwZ1Tblozmy5bPKBSP0u9BJKYAAtNZ9zW/vDjutehUQAgix5oU7lQk=
X-Received: by 2002:a5d:5101:: with SMTP id s1mr1206131wrt.89.1549320926241;
 Mon, 04 Feb 2019 14:55:26 -0800 (PST)
MIME-Version: 1.0
References: <20150415194714.GA30295@HIDDEN> <87y45vln0f.fsf@HIDDEN>
 <20160713132403.GA2349@HIDDEN> <87furc1qeu.fsf@HIDDEN>
 <87a7jbi8rx.fsf@HIDDEN>
In-Reply-To: <87a7jbi8rx.fsf@HIDDEN>
From: John Cowan <cowan@HIDDEN>
Date: Mon, 4 Feb 2019 17:55:14 -0500
Message-ID: <CAD2gp_ScjmURZ7yTFronxyR9r4P4P2L91mXNHguXpZG86chdVA@HIDDEN>
Content-Type: multipart/alternative; boundary="00000000000073c594058119636f"
X-Spam-Score: 0.0 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

--00000000000073c594058119636f
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Mon, Feb 4, 2019 at 3:45 PM Ricardo Wurmus <rekado@HIDDEN> wrote:

\I changed name->sxml to use only the namespace aliases / abbreviations
> instead of the namespace URIs.


The trouble with that is that XML rnamespaces are lexically scoped, like
Scheme
local variables.  It is perfectly valid to map a prefix to more than one
URL,
as long as the namespace declarations are in either disjoint or nested
elements.  So you don't know what the absolute name of the element
or attribute is from just the prefix and the local part.

Furthermore, it is also legal to define more than one prefix for
the same URL, in which case names using either prefix are normally
treated as equivalent (however, you can't have elements like
<a:foo>...</b:foo>
even if a and b map to the same namespace).

* Is the value for =E2=80=9Cnamespaces=E2=80=9D that=E2=80=99s passed in to=
 the
>   FINISH-ELEMENT procedure always the same?
>
> * Will the second return value of the final call to FINISH-ELEMENT
>   really always be the complete list of *all* namespaces that have been
>   encountered?
>

Definitely not, only the namespaces that are currently in scope.

* Are there valid XML documents for which the match patterns to inject
>   namespace declarations would not apply?  (e.g. documents with a PI
>   element and two separate XML trees)
>

That's not well-formed: you can only have a single element tree per XML
document, although you can have any number of PIs, comments, and
whitespace (which is normally ignored) before and after.

--=20
John Cowan          http://vrici.lojban.org/~cowan        cowan@HIDDEN
If I have seen farther than others, it is because I was looking through a
spyglass with my one good eye, with a parrot standing on my shoulder. --"Y"

--00000000000073c594058119636f
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr"><br></div><br><div class=
=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Mon, Feb 4, 2019 =
at 3:45 PM Ricardo Wurmus &lt;<a href=3D"mailto:rekado@HIDDEN">rekado@=
elephly.net</a>&gt; wrote:<br></div><div dir=3D"ltr" class=3D"gmail_attr"><=
br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8e=
x;border-left:1px solid rgb(204,204,204);padding-left:1ex">\I changed name-=
&gt;sxml to use only the namespace aliases / abbreviations<br>
instead of the namespace URIs.</blockquote><div><br></div><div>The trouble =
with that is that XML rnamespaces are lexically scoped, like Scheme</div><d=
iv>local variables.=C2=A0 It is perfectly valid to map a prefix to more tha=
n one URL,</div><div>as long as the namespace declarations are in either di=
sjoint or nested</div><div>elements.=C2=A0 So you don&#39;t know what the a=
bsolute name of the element</div><div>or attribute is from just the prefix =
and the local part.</div><div><br></div><div>Furthermore, it is also legal =
to define more than one prefix for</div><div>the same URL, in which case na=
mes using either prefix are normally</div><div>treated as equivalent (howev=
er, you can&#39;t have elements like &lt;a:foo&gt;...&lt;/b:foo&gt;=C2=A0</=
div><div>even if a and b map to the same namespace).</div><div><br></div><b=
lockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-le=
ft:1px solid rgb(204,204,204);padding-left:1ex">* Is the value for =E2=80=
=9Cnamespaces=E2=80=9D that=E2=80=99s passed in to the<br>
=C2=A0 FINISH-ELEMENT procedure always the same?<br>
<br>
* Will the second return value of the final call to FINISH-ELEMENT<br>
=C2=A0 really always be the complete list of *all* namespaces that have bee=
n<br>
=C2=A0 encountered?<br></blockquote><div><br></div><div>Definitely not, onl=
y the namespaces that are currently in scope.=C2=A0</div><div><br></div><bl=
ockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-lef=
t:1px solid rgb(204,204,204);padding-left:1ex">* Are there valid XML docume=
nts for which the match patterns to inject<br>
=C2=A0 namespace declarations would not apply?=C2=A0 (e.g. documents with a=
 PI<br>
=C2=A0 element and two separate XML trees)<br></blockquote><div><br></div><=
div>That&#39;s not well-formed: you can only have a single element tree per=
 XML</div><div>document, although you can have any number of PIs, comments,=
 and</div><div>whitespace (which is normally ignored) before and after.</di=
v><div><br></div><div>--=C2=A0</div><div><div>John Cowan=C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 <a href=3D"http://vrici.lojban.org/~cowan">http://vrici.l=
ojban.org/~cowan</a>=C2=A0 =C2=A0 =C2=A0 =C2=A0 <a href=3D"mailto:cowan@cci=
l.org">cowan@HIDDEN</a></div><div>If I have seen farther than others, it =
is because I was looking through a</div><div>spyglass with my one good eye,=
 with a parrot standing on my shoulder. --&quot;Y&quot;</div></div><div><br=
></div></div></div></div>

--00000000000073c594058119636f--




Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces?
Resent-From: Ricardo Wurmus <rekado@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Tue, 05 Feb 2019 10:58:02 +0000
Resent-Message-ID: <handler.20339.B20339.154936428023663 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 20339
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: John Cowan <cowan@HIDDEN>
Cc: Andy Wingo <wingo@HIDDEN>, 20339 <at> debbugs.gnu.org
Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.154936428023663
          (code B ref 20339); Tue, 05 Feb 2019 10:58:02 +0000
Received: (at 20339) by debbugs.gnu.org; 5 Feb 2019 10:58:00 +0000
Received: from localhost ([127.0.0.1]:60232 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1gqyQS-00069a-Ew
	for submit <at> debbugs.gnu.org; Tue, 05 Feb 2019 05:58:00 -0500
Received: from sender-of-o51.zoho.com ([135.84.80.216]:21117)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <rekado@HIDDEN>) id 1gqyQO-00069O-2H
 for 20339 <at> debbugs.gnu.org; Tue, 05 Feb 2019 05:57:57 -0500
ARC-Seal: i=1; a=rsa-sha256; t=1549357932; cv=none; d=zoho.com; s=zohoarc; 
 b=LxINLeuV0tR7uuQtUndIcJVTPdE1zA5Ck1IDc1ECK1fcujLyNAu3Yaeq8rpheviw/sGAkcI/rjD1/5Qhl9xjrqqhb5ZGH2YBSNH7IgS2e1DGYh0qxKDPRk0b6vfQU9+4eTdwfEZPT/jBRF3o+sTYoeQVdMM68hql3paMZNmCQtQ=
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com;
 s=zohoarc; t=1549357932;
 h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:To:ARC-Authentication-Results;
 bh=O5PO701Bi01oap6qC8cU7pcn0SPa5EGfUldj9TalWf8=; 
 b=GoijDIlcY6xXeClzLgvJBFU18Hm9I5ZTPRQKwiEgYYNsE+UHDARuZRapPgDmUX1pgelfCkvjhLwgC9nOfRs+5YGC9tl6DrWvuPNmSTDe7dAwqThcGtCD3fA8r7AJ1bmDu4zbB3DBvPVopKnjA/KGmjWBB7xZcJfTGv8ofuXota0=
ARC-Authentication-Results: i=1; mx.zoho.com; dkim=pass  header.i=elephly.net;
 spf=pass  smtp.mailfrom=rekado@HIDDEN;
 dmarc=pass header.from=<rekado@HIDDEN> header.from=<rekado@HIDDEN>
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1549357932; 
 s=zoho; d=elephly.net; i=rekado@HIDDEN;
 h=References:From:To:Cc:Subject:In-reply-to:Date:Message-ID:MIME-Version:Content-Type:Content-Transfer-Encoding;
 l=1303; bh=O5PO701Bi01oap6qC8cU7pcn0SPa5EGfUldj9TalWf8=;
 b=S7033f83I9pPS/E3suPCIxdcWzenD82ZAfgJ678d1bHzOrpenImyBeAwlS6fAZJR
 c3qA4WbIuz5xPqze3ZxE8GKP/XXdlomj0a7FWmhkcxhl1xdPCvX4Li1dmjXbJ4jTd2s
 mZavg4iJXgZk4X8OiP0biIKnrWWBWU7o9RwOCQlc=
Received: from localhost (p3E9E957E.dip0.t-ipconnect.de [62.158.149.126]) by
 mx.zohomail.com with SMTPS id 1549357930133206.1824205767324;
 Tue, 5 Feb 2019 01:12:10 -0800 (PST)
References: <20150415194714.GA30295@HIDDEN> <87y45vln0f.fsf@HIDDEN>
 <20160713132403.GA2349@HIDDEN> <87furc1qeu.fsf@HIDDEN>
 <87a7jbi8rx.fsf@HIDDEN>
 <CAD2gp_ScjmURZ7yTFronxyR9r4P4P2L91mXNHguXpZG86chdVA@HIDDEN>
User-agent: mu4e 1.0; emacs 26.1
From: Ricardo Wurmus <rekado@HIDDEN>
In-reply-to: <CAD2gp_ScjmURZ7yTFronxyR9r4P4P2L91mXNHguXpZG86chdVA@HIDDEN>
X-URL: https://elephly.net
X-PGP-Key: https://elephly.net/rekado.pubkey
X-PGP-Fingerprint: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
Date: Tue, 05 Feb 2019 10:12:06 +0100
Message-ID: <874l9iiopl.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-ZohoMailClient: External
X-Spam-Score: -0.0 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)


Hi John,

> The trouble with that is that XML rnamespaces are lexically scoped, like
> Scheme
> local variables.  It is perfectly valid to map a prefix to more than one
> URL,
> as long as the namespace declarations are in either disjoint or nested
> elements.  So you don't know what the absolute name of the element
> or attribute is from just the prefix and the local part.
>
> Furthermore, it is also legal to define more than one prefix for
> the same URL, in which case names using either prefix are normally
> treated as equivalent (however, you can't have elements like
> <a:foo>...</b:foo>
> even if a and b map to the same namespace).
>
> * Is the value for =E2=80=9Cnamespaces=E2=80=9D that=E2=80=99s passed in =
to the
>>   FINISH-ELEMENT procedure always the same?
>>
>> * Will the second return value of the final call to FINISH-ELEMENT
>>   really always be the complete list of *all* namespaces that have been
>>   encountered?
>>
>
> Definitely not, only the namespaces that are currently in scope.

Thanks for the clarifications!

In that case we coud have FINISH-ELEMENT add all namespace declarations
that are in scope to the current node that is about to be returned.  It
would be a little verbose, but more correct.

What do you think?

--=20
Ricardo





Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces?
Resent-From: Ricardo Wurmus <rekado@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Wed, 06 Feb 2019 04:45:01 +0000
Resent-Message-ID: <handler.20339.B20339.15494282682283 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 20339
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: John Cowan <cowan@HIDDEN>
Cc: 20339 <at> debbugs.gnu.org
Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.15494282682283
          (code B ref 20339); Wed, 06 Feb 2019 04:45:01 +0000
Received: (at 20339) by debbugs.gnu.org; 6 Feb 2019 04:44:28 +0000
Received: from localhost ([127.0.0.1]:33924 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1grF4V-0000al-CD
	for submit <at> debbugs.gnu.org; Tue, 05 Feb 2019 23:44:27 -0500
Received: from sender-of-o51.zoho.com ([135.84.80.216]:21058)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <rekado@HIDDEN>) id 1grF4S-0000aZ-Ip
 for 20339 <at> debbugs.gnu.org; Tue, 05 Feb 2019 23:44:25 -0500
ARC-Seal: i=1; a=rsa-sha256; t=1549371437; cv=none; d=zoho.com; s=zohoarc; 
 b=QvC9tolm152kf1f/XXQR9an58hT30JiReaKmWp2o4BPfkHyCmuIvPrUnQkkpgWooRwiXQrtRsetxYrr+885dpo6h41PiR4wNgEc+IXa/MZ3vdEOP26N58O5j1uE+UIU8szZMhF++vxuEnhx8+aZVwvZI6YxoHk63COnwjSoPK+4=
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com;
 s=zohoarc; t=1549371437;
 h=Content-Type:Cc:Date:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:To:ARC-Authentication-Results;
 bh=FvyclqO+OIQmmvjA2uzVA79hbzeOuP1ZDkiOv4qVpys=; 
 b=fvlDG+x9OxlpANi6g0bHefRJ0UT/esKc5t6YPPF/JNtGvLrWkQdJ0ZJVUsatP5NKgUrcGQYfdtZANQ7OLNKVyh5DaEHfAyXUNJqdseRmBNePwmt98dt2hHx7rg6IELHE1/2/YHTBiVPag1Ms+yF69n9mRihZvjE8LScBsUkWD+U=
ARC-Authentication-Results: i=1; mx.zoho.com; dkim=pass  header.i=elephly.net;
 spf=pass  smtp.mailfrom=rekado@HIDDEN;
 dmarc=pass header.from=<rekado@HIDDEN> header.from=<rekado@HIDDEN>
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1549371437; 
 s=zoho; d=elephly.net; i=rekado@HIDDEN;
 h=References:From:To:Cc:Subject:In-reply-to:Date:Message-ID:MIME-Version:Content-Type;
 l=3449; bh=FvyclqO+OIQmmvjA2uzVA79hbzeOuP1ZDkiOv4qVpys=;
 b=bBtnED2NwAp8U1v48HfmCcB47tdCJriQ1bmFcc9tBCIzi/lvBLrKF4irUA6iz+Bq
 3ZEVWYRNhHcng6MF0kWBJRJQu2/ni4Ph7qUG+FQSRFC2TBOkyHT9/7VeQFYM+w6V0wX
 IKq+NKmIqpZSJKvbfqc36sXnrfW9YI9CO0QSjbC8=
Received: from localhost (141.80.247.165 [141.80.247.165]) by mx.zohomail.com
 with SMTPS id 1549371435332230.93305749332058;
 Tue, 5 Feb 2019 04:57:15 -0800 (PST)
References: <20150415194714.GA30295@HIDDEN> <87y45vln0f.fsf@HIDDEN>
 <20160713132403.GA2349@HIDDEN> <87furc1qeu.fsf@HIDDEN>
 <87a7jbi8rx.fsf@HIDDEN>
 <CAD2gp_ScjmURZ7yTFronxyR9r4P4P2L91mXNHguXpZG86chdVA@HIDDEN>
 <874l9iiopl.fsf@HIDDEN>
User-agent: mu4e 1.0; emacs 26.1
From: Ricardo Wurmus <rekado@HIDDEN>
In-reply-to: <874l9iiopl.fsf@HIDDEN>
X-URL: https://elephly.net
X-PGP-Key: https://elephly.net/rekado.pubkey
X-PGP-Fingerprint: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
Date: Tue, 05 Feb 2019 13:57:11 +0100
Message-ID: <87r2cmgzq0.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=-=-="
X-ZohoMailClient: External
X-Zoho-Virus-Status: 1
X-Spam-Score: -0.0 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

--=-=-=
Content-Type: text/plain


Ricardo Wurmus <rekado@HIDDEN> writes:

> In that case we coud have FINISH-ELEMENT add all namespace declarations
> that are in scope to the current node that is about to be returned.  It
> would be a little verbose, but more correct.

Like this:


--=-=-=
Content-Type: text/x-patch
Content-Disposition: inline;
 filename=0001-sxml-xml-sxml-Record-and-use-namespace-abbreviations.patch

From d44c702718baea4c4557d12ca8dd7dab724c7fb6 Mon Sep 17 00:00:00 2001
From: Ricardo Wurmus <rekado@HIDDEN>
Date: Mon, 4 Feb 2019 21:39:06 +0100
Subject: [PATCH] sxml: xml->sxml: Record and use namespace abbreviations.

* module/sxml/simple.scm (xml->sxml)
[name->sxml]: Accept namespaces argument to look up abbreviation.
Return name with abbreviation prefix.
[parser]: Let FINISH-ELEMENT procedure return namespaces in addition to
the SXML tree's attributes.
---
 module/sxml/simple.scm | 34 +++++++++++++++++++++++++---------
 1 file changed, 25 insertions(+), 9 deletions(-)

diff --git a/module/sxml/simple.scm b/module/sxml/simple.scm
index 703ad9137..2bb332c83 100644
--- a/module/sxml/simple.scm
+++ b/module/sxml/simple.scm
@@ -1,7 +1,8 @@
 ;;;; (sxml simple) -- a simple interface to the SSAX parser
 ;;;;
-;;;; 	Copyright (C) 2009, 2010, 2013  Free Software Foundation, Inc.
+;;;; 	Copyright (C) 2009, 2010, 2013, 2019  Free Software Foundation, Inc.
 ;;;;    Modified 2004 by Andy Wingo <wingo at pobox dot com>.
+;;;;    Modified 2019 by Ricardo Wurmus <rekado@HIDDEN>.
 ;;;;    Originally written by Oleg Kiselyov <oleg at pobox dot com> as SXML-to-HTML.scm.
 ;;;; 
 ;;;; This library is free software; you can redistribute it and/or
@@ -30,6 +31,7 @@
   #:use-module (sxml ssax)
   #:use-module (sxml transform)
   #:use-module (ice-9 match)
+  #:use-module (srfi srfi-1)
   #:use-module (srfi srfi-13)
   #:export (xml->sxml sxml->xml sxml->string))
 
@@ -123,10 +125,15 @@ port."
         (acons '*DEFAULT* default-entity-handler entities)
         entities))
 
-  (define (name->sxml name)
+  (define (name->sxml name namespaces)
     (match name
       ((prefix . local-part)
-       (symbol-append prefix (string->symbol ":") local-part))
+       (let ((abbrev (and=> (find (match-lambda
+                                    ((abbrev uri . rest)
+                                     (and (eq? uri prefix) abbrev)))
+                                  namespaces)
+                            first)))
+         (symbol-append abbrev (string->symbol ":") local-part)))
       (_ name)))
 
   (define (doctype-continuation seed)
@@ -150,12 +157,21 @@ port."
        (let ((seed (if trim-whitespace?
                        (ssax:reverse-collect-str-drop-ws seed)
                        (ssax:reverse-collect-str seed)))
-             (attrs (attlist-fold
-                     (lambda (attr accum)
-                       (cons (list (name->sxml (car attr)) (cdr attr))
-                             accum))
-                     '() attributes)))
-         (acons (name->sxml elem-gi)
+             (attrs (append
+                     ;; Namespace declarations
+                     (filter-map (match-lambda
+                                   (('*DEFAULT* . _) #f)
+                                   ((abbrev uri . _)
+                                    (list (symbol-append 'xmlns: abbrev)
+                                          (symbol->string uri))))
+                                 namespaces)
+                     (attlist-fold
+                      (lambda (attr accum)
+                        (cons (list (name->sxml (car attr) namespaces)
+                                    (cdr attr))
+                              accum))
+                      '() attributes))))
+         (acons (name->sxml elem-gi namespaces)
                 (if (null? attrs)
                     seed
                     (cons (cons '@ attrs) seed))
-- 
2.20.1


--=-=-=
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable


It=E2=80=99s quite verbose because it doesn=E2=80=99t check if a namespace =
declaration
is the same in a parent.

--
Ricardo

--=-=-=--





Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces?
Resent-From: tomas@HIDDEN
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Tue, 12 Feb 2019 09:57:02 +0000
Resent-Message-ID: <handler.20339.B20339.15499653739393 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 20339
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: Ricardo Wurmus <rekado@HIDDEN>
Cc: Andy Wingo <wingo@HIDDEN>, 20339 <at> debbugs.gnu.org
Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.15499653739393
          (code B ref 20339); Tue, 12 Feb 2019 09:57:02 +0000
Received: (at 20339) by debbugs.gnu.org; 12 Feb 2019 09:56:13 +0000
Received: from localhost ([127.0.0.1]:44428 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1gtUnV-0002RQ-4H
	for submit <at> debbugs.gnu.org; Tue, 12 Feb 2019 04:56:13 -0500
Received: from mail.tuxteam.de ([5.199.139.25]:52178)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <tomas@HIDDEN>) id 1gtUnQ-0002RF-BX
 for 20339 <at> debbugs.gnu.org; Tue, 12 Feb 2019 04:56:11 -0500
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=tuxteam.de;
 s=mail; 
 h=In-Reply-To:Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date;
 bh=Qyl3/uqExotiKF+0PI1Es89oxwo4tCnHaYWu6Vuhh80=; 
 b=WDD6Mg/C9iHhNgIbwJ96LxsoW9aPFk6Lr3JU9cGUR5mocNgdavgHebttNr9+x/WUA7WZX5FwOV97kQoFLQ5xk4ZiAmNfgrAd9hzDBcG56z7VVurp9OQExPNTVFjcSGr0GI50APbxkADQcTy54eFA0ZodxKZZcac6Ky2wabTT54tLFGBT0JPqY/QGHDS3L1YhWf21THRpyo5ylS6Fn+IJL1HmT8GXW8cownU1CYO1F8YwEkoxPF3WZQ+sqiSq1//UxFfBy42qFI25Kafgi2atZFoNEi8kNBfJ2zL669bCmiPOCRrs8uSL9Jc8ie3n8GXTPsZTHw87U7vHns+Vb8ex9Q==;
Received: from tomas by mail.tuxteam.de with local (Exim 4.80)
 (envelope-from <tomas@HIDDEN>)
 id 1gtUnK-0004XC-Lt; Tue, 12 Feb 2019 10:56:02 +0100
Date: Tue, 12 Feb 2019 10:56:02 +0100
From: tomas@HIDDEN
Message-ID: <20190212095602.GD13448@HIDDEN>
References: <20150415194714.GA30295@HIDDEN> <87y45vln0f.fsf@HIDDEN>
 <20160713132403.GA2349@HIDDEN> <87furc1qeu.fsf@HIDDEN>
 <87a7jbi8rx.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
 protocol="application/pgp-signature"; boundary="Pk6IbRAofICFmK5e"
Content-Disposition: inline
In-Reply-To: <87a7jbi8rx.fsf@HIDDEN>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Score: 0.0 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)


--Pk6IbRAofICFmK5e
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Feb 04, 2019 at 09:44:02PM +0100, Ricardo Wurmus wrote:
> Hello!
>=20
> I just looked at this again and I think I came with something useful.
> Here=E2=80=99s some context:

[...]

> Attached is a patch that does the requested things.  The parser
> procedures like FINISH-ELEMENT have access to all the namespaces, so we
> I changed the FINISH-ELEMENT procedure to return the list of namespaces
> in addition to its SXML tree return value.

It's great that you pick that up, I'm excited :-)

I have lost a bit of contact to Guile as of late. But I'm preparing
some tooling to give your patches a whirl; in the meantime a couple
of comments from the peanut gallery:

As John has noted, the namespace mappings (i.e. the prefix -> namespace
URI binding) are kind of lexically scoped (I'd call it subtree scoped,
but structurally it is the same). While parsing is "easy" (assuming
well-formed XML), serializing is not unambiguous. In a way, the library
might want to be prepared to take hints from the application (as far
as the XML is to be read by humans, there might be "better" and "worse"
serializations).

It may take me a couple of days to come up to speed.

Thanks a lot & cheers
-- t

--Pk6IbRAofICFmK5e
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEUEARECAAYFAlximDIACgkQBcgs9XrR2kbicwCWNOloNf1OUTw7vsDBAlmuxDLi
egCffA4PYlxxVDtlzgdSZ4HqlUTN1o4=
=DZql
-----END PGP SIGNATURE-----

--Pk6IbRAofICFmK5e--




Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces?
Resent-From: Ricardo Wurmus <rekado@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Wed, 13 Feb 2019 00:17:01 +0000
Resent-Message-ID: <handler.20339.B20339.155001698524126 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 20339
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: tomas@HIDDEN
Cc: Andy Wingo <wingo@HIDDEN>, 20339 <at> debbugs.gnu.org
Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.155001698524126
          (code B ref 20339); Wed, 13 Feb 2019 00:17:01 +0000
Received: (at 20339) by debbugs.gnu.org; 13 Feb 2019 00:16:25 +0000
Received: from localhost ([127.0.0.1]:45563 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1gtiDw-0006Go-W2
	for submit <at> debbugs.gnu.org; Tue, 12 Feb 2019 19:16:25 -0500
Received: from sender-of-o51.zoho.com ([135.84.80.216]:21146)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <rekado@HIDDEN>) id 1gtiDt-0006Dl-H3
 for 20339 <at> debbugs.gnu.org; Tue, 12 Feb 2019 19:16:24 -0500
ARC-Seal: i=1; a=rsa-sha256; t=1550003411; cv=none; d=zoho.com; s=zohoarc; 
 b=BzPTtnVsdtKj91FW2mGqniG3l3iowKk35iqDYogZPgAqd+YnojHAGc+bTfrVpmvsMBnjEwgKKDhUH4jr0i/ynsQxB8DZ/cIRPRmkaE2FPNI8A7bPQT9Y6Vw5EmtI9Ncp/Fcq93HaJZoAz8YkOjs6td3NDV0+9G5CyFynNjmimB4=
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com;
 s=zohoarc; t=1550003411;
 h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:To:ARC-Authentication-Results;
 bh=rgwjLTIHDGg8TKkjlK2JEo2aDgrYMkop9+qsNXVTdFg=; 
 b=aaOxbRkEyXmyBHLQ5h0c6OIJvpCr8VWPgCjA7C8A/8abTz4dZMt3X/moVE1D/ywQzYdy9zvdjL/x2ok8hBr95i/I3fS7pLOiekTDLmSuReO0t5D5ggxpKUIbKxzW6f6vwftIstkHjXt8ZqPMiXuQs3lwXu/K44jC7h4wsqE2PIA=
ARC-Authentication-Results: i=1; mx.zoho.com; dkim=pass  header.i=elephly.net;
 spf=pass  smtp.mailfrom=rekado@HIDDEN;
 dmarc=pass header.from=<rekado@HIDDEN> header.from=<rekado@HIDDEN>
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1550003411; 
 s=zoho; d=elephly.net; i=rekado@HIDDEN;
 h=References:From:To:Cc:Subject:In-reply-to:Date:Message-ID:MIME-Version:Content-Type:Content-Transfer-Encoding;
 l=1960; bh=rgwjLTIHDGg8TKkjlK2JEo2aDgrYMkop9+qsNXVTdFg=;
 b=W310mK9tZOQquXjTd4oWbgz0S99qWH3Hoeh97mk6kbSvXUzlrmsvhKeiM+JgzZK8
 MWW2BqeIq38Ti42F44MQCtBRZVqt7cyVSoYNGTZuMD6HteMtMPM13B6Z6dGv1Z13o6q
 nQek8J4Ap5p2J0tvuj0+/mXnTYYsp6l9u107/eNA=
Received: from localhost (p3E9E9E6F.dip0.t-ipconnect.de [62.158.158.111]) by
 mx.zohomail.com with SMTPS id 1550003408876634.6826516030922;
 Tue, 12 Feb 2019 12:30:08 -0800 (PST)
References: <20150415194714.GA30295@HIDDEN> <87y45vln0f.fsf@HIDDEN>
 <20160713132403.GA2349@HIDDEN> <87furc1qeu.fsf@HIDDEN>
 <87a7jbi8rx.fsf@HIDDEN> <20190212095602.GD13448@HIDDEN>
User-agent: mu4e 1.0; emacs 26.1
From: Ricardo Wurmus <rekado@HIDDEN>
In-reply-to: <20190212095602.GD13448@HIDDEN>
X-URL: https://elephly.net
X-PGP-Key: https://elephly.net/rekado.pubkey
X-PGP-Fingerprint: BCA6 89B6 3655 3801 C3C6  2150 197A 5888 235F ACAC
Date: Tue, 12 Feb 2019 21:30:04 +0100
Message-ID: <87wom4iwc3.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-ZohoMailClient: External
X-Spam-Score: 0.0 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)


tomas@HIDDEN writes:

> As John has noted, the namespace mappings (i.e. the prefix -> namespace
> URI binding) are kind of lexically scoped (I'd call it subtree scoped,
> but structurally it is the same). While parsing is "easy" (assuming
> well-formed XML), serializing is not unambiguous.

The =E2=80=9Cfup=E2=80=9D handler of the parser visits every element and ha=
s a list of
namespaces that are in scope at this point.  Its purpose is to return
the SXML representation of that element.  At this point we can record
the namespaces as attributes.  (That=E2=80=99s what the patch does.)

When baking XML from SXML we don=E2=80=99t need to do anything special =E2=
=80=94 we only
need to convert everything to text, including the recorded namespace
attributes.  This isn=E2=80=99t pretty SXML (nor is it pretty XML), but it
appears to be correct as none of the namespace information is lost.

To get a better serialized representation the parser needs to do a
better job of identifying =E2=80=9Cnew=E2=80=9D namespaces.

> In a way, the library might want to be prepared to take hints from the
> application (as far as the XML is to be read by humans, there might be
> "better" and "worse" serializations).

The XML produced when this patch is applied will not be pretty.  To
generate minimal/pretty XML knowledge of the parent elements=E2=80=99 names=
paces
is required =E2=80=94 knowledge that the parser=E2=80=99s =E2=80=9Cfup=E2=
=80=9D handler does not have.

We could try to alter the parser so that it not only passes the list of
namespaces that are currently in scope, but also a list of namespaces
that are in scope for the parent node.  This would allow us to determine
the list of *new* namespaces that absolutely must be declared for the
current node.  If there are no new namespaces we can simply ignore them
and produce minimal SXML (and thus minimal XML later when the SXML is
serialized).

--
Ricardo





Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#20339: sxml simple: sxml->xml mishandles namespaces?
Resent-From: <tomas@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Mon, 08 Apr 2019 12:15:02 +0000
Resent-Message-ID: <handler.20339.B20339.155472565528064 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 20339
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: Ricardo Wurmus <rekado@HIDDEN>
Cc: John Cowan <cowan@HIDDEN>, 20339 <at> debbugs.gnu.org
Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.155472565528064
          (code B ref 20339); Mon, 08 Apr 2019 12:15:02 +0000
Received: (at 20339) by debbugs.gnu.org; 8 Apr 2019 12:14:15 +0000
Received: from localhost ([127.0.0.1]:49075 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1hDTAE-0007IZ-GN
	for submit <at> debbugs.gnu.org; Mon, 08 Apr 2019 08:14:14 -0400
Received: from mail.tuxteam.de ([5.199.139.25]:47211)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <tomas@HIDDEN>) id 1hDTAB-0007IL-3J
 for 20339 <at> debbugs.gnu.org; Mon, 08 Apr 2019 08:14:13 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=tuxteam.de;
 s=mail; 
 h=From:In-Reply-To:Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:Date;
 bh=9rttPn+Z2+suDJb3cpPrezQSV7+57zD/mpt/BOrh6pg=; 
 b=KNHTOkZxHdP+4ohMcWgeG29Lp9zSZNZqFFkIUR6sD9u63tEVA+2bqq6udxH0xOdUiUGVnADSKxgAwtcd4P+08keYwrPWzt6iv8p+aD1dwBsvwevUWDPIYbz/c55UbPMmqBwO0OITP9RiohTLTiLHLtO1oNmAA6zR1CgqDLRzb2lk5WTcVJ/6PPksOkYZlBIJVar6QDOeO6wUS7WJ7tA62G8fxitrClwh3RjAMIR6NIQApU2IymX1pZj5sNn0EOjEqIM+tlQTXlIHDabHLSQEXnCw94BdxPonDTbFWGID+d+cY6KMARXIaeIu1cLGuS9OHTRAkreb9lEdMaRBndPA/g==;
Received: from tomas by mail.tuxteam.de with local (Exim 4.80)
 (envelope-from <tomas@HIDDEN>)
 id 1hDTA3-0000Vy-WB; Mon, 08 Apr 2019 14:14:04 +0200
Date: Mon, 8 Apr 2019 14:14:03 +0200
Message-ID: <20190408121403.GA781@HIDDEN>
References: <20150415194714.GA30295@HIDDEN> <87y45vln0f.fsf@HIDDEN>
 <20160713132403.GA2349@HIDDEN> <87furc1qeu.fsf@HIDDEN>
 <87a7jbi8rx.fsf@HIDDEN>
 <CAD2gp_ScjmURZ7yTFronxyR9r4P4P2L91mXNHguXpZG86chdVA@HIDDEN>
 <874l9iiopl.fsf@HIDDEN> <87r2cmgzq0.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
 protocol="application/pgp-signature"; boundary="yrj/dFKFPuw6o+aM"
Content-Disposition: inline
In-Reply-To: <87r2cmgzq0.fsf@HIDDEN>
User-Agent: Mutt/1.5.21 (2010-09-15)
From: <tomas@HIDDEN>
X-Spam-Score: 0.0 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)


--yrj/dFKFPuw6o+aM
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Feb 05, 2019 at 01:57:11PM +0100, Ricardo Wurmus wrote:
>=20
> Ricardo Wurmus <rekado@HIDDEN> writes:
>=20
> > In that case we coud have FINISH-ELEMENT add all namespace declarations
> > that are in scope to the current node that is about to be returned.  It
> > would be a little verbose, but more correct.
>=20
> Like this:

Thanks again for your patch, and sorry for my glacial pace.

I now came around to test it (against Guile 2.2.4, commit
791cae940afcb2b2eb2c167fe438be1dc1008a73).

TL;DR:

 - The default namespace is still a problem (see below)
 - It would be nice to inhibit the down-inheritance of
   namespace declararions at xml->sxml time. Then, the
   sxml representation would closely mimic the XML, this
   has obvious advantages, since it'd give the user much
   more control over the generated XML.

I'd be willing to prepare a patch along these lines, but
for that, I'd like to get an idea of which direction we
want to take this whole thing to.

To see what's going on, I tried with a small XML example:

First with explicit (aka non-default) namespace:

  #+NAME: minimal-explicit
  #+BEGIN_EXAMPLE
  <?xml version=3D"1.0"?>
  <myns:root xmlns:myns=3D"http://example.org/namespaces/myns">
    <myns:subnode/>
  </myns:root>
  #+END_EXAMPLE

Before your patch:

  #+NAME: minimal-explicit-before
  #+BEGIN_SRC scheme :results output verbatim :var the-xml=3Dminimal-explic=
it
  (use-modules (sxml simple))
  (use-modules (ice-9 pretty-print))
  (pretty-print (xml->sxml the-xml))
  #+END_SRC

  #+RESULTS: minimal-explicit-before
  : <stdin>:12:0: warning: possibly unbound variable `pretty-print'
  : <stdin>:12:14: warning: possibly unbound variable `xml->sxml'
  : (*TOP* (*PI* xml "version=3D\"1.0\"")
  :        (http://example.org/namespaces/myns:root
  :          "\n  "
  :          (http://example.org/namespaces/myns:subnode)
  :          "\n"))

As we know, this replaces the namespace prefixes with the namespace URIs

After your patch:

  #+NAME: minimal-explicit-after
  #+BEGIN_SRC scheme :results output verbatim :var the-xml=3Dminimal-explic=
it
  (set! %load-path (cons "." %load-path))
  (use-modules (sxml simple))
  (use-modules (ice-9 pretty-print))
  (pretty-print (xml->sxml the-xml))
  #+END_SRC

  #+RESULTS: minimal-explicit-after
  #+begin_example
  <stdin>:13:0: warning: possibly unbound variable `pretty-print'
  <stdin>:13:14: warning: possibly unbound variable `xml->sxml'
  ;;; note: source file ./sxml/simple.scm
  ;;;       newer than compiled /usr/local/lib/guile/2.2/ccache/sxml/simple=
=2Ego
  ;;; found fresh local cache at /home/tomas/.cache/guile/ccache/2.2-LE-8-3=
=2EA/home/tomas/guile/sxml-fix/sxml/simple.scm.go
  (*TOP* (*PI* xml "version=3D\"1.0\"")
         (myns:root
           (@ (xmlns:myns "http://example.org/namespaces/myns"))
           "\n  "
           (myns:subnode
             (@ (xmlns:myns "http://example.org/namespaces/myns")))
           "\n"))
  #+end_example

(I've put sxml/simple.scm in the current directory, thus the manipulation
of %load-path). This mimics the XML more closely, using namespace prefixes
instead of namespaces in the sxml. This is compelling :-)

The only difference to the xml is that the namespace declaration is inherit=
ed
to lower-level nodes (that's why sxml->xml propagates them, too).

This works, with the above downside, which you noted too.

It doesn't work with a default namespace, though:

  #+NAME: minimal-implicit
  #+BEGIN_EXAMPLE
  <?xml version=3D"1.0"?>
  <root xmlns=3D"http://example.org/namespaces/myns">
    <subnode/>
  </root>
  #+END_EXAMPLE

With your patch:

  #+NAME: minimal-implicit-after
  #+BEGIN_SRC scheme :results output verbatim :var the-xml=3Dminimal-implic=
it
  (set! %load-path (cons "." %load-path))
  (use-modules (sxml simple))
  (use-modules (ice-9 pretty-print))
  (pretty-print (xml->sxml the-xml))
  #+END_SRC

  #+RESULTS: minimal-implicit-after
  : <stdin>:13:0: warning: possibly unbound variable `pretty-print'
  : <stdin>:13:14: warning: possibly unbound variable `xml->sxml'
  : ;;; note: source file ./sxml/simple.scm
  : ;;;       newer than compiled /usr/local/lib/guile/2.2/ccache/sxml/simp=
le.go
  : ;;; found fresh local cache at /home/tomas/.cache/guile/ccache/2.2-LE-8=
-3.A/home/tomas/guile/sxml-fix/sxml/simple.scm.go
  : (*TOP* (*PI* xml "version=3D\"1.0\"")
  :        (*DEFAULT*:root "\n  " (*DEFAULT*:subnode) "\n"))

Note that the namespace declaration for *DEFAULT* is missing,
so we lost that bit of information. Besides, this is not
serializable:

  #+NAME: reserialize-implicit
  #+BEGIN_SRC scheme :results output verbatim
  (set! %load-path (cons "." %load-path))
  (use-modules (sxml simple))
  (define the-sxml
    '(*TOP* (*PI* xml "version=3D\"1.0\"")
       (*DEFAULT*:root "\n  " (*DEFAULT*:subnode) "\n")))
  (sxml->xml the-sxml)
  #+END_SRC

It catches the bad (xml) name starting with a star:

  #+RESULTS: reserialize-implicit
  : ERROR: In procedure scm-error:
  : Invalid name starting character "*DEFAULT*" *DEFAULT*:root
  :=20
  : Entering a new prompt.  Type `,bt' for a backtrace or `,q' to continue.
  : scheme@(guile-user) [1]>=20

Cheers
-- tom=C3=A1s

--yrj/dFKFPuw6o+aM
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlyrOwsACgkQBcgs9XrR2kabQgCeIvJGAfCZb5KnVNe7M7VFapAY
l9kAn110JNoUb3XRLxV8nCAk4ihppgsF
=bnBc
-----END PGP SIGNATURE-----

--yrj/dFKFPuw6o+aM--




Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#20339: Taking a step back (was: sxml simple: sxml->xml mishandles namespaces?)
Resent-From: tomas@HIDDEN
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Fri, 03 May 2019 10:47:02 +0000
Resent-Message-ID: <handler.20339.B20339.15568803991733 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 20339
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: Ricardo Wurmus <rekado@HIDDEN>
Cc: Andy Wingo <wingo@HIDDEN>, 20339 <at> debbugs.gnu.org
Received: via spool by 20339-submit <at> debbugs.gnu.org id=B20339.15568803991733
          (code B ref 20339); Fri, 03 May 2019 10:47:02 +0000
Received: (at 20339) by debbugs.gnu.org; 3 May 2019 10:46:39 +0000
Received: from localhost ([127.0.0.1]:47827 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1hMViA-0000Rr-JF
	for submit <at> debbugs.gnu.org; Fri, 03 May 2019 06:46:39 -0400
Received: from mail.tuxteam.de ([5.199.139.25]:36828)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <tomas@HIDDEN>) id 1hMVi5-0000Rf-PQ
 for 20339 <at> debbugs.gnu.org; Fri, 03 May 2019 06:46:36 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=tuxteam.de;
 s=mail; 
 h=In-Reply-To:Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date;
 bh=JsUyCYj5dp82jov8QJTMxBPXPEXdwUHlTIQYdfEpU+E=; 
 b=l7AiYLeYETz7SyflbCLPIbNsDFf6JM41FkZjcbfshK++mwhT7RGL9flGapjF2b9dCLUSO9szcgoyxfdMXuF+ui9eTZlVP4LhBHZTxB9JiYxfrAalINzkp/4r4Dlrgh5ikwi3klo7Rs/s44ecP2F/ltsWXeKoNZwP0U90r5FMkUWfjbluHvu+pIz3ORlLnzpOz8HFEhBCFqSpWztzW50rhECHfSuqSdbYK6X+EViO8Ia0Qy6dUtX10vHLsDOVJgMc331Il0gaTm2rlefd/XLiCyv/7i1MYmaD+vAA8/PBHpiKJCHqiCDQ8vYXIb5N4eoNTzB/Lse3wsgD5UmA8vejMw==;
Received: from tomas by mail.tuxteam.de with local (Exim 4.80)
 (envelope-from <tomas@HIDDEN>)
 id 1hMVhz-0001i0-Ci; Fri, 03 May 2019 12:46:27 +0200
Date: Fri, 3 May 2019 12:46:27 +0200
From: tomas@HIDDEN
Message-ID: <20190503104627.GE31083@HIDDEN>
References: <20150415194714.GA30295@HIDDEN> <87y45vln0f.fsf@HIDDEN>
 <20160713132403.GA2349@HIDDEN> <87furc1qeu.fsf@HIDDEN>
 <87a7jbi8rx.fsf@HIDDEN> <20190212095602.GD13448@HIDDEN>
 <87wom4iwc3.fsf@HIDDEN>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
 protocol="application/pgp-signature"; boundary="imjhCm/Pyz7Rq5F2"
Content-Disposition: inline
In-Reply-To: <87wom4iwc3.fsf@HIDDEN>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Score: 0.0 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)


--imjhCm/Pyz7Rq5F2
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hi,

after mulling over it for a while, I think it's time to take
a step back and think a bit about where we'd like to go with
this.

Note that I'm ignoring technical details (the fact that the
SXML, and thus the XML serialization now has namespace declarations
everywhere down the path instead of just at the corresponding root
node, and the thing with the default namespaces, as noted in [1],
seem to me "fixable" technical details).

Your patch, Ricardo, takes a new approach wrt. the SXML resulting
=66rom an XML parse: the full tag names (the QNAMEs, in XML parlance)
are now composed of <prefix>:<name> (mimicking the XML) instead
of <namespace uri>:<name>, as the former (sxml simple) used to
do. This has upsides and downsides.

I'll call your approach the "prefix" approach (as having the
prefixes to qualify the tag names) and the approach followed
by (sxml simple) up to now the "URI" approach, which haves
the full namespace URI qualifying the name.

In the URI approach, a qualified tag name would look like

  "http://example.org/namespaces/myns:node"

whereas in the prefix approach, it'd look like

  "myns:root"

plus the knowledge somewhere that the prefix "myns" stands for

  myns -> http://example.org/namespaces/myns

Upsides of the prefix approach:

 + it mimics more closely the XML syntax. Since that
   is what the XML folks see, that follows the "principle
   of least astonishment" (aka POLA)
 + it is forced to keep the prefix -> namespace associations
   (it would be semantically incomplete if not, since what
   counts semantically is the namespace URI)

Downsides

 - it contradicts current documentation
   "All namespaces in the XML document must be declared, via
    xmlns attributes. SXML elements built from non-default
    namespaces will have their tags prefixed with their
    URI. Users can specify custom prefixes for certain
    namespaces with the #:namespaces keyword argument to
    xml->sxml." [2]

    This can be changed, of course :-)
    But perhaps someone is already relying on it?

  - working on the resulting SXML becomes harder, because
    to compare two qualified names, we'd have to resolve
    the namespace associations.

Upsides of the URI approach

 + it is what the documentation says
 + it follows more closely the XML semantics (the namespace
   prefix in itself is irrelevant after all). As a corollary,
   working on the SXML becomes easier: a comparison of two
   qualified names becomes a simple string comparison, etc.

   I think that is why (sxml simple)'s original design followed
   this path.

Downsides

   Well, negate the "prefix approach" upsides :-)

Let me just say that there seem to be precedents for the
prefix approach out there in the 'net: the Wikipedia
article [3] (yes, there's a wikipedia on that!) follows
the prefix approach. This nice blog post [4] too.

I think I'll stop here. Mi fingers itch with some hacking,
but I think we should pause and ponder before hacking.

Perhaps we should take this to guile-devel? OTOH, if someone
knows The Way Forward (TM), I'm willing to hack in this
direction.

Cheers & thanks

[1] Message ID <20190408121403.GA781@HIDDEN>
    http://lists.gnu.org/archive/html/bug-guile/2019-04/msg00001.html
[2] https://www.gnu.org/software/guile/manual/guile.html#SXML
[3] https://en.wikipedia.org/wiki/SXML
[4] https://www.more-magic.net/posts/lispy-dsl-sxml.html

-- tom=C3=A1s

--imjhCm/Pyz7Rq5F2
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlzMHAMACgkQBcgs9XrR2karLACdFBBbZnzvLF3kxFuyGiO1LdFl
7a8An3REZ122yhfCev5iLBMuQTKWSwMH
=m2+q
-----END PGP SIGNATURE-----

--imjhCm/Pyz7Rq5F2--





Last modified: Mon, 25 Nov 2019 12:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.