GNU bug report logs - #7172
emacs 23.2; xml.el: xml-parse-file hangs when DOCTYPE element names contain _ (underscore)

Previous Next

Package: emacs;

Reported by: Jose Marino <marinoj <at> astro.ufl.edu>

Date: Thu, 7 Oct 2010 17:05:02 UTC

Severity: normal

Done: Chong Yidong <cyd <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 7172 in the body.
You can then email your comments to 7172 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#7172; Package emacs. (Thu, 07 Oct 2010 17:05:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jose Marino <marinoj <at> astro.ufl.edu>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Thu, 07 Oct 2010 17:05:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jose Marino <marinoj <at> astro.ufl.edu>
To: bug-gnu-emacs <at> gnu.org
Subject: emacs 23.2; xml.el: xml-parse-file hangs when DOCTYPE element names
	contain _ (underscore)
Date: Thu, 07 Oct 2010 11:07:52 -0600
In a DOCTYPE construction, whenever there's an ELEMENT name with an 
underscore in its name, function xml-parse-file makes emacs become 
unresponsive and use 100% cpu. Emacs recovers nicely with C-g but no 
error is printed.

To reproduce this behavior I set up these two simple xml files:

------------ output --------------
$ cat example-good.xml
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE EXAMPLE [
   <!ELEMENT EXAMPLE EMPTY>
]>
<EXAMPLE>
</EXAMPLE>

$ cat example-bad.xml
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE EXAMPLE [
   <!ELEMENT EXAM_PLE EMPTY>
]>
<EXAM_PLE>
</EXAM_PLE>
------------ output --------------

Then from emacs I run:
(xml-parse-file "example-good.xml")
Which as expected produces:
((EXAMPLE nil "
"))

But when I do the same for the other file:
(xml-parse-file "example-bad.xml")
No output is produced and emacs becomes unresponsive.

Attaching strace to the running emacs process prints:
brk(0x267b000)                          = 0x267b000
brk(0x269d000)                          = 0x269d000
brk(0x2637000)                          = 0x2637000
brk(0x2659000)                          = 0x2659000
brk(0x267b000)                          = 0x267b000
brk(0x269d000)                          = 0x269d000
brk(0x2637000)                          = 0x2637000
brk(0x2659000)                          = 0x2659000
brk(0x267b000)                          = 0x267b000
brk(0x269d000)                          = 0x269d000
brk(0x2637000)                          = 0x2637000
brk(0x2659000)                          = 0x2659000

These messages repeat over and over.

I should mention that this behavior seems to be triggered by the 
underscore in the DOCTYPE ELEMENT name, and is not affected by the 
underscore in the actual element's name. Thus, this file also triggers 
the bug:
$ cat example-bad2.xml
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE EXAMPLE [
   <!ELEMENT EXAM_PLE EMPTY>
]>
<EXAMPLE>
</EXAMPLE>




Information forwarded to owner <at> debbugs.gnu.org, bug-gnu-emacs <at> gnu.org:
bug#7172; Package emacs. (Fri, 08 Oct 2010 00:37:02 GMT) Full text and rfc822 format available.

Message #8 received at 7172 <at> debbugs.gnu.org (full text, mbox):

From: Glenn Morris <rgm <at> gnu.org>
To: Jose Marino <marinoj <at> astro.ufl.edu>
Cc: 7172 <at> debbugs.gnu.org
Subject: Re: bug#7172: emacs 23.2;
	xml.el: xml-parse-file hangs when DOCTYPE element names contain _
	(underscore)
Date: Thu, 07 Oct 2010 20:39:54 -0400
Jose Marino wrote:

> Attaching strace to the running emacs process prints:
> brk(0x267b000)                          = 0x267b000

A much more useful thing to do in such cases is to
M-x toggle-debug-on-quit
beforehand, then interrupt Emacs with C-g when it hangs. Resulting backtrace:

Debugger entered--Lisp error: (quit)
  looking-at("<!ATTLIST[ 	\n
]*\\([[:alpha:]:_][-[:digit:].[:alpha:]:_]*\\)[ 	\n
]*\\(\\(?:[ 	\n
]*[[:alpha:]:_][-[:digit:].[:alpha:]:_]*[ 	\n
]*\\(?:CDATA\\|\\(?:ID\\|IDREF\\|IDREFS\\|ENTITY\\|ENTITIES\\|NMTOKEN\\|NMTOKENS\\)\\|\\(?:NOTATION[ 	\n
]([ 	\n
]*[[:alpha:]:_][-[:digit:].[:alpha:]:_]*\\(?:[ 	\n
]*|[ 	\n
]*[[:alpha:]:_][-[:digit:].[:alpha:]:_]*\\)*[ 	\n
]*)\\)\\|\\(?:\\(?:NOTATION[ 	\n
]([ 	\n
]*[[:alpha:]:_][-[:digit:].[:alpha:]:_]*\\(?:[ 	\n
]*|[ 	\n
]*[[:alpha:]:_][-[:digit:].[:alpha:]:_]*\\)*[ 	\n
]*)\\)\\|\\(?:([ 	\n
]*[-[:digit:].[:alpha:]:_]+\\(?:[ 	\n
]*|[ 	\n
]*[-[:digit:].[:alpha:]:_]+\\)*[ 	\n
])\\)\\)\\)[ 	\n
]*\\(?:#REQUIRED\\|#IMPLIED\\|\\(?:#FIXED[ 	\n
]\\)*\\(?:\"\\(?:[^&\"]\\|\\(?:&[[:alpha:]:_][-[:digit:].[:alpha:]:_]*;\\|\\(?:&#[0-9]+;\\|&#x[0-9a-fA-F]+;\\)\\)\\)*\"\\|'\\(?:[^&']\\|\\(?:&[[:alpha:]:_][-[:digit:].[:alpha:]:_]*;\\|\\(?:&#[0-9]+;\\|&#x[0-9a-fA-F]+;\\)\\)\\)*'\\)\\)\\)\\)*[ 	\n
]*>")
  xml-parse-dtd(nil)
  xml-parse-tag(nil nil)
  xml-parse-tag(nil nil)
  xml-parse-region(1 116 #<buffer  *temp*> nil nil)
  xml-parse-file("example-bad.xml")


That certainly is a regexp.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#7172; Package emacs. (Sun, 01 Jul 2012 11:05:02 GMT) Full text and rfc822 format available.

Message #11 received at 7172 <at> debbugs.gnu.org (full text, mbox):

From: Chong Yidong <cyd <at> gnu.org>
To: Jose Marino <marinoj <at> astro.ufl.edu>
Cc: 7172 <at> debbugs.gnu.org
Subject: Re: bug#7172: emacs 23.2;
	xml.el: xml-parse-file hangs when DOCTYPE element names contain _
	(underscore)
Date: Sun, 01 Jul 2012 18:59:32 +0800
Jose Marino <marinoj <at> astro.ufl.edu> writes:

> In a DOCTYPE construction, whenever there's an ELEMENT name with an
> underscore in its name, function xml-parse-file makes emacs become
> unresponsive and use 100% cpu. Emacs recovers nicely with C-g but no
> error is printed.
>
> $ cat example-bad.xml
> <?xml version="1.0" encoding="utf-8"?>
> <!DOCTYPE EXAMPLE [
>    <!ELEMENT EXAM_PLE EMPTY>
> ]>
> <EXAM_PLE>
> </EXAM_PLE>

This is fixed in trunk.  Thanks for the bug report.




bug closed, send any further explanations to 7172 <at> debbugs.gnu.org and Jose Marino <marinoj <at> astro.ufl.edu> Request was from Chong Yidong <cyd <at> gnu.org> to control <at> debbugs.gnu.org. (Sun, 01 Jul 2012 11:05:03 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#7172; Package emacs. (Sun, 01 Jul 2012 11:06:01 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 29 Jul 2012 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 11 years and 294 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.