GNU bug report logs - #36233
26.2; Tokenization error in rcirc parser

Previous Next

Package: emacs;

Reported by: "Jeff Johnson" <trnsz <at> pobox.com>

Date: Sat, 15 Jun 2019 22:09:01 UTC

Severity: minor

Tags: fixed, patch

Found in version 26.2

Fixed in version 27.1

Done: Noam Postavsky <npostavs <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 36233 in the body.
You can then email your comments to 36233 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#36233; Package emacs. (Sat, 15 Jun 2019 22:09:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to "Jeff Johnson" <trnsz <at> pobox.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sat, 15 Jun 2019 22:09:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: "Jeff Johnson" <trnsz <at> pobox.com>
To: bug-gnu-emacs <at> gnu.org
Cc: jhj <at> trnsz.com
Subject: 26.2; Tokenization error in rcirc parser
Date: Sat, 15 Jun 2019 18:01:38 -0400
I'm reporting a bug that has to do with unused IRC
protocol colon escaping combined with the use of
certain unusual but not disallowed characters in 
channel keys.

It should be noted that certain IRC servers that
always colon escape the last token, (pedantic, but
certainly valid per the IRC RFC's) often triggers a class
of bugs, to most of which rcirc is immune, except when
combing the use of ':' in a channel key, which leads the
to the channel key being stripped of the ':' but not
final argument.

Example, if the server sends:
"MODE #cchan +kl a:b :999" 

That is somehow parsed by rcirc as:
"MODE +kl a b :999"

Clearly incorrect - it should be "MODE +kl a:b 999".

The colon *is* allowed as part of the channel key (as it
is not disallowed in the RFC), except, of course, as the
first character of the key, where it would break protocol.

ircrc client does parse the case of "MODE #chan +k :a:b" 
correctly, yet fails for "MODE #chan +kl a:b :999".

An example of an ircd that always escapes the final token 
and does not disallow colons in channel keys is inspircd, 
but there are many others.  inspircd exactly follows the
RFC - pedantically so - but does not conform to historical
behavior - only standards compliance. 

Inspircd has a testnet at testnet.inspircd.org which can be    
easily used to reproduce the behavior reported.

I've tested that this bugs exists and is reproducible on both
Emacs 26.2 and 25.1 with the included rcirc.

--  
Jeff Johnson
trnsz <at> pobox.com




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#36233; Package emacs. (Sun, 16 Jun 2019 20:11:02 GMT) Full text and rfc822 format available.

Message #8 received at 36233 <at> debbugs.gnu.org (full text, mbox):

From: Noam Postavsky <npostavs <at> gmail.com>
To: "Jeff Johnson" <trnsz <at> pobox.com>
Cc: jhj <at> trnsz.com, 36233 <at> debbugs.gnu.org
Subject: Re: bug#36233: 26.2; Tokenization error in rcirc parser
Date: Sun, 16 Jun 2019 16:10:08 -0400
[Message part 1 (text/plain, inline)]
severity 36233 minor
tags 36233 + patch
quit

"Jeff Johnson" <trnsz <at> pobox.com> writes:

> Example, if the server sends:
> "MODE #cchan +kl a:b :999" 
>
> That is somehow parsed by rcirc as:
> "MODE +kl a b :999"

Yeah, it was using [^:]* to match the middle args.  Patch below.  I
added a test for this case in the patch, although this could probably
use some more testing to make sure I haven't broken other cases.

[0001-Make-rcirc-parsing-more-RFC2812-compliant-Bug-36233.patch (text/x-diff, inline)]
From 9b0cb9e737e3c68f6553f1995983f524bdd92453 Mon Sep 17 00:00:00 2001
From: Noam Postavsky <npostavs <at> gmail.com>
Date: Sun, 16 Jun 2019 13:48:56 -0400
Subject: [PATCH] Make rcirc parsing more RFC2812 compliant (Bug#36233)

Do continue to allow multiple spaces between arguments, even though
that is technically not allowed by the RFC.
* lisp/net/rcirc.el (rcirc-process-server-response-1): Fix parsing of
arguments which contain colons.
* test/lisp/net/rcirc-tests.el: New test.
---
 lisp/net/rcirc.el            | 25 ++++++++++++++-------
 test/lisp/net/rcirc-tests.el | 52 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 69 insertions(+), 8 deletions(-)
 create mode 100644 test/lisp/net/rcirc-tests.el

diff --git a/lisp/net/rcirc.el b/lisp/net/rcirc.el
index 8926772b94..d81fa939b9 100644
--- a/lisp/net/rcirc.el
+++ b/lisp/net/rcirc.el
@@ -774,22 +774,31 @@ (defun rcirc-process-server-response (process text)
     (rcirc-process-server-response-1 process text)))
 
 (defun rcirc-process-server-response-1 (process text)
-  (if (string-match "^\\(:\\([^ ]+\\) \\)?\\([^ ]+\\) \\(.+\\)$" text)
+  ;; See https://tools.ietf.org/html/rfc2812#section-2.3.1.  We accept
+  ;; multiple spaces between args, even though the RFC doesn't allow
+  ;; that.
+  (if (string-match "^\\(:\\([^ ]+\\) \\)?\\([^ ]+\\)" text)
       (let* ((user (match-string 2 text))
 	     (sender (rcirc-user-nick user))
              (cmd (match-string 3 text))
-             (args (match-string 4 text))
+             (cmd-end (match-end 3))
+             (args nil)
              (handler (intern-soft (concat "rcirc-handler-" cmd))))
-        (string-match "^\\([^:]*\\):?\\(.+\\)?$" args)
-        (let* ((args1 (match-string 1 args))
-               (args2 (match-string 2 args))
-               (args (delq nil (append (split-string args1 " " t)
-				       (list args2)))))
+        (cl-loop with i = cmd-end
+                 repeat 14
+                 while (eql i (string-match " +\\([^: ][^ ]*\\)" text i))
+                 do (progn (push (match-string 1 text) args)
+                           (setq i (match-end 0)))
+                 finally
+                 (progn (if (eql i (string-match " +:?" text i))
+                            (push (substring text (match-end 0)) args)
+                          (cl-assert (= i (length text))))
+                        (cl-callf nreverse args)))
         (if (not (fboundp handler))
             (rcirc-handler-generic process cmd sender args text)
           (funcall handler process sender args text))
         (run-hook-with-args 'rcirc-receive-message-functions
-                            process cmd sender args text)))
+                            process cmd sender args text))
     (message "UNHANDLED: %s" text)))
 
 (defvar rcirc-responses-no-activity '("305" "306")
diff --git a/test/lisp/net/rcirc-tests.el b/test/lisp/net/rcirc-tests.el
new file mode 100644
index 0000000000..128cb2e754
--- /dev/null
+++ b/test/lisp/net/rcirc-tests.el
@@ -0,0 +1,52 @@
+;;; rcirc-tests.el --- Tests for rcirc -*- lexical-binding:t -*-
+
+;; Copyright (C) 2019 Free Software Foundation, Inc.
+
+;; This program is free software: you can redistribute it and/or
+;; modify it under the terms of the GNU General Public License as
+;; published by the Free Software Foundation, either version 3 of the
+;; License, or (at your option) any later version.
+;;
+;; This program is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with this program.  If not, see `https://www.gnu.org/licenses/'.
+
+;;; Code:
+
+(require 'ert)
+(require 'rcirc)
+(require 'cl-lib)
+
+(defun rcirc-tests--parse-server-response (cmd text)
+  (cl-letf* ((received-args nil)
+             ((symbol-function (intern (concat "rcirc-handler-" cmd)))
+              (lambda (_process sender args text)
+                (setq received-args (list sender cmd args text))))
+             (rcirc-receive-message-functions nil)
+             (rcirc-trap-errors-flag nil))
+    (rcirc-process-server-response nil text)
+    received-args))
+
+(defmacro rcirc-tests--server-response-parse-should-be
+    (text expected-sender expected-cmd expected-args)
+  (declare (debug t))
+  (macroexp-let2* nil ((cmd expected-cmd))
+    `(should (equal (rcirc-tests--parse-server-response ,cmd ,text)
+                    (list ,expected-sender ,cmd ,expected-args ,text)))))
+
+(ert-deftest rcirc-process-server-response ()
+  (rcirc-tests--server-response-parse-should-be
+   "MODE #cchan +kl a:b :999"
+   nil "MODE" '("#cchan" "+kl" "a:b" "999"))
+  (rcirc-tests--server-response-parse-should-be
+   "MODE #cchan +kl a:b 999"
+   nil "MODE" '("#cchan" "+kl" "a:b" "999"))
+  (rcirc-tests--server-response-parse-should-be
+    "MODE #cchan +kl :a:b"
+    nil "MODE" '("#cchan" "+kl" "a:b")))
+
+;;; rcirc-tests.el ends here
-- 
2.11.0


Severity set to 'minor' from 'normal' Request was from Noam Postavsky <npostavs <at> gmail.com> to control <at> debbugs.gnu.org. (Sun, 16 Jun 2019 20:11:04 GMT) Full text and rfc822 format available.

Added tag(s) patch. Request was from Noam Postavsky <npostavs <at> gmail.com> to control <at> debbugs.gnu.org. (Sun, 16 Jun 2019 20:11:04 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#36233; Package emacs. (Mon, 17 Jun 2019 02:16:01 GMT) Full text and rfc822 format available.

Message #15 received at 36233 <at> debbugs.gnu.org (full text, mbox):

From: trnsz <at> pobox.com
To: Noam Postavsky <npostavs <at> gmail.com>
Cc: 36233 <at> debbugs.gnu.org
Subject: Re: bug#36233: 26.2; Tokenization error in rcirc parser
Date: Sun, 16 Jun 2019 21:53:59 -0400
[Message part 1 (text/plain, inline)]
On Jun 16, 2019, 4:10 PM -0400, Noam Postavsky , wrote:
> Yeah, it was using [^:]* to match the middle args. Patch below. I added a test for this case in the patch, although this could probably use some more testing to make sure I haven't broken other cases.

I can confirm this fix works, but I've not done extensive testing (I'm not normally an rcirc user).

Thanks for addressing this so quickly and I look forward to seeing the fix in a future release.

--
Jeffrey H. Johnson
jhj <at> trnsz.com
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#36233; Package emacs. (Sat, 22 Jun 2019 23:31:02 GMT) Full text and rfc822 format available.

Message #18 received at 36233 <at> debbugs.gnu.org (full text, mbox):

From: Noam Postavsky <npostavs <at> gmail.com>
To: trnsz <at> pobox.com
Cc: 36233 <at> debbugs.gnu.org
Subject: Re: bug#36233: 26.2; Tokenization error in rcirc parser
Date: Sat, 22 Jun 2019 19:30:13 -0400
tags 36233 fixed
close 36233 27.1
quit

trnsz <at> pobox.com writes:

> On Jun 16, 2019, 4:10 PM -0400, Noam Postavsky , wrote:

> I can confirm this fix works, but I've not done extensive testing (I'm not normally an rcirc user).

Okay, I've run with it for a week and nothing bad happened (though I'm
not a very advanced IRC user, I just hang out in #emacs on freenode).
I'll push to master and we'll soon see if there are any prolems.

f46b16b9fb 2019-06-22T19:25:44-04:00 "Make rcirc parsing more RFC2812 compliant (Bug#36233)"
https://git.savannah.gnu.org/cgit/emacs.git/commit/?id=f46b16b9fb00d341f222422a9514f5bd62f29971





Added tag(s) fixed. Request was from Noam Postavsky <npostavs <at> gmail.com> to control <at> debbugs.gnu.org. (Sat, 22 Jun 2019 23:31:04 GMT) Full text and rfc822 format available.

bug marked as fixed in version 27.1, send any further explanations to 36233 <at> debbugs.gnu.org and "Jeff Johnson" <trnsz <at> pobox.com> Request was from Noam Postavsky <npostavs <at> gmail.com> to control <at> debbugs.gnu.org. (Sat, 22 Jun 2019 23:31:04 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 21 Jul 2019 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 252 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.