X-Loop: help-debbugs@HIDDEN Subject: bug#30076: [PATCH] web: Recognize JSON content type as text. Resent-From: Arun Isaac <arunisaac@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Thu, 11 Jan 2018 05:33:01 +0000 Resent-Message-ID: <handler.30076.B.151564872429232 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: report 30076 X-GNU-PR-Package: guile X-GNU-PR-Keywords: patch To: 30076 <at> debbugs.gnu.org Cc: Arun Isaac <arunisaac@HIDDEN> X-Debbugs-Original-To: bug-guile@HIDDEN Received: via spool by submit <at> debbugs.gnu.org id=B.151564872429232 (code B ref -1); Thu, 11 Jan 2018 05:33:01 +0000 Received: (at submit) by debbugs.gnu.org; 11 Jan 2018 05:32:04 +0000 Received: from localhost ([127.0.0.1]:51636 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1eZVTA-0007bQ-D4 for submit <at> debbugs.gnu.org; Thu, 11 Jan 2018 00:32:04 -0500 Received: from eggs.gnu.org ([208.118.235.92]:53455) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <arunisaac@HIDDEN>) id 1eZVT8-0007aw-L5 for submit <at> debbugs.gnu.org; Thu, 11 Jan 2018 00:32:03 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <arunisaac@HIDDEN>) id 1eZVT2-0001zH-Nb for submit <at> debbugs.gnu.org; Thu, 11 Jan 2018 00:31:57 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:40661) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from <arunisaac@HIDDEN>) id 1eZVT2-0001z4-KN for submit <at> debbugs.gnu.org; Thu, 11 Jan 2018 00:31:56 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44418) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <arunisaac@HIDDEN>) id 1eZVT1-0001x0-Mr for bug-guile@HIDDEN; Thu, 11 Jan 2018 00:31:56 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <arunisaac@HIDDEN>) id 1eZVSy-0001v1-Fy for bug-guile@HIDDEN; Thu, 11 Jan 2018 00:31:55 -0500 Received: from vultr.systemreboot.net ([45.77.148.100]:53872) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from <arunisaac@HIDDEN>) id 1eZVSy-0001qK-0u for bug-guile@HIDDEN; Thu, 11 Jan 2018 00:31:52 -0500 Received: from [117.192.104.45] (helo=localhost.localdomain) by systemreboot.net with esmtpsa (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.90) (envelope-from <arunisaac@HIDDEN>) id 1eZVSg-00024Q-Oq; Thu, 11 Jan 2018 11:01:34 +0530 From: Arun Isaac <arunisaac@HIDDEN> Date: Thu, 11 Jan 2018 11:01:17 +0530 Message-Id: <20180111053117.4597-1-arunisaac@HIDDEN> X-Mailer: git-send-email 2.15.1 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.1 (----) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -4.1 (----) * module/web/response.scm (text-content-type?): Recognize JSON content type as text. --- module/web/response.scm | 1 + 1 file changed, 1 insertion(+) diff --git a/module/web/response.scm b/module/web/response.scm index 06e1c6dc1..679304c4d 100644 --- a/module/web/response.scm +++ b/module/web/response.scm @@ -184,6 +184,7 @@ reason phrase for the response's code." represents a textual type such as `text/plain'." (let ((type (symbol->string type))) (or (string-prefix? "text/" type) + (string-suffix? "/json" type) (string-suffix? "/xml" type) (string-suffix? "+xml" type)))) -- 2.15.1
Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) Content-Type: text/plain; charset=utf-8 X-Loop: help-debbugs@HIDDEN From: help-debbugs@HIDDEN (GNU bug Tracking System) To: Arun Isaac <arunisaac@HIDDEN> Subject: bug#30076: Acknowledgement ([PATCH] web: Recognize JSON content type as text.) Message-ID: <handler.30076.B.151564872429232.ack <at> debbugs.gnu.org> References: <20180111053117.4597-1-arunisaac@HIDDEN> X-Gnu-PR-Message: ack 30076 X-Gnu-PR-Package: guile X-Gnu-PR-Keywords: patch Reply-To: 30076 <at> debbugs.gnu.org Date: Thu, 11 Jan 2018 05:33:02 +0000 Thank you for filing a new bug report with debbugs.gnu.org. This is an automatically generated reply to let you know your message has been received. Your message is being forwarded to the package maintainers and other interested parties for their attention; they will reply in due course. Your message has been sent to the package maintainer(s): bug-guile@HIDDEN If you wish to submit further information on this problem, please send it to 30076 <at> debbugs.gnu.org. Please do not send mail to help-debbugs@HIDDEN unless you wish to report a problem with the Bug-tracking system. --=20 30076: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D30076 GNU Bug Tracking System Contact help-debbugs@HIDDEN with problems
X-Loop: help-debbugs@HIDDEN Subject: bug#30076: [PATCH] web: Recognize JSON content type as text. Resent-From: Mark H Weaver <mhw@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Wed, 31 Jan 2018 03:32:02 +0000 Resent-Message-ID: <handler.30076.B30076.151736950913363 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 30076 X-GNU-PR-Package: guile X-GNU-PR-Keywords: patch To: Arun Isaac <arunisaac@HIDDEN> Cc: 30076 <at> debbugs.gnu.org Received: via spool by 30076-submit <at> debbugs.gnu.org id=B30076.151736950913363 (code B ref 30076); Wed, 31 Jan 2018 03:32:02 +0000 Received: (at 30076) by debbugs.gnu.org; 31 Jan 2018 03:31:49 +0000 Received: from localhost ([127.0.0.1]:50085 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1egj7l-0003TT-F0 for submit <at> debbugs.gnu.org; Tue, 30 Jan 2018 22:31:49 -0500 Received: from world.peace.net ([50.252.239.5]:51134) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <mhw@HIDDEN>) id 1egj7j-0003T9-1U for 30076 <at> debbugs.gnu.org; Tue, 30 Jan 2018 22:31:47 -0500 Received: from pool-72-93-27-251.bstnma.east.verizon.net ([72.93.27.251] helo=jojen) by world.peace.net with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from <mhw@HIDDEN>) id 1egj7c-0001US-I2; Tue, 30 Jan 2018 22:31:40 -0500 From: Mark H Weaver <mhw@HIDDEN> References: <20180111053117.4597-1-arunisaac@HIDDEN> Date: Tue, 30 Jan 2018 22:31:04 -0500 In-Reply-To: <20180111053117.4597-1-arunisaac@HIDDEN> (Arun Isaac's message of "Thu, 11 Jan 2018 11:01:17 +0530") Message-ID: <87y3kevh53.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 0.0 (/) Hi Arun, Arun Isaac <arunisaac@HIDDEN> writes: > * module/web/response.scm (text-content-type?): Recognize JSON content > type as text. While this would seem reasonable at first glance, it seems to me that this will result in JSON texts with non-ASCII characters being mishandled in many cases. Within Guile, 'text-content-type?' is currently used in two places: * 'decode-response-body' in (web client), and * 'response-body-port' in (web response). In both places, if 'text-content-type?' returns true, the encoding of the response is assumed to be "ISO-8859-1" if not otherwise specified by an explicit 'charset' parameter. This is what RFC 2616 specifies for text/plain, although RFC 6657 would change the default to US-ASCII, as it was in RFC 2046, and maybe we should look into that. However, things are quite different for the application/json MIME type, as specified in RFCs 4627 and 7159. Those RFCs specify that JSON text "SHALL" (i.e. MUST) be encoded in Unicode (UTF-8, UTF-16 or UTF-32), that the default encoding is UTF-8, and furthermore that no charset parameter is defined for application/json. So, we can expect at least some conforming implementations to omit the 'charset' parameter, and yet in that case we must assume that the encoding is Unicode, and most definitely not ISO-8859-1. RFC 4627 makes the additional interesting observation (in section 3, "encoding") that since the first two characters of JSON text will always be ASCII, and since UTF-8/UTF-16/UTF-32 are the only valid encodings for JSON text, we can reliably determine the encoding by looking at the pattern of nul bytes in the first four octets: 00 00 00 xx UTF-32BE 00 xx 00 xx UTF-16BE xx 00 00 00 UTF-32LE xx 00 xx 00 UTF-16LE xx xx xx xx UTF-8 Given that any of these encodings above are possible, and that there is no 'charset' parameter defined for "application/json", it seems to me that we have no choice but to be prepared to auto-detect the encoding, as described in RFC 4627 section 3 if the 'charset' parameter is missing. What do you think? Mark
X-Loop: help-debbugs@HIDDEN Subject: bug#30076: [PATCH] web: Recognize JSON content type as text. Resent-From: Mark H Weaver <mhw@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Wed, 31 Jan 2018 06:06:02 +0000 Resent-Message-ID: <handler.30076.B30076.15173787192115 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 30076 X-GNU-PR-Package: guile X-GNU-PR-Keywords: patch To: Arun Isaac <arunisaac@HIDDEN> Cc: 30076 <at> debbugs.gnu.org Received: via spool by 30076-submit <at> debbugs.gnu.org id=B30076.15173787192115 (code B ref 30076); Wed, 31 Jan 2018 06:06:02 +0000 Received: (at 30076) by debbugs.gnu.org; 31 Jan 2018 06:05:19 +0000 Received: from localhost ([127.0.0.1]:50201 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1eglWH-0000Y2-8p for submit <at> debbugs.gnu.org; Wed, 31 Jan 2018 01:05:18 -0500 Received: from world.peace.net ([50.252.239.5]:51462) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <mhw@HIDDEN>) id 1eglWG-0000Xn-6N for 30076 <at> debbugs.gnu.org; Wed, 31 Jan 2018 01:05:16 -0500 Received: from pool-72-93-27-251.bstnma.east.verizon.net ([72.93.27.251] helo=jojen) by world.peace.net with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from <mhw@HIDDEN>) id 1eglWA-0002H6-1T; Wed, 31 Jan 2018 01:05:10 -0500 From: Mark H Weaver <mhw@HIDDEN> References: <20180111053117.4597-1-arunisaac@HIDDEN> <87y3kevh53.fsf@HIDDEN> Date: Wed, 31 Jan 2018 01:04:32 -0500 In-Reply-To: <87y3kevh53.fsf@HIDDEN> (Mark H. Weaver's message of "Tue, 30 Jan 2018 22:31:04 -0500") Message-ID: <87lggeva1b.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 0.0 (/) Mark H Weaver <mhw@HIDDEN> writes: > RFC 4627 makes the additional interesting observation (in section 3, > "encoding") that since the first two characters of JSON text will always > be ASCII, Sorry, it turns out that's no longer the case. RFC 4627 specified that a JSON text must be either an object or array, but in RFC 7159 a JSON text can be any JSON value. So only the first character is guaranteed to be ASCII. Having looked into this a bit more, I wonder if Guile should even try to set the port encoding itself. As far as I can tell, there's no way to know the encoding of the response payload in the general case, without knowledge of the specific MIME media type. We could teach Guile about "application/json", but if we follow that path, it would lead to us teaching Guile's web library about more media types over time, but we cannot hope to know about all of them. The 'charset' parameter is not universal. Whether it is a valid parameter, and how its value is to be interpreted, depends on the media type. For "application/json", technically there is no 'charset' parameter at all. Since it's not feasible for Guile to reliably choose the right encoding for arbitrary media types, perhaps it would be better for Guile to explicitly say that it's the application programmer's job to set the encoding of the port, if it contains textual data. What do you think? Mark
X-Loop: help-debbugs@HIDDEN Subject: bug#30076: [PATCH] web: Recognize JSON content type as text. Resent-From: Arun Isaac <arunisaac@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-guile@HIDDEN Resent-Date: Fri, 02 Feb 2018 07:32:02 +0000 Resent-Message-ID: <handler.30076.B30076.15175567172450 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 30076 X-GNU-PR-Package: guile X-GNU-PR-Keywords: patch To: Mark H Weaver <mhw@HIDDEN> Cc: 30076 <at> debbugs.gnu.org Received: via spool by 30076-submit <at> debbugs.gnu.org id=B30076.15175567172450 (code B ref 30076); Fri, 02 Feb 2018 07:32:02 +0000 Received: (at 30076) by debbugs.gnu.org; 2 Feb 2018 07:31:57 +0000 Received: from localhost ([127.0.0.1]:53319 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1ehVpF-0000dS-EE for submit <at> debbugs.gnu.org; Fri, 02 Feb 2018 02:31:57 -0500 Received: from vultr.systemreboot.net ([45.77.148.100]:42750) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <arunisaac@HIDDEN>) id 1ehVpD-0000dA-1Y for 30076 <at> debbugs.gnu.org; Fri, 02 Feb 2018 02:31:56 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=systemreboot.net; s=default; h=Content-Type:MIME-Version:Message-ID:Date: References:In-Reply-To:Subject:Cc:To:From:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=HyBHRiQ7xjVUyN6MVZgkroEgcQxIZZSSPUj4MJaXsbg=; b=bgV6IRfl6dfgJO5pNZULFRC7I B3bFueT0oE1D/VjDWvFFzTYgE2eELCa01J9flr1p107c7Mbq20kRiADeTuJC7AXeccu3v87D0E3oe MaYsoawb8ys/xyHYzonUMqhPiYj5HMwFePSzcmd+c99nSW5fcyCtIMedBeofoKD9hFDX4=; Received: from [14.139.128.15] (helo=steel) by systemreboot.net with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.90) (envelope-from <arunisaac@HIDDEN>) id 1ehVp3-00080b-LY; Fri, 02 Feb 2018 13:01:46 +0530 From: Arun Isaac <arunisaac@HIDDEN> In-Reply-To: <87lggeva1b.fsf@HIDDEN> References: <20180111053117.4597-1-arunisaac@HIDDEN> <87y3kevh53.fsf@HIDDEN> <87lggeva1b.fsf@HIDDEN> Date: Fri, 02 Feb 2018 13:01:34 +0530 Message-ID: <cu7tvuzzw2x.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -0.0 (/) > Having looked into this a bit more, I wonder if Guile should even try to > set the port encoding itself. As far as I can tell, there's no way to > know the encoding of the response payload in the general case, without > knowledge of the specific MIME media type. We could teach Guile about > "application/json", but if we follow that path, it would lead to us > teaching Guile's web library about more media types over time, but we > cannot hope to know about all of them. > Since it's not feasible for Guile to reliably choose the right encoding > for arbitrary media types, perhaps it would be better for Guile to > explicitly say that it's the application programmer's job to set the > encoding of the port, if it contains textual data. "application/json" is common enough that it would be convenient for the application programmer to have Guile know about it. But, as a Guile maintainer, this is your call. I don't have strong opinions this way or that.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.