Received: (at 46933) by debbugs.gnu.org; 22 Jun 2022 04:17:22 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Wed Jun 22 00:17:22 2022 Received: from localhost ([127.0.0.1]:33678 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1o3rnu-00006k-HU for submit <at> debbugs.gnu.org; Wed, 22 Jun 2022 00:17:22 -0400 Received: from quimby.gnus.org ([95.216.78.240]:51674) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <larsi@HIDDEN>) id 1o3rns-00006W-Ln for 46933 <at> debbugs.gnu.org; Wed, 22 Jun 2022 00:17:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnus.org; s=20200322; h=Content-Type:MIME-Version:Message-ID:In-Reply-To:Date: References:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=YEle+axDaxfT54Sd2gHvAUywguXF+fk4Ibqj5ZTuS0A=; b=ltvxmu+z8FW5SAvycxUed26W3o riKJha20EilF5jC1H9StbpwF8Hp+HQiCOPnE7dQNJvuKD3sWbtqkFC5FeIOK6LWiruE6ww5+1GniP IiAm6XNK+0w2q00mG/K0nEUgMHm3wm/em2/EzXIsLNkbAiW9BfGFevvXS6pHC89llZSY=; Received: from [84.212.220.105] (helo=xo) by quimby.gnus.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from <larsi@HIDDEN>) id 1o3rni-00068U-Js; Wed, 22 Jun 2022 06:17:13 +0200 From: Lars Ingebrigtsen <larsi@HIDDEN> To: Eli Zaretskii <eliz@HIDDEN> Subject: Re: bug#46933: Possible bugs in filepos-to-bufferpos / bufferpos-to-filepos References: <9cff0f8894f167925251@HIDDEN> <87im53ny95.fsf@HIDDEN> <87r13kqhdj.fsf@HIDDEN> <838rprmu1f.fsf@HIDDEN> <877d5a1er4.fsf@HIDDEN> <835ykukybi.fsf@HIDDEN> Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwAgMAAAAqbBEUAAAABGdBTUEAALGPC/xhBQAAACBj SFJNAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAADFBMVEXaixKce2Cwmkj/ //99tgMYAAAAAWJLR0QDEQxM8gAAAAlwSFlzAAALEgAACxIB0t1+/AAAAAd0SU1FB+YGFgQDJObc sBIAAAEASURBVCjPrdKxTsMwEAbgS9RISacMZGcAhj5Fli5MEeK3iCeElEr1UwSk7l2MgAkGo/qe kovtBCTW3nSfdL47JyY6exRD3835JeUjlQk1FR2tFlTtAroZ87c5z9l4HmO+Kt6123WhQ0kbVpZP qYw9YDhhtwf2fj4/laUO0ky7BVgb79XLlGcZ+HFt0cade30agC7BOAD3CRZThA0KbTHMdV/KYjsD CpA8IEOKB0H1H9qFMkEPSJMJBZRT06TbAO0VC64iWPGrw1FwHSBXjzCs+cOjjvCav90vZAqeBFYG yt5tc4g4aKvGpgnorbK6tc2f31eF7/5c0ucF0V195sfxAyqmZ7+Lo700AAAAWmVYSWZNTQAqAAAA CAAFARIAAwAAAAEAAQAAARoABQAAAAEAAABKARsABQAAAAEAAABSASgAAwAAAAEAAgAAAhMAAwAA AAEAAQAAAAAAAAAAAEgAAAABAAAASAAAAAEfUvc0AAAAJXRFWHRkYXRlOmNyZWF0ZQAyMDIyLTA2 LTIyVDA0OjAzOjM1KzAwOjAwG7XoFAAAACV0RVh0ZGF0ZTptb2RpZnkAMjAyMi0wNi0yMlQwNDow MzozNSswMDowMGroUKgAAAAXdEVYdGV4aWY6WUNiQ3JQb3NpdGlvbmluZwAxrA+AYwAAAABJRU5E rkJggg== X-Now-Playing: Mice Parade's =?utf-8?Q?=5FLapap=E1=BB=8D=5F=3A?= "Bushwick & Knoll" Date: Wed, 22 Jun 2022 06:17:09 +0200 In-Reply-To: <835ykukybi.fsf@HIDDEN> (Eli Zaretskii's message of "Tue, 21 Jun 2022 15:14:57 +0300") Message-ID: <87r13huybe.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Report: Spam detection software, running on the system "quimby.gnus.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see @@CONTACT_ADDRESS@@ for details. Content preview: Eli Zaretskii <eliz@HIDDEN> writes: > The GDB manual, if you have it, is generated in split form. Thanks; I could test with that. Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 46933 Cc: handa@HIDDEN, gregory@HIDDEN, 46933 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) Eli Zaretskii <eliz@HIDDEN> writes: > The GDB manual, if you have it, is generated in split form. Thanks; I could test with that. >> But there's one new in-tree usage for this -- in >> hexl.el. (In hexl-mode-exit and hexl-maybe-dehexlify-buffer.) I don't >> know whether that has the problem described in this bug report, though >> (I'm not familiar with hexl.el at all). > > There's no reason why this won't be relevant to hexl: it is a > general-purpose hex editor, so editing a file encoded in one of those > problematic ISO-2022 encodings should bump into the same issues. I meant more that I don't really know what it wants to achieve, or what kinds of files are typically used by hexl users. Do people use that on huge hex dumps or something else? And on hex dumps, there's typically be no coding system, so loading the file into memory just to determine that would be inefficient. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no
bug-gnu-emacs@HIDDEN
:bug#46933
; Package emacs
.
Full text available.Received: (at 46933) by debbugs.gnu.org; 21 Jun 2022 12:15:18 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Tue Jun 21 08:15:18 2022 Received: from localhost ([127.0.0.1]:59349 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1o3cmr-0005AD-OC for submit <at> debbugs.gnu.org; Tue, 21 Jun 2022 08:15:18 -0400 Received: from eggs.gnu.org ([209.51.188.92]:33158) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1o3cmp-00059t-MN for 46933 <at> debbugs.gnu.org; Tue, 21 Jun 2022 08:15:16 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:43476) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <eliz@HIDDEN>) id 1o3cmk-0006y1-5y; Tue, 21 Jun 2022 08:15:10 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=gghFJJZ9NnDlzsDzTipxOt8fzqmaHvJmLdCzUAiuQYI=; b=dZp4THPSJImf fhlTRxsOWl9TlK9r+uyxfFFCpsijY/bfpRyIsVezXDcM6FjG1gDv7wrW93Y/N+9eQLHo7Vo2pEu/i gVZjFgbT+pssIXv1fmYPSWq8erb4au4Ott1UJdVmSZ6p11Der0QsytkOzwA9/4a6HfwSb+/1ri9XF x4R7AH3QpgNqKyYeZABH9JH87taZH/EUfVj73MnXp33s9sBZzgRoJuZZjJQdUx6mmCkx39Vsz0chh mNxW4KLRFwQRIHgCfQY7Me2LOmSJwQ28Johe5hJSAj2jrdMMpl6GCT9AZV88BuSkVxNU1VqxC05WU ifdF8tJuaA+iwl7AOfUFvg==; Received: from [87.69.77.57] (port=4401 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <eliz@HIDDEN>) id 1o3cmi-0004PT-RQ; Tue, 21 Jun 2022 08:15:09 -0400 Date: Tue, 21 Jun 2022 15:14:57 +0300 Message-Id: <835ykukybi.fsf@HIDDEN> From: Eli Zaretskii <eliz@HIDDEN> To: Lars Ingebrigtsen <larsi@HIDDEN> In-Reply-To: <877d5a1er4.fsf@HIDDEN> (message from Lars Ingebrigtsen on Tue, 21 Jun 2022 12:40:15 +0200) Subject: Re: bug#46933: Possible bugs in filepos-to-bufferpos / bufferpos-to-filepos References: <9cff0f8894f167925251@HIDDEN> <87im53ny95.fsf@HIDDEN> <87r13kqhdj.fsf@HIDDEN> <838rprmu1f.fsf@HIDDEN> <877d5a1er4.fsf@HIDDEN> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 46933 Cc: handa@HIDDEN, gregory@HIDDEN, 46933 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) > From: Lars Ingebrigtsen <larsi@HIDDEN> > Cc: handa@HIDDEN, gregory@HIDDEN, 46933 <at> debbugs.gnu.org > Date: Tue, 21 Jun 2022 12:40:15 +0200 > > > The original (and so far the only) use case was an Info manual > > separated into several files, where the tag table at the end of the > > main file specifies offsets in bytes. See the function > > Info-find-node-2 in info.el. > > We no longer split up .info files into several files, so that's a bit > difficult to test. The GDB manual, if you have it, is generated in split form. > But there's one new in-tree usage for this -- in > hexl.el. (In hexl-mode-exit and hexl-maybe-dehexlify-buffer.) I don't > know whether that has the problem described in this bug report, though > (I'm not familiar with hexl.el at all). There's no reason why this won't be relevant to hexl: it is a general-purpose hex editor, so editing a file encoded in one of those problematic ISO-2022 encodings should bump into the same issues.
bug-gnu-emacs@HIDDEN
:bug#46933
; Package emacs
.
Full text available.Received: (at 46933) by debbugs.gnu.org; 21 Jun 2022 10:40:27 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Tue Jun 21 06:40:27 2022 Received: from localhost ([127.0.0.1]:59189 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1o3bJ4-0008TZ-Us for submit <at> debbugs.gnu.org; Tue, 21 Jun 2022 06:40:27 -0400 Received: from quimby.gnus.org ([95.216.78.240]:43442) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <larsi@HIDDEN>) id 1o3bJ3-0008TM-Dt for 46933 <at> debbugs.gnu.org; Tue, 21 Jun 2022 06:40:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnus.org; s=20200322; h=Content-Type:MIME-Version:Message-ID:In-Reply-To:Date: References:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=sS8HSj7+PCwTLrspGekm6T1yt4Q3gS34hJOsIRSAvSI=; b=Tscb5ZRz/3Dx4IbPdTcTmz0IdP Eijgk5vc9vtKEyJ5/sFtkRnKO82Zz4PySc5li915lDKTKY1vfsUguajb2UqDCwt1IUFy8kShELebS 0vv9wm55WXT9ZZd2cSqAOolCFDkoAjJ7MwcKyF8apZCQKMxyweguO6BAay6/zVSI8FbU=; Received: from [84.212.220.105] (helo=xo) by quimby.gnus.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from <larsi@HIDDEN>) id 1o3bIt-00049Y-Nu; Tue, 21 Jun 2022 12:40:18 +0200 From: Lars Ingebrigtsen <larsi@HIDDEN> To: Eli Zaretskii <eliz@HIDDEN> Subject: Re: bug#46933: Possible bugs in filepos-to-bufferpos / bufferpos-to-filepos References: <9cff0f8894f167925251@HIDDEN> <87im53ny95.fsf@HIDDEN> <87r13kqhdj.fsf@HIDDEN> <838rprmu1f.fsf@HIDDEN> Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAABGdBTUEAALGPC/xhBQAAACBj SFJNAAB6JgAAgIQAAPoAAACA6AAAdTAAAOpgAAA6mAAAF3CculE8AAAAGFBMVEVYQzpmXViKe3BK LB8tIBqtnZQtGQ/////WlcnBAAAAAWJLR0QHFmGI6wAAAAd0SU1FB+YGFQomACoHBsAAAAGtSURB VDjLZVLLjoQgEGyXzZ6FZO4riXeTNtw3Eu87yXwBhP//hO0HIM52HEe7uooqEIDLWv4N5TxscJXl gQkmKw+mdmfgzlwnZtgQzcBSgGiA3MV/CPUYMYyYjVSnee6A3A0NTH2aFzaCIMa4iSfxoAwqoP7R kkxWgIh7pDYRADRBTUPt/cB4sOKYXVwJhJXgK1DDsa2upGpNAW59C2/vFzBXgltufcmqFO/8MgK3 01lugPN+fXF574c16F3bWs/lAmo1uOox4Flk9fRnPbVVD56vp33YeWVo5elV7hYsSTwWI5YdrVFd eLELHwaFv/rmzMkOfrhv8yNBWqLF8ul+Pl6/JfNuckTHHAIKVYKvohV7Qeuc5Sp+huGZ6kh8mDE2 IOkdk3wzMZ6ZgYT7KVhKgdq7LJHhLDnwONKVCmKIBxFilUKkK6G4yNyPJ3T1ELrBy24gfWwech4A ZmBBBa6AqQQWK2FIr0DgAEEZeQSKAEGAdA4A6VA4lcrvUiGNSzQpSt1j3LYdGQgyk+teKUNMDSkG KQ1XQt55u66AlUFHQJxMjqEenXhKZ+JtE6k/GJPjGhYGkLAAAAAldEVYdGRhdGU6Y3JlYXRlADIw MjItMDYtMjFUMTA6Mzg6MDArMDA6MDDSfmK0AAAAJXRFWHRkYXRlOm1vZGlmeQAyMDIyLTA2LTIx VDEwOjM4OjAwKzAwOjAwoyPaCAAAAABJRU5ErkJggg== X-Now-Playing: Depeche Mode's _A Broken Frame_: "Leaving In Silence" Date: Tue, 21 Jun 2022 12:40:15 +0200 In-Reply-To: <838rprmu1f.fsf@HIDDEN> (Eli Zaretskii's message of "Mon, 20 Jun 2022 14:52:12 +0300") Message-ID: <877d5a1er4.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Report: Spam detection software, running on the system "quimby.gnus.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see @@CONTACT_ADDRESS@@ for details. Content preview: Eli Zaretskii <eliz@HIDDEN> writes: >> If I understand correctly, the suggestion here is to use this function >> in Info-find-node-2 instead of `filepos-to-bufferpos'? > > "Unless it's too slow". Let's see... Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 46933 Cc: handa@HIDDEN, gregory@HIDDEN, 46933 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) Eli Zaretskii <eliz@HIDDEN> writes: >> If I understand correctly, the suggestion here is to use this function >> in Info-find-node-2 instead of `filepos-to-bufferpos'? > > "Unless it's too slow". Let's see... > The original (and so far the only) use case was an Info manual > separated into several files, where the tag table at the end of the > main file specifies offsets in bytes. See the function > Info-find-node-2 in info.el. We no longer split up .info files into several files, so that's a bit difficult to test. But there's one new in-tree usage for this -- in hexl.el. (In hexl-mode-exit and hexl-maybe-dehexlify-buffer.) I don't know whether that has the problem described in this bug report, though (I'm not familiar with hexl.el at all). -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no
bug-gnu-emacs@HIDDEN
:bug#46933
; Package emacs
.
Full text available.Received: (at 46933) by debbugs.gnu.org; 20 Jun 2022 11:52:41 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Jun 20 07:52:41 2022 Received: from localhost ([127.0.0.1]:54616 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1o3FxJ-0008SL-Ai for submit <at> debbugs.gnu.org; Mon, 20 Jun 2022 07:52:41 -0400 Received: from eggs.gnu.org ([209.51.188.92]:34412) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1o3FxI-0008S4-6u for 46933 <at> debbugs.gnu.org; Mon, 20 Jun 2022 07:52:32 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:46596) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <eliz@HIDDEN>) id 1o3FxC-0001l4-QL; Mon, 20 Jun 2022 07:52:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=mbmNIOzHA4KW+PfXwN8uc2dcapwWJ83dz1/u0K1Nu78=; b=BSovpf1exrys AXP1SCeXGw+kNAf7mCzweZC+3A/0V6xm7whf47Vs8CZOj19LNz/8lNiEo7vNtW6vIybaFu2/7p9Qh KQfwSUmQC2dri8G/pTvSZYaY3atWQpJTOTKA6VS1izKMUd5eKOuAHiR1VRMt391i1CfLFvya9ZYz6 t4/+Rx3KO8pzbx0V2DiQYeeYo3slSLUr7qJKIByRPMBuV0NsTJrAi3cTBqc7X6E6DR6ftcgeRO+pd B0rsQaR4HNMjzqqI46rvnoHpgf6yewhtpukcdOUhIQ51z3YYh3pTQwiU9vRboagnr6vDEiiHFhlmE C2TwwpD1HIcw7MXBSX103Q==; Received: from [87.69.77.57] (port=1805 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <eliz@HIDDEN>) id 1o3FxC-0004pQ-5s; Mon, 20 Jun 2022 07:52:26 -0400 Date: Mon, 20 Jun 2022 14:52:12 +0300 Message-Id: <838rprmu1f.fsf@HIDDEN> From: Eli Zaretskii <eliz@HIDDEN> To: Lars Ingebrigtsen <larsi@HIDDEN> In-Reply-To: <87r13kqhdj.fsf@HIDDEN> (message from Lars Ingebrigtsen on Mon, 20 Jun 2022 02:59:52 +0200) Subject: Re: bug#46933: Possible bugs in filepos-to-bufferpos / bufferpos-to-filepos References: <9cff0f8894f167925251@HIDDEN> <87im53ny95.fsf@HIDDEN> <87r13kqhdj.fsf@HIDDEN> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 46933 Cc: handa@HIDDEN, gregory@HIDDEN, 46933 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) > From: Lars Ingebrigtsen <larsi@HIDDEN> > Cc: Eli Zaretskii <eliz@HIDDEN>, gregory@HIDDEN, 46933 <at> debbugs.gnu.org > Date: Mon, 20 Jun 2022 02:59:52 +0200 > > handa <handa@HIDDEN> writes: > > > For the latter case, perhaps something like the following code works. > > > > ;; Return the buffer position correspoinding to the byte position > > ;; FILEPOS in FILE provided that FILE is decoded by CODING-SYSTEM. > > (defun temp (file filepos coding-system) > > (with-temp-buffer > > (set-buffer-multibyte nil) > > (insert-file-contents-literally file) > > (let ((full (decode-coding-region 1 (point-max) coding-system t)) > > partial) > > (while (and (setq partial (decode-coding-region 1 (1+ filepos) > > coding-system t)) > > (not (eq (compare-strings full 0 (length partial) > > partial 0 (length partial)) > > t))) > > (setq filepos (1+ filepos))) > > (1+ (length partial))))) > > > > If it is too slow, there are a few ways to make it faster. > > (I'm going through old bug reports that unfortunately weren't resolved > at the time.) > > If I understand correctly, the suggestion here is to use this function > in Info-find-node-2 instead of `filepos-to-bufferpos'? "Unless it's too slow".
bug-gnu-emacs@HIDDEN
:bug#46933
; Package emacs
.
Full text available.Received: (at 46933) by debbugs.gnu.org; 20 Jun 2022 01:00:14 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sun Jun 19 21:00:13 2022 Received: from localhost ([127.0.0.1]:53537 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1o35ly-0003Xr-9c for submit <at> debbugs.gnu.org; Sun, 19 Jun 2022 21:00:13 -0400 Received: from quimby.gnus.org ([95.216.78.240]:56298) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <larsi@HIDDEN>) id 1o35lt-0003Vh-8b for 46933 <at> debbugs.gnu.org; Sun, 19 Jun 2022 21:00:09 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnus.org; s=20200322; h=Content-Type:MIME-Version:Message-ID:In-Reply-To:Date: References:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=W4CHZmjwayLXKXXm2QXJ/J1WMdGmE+RYPPGKl/GAo4w=; b=S8zTDN+CDasgIVX6H9A24nEr7h hfkiYNoM6I8jwBj4l+NFZUe24p3J6Lf+xMUaYXkU8OikraVisIAGzR0OJwyT4ey1x7jx92JXuRK3U 3RJ/2vHcQix676UcU3aCYmn2/eeH1VcycxJxRgenDwb57h6Xy6S80NFMny54SbOi/w3A=; Received: from [84.212.220.105] (helo=xo) by quimby.gnus.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from <larsi@HIDDEN>) id 1o35li-0004L1-Ud; Mon, 20 Jun 2022 02:59:57 +0200 From: Lars Ingebrigtsen <larsi@HIDDEN> To: handa <handa@HIDDEN> Subject: Re: bug#46933: Possible bugs in filepos-to-bufferpos / bufferpos-to-filepos References: <9cff0f8894f167925251@HIDDEN> <87im53ny95.fsf@HIDDEN> Date: Mon, 20 Jun 2022 02:59:52 +0200 In-Reply-To: <87im53ny95.fsf@HIDDEN> (handa@HIDDEN's message of "Sun, 04 Apr 2021 01:12:06 +0900") Message-ID: <87r13kqhdj.fsf@HIDDEN> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Report: Spam detection software, running on the system "quimby.gnus.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see @@CONTACT_ADDRESS@@ for details. Content preview: handa <handa@HIDDEN> writes: > But it seems that the usage in Info-find-node-2 is: > a byte position in an existing file that may not be created by Emacs > > There's a case that they are different. The method I wrote in the > pre [...] Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 46933 Cc: Eli Zaretskii <eliz@HIDDEN>, gregory@HIDDEN, 46933 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) handa <handa@HIDDEN> writes: > But it seems that the usage in Info-find-node-2 is: > a byte position in an existing file that may not be created by Emacs > > There's a case that they are different. The method I wrote in the > previous mail works only in the former case. And it seems that the > current implementation of filepos-to-bufferpos is the same because it > tries to get byte sequence by encode-coding-region. > > For the latter case, perhaps something like the following code works. > > ;; Return the buffer position correspoinding to the byte position > ;; FILEPOS in FILE provided that FILE is decoded by CODING-SYSTEM. > (defun temp (file filepos coding-system) > (with-temp-buffer > (set-buffer-multibyte nil) > (insert-file-contents-literally file) > (let ((full (decode-coding-region 1 (point-max) coding-system t)) > partial) > (while (and (setq partial (decode-coding-region 1 (1+ filepos) > coding-system t)) > (not (eq (compare-strings full 0 (length partial) > partial 0 (length partial)) > t))) > (setq filepos (1+ filepos))) > (1+ (length partial))))) > > If it is too slow, there are a few ways to make it faster. (I'm going through old bug reports that unfortunately weren't resolved at the time.) If I understand correctly, the suggestion here is to use this function in Info-find-node-2 instead of `filepos-to-bufferpos'? -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no
bug-gnu-emacs@HIDDEN
:bug#46933
; Package emacs
.
Full text available.Received: (at 46933) by debbugs.gnu.org; 3 Apr 2021 16:12:20 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sat Apr 03 12:12:20 2021 Received: from localhost ([127.0.0.1]:34349 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1lSism-0002Hv-CS for submit <at> debbugs.gnu.org; Sat, 03 Apr 2021 12:12:20 -0400 Received: from eggs.gnu.org ([209.51.188.92]:45662) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <handa@HIDDEN>) id 1lSisk-0002HX-BZ for 46933 <at> debbugs.gnu.org; Sat, 03 Apr 2021 12:12:18 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:55524) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from <handa@HIDDEN>) id 1lSise-0007Jj-OB; Sat, 03 Apr 2021 12:12:12 -0400 Received: from fl1-60-236-248-230.iba.mesh.ad.jp ([60.236.248.230]:54461 helo=shatin) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from <handa@HIDDEN>) id 1lSisd-0003XL-68; Sat, 03 Apr 2021 12:12:11 -0400 Received: from handa by shatin with local (Exim 4.93) (envelope-from <handa@HIDDEN>) id 1lSisY-0001Yz-Gj; Sun, 04 Apr 2021 01:12:06 +0900 From: handa <handa@HIDDEN> To: Eli Zaretskii <eliz@HIDDEN> Subject: Re: bug#46933: Possible bugs in filepos-to-bufferpos / bufferpos-to-filepos In-Reply-To: <83zgyif2aq.fsf@HIDDEN> (message from Eli Zaretskii on Thu, 01 Apr 2021 18:32:45 +0300) Date: Sun, 04 Apr 2021 01:12:06 +0900 Message-ID: <87im53ny95.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 46933 Cc: handa@HIDDEN, gregory@HIDDEN, 46933 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.7 (-) In article <83zgyif2aq.fsf@HIDDEN>, Eli Zaretskii <eliz@HIDDEN> writes: > Leaving the :pre-write/:post-read-conversion use case aside, do we > have some means of find where ISO-2022 shift-in/out sequence begins > and ends, so that we never try to decode a partial sequence (and > produce "characters" that are not really in the original buffer)? > If not, where can I find the description of every kind of such > sequences, i.e. sequences that modify the decoder state without > producing any characters? The official definition is in the standard ISO/IEC 2022, but it seems that this wiki page: https://en.wikipedia.org/wiki/ISO/IEC_2022 is more concise. Emacs implements all control sequences shown in the sections: "Shift functions", "Character set designations", and "Interaction with other coding systems". > > By the way, what is the intention of filepos-to-bufferpos? Why that > > function was introduce? > The original (and so far the only) use case was an Info manual > separated into several files, where the tag table at the end of the > main file specifies offsets in bytes. See the function > Info-find-node-2 in info.el. As filepos-to-bufferpos accepts the optional arg CODING-SYSTEM, I've thought BYTE arg is: a byte position in a file that will be created by encoding the current buffer by CODING-SYSTEM But it seems that the usage in Info-find-node-2 is: a byte position in an existing file that may not be created by Emacs There's a case that they are different. The method I wrote in the previous mail works only in the former case. And it seems that the current implementation of filepos-to-bufferpos is the same because it tries to get byte sequence by encode-coding-region. For the latter case, perhaps something like the following code works. ;; Return the buffer position correspoinding to the byte position ;; FILEPOS in FILE provided that FILE is decoded by CODING-SYSTEM. (defun temp (file filepos coding-system) (with-temp-buffer (set-buffer-multibyte nil) (insert-file-contents-literally file) (let ((full (decode-coding-region 1 (point-max) coding-system t)) partial) (while (and (setq partial (decode-coding-region 1 (1+ filepos) coding-system t)) (not (eq (compare-strings full 0 (length partial) partial 0 (length partial)) t))) (setq filepos (1+ filepos))) (1+ (length partial))))) If it is too slow, there are a few ways to make it faster. --- K. Handa handa@HIDDEN
bug-gnu-emacs@HIDDEN
:bug#46933
; Package emacs
.
Full text available.Received: (at 46933) by debbugs.gnu.org; 1 Apr 2021 15:33:11 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Thu Apr 01 11:33:11 2021 Received: from localhost ([127.0.0.1]:58332 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1lRzJn-0003QS-Jf for submit <at> debbugs.gnu.org; Thu, 01 Apr 2021 11:33:11 -0400 Received: from eggs.gnu.org ([209.51.188.92]:54964) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1lRzJl-0003QF-K6 for 46933 <at> debbugs.gnu.org; Thu, 01 Apr 2021 11:33:09 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:56670) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from <eliz@HIDDEN>) id 1lRzJg-0004J9-8n; Thu, 01 Apr 2021 11:33:04 -0400 Received: from 84.94.185.95.cable.012.net.il ([84.94.185.95]:3623 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from <eliz@HIDDEN>) id 1lRzJe-0001Sy-Mz; Thu, 01 Apr 2021 11:33:03 -0400 Date: Thu, 01 Apr 2021 18:32:45 +0300 Message-Id: <83zgyif2aq.fsf@HIDDEN> From: Eli Zaretskii <eliz@HIDDEN> To: handa <handa@HIDDEN> In-Reply-To: <87sg4arq9x.fsf@HIDDEN> (message from handa on Fri, 02 Apr 2021 00:14:02 +0900) Subject: Re: bug#46933: Possible bugs in filepos-to-bufferpos / bufferpos-to-filepos References: <87sg4arq9x.fsf@HIDDEN> X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 46933 Cc: gregory@HIDDEN, 46933 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.7 (-) > From: handa <handa@HIDDEN> > Cc: gregory@HIDDEN, 46933 <at> debbugs.gnu.org > Date: Fri, 02 Apr 2021 00:14:02 +0900 > > In article <83tuovmivc.fsf@HIDDEN>, Eli Zaretskii <eliz@HIDDEN> writes: > > > > Any coding system can have :post-read-conversion and > > > :pre-write-conversion functions, it is not guaranteed that encoded byte > > > length is greater than the number of characters. > > > Agreed, but AFAICT, ISO-2022-JP doesn't have any of these attributes, > > right? > > Yes, but one can add them by coding-system-put. Leaving the :pre-write/:post-read-conversion use case aside, do we have some means of find where ISO-2022 shift-in/out sequence begins and ends, so that we never try to decode a partial sequence (and produce "characters" that are not really in the original buffer)? If not, where can I find the description of every kind of such sequences, i.e. sequences that modify the decoder state without producing any characters? (UTF-8 has the same issue, btw, but in that case we have a simpler solution.)
bug-gnu-emacs@HIDDEN
:bug#46933
; Package emacs
.
Full text available.Received: (at 46933) by debbugs.gnu.org; 1 Apr 2021 15:26:05 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Thu Apr 01 11:26:05 2021 Received: from localhost ([127.0.0.1]:58320 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1lRzCv-0003EX-HZ for submit <at> debbugs.gnu.org; Thu, 01 Apr 2021 11:26:05 -0400 Received: from eggs.gnu.org ([209.51.188.92]:53528) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1lRzCu-0003E3-A7 for 46933 <at> debbugs.gnu.org; Thu, 01 Apr 2021 11:26:04 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:56539) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from <eliz@HIDDEN>) id 1lRzCo-0008MK-NO; Thu, 01 Apr 2021 11:25:58 -0400 Received: from 84.94.185.95.cable.012.net.il ([84.94.185.95]:3174 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from <eliz@HIDDEN>) id 1lRzCX-0007vZ-EE; Thu, 01 Apr 2021 11:25:53 -0400 Date: Thu, 01 Apr 2021 18:25:24 +0300 Message-Id: <831rbugh7f.fsf@HIDDEN> From: Eli Zaretskii <eliz@HIDDEN> To: handa <handa@HIDDEN> In-Reply-To: <87sg4arq9x.fsf@HIDDEN> (message from handa on Fri, 02 Apr 2021 00:14:02 +0900) Subject: Re: bug#46933: Possible bugs in filepos-to-bufferpos / bufferpos-to-filepos References: <87sg4arq9x.fsf@HIDDEN> X-Spam-Score: 1.3 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: > From: handa <handa@HIDDEN> > Cc: gregory@HIDDEN, 46933 <at> debbugs.gnu.org > Date: Fri, 02 Apr 2021 00:14:02 +0900 > > By the way, what is the intention of filepos-to-bufferpos? Why that > functi [...] Content analysis details: (1.3 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at https://www.dnswl.org/, low trust [209.51.188.92 listed in list.dnswl.org] -0.0 SPF_HELO_PASS SPF: HELO matches SPF record -0.0 SPF_PASS SPF: sender matches SPF record 2.0 PDS_TONAME_EQ_TOLOCAL_SHORT Short body with To: name matches everything in local email X-Debbugs-Envelope-To: 46933 Cc: gregory@HIDDEN, 46933 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: 0.3 (/) > From: handa <handa@HIDDEN> > Cc: gregory@HIDDEN, 46933 <at> debbugs.gnu.org > Date: Fri, 02 Apr 2021 00:14:02 +0900 > > By the way, what is the intention of filepos-to-bufferpos? Why that > function was introduce? The original (and so far the only) use case was an Info manual separated into several files, where the tag table at the end of the main file specifies offsets in bytes. See the function Info-find-node-2 in info.el.
bug-gnu-emacs@HIDDEN
:bug#46933
; Package emacs
.
Full text available.Received: (at 46933) by debbugs.gnu.org; 1 Apr 2021 15:14:21 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Thu Apr 01 11:14:21 2021 Received: from localhost ([127.0.0.1]:58291 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1lRz1Z-0002tl-FC for submit <at> debbugs.gnu.org; Thu, 01 Apr 2021 11:14:21 -0400 Received: from eggs.gnu.org ([209.51.188.92]:50588) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <handa@HIDDEN>) id 1lRz1X-0002tZ-SJ for 46933 <at> debbugs.gnu.org; Thu, 01 Apr 2021 11:14:20 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:56391) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from <handa@HIDDEN>) id 1lRz1S-0001LA-HD; Thu, 01 Apr 2021 11:14:14 -0400 Received: from fl1-60-236-248-230.iba.mesh.ad.jp ([60.236.248.230]:52076 helo=shatin) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from <handa@HIDDEN>) id 1lRz1K-0006vS-Bd; Thu, 01 Apr 2021 11:14:13 -0400 Received: from handa by shatin with local (Exim 4.93) (envelope-from <handa@HIDDEN>) id 1lRz1G-0008MN-CF; Fri, 02 Apr 2021 00:14:02 +0900 From: handa <handa@HIDDEN> To: Eli Zaretskii <eliz@HIDDEN> Subject: Re: bug#46933: Possible bugs in filepos-to-bufferpos / bufferpos-to-filepos In-Reply-To: <83tuovmivc.fsf@HIDDEN> (message from Eli Zaretskii on Sun, 28 Mar 2021 17:51:35 +0300) Date: Fri, 02 Apr 2021 00:14:02 +0900 Message-ID: <87sg4arq9x.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 46933 Cc: gregory@HIDDEN, 46933 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.7 (-) In article <83tuovmivc.fsf@HIDDEN>, Eli Zaretskii <eliz@HIDDEN> writes: > > Any coding system can have :post-read-conversion and > > :pre-write-conversion functions, it is not guaranteed that encoded byte > > length is greater than the number of characters. > Agreed, but AFAICT, ISO-2022-JP doesn't have any of these attributes, > right? Yes, but one can add them by coding-system-put. By the way, what is the intention of filepos-to-bufferpos? Why that function was introduce? --- K. Handa handa@HIDDEN
bug-gnu-emacs@HIDDEN
:bug#46933
; Package emacs
.
Full text available.Received: (at 46933) by debbugs.gnu.org; 28 Mar 2021 14:51:37 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sun Mar 28 10:51:37 2021 Received: from localhost ([127.0.0.1]:46744 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1lQWlN-0004ou-G8 for submit <at> debbugs.gnu.org; Sun, 28 Mar 2021 10:51:37 -0400 Received: from eggs.gnu.org ([209.51.188.92]:43052) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1lQWlL-0004oh-HF for 46933 <at> debbugs.gnu.org; Sun, 28 Mar 2021 10:51:35 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:53205) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from <eliz@HIDDEN>) id 1lQWlF-0003rf-Ok; Sun, 28 Mar 2021 10:51:29 -0400 Received: from 84.94.185.95.cable.012.net.il ([84.94.185.95]:3176 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from <eliz@HIDDEN>) id 1lQWlF-0004DP-7A; Sun, 28 Mar 2021 10:51:29 -0400 Date: Sun, 28 Mar 2021 17:51:35 +0300 Message-Id: <83tuovmivc.fsf@HIDDEN> From: Eli Zaretskii <eliz@HIDDEN> To: handa <handa@HIDDEN> In-Reply-To: <87y2e7s65m.fsf@HIDDEN> (message from handa on Sun, 28 Mar 2021 23:29:41 +0900) Subject: Re: bug#46933: Possible bugs in filepos-to-bufferpos / bufferpos-to-filepos References: <87y2e7s65m.fsf@HIDDEN> X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 46933 Cc: gregory@HIDDEN, 46933 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.7 (-) > From: handa <handa@HIDDEN> > Cc: gregory@HIDDEN, 46933 <at> debbugs.gnu.org > Date: Sun, 28 Mar 2021 23:29:41 +0900 > > > In any case, the problem is not with encoding, the problem is with > > decoding. Encoding doesn't have this problem because we always encode > > more than enough (we use the value of BYTE as the count of > > _characters_ to encode, so for ISO-2022 encoding it is usually much > > more than needed). By contrast, when decoding, we decode exactly > > BYTE+1 bytes, which then hits the problem if that offset is inside a > > shift sequence. > > Then, that implementation should be changed. > > Any coding system can have :post-read-conversion and > :pre-write-conversion functions, it is not guaranteed that encoded byte > length is greater than the number of characters. Agreed, but AFAICT, ISO-2022-JP doesn't have any of these attributes, right?
bug-gnu-emacs@HIDDEN
:bug#46933
; Package emacs
.
Full text available.Received: (at 46933) by debbugs.gnu.org; 28 Mar 2021 14:29:55 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sun Mar 28 10:29:55 2021 Received: from localhost ([127.0.0.1]:46707 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1lQWQN-0004FZ-2J for submit <at> debbugs.gnu.org; Sun, 28 Mar 2021 10:29:55 -0400 Received: from eggs.gnu.org ([209.51.188.92]:39890) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <handa@HIDDEN>) id 1lQWQL-0004FI-AZ for 46933 <at> debbugs.gnu.org; Sun, 28 Mar 2021 10:29:53 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:52940) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from <handa@HIDDEN>) id 1lQWQF-0007oL-UA; Sun, 28 Mar 2021 10:29:47 -0400 Received: from fl1-60-236-248-230.iba.mesh.ad.jp ([60.236.248.230]:65176 helo=shatin) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from <handa@HIDDEN>) id 1lQWQE-0003pl-6A; Sun, 28 Mar 2021 10:29:46 -0400 Received: from handa by shatin with local (Exim 4.93) (envelope-from <handa@HIDDEN>) id 1lQWQ9-0006ED-FK; Sun, 28 Mar 2021 23:29:41 +0900 From: handa <handa@HIDDEN> To: Eli Zaretskii <eliz@HIDDEN> Subject: Re: bug#46933: Possible bugs in filepos-to-bufferpos / bufferpos-to-filepos In-Reply-To: <83pmzkog6x.fsf@HIDDEN> (message from Eli Zaretskii on Sat, 27 Mar 2021 16:54:14 +0300) Date: Sun, 28 Mar 2021 23:29:41 +0900 Message-ID: <87y2e7s65m.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 46933 Cc: gregory@HIDDEN, 46933 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.7 (-) In article <83pmzkog6x.fsf@HIDDEN>, Eli Zaretskii <eliz@HIDDEN> writes: > > How about something like this method: > > 1. Encode the buffer text one line by one until we get a longer byte > > sequence than BYTE. > > 2. Delete the result of enoding the last line above. > > 3. Provided that the above last line has chars C1 C2 ... Cn, > > encode characters C1...Cn, C1...Cn-1, C1...Cn-2 until we get a shorter > > byte sequence than BYTE. > > > > The first step may be optimized by encode multiple lines instead of > > single line. > Even if we do optimize, this would be very slow, I think. Whether it is too slow or not depends on what filepos-to-bufferpos is used for. Do you know why filepos-to-bufferpos (and bufferpos-to-filepos) is introduced? > And what if the buffer has no newlines? In that case, just do the step 2. Or, we can use the bi-sectioning technique. > In any case, the problem is not with encoding, the problem is with > decoding. Encoding doesn't have this problem because we always encode > more than enough (we use the value of BYTE as the count of > _characters_ to encode, so for ISO-2022 encoding it is usually much > more than needed). By contrast, when decoding, we decode exactly > BYTE+1 bytes, which then hits the problem if that offset is inside a > shift sequence. Then, that implementation should be changed. Any coding system can have :post-read-conversion and :pre-write-conversion functions, it is not guaranteed that encoded byte length is greater than the number of characters. --- K. Handa handa@HIDDEN
bug-gnu-emacs@HIDDEN
:bug#46933
; Package emacs
.
Full text available.Received: (at 46933) by debbugs.gnu.org; 27 Mar 2021 14:25:05 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sat Mar 27 10:25:05 2021 Received: from localhost ([127.0.0.1]:44724 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1lQ9s9-0003tN-DX for submit <at> debbugs.gnu.org; Sat, 27 Mar 2021 10:25:05 -0400 Received: from heytings.org ([95.142.160.155]:43704) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <gregory@HIDDEN>) id 1lQ9s5-0003ss-BL for 46933 <at> debbugs.gnu.org; Sat, 27 Mar 2021 10:25:04 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=heytings.org; s=20210101; t=1616855099; bh=D21hc5ey1cJdXCDIiGDYS+sgbISsaxa7h/rvb9PUZ+g=; h=Date:From:To:cc:Subject:In-Reply-To:Message-ID:References:From; b=PM1FDb9I1RH9Ahkltxa0Ju5zzdfMpi+R24z0FqFvPOR2+g6+QzsFL1jWQUBTaDpsk CVxEv4jl9U4iSB/YeKsHf3O7LgYi9N22pUd6dgRao8OA7fD9Jfuc6eyZWmi97k8UHY zm9GRFyezWKVf6NfME728e249pfYv/S6qsKVeTIr+++s4b72FYWzhYwPq9UCmiOzt2 eh62H3a+4rJ35N1Gyq13o6V9kJ4951WKfBc0QrW5RxX0MEdk6OGQjhey+RN8cu5Bgi awWTWxWzzgpuwC86gYvk/+WyJy4FR8XycOXzOPsGc7G0D1W7NOAddiDGK2OtXvhrAe m+Vibf4f/XMiQ== Date: Sat, 27 Mar 2021 14:24:58 +0000 From: Gregory Heytings <gregory@HIDDEN> To: handa <handa@HIDDEN> Subject: Re: bug#46933: Possible bugs in filepos-to-bufferpos / bufferpos-to-filepos In-Reply-To: <874kgxtatr.fsf@HIDDEN> Message-ID: <b2996462741af4c4cb0a@HIDDEN> References: <874kgxtatr.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset=us-ascii X-Spam-Score: 2.0 (++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: >> Kenichi, why are these 6 bytes inserted by encode-coding-region, but >> not when we encode the same text as part of saving the buffer to its >> file? And why does it happen near the end of the text [...] Content analysis details: (2.0 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 SPF_HELO_PASS SPF: HELO matches SPF record -0.0 SPF_PASS SPF: sender matches SPF record 2.0 PDS_TONAME_EQ_TOLOCAL_HDRS_LCASE To: name matches everything in local email - LCASE headers X-Debbugs-Envelope-To: 46933 Cc: Eli Zaretskii <eliz@HIDDEN>, 46933 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.0 (-) >> Kenichi, why are these 6 bytes inserted by encode-coding-region, but >> not when we encode the same text as part of saving the buffer to its >> file? And why does it happen near the end of the text, between those 2 >> particular letters? > > There surely exists a bug. Could you please try the attached patch? > > The reason why that bug did not happen on file writing is that the code > in write_region calls encoding routine repeatedly without > CODING_MODE_LAST_BLOCK flag, and only in the case that flushing is > required (e.g. the case of iso-2022-jp), just for flushing, it calls > enoding routine again with CODING_MODE_LAST_BLOCK flag. In that case, > carryover does not happen in encode_coding (). > Thank you. I tried the patch, and it seems to fix the bufferpos-to-filepos bug, but not the filepos-to-bufferpos one. On the example file, (bufferpos-to-filepos (- (point-max) POS) 'exact) gives the expected results: POS = 0: 2997 POS = 1: 2995 POS = 2: 2993 POS = 3 (IDEOGRAPHIC FULL STOP): 2991 POS = 4 (HIRAGANA LETTER RU): 2989 But (goto-char (filepos-to-bufferpos POS 'exact)) gives: POS = 2985, 2986: last but one visible character (HIRAGANA LETTER RU) POS = 2987, 2988: last visible character (IDEOGRAPHIC FULL STOP) POS = 2989, 2990: first CRLF POS = 2991: second CRLF POS = 2992: point-max POS = 2993: first CRLF POS = 2994, 2995: second CRLF POS >= 2996: point-max where I would have expected: POS = 2989, 2990: last but one visible character (HIRAGANA LETTER RU) POS = 2991, 2992: last visible character (IDEOGRAPHIC FULL STOP) POS = 2993, 2994: first CRLF POS = 2995, 2996: second CRLF POS >= 2997: point-max
bug-gnu-emacs@HIDDEN
:bug#46933
; Package emacs
.
Full text available.Received: (at 46933) by debbugs.gnu.org; 27 Mar 2021 13:54:19 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sat Mar 27 09:54:19 2021 Received: from localhost ([127.0.0.1]:43605 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1lQ9ON-0002tF-IV for submit <at> debbugs.gnu.org; Sat, 27 Mar 2021 09:54:19 -0400 Received: from eggs.gnu.org ([209.51.188.92]:35132) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1lQ9OK-0002t2-22 for 46933 <at> debbugs.gnu.org; Sat, 27 Mar 2021 09:54:17 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:35955) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from <eliz@HIDDEN>) id 1lQ9OE-0003Pc-GO; Sat, 27 Mar 2021 09:54:10 -0400 Received: from 84.94.185.95.cable.012.net.il ([84.94.185.95]:4820 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from <eliz@HIDDEN>) id 1lQ9OD-0002zh-Qe; Sat, 27 Mar 2021 09:54:10 -0400 Date: Sat, 27 Mar 2021 16:54:14 +0300 Message-Id: <83pmzkog6x.fsf@HIDDEN> From: Eli Zaretskii <eliz@HIDDEN> To: handa <handa@HIDDEN> In-Reply-To: <871rc0u3v4.fsf@HIDDEN> (message from handa on Sat, 27 Mar 2021 22:23:59 +0900) Subject: Re: bug#46933: Possible bugs in filepos-to-bufferpos / bufferpos-to-filepos References: <871rc0u3v4.fsf@HIDDEN> X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 46933 Cc: gregory@HIDDEN, 46933 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.7 (-) > From: handa <handa@HIDDEN> > Cc: gregory@HIDDEN, 46933 <at> debbugs.gnu.org > Date: Sat, 27 Mar 2021 22:23:59 +0900 > > How about something like this method: > 1. Encode the buffer text one line by one until we get a longer byte > sequence than BYTE. > 2. Delete the result of enoding the last line above. > 3. Provided that the above last line has chars C1 C2 ... Cn, > encode characters C1...Cn, C1...Cn-1, C1...Cn-2 until we get a shorter > byte sequence than BYTE. > > The first step may be optimized by encode multiple lines instead of > single line. Even if we do optimize, this would be very slow, I think. And what if the buffer has no newlines? In any case, the problem is not with encoding, the problem is with decoding. Encoding doesn't have this problem because we always encode more than enough (we use the value of BYTE as the count of _characters_ to encode, so for ISO-2022 encoding it is usually much more than needed). By contrast, when decoding, we decode exactly BYTE+1 bytes, which then hits the problem if that offset is inside a shift sequence.
bug-gnu-emacs@HIDDEN
:bug#46933
; Package emacs
.
Full text available.Received: (at 46933) by debbugs.gnu.org; 27 Mar 2021 13:24:21 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sat Mar 27 09:24:21 2021 Received: from localhost ([127.0.0.1]:43557 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1lQ8vM-0002Aa-Tg for submit <at> debbugs.gnu.org; Sat, 27 Mar 2021 09:24:21 -0400 Received: from eggs.gnu.org ([209.51.188.92]:58678) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <handa@HIDDEN>) id 1lQ8vK-0002AN-V3 for 46933 <at> debbugs.gnu.org; Sat, 27 Mar 2021 09:24:19 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:35734) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from <handa@HIDDEN>) id 1lQ8vF-0001zz-Eq; Sat, 27 Mar 2021 09:24:13 -0400 Received: from fl1-60-236-248-230.iba.mesh.ad.jp ([60.236.248.230]:53994 helo=shatin) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from <handa@HIDDEN>) id 1lQ8vD-0002ma-8s; Sat, 27 Mar 2021 09:24:12 -0400 Received: from handa by shatin with local (Exim 4.93) (envelope-from <handa@HIDDEN>) id 1lQ8v1-0005i4-FN; Sat, 27 Mar 2021 22:23:59 +0900 From: handa <handa@HIDDEN> To: Eli Zaretskii <eliz@HIDDEN> Subject: Re: bug#46933: Possible bugs in filepos-to-bufferpos / bufferpos-to-filepos In-Reply-To: <8335whowuj.fsf@HIDDEN> (message from Eli Zaretskii on Sat, 27 Mar 2021 10:54:28 +0300) Date: Sat, 27 Mar 2021 22:23:59 +0900 Message-ID: <871rc0u3v4.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 46933 Cc: gregory@HIDDEN, 46933 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.7 (-) In article <8335whowuj.fsf@HIDDEN>, Eli Zaretskii <eliz@HIDDEN> writes: > Thanks. The patch fixes the problem with the extra 6 bytes, so I > installed it. Thank you for the improved concise comment in the code. > The results of filepos-to-bufferpos with the file attached by Gregory > are better now, but there are still problems for some values of BYTE > argument. The problem is that ISO-2022 encoding (and others like it) > include shift-in and shift-out sequences, used to switch between > character sets. As a trivial example, each CR+LF sequence has the > "ESC ( B" sequence before it and "ESC $ B" sequence after it, to > switch to ASCII before the newline, then switch to Japanese after it. > And likewise whenever there's Latin text within Japanese (there are > quite a lot of them in this particular file). These shift-in and > shift-out sequences consume bytes, but don't produce any characters. > So if the BYTE argument of filepos-to-bufferpos specifies a byte in > the middle of one of these shift sequences, the result will be > incorrect, because decoding a partial sequence produces the bytes of > that sequence verbatim, and the logic in filepos-to-bufferpos of using > the length of the decoded text breaks. We need special handling of > this and other similar coding-systems to fix these corner use cases, > similarly to what we do in filepos-to-bufferpos--dos. Patches > welcome. How about something like this method: 1. Encode the buffer text one line by one until we get a longer byte sequence than BYTE. 2. Delete the result of enoding the last line above. 3. Provided that the above last line has chars C1 C2 ... Cn, encode characters C1...Cn, C1...Cn-1, C1...Cn-2 until we get a shorter byte sequence than BYTE. The first step may be optimized by encode multiple lines instead of single line. --- K. Handa handa@HIDDEN
bug-gnu-emacs@HIDDEN
:bug#46933
; Package emacs
.
Full text available.Received: (at 46933) by debbugs.gnu.org; 27 Mar 2021 07:54:35 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sat Mar 27 03:54:35 2021 Received: from localhost ([127.0.0.1]:43346 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1lQ3mF-0004QD-6n for submit <at> debbugs.gnu.org; Sat, 27 Mar 2021 03:54:35 -0400 Received: from eggs.gnu.org ([209.51.188.92]:35062) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1lQ3mD-0004Q1-CY for 46933 <at> debbugs.gnu.org; Sat, 27 Mar 2021 03:54:33 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:32978) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from <eliz@HIDDEN>) id 1lQ3m5-0002yA-4P; Sat, 27 Mar 2021 03:54:28 -0400 Received: from 84.94.185.95.cable.012.net.il ([84.94.185.95]:2007 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from <eliz@HIDDEN>) id 1lQ3m4-0001Rs-95; Sat, 27 Mar 2021 03:54:24 -0400 Date: Sat, 27 Mar 2021 10:54:28 +0300 Message-Id: <8335whowuj.fsf@HIDDEN> From: Eli Zaretskii <eliz@HIDDEN> To: handa <handa@HIDDEN> In-Reply-To: <874kgxtatr.fsf@HIDDEN> (message from handa on Sat, 27 Mar 2021 14:38:56 +0900) Subject: Re: bug#46933: Possible bugs in filepos-to-bufferpos / bufferpos-to-filepos References: <874kgxtatr.fsf@HIDDEN> X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 46933 Cc: gregory@HIDDEN, 46933 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.7 (-) > From: handa <handa@HIDDEN> > Cc: gregory@HIDDEN, 46933 <at> debbugs.gnu.org > Date: Sat, 27 Mar 2021 14:38:56 +0900 > > In article <83ft0obk7i.fsf@HIDDEN>, Eli Zaretskii <eliz@HIDDEN> writes: > > > Kenichi, why are these 6 bytes inserted by encode-coding-region, but > > not when we encode the same text as part of saving the buffer to its > > file? And why does it happen near the end of the text, between those > > 2 particular letters? > > There surely exists a bug. Could you please try the attached patch? > > The reason why that bug did not happen on file writing is that the code > in write_region calls encoding routine repeatedly without > CODING_MODE_LAST_BLOCK flag, and only in the case that flushing is > required (e.g. the case of iso-2022-jp), just for flushing, it calls > enoding routine again with CODING_MODE_LAST_BLOCK flag. In that case, > carryover does not happen in encode_coding (). Thanks. The patch fixes the problem with the extra 6 bytes, so I installed it. The results of filepos-to-bufferpos with the file attached by Gregory are better now, but there are still problems for some values of BYTE argument. The problem is that ISO-2022 encoding (and others like it) include shift-in and shift-out sequences, used to switch between character sets. As a trivial example, each CR+LF sequence has the "ESC ( B" sequence before it and "ESC $ B" sequence after it, to switch to ASCII before the newline, then switch to Japanese after it. And likewise whenever there's Latin text within Japanese (there are quite a lot of them in this particular file). These shift-in and shift-out sequences consume bytes, but don't produce any characters. So if the BYTE argument of filepos-to-bufferpos specifies a byte in the middle of one of these shift sequences, the result will be incorrect, because decoding a partial sequence produces the bytes of that sequence verbatim, and the logic in filepos-to-bufferpos of using the length of the decoded text breaks. We need special handling of this and other similar coding-systems to fix these corner use cases, similarly to what we do in filepos-to-bufferpos--dos. Patches welcome. I'm leaving this bug open because not all of the problem was fixed.
bug-gnu-emacs@HIDDEN
:bug#46933
; Package emacs
.
Full text available.Received: (at 46933) by debbugs.gnu.org; 27 Mar 2021 05:39:09 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sat Mar 27 01:39:08 2021 Received: from localhost ([127.0.0.1]:43165 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1lQ1fA-0007Jl-LY for submit <at> debbugs.gnu.org; Sat, 27 Mar 2021 01:39:08 -0400 Received: from eggs.gnu.org ([209.51.188.92]:38754) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <handa@HIDDEN>) id 1lQ1f8-0007JL-Ku for 46933 <at> debbugs.gnu.org; Sat, 27 Mar 2021 01:39:07 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:58000) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from <handa@HIDDEN>) id 1lQ1f3-0002dG-C9; Sat, 27 Mar 2021 01:39:01 -0400 Received: from fl1-60-236-248-230.iba.mesh.ad.jp ([60.236.248.230]:52970 helo=shatin) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from <handa@HIDDEN>) id 1lQ1f2-0005us-EL; Sat, 27 Mar 2021 01:39:00 -0400 Received: from handa by shatin with local (Exim 4.93) (envelope-from <handa@HIDDEN>) id 1lQ1ey-0004xJ-L4; Sat, 27 Mar 2021 14:38:56 +0900 From: handa <handa@HIDDEN> To: Eli Zaretskii <eliz@HIDDEN> Subject: Re: bug#46933: Possible bugs in filepos-to-bufferpos / bufferpos-to-filepos In-Reply-To: <83ft0obk7i.fsf@HIDDEN> (message from Eli Zaretskii on Sun, 21 Mar 2021 17:27:45 +0200) Date: Sat, 27 Mar 2021 14:38:56 +0900 Message-ID: <874kgxtatr.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 46933 Cc: gregory@HIDDEN, 46933 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.7 (-) In article <83ft0obk7i.fsf@HIDDEN>, Eli Zaretskii <eliz@HIDDEN> writes: > Kenichi, why are these 6 bytes inserted by encode-coding-region, but > not when we encode the same text as part of saving the buffer to its > file? And why does it happen near the end of the text, between those > 2 particular letters? There surely exists a bug. Could you please try the attached patch? The reason why that bug did not happen on file writing is that the code in write_region calls encoding routine repeatedly without CODING_MODE_LAST_BLOCK flag, and only in the case that flushing is required (e.g. the case of iso-2022-jp), just for flushing, it calls enoding routine again with CODING_MODE_LAST_BLOCK flag. In that case, carryover does not happen in encode_coding (). --- K. Handa handa@HIDDEN diff --git a/src/coding.c b/src/coding.c index 221a9cad89..a9d5a7ccdc 100644 --- a/src/coding.c +++ b/src/coding.c @@ -7799,7 +7799,14 @@ encode_coding (struct coding_system *coding) coding_set_source (coding); consume_chars (coding, translation_table, max_lookup); coding_set_destination (coding); + /* If consume_chars did not consume all source chars, we call + coding->encoder again in the next iteration, and thus, for this + iteration, we must clear CODING_MODE_LAST_BLOCK flag. */ + unsigned saved_mode = coding->mode; + if (coding->consumed_char < coding->src_chars) + coding->mode &= ~CODING_MODE_LAST_BLOCK; (*(coding->encoder)) (coding); + coding->mode = saved_mode; } while (coding->consumed_char < coding->src_chars); if (BUFFERP (coding->dst_object) && coding->produced_char > 0)
bug-gnu-emacs@HIDDEN
:bug#46933
; Package emacs
.
Full text available.Received: (at 46933) by debbugs.gnu.org; 21 Mar 2021 15:27:55 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Sun Mar 21 11:27:55 2021 Received: from localhost ([127.0.0.1]:55272 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1lNzzf-000460-8j for submit <at> debbugs.gnu.org; Sun, 21 Mar 2021 11:27:55 -0400 Received: from eggs.gnu.org ([209.51.188.92]:46966) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1lNzzd-00045o-AV for 46933 <at> debbugs.gnu.org; Sun, 21 Mar 2021 11:27:54 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]:41184) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from <eliz@HIDDEN>) id 1lNzzW-00067d-Aa; Sun, 21 Mar 2021 11:27:47 -0400 Received: from 84.94.185.95.cable.012.net.il ([84.94.185.95]:2671 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from <eliz@HIDDEN>) id 1lNzzU-00048P-TG; Sun, 21 Mar 2021 11:27:45 -0400 Date: Sun, 21 Mar 2021 17:27:45 +0200 Message-Id: <83ft0obk7i.fsf@HIDDEN> From: Eli Zaretskii <eliz@HIDDEN> To: Gregory Heytings <gregory@HIDDEN>, Kenichi Handa <handa@HIDDEN> In-Reply-To: <9cff0f8894f167925251@HIDDEN> (message from Gregory Heytings on Thu, 04 Mar 2021 21:21:24 +0000) Subject: Re: bug#46933: Possible bugs in filepos-to-bufferpos / bufferpos-to-filepos References: <9cff0f8894f167925251@HIDDEN> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 46933 Cc: 46933 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -1.7 (-) > Date: Thu, 04 Mar 2021 21:21:24 +0000 > From: Gregory Heytings <gregory@HIDDEN> > > (Disclaimer: I have no knowledge whatsoever about the ISO-2022-JP > encoding, and although this looks like a bug, I'm not sure this is > actually a bug; I report this at the suggesion of Eli in bug#46859.) > > I downloaded the file [1], and converted it to the ISO-2022-JP encoding > with iconv -t iso-2022-jp one.txt > iso-2022-jp.txt. The resulting file > is attached to this bug report. It ends with two CRLFs, at byte offsets > 2993 and 2995. However, after emacs -Q iso-2022-jp.txt, with M-: > (goto-char (filepos-to-bufferpos POS 'exact)) we get: > > POS = 2991, 2992: last but one visible character (HIRAGANA LETTER RU) > POS = 2993, 2994: last visible character (IDEOGRAPHIC FULL STOP) > POS = 2995, 2996: first CRLF > POS = 2997: second CRLF > POS = 2998: point-max > POS = 2999: first CRLF > POS = 3000, 3001: second CRLF > POS >= 3002: point-max > > I would have expected: > > POS = 2989, 2990: last but one visible character (HIRAGANA LETTER RU) > POS = 2991, 2992: last visible character (IDEOGRAPHIC FULL STOP) > POS = 2993, 2994: first CRLF > POS = 2995, 2996: second CRLF > POS >= 2997: point-max > > The opposite operation M-: (bufferpos-to-filepos (- (point) POS) 'exact) > apparently also has bugs; its return values are not coherent with the > above ones: > > POS = 0: 3003 > POS = 1: 3001 > POS = 2: 2999 > POS = 3 (IDEOGRAPHIC FULL STOP): 2997 > POS = 4 (HIRAGANA LETTER RU): 2995 > > I would have expected: > > POS = 0: 2997 > POS = 1: 2995 > POS = 2: 2993 > POS = 3 (IDEOGRAPHIC FULL STOP): 2991 > POS = 4 (HIRAGANA LETTER RU): 2989 > > [1] https://darza.com/ecbackend/vendor/symfony/mime/Tests/Fixtures/samples/charsets/iso-2022-jp/one.txt There's something strange going on here with encoding of the buffer using iso-2022-jp-dos: near the end of the encoded bytestream, between the encoded HIRAGANA LETTER KO (こ) and HIRAGANA LETTER TO (と), we get 6 extra bytes: "ESC ( B ESC $ B". AFAIU, this sequence mean switch to ASCII and then switch back to Japanese. So together these 6 bytes are a no-op as regards to their effect on the text, but they disrupt the logic of filepos-to-bufferpos because they introduce extra bytes that aren't there in the original file. Kenichi, why are these 6 bytes inserted by encode-coding-region, but not when we encode the same text as part of saving the buffer to its file? And why does it happen near the end of the text, between those 2 particular letters?
bug-gnu-emacs@HIDDEN
:bug#46933
; Package emacs
.
Full text available.Received: (at submit) by debbugs.gnu.org; 4 Mar 2021 21:21:32 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Thu Mar 04 16:21:32 2021 Received: from localhost ([127.0.0.1]:60489 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1lHvPX-0008JB-Tb for submit <at> debbugs.gnu.org; Thu, 04 Mar 2021 16:21:32 -0500 Received: from lists.gnu.org ([209.51.188.17]:41820) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <gregory@HIDDEN>) id 1lHvPW-0008J2-1e for submit <at> debbugs.gnu.org; Thu, 04 Mar 2021 16:21:31 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:43582) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <gregory@HIDDEN>) id 1lHvPV-0000WI-RV for bug-gnu-emacs@HIDDEN; Thu, 04 Mar 2021 16:21:29 -0500 Received: from heytings.org ([95.142.160.155]:47266) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <gregory@HIDDEN>) id 1lHvPT-0008Nf-2X for bug-gnu-emacs@HIDDEN; Thu, 04 Mar 2021 16:21:29 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=heytings.org; s=20210101; t=1614892884; bh=Jmbob/yJrwyZ1+lBsya86DOUtTywDHjqRbL4Pqx5j/8=; h=Date:From:To:Subject:Message-ID:From; b=IqxV3FFit6BBwATkZUDvGmZqFrFHjnUTP3vMALHNfEqRq02GHn44x0oJWFNeJaspq /8Badudrm0NW+g8pX/QKwrqMG1bRsg5jK5bJPwlDufneEPs4xsdnsCZ8I1MYUG9dyT TACUOB0yHE/cqRnN7kO4XUN+WAQFLkwuLXEeOxQR/fZl+Ue0xUx3bfodYEWqhdH/zN NLPpB+J6aw80sKt3Xxn1ktNIvcMr/jAgI9XV4ANPa4w6rVdjXWYdntnmq4bDXY2O8E NbyYBdJKqDibNauZXgtMUQer2qHGISsTn+K5Dz9fy+1VMo/0xfX7aGBaEQMcmJ8Mvy V48eA55xmO2Tw== Date: Thu, 04 Mar 2021 21:21:24 +0000 From: Gregory Heytings <gregory@HIDDEN> To: bug-gnu-emacs@HIDDEN Subject: Possible bugs in filepos-to-bufferpos / bufferpos-to-filepos Message-ID: <9cff0f8894f167925251@HIDDEN> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="QDJK79UVpA" Received-SPF: pass client-ip=95.142.160.155; envelope-from=gregory@HIDDEN; helo=heytings.org X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.3 (-) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -2.3 (--) --QDJK79UVpA Content-Type: text/plain; format=flowed; charset=us-ascii (Disclaimer: I have no knowledge whatsoever about the ISO-2022-JP encoding, and although this looks like a bug, I'm not sure this is actually a bug; I report this at the suggesion of Eli in bug#46859.) I downloaded the file [1], and converted it to the ISO-2022-JP encoding with iconv -t iso-2022-jp one.txt > iso-2022-jp.txt. The resulting file is attached to this bug report. It ends with two CRLFs, at byte offsets 2993 and 2995. However, after emacs -Q iso-2022-jp.txt, with M-: (goto-char (filepos-to-bufferpos POS 'exact)) we get: POS = 2991, 2992: last but one visible character (HIRAGANA LETTER RU) POS = 2993, 2994: last visible character (IDEOGRAPHIC FULL STOP) POS = 2995, 2996: first CRLF POS = 2997: second CRLF POS = 2998: point-max POS = 2999: first CRLF POS = 3000, 3001: second CRLF POS >= 3002: point-max I would have expected: POS = 2989, 2990: last but one visible character (HIRAGANA LETTER RU) POS = 2991, 2992: last visible character (IDEOGRAPHIC FULL STOP) POS = 2993, 2994: first CRLF POS = 2995, 2996: second CRLF POS >= 2997: point-max The opposite operation M-: (bufferpos-to-filepos (- (point) POS) 'exact) apparently also has bugs; its return values are not coherent with the above ones: POS = 0: 3003 POS = 1: 3001 POS = 2: 2999 POS = 3 (IDEOGRAPHIC FULL STOP): 2997 POS = 4 (HIRAGANA LETTER RU): 2995 I would have expected: POS = 0: 2997 POS = 1: 2995 POS = 2: 2993 POS = 3 (IDEOGRAPHIC FULL STOP): 2991 POS = 4 (HIRAGANA LETTER RU): 2989 [1] https://darza.com/ecbackend/vendor/symfony/mime/Tests/Fixtures/samples/charsets/iso-2022-jp/one.txt In GNU Emacs 27.1 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.23, cairo version 1.16.0) of 2020-11-08, modified by Debian built on x86-ubc-01 Windowing system distributor 'The X.Org Foundation', version 11.0.12010000 System Description: Debian GNU/Linux bullseye/sid Configured using: 'configure --build x86_64-linux-gnu --prefix=/usr --sharedstatedir=/var/lib --libexecdir=/usr/lib --localstatedir=/var/lib --infodir=/usr/share/info --mandir=/usr/share/man --enable-libsystemd --with-pop=yes --enable-locallisppath=/etc/emacs:/usr/local/share/emacs/27.1/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/27.1/site-lisp:/usr/share/emacs/site-lisp --with-sound=alsa --without-gconf --with-mailutils --build x86_64-linux-gnu --prefix=/usr --sharedstatedir=/var/lib --libexecdir=/usr/lib --localstatedir=/var/lib --infodir=/usr/share/info --mandir=/usr/share/man --enable-libsystemd --with-pop=yes --enable-locallisppath=/etc/emacs:/usr/local/share/emacs/27.1/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/27.1/site-lisp:/usr/share/emacs/site-lisp --with-sound=alsa --without-gconf --with-mailutils --with-cairo --with-x=yes --with-x-toolkit=gtk3 --with-toolkit-scroll-bars 'CFLAGS=-g -O2 -fdebug-prefix-map=/build/emacs-6jKC2B/emacs-27.1+1=. -fstack-protector-strong -Wformat -Werror=format-security -Wall' 'CPPFLAGS=-Wdate-time -D_FORTIFY_SOURCE=2' LDFLAGS=-Wl,-z,relro' Configured features: XPM JPEG TIFF GIF PNG RSVG CAIRO SOUND GPM DBUS GSETTINGS GLIB NOTIFY INOTIFY ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE HARFBUZZ M17N_FLT LIBOTF ZLIB TOOLKIT_SCROLL_BARS GTK3 X11 XDBE XIM MODULES THREADS LIBSYSTEMD JSON PDUMPER LCMS2 GMP Important settings: value of $LANG: en_US.UTF-8 locale-coding-system: utf-8-unix Major mode: Lisp Interaction Minor modes in effect: tooltip-mode: t global-eldoc-mode: t eldoc-mode: t electric-indent-mode: t mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t line-number-mode: t transient-mark-mode: t Load-path shadows: None found. Features: (shadow sort mail-extr emacsbug message rmc puny dired dired-loaddefs format-spec rfc822 mml easymenu mml-sec password-cache epa derived epg epg-config gnus-util rmail rmail-loaddefs text-property-search time-date subr-x seq byte-opt gv bytecomp byte-compile cconv mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader cl-loaddefs cl-lib sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type mwheel term/x-win x-win term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe tabulated-list replace newcomment text-mode elisp-mode lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow isearch timer select scroll-bar mouse jit-lock font-lock syntax facemenu font-core term/tty-colors frame minibuffer cl-generic cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese composite charscript charprop case-table epa-hook jka-cmpr-hook help simple abbrev obarray cl-preloaded nadvice loaddefs button faces cus-face macroexp files text-properties overlay sha1 md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote threads dbusbind inotify lcms2 dynamic-setting system-font-setting font-render-setting cairo move-toolbar gtk x-toolkit x multi-tty make-network-process emacs) --QDJK79UVpA Content-Type: text/plain; name=iso-2022-jp.txt Content-Transfer-Encoding: BASE64 Content-ID: <9cff0f889484674cce7f@HIDDEN> Content-Description: Content-Disposition: attachment; filename=iso-2022-jp.txt SVNPLTIwMjItSlAbJEIkTyEiJSQlcyU/ITwlTSVDJUg+ZRsoQigbJEJGQyRL RUU7UiVhITwlaxsoQikbJEIkSiRJJEc7SCRvJGwka0Z8S1wkTko4O3pNUSRO Sjg7eklkOWYyPUp9PDAhIxsoQklTTy9JRUMgMjAyMhskQiROJSglOSUxITwl VyU3ITwlMSVzJTkkck14TVEkNyRGSjg7ej04OWckckBaJGpCWCQoJGsbKEI3 GyRCJVMlQyVIJE4lMyE8JUkkRyQiJGskMyRIJHJGQ0QnJEgkOSRrGyhCICgb JEIlIiVKJSYlcyU5NSFHPSROJSglOSUxITwlVyU3ITwlMSVzJTkkTz5KTiwk NSRsJGsbKEIpGyRCISNCLyRLIVYbKEJKSVMbJEIlMyE8JUkhVyRIOEYkUCRs JGskMyRIJGIkIiRrISMbKEINCg0KGyRCMzVNVxsoQg0KGyRCRnxLXDhsST01 LSRYJE5NeE1RJCxBW0RqJDUkbCRGJCQka0o4O3olMyE8JUkkRyQiJGohIkZ8 S1w4bCROTXhNUSQ1JGwkayVNJUMlSCVvITwlLyRLJCokJCRGISJGfEtcJE41 LDNKJHIxfk1RJDckPyRiJE4kRyQiJGshIyReJD9KODt6PTg5ZyRIJDckRiRP ISJGfEtcOGwkR01RJCQkaSRsJGs0QTt6ISIkUiRpJCwkSiEiJSslPyUrJUok TyRiJEEkbSRzISIlaSVGJXNKODt6ISIlLiVqJTclIko4O3ohIiUtJWola0o4 O3okSiRJJGI0XiRzJEckKiRqISIzWD1RJGQ7OjZIJE5KLExuJEckTk14TVEk YjlNTjgkPyRiJE4kSCRKJEMkRiQkJGshIzUsM0pMPiRLISIbKEJJU08bJEIk TkZ8S1w4bCROOEA4bCUzITwlSSRHJCIkaxsoQmphGyRCJEckTyRKJC8hIjlx ISZDTzBoTD4lMyE8JUkkThsoQkpQGyRCJCw8KCQ1JGwkRiQkJGskZiQoJHMk RyQiJGshIxsoQg0KGyRCSjg7ej04OWckSCQ3JEYbKEJKSVMgWCAwMjAxGyRC JE4bKEJDMBskQj04OWchSkApOGZKODt6IUshIhsoQkpJUyBYIDAyMDEbJEIk TiVpJUYlc0o4O3o9ODlnISIbKEJJU08gNjQ2GyRCJE45cTpdNHA9YEhHP143 QUo4O3ohIhsoQkpJUyBYIDAyMDgbJEIkThsoQjE5NzgbJEJHL0hHIUobKEJK SVMgQyA2MjI2LTE5NzgbJEIhSyRIGyhCMTk4MxskQkcvJCokaCRTGyhCMTk5 MBskQkcvSEckLE14TVEkRyQtJGshIxsoQkpJUyBYIDAyMDEbJEIkTkpSMj5M Pko4O3o9ODlnJE9NeE1RJEckLSRKJCQhIxsoQjE5ODYbJEJHLzBKOV8hIkZ8 S1wkTkVFO1IlYSE8JWskR01RJCQkaSRsJEYkLSQ/GyhCSlVORVQbJEIlMyE8 JUkkciEiQjwwZj1jISYbKEJNYXJrIENyaXNwaW4bJEIhJhsoQkVyaWsgdmFu IGRlciBQb2VsGyRCJCwbKEIxOTkzGyRCRy8kSxsoQlJGQxskQjI9JDckPyRi JE4bKEIoUkZDIDE0NjgpGyRCISM4ZSRLGyhCSklTIFggMDIwODoxOTk3GyRC JE5JbUIwPXEbKEIyGyRCJEgkNyRGGyhCSklTGyRCJEs1LERqJDUkbCQ/ISMb KEJNSU1FGyRCJEskKiQxJGtKODt6SWQ5ZjI9Sn08MCROPDFKTE1RJE5MPkEw JEgkNyRGGyhCIElBTkEgGyRCJEtFUE8/JDUkbCRGJCQkayEjGyhCDQobJEIk SiQqISJJZDlmMj0kTjtFTU0kSyREJCQkRiRPGyhCSVNPL0lFQyAyMDIyI0lT Ty0yMDIyLUpQGyRCJGI7Mj5IISMbKEINCg0KSVNPLTIwMjItSlAbJEIkSEhz STg9YEUqM0hEJTtITVEbKEINChskQiFWGyhCSklTGyRCJTMhPCVJIVchSiRe JD8kTyFWGyhCSVNPLTIwMjItSlAbJEIhVyFLJEgkJCQmJTMhPCVJTD4kTjUs RGoyPCRHJE8hIiQ9JE47RU1NREwkaiROO0hNUSQsNWEkYSRpJGwkayEjJDck KyQ3ISIbKEJXaW5kb3dzIE9TGyRCPmUkRyRPISI8QjpdJEskTxsoQkNQOTMy GyRCJTMhPCVJGyhCIChNaWNyb3NvZnQbJEIkSyRoJGsbKEJTaGlmdCBKSVMb JEIkcjNIRCUkNyQ/MCE8byEjGyhCSVNPLTIwMjItSlAbJEI1LERqMzBKODt6 JCxESTJDJDUkbCRGJCQkayEjIUskSyRoJGtGSDwrM0hEJSFKJE5KODt6IUsk ckNHJGokSiQvO0gkJiUiJVclaiUxITwlNyVnJXMkLEI/JCQhIyQzJE5OYyRI JDckRhsoQkludGVybmV0IEV4cGxvcmVyGyRCJGQbKEJPdXRsb29rIEV4cHJl c3MbJEIkLCQiJGshIyReJD8hIhsoQkVtRWRpdG9yGyRCISI9KDRdJSglRyUj JT8kZBsoQlRodW5kZXJiaXJkGyRCJE4kaCQmJEobKEJNaWNyb3NvZnQbJEI8 UjBKMzAkThsoQldpbmRvd3MbJEIlIiVXJWolMSE8JTclZyVzJEckYkYxTU0k Tj5sOWckLCQiJGshIyQzJE4+bDlnISIbKEJJU08tMjAyMi1KUBskQiROSE8w TzMwJE5KODt6JHI7SCRDJEYkNyReJCYkSCEiMFskSiRrQD1JSjRWJEckT0wk RGo1QUlUTEBKODt6JEgkNyRGRyc8MSQ1JGwkayQrISIkYiQ3JC8kT0o4O3oy PSQxJHI1LyQzJDk4NjB4JEgkSiRrISMkPSROJD8kYSEiGyhCV2luZG93cxsk Qk1RJE5FRTtSJWEhPCVrJS8laSUkJSIlcyVIJEckIiRDJEYkYkZIPCszSEQl JE5KODt6JHI7SE1RJDkkayRIN1k5cCRyPVAkNyQ/JGohIiQiJCgkRjtIJCgk SiQkJGgkJiRLQCk4QiQ3JEYkJCRrJGIkTiRiQjg6XyQ5JGshIyQ1JGkkSyRP GyhCSVNPLTIwMjItSlAbJEIkTkhPME9GYiRHJCIkQyRGJGIbKEJDUDkzMhsk QiRPSHNJOD1gSjg7eiFKGyhCRlVMTFdJRFRIIFRJTERFGyRCRXkhSyRyO30k RCROJEdKODt6Mj0kMSROODYweCRLJEokakZAJGshIxsoQg0KGyRCJF4kPyEi SWQ5ZjI9Sn08MEw+JHIbKEJJU08tMjAyMi1KUBskQiRIJDckRiQkJGskTiRL ISJKODt6PTg5ZyRIJDckRiRPGyhCSklTIFggMDIxMiAoGyRCJCQkbyRmJGtK ZD11NEE7ehsoQikgGyRCJGQbKEJKSVMgWCAwMjAxGyRCJE5KUjI+TD5KODt6 PTg5ZxsoQiAoGyRCJCQkbyRmJGtIPjNRJSslShsoQikgGyRCJHIkYklkOWYy PSQ3JEYkJCRrTmMkLCQiJGskLCEiGyhCSVNPLTIwMjItSlAbJEIkRyRPJDMk bCRpJE5KODt6JHI1dk1GJDckRiQkJEokJCEjJDMkbCRpJE5JZDlmMj0kT0ZI PCszSEQlJE48QkF1JEckIiRqISJDZiRLJE8bKEJJU08vSUVDIDIwMjIbJEIk TjtFTU0kSz1gNXIkOSRpJDckRiQkJEokJCRiJE4kYiQiJGsbKEJbMl0bJEIh Iz0+JEMkRjx1Py5CJiRORUU7UiVhITwlayUvJWklJCUiJXMlSCQsJDMkbCRp JE5GSDwrM0hEJSRLQlAxfiQ3JEYkJCRKJCQ+bDlnISIkPSROSjg7eiQiJGsk JCRPJD0kTko4O3okcjReJGA5VCEiO34kSyRPJUYlLSU5JUhBNEJOJCxKODt6 Mj0kMSQ5JGskMyRIJCwkIiRrISMbKEINCg0K --QDJK79UVpA--
Gregory Heytings <gregory@HIDDEN>
:bug-gnu-emacs@HIDDEN
.
Full text available.bug-gnu-emacs@HIDDEN
:bug#46933
; Package emacs
.
Full text available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.