Received: (at submit) by debbugs.gnu.org; 31 Mar 2025 17:45:17 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Mar 31 13:45:17 2025 Received: from localhost ([127.0.0.1]:42745 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1tzJCG-0004jt-6F for submit <at> debbugs.gnu.org; Mon, 31 Mar 2025 13:45:17 -0400 Received: from lists.gnu.org ([2001:470:142::17]:51120) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <szermatt@HIDDEN>) id 1tzFyW-0002Od-Po for submit <at> debbugs.gnu.org; Mon, 31 Mar 2025 10:18:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from <szermatt@HIDDEN>) id 1tzFyP-0006KW-0m for bug-gnu-emacs@HIDDEN; Mon, 31 Mar 2025 10:18:46 -0400 Received: from mail-wm1-x334.google.com ([2a00:1450:4864:20::334]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from <szermatt@HIDDEN>) id 1tzFyM-0005lB-IO for bug-gnu-emacs@HIDDEN; Mon, 31 Mar 2025 10:18:44 -0400 Received: by mail-wm1-x334.google.com with SMTP id 5b1f17b1804b1-43ea40a6e98so4136245e9.1 for <bug-gnu-emacs@HIDDEN>; Mon, 31 Mar 2025 07:18:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1743430720; x=1744035520; darn=gnu.org; h=mime-version:message-id:date:cc:subject:to:from:sender:from:to:cc :subject:date:message-id:reply-to; bh=LJCwQxRVD3ss4lriMEK40lWXBYNmRdnxnDLfdF/0gzs=; b=OGI1q+13vnESkhu7uh+gcsGnZwNHppQ8pllEGmkZMJL15T1uI/3jmSBiOWBOatvNMY EWK++U/sUxDIgsIPkxYRig3JGP9UDYNiYrFJ6g3ne7zxcUujyhzFZz6kVRYNKwnzJx2Q mZcPJSFG25FGTPaqArUsc1QevUQdZ1CEdRdWLo8x/s8apC7capl0BPUEop5Am+/Kmany 1gLXiBg6n7guCmMtne8jlRKye+n1NzKAfKlX9i2LOJqygVpEwxODFK2DuOYpDlQ3BVcR O3y7heVu+VJcm8pRcfyKoTXfLVe5mwt81EcyVma8lUEee56M0vZG703Cy9hW6ToH2lLV /qAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743430720; x=1744035520; h=mime-version:message-id:date:cc:subject:to:from:sender :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=LJCwQxRVD3ss4lriMEK40lWXBYNmRdnxnDLfdF/0gzs=; b=GfAUbtxsLgaQUqO+qVwDQGs4WSPERMPVSEL4FHizH+u+Ykl5jHICWlYMIgmc2ph7DW Uh5SeZBaE2+t9HuLFIOV142o/xQ0hDU2AuF2pEMQHMT5xETw1hPU5PPwFS/FzH4Ktu8n BdCbR3wrYj91zOjaWxcpkyXmQjAHQ+9OptjB5TxzThGxQ5EynZBQZmX/gH0tvK+g+/Gf kYbKmWEYA03emEuUoTPkJsm9A+cdxfaBh8+WSUpzu/XTiOXi5qQxuqAMvKQ36Hi5b4UR VtcbmyAQhuGyuLqJKH2aPxoO1LqTuaDD3h/qYa5MDbWrD/6A7/gfPYd2lPffZp/4gFb5 Swxw== X-Gm-Message-State: AOJu0Yxs/CKzc1bw5strkcVfNvko6oKLtlHvQcZd52PM0ljyr+Fhi6EU 66vEL7heGZUym5GJas5E3p75kBiYdhWmxyVDibtE4R0hVQKIBmCkqWqhoiqY X-Gm-Gg: ASbGncuvMjmKnAVikp+vV2Wf7TQd7Zg0ymOORH+aE79me5GAURmf4uF1Piqo+KRmoyX 0keNGjNPjsTlLpQZoM0ZS9zIV3ftGOJPJM+9Bo/7Qb+6glakbHmpUYpTerXrKcIT1djZGqXtuha 4e1XeQCKEDK7Y5ZU+aXCaKzNXJYNMKQ9HB7FYvpaM3Gl789uP/0DpUdkPrqqaRzj3k8vSW5HsFd +T42rRIPjG2yviBBVlEnDBidKbgIQU6gNXqp3bL04dpCUEJOL1rYLi5IoprtFneaYpZmCZEVmTJ nuQFpe+xNckJQguatIlpT/rgLapq7Ng0purY39MAqsFDUKN9V+p+GJKhCFFk X-Google-Smtp-Source: AGHT+IH2r0vurB18Xl9WLYVT+F/bcFedgdUzE7wlG+eEpi/OHIrJt3jq+M14kNzhS+9L51QgoppgkQ== X-Received: by 2002:a05:600c:699b:b0:43c:e305:6d50 with SMTP id 5b1f17b1804b1-43db62c034bmr86446655e9.24.1743430719888; Mon, 31 Mar 2025 07:18:39 -0700 (PDT) Received: from boomer.zia ([62.74.15.163]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-39c0b79e082sm11610488f8f.69.2025.03.31.07.18.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 31 Mar 2025 07:18:39 -0700 (PDT) From: Stephane Zermatten <szermatt@HIDDEN> To: bug-gnu-emacs@HIDDEN Subject: term.el sometimes prints undecoded multibyte UTF-8 chars Date: Mon, 31 Mar 2025 17:18:35 +0300 Message-ID: <m2iknpthac.fsf@HIDDEN> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Received-SPF: pass client-ip=2a00:1450:4864:20::334; envelope-from=szermatt@HIDDEN; helo=mail-wm1-x334.google.com X-Spam_score_int: -19 X-Spam_score: -2.0 X-Spam_bar: -- X-Spam_report: (-2.0 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.001, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Mon, 31 Mar 2025 13:45:14 -0400 Cc: szermatt@HIDDEN X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -0.0 (/) --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Tags: patch If I run a shell in a terminal with M-x term, with a very unicode-heavy prompt (fish 3.6 + tide), sometimes the Unicode characters are printed undecoded. One possible cause of this might be unfortunate chunking in the middle of a character, which the attached patch fixes. Without the patch, if I type this in M-x term /usr/bin/bash for j in $(seq 0 3); do for i in $(seq 0 30); do printf '\xf0\x9f'; sleep 0.1; printf '\x98\x80'; done; echo; done I get \360\237\203\022\360\... Instead of: =F0=9F=98=80=F0=9F=98=80=F0=9F=98=80=F0=9F=98=80=F0=9F=98=80=F0=9F=98=80= =F0=9F=98=80... With the patch included, I get the correct output. The issue comes from an incorrect check (> count partial 0), which should really be (and (>=3D count partial) (> partial 0)), but I simplified that to (> partial 0) in the patch, because the while loop guarantees (>=3D count partial). I rewrote the existing test to cover this case, and try out multiple different combination of chunks. I'm still looking into other causes of the issue, but this, at least, seems like an easy fix. In GNU Emacs 30.1 (build 2, x86_64-apple-darwin23.6.0, NS appkit-2487.70 Version 14.7.4 (Build 23H420)) of 2025-03-24 built on boomer.zia Windowing system distributor 'Apple', version 10.3.2487 System Description: macOS 14.7.4 Configured using: 'configure --disable-dependency-tracking --disable-silent-rules --enable-locallisppath=3D/usr/local/share/emacs/site-lisp --infodir=3D/usr/local/Cellar/emacs-plus@30/30.1/share/info/emacs --prefix=3D/usr/local/Cellar/emacs-plus@30/30.1 --with-native-compilation=3Daot --with-xml2 --with-gnutls --without-compress-install --without-dbus --without-imagemagick --with-modules --with-rsvg --with-webp --with-ns --disable-ns-self-contained 'CFLAGS=3D-O2 -DFD_SETSIZE=3D10000 -DDARWIN_UNLIMITED_SELECT -I/usr/local/opt/sqlite/include -I/usr/local/opt/gcc/include -I/usr/local/opt/libgccjit/include' 'LDFLAGS=3D-L/usr/local/opt/sqlite/lib -L/usr/local/lib/gcc/14 -I/usr/local/opt/gcc/include -I/usr/local/opt/libgccjit/include'' --=-=-= Content-Type: text/patch; charset=utf-8 Content-Disposition: attachment; filename=0001-Fix-issue-with-very-short-multibyte-character-chunk.patch Content-Transfer-Encoding: quoted-printable From 2bb6cec8f4f72009bcde1edab367f90ab82e5e2a Mon Sep 17 00:00:00 2001 From: Stephane Zermatten <szermatt@HIDDEN> Date: Mon, 31 Mar 2025 16:41:08 +0300 Subject: [PATCH] Fix issue with very short multibyte character chunk. Before this change, a chunk containing only a part of a multibyte character would be discarded and displayed undecoded on the terminal. * lisp/term.el --- lisp/term.el | 2 +- test/lisp/term-tests.el | 15 ++++++++------- 2 files changed, 9 insertions(+), 8 deletions(-) diff --git a/lisp/term.el b/lisp/term.el index 862103d88e6..a971300c055 100644 --- a/lisp/term.el +++ b/lisp/term.el @@ -3116,7 +3116,7 @@ term-emulate-terminal (- count 1 parti= al))) 'eight-bit)) (incf partial)) - (when (> count partial 0) + (when (> partial 0) (setq term-terminal-undecoded-bytes (substring decoded-substring (- partial))) (setq decoded-substring diff --git a/test/lisp/term-tests.el b/test/lisp/term-tests.el index 5ef8c1174df..aad84e171b2 100644 --- a/test/lisp/term-tests.el +++ b/test/lisp/term-tests.el @@ -402,13 +402,14 @@ term-to-margin (ert-deftest term-decode-partial () ;; Bug#25288. "Test multibyte characters sent into multiple chunks." ;; Set `locale-coding-system' so test will be deterministic. - (let* ((locale-coding-system 'utf-8-unix) - (string (make-string 7 ?=D1=88)) - (bytes (encode-coding-string string locale-coding-system))) - (should (equal string - (term-test-screen-from-input - 40 1 `(,(substring bytes 0 (/ (length bytes) 2)) - ,(substring bytes (/ (length bytes) 2)))))))) + (let ((locale-coding-system 'utf-8-unix)) + (should (equal "=D1=88=D1=88=D1=88" (term-test-screen-from-input + 40 1 '("\321" "\210\321\210\321\210")))) + (should (equal "=D1=88=D1=88=D1=88" (term-test-screen-from-input + 40 1 '("\321\210\321" "\210\321\210")))) + (should (equal "=D1=88=D1=88=D1=88" (term-test-screen-from-input + 40 1 '("\321\210\321\210\321" "\210")))))) + (ert-deftest term-undecodable-input () ;; Bug#29918. "Undecodable bytes should be passed through without error." (let* ((locale-coding-system 'utf-8-unix) ; As above. --=20 2.47.0 --=-=-=--
Stephane Zermatten <szermatt@HIDDEN>
:bug-gnu-emacs@HIDDEN
.
Full text available.bug-gnu-emacs@HIDDEN
:bug#77410
; Package emacs
.
Full text available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.