GNU logs - #79724, boring messages


Message sent to monnier@HIDDEN, bug-gnu-emacs@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#79724: 31.0.50; No easy way of searching a buffer for raw bytes
Resent-From: Eli Zaretskii <eliz@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: monnier@HIDDEN, bug-gnu-emacs@HIDDEN
Resent-Date: Thu, 30 Oct 2025 09:35:01 +0000
Resent-Message-ID: <handler.79724.B.17618168993028 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: report 79724
X-GNU-PR-Package: emacs
X-GNU-PR-Keywords: 
To: 79724 <at> debbugs.gnu.org
Cc: Stefan Monnier <monnier@HIDDEN>
X-Debbugs-Original-To: bug-gnu-emacs@HIDDEN
X-Debbugs-Original-Xcc: Stefan Monnier <monnier@HIDDEN>
Received: via spool by submit <at> debbugs.gnu.org id=B.17618168993028
          (code B ref -1); Thu, 30 Oct 2025 09:35:01 +0000
Received: (at submit) by debbugs.gnu.org; 30 Oct 2025 09:34:59 +0000
Received: from localhost ([127.0.0.1]:34182 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1vEP3a-0000ml-Mu
	for submit <at> debbugs.gnu.org; Thu, 30 Oct 2025 05:34:59 -0400
Received: from lists.gnu.org ([2001:470:142::17]:57270)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1vEP3S-0000mO-HV
 for submit <at> debbugs.gnu.org; Thu, 30 Oct 2025 05:34:51 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>) id 1vEP3F-0005t1-8e
 for bug-gnu-emacs@HIDDEN; Thu, 30 Oct 2025 05:34:39 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>) id 1vEP3C-0006XD-NS
 for bug-gnu-emacs@HIDDEN; Thu, 30 Oct 2025 05:34:36 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-version:Subject:To:From:Date:in-reply-to:
 references; bh=CgOFuZW3oeoHVmJ1rRGxUhM0BFFOhNC61MaiFdyLYq8=; b=UbD/vhANXHrJH0
 tsrMkvOv3jJPhjp/VFHqZXuRLIU2c22cD0jAj4kX5MmOiYlBnFB0U3trZ2165/p64h3Crrv9LoOmB
 5iF1x7dEgBmDnsL8Xaw6ke1LkaZWh98Sypt9bgorUKhilfbW+Jn2PqhXPDHV5HSFD0DghpYRkDlRg
 YxdL7EETlq2IRKIU2uiNOGUDoIZP7hYjWpyRITuYt8OvLJPazZh3XsgALtZAwbqPxa0ATFnJNO6po
 5OPmAOJn8ArrEiP5dBvPyp92PVZhbE76f5YGHerJ5oPE+t+OhE/liQUYZdOIzzsBntWYzAl796hCP
 O6B4rPpXs2r2pQbt//3w==;
Date: Thu, 30 Oct 2025 11:34:18 +0200
Message-Id: <86ikfwn36t.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
MIME-version: 1.0
Content-type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Spam-Score: -0.0 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

From: eliz@HIDDEN
--text follows this line--
As the subject says, how can a user easily search for raw bytes in a
buffer?  Or how can a Lisp program quickly scan a buffer to find raw
bytes and either remove or replace them?

To reproduce, start "emacs -Q" then insert a raw byte by typing

  C-x 8 RET 3fffe0 RET

Then try to come up with a regexp that finds only the raw byte.

This is important when one has a buffer which could include raw bytes,
and wants to json-serialize it, in which case there's a need to remove
raw bytes or replace them with something that will avoid signaling an
error from the serialization code.

The only way I found is to examine the buffer one character at a time
using charset-after.  But this is tedious and inefficient.

I seem to be unable to find a way to express this with regexps.  The
naïve way would be "[\u3fff00-\u3fffff]", but that doesn't work (it
finds ASCII letters and nothing else).  Nothing else I tried worked,
including the recipe from the ELisp manual:

       4. If the end points of a range are raw 8-bit bytes (*note Text
          Representations::), or if the range start is ASCII and the end
          is a raw byte (as in ‘[a-\377]’), the range will match only
          ASCII characters and raw 8-bit bytes, but not non-ASCII
          characters.  This feature is intended for searching text in
          unibyte buffers and strings.

In a buffer that includes only ASCII characters and a raw byte, typing
"C-M-s [a-\377]" signals an error "Failing regexp search.

Is there solution for this job that I'm missing?  If so, we should at
least document it.  If there's no solution currently, I think we
should add something to make it easier.


In GNU Emacs 31.0.50 (build 1458, i686-pc-mingw32) of 2025-10-30 built
 on ELIZ-PC
Repository revision: 06b3f11cb8f040d192a91972b40eab8c85a2cc5b
Repository branch: master
Windowing system distributor 'Microsoft Corp.', version 10.0.26100
System Description: Microsoft Windows 10 Enterprise (v10.0.2009.26100.6899)

Configured using:
 'configure -C --prefix=/d/usr --with-wide-int
 --without-native-compilation --enable-checking=yes,glyphs 'CFLAGS=-O0
 -gdwarf-4 -g3''

Configured features:
ACL GIF GMP GNUTLS HARFBUZZ JPEG LCMS2 LIBXML2 MODULES NOTIFY W32NOTIFY
PDUMPER PNG RSVG SOUND SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS
TREE_SITTER WEBP XPM ZLIB

Important settings:
  value of $LANG: ENG
  locale-coding-system: cp1252

Major mode: Lisp Interaction

Minor modes in effect:
  tooltip-mode: t
  global-eldoc-mode: t
  eldoc-mode: t
  show-paren-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  minibuffer-nonselected-mode: t
  minibuffer-regexp-mode: t
  line-number-mode: t
  indent-tabs-mode: t
  transient-mark-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug lisp-mnt message mailcap yank-media puny
dired dired-loaddefs rfc822 mml mml-sec password-cache epa derived epg
rfc6068 epg-config gnus-util text-property-search time-date subr-x
mm-decode mm-bodies mm-encode mailabbrev gmm-utils mailheader sendmail
mail-parse rfc2231 rfc2047 rfc2045 ietf-drums mm-util mail-prsvr
mail-utils warnings icons cl-loaddefs cl-lib rmc iso-transl tooltip
cconv eldoc paren electric uniquify ediff-hook vc-hooks lisp-float-type
elisp-mode mwheel touch-screen dos-w32 ls-lisp term/w32-nt disp-table
term/w32-win w32-win w32-vars term/common-win tool-bar dnd fontset image
regexp-opt fringe tabulated-list replace newcomment text-mode lisp-mode
prog-mode register page tab-bar menu-bar rfn-eshadow isearch easymenu
timer select scroll-bar mouse jit-lock font-lock syntax font-core
term/tty-colors frame minibuffer nadvice seq simple cl-generic
indonesian philippine cham georgian utf-8-lang misc-lang vietnamese
tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek
romanian slovak czech european ethiopic indian cyrillic chinese
composite emoji-zwj charscript charprop case-table epa-hook
jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs
theme-loaddefs faces cus-face macroexp files window text-properties
overlay sha1 md5 base64 format env code-pages mule custom widget keymap
hashtable-print-readable backquote threads w32notify w32 lcms2 multi-tty
move-toolbar make-network-process tty-child-frames emacs)

Memory information:
((conses 16 46888 16793) (symbols 48 6655 0) (strings 16 16703 2197)
 (string-bytes 1 346778) (vectors 16 9844)
 (vector-slots 8 115806 11013) (floats 8 23 6) (intervals 40 310 75)
 (buffers 928 10))




Message sent:


Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Mailer: MIME-tools 5.505 (Entity 5.505)
Content-Type: text/plain; charset=utf-8
X-Loop: help-debbugs@HIDDEN
From: help-debbugs@HIDDEN (GNU bug Tracking System)
To: Eli Zaretskii <eliz@HIDDEN>
Subject: bug#79724: Acknowledgement (31.0.50; No easy way of searching a
 buffer for raw bytes)
Message-ID: <handler.79724.B.17618168993028.ack <at> debbugs.gnu.org>
References: <86ikfwn36t.fsf@HIDDEN>
X-Gnu-PR-Message: ack 79724
X-Gnu-PR-Package: emacs
Reply-To: 79724 <at> debbugs.gnu.org
Date: Thu, 30 Oct 2025 09:35:02 +0000

Thank you for filing a new bug report with debbugs.gnu.org.

This is an automatically generated reply to let you know your message
has been received.

Your message is being forwarded to the package maintainers and other
interested parties for their attention; they will reply in due course.

As you requested using X-Debbugs-CC, your message was also forwarded to
  Stefan Monnier <monnier@HIDDEN>
(after having been given a bug report number, if it did not have one).

Your message has been sent to the package maintainer(s):
 bug-gnu-emacs@HIDDEN

If you wish to submit further information on this problem, please
send it to 79724 <at> debbugs.gnu.org.

Please do not send mail to help-debbugs@HIDDEN unless you wish
to report a problem with the Bug-tracking system.

--=20
79724: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D79724
GNU Bug Tracking System
Contact help-debbugs@HIDDEN with problems


Message sent to bug-gnu-emacs@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#79724: 31.0.50; No easy way of searching a buffer for raw bytes
Resent-From: Eli Zaretskii <eliz@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-gnu-emacs@HIDDEN
Resent-Date: Thu, 30 Oct 2025 10:23:02 +0000
Resent-Message-ID: <handler.79724.B79724.176181977413036 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 79724
X-GNU-PR-Package: emacs
X-GNU-PR-Keywords: 
To: 79724 <at> debbugs.gnu.org
Cc: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= <mattiase@HIDDEN>, monnier@HIDDEN
Received: via spool by 79724-submit <at> debbugs.gnu.org id=B79724.176181977413036
          (code B ref 79724); Thu, 30 Oct 2025 10:23:02 +0000
Received: (at 79724) by debbugs.gnu.org; 30 Oct 2025 10:22:54 +0000
Received: from localhost ([127.0.0.1]:34419 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1vEPnx-0003OB-Dn
	for submit <at> debbugs.gnu.org; Thu, 30 Oct 2025 06:22:53 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:41086)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1vEPns-0003Ne-O7
 for 79724 <at> debbugs.gnu.org; Thu, 30 Oct 2025 06:22:50 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1vEPnj-0004fP-Jb; Thu, 30 Oct 2025 06:22:39 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From:
 Date; bh=UUAvNwgNoZb/hoEg2gQwcp0GwagS9dogjENASBTukME=; b=V945RDbEDPApDDgh3qVd
 H8E+COPZzaYct8Enjc6Dgf1OlUeur+hZ/oLkIIluj4pMGT+MPptbBig4DWuEbBYR0vQzE05ZnAXPN
 Gx08ftDrHTvaYoP9A6GbSTLsZ0aHpmbwgG2xDu0KjgV/jppOCCoAGLtEztQtWbKuDKq0ErC1YACW+
 4ZELyh3Oux3fg580mzH9s29oR21PqCtKDVNj7J9/W8v+MZ2mCEfUj62bEhIK+8tpsNgz9V4vcEyU3
 Tqo8YCaDVLkNK4LVYZT9hQWOwLz5BGXq7BcazOvEqcFzPv2NbEJYqV/1lIAus2g0xAIhKRv8M5zkE
 wNcZqyG7MhjbDA==;
Date: Thu, 30 Oct 2025 12:22:34 +0200
Message-Id: <86frb0n0yd.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
In-Reply-To: <86ikfwn36t.fsf@HIDDEN> (message from Eli Zaretskii on Thu, 30
 Oct 2025 11:34:18 +0200)
References: <86ikfwn36t.fsf@HIDDEN>
MIME-version: 1.0
Content-type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Spam-Score: -2.3 (--)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

Let me add Mattias as well to the discussion, since he made some of
the changes in this area.

> Cc: Stefan Monnier <monnier@HIDDEN>
> Date: Thu, 30 Oct 2025 11:34:18 +0200
> From: Eli Zaretskii <eliz@HIDDEN>
> 
> As the subject says, how can a user easily search for raw bytes in a
> buffer?  Or how can a Lisp program quickly scan a buffer to find raw
> bytes and either remove or replace them?
> 
> To reproduce, start "emacs -Q" then insert a raw byte by typing
> 
>   C-x 8 RET 3fffe0 RET
> 
> Then try to come up with a regexp that finds only the raw byte.
> 
> This is important when one has a buffer which could include raw bytes,
> and wants to json-serialize it, in which case there's a need to remove
> raw bytes or replace them with something that will avoid signaling an
> error from the serialization code.
> 
> The only way I found is to examine the buffer one character at a time
> using charset-after.  But this is tedious and inefficient.
> 
> I seem to be unable to find a way to express this with regexps.  The
> naïve way would be "[\u3fff00-\u3fffff]", but that doesn't work (it
> finds ASCII letters and nothing else).  Nothing else I tried worked,
> including the recipe from the ELisp manual:
> 
>        4. If the end points of a range are raw 8-bit bytes (*note Text
>           Representations::), or if the range start is ASCII and the end
>           is a raw byte (as in ‘[a-\377]’), the range will match only
>           ASCII characters and raw 8-bit bytes, but not non-ASCII
>           characters.  This feature is intended for searching text in
>           unibyte buffers and strings.
> 
> In a buffer that includes only ASCII characters and a raw byte, typing
> "C-M-s [a-\377]" signals an error "Failing regexp search.
> 
> Is there solution for this job that I'm missing?  If so, we should at
> least document it.  If there's no solution currently, I think we
> should add something to make it easier.
> 
> 
> In GNU Emacs 31.0.50 (build 1458, i686-pc-mingw32) of 2025-10-30 built
>  on ELIZ-PC
> Repository revision: 06b3f11cb8f040d192a91972b40eab8c85a2cc5b
> Repository branch: master
> Windowing system distributor 'Microsoft Corp.', version 10.0.26100
> System Description: Microsoft Windows 10 Enterprise (v10.0.2009.26100.6899)
> 
> Configured using:
>  'configure -C --prefix=/d/usr --with-wide-int
>  --without-native-compilation --enable-checking=yes,glyphs 'CFLAGS=-O0
>  -gdwarf-4 -g3''
> 
> Configured features:
> ACL GIF GMP GNUTLS HARFBUZZ JPEG LCMS2 LIBXML2 MODULES NOTIFY W32NOTIFY
> PDUMPER PNG RSVG SOUND SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS
> TREE_SITTER WEBP XPM ZLIB
> 
> Important settings:
>   value of $LANG: ENG
>   locale-coding-system: cp1252
> 
> Major mode: Lisp Interaction
> 
> Minor modes in effect:
>   tooltip-mode: t
>   global-eldoc-mode: t
>   eldoc-mode: t
>   show-paren-mode: t
>   electric-indent-mode: t
>   mouse-wheel-mode: t
>   tool-bar-mode: t
>   menu-bar-mode: t
>   file-name-shadow-mode: t
>   global-font-lock-mode: t
>   font-lock-mode: t
>   blink-cursor-mode: t
>   minibuffer-nonselected-mode: t
>   minibuffer-regexp-mode: t
>   line-number-mode: t
>   indent-tabs-mode: t
>   transient-mark-mode: t
>   auto-composition-mode: t
>   auto-encryption-mode: t
>   auto-compression-mode: t
> 
> Load-path shadows:
> None found.
> 
> Features:
> (shadow sort mail-extr emacsbug lisp-mnt message mailcap yank-media puny
> dired dired-loaddefs rfc822 mml mml-sec password-cache epa derived epg
> rfc6068 epg-config gnus-util text-property-search time-date subr-x
> mm-decode mm-bodies mm-encode mailabbrev gmm-utils mailheader sendmail
> mail-parse rfc2231 rfc2047 rfc2045 ietf-drums mm-util mail-prsvr
> mail-utils warnings icons cl-loaddefs cl-lib rmc iso-transl tooltip
> cconv eldoc paren electric uniquify ediff-hook vc-hooks lisp-float-type
> elisp-mode mwheel touch-screen dos-w32 ls-lisp term/w32-nt disp-table
> term/w32-win w32-win w32-vars term/common-win tool-bar dnd fontset image
> regexp-opt fringe tabulated-list replace newcomment text-mode lisp-mode
> prog-mode register page tab-bar menu-bar rfn-eshadow isearch easymenu
> timer select scroll-bar mouse jit-lock font-lock syntax font-core
> term/tty-colors frame minibuffer nadvice seq simple cl-generic
> indonesian philippine cham georgian utf-8-lang misc-lang vietnamese
> tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek
> romanian slovak czech european ethiopic indian cyrillic chinese
> composite emoji-zwj charscript charprop case-table epa-hook
> jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs
> theme-loaddefs faces cus-face macroexp files window text-properties
> overlay sha1 md5 base64 format env code-pages mule custom widget keymap
> hashtable-print-readable backquote threads w32notify w32 lcms2 multi-tty
> move-toolbar make-network-process tty-child-frames emacs)
> 
> Memory information:
> ((conses 16 46888 16793) (symbols 48 6655 0) (strings 16 16703 2197)
>  (string-bytes 1 346778) (vectors 16 9844)
>  (vector-slots 8 115806 11013) (floats 8 23 6) (intervals 40 310 75)
>  (buffers 928 10))
> 
> 
> 
> 




Message sent to bug-gnu-emacs@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#79724: 31.0.50; No easy way of searching a buffer for raw bytes
Resent-From: Stephen Berman <stephen.berman@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-gnu-emacs@HIDDEN
Resent-Date: Thu, 30 Oct 2025 10:29:02 +0000
Resent-Message-ID: <handler.79724.B79724.176182012614769 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 79724
X-GNU-PR-Package: emacs
X-GNU-PR-Keywords: 
To: Eli Zaretskii <eliz@HIDDEN>
Cc: 79724 <at> debbugs.gnu.org, Stefan Monnier <monnier@HIDDEN>
Received: via spool by 79724-submit <at> debbugs.gnu.org id=B79724.176182012614769
          (code B ref 79724); Thu, 30 Oct 2025 10:29:02 +0000
Received: (at 79724) by debbugs.gnu.org; 30 Oct 2025 10:28:46 +0000
Received: from localhost ([127.0.0.1]:34462 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1vEPte-0003q8-BP
	for submit <at> debbugs.gnu.org; Thu, 30 Oct 2025 06:28:46 -0400
Received: from mout.gmx.net ([212.227.15.19]:53087)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.84_2) (envelope-from <stephen.berman@HIDDEN>)
 id 1vEPtY-0003p5-0k
 for 79724 <at> debbugs.gnu.org; Thu, 30 Oct 2025 06:28:41 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmx.net;
 s=s31663417; t=1761820112; x=1762424912; i=stephen.berman@HIDDEN;
 bh=BjdiFbW4GjxJ0hkf/LzUUquUmNHdf/GVqVghPPcioOA=;
 h=X-UI-Sender-Class:From:To:Cc:Subject:In-Reply-To:References:Date:
 Message-ID:MIME-Version:Content-Type:Content-Transfer-Encoding:cc:
 content-transfer-encoding:content-type:date:from:message-id:
 mime-version:reply-to:subject:to;
 b=mda/4TMqGOLshsyfyU9bpVEG9dCaBYZ5LrgGQ2D+7ViPJz5GIsIsCL3CjLSbl0wE
 Laz5raWnEUumRK8L4AI+rpZLcAjKC6sxpfQ9Auw427jvi+660FjFvyjMdhLARDbzS
 C771ZbLmaRqq/JTujmOr9O9WoMoUWt4BwG19uOujvS5XKxUCVvgPfqM+w2Gqw04KR
 2J6qqNehfHcWeKH3qmhkTH0RxtAPwCTHrhPNiQQ09741o73MdiblkiNEasCP2OPW3
 l6gTB6E2mMu2M6ww7xjENZj6zJNZIXnL1PCj9J1LhlWiOIUo3S0gT2cYbEw2gM87/
 z/u0aEU+mrbnBmd6Zw==
X-UI-Sender-Class: 724b4f7f-cbec-4199-ad4e-598c01a50d3a
Received: from strobelfssd ([88.130.62.64]) by mail.gmx.net (mrgmx005
 [212.227.17.190]) with ESMTPSA (Nemesis) id 1M2wL0-1vHhgS1vLQ-008O59; Thu, 30
 Oct 2025 11:28:32 +0100
From: Stephen Berman <stephen.berman@HIDDEN>
In-Reply-To: <86ikfwn36t.fsf@HIDDEN>
References: <86ikfwn36t.fsf@HIDDEN>
Date: Thu, 30 Oct 2025 11:28:30 +0100
Message-ID: <875xbwof8x.fsf@HIDDEN>
User-Agent: Gnus/5.13 (Gnus v5.13)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Provags-ID: V03:K1:KCycR3oiTJqOLOGGZ+deM/F4+oQxoYnbfArfc0D5ipEgWfHv5BF
 CAvNQNGk9SrKLMbfiSL2PnQCwVEAuXqKOVihGPl05zIPGXRUjzy+bl2uQbw+/EkjaEQteRs
 rNkREucKKqp1i4NFg0WmFJqiiUbRMFwruwZHj6VkhslmdhVzZszfISP3BoufKqbfB9hJgoY
 jmKb1S7LI+gUkj/DuRvPQ==
X-Spam-Flag: NO
UI-OutboundReport: notjunk:1;M01:P0:oxW35xLHzv8=;Ep6aZswkGizCvpomJUF0l7Q9hqr
 fYZRSvXFu9Pjtt5QWGEEAwPMs7UkZp8dF8tppUl3bao9c8oHhWqhobFeX61qEFi4HisAlFZLd
 76yc8RmeCB0Aneq/hWSJU7OE7PG+pNja/2CPPBbp5Txs/0DuPjHayK1vVZkoKctdNNkkHQBHo
 nnxQEHCTP/z+8os5cSJo/rxvTruUUrpE0BZdrNLO61rL+/78p5TndsZeDGexPHTiLHekHmn5z
 kNNDeUe9MSpqu8IbeCCQvxAk6XxNsXX2sk8L/hgPaOuitOvZRwpldrZNvkEHhHOKips39UUTi
 3ENKJWNDAJ0pgXRuN2lX5nJL9N93ZoHRypfWo+ixgoxWZPrjOgUbmhjAc7Xhs877Elpm+rZXX
 yid717feHjJ0wRF3gKEiPDyThscJ/Gv6rBsH9CWbDH0DDElScTSzyaoF0VenWn9iGpcB+gfbg
 0jAdraJ+sj6LLMqmmquUEG11mv1jvF7vwPGaZqkCSrcsmqcKgTQyeu3EiFJvCEN8h5/WYlE1m
 x8cd6ujfvNUPeflaM+4f0MIgnGGZTmFGoO7UFQwdrtiX/cJrIHY2iZJ51ibPnKmSqtxFhpc9S
 gu9gRCFZUyNtt7LyDVR1augs8FnEBeawY7v72KTxCj5Pmv0ltt/PYmk+nMto7s6+iFfFfO1sE
 p/JG4ei1DMTw1XexvawpKAaHuFJ3IiqiM8fTZXuJDpyTwxxX/Wt0hCGuc4bLhFx8sdwmZdMo8
 ekGLFE1CvC8TNsln0Ta0oHlVBi1qHBNicWCCFkKy5INDsCvEr7zpPRtdkFDvmLUI1lciPZKdA
 QD0wbZrfg1EA6bUO+T5vF3yJMgBlHjQ8iX2U4DF2VnEmk184m28Em4QhplI5bJQFb0T+B0L46
 O7XqASUZ40KDgmh+6TBxI3f1KJanz1z8wKRglyYBB8q0te7J2uTDelMt3DWyAnlt6zenYxC5z
 rdx0M/lGtNEugJta+Zr+tMY+1f9VdZXWc8VT/GBU5y9zwRU1i5HRq8Q7Hsyo50Q++fJP5EVl1
 J5thQ9cBVfcDDX0BEp21JjYvPYG5+hJCRqbEmVOn+M5cF3TZXpO0SvrgHWqsGOLMMrjCxU1ke
 HJHmS2fFMoTIPRrHyVbG9CI/+5qNQ2DVGi2FvS1GuF9kzkEprL9GBTi6/+nrWHeAfODtLQ+8e
 bMmBMnkUmgKpzVq7teEW71EWV28ersIk1UiI5Ycnlvy39tIqi9fkt2rkHXnWSetwz8A27SlsQ
 oo0rgC2AhN7njIJjaJi24It9+Ey3jtgQX7jJRxNrjkZltH9jTUYbag8kYMUqHyTEAmv+cU2bN
 a6DJmgfcx48jDT254gK6ZZzypwEr6PemjelIf3IJLHqBDebnMSOqlFvsX7Fmnn9xj6o7LJroI
 VBPzUGLPy0MQ6wHmadUonDiKpk6o0SGah6NFPx1vR+7faxb2lrQccVyxr6jmq7F+tRfdu/x5d
 vU9l1k1NkUdYNOGIgSSiaxwyfSWsrblcJxonb7FgGF9eDA/cFps/70jYBuBekxZvXY9cEViRq
 GDZ4dukUsP44mJlrLOpJ+aCPzamKUIW7mmxma3JyLFfaAZxzxLA7tyxmfZAzMvAsCkMd8AFqR
 lWjOaMu9F4M+zsqQe3rW1QbS+/vwe7P3gct7EoOsl23Tu1S+KXJ8Z75dmBTZlI7PhZ2V5FOzr
 GbIlCzXjv4eu5M7gNuAc163YETHqjgVXwbc3lxeinHysiZ/ODB1S3NLqMzi2uxtG746BL/rgw
 nM2JuUn7EnxX1oaLccPQrYaRpuBliAvs1ZsYB1WBmVQkGlnHgUQ8Xdg+idTr8snpLUhEC3edv
 apes1W6mpifWSg0oKdBh6miWI8Ai0QeWqINIscISXpFj7pklJbmP/epPhzlnCHoHbyp/gBs5S
 Ix6qvZNFuD8CA69EXSQpa+Cnx8fN4T6peiEI6yVG9GEdNM7Hc+sKf1CBiTEf2X2cvqwPNtKsf
 XUgKnXlG2lij4Jimk2lHbRB6T9oiSgz8IP74EELgbW+4WV7itlfV3rq9KjhTtpw+nSq6YSqbh
 8pneCUVeOvDs83vTCcfvexO17IxbGjmDWXsawyFXuPbCqVNKwnPVfuRf7ATo/mNxX4UzN4g8b
 AUQBqqnWI016r/HS/0cYb2p7uwSPovW/LdGYnM2+R7+KBvAJ4wiKMrvhQRLYwDsDe28xyczI7
 LcN1W6nO4asI15gOcUwKp6E2QisLFzmNd1AIuIblibqS+lTbvsaJNprVm3tGfnLhMtSBlfQ3r
 awqBQXAjYEjHifY2+URjQ0VpjwwkIJKMJ0GJ2lBKdaXtkV5mNOj0Y74I6OMHyapTFH7uJwPt/
 6o4g+wZE+8KTDVwjs6jzVJbsxrFxp8C1D672qCXIc3ZQXCKOKtcX4ucyB3g0mGaLcwIV6HFud
 4nrINl0rmaH/yC3rphgNS0Wrn3Sfsa+7Oe1DAd2+X4dS1rFAv0mNYDIfCQYfusUTmSKgfaa49
 VUB5S95B0QNsrJnzAzLomC3Ik0jwKBFJvsd0Xr9eqzSHS0gu74eJOmYpfKCBOEL8/MxWPi2x3
 U/aiZRHr7tJ9CDJzKnbKbB91BrkM11C+vgA7BCmbssMgTyTXUKOfzoVyaBu+S4DiG8nJH3FQp
 g3TTZHZ6iinb2C0xG8xqZoaXKPEXyxMnl8elCeRvNOzGSOY7NtPnQEH52VcG1POwHqb8ZvTbz
 fJ3dg5yOlJvcjs0DEKxlD/BoOJ/JtDShvcEHVeXpJxDH+isHehU0mXwqvyYnleFUt44hFuNLk
 PAEGhmHM4FRU/HZaHfC0HqsqJd6R5radjfZVwj+SBISWMNILZrIx2iLp+dsBYG43qxIL9EHRh
 lVi9/0pLLZeRwnPI+mTzjTp67HLu81Tsp3ixh0cnfcGW5UFba57UIWTbm69flOgHue1CovOfn
 Edl/ZaxyDJk/GEw1T9iJKHPjRrD+dqs56+H4w9Sv3yLIqCGLAjMWxyxjfCj/tSgTqc3Wwpmcd
 eoHsEGC2NJvC06noE1F/95E25vTNeCfAOWR5l9poztVinOBSWqfSmcJyLxIhlz1VbyNO7b+PF
 7eV/SgXy5uzzNA6D4ICHHUaIqqb9BhoMiRmPfbcjcMoOssLFRIsOU/2oHhTKzdObLLyDsLQS/
 neFjRFVO4A8X7rElAYDXHVKyLJXrUVhY00ua8zOhfoNXbUtqsQx6D2TJQkO3ZuVafRiNX8QNV
 dYtjWZvEfl+ypL+n3JrP4UC8abtuEo6QhOdOf6Hyb4nolsOOMxC56NceMxUlM1rW+d4iV1aJ9
 EcujXpYqA7yQKWyJz3W8AnnX34RljDpHaRayLRgoZ1D8XzZ2cF41L80Cb9rJvV3x5NQ9aZjRD
 Wgay9ggoPDJvCc2sJo+eu3UZ3GeBf4wrDRZuNTAGfAqt7gfxaAEkjxhsp9/cMkryOmYEPB9CB
 Ma0OaMzQ+Q6KfMnFgV+ZjnzOBEDuRZ5IzQdmrAA5/ov6ht874FFy7ChN+MHOSuAtsVYtfvpS0
 FDsQ54PT8xM3OXVDQCS0bsO8UY1QQXGhEnTnFyfSjK2ujpZIq0bMSIHEdgw7orJDDkF+6X35P
 weVBlpIktg5JOdI3StNWI+8VFdQGJ356HJksjBWi+QEE/pyeQsLo/2+i2D7MamP7HdUOCsHFm
 +mLCvxBucNW6ZhgdTxMfhC/qJffsyY1nFpr8UQE5BT/Fz8u1LOrE+/WlD+YX9Pw3iVcM4qqvg
 REkKZXjVZfZl1qzChYj/l2YuLOdf/siiEl0+bxI7Ty9PppfBHYP7GMFreIGv7mgA7zYnaEwMG
 oVUb2QHhcxaixYnAyrutmnwEEtXAlgosxC3eN+zsuE6UxxBNYnIxhiKRL+QSjX5kwuBuZjyYe
 0+q5qRuibDBe1FjmBrEhP1LbUGJ+REA8Eki8Eg56KS4KUoyiALAG2m+6LWSxY8D0obEEMvIuf
 nIld3furAYt4BRV5gs8gte8/NDhAYGJ+aNa/WtmrCuR7adpwwPkLgIlIgfWFRmtqPwhDqcVH5
 c7BXvoUViGU5t9MzTHiEJmd0df5izbo6CbhJ0vGNR/inm9997g2XtoLO1hS+9D3sRxKALUbsE
 rNVgtNhaCvTDrwwga/T1fNYS/zc75NYcNLt3Y1WOWxgNUR+BC80aSyRvdPqwBfg9MwUURRqA8
 rtcKOc9jTzTlS5TZw6zbzfhCBRVOFoeRsXK6n5+LYgLvxbOd2N9kEMpIFqJovMPfXhCaA6eO1
 CubMhOBxYWYsPZVIWNAwPiKky8p2XZ1ZTjnrh3ED83C/kiiOpQAUTessO+9eXoKK4BDPQMgdp
 G7aMwLvcBRjt5knsktt94OAf8EwMmIMXz41+kyAUXYff4g9q9gfCOMRar7oVmfndNMo5dh0G2
 quhJlz+BfHLS8NLsq9eJ/47tf7kDzrWJWFCcw4PmbcLu8qWSQfCXUDvjLqT/lgTI9q1P0HBUh
 7WJBi908nia6HZv9pZyHzsAnnKyG98v31Bh9D1Oz/EY8NuPpsidTszQRPyoLKFLkFykImSlq/
 NtYxf+e/+Aa8CQQcEHqyhGPLMwxq4h7xvajZNJi+NPc9JpRuUtkkglb73/PMMPIxPe7iN6oVo
 SBe8Y3MCfF2YDEOU1SIHhghH6UsrYQWMTeRd4R3RoeH/GU2RKX55m78ZcAMLnkOeIKAtut1u7
 ExL4mquJQnjw49Cv9Jai53XYNg685dCxtgX0tZifISPevPa2ohW2r7DuNRdWWFOFfHgny3c3K
 mszkqhfxQ0UvP82Fgl9EJhbrtqiLTKf6g5+b5XeWVPnvgT0LVjtffbc4/R9iMulut2qVZvRJL
 s8ilwA29glrFTYlrNRn53/3TASm+iwk2wX2mN0sO91DUw9PrZ7sZWLpZL0M/yCOcWX9sM4N6+
 oDX4RLksqECM4wj24a+oqPoPrAVzDKoYDJmUMRJYC5kGF4uIoPRZghO0ChR0tImHitlbJfmu1
 ZONSSkH16F3fKN/M+bUHKNZ6BpGQLcYUXpKYsRcolRp+6rzFZWmmdgkrUd7hkKyns/8MsQ8tO
 j01cfpwFOWVP/qhaN9A2KfNOrks6GrFDgskq1Jl2qyGQU4h+kdwyLuDZ0fU6sBWI5nqfI1dUA
 jj3Q4S+AxMZnuHwQ7GhcAUcgGYEI0j3vcyU+8iXNNTwe+Pr
X-Spam-Score: -0.7 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.7 (-)

On Thu, 30 Oct 2025 11:34:18 +0200 Eli Zaretskii <eliz@HIDDEN> wrote:

> From: eliz@HIDDEN
> --text follows this line--
> As the subject says, how can a user easily search for raw bytes in a
> buffer?  Or how can a Lisp program quickly scan a buffer to find raw
> bytes and either remove or replace them?
>
> To reproduce, start "emacs -Q" then insert a raw byte by typing
>
>   C-x 8 RET 3fffe0 RET
>
> Then try to come up with a regexp that finds only the raw byte.
>
> This is important when one has a buffer which could include raw bytes,
> and wants to json-serialize it, in which case there's a need to remove
> raw bytes or replace them with something that will avoid signaling an
> error from the serialization code.
>
> The only way I found is to examine the buffer one character at a time
> using charset-after.  But this is tedious and inefficient.
>
> I seem to be unable to find a way to express this with regexps.  The
> na=C3=AFve way would be "[\u3fff00-\u3fffff]", but that doesn't work (it
> finds ASCII letters and nothing else).  Nothing else I tried worked,
> including the recipe from the ELisp manual:
>
>        4. If the end points of a range are raw 8-bit bytes (*note Text
>           Representations::), or if the range start is ASCII and the end
>           is a raw byte (as in =E2=80=98[a-\377]=E2=80=99), the range wil=
l match only
>           ASCII characters and raw 8-bit bytes, but not non-ASCII
>           characters.  This feature is intended for searching text in
>           unibyte buffers and strings.
>
> In a buffer that includes only ASCII characters and a raw byte, typing
> "C-M-s [a-\377]" signals an error "Failing regexp search.
>
> Is there solution for this job that I'm missing?

This seems to work, at least for your examples (also if I add them to
the HELLO buffer):

C-M-s [^[:ascii:][:print:]]

Steve Berman




Message sent to bug-gnu-emacs@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#79724: 31.0.50; No easy way of searching a buffer for raw bytes
Resent-From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= <mattias.engdegard@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-gnu-emacs@HIDDEN
Resent-Date: Thu, 30 Oct 2025 11:11:01 +0000
Resent-Message-ID: <handler.79724.B79724.176182261624169 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 79724
X-GNU-PR-Package: emacs
X-GNU-PR-Keywords: 
To: Eli Zaretskii <eliz@HIDDEN>
Cc: 79724 <at> debbugs.gnu.org, monnier@HIDDEN
Received: via spool by 79724-submit <at> debbugs.gnu.org id=B79724.176182261624169
          (code B ref 79724); Thu, 30 Oct 2025 11:11:01 +0000
Received: (at 79724) by debbugs.gnu.org; 30 Oct 2025 11:10:16 +0000
Received: from localhost ([127.0.0.1]:34703 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1vEQXn-0006Hk-CJ
	for submit <at> debbugs.gnu.org; Thu, 30 Oct 2025 07:10:15 -0400
Received: from mail-lj1-x234.google.com ([2a00:1450:4864:20::234]:52727)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
 (Exim 4.84_2) (envelope-from <mattias.engdegard@HIDDEN>)
 id 1vEQXg-0006Gk-Ry
 for 79724 <at> debbugs.gnu.org; Thu, 30 Oct 2025 07:10:11 -0400
Received: by mail-lj1-x234.google.com with SMTP id
 38308e7fff4ca-37a1267c45dso5672481fa.1
 for <79724 <at> debbugs.gnu.org>; Thu, 30 Oct 2025 04:10:08 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gmail.com; s=20230601; t=1761822601; x=1762427401; darn=debbugs.gnu.org;
 h=to:references:message-id:content-transfer-encoding:cc:date
 :in-reply-to:from:subject:mime-version:sender:from:to:cc:subject
 :date:message-id:reply-to;
 bh=dB+Atx+yyiLw0B4KilXEhyhCauGRMu5d3jiqVYGcs7M=;
 b=WvpTcg6dqK2TxLrSP2ajKdiSj86rT/E0GraROFiEu46XBHqiT9m2JNP5Tk24rxJJPJ
 BAfw+eINKgIcTX9x1XXa3IMQHzWrn/d7ieiew1rkIgbNVD50grPRJty1PmzMvMNUhkHv
 7uTuL8xsjcOJYm09mC4ZYvLvGfwdSToY0mQU6VxxETlDvsqjrqaraQf2/JGF7oU004/W
 pkFoQBXMf4AvgO7qcj0hvcevRnDXZWtaMvwt+l9skC7aX7yA0+YQydtLtFcXKVAgeOmz
 JFeNgVIUqw0K4OLsJsyfiDUHoDxt0th43ouFry0K7cqO3ueG1BxEdYiL7uISgsRaJ/9P
 +4kw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1761822601; x=1762427401;
 h=to:references:message-id:content-transfer-encoding:cc:date
 :in-reply-to:from:subject:mime-version:sender:x-gm-message-state
 :from:to:cc:subject:date:message-id:reply-to;
 bh=dB+Atx+yyiLw0B4KilXEhyhCauGRMu5d3jiqVYGcs7M=;
 b=Tb+1sTJUmKwvOELL9NsuOHX18EQGMBU5dJePyGj2Ri6i4izf+zXQzV/U6FVybDuGlk
 IRzgwjwSC+GUgNRFFISEXcL+vAsErNz9DDSSE8hiLeEfW6IPO2NJjhUoMCHLghHl3XLv
 p4hKqhSquDhYzcrTYRnbs8k3G+TZmYKYJnk0wRDK2UK7aOVvSUtHvF3bKKjZLCMl8xGU
 4sOVimcb+JNWpCJN1IWMMvY8ETLtR6LLG5uZhBrEjOXm4SyQSIxGqsFM4B1+ceq7o8HB
 v32mV7keIZnr53xuZ6sU3BIhVtZVdQDQTV/b5/QrGs0td3Pmljyq04f/JQDyfKnfTDTe
 VrnQ==
X-Gm-Message-State: AOJu0YzVLzI5776h1JDXL028Nj6DIH7FEIrRTwwZaEX7nejnRN/iHM6V
 gaJ6wLSyMAHgP6C6/CK0Pel8b3oKbAfIaqH/cMXA6MckZhe0DehDYkMiupGH+Q==
X-Gm-Gg: ASbGncuAtX+kap/uWINq+LZtmkyML7JIPr0+iKQ/O7jkOnxFVe784Gkt6iefC0qNG9S
 NZFIWoi1YhE9P+RWjd5clqdNbz75Rusj+WYLN6O5T71odtkkh+5pG7w/VR6/ENF2XSGum8tYihx
 Dkx0065jN11Hj76suYqxOAAdiMI6YMHTMXeADODCMe7njOKd8Nqj5Z3oFGmDMsn+dF5hmRp4Hy3
 4TfKUdx3YiWNCL+CbjRQNGhad5Si1XivdmJ7Fs6P6w8PvBmEKL3sTQJMOYkl0rS9YDstKyeCZxS
 yDw81oyoEcEA7iJUvnpQEcPTnjDG6MJcOuR9AGTV46iBIZ9kSDrd4+PfuFoDVt0+PDNF1VLv5Ki
 kLBMH8VUuSXonk1++plevgwvSMswGW8ObAQf9xhHZ8PJ52t8pHmj0sVn/c/xWn1g0bgErE+nlHb
 SWVyt/PHWTaqiqvkCB8HutPiQbo48zCfCf6v0WDK5GDOJMOS6ijl3r74EdbKBa9LVSfzF2msTNV
 lLV
X-Google-Smtp-Source: AGHT+IHU+G6x9U+hd3XxtUk2lwhVsOSkntcMtrpX4YciV0PMY8yWIhk/B+q+87jTbtpV/ZXL4P7g/g==
X-Received: by 2002:a2e:bea5:0:b0:378:ebd7:ad0 with SMTP id
 38308e7fff4ca-37a052e66c8mr20876421fa.17.1761822600451; 
 Thu, 30 Oct 2025 04:10:00 -0700 (PDT)
Received: from smtpclient.apple (c188-150-186-155.bredband.tele2.se.
 [188.150.186.155]) by smtp.gmail.com with ESMTPSA id
 2adb3069b0e04-59301f50996sm4474975e87.36.2025.10.30.04.09.59
 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128);
 Thu, 30 Oct 2025 04:10:00 -0700 (PDT)
Content-Type: text/plain;
	charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.15\))
From: Mattias =?UTF-8?Q?Engdeg=C3=A5rd?= <mattias.engdegard@HIDDEN>
In-Reply-To: <86frb0n0yd.fsf@HIDDEN>
Date: Thu, 30 Oct 2025 12:09:58 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <1BFAF201-1A4D-4B6D-867F-5EE337A6C9C4@HIDDEN>
References: <86ikfwn36t.fsf@HIDDEN> <86frb0n0yd.fsf@HIDDEN>
X-Mailer: Apple Mail (2.3654.120.0.1.15)
X-Spam-Score: 0.0 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)

> As the subject says, how can a user easily search for raw bytes in a
> buffer?  Or how can a Lisp program quickly scan a buffer to find raw
> bytes and either remove or replace them?

  (re-search-forward (rx (in (#x3fff80 . #x3fffff))))

assuming it's a multibyte buffer which is almost always the case.

If you want to find all non-Unicode values, including the embarrassing =
range in #x110000..#x3fff7f that we don't speak about, maybe you'd like

  (re-search-forward (rx (not (in (0 . #x10ffff)))))

or if you prefer skip-chars-forward,

  (skip-chars-forward "\0-\x10ffff")

etc.

> na=C3=AFve way would be "[\u3fff00-\u3fffff]", but that doesn't work

Actually you almost nailed it, it's just a char escape matter: it's =
either \u with four, \U with eight or \x with any number of hex digits. =
(Or just use rx.)





Message sent to bug-gnu-emacs@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#79724: 31.0.50; No easy way of searching a buffer for raw bytes
Resent-From: Eli Zaretskii <eliz@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-gnu-emacs@HIDDEN
Resent-Date: Thu, 30 Oct 2025 11:27:02 +0000
Resent-Message-ID: <handler.79724.B79724.176182357028479 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 79724
X-GNU-PR-Package: emacs
X-GNU-PR-Keywords: 
To: Stephen Berman <stephen.berman@HIDDEN>
Cc: 79724 <at> debbugs.gnu.org, monnier@HIDDEN
Received: via spool by 79724-submit <at> debbugs.gnu.org id=B79724.176182357028479
          (code B ref 79724); Thu, 30 Oct 2025 11:27:02 +0000
Received: (at 79724) by debbugs.gnu.org; 30 Oct 2025 11:26:10 +0000
Received: from localhost ([127.0.0.1]:34811 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1vEQn9-0007P8-2I
	for submit <at> debbugs.gnu.org; Thu, 30 Oct 2025 07:26:10 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:34986)
 by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.84_2) (envelope-from <eliz@HIDDEN>) id 1vEQn3-0007OK-7y
 for 79724 <at> debbugs.gnu.org; Thu, 30 Oct 2025 07:26:04 -0400
Received: from fencepost.gnu.org ([2001:470:142:3::e])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <eliz@HIDDEN>)
 id 1vEQmw-0006Eb-5C; Thu, 30 Oct 2025 07:25:54 -0400
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org;
 s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From:
 Date; bh=IoYOO8tnv3k7Pd5SrRimjwmGnvElaEu+mAW+l9oKHcg=; b=VyiFXjxZoQtPTU884yck
 hVz2/6ZsLfsfS+KgPCCzJRBWfzGya50m1sE2kfQLJmq38VuDF3xHC8HrMpIgQcXVH77MEuR1WPN0b
 fb4gYZJ09p8o0eZhvptzu7y45fbYSmv7ArNr9eI3NROBFILw/ujlP7r1wluvTrf6AdpbuOAtcOOei
 /P7ZlK0LhSKwR2NW0UoKI9/dZ5WkI1En71XHJSHQheOZO4hGogKlpk5lOiqbSGT7q4Ss54C3u5Dq8
 gZ9jQBuZVQ8Awn1wOD933szPdOCw9I3JBtPHsVan9KEI0AVbFzfC7E5QgmwJY+PkMKAqr7d3jYueT
 y8mHK2uVcUkjbg==;
Date: Thu, 30 Oct 2025 13:25:49 +0200
Message-Id: <86ecqkmy0y.fsf@HIDDEN>
From: Eli Zaretskii <eliz@HIDDEN>
In-Reply-To: <875xbwof8x.fsf@HIDDEN> (message from Stephen Berman on Thu, 30
 Oct 2025 11:28:30 +0100)
References: <86ikfwn36t.fsf@HIDDEN> <875xbwof8x.fsf@HIDDEN>
MIME-version: 1.0
Content-type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Spam-Score: -2.3 (--)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

> From: Stephen Berman <stephen.berman@HIDDEN>
> Cc: 79724 <at> debbugs.gnu.org,  Stefan Monnier <monnier@HIDDEN>
> Date: Thu, 30 Oct 2025 11:28:30 +0100
> 
> On Thu, 30 Oct 2025 11:34:18 +0200 Eli Zaretskii <eliz@HIDDEN> wrote:
> 
> > From: eliz@HIDDEN
> > --text follows this line--
> > As the subject says, how can a user easily search for raw bytes in a
> > buffer?  Or how can a Lisp program quickly scan a buffer to find raw
> > bytes and either remove or replace them?
> >
> > To reproduce, start "emacs -Q" then insert a raw byte by typing
> >
> >   C-x 8 RET 3fffe0 RET
> >
> > Then try to come up with a regexp that finds only the raw byte.
> >
> > This is important when one has a buffer which could include raw bytes,
> > and wants to json-serialize it, in which case there's a need to remove
> > raw bytes or replace them with something that will avoid signaling an
> > error from the serialization code.
> >
> > The only way I found is to examine the buffer one character at a time
> > using charset-after.  But this is tedious and inefficient.
> >
> > I seem to be unable to find a way to express this with regexps.  The
> > naïve way would be "[\u3fff00-\u3fffff]", but that doesn't work (it
> > finds ASCII letters and nothing else).  Nothing else I tried worked,
> > including the recipe from the ELisp manual:
> >
> >        4. If the end points of a range are raw 8-bit bytes (*note Text
> >           Representations::), or if the range start is ASCII and the end
> >           is a raw byte (as in ‘[a-\377]’), the range will match only
> >           ASCII characters and raw 8-bit bytes, but not non-ASCII
> >           characters.  This feature is intended for searching text in
> >           unibyte buffers and strings.
> >
> > In a buffer that includes only ASCII characters and a raw byte, typing
> > "C-M-s [a-\377]" signals an error "Failing regexp search.
> >
> > Is there solution for this job that I'm missing?
> 
> This seems to work, at least for your examples (also if I add them to
> the HELLO buffer):
> 
> C-M-s [^[:ascii:][:print:]]

Thanks, but that will find also other codepoints.  (If it doesn't,
it's a separate bug.)  And anyway, how would one decide this should
work based on the documentation of [:print:]?






Last modified: Thu, 30 Oct 2025 11:30:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.