GNU bug report logs - #52769
29.0.50; [FEATURE REQUEST] repunctuate-sentences in region

Previous Next

Package: emacs;

Reported by: Rudolf Adamkovič <salutis <at> me.com>

Date: Fri, 24 Dec 2021 10:14:01 UTC

Severity: wishlist

Fixed in version 29.0.50

Done: Juri Linkov <juri <at> linkov.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 52769 in the body.
You can then email your comments to 52769 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#52769; Package emacs. (Fri, 24 Dec 2021 10:14:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Rudolf Adamkovič <salutis <at> me.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Fri, 24 Dec 2021 10:14:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Rudolf Adamkovič <salutis <at> me.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 29.0.50; [FEATURE REQUEST] repunctuate-sentences in region
Date: Fri, 24 Dec 2021 11:13:15 +0100
I often copy some text from elsewhere to Emacs, and I would like to
re-punctuate it.  As a user, I would expect it to work as follows:

(1) mark a position to form a region and
(2) call 'repunctuate-sentences'.

'repunctuate-sentences' documentation says:

> Put two spaces at the end of sentences from point to the end of
> buffer.  It works using query-replace-regexp.

… and 'query-replace-regexp' documentation says:

> In Transient Mark mode, if the mark is active, operate on the contents
> of the region.  Otherwise, operate from point to the end of the
> buffer's accessible portion.

Both functions work as documented, but as a user, I often need to
'repunctuate-sentences' in a region.

Could we improve 'repunctuate-sentences' to work such that in Transient
Mark mode and with mark active, it re-punctuates the contents of the
region?

Thank you.

Rudy

Thank you!


In GNU Emacs 29.0.50 (build 5, x86_64-apple-darwin21.2.0, NS appkit-2113.20 Version 12.1 (Build 21C52))
 of 2021-12-23 built on Workstation.local
Repository revision: 2fa7feca336dd16c57ffef072e0f0da6fffe4c5f
Repository branch: master
Windowing system distributor 'Apple', version 10.3.2113
System Description:  macOS 12.1

Configured using:
 'configure --with-json --with-xwidgets --with-native-compilation'

Configured features:
ACL DBUS GIF GLIB GMP GNUTLS JPEG JSON LCMS2 LIBXML2 MODULES NATIVE_COMP
NOTIFY KQUEUE NS PDUMPER PNG RSVG SQLITE3 THREADS TIFF
TOOLKIT_SCROLL_BARS WEBP XIM XWIDGETS ZLIB

Important settings:
  value of $LC_ALL: en_US.UTF-8
  locale-coding-system: utf-8-unix

Major mode: Helpful

Minor modes in effect:
  emms-mode-line-mode: t
  telega-root-auto-fill-mode: t
  telega-active-locations-mode: t
  telega-patrons-mode: t
  telega-mode-line-mode: t
  TeX-PDF-mode: t
  global-git-commit-mode: t
  magit-auto-revert-mode: t
  shell-dirtrack-mode: t
  corfu-global-mode: t
  corfu-mode: t
  vertico-mode: t
  marginalia-mode: t
  global-diff-hl-mode: t
  yas-global-mode: t
  yas-minor-mode: t
  global-hl-todo-mode: t
  global-subword-mode: t
  subword-mode: t
  save-place-mode: t
  global-auto-revert-mode: t
  delete-selection-mode: t
  savehist-mode: t
  tooltip-mode: t
  global-eldoc-mode: t
  show-paren-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  buffer-read-only: t
  size-indication-mode: t
  column-number-mode: t
  line-number-mode: t
  transient-mark-mode: t

Load-path shadows:
/Users/salutis/.emacs.d/elpa/transient-20211208.1819/transient hides /Users/salutis/src/emacs/nextstep/Emacs.app/Contents/Resources/lisp/transient
/Users/salutis/src/emacs/nextstep/Emacs.app/Contents/Resources/lisp/emacs-lisp/eieio-compat hides /Users/salutis/src/emacs/nextstep/Emacs.app/Contents/Resources/lisp/obsolete/eieio-compat

Features:
(shadow bbdb-message mail-extr view helpful trace edebug info-look
help-fns radix-tree elisp-refs tramp-cmds vterm tramp tramp-loaddefs
trampver tramp-integration files-x tramp-compat term ehelp vterm-module
term/xterm xterm eglot array jsonrpc ert debug backtrace xref pcase
mhtml-mode css-mode smie js sgml-mode facemenu org-duration org-pomodoro
alert log4e gntp org-timer hl-line emms-mode-line network-stream nsm
emms-player-mpd emms-url tq emms-player-simple emms-browser sort
emms-playlist-sort emms-last-played emms-volume emms-volume-sndioctl
emms-volume-mixerctl emms-volume-pulse emms-volume-amixer
emms-playlist-mode emms-source-playlist emms-source-file locate
emms-cache emms-info emms-later-do emms emms-compat ox-md ox-odt rng-loc
rng-uri rng-parse rng-match rng-dt rng-util rng-pttrn nxml-parse nxml-ns
nxml-enc xmltok nxml-util ox-latex ox-icalendar org-agenda ox-html table
ox-ascii ox-publish ox citar-org oc-csl citeproc citeproc-itemgetters
citeproc-biblatex citeproc-bibtex citeproc-cite citeproc-subbibs
citeproc-sort citeproc-name citeproc-formatters citeproc-number rst
citeproc-proc citeproc-disamb citeproc-itemdata
citeproc-generic-elements citeproc-macro citeproc-choose citeproc-date
citeproc-context citeproc-prange citeproc-style citeproc-locale
citeproc-term f citeproc-rt citeproc-lib citeproc-s let-alist queue
org-id org-refile citar s parsebib citar-file misearch multi-isearch
telega-obsolete telega telega-tdlib-events telega-webpage
visual-fill-column telega-root telega-info telega-chat telega-modes
telega-company telega-user telega-notifications notifications
telega-voip telega-msg telega-tme telega-sticker telega-i18n
telega-vvnote bindat telega-ffplay telega-media telega-sort
telega-filter telega-ins telega-folders telega-inline telega-tdlib
telega-util rainbow-identifiers dired-aux color telega-server
telega-core telega-customize cus-edit cus-start cus-load emacsbug
sendmail goto-addr bug-reference preview tex-buf font-latex latex
latex-flymake tex-ispell tex-style tex texmathp tex-mode flymake-proc
flymake compile image-file image-converter disp-table magit-extras
char-fold face-remap magit-bookmark magit-submodule magit-obsolete
magit-blame magit-stash magit-reflog magit-bisect magit-push magit-pull
magit-fetch magit-clone magit-remote magit-commit magit-sequence
magit-notes magit-worktree magit-tag magit-merge magit-branch
magit-reset magit-files magit-refs magit-status magit magit-repos
magit-apply magit-wip magit-log which-func imenu magit-diff smerge-mode
diff git-commit log-edit add-log magit-core magit-autorevert
magit-margin magit-transient magit-process with-editor shell server
magit-mode transient magit-git magit-section magit-utils crm dash
orderless cursor-sensor vc-mtn vc-hg vc-bzr vc-src vc-sccs vc-svn vc-cvs
vc-rcs project consult-vertico consult recentf tree-widget paredit
edmacro kmacro bbdb bbdb-site timezone modus-vivendi-theme
modus-operandi-theme modus-themes corfu vertico marginalia pdf-loader
diff-hl log-view pcvs-util vc-dir ewoc vc diminish yasnippet hl-todo
finder-inf fortune display-fill-column-indicator ob-sqlite ob-sql ob-C
cc-mode cc-fonts cc-guess cc-menus cc-cmds cc-styles cc-align cc-engine
cc-vars cc-defs ob-R org-clock cl ls-lisp cap-words superword subword
saveplace autorevert filenotify comp comp-cstr warnings delsel savehist
elfeed-link elfeed-show elfeed-search elfeed-csv elfeed elfeed-curl
elfeed-log xml-query bookmark pp elfeed-db elfeed-lib vc-git diff-mode
vc-dispatcher org-element avl-tree generator ol-eww eww xdg url-queue
thingatpt mm-url ol-rmail ol-mhe ol-irc ol-info ol-gnus nnselect
gnus-search eieio-opt speedbar ezimage dframe gnus-art mm-uu mml2015
mm-view mml-smime smime dig gnus-sum shr pixel-fill kinsoku svg dom
gnus-group gnus-undo gnus-start gnus-dbus dbus xml gnus-cloud nnimap
nnmail mail-source utf7 netrc nnoo parse-time gnus-spec gnus-int
gnus-range message yank-media rmc puny rfc822 mml mml-sec epa derived
epg rfc6068 epg-config mm-decode mm-bodies mm-encode mail-parse rfc2231
rfc2047 rfc2045 ietf-drums mailabbrev gmm-utils mailheader gnus-win gnus
nnheader gnus-util text-property-search mail-utils mm-util mail-prsvr
wid-edit ol-docview doc-view jka-compr image-mode exif dired
dired-loaddefs ol-bibtex ol-bbdb ol-w3m ol-doi org-link-doi cl-extra
help-mode org ob ob-tangle ob-ref ob-lob ob-table ob-exp org-macro
org-footnote org-src ob-comint org-pcomplete pcomplete comint ansi-color
ring org-list org-faces org-entities noutline outline easy-mmode
org-version ob-emacs-lisp ob-core ob-eval org-table oc-basic bibtex
iso8601 time-date ol rx org-keys oc org-compat advice org-macs
org-loaddefs format-spec find-func cal-menu calendar cal-loaddefs
tex-site info package browse-url url url-proxy url-privacy url-expand
url-methods url-history url-cookie url-domsuf url-util mailcap
url-handlers url-parse auth-source cl-seq eieio eieio-core cl-macs
eieio-loaddefs password-cache json map url-vars seq gv subr-x byte-opt
bytecomp byte-compile cconv cl-loaddefs cl-lib iso-transl tooltip eldoc
paren electric uniquify ediff-hook vc-hooks lisp-float-type elisp-mode
mwheel term/ns-win ns-win ucs-normalize mule-util term/common-win
tool-bar dnd fontset image regexp-opt fringe tabulated-list replace
newcomment text-mode lisp-mode prog-mode register page tab-bar menu-bar
rfn-eshadow isearch easymenu timer select scroll-bar mouse jit-lock
font-lock syntax font-core term/tty-colors frame minibuffer cl-generic
cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao
korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech
european ethiopic indian cyrillic chinese composite emoji-zwj charscript
charprop case-table epa-hook jka-cmpr-hook help simple abbrev obarray
cl-preloaded nadvice button loaddefs faces cus-face macroexp files
window text-properties overlay sha1 md5 base64 format env code-pages
mule custom widget keymap hashtable-print-readable backquote threads
xwidget-internal dbusbind kqueue cocoa ns lcms2 multi-tty
make-network-process native-compile emacs)

Memory information:
((conses 16 1273795 147391)
 (symbols 48 61371 3)
 (strings 32 344437 34383)
 (string-bytes 1 11812916)
 (vectors 16 121413)
 (vector-slots 8 2818587 91957)
 (floats 8 11575 522)
 (intervals 56 13645 8634)
 (buffers 992 47))

-- 
"Programming reliably --- must be an activity of an undeniably mathematical nature […] You see, mathematics is about thinking, and doing mathematics is always trying to think as well as possible." -- Edsger W. Dijkstra (1981)

Rudolf Adamkovič <salutis <at> me.com> [he/him]
Studenohorská 25
84103 Bratislava
Slovakia




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#52769; Package emacs. (Sat, 25 Dec 2021 19:08:02 GMT) Full text and rfc822 format available.

Message #8 received at 52769 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Rudolf Adamkovič <salutis <at> me.com>
Cc: 52769 <at> debbugs.gnu.org
Subject: Re: bug#52769: 29.0.50; [FEATURE REQUEST] repunctuate-sentences in
 region
Date: Sat, 25 Dec 2021 21:04:58 +0200
> Could we improve 'repunctuate-sentences' to work such that in Transient
> Mark mode and with mark active, it re-punctuates the contents of the
> region?

Thanks for the request.  Until now, I used a custom command
'canonically-double-space-region' attached below, activated
by advice when the command 'fill-paragraph' (M-q) is called
on the region.

But its heuristics is too unreliable to detect the places
where two spaces are needed.  It often misidentifies
an abbreviation as the end of the sentence.  So using
'query-replace' would be more reliably to make the decision
for every punctuation.

When I tried 'repunctuate-sentences', it stunned by its inefficiency:
it requires a confirmation even when there are already two spaces
at the end of the sentence!  Why does it do this?

PS:

#+begin_src emacs-lisp
(defun canonically-double-space-region (beg end)
  (interactive "*r")
  (canonically-space-region beg end)
  (unless (markerp end) (setq end (copy-marker end t)))
  (let* ((sentence-end-double-space nil) ; to get right regexp below
         (end-spc-re (rx (>= 5 (not (in ".?!"))) (regexp (sentence-end)))))
    (save-excursion
      (goto-char beg)
      (while (and (< (point) end)
                  (re-search-forward end-spc-re end t))
        (unless (or (>= (point) end)
                    (looking-back "[[:space:]]\\{2\\}\\|\n" 3))
          (insert " "))))))

(advice-add 'fill-paragraph :before
            (lambda (&rest _args)
              (when (use-region-p)
                (canonically-double-space-region
                 (region-beginning)
                 (region-end))))
            '((name . fill-paragraph-double-space)))
#+end_src




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#52769; Package emacs. (Tue, 28 Dec 2021 19:21:02 GMT) Full text and rfc822 format available.

Message #11 received at 52769 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Rudolf Adamkovič <salutis <at> me.com>
Cc: 52769 <at> debbugs.gnu.org
Subject: Re: bug#52769: 29.0.50; [FEATURE REQUEST] repunctuate-sentences in
 region
Date: Tue, 28 Dec 2021 21:20:00 +0200
close 52769 29.0.50
thanks

> Could we improve 'repunctuate-sentences' to work such that in Transient
> Mark mode and with mark active, it re-punctuates the contents of the
> region?

Now this is implemented in master.




bug marked as fixed in version 29.0.50, send any further explanations to 52769 <at> debbugs.gnu.org and Rudolf Adamkovič <salutis <at> me.com> Request was from Juri Linkov <juri <at> linkov.net> to control <at> debbugs.gnu.org. (Tue, 28 Dec 2021 19:21:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#52769; Package emacs. (Tue, 28 Dec 2021 19:32:02 GMT) Full text and rfc822 format available.

Message #16 received at 52769 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Rudolf Adamkovič <salutis <at> me.com>
Cc: 52769 <at> debbugs.gnu.org
Subject: Re: bug#52769: 29.0.50; [FEATURE REQUEST] repunctuate-sentences in
 region
Date: Tue, 28 Dec 2021 21:28:21 +0200
[Message part 1 (text/plain, inline)]
> When I tried 'repunctuate-sentences', it stunned by its inefficiency:
> it requires a confirmation even when there are already two spaces
> at the end of the sentence!  Why does it do this?

If no one has a better idea for a simpler implementation,
then this patch fixes the problem by skipping the sentences
that already have two spaces at the end:

[repunctuate-sentences-filter.patch (text/x-diff, inline)]
diff --git a/lisp/textmodes/paragraphs.el b/lisp/textmodes/paragraphs.el
index 98362b8579..0b09895339 100644
--- a/lisp/textmodes/paragraphs.el
+++ b/lisp/textmodes/paragraphs.el
@@ -494,7 +494,14 @@ repunctuate-sentences
     (if no-query
         (while (re-search-forward regexp nil t)
           (replace-match to-string))
-      (query-replace-regexp regexp to-string nil start end))))
+      (let ((regexp "\\([]\"')]?\\)\\([.?!]\\)\\([]\"')]?\\)\\( +\\)")
+            (space-filter (lambda (_start _end)
+                            (not (length= (match-string 4) 2)))))
+        (unwind-protect
+            (progn
+              (add-function :after-while isearch-filter-predicate space-filter)
+              (query-replace-regexp regexp to-string nil start end))
+          (remove-function isearch-filter-predicate space-filter))))))
 
 
 (defun backward-sentence (&optional arg)

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#52769; Package emacs. (Tue, 28 Dec 2021 20:22:02 GMT) Full text and rfc822 format available.

Message #19 received at 52769 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Rudolf Adamkovič <salutis <at> me.com>
Cc: 52769 <at> debbugs.gnu.org
Subject: Re: bug#52769: 29.0.50; [FEATURE REQUEST] repunctuate-sentences in
 region
Date: Tue, 28 Dec 2021 22:18:12 +0200
[Message part 1 (text/plain, inline)]
> If no one has a better idea for a simpler implementation,
> then this patch fixes the problem by skipping the sentences
> that already have two spaces at the end:

The filter will also allow redefining it with own logic
such as skipping known abbreviations that don't require
two spaces, i.e., e.g.:

  (defun repunctuate-sentences-filter (_start _end)
    (not (or (length= (match-string 4) 2)
             (looking-back (rx (or "i.e." "e.g.") " ") 5))))

[repunctuate-sentences-filter-2.patch (text/x-diff, inline)]
diff --git a/lisp/textmodes/paragraphs.el b/lisp/textmodes/paragraphs.el
index acb26fd1c1..580f3617d0 100644
--- a/lisp/textmodes/paragraphs.el
+++ b/lisp/textmodes/paragraphs.el
@@ -479,6 +479,9 @@ forward-sentence
       (setq arg (1- arg)))
     (constrain-to-field nil opoint t)))
 
+(defun repunctuate-sentences-filter (_start _end)
+  (not (length= (match-string 4) 2)))
+
 (defun repunctuate-sentences (&optional no-query start end)
   "Put two spaces at the end of sentences from point to the end of buffer.
 It works using `query-replace-regexp'.  In Transient Mark mode,
@@ -489,14 +492,21 @@ repunctuate-sentences
   (interactive (list nil
                      (if (use-region-p) (region-beginning))
                      (if (use-region-p) (region-end))))
-  (let ((regexp "\\([]\"')]?\\)\\([.?!]\\)\\([]\"')]?\\) +")
-        (to-string "\\1\\2\\3  "))
-    (if no-query
-        (progn
-          (when start (goto-char start))
-          (while (re-search-forward regexp end t)
-            (replace-match to-string)))
-      (query-replace-regexp regexp to-string nil start end))))
+  (if no-query
+      (let ((regexp "\\([]\"')]?\\)\\([.?!]\\)\\([]\"')]?\\) +")
+            (to-string "\\1\\2\\3  "))
+        (when start (goto-char start))
+        (while (re-search-forward regexp end t)
+          (replace-match to-string)))
+    (let ((regexp "\\([]\"')]?\\)\\([.?!]\\)\\([]\"')]?\\)\\( +\\)")
+          (to-string "\\1\\2\\3  "))
+      (unwind-protect
+          (progn
+            (add-function :after-while isearch-filter-predicate
+                          #'repunctuate-sentences-filter)
+            (query-replace-regexp regexp to-string nil start end))
+        (remove-function isearch-filter-predicate
+                         #'repunctuate-sentences-filter)))))
 
 
 (defun backward-sentence (&optional arg)

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#52769; Package emacs. (Tue, 28 Dec 2021 21:33:02 GMT) Full text and rfc822 format available.

Message #22 received at 52769 <at> debbugs.gnu.org (full text, mbox):

From: Rudolf Adamkovič <salutis <at> me.com>
To: Juri Linkov <juri <at> linkov.net>
Cc: 52769 <at> debbugs.gnu.org
Subject: Re: bug#52769: 29.0.50; [FEATURE REQUEST] repunctuate-sentences in
 region
Date: Tue, 28 Dec 2021 22:31:54 +0100
Juri Linkov <juri <at> linkov.net> writes:

> Now this is implemented in master.

I have just recompiled Emacs, and everything works as expected.  This
patch will make my life easier.  Thank you!

Rudy
-- 
"Programming reliably --- must be an activity of an undeniably
mathematical nature […] You see, mathematics is about thinking, and
doing mathematics is always trying to think as well as possible." --
Edsger W. Dijkstra (1981)

Rudolf Adamkovič <salutis <at> me.com> [he/him]
Studenohorská 25
84103 Bratislava
Slovakia




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 26 Jan 2022 12:24:07 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 84 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.