GNU bug report logs - #50726
26.3; Let `count-words(-region)' count only words entirely within the region

Previous Next

Package: emacs;

Reported by: Drew Adams <drew.adams <at> oracle.com>

Date: Tue, 21 Sep 2021 22:52:01 UTC

Severity: wishlist

Tags: wontfix

Found in version 26.3

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 50726 in the body.
You can then email your comments to 50726 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#50726; Package emacs. (Tue, 21 Sep 2021 22:52:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Drew Adams <drew.adams <at> oracle.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Tue, 21 Sep 2021 22:52:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Drew Adams <drew.adams <at> oracle.com>
To: "bug-gnu-emacs <at> gnu.org" <bug-gnu-emacs <at> gnu.org>
Subject: 26.3; Let `count-words(-region)' count only words entirely within the
 region
Date: Tue, 21 Sep 2021 22:50:56 +0000
Enhancement request.

A word that straddles the beginning or end of the region is counted as a
word in the region.  It would be good to be able to have such functions
not count such partial words.
___

Here's an example of a command that counts the words in a rectangular
region.  By default it excludes words that straddle the row boundaries,
but a prefix arg counts such partial words also.

https://emacs.stackexchange.com/a/68611/105
___

Admittedly, this difference is not so important for a non-rectangular
region, as it has only two boundaries, and a user can see interactively
whether the text at the beginning or end forms a real word.  But when
called from Lisp, if you want to exclude such partial words you need to
write some code to adjust the count.

In GNU Emacs 26.3 (build 1, x86_64-w64-mingw32)
 of 2019-08-29
Repository revision: 96dd0196c28bc36779584e47fffcca433c9309cd
Windowing system distributor `Microsoft Corp.', version 10.0.19042
Configured using:
 `configure --without-dbus --host=x86_64-w64-mingw32
 --without-compress-install 'CFLAGS=-O2 -static -g3''





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50726; Package emacs. (Wed, 29 Sep 2021 11:34:01 GMT) Full text and rfc822 format available.

Message #8 received at 50726 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefan <at> marxist.se>
To: Drew Adams <drew.adams <at> oracle.com>
Cc: 50726 <at> debbugs.gnu.org
Subject: Re: bug#50726: 26.3; Let `count-words(-region)' count only words
 entirely within the region
Date: Wed, 29 Sep 2021 04:33:10 -0700
Drew Adams <drew.adams <at> oracle.com> writes:

> Enhancement request.
>
> A word that straddles the beginning or end of the region is counted as a
> word in the region.  It would be good to be able to have such functions
> not count such partial words.
> ___
>
> Here's an example of a command that counts the words in a rectangular
> region.  By default it excludes words that straddle the row boundaries,
> but a prefix arg counts such partial words also.
>
> https://emacs.stackexchange.com/a/68611/105

Copying in the code below.  I have no comment, besides to say that a
more strict `count-words' could perhaps be named `count-words-strict'.

(defun count-words-rectangle (start end &optional allow-partial-p msgp)
  "Count words in the rectangle from START to END.
This is similar to `count-words', but for a rectangular region.

Also:

* By default, a word that straddles the beginning or end of a
  rectangle row is not counted.  That is, this counts only words that
  are entirely within the rectangle.
* A prefix arg means count also such partial words at row boundaries.

If called interactively, START and END are the bounds of the start and
end of the active region.  Print a message reporting the number of
rows (lines), columns (characters per row), words, and characters.

If called from Lisp, return the number of words in the rectangle
between START and END, without printing any message."
  (interactive "r\nP\np")
  (let ((bounds  (extract-rectangle-bounds start end))
        (words   0)
        (chars   0))
    (dolist (beg+end  bounds)
      (setq words  (+ words (count-words (car beg+end) (cdr beg+end)))))
    (let (beg end)
      (dolist (beg+end  bounds)
        (setq beg  (car beg+end)
              end  (cdr beg+end))
        (unless allow-partial-p
          (when (and (char-after (1- beg))  (equal '(2) (syntax-after (1- beg)))
                     (char-after beg)       (equal '(2) (syntax-after beg)))
            (setq words  (1- words)))
          (when (and (char-after (1- end))  (equal '(2) (syntax-after (1- end)))
                     (char-after end)       (equal '(2) (syntax-after     end)))
            (setq words  (1- words))))))
    (when msgp
      (dolist
          (beg+end  bounds)
        (setq chars  (+ chars (- (cdr beg+end) (car beg+end)))))
      (let ((rows  (count-lines start end))
            (cols  (let ((rpc  (save-excursion
                                 (rectangle--pos-cols
(region-beginning) (region-end)))))
                     (abs (- (car rpc) (cdr rpc))))))
        (message "Rectangle has %d row%s, %d colum%s, %d word%s, and %d char%s."
                 rows  (if (= rows 1)  "" "s")
                 cols  (if (= cols 1)  "" "s")
                 words (if (= words 1) "" "s")
                 chars (if (= chars 1) "" "s"))))
    words))




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50726; Package emacs. (Fri, 26 Aug 2022 12:35:02 GMT) Full text and rfc822 format available.

Message #11 received at 50726 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Stefan Kangas <stefan <at> marxist.se>
Cc: 50726 <at> debbugs.gnu.org, Drew Adams <drew.adams <at> oracle.com>
Subject: Re: bug#50726: 26.3; Let `count-words(-region)' count only words
 entirely within the region
Date: Fri, 26 Aug 2022 14:34:27 +0200
Stefan Kangas <stefan <at> marxist.se> writes:

> I have no comment, besides to say that a more strict `count-words'
> could perhaps be named `count-words-strict'.

I think adding such a function would be too special-purpose and wouldn't
have enough usage to warrant it.

So I'm closing this bug report.




Added tag(s) wontfix. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Fri, 26 Aug 2022 12:35:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 50726 <at> debbugs.gnu.org and Drew Adams <drew.adams <at> oracle.com> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Fri, 26 Aug 2022 12:35:03 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 24 Sep 2022 11:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 212 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.