GNU bug report logs - #63861
[PATCH] pp.el: New "pretty printing" code

Previous Next

Package: emacs;

Reported by: Stefan Monnier <monnier <at> iro.umontreal.ca>

Date: Fri, 2 Jun 2023 22:52:02 UTC

Severity: normal

Tags: patch

Done: Stefan Monnier <monnier <at> iro.umontreal.ca>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 63861 in the body.
You can then email your comments to 63861 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Fri, 02 Jun 2023 22:52:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Stefan Monnier <monnier <at> iro.umontreal.ca>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Fri, 02 Jun 2023 22:52:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: bug-gnu-emacs <at> gnu.org
Subject: [PATCH] pp.el: New "pretty printing" code
Date: Fri, 02 Jun 2023 18:50:57 -0400
[Message part 1 (text/plain, inline)]
Tags: patch

I've often been annoyed by the way `ielm` "pretty prints" data,
so I wrote my own pretty printer, which has evolved over the years.
I believe it has finally reached a state which may be acceptable
to more than just myself.

The new code is in a new function `pp-region`.
The old code redirects to the new code if `pp-buffer-use-pp-region` is
non-nil, tho I'm not sure we want to bother users with such
a config var.  Hopefully, the new code should be good enough that users
don't need to choose.  Maybe I should make it a `defvar` and have it
default to t, so new users will complain if it's not good enough?


        Stefan


 In GNU Emacs 30.0.50 (build 2, x86_64-pc-linux-gnu, X toolkit, cairo
 version 1.16.0, Xaw3d scroll bars) of 2023-05-24 built on pastel
Repository revision: 41b6b907d4cf2f8c302647dc63c5d9c94f9f01d6
Repository branch: work
Windowing system distributor 'The X.Org Foundation', version 11.0.12011000
System Description: Debian GNU/Linux 11 (bullseye)

Configured using:
 'configure -C --enable-checking --enable-check-lisp-object-type --with-modules --with-cairo --with-tiff=ifavailable
 'CFLAGS=-Wall -g3 -Og -Wno-pointer-sign'
 PKG_CONFIG_PATH=/home/monnier/lib/pkgconfig'

[0001-pp.el-New-pretty-printing-code.patch (text/patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Sat, 03 Jun 2023 05:53:02 GMT) Full text and rfc822 format available.

Message #8 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Sat, 03 Jun 2023 08:53:41 +0300
> Date: Fri, 02 Jun 2023 18:50:57 -0400
> From:  Stefan Monnier via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
> 
> I've often been annoyed by the way `ielm` "pretty prints" data,
> so I wrote my own pretty printer, which has evolved over the years.
> I believe it has finally reached a state which may be acceptable
> to more than just myself.
> 
> The new code is in a new function `pp-region`.
> The old code redirects to the new code if `pp-buffer-use-pp-region` is
> non-nil, tho I'm not sure we want to bother users with such
> a config var.  Hopefully, the new code should be good enough that users
> don't need to choose.  Maybe I should make it a `defvar` and have it
> default to t, so new users will complain if it's not good enough?

Thanks.  I almost never use IELM, so I have no significant comments to
this, only minor ones.

> +(defun pp-region (beg end)
> +  "Insert newlines in BEG..END to try and fit within `fill-column'.
> +Presumes the current buffer contains Lisp code and has indentation properly
> +configured for that.
> +Designed under the assumption that the region occupies a single line,
> +tho it should also work if that's not the case."

The first line should say what this command does.

Also, I think this warrants a NEWS entry and should be documented in
the ELisp manual.

> +(defcustom pp-buffer-use-pp-region nil
> +  "If non-nil, `pp-buffer' uses the new `pp-region' code."
> +  :type 'boolean)

Please add :version.  Also, "the new code" should be rephrased to not
use "new" (which doesn't stand the time test).

And the new defcustom should probably be mentioned in the manual,
perhaps where we mention IELM.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Sat, 03 Jun 2023 18:19:02 GMT) Full text and rfc822 format available.

Message #11 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Sat, 03 Jun 2023 14:18:36 -0400
>> I've often been annoyed by the way `ielm` "pretty prints" data,
>> so I wrote my own pretty printer, which has evolved over the years.
>> I believe it has finally reached a state which may be acceptable
>> to more than just myself.
>> 
>> The new code is in a new function `pp-region`.
>> The old code redirects to the new code if `pp-buffer-use-pp-region` is
>> non-nil, tho I'm not sure we want to bother users with such
>> a config var.  Hopefully, the new code should be good enough that users
>> don't need to choose.  Maybe I should make it a `defvar` and have it
>> default to t, so new users will complain if it's not good enough?
>
> Thanks.  I almost never use IELM, so I have no significant comments to
> this, only minor ones.

FWIW, the change affects other functionality that uses `pp`, such as
`C-h v`.  While working on (previous versions of) this code, I've had
performance problems show up during the generation of `emoji-labels.el`.

>> +(defun pp-region (beg end)
>> +  "Insert newlines in BEG..END to try and fit within `fill-column'.
>> +Presumes the current buffer contains Lisp code and has indentation properly
>> +configured for that.
>> +Designed under the assumption that the region occupies a single line,
>> +tho it should also work if that's not the case."
>
> The first line should say what this command does.

How 'bout:

    Insert line-breaks in Lisp code so it fits within `fill-column`.

?

> Also, I think this warrants a NEWS entry and should be documented in
> the ELisp manual.

Definitely for NEWS, yes.  For the ELisp manual, currently we don't
document `pp-buffer`, the closest I see is `indent-pp-sexp` (in
`programs.texi`).
I'm not sure what to put in there. nor where to put it.

>> +(defcustom pp-buffer-use-pp-region nil
>> +  "If non-nil, `pp-buffer' uses the new `pp-region' code."
>> +  :type 'boolean)
> Please add :version.

Hmm... so you think it should stay as a `defcustom` and we should thus
plan to keep both kinds of pretty-printing in the long term?
I mostly intended it to be a temporary knob for people to be able to try
the new code and easily compare with the old (or revert to the old when
bumping into a problem with the new).

If so, we should probably think of better names to distinguish the two
pp styles than `pp-buffer` vs `pp-region`.  Maybe `pp-fill` for the new
code since arguably the main difference is that the new code pays
attention to `fill-column`?  I don't have a good idea for a name for the
old code, OTOH (and I think it would make sense to keep `pp-buffer` as
a dispatch between the two options, so it would be good to have
a separate name for the old style).

Another difference might be that the new style is maybe aimed more at
pp'ing code than data, whereas the old style might be a bit more
"agnostic" to the definition.  Yet another difference is that the old
code tends to use more lines (because it doesn't try to fill the line
upto `fill-column`) and occasionally outputs very long lines because it
only breaks lines near parentheses.

Maybe that info can inspire someone to come up with a good name for this
"old style"?

> Also, "the new code" should be rephrased to not use "new" (which
> doesn't stand the time test).

:-)

> And the new defcustom should probably be mentioned in the manual,
> perhaps where we mention IELM.

If it stays as a `defcustom`, agreed.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Sat, 03 Jun 2023 18:58:02 GMT) Full text and rfc822 format available.

Message #14 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Sat, 03 Jun 2023 21:58:03 +0300
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: 63861 <at> debbugs.gnu.org
> Date: Sat, 03 Jun 2023 14:18:36 -0400
> 
> >> +(defun pp-region (beg end)
> >> +  "Insert newlines in BEG..END to try and fit within `fill-column'.
> >> +Presumes the current buffer contains Lisp code and has indentation properly
> >> +configured for that.
> >> +Designed under the assumption that the region occupies a single line,
> >> +tho it should also work if that's not the case."
> >
> > The first line should say what this command does.
> 
> How 'bout:
> 
>     Insert line-breaks in Lisp code so it fits within `fill-column`.
> 
> ?

Yes, but let's also mention BEG and END:

  Break lines in Lisp code between BEG and END so it fits within `fill-column'.

> > Also, I think this warrants a NEWS entry and should be documented in
> > the ELisp manual.
> 
> Definitely for NEWS, yes.  For the ELisp manual, currently we don't
> document `pp-buffer`, the closest I see is `indent-pp-sexp` (in
> `programs.texi`).
> I'm not sure what to put in there. nor where to put it.

We document "pp" in "Output Functions".  Maybe there?

> >> +(defcustom pp-buffer-use-pp-region nil
> >> +  "If non-nil, `pp-buffer' uses the new `pp-region' code."
> >> +  :type 'boolean)
> > Please add :version.
> 
> Hmm... so you think it should stay as a `defcustom` and we should thus
> plan to keep both kinds of pretty-printing in the long term?

No, I just said that _if_ we keep it as a defcustom, _then_ it should
have a :version tag.  I have no idea how many users will want to
customize this.

> I mostly intended it to be a temporary knob for people to be able to try
> the new code and easily compare with the old (or revert to the old when
> bumping into a problem with the new).
> 
> If so, we should probably think of better names to distinguish the two
> pp styles than `pp-buffer` vs `pp-region`.  Maybe `pp-fill` for the new
> code since arguably the main difference is that the new code pays
> attention to `fill-column`?  I don't have a good idea for a name for the
> old code, OTOH (and I think it would make sense to keep `pp-buffer` as
> a dispatch between the two options, so it would be good to have
> a separate name for the old style).
> 
> Another difference might be that the new style is maybe aimed more at
> pp'ing code than data, whereas the old style might be a bit more
> "agnostic" to the definition.  Yet another difference is that the old
> code tends to use more lines (because it doesn't try to fill the line
> upto `fill-column`) and occasionally outputs very long lines because it
> only breaks lines near parentheses.
> 
> Maybe that info can inspire someone to come up with a good name for this
> "old style"?

Maybe we should leave it as a variable for now, and see if there's
enough demand for both flavors.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Sun, 04 Jun 2023 03:26:01 GMT) Full text and rfc822 format available.

Message #17 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Visuwesh <visuweshm <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Sun, 04 Jun 2023 08:55:29 +0530
[சனி ஜூன் 03, 2023] Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of text editors" wrote:

> Hmm... so you think it should stay as a `defcustom` and we should thus
> plan to keep both kinds of pretty-printing in the long term?
> I mostly intended it to be a temporary knob for people to be able to try
> the new code and easily compare with the old (or revert to the old when
> bumping into a problem with the new).
>
> If so, we should probably think of better names to distinguish the two
> pp styles than `pp-buffer` vs `pp-region`.  Maybe `pp-fill` for the new
> code since arguably the main difference is that the new code pays
> attention to `fill-column`?  I don't have a good idea for a name for the
> old code, OTOH (and I think it would make sense to keep `pp-buffer` as
> a dispatch between the two options, so it would be good to have
> a separate name for the old style).
>
> Another difference might be that the new style is maybe aimed more at
> pp'ing code than data, whereas the old style might be a bit more
> "agnostic" to the definition.  Yet another difference is that the old
> code tends to use more lines (because it doesn't try to fill the line
> upto `fill-column`) and occasionally outputs very long lines because it
> only breaks lines near parentheses.

BTW, how does this compare to the newly added `pp-emacs-lisp-code'?
It was still rough around the edges the last time I set
`pp-use-max-width' non-nil.  It is also quite a lot slower than the old
path.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Mon, 05 Jun 2023 16:41:01 GMT) Full text and rfc822 format available.

Message #20 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of
 text editors" <bug-gnu-emacs <at> gnu.org>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 63861 <at> debbugs.gnu.org,
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Mon, 05 Jun 2023 19:12:49 +0300
> FWIW, the change affects other functionality that uses `pp`, such as
> `C-h v`.  While working on (previous versions of) this code, I've had
> performance problems show up during the generation of `emoji-labels.el`.

When tried on emoji-labels.el, at the end it failed with

  (scan-error "Containing expression ends prematurely" 255866 255867)

>> Also, I think this warrants a NEWS entry and should be documented in
>> the ELisp manual.
>
> Definitely for NEWS, yes.  For the ELisp manual, currently we don't
> document `pp-buffer`, the closest I see is `indent-pp-sexp` (in
> `programs.texi`).

For indent-pp-sexp with a prefix arg used on defuns, the new version
is much better - it makes code more readable.  It has only one snag:
it inserts an empty line between the defun line and the docstring.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Mon, 05 Jun 2023 16:41:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Wed, 07 Jun 2023 14:11:01 GMT) Full text and rfc822 format available.

Message #26 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Thierry Volpiatto <thievol <at> posteo.net>
To: Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of
 text editors" <bug-gnu-emacs <at> gnu.org>
Cc: 63861 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Wed, 07 Jun 2023 14:10:22 +0000
Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of
text editors" <bug-gnu-emacs <at> gnu.org> writes:

> Tags: patch
>
> I've often been annoyed by the way `ielm` "pretty prints" data,
> so I wrote my own pretty printer, which has evolved over the years.
> I believe it has finally reached a state which may be acceptable
> to more than just myself.
>
> The new code is in a new function `pp-region`.
> The old code redirects to the new code if `pp-buffer-use-pp-region` is
> non-nil, tho I'm not sure we want to bother users with such
> a config var.  Hopefully, the new code should be good enough that users
> don't need to choose.  Maybe I should make it a `defvar` and have it
> default to t, so new users will complain if it's not good enough?

I tried your code and it looks very slow (but looks nice once printed).
Testing on my bookmark-alist printed in some buffer.
Here with a slightly modified version of pp-buffer (not much faster than
the original one):

(benchmark-run-compiled 1 (pp-buffer))
=> (6.942135047 0 0.0)
And here with your version (using pp-region):
(benchmark-run-compiled 1 (pp-buffer))
=> (46.141411097 0 0.0)

For describe variable I use a modified version of `pp` which is very
fast (nearly instant to pretty print value above) but maybe unsafe with
some vars, didn't have any problems though, see
https://github.com/thierryvolpiatto/emacs-config/blob/main/describe-variable.el.

>
>         Stefan
>
>
>  In GNU Emacs 30.0.50 (build 2, x86_64-pc-linux-gnu, X toolkit, cairo
>  version 1.16.0, Xaw3d scroll bars) of 2023-05-24 built on pastel
> Repository revision: 41b6b907d4cf2f8c302647dc63c5d9c94f9f01d6
> Repository branch: work
> Windowing system distributor 'The X.Org Foundation', version 11.0.12011000
> System Description: Debian GNU/Linux 11 (bullseye)
>
> Configured using:
>  'configure -C --enable-checking --enable-check-lisp-object-type --with-modules --with-cairo --with-tiff=ifavailable
>  'CFLAGS=-Wall -g3 -Og -Wno-pointer-sign'
>  PKG_CONFIG_PATH=/home/monnier/lib/pkgconfig'
>
>

-- 
Thierry




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Wed, 07 Jun 2023 14:11:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Wed, 07 Jun 2023 15:22:01 GMT) Full text and rfc822 format available.

Message #32 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Juri Linkov <juri <at> linkov.net>
Cc: "Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of
 text editors" <bug-gnu-emacs <at> gnu.org>, 63861 <at> debbugs.gnu.org,
 Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Wed, 07 Jun 2023 11:21:23 -0400
>> FWIW, the change affects other functionality that uses `pp`, such as
>> `C-h v`.  While working on (previous versions of) this code, I've had
>> performance problems show up during the generation of `emoji-labels.el`.
>
> When tried on emoji-labels.el, at the end it failed with
>
>   (scan-error "Containing expression ends prematurely" 255866 255867)

Hmm... when I do

    rm lisp/international/emoji-labels.el
    make

the file is rebuilt correctly.
Could you give more details about how you got the above?

>>> Also, I think this warrants a NEWS entry and should be documented in
>>> the ELisp manual.
>> Definitely for NEWS, yes.  For the ELisp manual, currently we don't
>> document `pp-buffer`, the closest I see is `indent-pp-sexp` (in
>> `programs.texi`).
> For indent-pp-sexp with a prefix arg used on defuns, the new version
> is much better - it makes code more readable.  It has only one snag:
> it inserts an empty line between the defun line and the docstring.

Oh, indeed, I'm not careful to check before I insert the `\n` in
this case (I always use it on code that starts as a single line).
I'll fix it in the next round, thanks.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Wed, 07 Jun 2023 15:22:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Wed, 07 Jun 2023 15:28:02 GMT) Full text and rfc822 format available.

Message #38 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Thierry Volpiatto <thievol <at> posteo.net>
Cc: "Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of
 text editors" <bug-gnu-emacs <at> gnu.org>, 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Wed, 07 Jun 2023 11:27:35 -0400
> I tried your code and it looks very slow (but looks nice once printed).
> Testing on my bookmark-alist printed in some buffer.
> Here with a slightly modified version of pp-buffer (not much faster than
> the original one):
>
> (benchmark-run-compiled 1 (pp-buffer))
> => (6.942135047 0 0.0)
> And here with your version (using pp-region):
> (benchmark-run-compiled 1 (pp-buffer))
> => (46.141411097 0 0.0)
>
> For describe variable I use a modified version of `pp` which is very
> fast (nearly instant to pretty print value above) but maybe unsafe with
> some vars, didn't have any problems though, see
> https://github.com/thierryvolpiatto/emacs-config/blob/main/describe-variable.el.

Beside the actual way we choose when to insert \n, the main difference
w.r.t performance tends to come from the fact that the new code relies
on `lisp-indent-line` rather than `lisp-indent-region`.

In many cases it doesn't make much difference performancewise, but
indeed there are cases where the difference is significant (more
specifically where it makes the code O(N²) rather than O(N)).
I've been using the patch below for a while and I should probably
include it the `pp-region` patch.

Can you check whether it helps for your case?


        Stefan


diff --git a/lisp/emacs-lisp/lisp-mode.el b/lisp/emacs-lisp/lisp-mode.el
index d44c9d6e23d..9914ededb85 100644
--- a/lisp/emacs-lisp/lisp-mode.el
+++ b/lisp/emacs-lisp/lisp-mode.el
@@ -876,7 +876,7 @@ lisp-ppss
 2 (counting from 0).  This is important for Lisp indentation."
   (unless pos (setq pos (point)))
   (let ((pss (syntax-ppss pos)))
-    (if (nth 9 pss)
+    (if (and (not (nth 2 pss)) (nth 9 pss))
         (let ((sexp-start (car (last (nth 9 pss)))))
           (parse-partial-sexp sexp-start pos nil nil (syntax-ppss sexp-start)))
       pss)))





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Wed, 07 Jun 2023 15:28:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Wed, 07 Jun 2023 15:49:01 GMT) Full text and rfc822 format available.

Message #44 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Visuwesh <visuweshm <at> gmail.com>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Wed, 07 Jun 2023 11:48:32 -0400
> BTW, how does this compare to the newly added `pp-emacs-lisp-code`?

Very good question.  I had completely missed that (and its `pp-use-max-width`).
This points to a host of integration issues between my code and the
existing code.  I'll have to take a deeper look.

> It was still rough around the edges the last time I set
> `pp-use-max-width' non-nil.  It is also quite a lot slower than the
> old path.

My new code is expected to be slower than the "normal" pretty-printer,
but barring performance bugs in `lisp-indent-line` (such as the one
fixed by the patch I just sent to Thierry) it should be approximately
a constant factor slower.

AFAICT the performance of `pp-emacs-lisp-code` can be more problematic.

Beside performance, I guess the behavior of the two should be
somewhat similar, tho I also see that `pp-emacs-lisp-code` pays
attention to the Edebug and `doc-string-elt` info, so it may give
slightly more refined info.

Another difference is that `pp-emacs-lisp-code` starts with an S-exp
object, whereas my code starts with a region (i.e. an sexp that's
already been printed).


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Wed, 07 Jun 2023 16:25:02 GMT) Full text and rfc822 format available.

Message #47 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Thierry Volpiatto <thievol <at> posteo.net>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: "Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of
 text editors" <bug-gnu-emacs <at> gnu.org>, 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Wed, 07 Jun 2023 16:19:38 +0000
[Message part 1 (text/plain, inline)]
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

>> I tried your code and it looks very slow (but looks nice once printed).
>> Testing on my bookmark-alist printed in some buffer.
>> Here with a slightly modified version of pp-buffer (not much faster than
>> the original one):
>>
>> (benchmark-run-compiled 1 (pp-buffer))
>> => (6.942135047 0 0.0)
>> And here with your version (using pp-region):
>> (benchmark-run-compiled 1 (pp-buffer))
>> => (46.141411097 0 0.0)
>>
>> For describe variable I use a modified version of `pp` which is very
>> fast (nearly instant to pretty print value above) but maybe unsafe with
>> some vars, didn't have any problems though, see
>> https://github.com/thierryvolpiatto/emacs-config/blob/main/describe-variable.el.
>
> Beside the actual way we choose when to insert \n, the main difference
> w.r.t performance tends to come from the fact that the new code relies
> on `lisp-indent-line` rather than `lisp-indent-region`.
>
> In many cases it doesn't make much difference performancewise, but
> indeed there are cases where the difference is significant (more
> specifically where it makes the code O(N²) rather than O(N)).
> I've been using the patch below for a while and I should probably
> include it the `pp-region` patch.
>
> Can you check whether it helps for your case?

No, more or less the same:

(benchmark-run-compiled 1 (pp-buffer))
=> (48.501764747 0 0.0)

I have modified my code so that it can be used outside help, you can
test it with tv/pp-region. Here with always the same test buffer:

(benchmark-run-compiled 1 (tv/pp-region (point-min) (point-max)))
=> (0.259444169 0 0.0)


-- 
Thierry
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Wed, 07 Jun 2023 16:25:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Wed, 07 Jun 2023 18:15:02 GMT) Full text and rfc822 format available.

Message #53 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Wed, 07 Jun 2023 21:12:00 +0300
>> When tried on emoji-labels.el, at the end it failed with
>>
>>   (scan-error "Containing expression ends prematurely" 255866 255867)
>
> Hmm... when I do
>
>     rm lisp/international/emoji-labels.el
>     make
>
> the file is rebuilt correctly.
> Could you give more details about how you got the above?

With (setq debug-on-error t) when point is at the beginning of this line:

  (defconst emoji--labels '(("Smileys"

typing 'C-u C-M-q' gives the error:

  Debugger entered--Lisp error: (scan-error "Containing expression ends prematurely" 9739 9740)
    pp-region(522 9447)
    pp-region(521 9448)
    pp-buffer()
    indent-pp-sexp((4))
    funcall-interactively(indent-pp-sexp (4))
    command-execute(indent-pp-sexp)

I can send the file, but all its parens are balanced.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Wed, 07 Jun 2023 19:44:01 GMT) Full text and rfc822 format available.

Message #56 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Juri Linkov <juri <at> linkov.net>
Cc: 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Wed, 07 Jun 2023 15:43:15 -0400
> With (setq debug-on-error t) when point is at the beginning of this line:
>
>   (defconst emoji--labels '(("Smileys"
>
> typing 'C-u C-M-q' gives the error:
>
>   Debugger entered--Lisp error: (scan-error "Containing expression ends prematurely" 9739 9740)

Thanks, I can reproduce it.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Wed, 07 Jun 2023 21:19:02 GMT) Full text and rfc822 format available.

Message #59 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Thierry Volpiatto <thievol <at> posteo.net>
Cc: "Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of
 text editors" <bug-gnu-emacs <at> gnu.org>, 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Wed, 07 Jun 2023 17:18:00 -0400
> No, more or less the same:

:-(

> (benchmark-run-compiled 1 (pp-buffer))
> => (48.501764747 0 0.0)
>
> I have modified my code so that it can be used outside help, you can
> test it with tv/pp-region. Here with always the same test buffer:
>
> (benchmark-run-compiled 1 (tv/pp-region (point-min) (point-max)))
> => (0.259444169 0 0.0)

Hmm... I think this depends on the data you're printing.
Any hope you can share the data you're using for your tests?
[ I understand it's personal, so it can be problematic.  Maybe you can
  anonymize with a search&replace of, say `[[:alpha:]]` to `x`?  ]


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Wed, 07 Jun 2023 21:19:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Thu, 08 Jun 2023 16:17:02 GMT) Full text and rfc822 format available.

Message #65 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Thierry Volpiatto <thievol <at> posteo.net>
Cc: "Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of
 text editors" <bug-gnu-emacs <at> gnu.org>, 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Thu, 08 Jun 2023 12:15:47 -0400
>> The new code is in a new function `pp-region`.
>> The old code redirects to the new code if `pp-buffer-use-pp-region` is
>> non-nil, tho I'm not sure we want to bother users with such
>> a config var.  Hopefully, the new code should be good enough that users
>> don't need to choose.  Maybe I should make it a `defvar` and have it
>> default to t, so new users will complain if it's not good enough?
>
> I tried your code and it looks very slow (but looks nice once printed).
> Testing on my bookmark-alist printed in some buffer.
> Here with a slightly modified version of pp-buffer (not much faster than
> the original one):
>
> (benchmark-run-compiled 1 (pp-buffer))
> => (6.942135047 0 0.0)
> And here with your version (using pp-region):
> (benchmark-run-compiled 1 (pp-buffer))
> => (46.141411097 0 0.0)

Hmm... that's weird.  With the bookmark-alist.el file you sent me,
I get pretty much the reverse result:

    % for f in foo.el test-load-history.el test-bookmark-alist.el; do \
          for v in nil t; do \
              time src/emacs -Q --batch "$f" -l pp \
                   --eval "(progn (setq pp-buffer-use-pp-region $v) \
                             (message \"%s %s %S\" (file-name-nondirectory buffer-file-name) \
                                                   pp-buffer-use-pp-region \
                                                   (benchmark-run (pp-buffer))))"; \
      done; done
    foo.el nil (0.210123295 1 0.057426385)
    src/emacs -Q --batch "$f" -l pp --eval   0.34s user 0.04s system 99% cpu 0.382 total
    foo.el t (0.07107641199999999 0 0.0)
    src/emacs -Q --batch "$f" -l pp --eval   0.19s user 0.03s system 99% cpu 0.222 total
    test-load-history.el nil (156.07942386099998 17 1.432754161)
    src/emacs -Q --batch "$f" -l pp --eval   156.04s user 0.20s system 99% cpu 2:36.26 total
    test-load-history.el t (96.480110987 24 1.9799413479999999)
    src/emacs -Q --batch "$f" -l pp --eval   96.40s user 0.27s system 99% cpu 1:36.69 total
    test-bookmark-alist.el nil (51.211047973 8 0.6690439610000001)
    src/emacs -Q --batch "$f" -l pp --eval   51.29s user 0.11s system 99% cpu 51.401 total
    test-bookmark-alist.el t (5.110458941 6 0.468187075)
    src/emacs -Q --batch "$f" -l pp --eval   5.21s user 0.09s system 99% cpu 5.302 total
    %

This is comparing "the old pp-buffer" with "the new `pp-region`", using
the patch below.  This is not using your `tv/pp-region` code.

I find these results (mine) quite odd: they suggest that my `pp-region`
is *faster* than the old `pp-buffer` for `load-history` and `bookmark-alist`
data, which I definitely did not expect (and don't know how to explain
either).

These tests were run on a machine whose CPU's speed can vary quite
drastically depending on the load, so take those numbers with a grain of
salt, but the dynamic frequency fluctuations shouldn't cause more than
a factor of 2 difference (and according to my CPU frequency monitor
widget, the frequency was reasonably stable during the test).

> For describe variable I use a modified version of `pp` which is very
> fast (nearly instant to pretty print value above) but maybe unsafe with
> some vars, didn't have any problems though, see
> https://github.com/thierryvolpiatto/emacs-config/blob/main/describe-variable.el.

So, IIUC the numbers you cite above compare my `pp-region` to your
`tv/pp-region`, right?  And do I understand correctly that
`tv/pp-region` does not indent its output?
What was the reason for this choice?


        Stefan


PS: BTW, looking at the output of `pp` on the bookmark-data, it's not
a test where `pp-region` shines: the old pp uses up more lines, but is
more regular and arguably more readable :-(


diff --git a/lisp/emacs-lisp/lisp-mode.el b/lisp/emacs-lisp/lisp-mode.el
index d44c9d6e23d..9914ededb85 100644
--- a/lisp/emacs-lisp/lisp-mode.el
+++ b/lisp/emacs-lisp/lisp-mode.el
@@ -876,7 +876,7 @@ lisp-ppss
 2 (counting from 0).  This is important for Lisp indentation."
   (unless pos (setq pos (point)))
   (let ((pss (syntax-ppss pos)))
-    (if (nth 9 pss)
+    (if (and (not (nth 2 pss)) (nth 9 pss))
         (let ((sexp-start (car (last (nth 9 pss)))))
           (parse-partial-sexp sexp-start pos nil nil (syntax-ppss sexp-start)))
       pss)))
diff --git a/lisp/emacs-lisp/pp.el b/lisp/emacs-lisp/pp.el
index e6e3cd6c6f4..89d7325a491 100644
--- a/lisp/emacs-lisp/pp.el
+++ b/lisp/emacs-lisp/pp.el
@@ -74,31 +74,127 @@ pp-to-string
       (pp-buffer)
       (buffer-string))))
 
+(defun pp--within-fill-column-p ()
+  "Return non-nil if point is within `fill-column'."
+  ;; Try and make it O(fill-column) rather than O(current-column),
+  ;; so as to avoid major slowdowns on long lines.
+  ;; FIXME: This doesn't account for invisible text or `display' properties :-(
+  (and (save-excursion
+         (re-search-backward
+          "^\\|\n" (max (point-min) (- (point) fill-column)) t))
+       (<= (current-column) fill-column)))
+
+(defun pp-region (beg end)
+  "Insert newlines in BEG..END to try and fit within `fill-column'.
+Presumes the current buffer contains Lisp code and has indentation properly
+configured for that.
+Designed under the assumption that the region occupies a single line,
+tho it should also work if that's not the case."
+  (interactive "r")
+  (goto-char beg)
+  (let ((end (copy-marker end t))
+        (newline (lambda ()
+                   (skip-chars-forward ")]}")
+                   (unless (save-excursion (skip-chars-forward " \t") (eolp))
+                     (insert "\n")
+                     (indent-according-to-mode)))))
+    (while (progn (forward-comment (point-max))
+                  (< (point) end))
+      (let ((beg (point))
+            ;; Whether we're in front of an element with paired delimiters.
+            ;; Can be something funky like #'(lambda ..) or ,'#s(...).
+            (paired (when (looking-at "['`,#]*[[:alpha:]]*\\([({[\"]\\)")
+                      (match-beginning 1))))
+        ;; Go to the end of the sexp.
+        (goto-char (or (scan-sexps (or paired (point)) 1) end))
+        (unless
+            (and
+             ;; The sexp is all on a single line.
+             (save-excursion (not (search-backward "\n" beg t)))
+             ;; And its end is within `fill-column'.
+             (or (pp--within-fill-column-p)
+                 ;; If the end of the sexp is beyond `fill-column',
+                 ;; try to move the sexp to its own line.
+                 (and
+                  (save-excursion
+                    (goto-char beg)
+                    (if (save-excursion (skip-chars-backward " \t({[',") (bolp))
+                        ;; The sexp was already on its own line.
+                        nil
+                      (skip-chars-backward " \t")
+                      (setq beg (copy-marker beg t))
+                      (if paired (setq paired (copy-marker paired t)))
+                      ;; We could try to undo this insertion if it
+                      ;; doesn't reduce the indentation depth, but I'm
+                      ;; not sure it's worth the trouble.
+                      (insert "\n") (indent-according-to-mode)
+                      t))
+                  ;; Check again if we moved the whole exp to a new line.
+                  (pp--within-fill-column-p))))
+          ;; The sexp is spread over several lines, and/or its end is
+          ;; (still) beyond `fill-column'.
+          (when (and paired (not (eq ?\" (char-after paired))))
+            ;; The sexp has sub-parts, so let's try and spread
+            ;; them over several lines.
+            (save-excursion
+              (goto-char beg)
+              (when (looking-at "(\\([^][()\" \t\n;']+\\)")
+                ;; Inside an expression of the form (SYM ARG1
+                ;; ARG2 ... ARGn) where SYM has a `lisp-indent-function'
+                ;; property that's a number, insert a newline after
+                ;; the corresponding ARGi, because it tends to lead to
+                ;; more natural and less indented code.
+                (let* ((sym (intern-soft (match-string 1)))
+                       (lif (and sym (get sym 'lisp-indent-function))))
+                  (if (eq lif 'defun) (setq lif 2))
+                  (when (natnump lif)
+                    (goto-char (match-end 0))
+                    (forward-sexp lif)
+                    (funcall newline)))))
+            (save-excursion
+              (pp-region (1+ paired) (1- (point)))))
+          ;; Now the sexp either ends beyond `fill-column' or is
+          ;; spread over several lines (or both).  Either way, the rest of the
+          ;; line should be moved to its own line.
+          (funcall newline))))))
+
+(defcustom pp-buffer-use-pp-region t
+  "If non-nil, `pp-buffer' uses the new `pp-region' code."
+  :type 'boolean)
+
 ;;;###autoload
 (defun pp-buffer ()
   "Prettify the current buffer with printed representation of a Lisp object."
   (interactive)
   (goto-char (point-min))
-  (while (not (eobp))
-    (cond
-     ((ignore-errors (down-list 1) t)
-      (save-excursion
-        (backward-char 1)
-        (skip-chars-backward "'`#^")
-        (when (and (not (bobp)) (memq (char-before) '(?\s ?\t ?\n)))
-          (delete-region
-           (point)
-           (progn (skip-chars-backward " \t\n") (point)))
-          (insert "\n"))))
-     ((ignore-errors (up-list 1) t)
-      (skip-syntax-forward ")")
-      (delete-region
-       (point)
-       (progn (skip-chars-forward " \t\n") (point)))
-      (insert ?\n))
-     (t (goto-char (point-max)))))
-  (goto-char (point-min))
-  (indent-sexp))
+  (if pp-buffer-use-pp-region
+      (with-syntax-table emacs-lisp-mode-syntax-table
+        (let ((fill-column (max fill-column 70))
+              (indent-line-function
+               (if (local-variable-p 'indent-line-function)
+                   indent-line-function
+                 #'lisp-indent-line)))
+          (pp-region (point-min) (point-max))))
+    (while (not (eobp))
+      (cond
+       ((ignore-errors (down-list 1) t)
+        (save-excursion
+          (backward-char 1)
+          (skip-chars-backward "'`#^")
+          (when (and (not (bobp)) (memq (char-before) '(?\s ?\t ?\n)))
+            (delete-region
+             (point)
+             (progn (skip-chars-backward " \t\n") (point)))
+            (insert "\n"))))
+       ((ignore-errors (up-list 1) t)
+        (skip-syntax-forward ")")
+        (delete-region
+         (point)
+         (progn (skip-chars-forward " \t\n") (point)))
+        (insert ?\n))
+       (t (goto-char (point-max)))))
+    (goto-char (point-min))
+    (indent-sexp)))
 
 ;;;###autoload
 (defun pp (object &optional stream)





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Thu, 08 Jun 2023 16:17:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Thu, 08 Jun 2023 18:36:02 GMT) Full text and rfc822 format available.

Message #71 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Thierry Volpiatto <thievol <at> posteo.net>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: "Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of
 text editors" <bug-gnu-emacs <at> gnu.org>, 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Thu, 08 Jun 2023 18:08:18 +0000
[Message part 1 (text/plain, inline)]
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

> This is comparing "the old pp-buffer" with "the new `pp-region`", using
> the patch below.  This is not using your `tv/pp-region` code.
>
> I find these results (mine) quite odd: they suggest that my `pp-region`
> is *faster* than the old `pp-buffer` for `load-history` and `bookmark-alist`
> data, which I definitely did not expect (and don't know how to explain
> either).

Hmm, don't know, I ran pp-buffer with M-: from the test-load-history.el or the
test-bookmark-alist.el buffer. May be using emacs --batch makes a
difference? is the data really printed in such case?

More or less the code using pp-region takes between 42 to 48s and the one
with old pp-buffer around 6s.

Also sorry about your last patch I tested it too fast, it is indeed
slightly faster, but not much: 42 vs 46s.

> These tests were run on a machine whose CPU's speed can vary quite
> drastically depending on the load, so take those numbers with a grain of
> salt, but the dynamic frequency fluctuations shouldn't cause more than
> a factor of 2 difference (and according to my CPU frequency monitor
> widget, the frequency was reasonably stable during the test).

Yes, sure, more or less it is the same.

>> For describe variable I use a modified version of `pp` which is very
>> fast (nearly instant to pretty print value above) but maybe unsafe with
>> some vars, didn't have any problems though, see
>> https://github.com/thierryvolpiatto/emacs-config/blob/main/describe-variable.el.
>
> So, IIUC the numbers you cite above compare my `pp-region` to your
> `tv/pp-region`, right?

Not in the first mail, but after yes.

> And do I understand correctly that `tv/pp-region` does not indent its
> output?

No, it does indent, see function tv/pp which use pp-to-string which use pp-buffer
and pp-buffer indent the whole sexp at the end.

> What was the reason for this choice?

Because indentation is done at the end by pp-buffer.

PS (unrelated to pp-region): About the old pp-buffer, there is a difficult to find bug where the
same operation is running twice (newline is added, then removed, then
added again and then the loop continue)

You can see it by edebugging pp-buffer on a simple alist like this:

(defvar bar '((1 . "un") (2 . "deux") (3 . "trois") (4 . "quatre") (5 . "cinq") (6 . "six")))

-- 
Thierry
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Thu, 08 Jun 2023 18:36:03 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Thu, 08 Jun 2023 22:36:01 GMT) Full text and rfc822 format available.

Message #77 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Thierry Volpiatto <thievol <at> posteo.net>
Cc: 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Thu, 08 Jun 2023 18:35:05 -0400
>> I find these results (mine) quite odd: they suggest that my `pp-region`
>> is *faster* than the old `pp-buffer` for `load-history` and `bookmark-alist`
>> data, which I definitely did not expect (and don't know how to explain
>> either).

I've just redone my tests a bit differently, added `pp-emacs-lisp-code`,
and also introduced a var to control whether to activate the `lisp-ppss`
patch or not.  I also fixed my `foo.el` file: its content was
accidentally already pretty printed rather than being on a single line,
which totally changes the behavior of `pp-region` and `pp-buffer`).

For reference:

    % (cd ~/tmp; l foo.el test*.el)
    -rw------- 1 monnier monnier 1125154  8 jun 11:20 test-load-history.el
    -rw------- 1 monnier monnier  163258  8 jun 11:20 test-bookmark-alist.el
    -rw-r--r-- 1 monnier monnier   77101  8 jun 17:20 foo.el
    %

Here's the code I used to run the test:

    for f in ~/tmp/foo.el ~/tmp/test-bookmark-alist.el ~/tmp/test-load-history.el; do for ppss in nil t; do for v in '(pp-buffer)' '(pp-region (point-min) (point-max))' '(tv/pp-region (point-min) (point-max))' '(let ((s (read (current-buffer)))) (erase-buffer) (pp-emacs-lisp-code s))'; do src/emacs -Q --batch -l ~/tmp/describe-variable --eval "(with-temp-buffer (emacs-lisp-mode) (insert-file-contents \"$f\") (setq pp-buffer-use-pp-region nil lisp--faster-ppss $ppss) (message \"%S %S %S %S\" (file-name-nondirectory \"$f\") (benchmark-run $v) '$v '$ppss))"&;  done; done; done

So, by file, from fastest to slowest:

    foo.el (0.859482743 0 0.0) (pp-buffer) t
    foo.el (0.890402623 0 0.0) (pp-buffer) nil
    foo.el (4.62344853 4 1.7225397670000002) (tv/pp-region (point-min) (point-max)) t
    foo.el (4.687414465 4 1.7116580980000002) (tv/pp-region (point-min) (point-max)) nil
    foo.el (7.932661181 1 0.3435169600000001) (pp-region (point-min) (point-max)) t
    foo.el (196.183345212 1 0.618591124) (pp-region (point-min) (point-max)) nil
    foo.el (2997.739238575 505 105.82851685700001) (let ((s (read (current-buffer)))) (erase-buffer) (pp-emacs-lisp-code s)) t

Here we see that my `pp-region` code is slower than `pp-buffer` by
a factor ~10x: I'm not very happy about it, but this `foo.el` file was
selected because it was the worst case I had come across (tho that was
before I found the `lisp-ppss` patch).

The last element in each line is whether we activated the `lisp-ppss`
patch.  As we can see here, the `lisp-ppss` patch makes an enormous
difference (~24x) for my code, but not for `pp-buffer` (because it
relies on `lisp-indent-region` rather than `lisp-indent-line`) and not
for `tv/pp-region` either (because it doesn't indent at all).

We also see that `pp-emacs-lisp-code` is *much* slower.  I don't include
other results for this function in this email because they're still
running :-)

    test-bookmark-alist.el (13.237991019999999 6 2.403892035) (tv/pp-region (point-min) (point-max)) nil
    test-bookmark-alist.el (14.853056353 6 2.705511935) (tv/pp-region (point-min) (point-max)) t
    test-bookmark-alist.el (28.059658418 5 2.005039257) (pp-region (point-min) (point-max)) t
    test-bookmark-alist.el (180.390204026 5 2.1182066760000002) (pp-region (point-min) (point-max)) nil
    test-bookmark-alist.el (265.95429676599997 10 4.445954908) (pp-buffer) t
    test-bookmark-alist.el (268.975666886 10 3.6774180120000004) (pp-buffer) nil

Here we see that my `pp-region` code can be faster (even significantly
so) than `pp-buffer`.  I'm not sure why, but I'll take the win :-)
We also see that the faster `lisp-ppss` again makes an important
difference in the performance of `pp-region` (~8x), tho the effect is
not as drastic as in the case of `foo.el`.

    test-load-history.el (235.134082197 8 4.440112806999999) (tv/pp-region (point-min) (point-max)) nil
    test-load-history.el (235.873981253 8 4.416064476) (tv/pp-region (point-min) (point-max)) t
    test-load-history.el (506.770548196 31 9.706665932) (pp-region (point-min) (point-max)) t
    test-load-history.el (701.991875274 43 12.390212449) (pp-buffer) t
    test-load-history.el (710.843618646 43 12.156289354) (pp-buffer) nil
    test-load-history.el (1419.268184021 36 9.260999640000001) (pp-region (point-min) (point-max)) nil

Here again, we see that `pp-region` is competitive with `pp-buffer` and
the `lisp-ppss` patch speeds it up significantly (~3x).

Another thing we see in those tests is that `pp-region` (with the
`lisp-ppss` patch) is ~2x slower than `tv/pp-region`, whereas the
performance differential with `pp-buffer` varies a lot more.  Also if we
compare the time it takes to the size of the file, we see:

      77101B /   7.932661181s = 9719 B/s
     163258B /  28.059658418s = 5818 B/s
    1125154B / 506.770548196s = 2220 B/s

`pp-region`s performance is not quite linear in the size of the file :-(
Interestingly, the same holds for `tv/pp-region`:

      77101B /   4.62344853s  = 16676 B/s
     163258B /  13.237991019s = 12332 B/s
    1125154B / 235.134082197s =  4785 B/s

even though it works in a fundamentally very different way (which, to
my naive eye should result in a linear performance), so maybe the
slowdown here is due to something external (such as the fact that
various operations on buffers get slower as the buffer gets bigger).

> hmm, don't know, I ran pp-buffer with M-: from the test-load-history.el or the
> test-bookmark-alist.el buffer. May be using emacs --batch makes a
> difference?

I don't see any significant performance difference between batch and
non-batch :-(

> is the data really printed in such case?

Yes, definitely.

> More or less the code using pp-region takes between 42 to 48s and the one
> with old pp-buffer around 6s.

I wonder why we see such wildly different performance.  In my tests on
your `test-bookmark-alist.el` I basically see the reverse ratio!

> Also sorry about your last patch I tested it too fast, it is indeed
> slightly faster, but not much: 42 vs 46s.

This is also perplexing.  In my tests, the patch has a very significant
impact on the performance of `pp-region`.
Are you sure the patch is used (`lisp-mode.el` is preloaded, so you need
to re-dump Emacs, or otherwise manually force-reload `lisp-mode.elc`
into your Emacs session)?

FWIW, I'm running my tests on Emacs's `master` branch with the native
ELisp compiler enabled (tho I don't see much difference in performance
on these tests when running my other Emacs build without the native
compiler) on an i386 Debian testing system.

>> And do I understand correctly that `tv/pp-region` does not indent its
>> output?
> No, it does indent, see function tv/pp which use pp-to-string which use pp-buffer
> and pp-buffer indent the whole sexp at the end.

AFAICT `tv/pp` uses `pp-to-string` only on "atomic" values (i.e. not
lists, vectors, or hash-tables), so there's usually not much to indent in there.
What I see in the output after calling `tv/pp-region` are non-indented lists.

>> What was the reason for this choice?
> Because indentation is done at the end by pp-buffer.

When I use `describe-variable` with your code, the printed value is
indeed indented, but that code uses only `pp-buffer` and not
`tv/pp-region` (i.e. `tv/describe-variable` does not call
`tv/pp-region`, neither directly nor indirectly).

> PS (unrelated to pp-region): About the old pp-buffer, there is
> a difficult to find bug where the same operation is running twice
> (newline is added, then removed, then added again and then the loop
> continue)
>
> You can see it by edebugging pp-buffer on a simple alist like this:
>
> (defvar bar '((1 . "un") (2 . "deux") (3 . "trois") (4 . "quatre") (5 . "cinq") (6 . "six")))

That might be part of the poor performance we see on
`test-bookmark-alist.el`?


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Fri, 09 Jun 2023 00:08:01 GMT) Full text and rfc822 format available.

Message #80 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Thierry Volpiatto <thievol <at> posteo.net>
Cc: 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Thu, 08 Jun 2023 20:07:26 -0400
> `pp-region`s performance is not quite linear in the size of the file :-(
> Interestingly, the same holds for `tv/pp-region`:
>
>       77101B /   4.62344853s  = 16676 B/s
>      163258B /  13.237991019s = 12332 B/s
>     1125154B / 235.134082197s =  4785 B/s
>
> even though it works in a fundamentally very different way (which, to
> my naive eye should result in a linear performance), so maybe the
> slowdown here is due to something external (such as the fact that
> various operations on buffers get slower as the buffer gets bigger).

I think I figured it out: the larger files happen to have relatively
more (smaller) "objects".  So it is more or less linear, just not in the
size in bytes but in the size in number of objects.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Fri, 09 Jun 2023 03:22:02 GMT) Full text and rfc822 format available.

Message #83 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Visuwesh <visuweshm <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca> 
Cc: Eli Zaretskii <eliz <at> gnu.org>, 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Fri, 09 Jun 2023 08:51:39 +0530
[புதன் ஜூன் 07, 2023] Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of text editors" wrote:

>> BTW, how does this compare to the newly added `pp-emacs-lisp-code`?
>
> Very good question.  I had completely missed that (and its `pp-use-max-width`).
> This points to a host of integration issues between my code and the
> existing code.  I'll have to take a deeper look.

From what I remember, pp simply switches to use pp-emacs-lisp-code when
the relevant user option is set.  The poor performance of
pp-emacs-lisp-code made me wish pp-use-max-width was only obeyed by user
facing commands like pp-eval-expression & friends.

>> It was still rough around the edges the last time I set
>> `pp-use-max-width' non-nil.  It is also quite a lot slower than the
>> old path.
>
> My new code is expected to be slower than the "normal" pretty-printer,
> but barring performance bugs in `lisp-indent-line` (such as the one
> fixed by the patch I just sent to Thierry) it should be approximately
> a constant factor slower.
>
> AFAICT the performance of `pp-emacs-lisp-code` can be more problematic.

Hopefully, the constant factor is quite small.  pp-emacs-lisp-code took
a lot of time to print my modest bookmark alist (28 entries) and for the
longest time I thought some other code in Emacs has gone awry.

> Beside performance, I guess the behavior of the two should be
> somewhat similar, tho I also see that `pp-emacs-lisp-code` pays
> attention to the Edebug and `doc-string-elt` info, so it may give
> slightly more refined info.

I haven't tested your pretty printer but pp-emacs-lisp-code could give
some really bizarre results for lisp data.  Unfortunately, I don't have
any examples handy to illustrate what I mean by bizarre though. 

> Another difference is that `pp-emacs-lisp-code` starts with an S-exp
> object, whereas my code starts with a region (i.e. an sexp that's
> already been printed).
>
>
>         Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Fri, 09 Jun 2023 05:51:01 GMT) Full text and rfc822 format available.

Message #86 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Thierry Volpiatto <thievol <at> posteo.net>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Fri, 09 Jun 2023 05:22:31 +0000
[Message part 1 (text/plain, inline)]
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

>>> I find these results (mine) quite odd: they suggest that my `pp-region`
>>> is *faster* than the old `pp-buffer` for `load-history` and `bookmark-alist`
>>> data, which I definitely did not expect (and don't know how to explain
>>> either).
>
> I've just redone my tests a bit differently, added `pp-emacs-lisp-code`,
> and also introduced a var to control whether to activate the `lisp-ppss`
> patch or not.  I also fixed my `foo.el` file: its content was
> accidentally already pretty printed rather than being on a single line,
> which totally changes the behavior of `pp-region` and `pp-buffer`).
>
> For reference:
>
>     % (cd ~/tmp; l foo.el test*.el)
>     -rw------- 1 monnier monnier 1125154  8 jun 11:20 test-load-history.el
>     -rw------- 1 monnier monnier  163258  8 jun 11:20 test-bookmark-alist.el
>     -rw-r--r-- 1 monnier monnier   77101  8 jun 17:20 foo.el
>     %
>
> Here's the code I used to run the test:
>
>     for f in ~/tmp/foo.el ~/tmp/test-bookmark-alist.el ~/tmp/test-load-history.el; do for ppss in nil t; do for v in '(pp-buffer)' '(pp-region (point-min) (point-max))' '(tv/pp-region (point-min) (point-max))' '(let ((s (read (current-buffer)))) (erase-buffer) (pp-emacs-lisp-code s))'; do src/emacs -Q --batch -l ~/tmp/describe-variable --eval "(with-temp-buffer (emacs-lisp-mode) (insert-file-contents \"$f\") (setq pp-buffer-use-pp-region nil lisp--faster-ppss $ppss) (message \"%S %S %S %S\" (file-name-nondirectory \"$f\") (benchmark-run $v) '$v '$ppss))"&;  done; done; done
>
> So, by file, from fastest to slowest:
>
>     foo.el (0.859482743 0 0.0) (pp-buffer) t
>     foo.el (0.890402623 0 0.0) (pp-buffer) nil
>     foo.el (4.62344853 4 1.7225397670000002) (tv/pp-region (point-min) (point-max)) t
>     foo.el (4.687414465 4 1.7116580980000002) (tv/pp-region (point-min) (point-max)) nil
>     foo.el (7.932661181 1 0.3435169600000001) (pp-region (point-min) (point-max)) t
>     foo.el (196.183345212 1 0.618591124) (pp-region (point-min) (point-max)) nil
>     foo.el (2997.739238575 505 105.82851685700001) (let ((s (read (current-buffer)))) (erase-buffer) (pp-emacs-lisp-code s)) t
>
> Here we see that my `pp-region` code is slower than `pp-buffer` by
> a factor ~10x: I'm not very happy about it, but this `foo.el` file was
> selected because it was the worst case I had come across (tho that was
> before I found the `lisp-ppss` patch).
>
> The last element in each line is whether we activated the `lisp-ppss`
> patch.  As we can see here, the `lisp-ppss` patch makes an enormous
> difference (~24x) for my code, but not for `pp-buffer` (because it
> relies on `lisp-indent-region` rather than `lisp-indent-line`) and not
> for `tv/pp-region` either (because it doesn't indent at all).
>
> We also see that `pp-emacs-lisp-code` is *much* slower.  I don't include
> other results for this function in this email because they're still
> running :-)
>
>     test-bookmark-alist.el (13.237991019999999 6 2.403892035) (tv/pp-region (point-min) (point-max)) nil
>     test-bookmark-alist.el (14.853056353 6 2.705511935) (tv/pp-region (point-min) (point-max)) t
>     test-bookmark-alist.el (28.059658418 5 2.005039257) (pp-region (point-min) (point-max)) t
>     test-bookmark-alist.el (180.390204026 5 2.1182066760000002) (pp-region (point-min) (point-max)) nil
>     test-bookmark-alist.el (265.95429676599997 10 4.445954908) (pp-buffer) t
>     test-bookmark-alist.el (268.975666886 10 3.6774180120000004) (pp-buffer) nil
>
> Here we see that my `pp-region` code can be faster (even significantly
> so) than `pp-buffer`.  I'm not sure why, but I'll take the win :-)
> We also see that the faster `lisp-ppss` again makes an important
> difference in the performance of `pp-region` (~8x), tho the effect is
> not as drastic as in the case of `foo.el`.
>
>     test-load-history.el (235.134082197 8 4.440112806999999) (tv/pp-region (point-min) (point-max)) nil
>     test-load-history.el (235.873981253 8 4.416064476) (tv/pp-region (point-min) (point-max)) t
>     test-load-history.el (506.770548196 31 9.706665932) (pp-region (point-min) (point-max)) t
>     test-load-history.el (701.991875274 43 12.390212449) (pp-buffer) t
>     test-load-history.el (710.843618646 43 12.156289354) (pp-buffer) nil
>     test-load-history.el (1419.268184021 36 9.260999640000001) (pp-region (point-min) (point-max)) nil
>
> Here again, we see that `pp-region` is competitive with `pp-buffer` and
> the `lisp-ppss` patch speeds it up significantly (~3x).
>
> Another thing we see in those tests is that `pp-region` (with the
> `lisp-ppss` patch) is ~2x slower than `tv/pp-region`, whereas the
> performance differential with `pp-buffer` varies a lot more.  Also if we
> compare the time it takes to the size of the file, we see:
>
>       77101B /   7.932661181s = 9719 B/s
>      163258B /  28.059658418s = 5818 B/s
>     1125154B / 506.770548196s = 2220 B/s
>
> `pp-region`s performance is not quite linear in the size of the file :-(
> Interestingly, the same holds for `tv/pp-region`:
>
>       77101B /   4.62344853s  = 16676 B/s
>      163258B /  13.237991019s = 12332 B/s
>     1125154B / 235.134082197s =  4785 B/s
>
> even though it works in a fundamentally very different way (which, to
> my naive eye should result in a linear performance), so maybe the
> slowdown here is due to something external (such as the fact that
> various operations on buffers get slower as the buffer gets bigger).
>
>> hmm, don't know, I ran pp-buffer with M-: from the test-load-history.el or the
>> test-bookmark-alist.el buffer. May be using emacs --batch makes a
>> difference?
>
> I don't see any significant performance difference between batch and
> non-batch :-(
>
>> is the data really printed in such case?
>
> Yes, definitely.
>
>> More or less the code using pp-region takes between 42 to 48s and the one
>> with old pp-buffer around 6s.
>
> I wonder why we see such wildly different performance.  In my tests on
> your `test-bookmark-alist.el` I basically see the reverse ratio!
>
>> Also sorry about your last patch I tested it too fast, it is indeed
>> slightly faster, but not much: 42 vs 46s.
>
> This is also perplexing.  In my tests, the patch has a very significant
> impact on the performance of `pp-region`.
> Are you sure the patch is used (`lisp-mode.el` is preloaded, so you need
> to re-dump Emacs, or otherwise manually force-reload `lisp-mode.elc`
> into your Emacs session)?

No, I just C-M-x lisp-ppss, I will try today to recompile an emacs with
your patchs applied and see if it makes a difference.
I also use emacs-28, will try with 30.

> FWIW, I'm running my tests on Emacs's `master` branch with the native
> ELisp compiler enabled (tho I don't see much difference in performance
> on these tests when running my other Emacs build without the native
> compiler) on an i386 Debian testing system.

I don't use native compilation.

>>> And do I understand correctly that `tv/pp-region` does not indent its
>>> output?
>> No, it does indent, see function tv/pp which use pp-to-string which use pp-buffer
>> and pp-buffer indent the whole sexp at the end.
>
> AFAICT `tv/pp` uses `pp-to-string` only on "atomic" values (i.e. not
> lists, vectors, or hash-tables), so there's usually not much to indent in there.
> What I see in the output after calling `tv/pp-region` are non-indented lists.

Hmm maybe, seems it was similar to pp-buffer but perhaps I didn't look
carefully.

>>> What was the reason for this choice?
>> Because indentation is done at the end by pp-buffer.
>
> When I use `describe-variable` with your code, the printed value is
> indeed indented, but that code uses only `pp-buffer` and not
> `tv/pp-region` (i.e. `tv/describe-variable` does not call
> `tv/pp-region`, neither directly nor indirectly).

Yes you have to run manually tv/pp-value-in-help when you need to read
the value of a variable (unless it is a small var).

>> PS (unrelated to pp-region): About the old pp-buffer, there is
>> a difficult to find bug where the same operation is running twice
>> (newline is added, then removed, then added again and then the loop
>> continue)
>>
>> You can see it by edebugging pp-buffer on a simple alist like this:
>>
>> (defvar bar '((1 . "un") (2 . "deux") (3 . "trois") (4 . "quatre") (5 . "cinq") (6 . "six")))
>
> That might be part of the poor performance we see on
> `test-bookmark-alist.el`?

Not sure it makes a big difference but for sure it is slower to defeat
the operation done in previous turn and do it again.

>
>         Stefan


-- 
Thierry
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Fri, 09 Jun 2023 15:00:01 GMT) Full text and rfc822 format available.

Message #89 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Visuwesh <visuweshm <at> gmail.com>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Fri, 09 Jun 2023 10:59:18 -0400
>>> BTW, how does this compare to the newly added `pp-emacs-lisp-code`?
>> Very good question.  I had completely missed that (and its `pp-use-max-width`).
>> This points to a host of integration issues between my code and the
>> existing code.  I'll have to take a deeper look.
> From what I remember, pp simply switches to use pp-emacs-lisp-code when
> the relevant user option is set.

Yup, similar to my patch, except my patch hooks into `pp-buffer`,
i.e. a lower level which hence affects more uses.

> The poor performance of pp-emacs-lisp-code made me wish
> pp-use-max-width was only obeyed by user facing commands like
> pp-eval-expression & friends.

My tests yesterday suggest `pp-emacs-lisp-code` is simply too slow to be
used except when we know beforehand that the sexp to be printed
is small.  And I can't think of a single piece of code where that's
the case.  I suspect part of the code can be improved to bring the
computational complexity of the code closer to linear, but until
someone does that, I think `pp-use-max-width` is just unusable.
Instead we could/should provide ways for the user to interactively call
a command to reformat something using `pp-emacs-lisp-code`.  Or maybe
change the code so `pp-emacs-lisp-code` is used only when the total
printed size is below a certain threshold.

>> My new code is expected to be slower than the "normal" pretty-printer,
>> but barring performance bugs in `lisp-indent-line` (such as the one
>> fixed by the patch I just sent to Thierry) it should be approximately
>> a constant factor slower.
>> AFAICT the performance of `pp-emacs-lisp-code` can be more problematic.
> Hopefully, the constant factor is quite small.

In my tests, 10x seems to be the "worst case slowdown" of `pp-region`.
On some of the tests, it's actually faster, sometimes significantly so
(presumably due to some non-linear complexity in some parts of
`pp-buffer`).

> pp-emacs-lisp-code took a lot of time to print my modest bookmark
> alist (28 entries) and for the longest time I thought some other code
> in Emacs has gone awry.

AFAICT it suffers from a pretty bad complexity.

> I haven't tested your pretty printer but pp-emacs-lisp-code could give
> some really bizarre results for lisp data.

I haven't seen "really bizarre results" with `pp-region` yet, but
I wouldn't be surprised if that can happen: I've been playing with
various pretty-printing alternatives over the years and they all seem to
suffer from weird behaviors occasionally, except for those that insert
too many line breaks.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Fri, 09 Jun 2023 15:05:03 GMT) Full text and rfc822 format available.

Message #92 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 63861 <at> debbugs.gnu.org,
 Visuwesh <visuweshm <at> gmail.com>
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Fri, 09 Jun 2023 15:09:15 +0000
Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of
text editors" <bug-gnu-emacs <at> gnu.org> writes:

> I haven't seen "really bizarre results" with `pp-region` yet, but
> I wouldn't be surprised if that can happen: I've been playing with
> various pretty-printing alternatives over the years and they all seem to
> suffer from weird behaviors occasionally, except for those that insert
> too many line breaks.

If algorithms that insert "too many" line breaks can be reasonably fast,
can they be executed first followed by merging the excessive lines back?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Fri, 09 Jun 2023 15:27:02 GMT) Full text and rfc822 format available.

Message #95 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Ihor Radchenko <yantar92 <at> posteo.net>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 63861 <at> debbugs.gnu.org,
 Visuwesh <visuweshm <at> gmail.com>
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Fri, 09 Jun 2023 11:26:16 -0400
>> I haven't seen "really bizarre results" with `pp-region` yet, but
>> I wouldn't be surprised if that can happen: I've been playing with
>> various pretty-printing alternatives over the years and they all seem to
>> suffer from weird behaviors occasionally, except for those that insert
>> too many line breaks.
> If algorithms that insert "too many" line breaks can be reasonably fast,
> can they be executed first followed by merging the excessive lines back?

Speed is definitely an issue in general, but the "behavior" I was
alluding to above was not the performance but the end result (and
I think the same was true for Visuwesh's comment).


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Fri, 09 Jun 2023 16:00:02 GMT) Full text and rfc822 format available.

Message #98 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Visuwesh <visuweshm <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Ihor Radchenko <yantar92 <at> posteo.net>, 63861 <at> debbugs.gnu.org,
 Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Fri, 09 Jun 2023 21:29:18 +0530
[வெள்ளி ஜூன் 09, 2023] Stefan Monnier wrote:

> Speed is definitely an issue in general, but the "behavior" I was
> alluding to above was not the performance but the end result (and
> I think the same was true for Visuwesh's comment).

Indeed, I was talking about the end result.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Fri, 09 Jun 2023 16:05:02 GMT) Full text and rfc822 format available.

Message #101 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Visuwesh <visuweshm <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Fri, 09 Jun 2023 21:34:43 +0530
[வெள்ளி ஜூன் 09, 2023] Stefan Monnier wrote:

>>> My new code is expected to be slower than the "normal" pretty-printer,
>>> but barring performance bugs in `lisp-indent-line` (such as the one
>>> fixed by the patch I just sent to Thierry) it should be approximately
>>> a constant factor slower.
>>> AFAICT the performance of `pp-emacs-lisp-code` can be more problematic.
>> Hopefully, the constant factor is quite small.
>
> In my tests, 10x seems to be the "worst case slowdown" of `pp-region`.
> On some of the tests, it's actually faster, sometimes significantly so
> (presumably due to some non-linear complexity in some parts of
> `pp-buffer`).

I cannot imagine how bad 10x really will be but if you are daily-driving
this, I assume it is so not bad after all.

Thanks for your answers!




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Mon, 12 Jun 2023 20:23:02 GMT) Full text and rfc822 format available.

Message #104 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Mon, 12 Jun 2023 16:21:50 -0400
[Message part 1 (text/plain, inline)]
> Yes, but let's also mention BEG and END:
>
>   Break lines in Lisp code between BEG and END so it fits within `fill-column'.

Even better, thanks.

>> > Also, I think this warrants a NEWS entry and should be documented in
>> > the ELisp manual.
>> 
>> Definitely for NEWS, yes.  For the ELisp manual, currently we don't
>> document `pp-buffer`, the closest I see is `indent-pp-sexp` (in
>> `programs.texi`).
>> I'm not sure what to put in there. nor where to put it.
>
> We document "pp" in "Output Functions".  Maybe there?

Haven't looked at that yet: I'm still trying to figure out how the
functionality should be exposed.

>> >> +(defcustom pp-buffer-use-pp-region nil
>> >> +  "If non-nil, `pp-buffer' uses the new `pp-region' code."
>> >> +  :type 'boolean)
>> > Please add :version.
>> Hmm... so you think it should stay as a `defcustom` and we should thus
>> plan to keep both kinds of pretty-printing in the long term?
>
> No, I just said that _if_ we keep it as a defcustom, _then_ it should
> have a :version tag.  I have no idea how many users will want to
> customize this.

Since Emacs-29 already has a similar defcustom to use the
`pp-emacs-lisp-code` algorithm (and since Thierry uses yet another
algorithm), I guess there's enough evidence to convince me that we
should have a defcustom.

But I don't like the `pp-use-max-width` defcustom: its name doesn't say
what it does since the fact that `pp-emacs-lisp-code` obeys `pp-max-width`
is just one part of the difference with the default pp code.
So I suggest "merging" that var with the new custom var that chooses
which algorithm to use (I could make it an obsolete alias, but it seemed
cleaner to use a new var and mark the old one obsolete).

See below my new version of the patch.  I renamed `pp-region` to
`pp-fill`.  The patch introduces a custom var `pp-default-function`
which specifies which algorithm to use among:
- `pp-emacs-lisp-code` (Lars' new-in-29 pretty-but-slow pretty printer).
- `pp-fill` (my new pretty printer).
- `pp-28` (the old pretty printer; suggestions for a better name welcome).
- `pp-29` (dispatches according to `pp-use-max-width`, to either `pp-28`
  or `pp-emacs-lisp-code`, like we do in Emacs-29).
- Thierry could plug his `tv/pp-region` in here.

The patch also changes `pp` so you can call it with BEG..END and it will
pretty-print that region, which makes `pp-buffer` obsolete (I have not
yet updated the callers accordingly).

If there's no objection, I'll adjust the doc, fix the obsolete uses of
`pp-buffer`, and install.


        Stefan
[pp-fill.patch (text/x-diff, inline)]
diff --git a/lisp/emacs-lisp/lisp-mode.el b/lisp/emacs-lisp/lisp-mode.el
index d44c9d6e23d..9914ededb85 100644
--- a/lisp/emacs-lisp/lisp-mode.el
+++ b/lisp/emacs-lisp/lisp-mode.el
@@ -876,7 +876,7 @@ lisp-ppss
 2 (counting from 0).  This is important for Lisp indentation."
   (unless pos (setq pos (point)))
   (let ((pss (syntax-ppss pos)))
-    (if (nth 9 pss)
+    (if (and (not (nth 2 pss)) (nth 9 pss))
         (let ((sexp-start (car (last (nth 9 pss)))))
           (parse-partial-sexp sexp-start pos nil nil (syntax-ppss sexp-start)))
       pss)))
diff --git a/lisp/emacs-lisp/pp.el b/lisp/emacs-lisp/pp.el
index e6e3cd6c6f4..28620fd4bbd 100644
--- a/lisp/emacs-lisp/pp.el
+++ b/lisp/emacs-lisp/pp.el
@@ -52,32 +52,195 @@
 large lists."
   :type 'boolean
   :version "29.1")
+(make-obsolete-variable 'pp-use-max-width 'pp-default-function "30.1")
+
+(defcustom pp-default-function #'pp-fill
+  ;; FIXME: The best pretty printer to use depends on the use-case
+  ;; so maybe we should allow callers to specify what they want (maybe with
+  ;; options like `fast', `compact', `code', `data', ...) and these
+  ;; can then be mapped to actual pretty-printing algorithms.
+  ;; Then again, callers can just directly call the corresponding function.
+  "Function that `pp' should dispatch to for pretty printing.
+That function can be called in one of two ways:
+- with a single argument, which it should insert and pretty-print at point.
+- with two arguments which delimit a region containing Lisp sexps
+  which should be pretty-printed.
+In both cases, the function can presume that the buffer is setup for
+Lisp syntax."
+  :type 'function
+  :options '(pp-28 pp-29 pp-fill pp-emacs-lisp-code)
+  :version "30.1")
 
 (defvar pp--inhibit-function-formatting nil)
 
+;; There are basically two APIs for a pretty-printing function:
+;;
+;; - either the function takes an object (and prints it in addition to
+;;   prettifying it).
+;; - or the function takes a region containing an already printed object
+;;   and prettifies its content.
+;;
+;; `pp--object' and `pp--region' are helper functions to convert one
+;; API to the other.
+
+
+(defun pp--object (object region-function)
+  "Pretty-print OBJECT at point.
+The prettifying is done by REGION-FUNCTION which is
+called with two positions as arguments and should fold lines
+within that region.  Returns the result as a string."
+  (let ((print-escape-newlines pp-escape-newlines)
+        (print-quoted t)
+        (beg (point)))
+    ;; FIXME: In many cases it would be preferable to use `cl-prin1' here.
+    (prin1 object (current-buffer))
+    (funcall region-function beg (point))))
+
+(defun pp--region (beg end object-function)
+  "Pretty-print the object(s) contained within BEG..END.
+OBJECT-FUNCTION is called with a single object as argument
+and should pretty print it at point into the current buffer."
+  (save-excursion
+    (with-restriction beg end
+      (goto-char (point-min))
+      (while
+          (progn
+            ;; We'll throw away all the comments within objects, but let's
+            ;; try at least to preserve the comments between objects.
+            (forward-comment (point-max))
+            (let ((beg (point))
+                  (object (ignore-error end-of-buffer
+                              (list (read (current-buffer))))))
+              (when (consp object)
+                (delete-region beg (point))
+                (funcall object-function (car object))
+                t)))))))
+
+(defun pp-29 (beg-or-sexp &optional end)
+  "Prettify the current region with printed representation of a Lisp object.
+Uses the pretty-printing algorithm that was standard in Emacs-29,
+which, depending on `pp-use-max-width', will either use `pp-28'
+or `pp-emacs-lisp-code'."
+  (if pp-use-max-width
+      (let ((pp--inhibit-function-formatting t)) ;FIXME: Why?
+        (pp-emacs-lisp-code beg-or-sexp end))
+    (pp-28 beg-or-sexp end)))
+
 ;;;###autoload
-(defun pp-to-string (object)
+(defun pp-to-string (object &optional pp-function)
   "Return a string containing the pretty-printed representation of OBJECT.
 OBJECT can be any Lisp object.  Quoting characters are used as needed
-to make output that `read' can handle, whenever this is possible."
-  (if pp-use-max-width
-      (let ((pp--inhibit-function-formatting t))
-        (with-temp-buffer
-          (pp-emacs-lisp-code object)
-          (buffer-string)))
+to make output that `read' can handle, whenever this is possible.
+Optional argument PP-FUNCTION overrides `pp-default-function'."
     (with-temp-buffer
       (lisp-mode-variables nil)
       (set-syntax-table emacs-lisp-mode-syntax-table)
-      (let ((print-escape-newlines pp-escape-newlines)
-            (print-quoted t))
-        (prin1 object (current-buffer)))
-      (pp-buffer)
-      (buffer-string))))
+    (funcall (or pp-function pp-default-function) object)
+    (buffer-string)))
+
+(defun pp--within-fill-column-p ()
+  "Return non-nil if point is within `fill-column'."
+  ;; Try and make it O(fill-column) rather than O(current-column),
+  ;; so as to avoid major slowdowns on long lines.
+  ;; FIXME: This doesn't account for invisible text or `display' properties :-(
+  (and (save-excursion
+         (re-search-backward
+          "^\\|\n" (max (point-min) (- (point) fill-column)) t))
+       (<= (current-column) fill-column)))
+
+(defun pp-fill (beg &optional end)
+  "Break lines in Lisp code between BEG and END so it fits within `fill-column'.
+Presumes the current buffer has syntax and indentation properly
+configured for that.
+Designed under the assumption that the region occupies a single line,
+tho it should also work if that's not the case.
+Can also be called with a single argument, in which case
+it inserts and pretty-prints that arg at point."
+  (interactive "r")
+  (if (null end) (pp--object beg #'pp-fill)
+    (goto-char beg)
+    (let ((end (copy-marker end t))
+          (newline (lambda ()
+                     (skip-chars-forward ")]}")
+                     (unless (save-excursion (skip-chars-forward " \t") (eolp))
+                       (insert "\n")
+                       (indent-according-to-mode)))))
+      (while (progn (forward-comment (point-max))
+                    (< (point) end))
+        (let ((beg (point))
+              ;; Whether we're in front of an element with paired delimiters.
+              ;; Can be something funky like #'(lambda ..) or ,'#s(...).
+              (paired (when (looking-at "['`,#]*[[:alpha:]]*\\([({[\"]\\)")
+                        (match-beginning 1))))
+          ;; Go to the end of the sexp.
+          (goto-char (or (scan-sexps (or paired (point)) 1) end))
+          (unless
+              (and
+               ;; The sexp is all on a single line.
+               (save-excursion (not (search-backward "\n" beg t)))
+               ;; And its end is within `fill-column'.
+               (or (pp--within-fill-column-p)
+                   ;; If the end of the sexp is beyond `fill-column',
+                   ;; try to move the sexp to its own line.
+                   (and
+                    (save-excursion
+                      (goto-char beg)
+                      (if (save-excursion (skip-chars-backward " \t({[',")
+                                          (bolp))
+                          ;; The sexp was already on its own line.
+                          nil
+                        (skip-chars-backward " \t")
+                        (setq beg (copy-marker beg t))
+                        (if paired (setq paired (copy-marker paired t)))
+                        ;; We could try to undo this insertion if it
+                        ;; doesn't reduce the indentation depth, but I'm
+                        ;; not sure it's worth the trouble.
+                        (insert "\n") (indent-according-to-mode)
+                        t))
+                    ;; Check again if we moved the whole exp to a new line.
+                    (pp--within-fill-column-p))))
+            ;; The sexp is spread over several lines, and/or its end is
+            ;; (still) beyond `fill-column'.
+            (when (and paired (not (eq ?\" (char-after paired))))
+              ;; The sexp has sub-parts, so let's try and spread
+              ;; them over several lines.
+              (save-excursion
+                (goto-char beg)
+                (when (looking-at "(\\([^][()\" \t\n;']+\\)")
+                  ;; Inside an expression of the form (SYM ARG1
+                  ;; ARG2 ... ARGn) where SYM has a `lisp-indent-function'
+                  ;; property that's a number, insert a newline after
+                  ;; the corresponding ARGi, because it tends to lead to
+                  ;; more natural and less indented code.
+                  (let* ((sym (intern-soft (match-string 1)))
+                         (lif (and sym (get sym 'lisp-indent-function))))
+                    (if (eq lif 'defun) (setq lif 2))
+                    (when (natnump lif)
+                      (goto-char (match-end 0))
+                      (forward-sexp lif)
+                      (funcall newline)))))
+              (save-excursion
+                (pp-fill (1+ paired) (1- (point)))))
+            ;; Now the sexp either ends beyond `fill-column' or is
+            ;; spread over several lines (or both).  Either way, the
+            ;; rest of the line should be moved to its own line.
+            (funcall newline)))))))
 
 ;;;###autoload
 (defun pp-buffer ()
   "Prettify the current buffer with printed representation of a Lisp object."
+  (declare (obsolete pp "30"))
   (interactive)
+  (pp (point-min) (point-max)))
+
+(defun pp-28 (beg &optional end)
+  "Prettify the current region with printed representation of a Lisp object.
+Uses the pretty-printing algorithm that was standard in Emacs≤29.
+Non-interactively can also be called with a single argument, in which
+case that argument will be inserted pretty-printed at point."
+  (interactive "r")
+  (if (null end) (pp--object beg #'pp-29)
+    (save-restriction beg end
   (goto-char (point-min))
   (while (not (eobp))
     (cond
@@ -98,7 +261,7 @@
       (insert ?\n))
      (t (goto-char (point-max)))))
   (goto-char (point-min))
-  (indent-sexp))
+      (indent-sexp))))
 
 ;;;###autoload
 (defun pp (object &optional stream)
@@ -106,14 +106,22 @@
 Quoting characters are printed as needed to make output that `read'
 can handle, whenever this is possible.
 
-This function does not apply special formatting rules for Emacs
-Lisp code.  See `pp-emacs-lisp-code' instead.
+Uses the pretty-printing code specified in `pp-default-function'.
 
-By default, this function won't limit the line length of lists
-and vectors.  Bind `pp-use-max-width' to a non-nil value to do so.
-
-Output stream is STREAM, or value of `standard-output' (which see)."
-  (princ (pp-to-string object) (or stream standard-output)))
+Output stream is STREAM, or value of `standard-output' (which see).
+An alternative calling convention is to pass it two buffer positions,
+in which case it will prettify that region's content."
+  (cond
+   ((and (integerp object) (integerp stream))
+    (funcall pp-default-function object stream))
+   ((and (eq (or stream standard-output) (current-buffer))
+         ;; Make sure the current buffer is setup sanely.
+         (eq (syntax-table) emacs-lisp-mode-syntax-table)
+         (eq indent-line-function #'lisp-indent-line))
+    ;; Skip the buffer->string->buffer middle man.
+    (funcall pp-default-function object))
+   (t
+    (princ (pp-to-string object) (or stream standard-output)))))
 
 ;;;###autoload
 (defun pp-display-expression (expression out-buffer-name &optional lisp)
@@ -220,11 +220,14 @@
     (pp-macroexpand-expression (pp-last-sexp))))
 
 ;;;###autoload
-(defun pp-emacs-lisp-code (sexp)
+(defun pp-emacs-lisp-code (sexp &optional end)
   "Insert SEXP into the current buffer, formatted as Emacs Lisp code.
 Use the `pp-max-width' variable to control the desired line length.
-Note that this could be slow for large SEXPs."
+Note that this could be slow for large SEXPs.
+Can also be called with two arguments, in which case they're taken to be
+the bounds of a region containing Lisp code to pretty-print."
   (require 'edebug)
+  (if end (pp--region sexp end #'pp-emacs-lisp-code)
   (let ((obuf (current-buffer)))
     (with-temp-buffer
       (emacs-lisp-mode)
@@ -234,7 +237,7 @@
       (indent-sexp)
       (while (re-search-forward " +$" nil t)
         (replace-match ""))
-      (insert-into-buffer obuf))))
+        (insert-into-buffer obuf)))))
 
 (defun pp--insert-lisp (sexp)
   (cl-case (type-of sexp)

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Tue, 13 Jun 2023 10:56:02 GMT) Full text and rfc822 format available.

Message #107 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Tue, 13 Jun 2023 13:55:14 +0300
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: 63861 <at> debbugs.gnu.org
> Date: Mon, 12 Jun 2023 16:21:50 -0400
> 
> > We document "pp" in "Output Functions".  Maybe there?
> 
> Haven't looked at that yet: I'm still trying to figure out how the
> functionality should be exposed.

Well, just don't forget that when you eventually install ;-)

> +(defcustom pp-default-function #'pp-fill
> +  ;; FIXME: The best pretty printer to use depends on the use-case
> +  ;; so maybe we should allow callers to specify what they want (maybe with
> +  ;; options like `fast', `compact', `code', `data', ...) and these
> +  ;; can then be mapped to actual pretty-printing algorithms.
> +  ;; Then again, callers can just directly call the corresponding function.
> +  "Function that `pp' should dispatch to for pretty printing.
> +That function can be called in one of two ways:
> +- with a single argument, which it should insert and pretty-print at point.
> +- with two arguments which delimit a region containing Lisp sexps
> +  which should be pretty-printed.
> +In both cases, the function can presume that the buffer is setup for
> +Lisp syntax."

Perhaps say in the doc string something about when each algorithm is
slower/faster?

Otherwise LGTM, thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Fri, 16 Jun 2023 18:28:01 GMT) Full text and rfc822 format available.

Message #110 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Fri, 16 Jun 2023 14:26:54 -0400
[Message part 1 (text/plain, inline)]
> Otherwise LGTM, thanks.

OK, I think I have it almost ready see the patches below.
I just hit one snag when trying to fix the tests.

We have for example the following test:

    (ert-deftest pp-print-quote ()
      (should (string= (pp-to-string 'quote) "quote"))
      (should (string= (pp-to-string ''quote) "'quote"))
      (should (string= (pp-to-string '('a 'b)) "('a 'b)\n"))
      (should (string= (pp-to-string '(''quote 'quote)) "(''quote 'quote)\n"))

This is how the old code behaved, i.e. the output sometimes ends with \n
and sometimes not, depending on whether the object printed is a list or not.

Currently, my new code behaves the same when using `pp-28` or `pp-29`
but when using the new default (i.e. `pp-fill`) the output never ends in
\n.  This change was not intentional, but I think it makes sense because
it's more consistent.

I'm not completely sure how we should fix this.  I think the old
behavior of sometimes adding \n and sometimes not is not desirable, so
I think we should change it (a backward incompatible change).
We have two remaining choices:

A) never add \n
B) always add \n

AFAICT, in practice the old behavior resulted in a \n added in most
cases, so (B) should lead to less breakage, but OTOH I think (A) would
be cleaner since it's easier for callers to add a \n when needed than
for them to remove a \n.

WDYT?  A or B?


        Stefan


[0001-lisp-emacs-lisp-lisp-mode.el-lisp-ppss-Fix-performan.patch (text/x-diff, inline)]
From 31bc44c81386f8db2aecfe1529d051fed1367df9 Mon Sep 17 00:00:00 2001
From: Stefan Monnier <monnier <at> iro.umontreal.ca>
Date: Fri, 16 Jun 2023 13:14:27 -0400
Subject: [PATCH 1/5] * lisp/emacs-lisp/lisp-mode.el (lisp-ppss): Fix
 performance bug
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

(nth 2 ppss) can be absent but not incorrect, so don't recompute ppss
for (nth 2 ppss) when (nth 2 ppss) is already provided.
When calling `lisp-indent-line` on all the lines in a region, this
sometimes introduced a gratuitous O(N²) complexity.
---
 lisp/emacs-lisp/lisp-mode.el | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lisp/emacs-lisp/lisp-mode.el b/lisp/emacs-lisp/lisp-mode.el
index d44c9d6e23d..9914ededb85 100644
--- a/lisp/emacs-lisp/lisp-mode.el
+++ b/lisp/emacs-lisp/lisp-mode.el
@@ -876,7 +876,7 @@ lisp-ppss
 2 (counting from 0).  This is important for Lisp indentation."
   (unless pos (setq pos (point)))
   (let ((pss (syntax-ppss pos)))
-    (if (nth 9 pss)
+    (if (and (not (nth 2 pss)) (nth 9 pss))
         (let ((sexp-start (car (last (nth 9 pss)))))
           (parse-partial-sexp sexp-start pos nil nil (syntax-ppss sexp-start)))
       pss)))
-- 
2.39.2

[0002-pp.el-pp-default-function-New-custom-var.patch (text/x-diff, inline)]
From 16c8fa5e209e5d13f86e87a84a678608de0d5341 Mon Sep 17 00:00:00 2001
From: Stefan Monnier <monnier <at> iro.umontreal.ca>
Date: Fri, 16 Jun 2023 13:31:13 -0400
Subject: [PATCH 2/5] pp.el (pp-default-function): New custom var

* lisp/emacs-lisp/pp.el (pp-use-max-width): Make obsolete.
(pp-default-function): New custom var.
(pp--object, pp--region): New helper functions.
(pp-29): New function, extracted from `pp-to-string`.
(pp-to-string): Add `pp-function` arg and obey `pp-default-function`.
(pp-28): New function, extracted from `pp-buffer`.
(pp-buffer): Rewrite, using `pp` so it obeys `pp-default-function`.
(pp): Add new calling convention to apply it to a region,
and obey `pp-default-function`.
(pp-emacs-lisp-code): Add new calling convention to apply it to a region.
---
 lisp/emacs-lisp/pp.el | 198 ++++++++++++++++++++++++++++++------------
 1 file changed, 144 insertions(+), 54 deletions(-)

diff --git a/lisp/emacs-lisp/pp.el b/lisp/emacs-lisp/pp.el
index e6e3cd6c6f4..0798e46f735 100644
--- a/lisp/emacs-lisp/pp.el
+++ b/lisp/emacs-lisp/pp.el
@@ -52,53 +52,132 @@ pp-use-max-width
 large lists."
   :type 'boolean
   :version "29.1")
+(make-obsolete-variable 'pp-use-max-width 'pp-default-function "30.1")
+
+(defcustom pp-default-function #'pp-29
+  ;; FIXME: The best pretty printer to use depends on the use-case
+  ;; so maybe we should allow callers to specify what they want (maybe with
+  ;; options like `fast', `compact', `code', `data', ...) and these
+  ;; can then be mapped to actual pretty-printing algorithms.
+  ;; Then again, callers can just directly call the corresponding function.
+  "Function that `pp' should dispatch to for pretty printing.
+That function can be called in one of two ways:
+- with a single argument, which it should insert and pretty-print at point.
+- with two arguments which delimit a region containing Lisp sexps
+  which should be pretty-printed.
+In both cases, the function can presume that the buffer is setup for
+Lisp syntax."
+  :type '(choice
+          (const :tag "Emacs≤28 algorithm, fast and good enough" pp-28)
+          (const :tag "Work hard for code (slow on large inputs)"
+                 pp-emacs-lisp-code)
+          (const :tag "`pp-emacs-lisp-code' if `pp-use-max-width' else `pp-28'"
+                 pp-29)
+          function)
+  :version "30.1")
 
 (defvar pp--inhibit-function-formatting nil)
 
+;; There are basically two APIs for a pretty-printing function:
+;;
+;; - either the function takes an object (and prints it in addition to
+;;   prettifying it).
+;; - or the function takes a region containing an already printed object
+;;   and prettifies its content.
+;;
+;; `pp--object' and `pp--region' are helper functions to convert one
+;; API to the other.
+
+
+(defun pp--object (object region-function)
+  "Pretty-print OBJECT at point.
+The prettifying is done by REGION-FUNCTION which is
+called with two positions as arguments and should fold lines
+within that region.  Returns the result as a string."
+  (let ((print-escape-newlines pp-escape-newlines)
+        (print-quoted t)
+        (beg (point)))
+    ;; FIXME: In many cases it would be preferable to use `cl-prin1' here.
+    (prin1 object (current-buffer))
+    (funcall region-function beg (point))))
+
+(defun pp--region (beg end object-function)
+  "Pretty-print the object(s) contained within BEG..END.
+OBJECT-FUNCTION is called with a single object as argument
+and should pretty print it at point into the current buffer."
+  (save-excursion
+    (with-restriction beg end
+      (goto-char (point-min))
+      (while
+          (progn
+            ;; We'll throw away all the comments within objects, but let's
+            ;; try at least to preserve the comments between objects.
+            (forward-comment (point-max))
+            (let ((beg (point))
+                  (object (ignore-error end-of-buffer
+                              (list (read (current-buffer))))))
+              (when (consp object)
+                (delete-region beg (point))
+                (funcall object-function (car object))
+                t)))))))
+
+(defun pp-29 (beg-or-sexp &optional end) ;FIXME: Better name?
+  "Prettify the current region with printed representation of a Lisp object.
+Uses the pretty-printing algorithm that was standard in Emacs-29,
+which, depending on `pp-use-max-width', will either use `pp-28'
+or `pp-emacs-lisp-code'."
+  (if pp-use-max-width
+      (let ((pp--inhibit-function-formatting t)) ;FIXME: Why?
+        (pp-emacs-lisp-code beg-or-sexp end))
+    (pp-28 beg-or-sexp end)))
+
 ;;;###autoload
-(defun pp-to-string (object)
+(defun pp-to-string (object &optional pp-function)
   "Return a string containing the pretty-printed representation of OBJECT.
 OBJECT can be any Lisp object.  Quoting characters are used as needed
-to make output that `read' can handle, whenever this is possible."
-  (if pp-use-max-width
-      (let ((pp--inhibit-function-formatting t))
-        (with-temp-buffer
-          (pp-emacs-lisp-code object)
-          (buffer-string)))
-    (with-temp-buffer
-      (lisp-mode-variables nil)
-      (set-syntax-table emacs-lisp-mode-syntax-table)
-      (let ((print-escape-newlines pp-escape-newlines)
-            (print-quoted t))
-        (prin1 object (current-buffer)))
-      (pp-buffer)
-      (buffer-string))))
+to make output that `read' can handle, whenever this is possible.
+Optional argument PP-FUNCTION overrides `pp-default-function'."
+  (with-temp-buffer
+    (lisp-mode-variables nil)
+    (set-syntax-table emacs-lisp-mode-syntax-table)
+    (funcall (or pp-function pp-default-function) object)
+    (buffer-string)))
 
 ;;;###autoload
 (defun pp-buffer ()
   "Prettify the current buffer with printed representation of a Lisp object."
   (interactive)
-  (goto-char (point-min))
-  (while (not (eobp))
-    (cond
-     ((ignore-errors (down-list 1) t)
-      (save-excursion
-        (backward-char 1)
-        (skip-chars-backward "'`#^")
-        (when (and (not (bobp)) (memq (char-before) '(?\s ?\t ?\n)))
+  (pp (point-min) (point-max)))
+
+(defun pp-28 (beg &optional end)        ;FIXME: Better name?
+  "Prettify the current region with printed representation of a Lisp object.
+Uses the pretty-printing algorithm that was standard in Emacs≤29.
+Non-interactively can also be called with a single argument, in which
+case that argument will be inserted pretty-printed at point."
+  (interactive "r")
+  (if (null end) (pp--object beg #'pp-29)
+    (save-restriction beg end
+      (goto-char (point-min))
+      (while (not (eobp))
+        (cond
+         ((ignore-errors (down-list 1) t)
+          (save-excursion
+            (backward-char 1)
+            (skip-chars-backward "'`#^")
+            (when (and (not (bobp)) (memq (char-before) '(?\s ?\t ?\n)))
+              (delete-region
+               (point)
+               (progn (skip-chars-backward " \t\n") (point)))
+              (insert "\n"))))
+         ((ignore-errors (up-list 1) t)
+          (skip-syntax-forward ")")
           (delete-region
            (point)
-           (progn (skip-chars-backward " \t\n") (point)))
-          (insert "\n"))))
-     ((ignore-errors (up-list 1) t)
-      (skip-syntax-forward ")")
-      (delete-region
-       (point)
-       (progn (skip-chars-forward " \t\n") (point)))
-      (insert ?\n))
-     (t (goto-char (point-max)))))
-  (goto-char (point-min))
-  (indent-sexp))
+           (progn (skip-chars-forward " \t\n") (point)))
+          (insert ?\n))
+         (t (goto-char (point-max)))))
+      (goto-char (point-min))
+      (indent-sexp))))
 
 ;;;###autoload
 (defun pp (object &optional stream)
@@ -106,14 +185,22 @@ pp
 Quoting characters are printed as needed to make output that `read'
 can handle, whenever this is possible.
 
-This function does not apply special formatting rules for Emacs
-Lisp code.  See `pp-emacs-lisp-code' instead.
-
-By default, this function won't limit the line length of lists
-and vectors.  Bind `pp-use-max-width' to a non-nil value to do so.
-
-Output stream is STREAM, or value of `standard-output' (which see)."
-  (princ (pp-to-string object) (or stream standard-output)))
+Uses the pretty-printing code specified in `pp-default-function'.
+
+Output stream is STREAM, or value of `standard-output' (which see).
+An alternative calling convention is to pass it two buffer positions,
+in which case it will prettify that region's content."
+  (cond
+   ((and (integerp object) (integerp stream))
+    (funcall pp-default-function object stream))
+   ((and (eq (or stream standard-output) (current-buffer))
+         ;; Make sure the current buffer is setup sanely.
+         (eq (syntax-table) emacs-lisp-mode-syntax-table)
+         (eq indent-line-function #'lisp-indent-line))
+    ;; Skip the buffer->string->buffer middle man.
+    (funcall pp-default-function object))
+   (t
+    (princ (pp-to-string object) (or stream standard-output)))))
 
 ;;;###autoload
 (defun pp-display-expression (expression out-buffer-name &optional lisp)
@@ -220,21 +307,24 @@ pp-macroexpand-last-sexp
     (pp-macroexpand-expression (pp-last-sexp))))
 
 ;;;###autoload
-(defun pp-emacs-lisp-code (sexp)
+(defun pp-emacs-lisp-code (sexp &optional end)
   "Insert SEXP into the current buffer, formatted as Emacs Lisp code.
 Use the `pp-max-width' variable to control the desired line length.
-Note that this could be slow for large SEXPs."
+Note that this could be slow for large SEXPs.
+Can also be called with two arguments, in which case they're taken to be
+the bounds of a region containing Lisp code to pretty-print."
   (require 'edebug)
-  (let ((obuf (current-buffer)))
-    (with-temp-buffer
-      (emacs-lisp-mode)
-      (pp--insert-lisp sexp)
-      (insert "\n")
-      (goto-char (point-min))
-      (indent-sexp)
-      (while (re-search-forward " +$" nil t)
-        (replace-match ""))
-      (insert-into-buffer obuf))))
+  (if end (pp--region sexp end #'pp-emacs-lisp-code)
+    (let ((obuf (current-buffer)))
+      (with-temp-buffer
+        (emacs-lisp-mode)
+        (pp--insert-lisp sexp)
+        (insert "\n")
+        (goto-char (point-min))
+        (indent-sexp)
+        (while (re-search-forward " +$" nil t)
+          (replace-match ""))
+        (insert-into-buffer obuf)))))
 
 (defun pp--insert-lisp (sexp)
   (cl-case (type-of sexp)
-- 
2.39.2

[0003-pp.el-pp-buffer-Mark-as-obsolete.patch (text/x-diff, inline)]
From 2e10c9ef0b697fe55d6a5162312a217bc22133a1 Mon Sep 17 00:00:00 2001
From: Stefan Monnier <monnier <at> iro.umontreal.ca>
Date: Fri, 16 Jun 2023 13:21:15 -0400
Subject: [PATCH 3/5] pp.el (pp-buffer): Mark as obsolete

* lisp/emacs-lisp/pp.el (pp-buffer): Mark as obsolete

* lisp/org/org-table.el (org-table-fedit-lisp-indent):
* lisp/emacs-lisp/lisp-mode.el (indent-pp-sexp):
* lisp/emacs-lisp/backtrace.el (backtrace--multi-line):
* lisp/ielm.el (ielm-eval-input):
* lisp/help-fns.el (describe-variable): Use the new `pp` calling
convention instead of `pp-buffer`.
(help-fns-edit-variable): Use `pp` instead of `prin1` + `pp-buffer`.
 Use the new `pp`
---
 lisp/emacs-lisp/backtrace.el | 2 +-
 lisp/emacs-lisp/lisp-mode.el | 2 +-
 lisp/emacs-lisp/pp.el        | 1 +
 lisp/help-fns.el             | 9 ++++-----
 lisp/ielm.el                 | 2 +-
 lisp/org/org-table.el        | 5 ++++-
 6 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/lisp/emacs-lisp/backtrace.el b/lisp/emacs-lisp/backtrace.el
index 57912c854b0..81cfafa5738 100644
--- a/lisp/emacs-lisp/backtrace.el
+++ b/lisp/emacs-lisp/backtrace.el
@@ -556,7 +556,7 @@ backtrace-multi-line
 (defun backtrace--multi-line ()
   "Pretty print the current buffer, then remove the trailing newline."
   (set-syntax-table emacs-lisp-mode-syntax-table)
-  (pp-buffer)
+  (pp (point-min) (point-max))
   (goto-char (1- (point-max)))
   (delete-char 1))
 
diff --git a/lisp/emacs-lisp/lisp-mode.el b/lisp/emacs-lisp/lisp-mode.el
index 9914ededb85..83b374551fc 100644
--- a/lisp/emacs-lisp/lisp-mode.el
+++ b/lisp/emacs-lisp/lisp-mode.el
@@ -1418,7 +1418,7 @@ indent-pp-sexp
       (save-excursion
         (save-restriction
           (narrow-to-region (point) (progn (forward-sexp 1) (point)))
-          (pp-buffer)
+          (pp (point-min) (point-max))
           (goto-char (point-max))
           (if (eq (char-before) ?\n)
               (delete-char -1)))))
diff --git a/lisp/emacs-lisp/pp.el b/lisp/emacs-lisp/pp.el
index 0798e46f735..65325dea6f1 100644
--- a/lisp/emacs-lisp/pp.el
+++ b/lisp/emacs-lisp/pp.el
@@ -146,6 +146,7 @@ pp-to-string
 ;;;###autoload
 (defun pp-buffer ()
   "Prettify the current buffer with printed representation of a Lisp object."
+  (declare (obsolete pp "30"))
   (interactive)
   (pp (point-min) (point-max)))
 
diff --git a/lisp/help-fns.el b/lisp/help-fns.el
index b9388b45397..79a2b9a495b 100644
--- a/lisp/help-fns.el
+++ b/lisp/help-fns.el
@@ -1341,7 +1341,7 @@ describe-variable
                         (lisp-data-mode)
                         (set-syntax-table emacs-lisp-mode-syntax-table)
                         (insert print-rep)
-                        (pp-buffer)
+                        (pp (point-min) (point-max))
                         (font-lock-ensure)
                         (let ((pp-buffer (current-buffer)))
                           (with-current-buffer buf
@@ -1368,7 +1368,7 @@ describe-variable
 			(cl-prin1 origval))
                       (save-restriction
                         (narrow-to-region from (point))
-                        (save-excursion (pp-buffer)))
+                        (save-excursion (pp (point-min) (point-max))))
                       (help-fns--editable-variable from (point)
                                                    variable origval buffer)
 		      (if (< (point) (+ from 20))
@@ -1399,7 +1399,7 @@ describe-variable
                         (cl-prin1 global-val)
                         (save-restriction
                           (narrow-to-region from (point))
-                          (save-excursion (pp-buffer)))
+                          (save-excursion (pp (point-min) (point-max))))
 			;; See previous comment for this function.
 			;; (help-xref-on-pp from (point))
 			(if (< (point) (+ from 20))
@@ -1479,8 +1479,7 @@ help-fns-edit-variable
     (unless var
       (error "No variable under point"))
     (pop-to-buffer-same-window (format "*edit %s*" (nth 0 var)))
-    (prin1 (nth 1 var) (current-buffer))
-    (pp-buffer)
+    (pp (nth 1 var) (current-buffer))
     (goto-char (point-min))
     (help-fns--edit-value-mode)
     (insert (format ";; Edit the `%s' variable.\n" (nth 0 var))
diff --git a/lisp/ielm.el b/lisp/ielm.el
index 5c370733c05..f7984ea162c 100644
--- a/lisp/ielm.el
+++ b/lisp/ielm.el
@@ -436,7 +436,7 @@ ielm-eval-input
                                  ;; right buffer!
                                  (with-current-buffer ielmbuf
                                    (cl-prin1 result tmpbuf))
-                                 (pp-buffer)
+                                 (pp (point-min) (point-max))
                                  (concat (buffer-string) aux))))))
           (error
            (setq error-type "IELM Error")
diff --git a/lisp/org/org-table.el b/lisp/org/org-table.el
index 42f234790c5..ecd17c76ec2 100644
--- a/lisp/org/org-table.el
+++ b/lisp/org/org-table.el
@@ -3717,7 +3717,10 @@ org-table-fedit-lisp-indent
 	      (setq this-command nil)
 	      (while (re-search-forward "[ \t]*\n[ \t]*" nil t)
 		(replace-match " ")))
-	  (pp-buffer)
+	  (if (fboundp 'pp-buffer) ;Obsolete since Emacs-30
+	      (with-suppressed-warnings ((obsolete pp-buffer))
+	        (pp-buffer))
+	    (pp (point-min) (point-max)))
 	  (untabify (point-min) (point-max))
 	  (goto-char (1+ (point-min)))
 	  (while (re-search-forward "^." nil t)
-- 
2.39.2

[0004-pp.el-pp-fill-New-default-pp-function.patch (text/x-diff, inline)]
From cee9fb91a200afbaa9d3e7e52d8cd1533e150acc Mon Sep 17 00:00:00 2001
From: Stefan Monnier <monnier <at> iro.umontreal.ca>
Date: Fri, 16 Jun 2023 13:35:06 -0400
Subject: [PATCH 4/5] pp.el (pp-fill): New default pp function

* lisp/emacs-lisp/pp.el (pp-default-function): Change default.
(pp--within-fill-column-p): New helper function.
(pp-fill): New function.
---
 lisp/emacs-lisp/pp.el | 91 ++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 90 insertions(+), 1 deletion(-)

diff --git a/lisp/emacs-lisp/pp.el b/lisp/emacs-lisp/pp.el
index 65325dea6f1..2dc8f7cb65d 100644
--- a/lisp/emacs-lisp/pp.el
+++ b/lisp/emacs-lisp/pp.el
@@ -54,7 +54,7 @@ pp-use-max-width
   :version "29.1")
 (make-obsolete-variable 'pp-use-max-width 'pp-default-function "30.1")
 
-(defcustom pp-default-function #'pp-29
+(defcustom pp-default-function #'pp-fill
   ;; FIXME: The best pretty printer to use depends on the use-case
   ;; so maybe we should allow callers to specify what they want (maybe with
   ;; options like `fast', `compact', `code', `data', ...) and these
@@ -68,6 +68,7 @@ pp-default-function
 In both cases, the function can presume that the buffer is setup for
 Lisp syntax."
   :type '(choice
+          (const :tag "Fit within `fill-column'" pp-fill)
           (const :tag "Emacs≤28 algorithm, fast and good enough" pp-28)
           (const :tag "Work hard for code (slow on large inputs)"
                  pp-emacs-lisp-code)
@@ -143,6 +144,94 @@ pp-to-string
     (funcall (or pp-function pp-default-function) object)
     (buffer-string)))
 
+(defun pp--within-fill-column-p ()
+  "Return non-nil if point is within `fill-column'."
+  ;; Try and make it O(fill-column) rather than O(current-column),
+  ;; so as to avoid major slowdowns on long lines.
+  ;; FIXME: This doesn't account for invisible text or `display' properties :-(
+  (and (save-excursion
+         (re-search-backward
+          "^\\|\n" (max (point-min) (- (point) fill-column)) t))
+       (<= (current-column) fill-column)))
+
+(defun pp-fill (beg &optional end)
+  "Break lines in Lisp code between BEG and END so it fits within `fill-column'.
+Presumes the current buffer has syntax and indentation properly
+configured for that.
+Designed under the assumption that the region occupies a single line,
+tho it should also work if that's not the case.
+Can also be called with a single argument, in which case
+it inserts and pretty-prints that arg at point."
+  (interactive "r")
+  (if (null end) (pp--object beg #'pp-fill)
+    (goto-char beg)
+    (let ((end (copy-marker end t))
+          (newline (lambda ()
+                     (skip-chars-forward ")]}")
+                     (unless (save-excursion (skip-chars-forward " \t") (eolp))
+                       (insert "\n")
+                       (indent-according-to-mode)))))
+      (while (progn (forward-comment (point-max))
+                    (< (point) end))
+        (let ((beg (point))
+              ;; Whether we're in front of an element with paired delimiters.
+              ;; Can be something funky like #'(lambda ..) or ,'#s(...).
+              (paired (when (looking-at "['`,#]*[[:alpha:]]*\\([({[\"]\\)")
+                        (match-beginning 1))))
+          ;; Go to the end of the sexp.
+          (goto-char (or (scan-sexps (or paired (point)) 1) end))
+          (unless
+              (and
+               ;; The sexp is all on a single line.
+               (save-excursion (not (search-backward "\n" beg t)))
+               ;; And its end is within `fill-column'.
+               (or (pp--within-fill-column-p)
+                   ;; If the end of the sexp is beyond `fill-column',
+                   ;; try to move the sexp to its own line.
+                   (and
+                    (save-excursion
+                      (goto-char beg)
+                      (if (save-excursion (skip-chars-backward " \t({[',")
+                                          (bolp))
+                          ;; The sexp was already on its own line.
+                          nil
+                        (skip-chars-backward " \t")
+                        (setq beg (copy-marker beg t))
+                        (if paired (setq paired (copy-marker paired t)))
+                        ;; We could try to undo this insertion if it
+                        ;; doesn't reduce the indentation depth, but I'm
+                        ;; not sure it's worth the trouble.
+                        (insert "\n") (indent-according-to-mode)
+                        t))
+                    ;; Check again if we moved the whole exp to a new line.
+                    (pp--within-fill-column-p))))
+            ;; The sexp is spread over several lines, and/or its end is
+            ;; (still) beyond `fill-column'.
+            (when (and paired (not (eq ?\" (char-after paired))))
+              ;; The sexp has sub-parts, so let's try and spread
+              ;; them over several lines.
+              (save-excursion
+                (goto-char beg)
+                (when (looking-at "(\\([^][()\" \t\n;']+\\)")
+                  ;; Inside an expression of the form (SYM ARG1
+                  ;; ARG2 ... ARGn) where SYM has a `lisp-indent-function'
+                  ;; property that's a number, insert a newline after
+                  ;; the corresponding ARGi, because it tends to lead to
+                  ;; more natural and less indented code.
+                  (let* ((sym (intern-soft (match-string 1)))
+                         (lif (and sym (get sym 'lisp-indent-function))))
+                    (if (eq lif 'defun) (setq lif 2))
+                    (when (natnump lif)
+                      (goto-char (match-end 0))
+                      (forward-sexp lif)
+                      (funcall newline)))))
+              (save-excursion
+                (pp-fill (1+ paired) (1- (point)))))
+            ;; Now the sexp either ends beyond `fill-column' or is
+            ;; spread over several lines (or both).  Either way, the
+            ;; rest of the line should be moved to its own line.
+            (funcall newline)))))))
+
 ;;;###autoload
 (defun pp-buffer ()
   "Prettify the current buffer with printed representation of a Lisp object."
-- 
2.39.2

[0005-lispref-streams.texi-Document-new-PP-functionality.patch (text/x-diff, inline)]
From f55500ef4033c5919783f391c079b9e5ec61fc0c Mon Sep 17 00:00:00 2001
From: Stefan Monnier <monnier <at> iro.umontreal.ca>
Date: Fri, 16 Jun 2023 13:35:36 -0400
Subject: [PATCH 5/5] lispref/streams.texi: Document new PP functionality

* doc/lispref/streams.texi (Output Functions): Document new `pp`
calling convention.
(Output Variables): Document `pp-default-function`.
---
 doc/lispref/streams.texi | 24 ++++++++++++++++++++----
 etc/NEWS                 | 10 ++++++++++
 2 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/doc/lispref/streams.texi b/doc/lispref/streams.texi
index 89046a68249..2eb71b83f9c 100644
--- a/doc/lispref/streams.texi
+++ b/doc/lispref/streams.texi
@@ -755,10 +755,17 @@ Output Functions
 @end defmac
 
 @cindex pretty-printer
-@defun pp object &optional stream
-This function outputs @var{object} to @var{stream}, just like
-@code{prin1}, but does it in a prettier way.  That is, it'll
-indent and fill the object to make it more readable for humans.
+@defun pp object-or-beg &optional stream-or-end
+This function indents and fills the printed representation of an
+object (typically representing ELisp code) to make it more readable
+for humans.
+
+It accepts two calling conventions: if called with two integers
+@var{beg} and @var{end}, it indents and fills the corresponding
+region, presumably containing the printed representation of one or
+more objects, otherwise it takes a @var{object} and an optional
+@var{stream}, and prints @var{object} like @code{prin1}, but does it
+in a prettier way.
 @end defun
 
 If you need to use binary I/O in batch mode, e.g., use the functions
@@ -981,6 +988,15 @@ Output Variables
 having their own escape syntax such as newline.
 @end defvar
 
+@defopt pp-default-function
+This user variable specifies the function used by @code{pp} to prettify
+its output.  By default it uses @code{pp-fill} which attempts to
+strike a good balance between speed and generating natural looking output
+that fits within @code{fill-column}.  The previous default was
+@code{pp-28}, which tends to be faster but generate output that looks
+less natural and is less compact.
+@end defopt
+
 @node Output Overrides
 @section Overriding Output Variables
 @cindex overrides, in output functions
diff --git a/etc/NEWS b/etc/NEWS
index 61e6e161665..7aa387b3a5c 100644
--- a/etc/NEWS
+++ b/etc/NEWS
@@ -396,6 +396,16 @@ name as a string.  The new function
 'dictionary-completing-read-dictionary' can be used to prompt with
 completion based on dictionaries that the server supports.
 
+** Pp
+*** New 'pp-default-function' custom variable replaces 'pp-use-max-width'.
+
+*** New default pretty printing function, which tries to obey 'fill-column'.
+
+*** 'pp' can be applied to a region rather than an object.
+As a consequence, 'pp-buffer' is now declared obsolete.
+
+*** 'pp-to-string' takes an additional 'pp-function' argument.
+This arg specifies the prettifying algorithm to use.
 
 * New Modes and Packages in Emacs 30.1
 
-- 
2.39.2


Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Sat, 17 Jun 2023 05:40:01 GMT) Full text and rfc822 format available.

Message #113 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Sat, 17 Jun 2023 08:39:07 +0300
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: 63861 <at> debbugs.gnu.org
> Date: Fri, 16 Jun 2023 14:26:54 -0400
> 
> +(defun pp-28 (beg &optional end)        ;FIXME: Better name?
> +  "Prettify the current region with printed representation of a Lisp object.
> +Uses the pretty-printing algorithm that was standard in Emacs≤29.
                                                           ^^^^^^^^
Please avoid non-ASCII characters in doc strings: they could produce
display problems on less-than-capable terminals.

> +@defun pp object-or-beg &optional stream-or-end
> +This function indents and fills the printed representation of an
> +object (typically representing ELisp code) to make it more readable
> +for humans.
> +
> +It accepts two calling conventions: if called with two integers
> +@var{beg} and @var{end}, it indents and fills the corresponding
> +region, presumably containing the printed representation of one or
> +more objects, otherwise it takes a @var{object} and an optional
> +@var{stream}, and prints @var{object} like @code{prin1}, but does it
> +in a prettier way.

This text references arguments like @var{object} that are named
differently in the @defun line.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Sat, 17 Jun 2023 16:15:01 GMT) Full text and rfc822 format available.

Message #116 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Sat, 17 Jun 2023 12:13:54 -0400
>> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
>> Cc: 63861 <at> debbugs.gnu.org
>> Date: Fri, 16 Jun 2023 14:26:54 -0400
>> 
>> +(defun pp-28 (beg &optional end)        ;FIXME: Better name?
>> +  "Prettify the current region with printed representation of a Lisp object.
>> +Uses the pretty-printing algorithm that was standard in Emacs≤29.
>                                                            ^^^^^^^^
> Please avoid non-ASCII characters in doc strings: they could produce
> display problems on less-than-capable terminals.

OK

>> +@defun pp object-or-beg &optional stream-or-end
>> +This function indents and fills the printed representation of an
>> +object (typically representing ELisp code) to make it more readable
>> +for humans.
>> +
>> +It accepts two calling conventions: if called with two integers
>> +@var{beg} and @var{end}, it indents and fills the corresponding
>> +region, presumably containing the printed representation of one or
>> +more objects, otherwise it takes a @var{object} and an optional
>> +@var{stream}, and prints @var{object} like @code{prin1}, but does it
>> +in a prettier way.
>
> This text references arguments like @var{object} that are named
> differently in the @defun line.

Indeed.  Assuming you understood what I meant to say, do you have
a recommendation of how to write it?

Or maybe, I should leave `pp` alone and add a new `pp-region`
function instead instead of combining two different calling conventions
on a single function?  The reason I've done that is because it was
difficult to avoid doing it for the "backend functions" (those that can
be put on `pp-default-function`), but admittedly, that doesn't have to
carry over to `pp`.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Sat, 17 Jun 2023 16:43:02 GMT) Full text and rfc822 format available.

Message #119 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Sat, 17 Jun 2023 19:42:44 +0300
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: 63861 <at> debbugs.gnu.org
> Date: Sat, 17 Jun 2023 12:13:54 -0400
> 
> >> +@defun pp object-or-beg &optional stream-or-end
> >> +This function indents and fills the printed representation of an
> >> +object (typically representing ELisp code) to make it more readable
> >> +for humans.
> >> +
> >> +It accepts two calling conventions: if called with two integers
> >> +@var{beg} and @var{end}, it indents and fills the corresponding
> >> +region, presumably containing the printed representation of one or
> >> +more objects, otherwise it takes a @var{object} and an optional
> >> +@var{stream}, and prints @var{object} like @code{prin1}, but does it
> >> +in a prettier way.
> >
> > This text references arguments like @var{object} that are named
> > differently in the @defun line.
> 
> Indeed.  Assuming you understood what I meant to say, do you have
> a recommendation of how to write it?

Something like this:

  The function can be called to pretty-print either a region of text,
  presumably containing the printed representation of one or more
  objects, or to pretty-print an object.  In the first case, the
  function must be called with 2 arguments, @var{object-or-beg} and
  @var{stream-or-end}, which are integer buffer positions that define
  the region; in the second case, @var{object-or-beg} is the object to
  print and @var{stream-or-end} is the stream to which to print it,
  which defaults to @code{standard-output} if nil or omitted.

> Or maybe, I should leave `pp` alone and add a new `pp-region`
> function instead instead of combining two different calling conventions
> on a single function?

That might be better, indeed, both for documentation and for usage.




Reply sent to Stefan Monnier <monnier <at> iro.umontreal.ca>:
You have taken responsibility. (Sat, 17 Jun 2023 22:09:02 GMT) Full text and rfc822 format available.

Notification sent to Stefan Monnier <monnier <at> iro.umontreal.ca>:
bug acknowledged by developer. (Sat, 17 Jun 2023 22:09:02 GMT) Full text and rfc822 format available.

Message #124 received at 63861-done <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 63861-done <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Sat, 17 Jun 2023 18:08:33 -0400
>> Or maybe, I should leave `pp` alone and add a new `pp-region`
>> function instead instead of combining two different calling conventions
>> on a single function?
>
> That might be better, indeed, both for documentation and for usage.

Done.  That also leads to fewer visible changes, so it's good (I just
kept `pp-buffer` instead of adding `pp-region`).

I pushed the result (after fixing the test problems with option B).
Thanks,


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#63861; Package emacs. (Tue, 20 Jun 2023 20:57:01 GMT) Full text and rfc822 format available.

Message #127 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Thierry Volpiatto <thievol <at> posteo.net>
Cc: 63861 <at> debbugs.gnu.org
Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code
Date: Tue, 20 Jun 2023 16:56:33 -0400
> So, by file, from fastest to slowest:
>
>     foo.el (0.859482743 0 0.0) (pp-buffer) t
>     foo.el (0.890402623 0 0.0) (pp-buffer) nil
>     foo.el (4.62344853 4 1.7225397670000002) (tv/pp-region (point-min) (point-max)) t
>     foo.el (4.687414465 4 1.7116580980000002) (tv/pp-region (point-min) (point-max)) nil
>     foo.el (7.932661181 1 0.3435169600000001) (pp-region (point-min) (point-max)) t
>     foo.el (196.183345212 1 0.618591124) (pp-region (point-min) (point-max)) nil
>     foo.el (2997.739238575 505 105.82851685700001) (let ((s (read (current-buffer)))) (erase-buffer) (pp-emacs-lisp-code s)) t
[...]
> We also see that `pp-emacs-lisp-code` is *much* slower.  I don't include
> other results for this function in this email because they're still
> running :-)

OK, they're done running (well, I had to re-run them because of a power
failure in between).  The tests failed on `test-load-history.el` with:

    Error: wrong-type-argument (number-or-marker-p cl--defmethod-doc-pos)

so it looks like it tickles a bug somewhere in `pp-emacs-lisp-code`.
As for the performance:

    "foo.el" (3207.572643287 505 111.459754959) (... (pp-emacs-lisp-code s)) t
    "foo.el" (121171.97145393 692 103.67438615900001) (... (pp-emacs-lisp-code s)) nil
    "test-bookmark-alist.el" (102462.563603419 5456 921.614736375) (... (pp-emacs-lisp-code s)) t
    "test-bookmark-alist.el" (191188.84323175802 7493 847.82675889) (... (pp-emacs-lisp-code s)) nil

So the `lisp-ppss` patch speeds up `pp-emacs-lisp-code` by a factor 37x
on `foo.el` and a factor a bit less than 2x for `test-bookmark-alist.el`.

We also see that `pp-emacs-lisp-code` (with the `lisp-ppss` patch) is
more than 300x slower than the new `pp-fill` code on `foo.el` and more
than 3000x slower than the new `pp-fill` code on
`test-bookmark-alist.el`.

Admittedly, these are not cases for which that code was designed (these
files hold data rather than code).


        Stefan





bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 19 Jul 2023 11:24:09 GMT) Full text and rfc822 format available.

This bug report was last modified 282 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.