GNU bug report logs - #67390
28; shorthands-font-lock-shorthands assumes shorthand uses same separator

Previous Next

Package: emacs;

Reported by: Jonas Bernoulli <jonas <at> bernoul.li>

Date: Wed, 22 Nov 2023 22:19:01 UTC

Severity: normal

To reply to this bug, email your comments to 67390 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Wed, 22 Nov 2023 22:19:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jonas Bernoulli <jonas <at> bernoul.li>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Wed, 22 Nov 2023 22:19:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jonas Bernoulli <jonas <at> bernoul.li>
To: bug-gnu-emacs <at> gnu.org
Cc: Joseph Turner <joseph <at> ushin.org>, Adam Porter <adam <at> alphapapa.net>,
 João Távora <joaotavora <at> gmail.com>
Subject: 28; shorthands-font-lock-shorthands assumes shorthand uses same
 separator
Date: Wed, 22 Nov 2023 23:18:18 +0100
Hello,

`shorthands-font-lock-shorthands' assumes that the shorthand uses the
same separator as the real symbol name, e.g., "s-" instead of "long-".
This causes an additional character to be highlighted when an
alternative separator is used, e.g., when using "s/" instead of "long-".

`shorthands--mismatch-from-end' returns the length of the common suffix,
of the shorthand and the long name.

  (shorthands--mismatch-from-end "s-tail" "long-tail") => 5

  "s-" is highlighted in this case.

Since `shorthands-font-lock-shorthands' also wants to highlight the
separator, "-" in this case, it corrects this off-by-one, but when
different separators are in use, this causes an additional character
to be highlighted

  (shorthands--mismatch-from-end "s/tail" "long-tail") => 4

  "s/t" is highlighted in this case.

Could we add support for using an alternative separator in shorthands?

IMO it is okay to use another separator for *shorthand* prefixes.  They
won't show up in M-x completion candidates, for example.  Shorthands are
confined to the file where they are being used, and I think authors
should be free to use whatever prefix they fancy.

     Cheers,
     Jonas




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Thu, 23 Nov 2023 12:55:02 GMT) Full text and rfc822 format available.

Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Jonas Bernoulli <jonas <at> bernoul.li>
Cc: Joseph Turner <joseph <at> ushin.org>, Adam Porter <adam <at> alphapapa.net>,
 bug-gnu-emacs <at> gnu.org
Subject: Re: 28;
 shorthands-font-lock-shorthands assumes shorthand uses same separator
Date: Thu, 23 Nov 2023 12:57:31 +0000
On Wed, Nov 22, 2023 at 10:18 PM Jonas Bernoulli <jonas <at> bernoul.li> wrote:

> Could we add support for using an alternative separator in shorthands?

I think so, if you can find a patch for it.  This only affects the font-locking
bits of shorthands right?  IOW all other shorthand-aware functionality like
eldoc, M-., etc, already works with different separators, right?

> IMO it is okay to use another separator for *shorthand* prefixes.  They
> won't show up in M-x completion candidates, for example.  Shorthands are
> confined to the file where they are being used, and I think authors
> should be free to use whatever prefix they fancy.

I agree in principle, and exactly for that reason (that shorthands are confined
to a file).

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Fri, 24 Nov 2023 21:53:02 GMT) Full text and rfc822 format available.

Message #11 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jonas Bernoulli <jonas <at> bernoul.li>
To: João Távora <joaotavora <at> gmail.com>
Cc: Joseph Turner <joseph <at> ushin.org>, Adam Porter <adam <at> alphapapa.net>,
 bug-gnu-emacs <at> gnu.org
Subject: Re: 28; shorthands-font-lock-shorthands assumes shorthand uses same
 separator
Date: Fri, 24 Nov 2023 22:51:32 +0100
João Távora <joaotavora <at> gmail.com> writes:

> On Wed, Nov 22, 2023 at 10:18 PM Jonas Bernoulli <jonas <at> bernoul.li> wrote:
>
>> Could we add support for using an alternative separator in shorthands?
>
> I think so, if you can find a patch for it.  This only affects the font-locking
> bits of shorthands right?  IOW all other shorthand-aware functionality like
> eldoc, M-., etc, already works with different separators, right?

I haven't found any issues beside this off-by-one font-lock issue.

So far I have used this beauty:

diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.el
@@ -57,7 +57,7 @@ shorthands--mismatch-from-end
            for i from 1
            for i1 = (- l1 i) for i2 = (- l2 i)
            while (and (>= i1 0) (>= i2 0) (eq (aref str1 i1) (aref str2 i2)))
-           finally (return (1- i))))
+           finally (return (if (eq (aref str2 (1+ i2)) ?-) (1- i) i))))

Is that good enough?  Depending on how you look at it, this changes what
is being returned, but IMO this function is a bit murky to begin with.

The function name is `shorthands--mismatch-from-end', but it returns the
length of the common suffix, minus one, to account for the separator.
This change ensures that the separator is accounted for, even if it
differs between the shorthand and real symbol.

Since this function returns the length of the *matching* suffix after
the prefixes (including separator), I find it weird that its name
contains *MISmatch*.

It might make more sense to return the length of the shorthand prefix.

Also, have you considered throwing in a
  (not (string-equal (match-string 1) sname))

to avoid having to call `shorthands--mismatch-from-end' at all?
Maybe you have, but concluded it is cheaper to do a bit too much work
for non-shorthands, than to effectively repeat some work for shorthands.

     Jonas




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Sat, 25 Nov 2023 00:04:02 GMT) Full text and rfc822 format available.

Message #14 received at submit <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Jonas Bernoulli <jonas <at> bernoul.li>
Cc: Joseph Turner <joseph <at> ushin.org>, Adam Porter <adam <at> alphapapa.net>,
 bug-gnu-emacs <at> gnu.org
Subject: Re: 28;
 shorthands-font-lock-shorthands assumes shorthand uses same separator
Date: Sat, 25 Nov 2023 00:03:10 +0000
On Fri, Nov 24, 2023 at 9:51 PM Jonas Bernoulli <jonas <at> bernoul.li> wrote:

> I haven't found any issues beside this off-by-one font-lock issue.

Good.

> So far I have used this beauty:
>
> diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.el
> @@ -57,7 +57,7 @@ shorthands--mismatch-from-end
>             for i from 1
>             for i1 = (- l1 i) for i2 = (- l2 i)
>             while (and (>= i1 0) (>= i2 0) (eq (aref str1 i1) (aref str2 i2)))
> -           finally (return (1- i))))
> +           finally (return (if (eq (aref str2 (1+ i2)) ?-) (1- i) i))))
>
> Is that good enough?  Depending on how you look at it, this changes what
> is being returned, but IMO this function is a bit murky to begin with.

A bit dodgy, no :-)  Maybe a docstring would shed some light
on what this function is supposed to do, so I'll propose one below.

> The function name is `shorthands--mismatch-from-end', but it returns the
> length of the common suffix, minus one, to account for the separator.
> This change ensures that the separator is accounted for, even if it
> differs between the shorthand and real symbol.
>
> Since this function returns the length of the *matching* suffix after
> the prefixes (including separator), I find it weird that its name
> contains *MISmatch*.

Probably I wanted to emulate what CL's MISMATCH does to some degree.
cl-mismatch exists in Emacs, but it's not available in shorthands.el,
and it seems to be buggy anyway.

You can rename the function shorthands--common-suffix-length if
you want.  I probably meant it to be separator-agnostic function,
and be replaceable by a cl-mismatch some time around 2084,

Now to your problem: I think what you want is to customize element
comparison (CL's MISMATCH allows that, btw).  Here's one way.

diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.el
index 82200ab88e9..36a862bc037 100644
--- a/lisp/emacs-lisp/shorthands.el
+++ b/lisp/emacs-lisp/shorthands.el
@@ -52,11 +52,18 @@ elisp-shorthand-font-lock-face
   :version "28.1"
   :group 'font-lock-faces)

-(defun shorthands--mismatch-from-end (str1 str2)
-  (cl-loop with l1 = (length str1) with l2 = (length str2)
+(defun shorthands--mismatch-from-end (str1 str2 &optional test)
+  "Compute position of first mismatching element of STR1 and STR2, from end.
+The return value is the reverse index of that element.  If 0, the
+last characters of STR1 and STR2 differ, if 1, the penultimate
+characters differ, and so on.  If optional TEST is passed,
+compare elements using that function, else compare with `eq'."
+  (cl-loop with test = (or test #'eq)
+           with l1 = (length str1) with l2 = (length str2)
            for i from 1
            for i1 = (- l1 i) for i2 = (- l2 i)
-           while (and (>= i1 0) (>= i2 0) (eq (aref str1 i1) (aref str2 i2)))
+           while (and (>= i1 0) (>= i2 0)
+                      (funcall test (aref str1 i1) (aref str2 i2)))
            finally (return (1- i))))

 (defun shorthands-font-lock-shorthands (limit)

And now using the following lambda for TEST yields what you want:

(shorthands--mismatch-from-end "s-tail" "long-tail") ;; => 5

(shorthands--mismatch-from-end "s/tail" "long-tail"
                               (lambda (c1 c2)
                                 (or (and (eq c1 ?/) (eq c2 ?-))
                                     (eq c1 c2)))) ;; => also 5

Of course, you can hardcode this test inside the function, no need
for a parameter, and rename the function to whatever you find more
appropriate.  This allows us to keep control over what things
are considered acceptable separators from a font-locking perspective

> It might make more sense to return the length of the shorthand prefix.

I've been away from this code for a couple of years, so feel free to
propose more extensive changes.  As long as they don't increase the
complexity and are suitably tested, I have nothing against.

> Also, have you considered throwing in a
>   (not (string-equal (match-string 1) sname))
>
> to avoid having to call `shorthands--mismatch-from-end' at all?

Where and how this be thrown in?  How would you know what to highlight
in the shorthand printed form?  There can be many shorthand prefixes
in a given file.  But do show some code.

> Maybe you have, but concluded it is cheaper to do a bit too much work
> for non-shorthands, than to effectively repeat some work for shorthands.

Maybe.  Not sure this is more work (string-equal must still compare
every character right?), but, in summary, I'm not married to this
implementation.  I somewhat appreciate that I could still read it
after not having looked  at it for a couple years, but feel free to
change it.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Sat, 25 Nov 2023 03:31:01 GMT) Full text and rfc822 format available.

Message #17 received at 67390 <at> debbugs.gnu.org (full text, mbox):

From: Joseph Turner <joseph <at> ushin.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: 67390 <at> debbugs.gnu.org, Adam Porter <adam <at> alphapapa.net>,
 Jonas Bernoulli <jonas <at> bernoul.li>
Subject: Re: 28; shorthands-font-lock-shorthands assumes shorthand uses same
 separator
Date: Fri, 24 Nov 2023 19:26:54 -0800
[Message part 1 (text/plain, inline)]
João Távora <joaotavora <at> gmail.com> writes:

> On Fri, Nov 24, 2023 at 9:51 PM Jonas Bernoulli <jonas <at> bernoul.li> wrote:
>
>> Also, have you considered throwing in a
>>   (not (string-equal (match-string 1) sname))
>>
>> to avoid having to call `shorthands--mismatch-from-end' at all?
>
> Where and how this be thrown in?  How would you know what to highlight
> in the shorthand printed form?  There can be many shorthand prefixes
> in a given file.  But do show some code.

Please see the attached patch, inspired by Jonas's comment.

>> Maybe you have, but concluded it is cheaper to do a bit too much work
>> for non-shorthands, than to effectively repeat some work for shorthands.
>
> Maybe.  Not sure this is more work (string-equal must still compare
> every character right?), but, in summary, I'm not married to this
> implementation.  I somewhat appreciate that I could still read it
> after not having looked  at it for a couple years, but feel free to
> change it.

I haven't done any benchmarking - I'm curious to learn how to benchmark
font lock though!

Thanks!!

Joseph

[0001-Support-shorthand-prefixes-besides.patch (text/x-diff, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Sat, 25 Nov 2023 16:03:02 GMT) Full text and rfc822 format available.

Message #20 received at 67390 <at> debbugs.gnu.org (full text, mbox):

From: Jonas Bernoulli <jonas <at> bernoul.li>
To: Joseph Turner <joseph <at> ushin.org>, João Távora
 <joaotavora <at> gmail.com>
Cc: 67390 <at> debbugs.gnu.org, Adam Porter <adam <at> alphapapa.net>
Subject: Re: 28; shorthands-font-lock-shorthands assumes shorthand uses same
 separator
Date: Sat, 25 Nov 2023 17:01:51 +0100
Joseph Turner <joseph <at> ushin.org> writes:

> +                          (car (shorthands--find-if
> +                                (lambda (short)
> +                                  (string-prefix-p short (match-string 1)))
> +                                read-symbol-shorthands #'car)))))

Or simply:
  (car (assoc (match-string 1)
              read-symbol-shorthands
              #'string-prefix-p))




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Sat, 25 Nov 2023 22:44:01 GMT) Full text and rfc822 format available.

Message #23 received at 67390 <at> debbugs.gnu.org (full text, mbox):

From: Joseph Turner <joseph <at> ushin.org>
To: Jonas Bernoulli <jonas <at> bernoul.li>
Cc: 67390 <at> debbugs.gnu.org, Adam Porter <adam <at> alphapapa.net>,
 João Távora <joaotavora <at> gmail.com>
Subject: Re: 28; shorthands-font-lock-shorthands assumes shorthand uses same
 separator
Date: Sat, 25 Nov 2023 14:42:48 -0800
[Message part 1 (text/plain, inline)]
Jonas Bernoulli <jonas <at> bernoul.li> writes:

> Joseph Turner <joseph <at> ushin.org> writes:
>
>> +                          (car (shorthands--find-if
>> +                                (lambda (short)
>> +                                  (string-prefix-p short (match-string 1)))
>> +                                read-symbol-shorthands #'car)))))
>
> Or simply:
>   (car (assoc (match-string 1)
>               read-symbol-shorthands
>               #'string-prefix-p))

Much nicer - see patch.  Thanks, Jonas!

[0001-Support-shorthand-prefixes-besides.patch (text/x-diff, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Sun, 26 Nov 2023 13:53:02 GMT) Full text and rfc822 format available.

Message #26 received at 67390 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Joseph Turner <joseph <at> ushin.org>
Cc: 67390 <at> debbugs.gnu.org, Adam Porter <adam <at> alphapapa.net>,
 Jonas Bernoulli <jonas <at> bernoul.li>
Subject: Re: 28;
 shorthands-font-lock-shorthands assumes shorthand uses same separator
Date: Sun, 26 Nov 2023 13:52:30 +0000
On Sat, Nov 25, 2023 at 10:43 PM Joseph Turner <joseph <at> ushin.org> wrote:
>
> Jonas Bernoulli <jonas <at> bernoul.li> writes:
>
> > Joseph Turner <joseph <at> ushin.org> writes:
> >
> >> +                          (car (shorthands--find-if
> >> +                                (lambda (short)
> >> +                                  (string-prefix-p short (match-string 1)))
> >> +                                read-symbol-shorthands #'car)))))
> >
> > Or simply:
> >   (car (assoc (match-string 1)
> >               read-symbol-shorthands
> >               #'string-prefix-p))
>
> Much nicer - see patch.  Thanks, Jonas!

So, I had a look at this patch and I think we should compare it
with the patch after my sig, which keeps 'shorthands--mismatch-from-end'
and also fixes this bug.

The main difference I see is that my patch keeps doing one string
comparison, via the mismatch function (which btw is now perfectly
analogous to CL mismatch and thus correctly named).  In the worst case,
Josheph's patch does 1 + N where N is the number of shorthands.  So
this is a fundamental complexity change.

Normally, that would be the end of the story, but here, it isn't.
For two reasons.

My version keeps a behaviour that can be considered buggy.
If a shorthand prefix has a common suffix with the longhand prefix
then that suffix will not be highlighted. Like:

;; Local Variables:
;; read-symbol-shorthands: (("bcrumb-" . "breadcrumb-")
;; End:

Here only "b" would be highlighted, effectively showing the user
how much typing was saved.  Is this wrong?  Does it makes sense
to use shorthands like this?

The other reason why this isn't the end of the story is that even
if we take that bug for granted, the string comparison functions in
Joshep's patch delegate to built-in C  comparison operators, which are
often much, much faster than Elisp.  At least before the advent of native
compilation, it used to be like this. Of course for a large enough  N
number of shorthands, my version wins, but that is probably not very
common either (or is it?  Not very hard to imagine a file making use
of many libraries and shorthanding each of them?)

So, benchmarking it will have to be, I'm afraid, because AFAIK
font-locking is a very performance sensitive area of Emacs.

In the meantime I will push my patch, but keep the bug open to see
if it is worth pushing Joseph's version.

João


diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.el
index 82200ab88e9..b0665a55695 100644
--- a/lisp/emacs-lisp/shorthands.el
+++ b/lisp/emacs-lisp/shorthands.el
@@ -53,11 +53,16 @@ elisp-shorthand-font-lock-face
   :group 'font-lock-faces)

 (defun shorthands--mismatch-from-end (str1 str2)
+  "Tell index of first mismatch in STR1 and STR2, from end.
+The index is a valid 0-based index on STR1.  Returns nil if STR1
+equals STR2.  Return 0 if STR1 is a suffix of STR2."
   (cl-loop with l1 = (length str1) with l2 = (length str2)
            for i from 1
            for i1 = (- l1 i) for i2 = (- l2 i)
-           while (and (>= i1 0) (>= i2 0) (eq (aref str1 i1) (aref str2 i2)))
-           finally (return (1- i))))
+           while (eq (aref str1 i1) (aref str2 i2))
+           if (zerop i2) return (if (zerop i1) nil i1)
+           if (zerop i1) return 0
+           finally (return i1)))

 (defun shorthands-font-lock-shorthands (limit)
   (when read-symbol-shorthands
@@ -69,10 +74,16 @@ shorthands-font-lock-shorthands
                                                font-lock-string-face)))
                          (intern-soft (match-string 1))))
              (sname (and probe (symbol-name probe)))
-             (mm (and sname (shorthands--mismatch-from-end
-                             (match-string 1) sname))))
-        (unless (or (null mm) (= mm (length sname)))
-          (add-face-text-property (match-beginning 1) (1+ (- (match-end 1) mm))
+             (mismatch (and sname (shorthands--mismatch-from-end
+                                   (match-string 1) sname)))
+             (guess (and mismatch (1+ mismatch))))
+        (when guess
+          (when (and (< guess (1- (length (match-string 1))))
+                     ;; In bug#67390 we allow other separators
+                     (eq (char-syntax (aref (match-string 1) guess)) ?_))
+            (setq guess (1+ guess)))
+          (add-face-text-property (match-beginning 1)
+                                  (+ (match-beginning 1) guess)
                                   'elisp-shorthand-font-lock-face))))))

 (font-lock-add-keywords 'emacs-lisp-mode
'((shorthands-font-lock-shorthands)) t)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Sun, 26 Nov 2023 20:39:02 GMT) Full text and rfc822 format available.

Message #29 received at 67390 <at> debbugs.gnu.org (full text, mbox):

From: Joseph Turner <joseph <at> ushin.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: 67390 <at> debbugs.gnu.org, Adam Porter <adam <at> alphapapa.net>,
 Jonas Bernoulli <jonas <at> bernoul.li>
Subject: Re: 28; shorthands-font-lock-shorthands assumes shorthand uses same
 separator
Date: Sun, 26 Nov 2023 12:35:32 -0800
João Távora <joaotavora <at> gmail.com> writes:

> On Sat, Nov 25, 2023 at 10:43 PM Joseph Turner <joseph <at> ushin.org> wrote:
>>
>> Jonas Bernoulli <jonas <at> bernoul.li> writes:
>>
>> > Joseph Turner <joseph <at> ushin.org> writes:
>> >
>> >> +                          (car (shorthands--find-if
>> >> +                                (lambda (short)
>> >> +                                  (string-prefix-p short (match-string 1)))
>> >> +                                read-symbol-shorthands #'car)))))
>> >
>> > Or simply:
>> >   (car (assoc (match-string 1)
>> >               read-symbol-shorthands
>> >               #'string-prefix-p))
>>
>> Much nicer - see patch.  Thanks, Jonas!
>
> So, I had a look at this patch and I think we should compare it
> with the patch after my sig, which keeps 'shorthands--mismatch-from-end'
> and also fixes this bug.
>
> The main difference I see is that my patch keeps doing one string
> comparison, via the mismatch function (which btw is now perfectly
> analogous to CL mismatch and thus correctly named).  In the worst case,
> Josheph's patch does 1 + N where N is the number of shorthands.  So
> this is a fundamental complexity change.
>
> Normally, that would be the end of the story, but here, it isn't.
> For two reasons.
>
> My version keeps a behaviour that can be considered buggy.
> If a shorthand prefix has a common suffix with the longhand prefix
> then that suffix will not be highlighted. Like:
>
> ;; Local Variables:
> ;; read-symbol-shorthands: (("bcrumb-" . "breadcrumb-")
> ;; End:
>
> Here only "b" would be highlighted, effectively showing the user
> how much typing was saved.  Is this wrong?  Does it makes sense
> to use shorthands like this?

I would expect the entire the shorthand to be highlit, I don't feeling
strongly about this.

> The other reason why this isn't the end of the story is that even
> if we take that bug for granted, the string comparison functions in
> Joshep's patch delegate to built-in C  comparison operators, which are
> often much, much faster than Elisp.  At least before the advent of native
> compilation, it used to be like this. Of course for a large enough  N
> number of shorthands, my version wins, but that is probably not very
> common either (or is it?  Not very hard to imagine a file making use
> of many libraries and shorthanding each of them?)
>
> So, benchmarking it will have to be, I'm afraid, because AFAIK
> font-locking is a very performance sensitive area of Emacs.

Yes.  I would like to learn how to do this!

> In the meantime I will push my patch, but keep the bug open to see
> if it is worth pushing Joseph's version.

Thank you!!  I'm happy to discuss this further if others are interested.

Joseph




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Sun, 26 Nov 2023 22:03:02 GMT) Full text and rfc822 format available.

Message #32 received at 67390 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Joseph Turner <joseph <at> ushin.org>, Eli Zaretskii <eliz <at> gnu.org>
Cc: 67390 <at> debbugs.gnu.org, Adam Porter <adam <at> alphapapa.net>,
 Jonas Bernoulli <jonas <at> bernoul.li>
Subject: Re: 28;
 shorthands-font-lock-shorthands assumes shorthand uses same separator
Date: Sun, 26 Nov 2023 22:02:01 +0000
On Sun, Nov 26, 2023 at 8:38 PM Joseph Turner <joseph <at> ushin.org> wrote:
>
> João Távora <joaotavora <at> gmail.com> writes:
>
> > On Sat, Nov 25, 2023 at 10:43 PM Joseph Turner <joseph <at> ushin.org> wrote:

> > So, benchmarking it will have to be, I'm afraid, because AFAIK
> > font-locking is a very performance sensitive area of Emacs.
>
> Yes.  I would like to learn how to do this!

I'm CCing Eli.

In the past, ISTR, Eli suggested to benchmark such things by visiting a
very large file in its beginning, then scrolling down by holding
the down arrow or PgDn for some fixed time period, like 30 seconds.
The  Emacs that scrolls the farthest is the most performant.  Not
entirely fail-proof (other processes may interfere, etc), but not
bad either.

So here you could create very large fictitious Elisp file with 0, 1, 3 and
10 shorthands each and then run your version vs my version and record
the final line numbers.  Then show the files and the numbers.

> > In the meantime I will push my patch, but keep the bug open to see
> > if it is worth pushing Joseph's version.
>
> Thank you!!

Now done in 36941e9e6a (master).

> I'm happy to discuss this further if others are interested.

I'm keeping this open.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Mon, 27 Nov 2023 04:13:01 GMT) Full text and rfc822 format available.

Message #35 received at 67390 <at> debbugs.gnu.org (full text, mbox):

From: Joseph Turner <joseph <at> ushin.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: 67390 <at> debbugs.gnu.org, Adam Porter <adam <at> alphapapa.net>,
 Eli Zaretskii <eliz <at> gnu.org>, Jonas Bernoulli <jonas <at> bernoul.li>
Subject: Re: 28; shorthands-font-lock-shorthands assumes shorthand uses same
 separator
Date: Sun, 26 Nov 2023 19:48:38 -0800
João Távora <joaotavora <at> gmail.com> writes:

> In the past, ISTR, Eli suggested to benchmark such things by visiting a
> very large file in its beginning, then scrolling down by holding
> the down arrow or PgDn for some fixed time period, like 30 seconds.
> The  Emacs that scrolls the farthest is the most performant.  Not
> entirely fail-proof (other processes may interfere, etc), but not
> bad either.
>
> So here you could create very large fictitious Elisp file with 0, 1, 3 and
> 10 shorthands each and then run your version vs my version and record
> the final line numbers.  Then show the files and the numbers.

In a 2.5M Elisp file with 0 shorthand prefixes, after 30s of pressing
C-v (scroll-up-command), point was on line 19238.

I then tried the same file with 1 and 4 shorthands, and I got basically
the same result:

| With 1 shorthand prefix |       |
|-------------------------+-------|
| No patch                | 19447 |
| mismatch                | 19238 |
| compare all prefixes    | 19024 |

| With 4 shorthand prefixes |       |
|---------------------------+-------|
| No patch                  | 19211 |
| mismatch                  | 19521 |
| compare all prefixes      | 19339 |

There is a big margin of error (probably around 500-1000 lines) since my
method wasn't at all exact. I stopped holding C-v when the stopwatch on
another device hit 30s, and I might have held for ±1s.

If this approach to benchmarking is valid, I think it indicates that
shorthands has no significant effect on performance in either case,
though there may be a greater difference with more shorthand prefixes.

> My version keeps a behaviour that can be considered buggy.
> If a shorthand prefix has a common suffix with the longhand prefix
> then that suffix will not be highlighted. Like:

> ;; Local Variables:
> ;; read-symbol-shorthands: (("bcrumb-" . "breadcrumb-")
> ;; End:

> Here only "b" would be highlighted, effectively showing the user
> how much typing was saved.  Is this wrong?  Does it makes sense
> to use shorthands like this?

Another example case:

;; Local Variables:
;; read-symbol-shorthands: (("aw-" . "ace-window-")
;; End:

Here only "a" would be highlighted.

Thanks,

Joseph




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Mon, 27 Nov 2023 12:12:01 GMT) Full text and rfc822 format available.

Message #38 received at 67390 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: joseph <at> ushin.org, 67390 <at> debbugs.gnu.org, jonas <at> bernoul.li,
 adam <at> alphapapa.net
Subject: Re: 28;
 shorthands-font-lock-shorthands assumes shorthand uses same separator
Date: Mon, 27 Nov 2023 14:10:53 +0200
> From: João Távora <joaotavora <at> gmail.com>
> Date: Sun, 26 Nov 2023 22:02:01 +0000
> Cc: Jonas Bernoulli <jonas <at> bernoul.li>, 67390 <at> debbugs.gnu.org, 
> 	Adam Porter <adam <at> alphapapa.net>
> 
> On Sun, Nov 26, 2023 at 8:38 PM Joseph Turner <joseph <at> ushin.org> wrote:
> >
> > João Távora <joaotavora <at> gmail.com> writes:
> >
> > > On Sat, Nov 25, 2023 at 10:43 PM Joseph Turner <joseph <at> ushin.org> wrote:
> 
> > > So, benchmarking it will have to be, I'm afraid, because AFAIK
> > > font-locking is a very performance sensitive area of Emacs.
> >
> > Yes.  I would like to learn how to do this!
> 
> I'm CCing Eli.
> 
> In the past, ISTR, Eli suggested to benchmark such things by visiting a
> very large file in its beginning, then scrolling down by holding
> the down arrow or PgDn for some fixed time period, like 30 seconds.
> The  Emacs that scrolls the farthest is the most performant.  Not
> entirely fail-proof (other processes may interfere, etc), but not
> bad either.

I still recommend this method.  Something like the below:

  (defun scroll-up-benchmark ()
    (interactive)
    (let ((oldgc gcs-done)
	  (oldtime (float-time)))
      (condition-case nil (while t (scroll-up) (redisplay))
	(error (message "GCs: %d Elapsed time: %f seconds"
			(- gcs-done oldgc) (- (float-time) oldtime))))))

Evaluate the above, and the invoke it at the beginning of a large
file.  Then compare the timings with different font-lock arrangements.

A variant is to scroll by N lines, not by pages.  Just change the
above to call scroll-up with the argument of N, for example 1 (or any
other number, if you want).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Wed, 29 Nov 2023 08:23:02 GMT) Full text and rfc822 format available.

Message #41 received at 67390 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Joseph Turner <joseph <at> ushin.org>, 67390 <at> debbugs.gnu.org,
 Jonas Bernoulli <jonas <at> bernoul.li>, Adam Porter <adam <at> alphapapa.net>
Subject: Re: bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand
 uses same separator
Date: Wed, 29 Nov 2023 08:21:59 +0000
[Message part 1 (text/plain, inline)]
On Mon, Nov 27, 2023, 12:12 Eli Zaretskii <eliz <at> gnu.org> wrote:

> > From: João Távora <joaotavora <at> gmail.com>
> > Date: Sun, 26 Nov 2023 22:02:01 +0000
> > Cc: Jonas Bernoulli <jonas <at> bernoul.li>, 67390 <at> debbugs.gnu.org,
> >       Adam Porter <adam <at> alphapapa.net>
> >
> > On Sun, Nov 26, 2023 at 8:38 PM Joseph Turner <joseph <at> ushin.org> wrote:
> > >
> > > João Távora <joaotavora <at> gmail.com> writes:
> > >
> > > > On Sat, Nov 25, 2023 at 10:43 PM Joseph Turner <joseph <at> ushin.org>
> wrote:
> >
> > > > So, benchmarking it will have to be, I'm afraid, because AFAIK
> > > > font-locking is a very performance sensitive area of Emacs.
> > >
> > > Yes.  I would like to learn how to do this!
> >
> > I'm CCing Eli.
> >
> > In the past, ISTR, Eli suggested to benchmark such things by visiting a
> > very large file in its beginning, then scrolling down by holding
> > the down arrow or PgDn for some fixed time period, like 30 seconds.
> > The  Emacs that scrolls the farthest is the most performant.  Not
> > entirely fail-proof (other processes may interfere, etc), but not
> > bad either.
>
> I still recommend this method.  Something like the below:
>
>   (defun scroll-up-benchmark ()
>     (interactive)
>     (let ((oldgc gcs-done)
>           (oldtime (float-time)))
>       (condition-case nil (while t (scroll-up) (redisplay))
>         (error (message "GCs: %d Elapsed time: %f seconds"
>                         (- gcs-done oldgc) (- (float-time) oldtime))))))
>
> Evaluate the above, and the invoke it at the beginning of a large
> file.  Then compare the timings with different font-lock arrangements.
>
> A variant is to scroll by N lines, not by pages.  Just change the
> above to call scroll-up with the argument of N, for example 1 (or any
> other number, if you want).
>

Joseph can you try these variations? They're slightly more exact. Also show
at least one of the large lisp files or tell me how to generate one. If you
still don't find any significant slowdown, I think we can merge your patch.

João

>
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Wed, 29 Nov 2023 09:30:02 GMT) Full text and rfc822 format available.

Message #44 received at 67390 <at> debbugs.gnu.org (full text, mbox):

From: Joseph Turner <joseph <at> ushin.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: 67390 <at> debbugs.gnu.org, Adam Porter <adam <at> alphapapa.net>,
 Eli Zaretskii <eliz <at> gnu.org>, Jonas Bernoulli <jonas <at> bernoul.li>
Subject: Re: bug#67390: 28; shorthands-font-lock-shorthands assumes
 shorthand uses same separator
Date: Wed, 29 Nov 2023 01:12:52 -0800
João Távora <joaotavora <at> gmail.com> writes:

> On Mon, Nov 27, 2023, 12:12 Eli Zaretskii <eliz <at> gnu.org> wrote:
>
>  > From: João Távora <joaotavora <at> gmail.com>
>  > Date: Sun, 26 Nov 2023 22:02:01 +0000
>  > Cc: Jonas Bernoulli <jonas <at> bernoul.li>, 67390 <at> debbugs.gnu.org,
>  >       Adam Porter <adam <at> alphapapa.net>
>  >
>  > On Sun, Nov 26, 2023 at 8:38 PM Joseph Turner <joseph <at> ushin.org> wrote:
>  > >
>  > > João Távora <joaotavora <at> gmail.com> writes:
>  > >
>  > > > On Sat, Nov 25, 2023 at 10:43 PM Joseph Turner <joseph <at> ushin.org> wrote:
>  >
>  > > > So, benchmarking it will have to be, I'm afraid, because AFAIK
>  > > > font-locking is a very performance sensitive area of Emacs.
>  > >
>  > > Yes.  I would like to learn how to do this!
>  >
>  > I'm CCing Eli.
>  >
>  > In the past, ISTR, Eli suggested to benchmark such things by visiting a
>  > very large file in its beginning, then scrolling down by holding
>  > the down arrow or PgDn for some fixed time period, like 30 seconds.
>  > The  Emacs that scrolls the farthest is the most performant.  Not
>  > entirely fail-proof (other processes may interfere, etc), but not
>  > bad either.
>
>  I still recommend this method.  Something like the below:
>
>    (defun scroll-up-benchmark ()
>      (interactive)
>      (let ((oldgc gcs-done)
>            (oldtime (float-time)))
>        (condition-case nil (while t (scroll-up) (redisplay))
>          (error (message "GCs: %d Elapsed time: %f seconds"
>                          (- gcs-done oldgc) (- (float-time) oldtime))))))
>
>  Evaluate the above, and the invoke it at the beginning of a large
>  file.  Then compare the timings with different font-lock arrangements.
>
>  A variant is to scroll by N lines, not by pages.  Just change the
>  above to call scroll-up with the argument of N, for example 1 (or any
>  other number, if you want).
>
> Joseph can you try these variations? They're slightly more exact. Also show at least one of the large lisp files or tell me how to generate
> one. If you still don't find any significant slowdown, I think we can merge your patch.

I'm happy to try Eli's variation if you don't beat me to it ;)

My large lisp file consisted of copy-pasting with a kbd macro the body of
https://git.sr.ht/~ushin/hyperdrive.el/tree/master/item/hyperdrive-lib.el
until the file reached ~50K lines -- well over the limit I expected to
reach on my machine.

One potential issue with the patch is that multiple shorthand prefixes
might match, while assoc will return the first matching cons pair
read-symbol-shorthand.

For example in hyperdrive-lib.el, we use the following shorthands in
order to display "//" instead of "/-" as the prefix separator for
private symbols, like "h//format-entry" instead of "h/-format-entry":

;; read-symbol-shorthands: (
;;   ("he//" . "hyperdrive-entry--")
;;   ("he/"  . "hyperdrive-entry-")
;;   ("h//"  . "hyperdrive--")
;;   ("h/"   . "hyperdrive-"))

However, if we rearrange the values like:

;; read-symbol-shorthands: (
;;   ("he/"  . "hyperdrive-entry-")
;;   ("he//" . "hyperdrive-entry--")
;;   ("h/"   . "hyperdrive-")
;;   ("h//"  . "hyperdrive--"))

then shorthands doesn't add fontification for either "h//" or "he//".
(This surprised me - I was expecting to see just the "h/" and "he/"
highlit)

However, I'm starting to wonder whether this is a case of user error.
Should we avoid overlapping shorthand prefixes?

Thank you!!

Joseph





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Wed, 29 Nov 2023 13:31:02 GMT) Full text and rfc822 format available.

Message #47 received at 67390 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Jonas Bernoulli <jonas <at> bernoul.li>
Cc: Joseph Turner <joseph <at> ushin.org>, 67390 <at> debbugs.gnu.org,
 Adam Porter <adam <at> alphapapa.net>
Subject: Re: bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand
 uses same separator
Date: Wed, 29 Nov 2023 13:30:22 +0000
On Sat, Nov 25, 2023 at 4:03 PM Jonas Bernoulli via Bug reports for
GNU Emacs, the Swiss army knife of text editors
<bug-gnu-emacs <at> gnu.org> wrote:
>
> Joseph Turner <joseph <at> ushin.org> writes:
>
> > +                          (car (shorthands--find-if
> > +                                (lambda (short)
> > +                                  (string-prefix-p short (match-string 1)))
> > +                                read-symbol-shorthands #'car)))))
>
> Or simply:
>   (car (assoc (match-string 1)
>               read-symbol-shorthands
>               #'string-prefix-p))

I don't think it works, at least in my 'assoc', the order
of string-prefix-p arguments must be switched.  Pity
assoc or string-prefix-p decs didn't coordinate this.

(assoc probe read-symbol-shorthands (lambda (a b) (string-prefix-p b a)))

works




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Wed, 29 Nov 2023 13:57:02 GMT) Full text and rfc822 format available.

Message #50 received at 67390 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Joseph Turner <joseph <at> ushin.org>
Cc: 67390 <at> debbugs.gnu.org, Adam Porter <adam <at> alphapapa.net>,
 Eli Zaretskii <eliz <at> gnu.org>, Jonas Bernoulli <jonas <at> bernoul.li>
Subject: Re: bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand
 uses same separator
Date: Wed, 29 Nov 2023 13:56:28 +0000
On Wed, Nov 29, 2023 at 9:29 AM Joseph Turner <joseph <at> ushin.org> wrote:

> > Joseph can you try these variations? They're slightly more exact. Also show at least one of the large lisp files or tell me how to generate
> > one. If you still don't find any significant slowdown, I think we can merge your patch.
>
> I'm happy to try Eli's variation if you don't beat me to it ;)

Great.  No I don't think I will beat you to it :-), you have the
machinery setup already.

> However, if we rearrange the values like:
>
> ;; read-symbol-shorthands: (
> ;;   ("he/"  . "hyperdrive-entry-")
> ;;   ("he//" . "hyperdrive-entry--")
> ;;   ("h/"   . "hyperdrive-")
> ;;   ("h//"  . "hyperdrive--"))
>
> then shorthands doesn't add fontification for either "h//" or "he//".
> (This surprised me - I was expecting to see just the "h/" and "he/"
> highlit)
>
> However, I'm starting to wonder whether this is a case of user error.
> Should we avoid overlapping shorthand prefixes?

No, but you should make sure the longer prefixes appear before.
It's not only a font-locking issue, if the shorter shorthand
appears before it could read to an unintended symbol.  If shorthands
are

  (("h/" . "hello-") ("h//" . "hyperdrive--"))

and you type

  "h//warp"

, I'm fairly sure this will be read to

    "hello-/warp"

, which is probably not what you intended

The documentation doesn't mention this, but it should, so patches
welcome for that.

In theory, lread.c could do this sorting for you.  I don't think
this would affect performance, so patches welcome for that,
too.

Anyway, here's another patch that more or less bridges our two patches
and seems to do the right thing every time, even making you aware of
that ordering pitfall.

It could be more performant (or less, or about the same), so if
you can, please include it in your benchmarks.  It does less
work when presented with a "/"-ending shorthand for a "-"-separated
symbol, so I'm starting to think that this "/" idea is a good
convention for shorthands (useful for distinguishing them in
github code, for example).

I've tested it briefly with those 'he/', 'he//', etc cases.

diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.el
index b0665a55695..69e9e252aee 100644
--- a/lisp/emacs-lisp/shorthands.el
+++ b/lisp/emacs-lisp/shorthands.el
@@ -76,14 +76,19 @@ shorthands-font-lock-shorthands
              (sname (and probe (symbol-name probe)))
              (mismatch (and sname (shorthands--mismatch-from-end
                                    (match-string 1) sname)))
-             (guess (and mismatch (1+ mismatch))))
-        (when guess
-          (when (and (< guess (1- (length (match-string 1))))
-                     ;; In bug#67390 we allow other separators
-                     (eq (char-syntax (aref (match-string 1) guess)) ?_))
-            (setq guess (1+ guess)))
+             probe2)
+        (when mismatch
+          (unless (eq (char-syntax (aref (match-string 1) mismatch)) ?_)
+            (or probe2
+                (setq probe2
+                      (buffer-substring (match-beginning 1)
+                                        (+ (match-beginning 1) mismatch))))
+            (setq mismatch (1- (length
+                                (car (assoc probe2
+                                            read-symbol-shorthands
+                                            (lambda (a b)
(string-prefix-p b a))))))))
           (add-face-text-property (match-beginning 1)
-                                  (+ (match-beginning 1) guess)
+                                  (+ (match-beginning 1) (1+ mismatch))
                                   'elisp-shorthand-font-lock-face))))))

 (font-lock-add-keywords 'emacs-lisp-mode
'((shorthands-font-lock-shorthands)) t)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Wed, 29 Nov 2023 23:28:01 GMT) Full text and rfc822 format available.

Message #53 received at 67390 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Jonas Bernoulli <jonas <at> bernoul.li>
Cc: Joseph Turner <joseph <at> ushin.org>, 67390 <at> debbugs.gnu.org,
 Adam Porter <adam <at> alphapapa.net>
Subject: Re: bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand
 uses same separator
Date: Wed, 29 Nov 2023 23:27:00 +0000
On Wed, Nov 29, 2023 at 1:30 PM João Távora <joaotavora <at> gmail.com> wrote:
>
> On Sat, Nov 25, 2023 at 4:03 PM Jonas Bernoulli via Bug reports for
> GNU Emacs, the Swiss army knife of text editors
> <bug-gnu-emacs <at> gnu.org> wrote:
> >
> > Joseph Turner <joseph <at> ushin.org> writes:
> >
> > > +                          (car (shorthands--find-if
> > > +                                (lambda (short)
> > > +                                  (string-prefix-p short (match-string 1)))
> > > +                                read-symbol-shorthands #'car)))))
> >
> > Or simply:
> >   (car (assoc (match-string 1)
> >               read-symbol-shorthands
> >               #'string-prefix-p))
>
> I don't think it works, at least in my 'assoc', the order
> of string-prefix-p arguments must be switched.  Pity
> assoc or string-prefix-p decs didn't coordinate this.

nevermind, it does work if what you want is to see if the cars
of the alist are prefixes to the key, which is probably your
intention in this snippet.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Thu, 30 Nov 2023 14:18:01 GMT) Full text and rfc822 format available.

Message #56 received at 67390 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Jonas Bernoulli <jonas <at> bernoul.li>, Eli Zaretskii <eliz <at> gnu.org>
Cc: Joseph Turner <joseph <at> ushin.org>, 67390 <at> debbugs.gnu.org,
 Adam Porter <adam <at> alphapapa.net>
Subject: Re: bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand
 uses same separator
Date: Thu, 30 Nov 2023 14:16:51 +0000
Hi all,

I've been working on all these shorthand-related issues over the last
two days and I have reasonably short fixes for all of them.

For this particular issue (bug#67309), I've opted to
use Joseph's patch with very slight adjustments, as it's the
only one that guarantees correct behaviour and doesn't seem
to impact performance.

The other issues are:

bug#63480 (loaddefs-gen.el doesn't know about shorthands)
bug#67325 (prefix discovery i.e. register-definition-prefixes)
bug#67523 (check-declare.el doesn't know about shorthands)

I have all this in 6 commits in the bugfix/shorthand-fixes branch.

Here's the full patch minus whitespace changes.  If there are
no comments I'll push in a few days' time.

João

diff --git a/doc/lispref/symbols.texi b/doc/lispref/symbols.texi
index 1f3b677d7fb..18e80311177 100644
--- a/doc/lispref/symbols.texi
+++ b/doc/lispref/symbols.texi
@@ -761,6 +761,23 @@ Shorthands
 ;; End:
 @end example

+Note that if you have two shorthands in the same file where one is the
+prefix of the other, the longer shorthand will be attempted first.
+This happens regardless of the order you specify shorthands in the
+local variables section of your file.
+
+@example
+'(
+  t//foo ; reads to 'my-tricks--foo', not 'my-tricks-/foo'
+  t/foo  ; reads to 'my-tricks-foo'
+  )
+
+;; Local Variables:
+;; read-symbol-shorthands: (("t/" . "my-tricks-")
+;;                          ("t//" . "my-tricks--")
+;; End:
+@end example
+
 @subsection Exceptions

 There are two exceptions to rules governing Shorthand transformations:
diff --git a/lisp/emacs-lisp/check-declare.el b/lisp/emacs-lisp/check-declare.el
index c887d95210c..b19aedf314d 100644
--- a/lisp/emacs-lisp/check-declare.el
+++ b/lisp/emacs-lisp/check-declare.el
@@ -145,21 +145,26 @@ check-declare-verify
     (if (file-regular-p fnfile)
         (with-temp-buffer
           (insert-file-contents fnfile)
+          (unless cflag
+            ;; If in Elisp, ensure syntax and shorthands available
+            (set-syntax-table emacs-lisp-mode-syntax-table)
+            (let (enable-local-variables) (hack-local-variables)))
           ;; defsubst's don't _have_ to be known at compile time.
-          (setq re (format (if cflag
-                               "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
+          (setq re (if cflag
+                       (format "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
+                               (regexp-opt (mapcar 'cadr fnlist) t))
                      "^[ \t]*(\\(fset[ \t]+'\\|\
 cl-def\\(?:generic\\|method\\|un\\)\\|\
 def\\(?:un\\|subst\\|foo\\|method\\|class\\|\
 ine-\\(?:derived\\|generic\\|\\(?:global\\(?:ized\\)?-\\)?minor\\)-mode\\|\
 \\(?:ine-obsolete-function-\\)?alias[ \t]+'\\|\
 ine-overloadable-function\\)\\)\
-[ \t]*%s\\([ \t;]+\\|$\\)")
-                           (regexp-opt (mapcar 'cadr fnlist) t)))
+[ \t]*\\(\\(?:\\sw\\|\\s_\\)+\\)\\([ \t;]+\\|$\\)"))
           (while (re-search-forward re nil t)
             (skip-chars-forward " \t\n")
-            (setq fn (match-string 2)
-                  type (match-string 1)
+            (setq fn (symbol-name (car (read-from-string (match-string 2)))))
+            (when (member fn (mapcar 'cadr fnlist))
+              (setq type (match-string 1)
                     ;; (min . max) for a fixed number of arguments, or
                     ;; arglists with optional elements.
                     ;; (min) for arglists with &rest.
@@ -202,7 +207,7 @@ check-declare-verify
                               (t
                                'err))
                     ;; alist of functions and arglist signatures.
-                  siglist (cons (cons fn sig) siglist)))))
+                    siglist (cons (cons fn sig) siglist))))))
     (dolist (e fnlist)
       (setq arglist (nth 2 e)
             type
diff --git a/lisp/emacs-lisp/loaddefs-gen.el b/lisp/emacs-lisp/loaddefs-gen.el
index 04bea4723a2..e8093200bec 100644
--- a/lisp/emacs-lisp/loaddefs-gen.el
+++ b/lisp/emacs-lisp/loaddefs-gen.el
@@ -378,6 +378,7 @@ loaddefs-generate--parse-file
   (let ((defs nil)
         (load-name (loaddefs-generate--file-load-name file main-outfile))
         (compute-prefixes t)
+        read-symbol-shorthands
         local-outfile inhibit-autoloads)
     (with-temp-buffer
       (insert-file-contents file)
@@ -399,7 +400,19 @@ loaddefs-generate--parse-file
             (setq inhibit-autoloads (read (current-buffer)))))
         (save-excursion
           (when (re-search-forward "autoload-compute-prefixes: *" nil t)
-            (setq compute-prefixes (read (current-buffer))))))
+            (setq compute-prefixes (read (current-buffer)))))
+        (save-excursion
+          ;; since we're "open-coding" we have to repeat more
+          ;; complicated logic in `hack-local-variables'.
+          (when (re-search-forward "read-symbol-shorthands: *" nil t)
+            (let* ((commentless (replace-regexp-in-string
+                                 "\n\\s-*;+" ""
+                                 (buffer-substring (point) (point-max))))
+                   (unsorted-shorthands (car (read-from-string commentless))))
+              (setq read-symbol-shorthands
+                    (sort unsorted-shorthands
+                          (lambda (sh1 sh2)
+                            (> (length (car sh1)) (length (car sh2))))))))))

       ;; We always return the package version (even for pre-dumped
       ;; files).
@@ -486,7 +499,11 @@ loaddefs-generate--compute-prefixes
     (while (re-search-forward
             "^(\\(def[^ \t\n]+\\)[ \t\n]+['(]*\\([^' ()\"\n]+\\)[\n \t]" nil t)
       (unless (member (match-string 1) autoload-ignored-definitions)
-        (let ((name (match-string-no-properties 2)))
+        (let* ((name (match-string-no-properties 2))
+               ;; Consider `read-symbol-shorthands'.
+               (probe (let ((obarray (obarray-make)))
+                        (car (read-from-string name)))))
+          (setq name (symbol-name probe))
           (when (save-excursion
                   (goto-char (match-beginning 0))
                   (or (bobp)
diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.el
index b0665a55695..69b562e3c7e 100644
--- a/lisp/emacs-lisp/shorthands.el
+++ b/lisp/emacs-lisp/shorthands.el
@@ -52,38 +52,26 @@ elisp-shorthand-font-lock-face
   :version "28.1"
   :group 'font-lock-faces)

-(defun shorthands--mismatch-from-end (str1 str2)
-  "Tell index of first mismatch in STR1 and STR2, from end.
-The index is a valid 0-based index on STR1.  Returns nil if STR1
-equals STR2.  Return 0 if STR1 is a suffix of STR2."
-  (cl-loop with l1 = (length str1) with l2 = (length str2)
-           for i from 1
-           for i1 = (- l1 i) for i2 = (- l2 i)
-           while (eq (aref str1 i1) (aref str2 i2))
-           if (zerop i2) return (if (zerop i1) nil i1)
-           if (zerop i1) return 0
-           finally (return i1)))
-
 (defun shorthands-font-lock-shorthands (limit)
+  "Font lock until LIMIT considering `read-symbol-shorthands'."
   (when read-symbol-shorthands
     (while (re-search-forward
             (concat "\\_<\\(" (rx lisp-mode-symbol) "\\)\\_>")
             limit t)
       (let* ((existing (get-text-property (match-beginning 1) 'face))
+             (print-name (match-string 1))
              (probe (and (not (memq existing '(font-lock-comment-face
                                                font-lock-string-face)))
-                         (intern-soft (match-string 1))))
-             (sname (and probe (symbol-name probe)))
-             (mismatch (and sname (shorthands--mismatch-from-end
-                                   (match-string 1) sname)))
-             (guess (and mismatch (1+ mismatch))))
-        (when guess
-          (when (and (< guess (1- (length (match-string 1))))
-                     ;; In bug#67390 we allow other separators
-                     (eq (char-syntax (aref (match-string 1) guess)) ?_))
-            (setq guess (1+ guess)))
+                         (intern-soft print-name)))
+             (symbol-name (and probe (symbol-name probe)))
+             (prefix (and symbol-name
+                          (not (string-equal print-name symbol-name))
+                          (car (assoc print-name
+                                      read-symbol-shorthands
+                                      #'string-prefix-p)))))
+        (when prefix
           (add-face-text-property (match-beginning 1)
-                                  (+ (match-beginning 1) guess)
+                                  (+ (match-beginning 1) (length prefix))
                                   'elisp-shorthand-font-lock-face))))))

 (font-lock-add-keywords 'emacs-lisp-mode
'((shorthands-font-lock-shorthands)) t)
diff --git a/lisp/files.el b/lisp/files.el
index 1cdcec23b11..b266d0727ec 100644
--- a/lisp/files.el
+++ b/lisp/files.el
@@ -3735,7 +3735,8 @@ before-hack-local-variables-hook
 This hook is called only if there is at least one file-local
 variable to set.")

-(defvar permanently-enabled-local-variables '(lexical-binding)
+(defvar permanently-enabled-local-variables
+  '(lexical-binding read-symbol-shorthands)
   "A list of file-local variables that are always enabled.
 This overrides any `enable-local-variables' setting.")

@@ -4171,6 +4172,13 @@ hack-local-variables--find-variables
                                 ;; to use 'thisbuf's name in the
                                 ;; warning message.
                                 (or (buffer-file-name thisbuf) ""))))))
+                          ((eq var 'read-symbol-shorthands)
+                           ;; Sort automatically by shorthand length
+                           ;; descending
+                           (setq val (sort val
+                                           (lambda (sh1 sh2) (>
(length (car sh1))
+
(length (car sh2))))))
+                           (push (cons 'read-symbol-shorthands val) result))
                           ((and (eq var 'mode) handle-mode))
    (t
     (ignore-errors




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Thu, 30 Nov 2023 15:24:01 GMT) Full text and rfc822 format available.

Message #59 received at 67390 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: joseph <at> ushin.org, 67390 <at> debbugs.gnu.org, jonas <at> bernoul.li,
 adam <at> alphapapa.net
Subject: Re: bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand
 uses same separator
Date: Thu, 30 Nov 2023 17:23:04 +0200
> From: João Távora <joaotavora <at> gmail.com>
> Date: Thu, 30 Nov 2023 14:16:51 +0000
> Cc: Joseph Turner <joseph <at> ushin.org>, 67390 <at> debbugs.gnu.org, 
> 	Adam Porter <adam <at> alphapapa.net>
> 
> Hi all,
> 
> I've been working on all these shorthand-related issues over the last
> two days and I have reasonably short fixes for all of them.
> 
> For this particular issue (bug#67309), I've opted to
> use Joseph's patch with very slight adjustments, as it's the
> only one that guarantees correct behaviour and doesn't seem
> to impact performance.
> 
> The other issues are:
> 
> bug#63480 (loaddefs-gen.el doesn't know about shorthands)
> bug#67325 (prefix discovery i.e. register-definition-prefixes)
> bug#67523 (check-declare.el doesn't know about shorthands)
> 
> I have all this in 6 commits in the bugfix/shorthand-fixes branch.

Thanks.

> Here's the full patch minus whitespace changes.  If there are
> no comments I'll push in a few days' time.

The plan is to merge these to the master branch, right?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Thu, 30 Nov 2023 15:30:02 GMT) Full text and rfc822 format available.

Message #62 received at 67390 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: joseph <at> ushin.org, 67390 <at> debbugs.gnu.org, jonas <at> bernoul.li,
 adam <at> alphapapa.net
Subject: Re: bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand
 uses same separator
Date: Thu, 30 Nov 2023 15:29:23 +0000
On Thu, Nov 30, 2023 at 3:23 PM Eli Zaretskii <eliz <at> gnu.org> wrote:

> > I have all this in 6 commits in the bugfix/shorthand-fixes branch.
>
> Thanks.
>
> > Here's the full patch minus whitespace changes.  If there are
> > no comments I'll push in a few days' time.
>
> The plan is to merge these to the master branch, right?

Yes




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Sun, 10 Dec 2023 10:53:02 GMT) Full text and rfc822 format available.

Message #65 received at 67390 <at> debbugs.gnu.org (full text, mbox):

From: Joseph Turner <joseph <at> ushin.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: 67390 <at> debbugs.gnu.org, Adam Porter <adam <at> alphapapa.net>,
 Eli Zaretskii <eliz <at> gnu.org>, Jonas Bernoulli <jonas <at> bernoul.li>
Subject: Re: bug#67390: 28; shorthands-font-lock-shorthands assumes
 shorthand uses same separator
Date: Sat, 09 Dec 2023 10:50:16 -0800
Hi João!  Thanks for your patience - preparing for EmacsConf was a blast,
and now I'm on a plane to go visit my grandmother!

João Távora <joaotavora <at> gmail.com> writes:

> Hi all,
>
> I've been working on all these shorthand-related issues over the last
> two days and I have reasonably short fixes for all of them.
>
> For this particular issue (bug#67309), I've opted to
> use Joseph's patch with very slight adjustments, as it's the
> only one that guarantees correct behaviour and doesn't seem
> to impact performance.
>
> The other issues are:
>
> bug#63480 (loaddefs-gen.el doesn't know about shorthands)
> bug#67325 (prefix discovery i.e. register-definition-prefixes)
> bug#67523 (check-declare.el doesn't know about shorthands)
>
> I have all this in 6 commits in the bugfix/shorthand-fixes branch.
>
> Here's the full patch minus whitespace changes.  If there are
> no comments I'll push in a few days' time.
>
> João
>
> diff --git a/doc/lispref/symbols.texi b/doc/lispref/symbols.texi
> index 1f3b677d7fb..18e80311177 100644
> --- a/doc/lispref/symbols.texi
> +++ b/doc/lispref/symbols.texi
> @@ -761,6 +761,23 @@ Shorthands
>  ;; End:
>  @end example
>
> +Note that if you have two shorthands in the same file where one is the
> +prefix of the other, the longer shorthand will be attempted first.
> +This happens regardless of the order you specify shorthands in the
> +local variables section of your file.
> +
> +@example
> +'(
> +  t//foo ; reads to 'my-tricks--foo', not 'my-tricks-/foo'
> +  t/foo  ; reads to 'my-tricks-foo'
> +  )
> +
> +;; Local Variables:
> +;; read-symbol-shorthands: (("t/" . "my-tricks-")
> +;;                          ("t//" . "my-tricks--")
> +;; End:
> +@end example
> +
>  @subsection Exceptions

Clear and concise.

>  There are two exceptions to rules governing Shorthand transformations:
> diff --git a/lisp/emacs-lisp/check-declare.el b/lisp/emacs-lisp/check-declare.el
> index c887d95210c..b19aedf314d 100644
> --- a/lisp/emacs-lisp/check-declare.el
> +++ b/lisp/emacs-lisp/check-declare.el
> @@ -145,21 +145,26 @@ check-declare-verify
>      (if (file-regular-p fnfile)
>          (with-temp-buffer
>            (insert-file-contents fnfile)
> +          (unless cflag
> +            ;; If in Elisp, ensure syntax and shorthands available
> +            (set-syntax-table emacs-lisp-mode-syntax-table)
> +            (let (enable-local-variables) (hack-local-variables)))
>            ;; defsubst's don't _have_ to be known at compile time.
> -          (setq re (format (if cflag
> -                               "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
> +          (setq re (if cflag
> +                       (format "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
> +                               (regexp-opt (mapcar 'cadr fnlist) t))
>                       "^[ \t]*(\\(fset[ \t]+'\\|\
>  cl-def\\(?:generic\\|method\\|un\\)\\|\
>  def\\(?:un\\|subst\\|foo\\|method\\|class\\|\
>  ine-\\(?:derived\\|generic\\|\\(?:global\\(?:ized\\)?-\\)?minor\\)-mode\\|\
>  \\(?:ine-obsolete-function-\\)?alias[ \t]+'\\|\
>  ine-overloadable-function\\)\\)\
> -[ \t]*%s\\([ \t;]+\\|$\\)")
> -                           (regexp-opt (mapcar 'cadr fnlist) t)))
> +[ \t]*\\(\\(?:\\sw\\|\\s_\\)+\\)\\([ \t;]+\\|$\\)"))

Would you explain what this regexp is intended to match?

>            (while (re-search-forward re nil t)
>              (skip-chars-forward " \t\n")
> -            (setq fn (match-string 2)
> -                  type (match-string 1)
> +            (setq fn (symbol-name (car (read-from-string (match-string 2)))))
> +            (when (member fn (mapcar 'cadr fnlist))
> +              (setq type (match-string 1)
>                      ;; (min . max) for a fixed number of arguments, or
>                      ;; arglists with optional elements.
>                      ;; (min) for arglists with &rest.
> @@ -202,7 +207,7 @@ check-declare-verify
>                                (t
>                                 'err))
>                      ;; alist of functions and arglist signatures.
> -                  siglist (cons (cons fn sig) siglist)))))
> +                    siglist (cons (cons fn sig) siglist))))))
>      (dolist (e fnlist)
>        (setq arglist (nth 2 e)
>              type

On my machine, this patch removes some of the check-declare "function
not found" errors, but not all.  For example, with hyperdrive-lib.el:

(check-declare-file "~/.local/src/hyperdrive.el/hyperdrive-lib.el")

Before this patch, the "*Check Declarations Warnings*" buffer shows:

--8<---------------cut here---------------start------------->8---
 ■ hyperdrive-lib.el:44:Warning (check-declare): said ‘h/mode’ was defined in
    ../../../.emacs.d/elpa/hyperdrive/hyperdrive.el: function not found
 ■ hyperdrive-lib.el:508:Warning (check-declare): said ‘h/history’ was defined
    in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-history.el: function not
    found
 ■ hyperdrive-lib.el:1283:Warning (check-declare): said ‘h/org--link-goto’ was
    defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-org.el: function
    not found
 ■ hyperdrive-lib.el:45:Warning (check-declare): said ‘h/dir-mode’ was defined
    in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function not found
 ■ hyperdrive-lib.el:1069:Warning (check-declare): said
    ‘h/dir--entry-at-point’ was defined in
    ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function not found
 ■ hyperdrive-lib.el:1332:Warning (check-declare): said ‘h/dir-handler’ was
    defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function
    not found
--8<---------------cut here---------------end--------------->8---

and after your patch:

--8<---------------cut here---------------start------------->8---
 ■ hyperdrive-lib.el:44:Warning (check-declare): said ‘h/mode’ was defined in
    ../../../.emacs.d/elpa/hyperdrive/hyperdrive.el: function not found
 ■ hyperdrive-lib.el:508:Warning (check-declare): said ‘h/history’ was defined
    in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-history.el: function not
    found
 ■ hyperdrive-lib.el:1332:Warning (check-declare): said ‘h/dir-handler’ was
    defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function
    not found
--8<---------------cut here---------------end--------------->8---

Are you able to reproduce this on your machine?

> diff --git a/lisp/emacs-lisp/loaddefs-gen.el b/lisp/emacs-lisp/loaddefs-gen.el
> index 04bea4723a2..e8093200bec 100644
> --- a/lisp/emacs-lisp/loaddefs-gen.el
> +++ b/lisp/emacs-lisp/loaddefs-gen.el
> @@ -378,6 +378,7 @@ loaddefs-generate--parse-file
>    (let ((defs nil)
>          (load-name (loaddefs-generate--file-load-name file main-outfile))
>          (compute-prefixes t)
> +        read-symbol-shorthands
>          local-outfile inhibit-autoloads)
>      (with-temp-buffer
>        (insert-file-contents file)
> @@ -399,7 +400,19 @@ loaddefs-generate--parse-file
>              (setq inhibit-autoloads (read (current-buffer)))))
>          (save-excursion
>            (when (re-search-forward "autoload-compute-prefixes: *" nil t)
> -            (setq compute-prefixes (read (current-buffer))))))
> +            (setq compute-prefixes (read (current-buffer)))))
> +        (save-excursion
> +          ;; since we're "open-coding" we have to repeat more
> +          ;; complicated logic in `hack-local-variables'.
> +          (when (re-search-forward "read-symbol-shorthands: *" nil t)
> +            (let* ((commentless (replace-regexp-in-string
> +                                 "\n\\s-*;+" ""
> +                                 (buffer-substring (point) (point-max))))
> +                   (unsorted-shorthands (car (read-from-string commentless))))
> +              (setq read-symbol-shorthands
> +                    (sort unsorted-shorthands
> +                          (lambda (sh1 sh2)
> +                            (> (length (car sh1)) (length (car sh2))))))))))

IIUC, the intention here is to jump to a final "Local Variables"
declaration at the end of the file, then remove ";;", then read in the
uncommented value of `read-symbol-shorthands'.

Since `read-from-string' just reads one expression, the above hunk works
when there are more local variables after read-symbol-shorthands:

;; Local Variables:
;; read-symbol-shorthands: (("bc-" . "breadcrumb-"))
;; autoload-compute-prefixes: nil
;; End:

But if the read-symbol-shorthands declaration comes at the top, as in...

-*- read-symbol-shorthands: (("bc-" . "breadcrumb-")); -*-

...then this form will allocate two strings almost as long as the file.

Here's an alternative hack attempting to uncomment and read the minimum:

diff --git a/lisp/emacs-lisp/loaddefs-gen.el b/lisp/emacs-lisp/loaddefs-gen.el
index e8093200bec..406e4b28f1f 100644
--- a/lisp/emacs-lisp/loaddefs-gen.el
+++ b/lisp/emacs-lisp/loaddefs-gen.el
@@ -404,10 +404,13 @@ don't include."
         (save-excursion
           ;; since we're "open-coding" we have to repeat more
           ;; complicated logic in `hack-local-variables'.
-          (when (re-search-forward "read-symbol-shorthands: *" nil t)
-            (let* ((commentless (replace-regexp-in-string
+          (when-let ((beg
+                      (re-search-forward "read-symbol-shorthands: *" nil t)))
+            ;; `read-symbol-shorthands' alist ends with two parens.
+            (let* ((end (re-search-forward ")[;\n\s]*)"))
+                   (commentless (replace-regexp-in-string
                                  "\n\\s-*;+" ""
-                                 (buffer-substring (point) (point-max))))
+                                 (buffer-substring beg end)))
                    (unsorted-shorthands (car (read-from-string commentless))))
               (setq read-symbol-shorthands
                     (sort unsorted-shorthands

>        ;; We always return the package version (even for pre-dumped
>        ;; files).
> @@ -486,7 +499,11 @@ loaddefs-generate--compute-prefixes
>      (while (re-search-forward
>              "^(\\(def[^ \t\n]+\\)[ \t\n]+['(]*\\([^' ()\"\n]+\\)[\n \t]" nil t)
>        (unless (member (match-string 1) autoload-ignored-definitions)
> -        (let ((name (match-string-no-properties 2)))
> +        (let* ((name (match-string-no-properties 2))
> +               ;; Consider `read-symbol-shorthands'.
> +               (probe (let ((obarray (obarray-make)))
> +                        (car (read-from-string name)))))
> +          (setq name (symbol-name probe))
>            (when (save-excursion
>                    (goto-char (match-beginning 0))
>                    (or (bobp)
> diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.el
> index b0665a55695..69b562e3c7e 100644
> --- a/lisp/emacs-lisp/shorthands.el
> +++ b/lisp/emacs-lisp/shorthands.el
> @@ -52,38 +52,26 @@ elisp-shorthand-font-lock-face
>    :version "28.1"
>    :group 'font-lock-faces)
>
> -(defun shorthands--mismatch-from-end (str1 str2)
> -  "Tell index of first mismatch in STR1 and STR2, from end.
> -The index is a valid 0-based index on STR1.  Returns nil if STR1
> -equals STR2.  Return 0 if STR1 is a suffix of STR2."
> -  (cl-loop with l1 = (length str1) with l2 = (length str2)
> -           for i from 1
> -           for i1 = (- l1 i) for i2 = (- l2 i)
> -           while (eq (aref str1 i1) (aref str2 i2))
> -           if (zerop i2) return (if (zerop i1) nil i1)
> -           if (zerop i1) return 0
> -           finally (return i1)))
> -
>  (defun shorthands-font-lock-shorthands (limit)
> +  "Font lock until LIMIT considering `read-symbol-shorthands'."
>    (when read-symbol-shorthands
>      (while (re-search-forward
>              (concat "\\_<\\(" (rx lisp-mode-symbol) "\\)\\_>")
>              limit t)
>        (let* ((existing (get-text-property (match-beginning 1) 'face))
> +             (print-name (match-string 1))
>               (probe (and (not (memq existing '(font-lock-comment-face
>                                                 font-lock-string-face)))
> -                         (intern-soft (match-string 1))))
> -             (sname (and probe (symbol-name probe)))
> -             (mismatch (and sname (shorthands--mismatch-from-end
> -                                   (match-string 1) sname)))
> -             (guess (and mismatch (1+ mismatch))))
> -        (when guess
> -          (when (and (< guess (1- (length (match-string 1))))
> -                     ;; In bug#67390 we allow other separators
> -                     (eq (char-syntax (aref (match-string 1) guess)) ?_))
> -            (setq guess (1+ guess)))
> +                         (intern-soft print-name)))
> +             (symbol-name (and probe (symbol-name probe)))
> +             (prefix (and symbol-name
> +                          (not (string-equal print-name symbol-name))
> +                          (car (assoc print-name
> +                                      read-symbol-shorthands
> +                                      #'string-prefix-p)))))
> +        (when prefix
>            (add-face-text-property (match-beginning 1)
> -                                  (+ (match-beginning 1) guess)
> +                                  (+ (match-beginning 1) (length prefix))
>                                    'elisp-shorthand-font-lock-face))))))

Works well.  let-binding `symbol-name' and `print-name' is good improvement.

>  (font-lock-add-keywords 'emacs-lisp-mode
> '((shorthands-font-lock-shorthands)) t)
> diff --git a/lisp/files.el b/lisp/files.el
> index 1cdcec23b11..b266d0727ec 100644
> --- a/lisp/files.el
> +++ b/lisp/files.el
> @@ -3735,7 +3735,8 @@ before-hack-local-variables-hook
>  This hook is called only if there is at least one file-local
>  variable to set.")
>
> -(defvar permanently-enabled-local-variables '(lexical-binding)
> +(defvar permanently-enabled-local-variables
> +  '(lexical-binding read-symbol-shorthands)
>    "A list of file-local variables that are always enabled.
>  This overrides any `enable-local-variables' setting.")
>
> @@ -4171,6 +4172,13 @@ hack-local-variables--find-variables
>                                  ;; to use 'thisbuf's name in the
>                                  ;; warning message.
>                                  (or (buffer-file-name thisbuf) ""))))))
> +                          ((eq var 'read-symbol-shorthands)
> +                           ;; Sort automatically by shorthand length
> +                           ;; descending
> +                           (setq val (sort val
> +                                           (lambda (sh1 sh2) (>
> (length (car sh1))
> +
> (length (car sh2))))))
> +                           (push (cons 'read-symbol-shorthands val) result))
>                            ((and (eq var 'mode) handle-mode))
>     (t
>      (ignore-errors

Good catch.  I agree that longer shorthands should be applied first.

-----

A couple typo nits on the commit message of "Improve
shorthands-font-lock-shorthands (bug#67390)":

-  h//thingy     ; hilits "//"       reads to 'hyperdrive--thingy'
+  h//thingy     ; hilits "h//"       reads to 'hyperdrive--thingy'

-  Co-authored-by: João Távora <joaotavora <at> gmail.com>
+  Co-authored-by: Joseph Turner <joseph <at> breatheoutbreathe.in>


Thank you!

Joseph




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Sat, 03 Feb 2024 07:11:01 GMT) Full text and rfc822 format available.

Message #68 received at 67390 <at> debbugs.gnu.org (full text, mbox):

From: Joseph Turner <joseph <at> ushin.org>
To: João Távora <joaotavora <at> gmail.com>, Jonas Bernoulli
 <jonas <at> bernoul.li>,
 Eli Zaretskii <eliz <at> gnu.org>, 67390 <at> debbugs.gnu.org, Adam Porter
 <adam <at> alphapapa.net>
Subject: Re: bug#67390: 28; shorthands-font-lock-shorthands assumes
 shorthand uses same separator
Date: Fri, 02 Feb 2024 23:10:00 -0800
Joseph Turner <joseph <at> ushin.org> writes:

> Hi João!  Thanks for your patience - preparing for EmacsConf was a blast,
> and now I'm on a plane to go visit my grandmother!
>
> João Távora <joaotavora <at> gmail.com> writes:
>
>> Hi all,
>>
>> I've been working on all these shorthand-related issues over the last
>> two days and I have reasonably short fixes for all of them.
>>
>> For this particular issue (bug#67309), I've opted to
>> use Joseph's patch with very slight adjustments, as it's the
>> only one that guarantees correct behaviour and doesn't seem
>> to impact performance.
>>
>> The other issues are:
>>
>> bug#63480 (loaddefs-gen.el doesn't know about shorthands)
>> bug#67325 (prefix discovery i.e. register-definition-prefixes)
>> bug#67523 (check-declare.el doesn't know about shorthands)
>>
>> I have all this in 6 commits in the bugfix/shorthand-fixes branch.
>>
>> Here's the full patch minus whitespace changes.  If there are
>> no comments I'll push in a few days' time.
>>
>> João
>>
>> diff --git a/doc/lispref/symbols.texi b/doc/lispref/symbols.texi
>> index 1f3b677d7fb..18e80311177 100644
>> --- a/doc/lispref/symbols.texi
>> +++ b/doc/lispref/symbols.texi
>> @@ -761,6 +761,23 @@ Shorthands
>>  ;; End:
>>  @end example
>>
>> +Note that if you have two shorthands in the same file where one is the
>> +prefix of the other, the longer shorthand will be attempted first.
>> +This happens regardless of the order you specify shorthands in the
>> +local variables section of your file.
>> +
>> +@example
>> +'(
>> +  t//foo ; reads to 'my-tricks--foo', not 'my-tricks-/foo'
>> +  t/foo  ; reads to 'my-tricks-foo'
>> +  )
>> +
>> +;; Local Variables:
>> +;; read-symbol-shorthands: (("t/" . "my-tricks-")
>> +;;                          ("t//" . "my-tricks--")
>> +;; End:
>> +@end example
>> +
>>  @subsection Exceptions
>
> Clear and concise.
>
>>  There are two exceptions to rules governing Shorthand transformations:
>> diff --git a/lisp/emacs-lisp/check-declare.el b/lisp/emacs-lisp/check-declare.el
>> index c887d95210c..b19aedf314d 100644
>> --- a/lisp/emacs-lisp/check-declare.el
>> +++ b/lisp/emacs-lisp/check-declare.el
>> @@ -145,21 +145,26 @@ check-declare-verify
>>      (if (file-regular-p fnfile)
>>          (with-temp-buffer
>>            (insert-file-contents fnfile)
>> +          (unless cflag
>> +            ;; If in Elisp, ensure syntax and shorthands available
>> +            (set-syntax-table emacs-lisp-mode-syntax-table)
>> +            (let (enable-local-variables) (hack-local-variables)))
>>            ;; defsubst's don't _have_ to be known at compile time.
>> -          (setq re (format (if cflag
>> -                               "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
>> +          (setq re (if cflag
>> +                       (format "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
>> +                               (regexp-opt (mapcar 'cadr fnlist) t))
>>                       "^[ \t]*(\\(fset[ \t]+'\\|\
>>  cl-def\\(?:generic\\|method\\|un\\)\\|\
>>  def\\(?:un\\|subst\\|foo\\|method\\|class\\|\
>>  ine-\\(?:derived\\|generic\\|\\(?:global\\(?:ized\\)?-\\)?minor\\)-mode\\|\
>>  \\(?:ine-obsolete-function-\\)?alias[ \t]+'\\|\
>>  ine-overloadable-function\\)\\)\
>> -[ \t]*%s\\([ \t;]+\\|$\\)")
>> -                           (regexp-opt (mapcar 'cadr fnlist) t)))
>> +[ \t]*\\(\\(?:\\sw\\|\\s_\\)+\\)\\([ \t;]+\\|$\\)"))
>
> Would you explain what this regexp is intended to match?
>
>>            (while (re-search-forward re nil t)
>>              (skip-chars-forward " \t\n")
>> -            (setq fn (match-string 2)
>> -                  type (match-string 1)
>> +            (setq fn (symbol-name (car (read-from-string (match-string 2)))))
>> +            (when (member fn (mapcar 'cadr fnlist))
>> +              (setq type (match-string 1)
>>                      ;; (min . max) for a fixed number of arguments, or
>>                      ;; arglists with optional elements.
>>                      ;; (min) for arglists with &rest.
>> @@ -202,7 +207,7 @@ check-declare-verify
>>                                (t
>>                                 'err))
>>                      ;; alist of functions and arglist signatures.
>> -                  siglist (cons (cons fn sig) siglist)))))
>> +                    siglist (cons (cons fn sig) siglist))))))
>>      (dolist (e fnlist)
>>        (setq arglist (nth 2 e)
>>              type
>
> On my machine, this patch removes some of the check-declare "function
> not found" errors, but not all.  For example, with hyperdrive-lib.el:
>
> (check-declare-file "~/.local/src/hyperdrive.el/hyperdrive-lib.el")
>
> Before this patch, the "*Check Declarations Warnings*" buffer shows:
>
> --8<---------------cut here---------------start------------->8---
>  ■ hyperdrive-lib.el:44:Warning (check-declare): said ‘h/mode’ was defined in
>     ../../../.emacs.d/elpa/hyperdrive/hyperdrive.el: function not found
>  ■ hyperdrive-lib.el:508:Warning (check-declare): said ‘h/history’ was defined
>     in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-history.el: function not
>     found
>  ■ hyperdrive-lib.el:1283:Warning (check-declare): said ‘h/org--link-goto’ was
>     defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-org.el: function
>     not found
>  ■ hyperdrive-lib.el:45:Warning (check-declare): said ‘h/dir-mode’ was defined
>     in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function not found
>  ■ hyperdrive-lib.el:1069:Warning (check-declare): said
>     ‘h/dir--entry-at-point’ was defined in
>     ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function not found
>  ■ hyperdrive-lib.el:1332:Warning (check-declare): said ‘h/dir-handler’ was
>     defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function
>     not found
> --8<---------------cut here---------------end--------------->8---
>
>
> and after your patch:
>
> --8<---------------cut here---------------start------------->8---
>  ■ hyperdrive-lib.el:44:Warning (check-declare): said ‘h/mode’ was defined in
>     ../../../.emacs.d/elpa/hyperdrive/hyperdrive.el: function not found
>  ■ hyperdrive-lib.el:508:Warning (check-declare): said ‘h/history’ was defined
>     in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-history.el: function not
>     found
>  ■ hyperdrive-lib.el:1332:Warning (check-declare): said ‘h/dir-handler’ was
>     defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function
>     not found
> --8<---------------cut here---------------end--------------->8---
>
> Are you able to reproduce this on your machine?
>
>> diff --git a/lisp/emacs-lisp/loaddefs-gen.el b/lisp/emacs-lisp/loaddefs-gen.el
>> index 04bea4723a2..e8093200bec 100644
>> --- a/lisp/emacs-lisp/loaddefs-gen.el
>> +++ b/lisp/emacs-lisp/loaddefs-gen.el
>> @@ -378,6 +378,7 @@ loaddefs-generate--parse-file
>>    (let ((defs nil)
>>          (load-name (loaddefs-generate--file-load-name file main-outfile))
>>          (compute-prefixes t)
>> +        read-symbol-shorthands
>>          local-outfile inhibit-autoloads)
>>      (with-temp-buffer
>>        (insert-file-contents file)
>> @@ -399,7 +400,19 @@ loaddefs-generate--parse-file
>>              (setq inhibit-autoloads (read (current-buffer)))))
>>          (save-excursion
>>            (when (re-search-forward "autoload-compute-prefixes: *" nil t)
>> -            (setq compute-prefixes (read (current-buffer))))))
>> +            (setq compute-prefixes (read (current-buffer)))))
>> +        (save-excursion
>> +          ;; since we're "open-coding" we have to repeat more
>> +          ;; complicated logic in `hack-local-variables'.
>> +          (when (re-search-forward "read-symbol-shorthands: *" nil t)
>> +            (let* ((commentless (replace-regexp-in-string
>> +                                 "\n\\s-*;+" ""
>> +                                 (buffer-substring (point) (point-max))))
>> +                   (unsorted-shorthands (car (read-from-string commentless))))
>> +              (setq read-symbol-shorthands
>> +                    (sort unsorted-shorthands
>> +                          (lambda (sh1 sh2)
>> +                            (> (length (car sh1)) (length (car sh2))))))))))
>
> IIUC, the intention here is to jump to a final "Local Variables"
> declaration at the end of the file, then remove ";;", then read in the
> uncommented value of `read-symbol-shorthands'.
>
> Since `read-from-string' just reads one expression, the above hunk works
> when there are more local variables after read-symbol-shorthands:
>
> ;; Local Variables:
> ;; read-symbol-shorthands: (("bc-" . "breadcrumb-"))
> ;; autoload-compute-prefixes: nil
> ;; End:
>
> But if the read-symbol-shorthands declaration comes at the top, as in...
>
> -*- read-symbol-shorthands: (("bc-" . "breadcrumb-")); -*-
>
> ...then this form will allocate two strings almost as long as the file.
>
> Here's an alternative hack attempting to uncomment and read the minimum:
>
> diff --git a/lisp/emacs-lisp/loaddefs-gen.el b/lisp/emacs-lisp/loaddefs-gen.el
> index e8093200bec..406e4b28f1f 100644
> --- a/lisp/emacs-lisp/loaddefs-gen.el
> +++ b/lisp/emacs-lisp/loaddefs-gen.el
> @@ -404,10 +404,13 @@ don't include."
>          (save-excursion
>            ;; since we're "open-coding" we have to repeat more
>            ;; complicated logic in `hack-local-variables'.
> -          (when (re-search-forward "read-symbol-shorthands: *" nil t)
> -            (let* ((commentless (replace-regexp-in-string
> +          (when-let ((beg
> +                      (re-search-forward "read-symbol-shorthands: *" nil t)))
> +            ;; `read-symbol-shorthands' alist ends with two parens.
> +            (let* ((end (re-search-forward ")[;\n\s]*)"))
> +                   (commentless (replace-regexp-in-string
>                                   "\n\\s-*;+" ""
> -                                 (buffer-substring (point) (point-max))))
> +                                 (buffer-substring beg end)))
>                     (unsorted-shorthands (car (read-from-string commentless))))
>                (setq read-symbol-shorthands
>                      (sort unsorted-shorthands
>
>>        ;; We always return the package version (even for pre-dumped
>>        ;; files).
>> @@ -486,7 +499,11 @@ loaddefs-generate--compute-prefixes
>>      (while (re-search-forward
>>              "^(\\(def[^ \t\n]+\\)[ \t\n]+['(]*\\([^' ()\"\n]+\\)[\n \t]" nil t)
>>        (unless (member (match-string 1) autoload-ignored-definitions)
>> -        (let ((name (match-string-no-properties 2)))
>> +        (let* ((name (match-string-no-properties 2))
>> +               ;; Consider `read-symbol-shorthands'.
>> +               (probe (let ((obarray (obarray-make)))
>> +                        (car (read-from-string name)))))
>> +          (setq name (symbol-name probe))
>>            (when (save-excursion
>>                    (goto-char (match-beginning 0))
>>                    (or (bobp)
>> diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.el
>> index b0665a55695..69b562e3c7e 100644
>> --- a/lisp/emacs-lisp/shorthands.el
>> +++ b/lisp/emacs-lisp/shorthands.el
>> @@ -52,38 +52,26 @@ elisp-shorthand-font-lock-face
>>    :version "28.1"
>>    :group 'font-lock-faces)
>>
>> -(defun shorthands--mismatch-from-end (str1 str2)
>> -  "Tell index of first mismatch in STR1 and STR2, from end.
>> -The index is a valid 0-based index on STR1.  Returns nil if STR1
>> -equals STR2.  Return 0 if STR1 is a suffix of STR2."
>> -  (cl-loop with l1 = (length str1) with l2 = (length str2)
>> -           for i from 1
>> -           for i1 = (- l1 i) for i2 = (- l2 i)
>> -           while (eq (aref str1 i1) (aref str2 i2))
>> -           if (zerop i2) return (if (zerop i1) nil i1)
>> -           if (zerop i1) return 0
>> -           finally (return i1)))
>> -
>>  (defun shorthands-font-lock-shorthands (limit)
>> +  "Font lock until LIMIT considering `read-symbol-shorthands'."
>>    (when read-symbol-shorthands
>>      (while (re-search-forward
>>              (concat "\\_<\\(" (rx lisp-mode-symbol) "\\)\\_>")
>>              limit t)
>>        (let* ((existing (get-text-property (match-beginning 1) 'face))
>> +             (print-name (match-string 1))
>>               (probe (and (not (memq existing '(font-lock-comment-face
>>                                                 font-lock-string-face)))
>> -                         (intern-soft (match-string 1))))
>> -             (sname (and probe (symbol-name probe)))
>> -             (mismatch (and sname (shorthands--mismatch-from-end
>> -                                   (match-string 1) sname)))
>> -             (guess (and mismatch (1+ mismatch))))
>> -        (when guess
>> -          (when (and (< guess (1- (length (match-string 1))))
>> -                     ;; In bug#67390 we allow other separators
>> -                     (eq (char-syntax (aref (match-string 1) guess)) ?_))
>> -            (setq guess (1+ guess)))
>> +                         (intern-soft print-name)))
>> +             (symbol-name (and probe (symbol-name probe)))
>> +             (prefix (and symbol-name
>> +                          (not (string-equal print-name symbol-name))
>> +                          (car (assoc print-name
>> +                                      read-symbol-shorthands
>> +                                      #'string-prefix-p)))))
>> +        (when prefix
>>            (add-face-text-property (match-beginning 1)
>> -                                  (+ (match-beginning 1) guess)
>> +                                  (+ (match-beginning 1) (length prefix))
>>                                    'elisp-shorthand-font-lock-face))))))
>
> Works well.  let-binding `symbol-name' and `print-name' is good improvement.
>
>>  (font-lock-add-keywords 'emacs-lisp-mode
>> '((shorthands-font-lock-shorthands)) t)
>> diff --git a/lisp/files.el b/lisp/files.el
>> index 1cdcec23b11..b266d0727ec 100644
>> --- a/lisp/files.el
>> +++ b/lisp/files.el
>> @@ -3735,7 +3735,8 @@ before-hack-local-variables-hook
>>  This hook is called only if there is at least one file-local
>>  variable to set.")
>>
>> -(defvar permanently-enabled-local-variables '(lexical-binding)
>> +(defvar permanently-enabled-local-variables
>> +  '(lexical-binding read-symbol-shorthands)
>>    "A list of file-local variables that are always enabled.
>>  This overrides any `enable-local-variables' setting.")
>>
>> @@ -4171,6 +4172,13 @@ hack-local-variables--find-variables
>>                                  ;; to use 'thisbuf's name in the
>>                                  ;; warning message.
>>                                  (or (buffer-file-name thisbuf) ""))))))
>> +                          ((eq var 'read-symbol-shorthands)
>> +                           ;; Sort automatically by shorthand length
>> +                           ;; descending
>> +                           (setq val (sort val
>> +                                           (lambda (sh1 sh2) (>
>> (length (car sh1))
>> +
>> (length (car sh2))))))
>> +                           (push (cons 'read-symbol-shorthands val) result))
>>                            ((and (eq var 'mode) handle-mode))
>>     (t
>>      (ignore-errors
>
> Good catch.  I agree that longer shorthands should be applied first.
>
> -----
>
> A couple typo nits on the commit message of "Improve
> shorthands-font-lock-shorthands (bug#67390)":
>
> -  h//thingy     ; hilits "//"       reads to 'hyperdrive--thingy'
> +  h//thingy     ; hilits "h//"       reads to 'hyperdrive--thingy'
>
> -  Co-authored-by: João Távora <joaotavora <at> gmail.com>
> +  Co-authored-by: Joseph Turner <joseph <at> breatheoutbreathe.in>
>
>
> Thank you!
>
> Joseph

Ping!




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Sat, 03 Feb 2024 14:51:02 GMT) Full text and rfc822 format available.

Message #71 received at 67390 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Joseph Turner <joseph <at> ushin.org>
Cc: 67390 <at> debbugs.gnu.org, Adam Porter <adam <at> alphapapa.net>,
 Eli Zaretskii <eliz <at> gnu.org>, Jonas Bernoulli <jonas <at> bernoul.li>
Subject: Re: bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand
 uses same separator
Date: Sat, 3 Feb 2024 14:50:01 +0000
Sorry, this flew under the radar.  I thought I had already pushed to master
but didn't.  So I went through the commits again, addressed your concerns, and
applied your suggestions.  Pushed to master now.



On Sat, Feb 3, 2024 at 7:10 AM Joseph Turner <joseph <at> ushin.org> wrote:


>> -[ \t]*%s\\([ \t;]+\\|$\\)")
>> -                           (regexp-opt (mapcar 'cadr fnlist) t)))
>> +[ \t]*\\(\\(?:\\sw\\|\\s_\\)+\\)\\([ \t;]+\\|$\\)"))
>
> Would you explain what this regexp is intended to match?

A very complicated one, right?  Well ask the author, but I think it's intended
to find many definition-like forms.  No idea why this is  done with regexps and
not with 'read' as it is a classical parsing pitfall in the long run.
Maybe there
was a reason.

Anyway, I just added a bit of logic so that it considers
read-symbol-shorthands if
there are any.

> Are you able to reproduce this on your machine?

Yes, and I fixed it.

> ...then this form will allocate two strings almost as long as the file.
>
> Here's an alternative hack attempting to uncomment and read the minimum:

Thanks, I think that's a good idea and I added a commit in your name.

> A couple typo nits on the commit message of "Improve
> shorthands-font-lock-shorthands (bug#67390)":
>
> -  h//thingy     ; hilits "//"       reads to 'hyperdrive--thingy'
> +  h//thingy     ; hilits "h//"       reads to 'hyperdrive--thingy'
>
> -  Co-authored-by: João Távora <joaotavora <at> gmail.com>
> +  Co-authored-by: Joseph Turner <joseph <at> breatheoutbreathe.in>

I fixed these, too.  If you succesfully test this, I think we can close this bug
(and the other ones, too).

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Sat, 03 Feb 2024 20:03:01 GMT) Full text and rfc822 format available.

Message #74 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Joseph Turner <joseph <at> breatheoutbreathe.in>
To: João Távora <joaotavora <at> gmail.com>
Cc: 67390 <at> debbugs.gnu.org, Adam Porter <adam <at> alphapapa.net>,
 Eli Zaretskii <eliz <at> gnu.org>, Jonas Bernoulli <jonas <at> bernoul.li>,
 bug-gnu-emacs <at> gnu.org
Subject: Re: bug#67390: 28; shorthands-font-lock-shorthands assumes
 shorthand uses same separator
Date: Sat, 03 Feb 2024 11:43:03 -0800
[Message part 1 (text/plain, inline)]
João Távora <joaotavora <at> gmail.com> writes:

> Sorry, this flew under the radar.  I thought I had already pushed to master
> but didn't.  So I went through the commits again, addressed your concerns, and
> applied your suggestions.  Pushed to master now.

Thank you!

> On Sat, Feb 3, 2024 at 7:10 AM Joseph Turner <joseph <at> ushin.org> wrote:
>
>
>>> -[ \t]*%s\\([ \t;]+\\|$\\)")
>>> -                           (regexp-opt (mapcar 'cadr fnlist) t)))
>>> +[ \t]*\\(\\(?:\\sw\\|\\s_\\)+\\)\\([ \t;]+\\|$\\)"))
>>
>> Would you explain what this regexp is intended to match?
>
> A very complicated one, right?  Well ask the author, but I think it's intended
> to find many definition-like forms.  No idea why this is  done with regexps and
> not with 'read' as it is a classical parsing pitfall in the long run.
> Maybe there
> was a reason.
>
> Anyway, I just added a bit of logic so that it considers
> read-symbol-shorthands if
> there are any.

That makes sense.

>> Are you able to reproduce this on your machine?
>
> Yes, and I fixed it.
>
>> ...then this form will allocate two strings almost as long as the file.
>>
>> Here's an alternative hack attempting to uncomment and read the minimum:
>
> Thanks, I think that's a good idea and I added a commit in your name.

Thanks!

>> A couple typo nits on the commit message of "Improve
>> shorthands-font-lock-shorthands (bug#67390)":
>>
>> -  h//thingy     ; hilits "//"       reads to 'hyperdrive--thingy'
>> +  h//thingy     ; hilits "h//"       reads to 'hyperdrive--thingy'
>>
>> -  Co-authored-by: João Távora <joaotavora <at> gmail.com>
>> +  Co-authored-by: Joseph Turner <joseph <at> breatheoutbreathe.in>
>
> I fixed these, too.  If you succesfully test this, I think we can close this bug
> (and the other ones, too).

I'm still reproducing the check-declare bug on my machine. It appears
that binding `enable-local-variables' to nil around the call to
`hack-local-variables' means that `read-symbol-shorthands' is not set.
Can we bind `enable-local-variables' to `:safe' instead?

(let (
      ;; (enable-local-variables t)   ; works
      ;; (enable-local-variables)     ; doesn't work
      (enable-local-variables :safe)  ; works
      )
  (with-temp-buffer
    (insert-file-contents "~/.local/src/hyperdrive.el/hyperdrive-lib.el")
    (hack-local-variables)
    read-symbol-shorthands))

See attached patch.

There's no way to hack just a single file- or dir-local variable, right?

Thank you!

Joseph

[0001-Bind-enable-local-variables-to-safe-for-shorthands-b.patch (text/x-diff, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Sat, 03 Feb 2024 20:03:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Sat, 03 Feb 2024 22:26:02 GMT) Full text and rfc822 format available.

Message #80 received at 67390 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Joseph Turner <joseph <at> breatheoutbreathe.in>
Cc: 67390 <at> debbugs.gnu.org, Adam Porter <adam <at> alphapapa.net>,
 Eli Zaretskii <eliz <at> gnu.org>, Jonas Bernoulli <jonas <at> bernoul.li>,
 bug-gnu-emacs <at> gnu.org
Subject: Re: bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand
 uses same separator
Date: Sat, 3 Feb 2024 22:25:26 +0000
On Sat, Feb 3, 2024 at 8:01 PM Joseph Turner
<joseph <at> breatheoutbreathe.in> wrote:

> I'm still reproducing the check-declare bug on my machine. It appears
> that binding `enable-local-variables' to nil around the call to
> `hack-local-variables' means that `read-symbol-shorthands' is not set.
> Can we bind `enable-local-variables' to `:safe' instead?

It could be some bootstrapping issue, since the safe spec of that particular
variable itself needs to be autoloaded.  I vaguely remember something like
this and I _think_ it was fixed.

Anyway, I can't reproduce this with this test:

   src/emacs -Q --batch --eval '(check-declare-file
"~/tmp/hyperdrive.el/hyperdrive-lib.el")'

where ~/tmp/hyperdrive.el is a checkout of your hyperdrive library.

This doesn't output anything, which I think is the expected result.

How are you testing?

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Sat, 03 Feb 2024 22:26:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Sat, 03 Feb 2024 23:54:01 GMT) Full text and rfc822 format available.

Message #86 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Joseph Turner <joseph <at> breatheoutbreathe.in>
To: João Távora <joaotavora <at> gmail.com>
Cc: 67390 <at> debbugs.gnu.org, Adam Porter <adam <at> alphapapa.net>,
 Eli Zaretskii <eliz <at> gnu.org>, Jonas Bernoulli <jonas <at> bernoul.li>,
 bug-gnu-emacs <at> gnu.org
Subject: Re: bug#67390: 28; shorthands-font-lock-shorthands assumes
 shorthand uses same separator
Date: Sat, 03 Feb 2024 15:48:18 -0800
João Távora <joaotavora <at> gmail.com> writes:

> On Sat, Feb 3, 2024 at 8:01 PM Joseph Turner
> <joseph <at> breatheoutbreathe.in> wrote:
>
>> I'm still reproducing the check-declare bug on my machine. It appears
>> that binding `enable-local-variables' to nil around the call to
>> `hack-local-variables' means that `read-symbol-shorthands' is not set.
>> Can we bind `enable-local-variables' to `:safe' instead?
>
> It could be some bootstrapping issue, since the safe spec of that particular
> variable itself needs to be autoloaded.  I vaguely remember something like
> this and I _think_ it was fixed.
>
> Anyway, I can't reproduce this with this test:
>
>    src/emacs -Q --batch --eval '(check-declare-file
> "~/tmp/hyperdrive.el/hyperdrive-lib.el")'
>
> where ~/tmp/hyperdrive.el is a checkout of your hyperdrive library.
>
> This doesn't output anything, which I think is the expected result.
>
> How are you testing?

Hmm... I just compiled from master with

./configure --with-x-toolkit=no --with-xpm=ifavailable --with-jpeg=ifavailable --with-gif=ifavailable --with-tiff=ifavailable --with-gnutls=ifavailable && make

then I ran

src/emacs -Q --batch --eval '(check-declare-file "~/.local/src/hyperdrive.el/hyperdrive-lib.el")'

which produced

uncompressing textsec-check.el.gz...
uncompressing textsec-check.el.gz...done
../hyperdrive.el/hyperdrive-lib.el:44:Warning (check-declare): said ‘h/mode’ was defined in ../hyperdrive.el/hyperdrive.el: function not found
../hyperdrive.el/hyperdrive-lib.el:508:Warning (check-declare): said ‘h/history’ was defined in ../hyperdrive.el/hyperdrive-history.el: function not found
../hyperdrive.el/hyperdrive-lib.el:1332:Warning (check-declare): said ‘h/dir-handler’ was defined in ../hyperdrive.el/hyperdrive-dir.el: function not found

Would someone else kindly attempt to reproduce the issue?

Thanks!

Joseph




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67390; Package emacs. (Sat, 03 Feb 2024 23:54:02 GMT) Full text and rfc822 format available.

This bug report was last modified 90 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.