GNU bug report logs - #68246
30.0.50; Add non-TS mode as extra parent of TS modes

Previous Next

Package: emacs;

Reported by: Stefan Monnier <monnier <at> iro.umontreal.ca>

Date: Thu, 4 Jan 2024 22:12:01 UTC

Severity: wishlist

Found in version 30.0.50

Done: Stefan Monnier <monnier <at> iro.umontreal.ca>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 68246 in the body.
You can then email your comments to 68246 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to monnier <at> iro.umontreal.ca, bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Thu, 04 Jan 2024 22:12:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Stefan Monnier <monnier <at> iro.umontreal.ca>:
New bug report received and forwarded. Copy sent to monnier <at> iro.umontreal.ca, bug-gnu-emacs <at> gnu.org. (Thu, 04 Jan 2024 22:12:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: bug-gnu-emacs <at> gnu.org
Subject: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Thu, 04 Jan 2024 17:11:14 -0500
[Message part 1 (text/plain, inline)]
Package: Emacs
Version: 30.0.50


Many packages use the `major-mode` as a proxy for the type of content
in the buffer.  When using the new TS modes, these packages tend to
behave poorly because they do not understand that a buffer in `js-ts-mode`
contains Javascript.

I suggest we add the non-TS mode as an extra parent, so
`derived-mode-all-parents` includes `js-mode` in `js-ts-mode`.


        Stefan
[ts-parents.patch (text/x-diff, inline)]
diff --git a/lisp/progmodes/c-ts-mode.el b/lisp/progmodes/c-ts-mode.el
index e5835bdb62d..461218cbb7d 100644
--- a/lisp/progmodes/c-ts-mode.el
+++ b/lisp/progmodes/c-ts-mode.el
@@ -1314,6 +1314,8 @@ c-ts-mode
                   (lambda (_pos) 'c))
       (treesit-font-lock-recompute-features '(emacs-devel)))))
 
+(derived-mode-add-parents 'c-ts-mode '(c-mode))
+
 ;;;###autoload
 (define-derived-mode c++-ts-mode c-ts-base-mode "C++"
   "Major mode for editing C++, powered by tree-sitter.
@@ -1357,6 +1359,8 @@ c++-ts-mode
       (setq-local add-log-current-defun-function
                   #'c-ts-mode--emacs-current-defun-name))))
 
+(derived-mode-add-parents 'c++-ts-mode '(c++-mode))
+
 (easy-menu-define c-ts-mode-menu (list c-ts-mode-map c++-ts-mode-map)
   "Menu for `c-ts-mode' and `c++-ts-mode'."
   '("C/C++"
diff --git a/lisp/progmodes/cmake-ts-mode.el b/lisp/progmodes/cmake-ts-mode.el
index d933e4ebb81..2a185fb0aa2 100644
--- a/lisp/progmodes/cmake-ts-mode.el
+++ b/lisp/progmodes/cmake-ts-mode.el
@@ -261,6 +261,8 @@ cmake-ts-mode
 
     (treesit-major-mode-setup)))
 
+(derived-mode-add-parents 'cmake-ts-mode '(cmake-mode))
+
 (if (treesit-ready-p 'cmake)
     (add-to-list 'auto-mode-alist
                  '("\\(?:CMakeLists\\.txt\\|\\.cmake\\)\\'" . cmake-ts-mode)))
diff --git a/lisp/progmodes/csharp-mode.el b/lisp/progmodes/csharp-mode.el
index 7bf57bcbe21..18114d08528 100644
--- a/lisp/progmodes/csharp-mode.el
+++ b/lisp/progmodes/csharp-mode.el
@@ -998,6 +998,8 @@ csharp-ts-mode
 
   (add-to-list 'auto-mode-alist '("\\.cs\\'" . csharp-ts-mode)))
 
+(derived-mode-add-parents 'csharp-ts-mode '(csharp-mode))
+
 (provide 'csharp-mode)
 
 ;;; csharp-mode.el ends here
diff --git a/lisp/progmodes/dockerfile-ts-mode.el b/lisp/progmodes/dockerfile-ts-mode.el
index 334f3064d98..618082cfe7a 100644
--- a/lisp/progmodes/dockerfile-ts-mode.el
+++ b/lisp/progmodes/dockerfile-ts-mode.el
@@ -190,6 +190,8 @@ dockerfile-ts-mode
 
     (treesit-major-mode-setup)))
 
+(derived-mode-add-parents 'dockerfile-ts-mode '(dockerfile-mode))
+
 (if (treesit-ready-p 'dockerfile)
     (add-to-list 'auto-mode-alist
                  ;; NOTE: We can't use `rx' here, as it breaks bootstrap.
diff --git a/lisp/progmodes/elixir-ts-mode.el b/lisp/progmodes/elixir-ts-mode.el
index b493195eedd..9a819f5df0c 100644
--- a/lisp/progmodes/elixir-ts-mode.el
+++ b/lisp/progmodes/elixir-ts-mode.el
@@ -745,6 +745,8 @@ elixir-ts-mode
     (treesit-major-mode-setup)
     (setq-local syntax-propertize-function #'elixir-ts--syntax-propertize)))
 
+(derived-mode-add-parents 'elixir-ts-mode '(elixir-mode))
+
 (if (treesit-ready-p 'elixir)
     (progn
       (add-to-list 'auto-mode-alist '("\\.elixir\\'" . elixir-ts-mode))
diff --git a/lisp/progmodes/go-ts-mode.el b/lisp/progmodes/go-ts-mode.el
index 65adc1c55ea..e16459cd975 100644
--- a/lisp/progmodes/go-ts-mode.el
+++ b/lisp/progmodes/go-ts-mode.el
@@ -261,6 +261,8 @@ go-ts-mode
 
     (treesit-major-mode-setup)))
 
+(derived-mode-add-parents 'go-ts-mode '(go-mode))
+
 (if (treesit-ready-p 'go)
     (add-to-list 'auto-mode-alist '("\\.go\\'" . go-ts-mode)))
 
@@ -437,6 +439,9 @@ go-mod-ts-mode
 
     (treesit-major-mode-setup)))
 
+(derived-mode-add-parents 'go-mode-ts-mode '(go-mod-mode))
+
+
 (if (treesit-ready-p 'gomod)
     (add-to-list 'auto-mode-alist '("/go\\.mod\\'" . go-mod-ts-mode)))
 
diff --git a/lisp/progmodes/heex-ts-mode.el b/lisp/progmodes/heex-ts-mode.el
index 7b53a44deb2..702610bc1eb 100644
--- a/lisp/progmodes/heex-ts-mode.el
+++ b/lisp/progmodes/heex-ts-mode.el
@@ -177,6 +177,8 @@ heex-ts-mode
 
     (treesit-major-mode-setup)))
 
+(derived-mode-add-parents 'heex-ts-mode '(heex-mode))
+
 (if (treesit-ready-p 'heex)
     ;; Both .heex and the deprecated .leex files should work
     ;; with the tree-sitter-heex grammar.
diff --git a/lisp/progmodes/java-ts-mode.el b/lisp/progmodes/java-ts-mode.el
index 0b1ac49b99f..51e0eeef79a 100644
--- a/lisp/progmodes/java-ts-mode.el
+++ b/lisp/progmodes/java-ts-mode.el
@@ -401,6 +401,8 @@ java-ts-mode
                 ("Method" "\\`method_declaration\\'" nil nil)))
   (treesit-major-mode-setup))
 
+(derived-mode-add-parents 'java-ts-mode '(java-mode))
+
 (if (treesit-ready-p 'java)
     (add-to-list 'auto-mode-alist '("\\.java\\'" . java-ts-mode)))
 
diff --git a/lisp/progmodes/js.el b/lisp/progmodes/js.el
index 0115feb0e97..2420bdde50a 100644
--- a/lisp/progmodes/js.el
+++ b/lisp/progmodes/js.el
@@ -3898,6 +3898,8 @@ js-ts-mode
     (add-to-list 'auto-mode-alist
                  '("\\(\\.js[mx]\\|\\.har\\)\\'" . js-ts-mode))))
 
+(derived-mode-add-parents 'js-ts-mode '(js-mode))
+
 (defvar js-ts--s-p-query
   (when (treesit-available-p)
     (treesit-query-compile 'javascript
diff --git a/lisp/progmodes/json-ts-mode.el b/lisp/progmodes/json-ts-mode.el
index 32bc10bbda9..1fb96555010 100644
--- a/lisp/progmodes/json-ts-mode.el
+++ b/lisp/progmodes/json-ts-mode.el
@@ -164,6 +164,8 @@ json-ts-mode
 
   (treesit-major-mode-setup))
 
+(derived-mode-add-parents 'json-ts-mode '(json-mode))
+
 (if (treesit-ready-p 'json)
     (add-to-list 'auto-mode-alist
                  '("\\.json\\'" . json-ts-mode)))
diff --git a/lisp/progmodes/lua-ts-mode.el b/lisp/progmodes/lua-ts-mode.el
index 3b600f59521..e81f05ff3cb 100644
--- a/lisp/progmodes/lua-ts-mode.el
+++ b/lisp/progmodes/lua-ts-mode.el
@@ -757,6 +757,8 @@ lua-ts-mode
 
   (add-hook 'flymake-diagnostic-functions #'lua-ts-flymake-luacheck nil 'local))
 
+(derived-mode-add-parents 'lua-ts-mode '(lua-mode))
+
 (when (treesit-ready-p 'lua)
   (add-to-list 'auto-mode-alist '("\\.lua\\'" . lua-ts-mode)))
 
diff --git a/lisp/progmodes/python.el b/lisp/progmodes/python.el
index 1148da11a06..94a133b0688 100644
--- a/lisp/progmodes/python.el
+++ b/lisp/progmodes/python.el
@@ -6995,6 +6995,8 @@ python-ts-mode
     (add-to-list 'auto-mode-alist '("\\.py[iw]?\\'" . python-ts-mode))
     (add-to-list 'interpreter-mode-alist '("python[0-9.]*" . python-ts-mode))))
 
+(derived-mode-add-parents 'python-ts-mode '(python-mode))
+
 ;;; Completion predicates for M-x
 ;; Commands that only make sense when editing Python code.
 (dolist (sym '(python-add-import
diff --git a/lisp/progmodes/ruby-ts-mode.el b/lisp/progmodes/ruby-ts-mode.el
index 598eaa461ff..7282d43e091 100644
--- a/lisp/progmodes/ruby-ts-mode.el
+++ b/lisp/progmodes/ruby-ts-mode.el
@@ -1196,6 +1196,8 @@ ruby-ts-mode
 
   (setq-local syntax-propertize-function #'ruby-ts--syntax-propertize))
 
+(derived-mode-add-parents 'ruby-ts-mode '(ruby-mode))
+
 (if (treesit-ready-p 'ruby)
     ;; Copied from ruby-mode.el.
     (add-to-list 'auto-mode-alist
diff --git a/lisp/progmodes/rust-ts-mode.el b/lisp/progmodes/rust-ts-mode.el
index c5fc57cc374..c67ac43e4d0 100644
--- a/lisp/progmodes/rust-ts-mode.el
+++ b/lisp/progmodes/rust-ts-mode.el
@@ -474,6 +474,8 @@ rust-ts-mode
 
     (treesit-major-mode-setup)))
 
+(derived-mode-add-parents 'rust-ts-mode '(rust-mode))
+
 (if (treesit-ready-p 'rust)
     (add-to-list 'auto-mode-alist '("\\.rs\\'" . rust-ts-mode)))
 
diff --git a/lisp/progmodes/sh-script.el b/lisp/progmodes/sh-script.el
index 0562415b4e5..e7e08fba1c9 100644
--- a/lisp/progmodes/sh-script.el
+++ b/lisp/progmodes/sh-script.el
@@ -1638,6 +1638,8 @@ bash-ts-mode
     (setq-local treesit-defun-type-regexp "function_definition")
     (treesit-major-mode-setup)))
 
+(derived-mode-add-parents 'bash-ts-mode '(sh-mode))
+
 (advice-add 'bash-ts-mode :around #'sh--redirect-bash-ts-mode
             ;; Give it lower precedence than normal advice, so other
             ;; advices take precedence over it.
diff --git a/lisp/progmodes/typescript-ts-mode.el b/lisp/progmodes/typescript-ts-mode.el
index e9c6afff440..83a3baaf5ef 100644
--- a/lisp/progmodes/typescript-ts-mode.el
+++ b/lisp/progmodes/typescript-ts-mode.el
@@ -491,6 +491,8 @@ typescript-ts-mode
 
     (treesit-major-mode-setup)))
 
+(derived-mode-add-parents 'typescript-ts-mode '(typescript-mode))
+
 (if (treesit-ready-p 'typescript)
     (add-to-list 'auto-mode-alist '("\\.ts\\'" . typescript-ts-mode)))
 
@@ -548,6 +550,8 @@ tsx-ts-mode
 
     (treesit-major-mode-setup)))
 
+(derived-mode-add-parents 'tsx-ts-mode '(tsx-mode))
+
 (defvar typescript-ts--s-p-query
   (when (treesit-available-p)
     (treesit-query-compile 'typescript
diff --git a/lisp/textmodes/css-mode.el b/lisp/textmodes/css-mode.el
index 425f3ec8a30..f5a20e0ca0e 100644
--- a/lisp/textmodes/css-mode.el
+++ b/lisp/textmodes/css-mode.el
@@ -1830,6 +1830,8 @@ css-ts-mode
 
     (add-to-list 'auto-mode-alist '("\\.css\\'" . css-ts-mode))))
 
+(derived-mode-add-parents 'css-ts-mode '(css-mode))
+
 ;;;###autoload
 (define-derived-mode css-mode css-base-mode "CSS"
   "Major mode to edit Cascading Style Sheets (CSS).
diff --git a/lisp/textmodes/html-ts-mode.el b/lisp/textmodes/html-ts-mode.el
index 301f3e8791c..bf6c1307e96 100644
--- a/lisp/textmodes/html-ts-mode.el
+++ b/lisp/textmodes/html-ts-mode.el
@@ -123,6 +123,8 @@ html-ts-mode
               '(("Element" "\\`tag_name\\'" nil nil)))
   (treesit-major-mode-setup))
 
+(derived-mode-add-parents 'html-ts-mode '(html-mode))
+
 (if (treesit-ready-p 'html)
     (add-to-list 'auto-mode-alist '("\\.html\\'" . html-ts-mode)))
 
diff --git a/lisp/textmodes/toml-ts-mode.el b/lisp/textmodes/toml-ts-mode.el
index 1ba410045f5..1b621032f8a 100644
--- a/lisp/textmodes/toml-ts-mode.el
+++ b/lisp/textmodes/toml-ts-mode.el
@@ -153,6 +153,8 @@ toml-ts-mode
 
     (treesit-major-mode-setup)))
 
+(derived-mode-add-parents 'toml-ts-mode '(toml-mode))
+
 (if (treesit-ready-p 'toml)
     (add-to-list 'auto-mode-alist '("\\.toml\\'" . toml-ts-mode)))
 
diff --git a/lisp/textmodes/yaml-ts-mode.el b/lisp/textmodes/yaml-ts-mode.el
index 2b57b384300..854ce3ad456 100644
--- a/lisp/textmodes/yaml-ts-mode.el
+++ b/lisp/textmodes/yaml-ts-mode.el
@@ -143,6 +143,8 @@ yaml-ts-mode
 
     (treesit-major-mode-setup)))
 
+(derived-mode-add-parents 'yaml-ts-mode '(yaml-mode))
+
 (if (treesit-ready-p 'yaml)
     (add-to-list 'auto-mode-alist '("\\.ya?ml\\'" . yaml-ts-mode)))
 

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Thu, 04 Jan 2024 23:03:01 GMT) Full text and rfc822 format available.

Message #8 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Thu, 4 Jan 2024 23:02:19 +0000
On Thu, Jan 4, 2024 at 10:12 PM Stefan Monnier via Bug reports for GNU
Emacs, the Swiss army knife of text editors <bug-gnu-emacs <at> gnu.org>
wrote:
>
> Package: Emacs
> Version: 30.0.50
>
>
> Many packages use the `major-mode` as a proxy for the type of content
> in the buffer.  When using the new TS modes, these packages tend to
> behave poorly because they do not understand that a buffer in `js-ts-mode`
> contains Javascript.
>
> I suggest we add the non-TS mode as an extra parent, so
> `derived-mode-all-parents` includes `js-mode` in `js-ts-mode`.


Hmmm, this would either mean stupendous and welcome
simplification in eglot-server-programs or horrible breakage
there.  Or maybe something in between :-)  Seems risky, so I
hope we give it adequate testing.  In a fair number of servers.
Can some Eglot user help me out here?

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Thu, 04 Jan 2024 23:06:02 GMT) Full text and rfc822 format available.

Message #11 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 5 Jan 2024 01:05:16 +0200
On 05/01/2024 01:02, João Távora wrote:
> Hmmm, this would either mean stupendous and welcome
> simplification in eglot-server-programs

Probably not, if you want to keep compatibility with ts modes in Emacs 29.1.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Thu, 04 Jan 2024 23:19:02 GMT) Full text and rfc822 format available.

Message #14 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Thu, 04 Jan 2024 18:18:30 -0500
> Hmmm, this would either mean stupendous and welcome
> simplification in eglot-server-programs

In the short term, most affected packages (like YASnippet as well) won't
benefit very much because they still need to accommodate Emacs<30.

> or horrible breakage there.  Or maybe something in between :-)  Seems
> risky, so I hope we give it adequate testing.

My preliminary tests were encouraging, but it's just a small dot in the
vast space of possibilities, so yes, we need people to try it out.

[ Maybe we'll need to offer two kinds of `derived-mode-p` and
  `derived-mode-all-parents`: one which pays attention only to the "true
  inherited parents" and another that's more permissive and includes the 
  extra parents.  I'm crossing my fingers, hoping that it won't come to
  that.  ]


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Thu, 04 Jan 2024 23:42:02 GMT) Full text and rfc822 format available.

Message #17 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 68246 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Thu, 4 Jan 2024 23:41:21 +0000
On Thu, Jan 4, 2024 at 11:05 PM Dmitry Gutov <dmitry <at> gutov.dev> wrote:
>
> On 05/01/2024 01:02, João Távora wrote:
> > Hmmm, this would either mean stupendous and welcome
> > simplification in eglot-server-programs
>
> Probably not, if you want to keep compatibility with ts modes in Emacs 29.1.

Not a problem, could have my own derived-mode-all-parents thing
in the meantime.  Or could just copy the idea directly.  Or, yes, could
just wait.  The question remains tho: does this work every time?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Thu, 04 Jan 2024 23:50:01 GMT) Full text and rfc822 format available.

Message #20 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Thu, 4 Jan 2024 23:48:48 +0000
On Thu, Jan 4, 2024 at 11:18 PM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
>
> > Hmmm, this would either mean stupendous and welcome
> > simplification in eglot-server-programs
>
> In the short term, most affected packages (like YASnippet as well) won't
> benefit very much because they still need to accommodate Emacs<30.

But like I told Dmitry: if the idea is good, I guess the logic isn't
hard to implement as a package-specific hack, which is then removed
in the future.

I have to say that, practical advantages aside, I don't much fancy
this implicit derivation based on a name of a specific convention.
More than the implicit bit, it's that it only affects types or at
least would seem so.  Why can't we go to the ts modes we control
ourselves and  write in this derivation?  It's because of hookage
right? We _don't_ want  x-mode-hook to run when we activate
x-ts-mode.  Or do we?  Maybe we  do?  How exactly is inheritance
defined for major modes?  What properties are inherited?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 05 Jan 2024 00:00:02 GMT) Full text and rfc822 format available.

Message #23 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Thu, 04 Jan 2024 18:59:16 -0500
> I have to say that, practical advantages aside, I don't much fancy
> this implicit derivation based on a name of a specific convention.

Not sure what you're referring to.

> Why can't we go to the ts modes we control ourselves and  write in
> this derivation?

Isn't this what my patch does?


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 05 Jan 2024 00:36:02 GMT) Full text and rfc822 format available.

Message #26 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 5 Jan 2024 00:35:28 +0000
On Thu, Jan 4, 2024 at 11:59 PM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
>
> > I have to say that, practical advantages aside, I don't much fancy
> > this implicit derivation based on a name of a specific convention.
>
> Not sure what you're referring to.

Doh.  Shame on me for not reading the patch.  For some
silly reason I just assumed it was some trick on symbol-name
inside derived-mode-all-parents.

> > Why can't we go to the ts modes we control ourselves and  write in
> > this derivation?
>
> Isn't this what my patch does?

Sorry.  Well in that case, that takes care of one misgiving :-)
I suppose one way to proceed would be to try it out for some
time and be ready to revert it, or parts of it?  Maybe announce
on emacs-devel?

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 05 Jan 2024 00:44:02 GMT) Full text and rfc822 format available.

Message #29 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Thu, 4 Jan 2024 16:43:37 -0800

> 
>>> Why can't we go to the ts modes we control ourselves and  write in
>>> this derivation?
>> 
>> Isn't this what my patch does?
> 
> Sorry.  Well in that case, that takes care of one misgiving :-)
> I suppose one way to proceed would be to try it out for some
> time and be ready to revert it, or parts of it?  Maybe announce
> on emacs-devel?

I’m all for it :-)

Yuan



Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 05 Jan 2024 07:41:02 GMT) Full text and rfc822 format available.

Message #32 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 05 Jan 2024 09:40:17 +0200
> Cc: monnier <at> iro.umontreal.ca
> Date: Thu, 04 Jan 2024 17:11:14 -0500
> From:  Stefan Monnier via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
> 
> Many packages use the `major-mode` as a proxy for the type of content
> in the buffer.  When using the new TS modes, these packages tend to
> behave poorly because they do not understand that a buffer in `js-ts-mode`
> contains Javascript.
> 
> I suggest we add the non-TS mode as an extra parent, so
> `derived-mode-all-parents` includes `js-mode` in `js-ts-mode`.

If it works well, it's a good simplification, IMO.

But this patch is IMO incomplete:

  . it should modify our .dir-locals.el and Eglot's database to remove
    special entries for TS modes
  . it should add the recommendation to consider using this to the
    "Major Mode Convention" node of the ELisp manual




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 05 Jan 2024 07:53:01 GMT) Full text and rfc822 format available.

Message #35 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 05 Jan 2024 09:51:44 +0200
> Cc: 68246 <at> debbugs.gnu.org
> From: João Távora <joaotavora <at> gmail.com>
> Date: Thu, 4 Jan 2024 23:48:48 +0000
> 
> We _don't_ want  x-mode-hook to run when we activate
> x-ts-mode.  Or do we?

We don't, because FOO-mode-hook usually assumes all kinds of things
that are generally not true for FOO-ts-mode.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 05 Jan 2024 11:29:01 GMT) Full text and rfc822 format available.

Message #38 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 68246 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 5 Jan 2024 11:27:53 +0000
On Fri, Jan 5, 2024 at 7:51 AM Eli Zaretskii <eliz <at> gnu.org> wrote:
>
> > Cc: 68246 <at> debbugs.gnu.org
> > From: João Távora <joaotavora <at> gmail.com>
> > Date: Thu, 4 Jan 2024 23:48:48 +0000
> >
> > We _don't_ want  x-mode-hook to run when we activate
> > x-ts-mode.  Or do we?
>
> We don't, because FOO-mode-hook usually assumes all kinds of things
> that are generally not true for FOO-ts-mode.

Indeed.

Sorry for the long mail, here's a TL;DR: let's experiment,
but maybe tighten up the docs to state 'define-derived-mode' is
parenting and 'add-derived-mode-parents' is more like adoption.
Possibly add  'remove-derived-mode-parent' for safety and fixing
existing bugs.

Now I've read the patch and the misgivings are back.  It uses
`derived-mode-add-parents`, whereas by "adding the inheritance
ourselves", I was suggesting going to each `define-derived-mode`
of each 'foo-ts-mode' and really putting in 'foo-mode' as a parent.

So this is the multi-file macro-expanded moral equivalent of
the symbol-name hack I was talking about

'd-m-add-parents' is a very new util in our tree, and I suppose
it has been discussed.  It's not really full inheritance as
given by normal parenting, it's more like "adult adoption" :-)

I guess we can try it, and even like it, but can we be sure
of all the semantic impacts?

If you ask me, authors should prioritize getting their hands dirty
and extract commonality for the foo-ts-mode and foo-mode one
by one.  Name such modes foo-base-mode or base-foo-mode maybe.

This is what was done for lisp-data-mode which now parents most
(all?) Lisp modes, so much that the other day I could write a
functional 2 line Clojure-mode based on lisp-data-mode.  The
situation is that camp is much cleaner now, and it wasn't
a very difficult change.

base-foo-mode is a natural place for setup that is common
to both foo-mode and foo-ts-mode to exist.  There is a
good number of things that are independent of the particular
implementation of parsing (lisp-based vs ts-based).

Is it too late for the ts-modes to be looked at like that?
It seems our approach to TS modes often/always looks like
'foo-rewrite-completely-using-ts-while-at-it-mode'.

Maybe for some modes this makes sense IMO, like C and C++ modes.

Inadequate parenting is a real problem.  The lisp-mode example
2-3 years ago, but also recently the parenting the js-json-mode <->
js-mode relation has caused a so-far unsolved problem in Eglot
described in bug#67463.

That bug can also really only be solved by "getting hands
dirty" or by introducing some remove-derived-mode-parent
counterpart to the new derived-mode-add-parents.

If that's how we want to view "derived-mode-p" from now on.
Maybe it is.  But it should be well explained in the docs
of both define-derived-mode and derived-mode-p that you don't
need the former to get the latter and that d-d-mode bakes in
much more powerful relation that add-derived-mode-parents
doesn't fully emulate.  And that remove-derived-mode-parent
can sever that part of the relation (and only that part).

> It should modify our .dir-locals.el and Eglot's database to
> remove special entries for TS modes

As Dmitry mentioned: if that is done just like that it will
break Eglot's support of ts modes in any Emacs which doesn't
have Stefan's patch.

But we could easily add some compatibility code to Eglot that
does the same thing as the patch in ad-hoc fashion, and then
remove that code (much later) on.

Also, I know this mail is long enough, but apropos Eglot's
database, it's getting quite large as you may notice.  Another,
much more natural way to simplify it would be, if major-modes
started setting eglot-server-program (singular) buffer-locally
variable which takes precedence over eglot-server-programs.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 05 Jan 2024 13:27:02 GMT) Full text and rfc822 format available.

Message #41 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 05 Jan 2024 15:26:00 +0200
> From: João Távora <joaotavora <at> gmail.com>
> Date: Fri, 5 Jan 2024 11:27:53 +0000
> Cc: monnier <at> iro.umontreal.ca, 68246 <at> debbugs.gnu.org
> 
> base-foo-mode is a natural place for setup that is common
> to both foo-mode and foo-ts-mode to exist.  There is a
> good number of things that are independent of the particular
> implementation of parsing (lisp-based vs ts-based).
> 
> Is it too late for the ts-modes to be looked at like that?

It doesn't always make sense.  Where it does make sense, I think we
did it (see python-ts-mode, for example).

> It seems our approach to TS modes often/always looks like
> 'foo-rewrite-completely-using-ts-while-at-it-mode'.

That's right -- but it's justified.  The commonality is usually either
very thin or almost non-existent.  If you think about it, you will
understand: where the traditional modes use regexps and syntax-pss,
the TS modes use parser-related primitives.  How can you find common
grounds between these so different bases for implementation?

> > It should modify our .dir-locals.el and Eglot's database to
> > remove special entries for TS modes
> 
> As Dmitry mentioned: if that is done just like that it will
> break Eglot's support of ts modes in any Emacs which doesn't
> have Stefan's patch.

Then Eglot (or maybe compat.el) will have to provide compatibility
shims.  But for .dir-locals.el, I still think we should update it.

> Also, I know this mail is long enough, but apropos Eglot's
> database, it's getting quite large as you may notice.  Another,
> much more natural way to simplify it would be, if major-modes
> started setting eglot-server-program (singular) buffer-locally
> variable which takes precedence over eglot-server-programs.

Maybe.  But that's an unrelated issue.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 05 Jan 2024 15:17:01 GMT) Full text and rfc822 format available.

Message #44 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 68246 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 5 Jan 2024 15:16:30 +0000
On Fri, Jan 5, 2024 at 1:26 PM Eli Zaretskii <eliz <at> gnu.org> wrote:

> That's right -- but it's justified.  The commonality is usually either
> very thin or almost non-existent.  If you think about it, you will
> understand: where the traditional modes use regexps and syntax-ppss,
> the TS modes use parser-related primitives.

I _have_ thought about it.  And I started from evidence that
not everything a major mode dedicated to a language supplies
is directly related to the parser implementation. Many things
are, but not all.  So reimplementing a full major mode just
for changing the praser backend might not make sense.

> How can you find common
> grounds between these so different bases for implementation?

Very easily, I think.  Stefan's patch is one such example.
What it is fixing?  Tools that want this common ground and haven't
found it, of course!

But there's also my own hookage that I had to move from c++-mode
to c++-ts-mode.  Stefan's patch doesn't fix that.

Take inserting comments via comment-dwim.  Or invoking LSP in
any mode for another example.  Or consulting documentation.  Or
anything we've built (including muscle memory) that lives on
top of syntactic abstractions like forward-sexp.  Basically any
preference that the major-mode expresses regarding an orthogonal
facility (minor mode or not) should, in principle, be shared.

At the very least, it seems a common hook would be useful, and that's
what an empty foo-base-mode() would give.  It _also_ gives you the
possibility to fix this more elegantly (well, at the expense of
yet another naming convention).

> > > It should modify our .dir-locals.el and Eglot's database to
> > > remove special entries for TS modes
> >
> > As Dmitry mentioned: if that is done just like that it will
> > break Eglot's support of ts modes in any Emacs which doesn't
> > have Stefan's patch.
>
> Then Eglot (or maybe compat.el) will have to provide compatibility
> shims.

Yes, if any wants to update Eglot to use compat.el, I think
it would be useful in this and more situations.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 05 Jan 2024 15:35:02 GMT) Full text and rfc822 format available.

Message #47 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 05 Jan 2024 17:34:34 +0200
> From: João Távora <joaotavora <at> gmail.com>
> Date: Fri, 5 Jan 2024 15:16:30 +0000
> Cc: monnier <at> iro.umontreal.ca, 68246 <at> debbugs.gnu.org
> 
> On Fri, Jan 5, 2024 at 1:26 PM Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
> > That's right -- but it's justified.  The commonality is usually either
> > very thin or almost non-existent.  If you think about it, you will
> > understand: where the traditional modes use regexps and syntax-ppss,
> > the TS modes use parser-related primitives.
> 
> I _have_ thought about it.  And I started from evidence that
> not everything a major mode dedicated to a language supplies
> is directly related to the parser implementation. Many things
> are, but not all.  So reimplementing a full major mode just
> for changing the praser backend might not make sense.

Experience shows that eventually most if not all of that goes back to
how the mode parses or analyzes the text (a.k.a. "source code").

> > How can you find common
> > grounds between these so different bases for implementation?
> 
> Very easily, I think.  Stefan's patch is one such example.
> What it is fixing?  Tools that want this common ground and haven't
> found it, of course!

What Stefan's patch fixes is the features that depend only on the
mode's symbol, but don't call any mode-specific functions or examine
its data structures.

> Take inserting comments via comment-dwim.

Even comment-dwim already shows a problem: it must determine whether
point is inside a comment, and TS and non-TS modes do that radically
differently.

The commonality could of course be increased by refactoring the
existing stuff so that it uses more abstract interfaces, which could
then be implemented separately for TS and non-TS modes.  But that
requires some motivation, and for now I see no such motivation where
different people maintain the traditional and TS modes.  Such
refactoring is not an easy business, so without the motivation I see
no way for that to happen any time soon.

> Or invoking LSP in any mode for another example.

LSP only cares for the language, so of course it can benefit from
Stefan's patch, because all that matters is the mode's symbol.

> Or consulting documentation.

Again, only the mode's symbol is important.

> Or anything we've built (including muscle memory) that lives on top
> of syntactic abstractions like forward-sexp.

Here you already bump into a problem, because most languages have no
notion of "sexp", so making a TS mode do the same as a traditional
mode is not easy at all.

> Basically any preference that the major-mode expresses regarding an
> orthogonal facility (minor mode or not) should, in principle, be
> shared.

I invite you to compare CC mode with c-ts-mode, and see for yourself
how the common grounds are very small.  It seems surprising at first
sight, but once you look at the code, it is very clear.

> At the very least, it seems a common hook would be useful, and that's
> what an empty foo-base-mode() would give.

Where a base mode makes sense, sure.  But even that causes problems,
since the base mode leaves some stuff not set up, and this various
things that you'd want to do in a mode hook are impossible in the
base-mode hook.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 05 Jan 2024 18:03:02 GMT) Full text and rfc822 format available.

Message #50 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>, Yuan Fu <casouri <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 5 Jan 2024 18:02:29 +0000
On Fri, Jan 5, 2024 at 3:34 PM Eli Zaretskii <eliz <at> gnu.org> wrote:

> > I _have_ thought about it.  And I started from evidence that
> > not everything a major mode dedicated to a language supplies
> > is directly related to the parser implementation. Many things
> > are, but not all.  So reimplementing a full major mode just
> > for changing the praser backend might not make sense.
>
> Experience shows that eventually most if not all of that goes back to
> how the mode parses or analyzes the text (a.k.a. "source code").

That might be.  That's why I specifically wrote "parser implementation"
It _needn't_ (and quite fortunately often _doesn't_) go back to that.

> > > How can you find common
> > > grounds between these so different bases for implementation?
> >
> > Very easily, I think.  Stefan's patch is one such example.
> > What it is fixing?  Tools that want this common ground and haven't
> > found it, of course!
>
> What Stefan's patch fixes is the features that depend only on the
> mode's symbol, but don't call any mode-specific functions or examine
> its data structures.

Yes.  But it doesn't fix a minor mode that relies on a buffer-local
setting of one of this minor-modes variables, and this setting
happens in the mode body or in a hook, say flymake-diagnostic-functions,
but there are many.  Same for a minor mode relying on syntax-ppss.

> > Take inserting comments via comment-dwim.
>
> Even comment-dwim already shows a problem: it must determine whether
> point is inside a comment, and TS and non-TS modes do that radically
> differently.

Again, that's true for the implementation-wise.  But it needn't
be of the  interface.

> LSP only cares for the language, so of course it can benefit from
> Stefan's patch, because all that matters is the mode's symbol.

Stefan's patch becomes irrelevant (for LSP) if we switch to
eglot-server-program  (singular).  Either we will have two identical
settings of  eglot-server-program in foo-mode and foo-ts-mode, or
we will use a shared one.

Yasnippet (another package I re-wrote mostly) also relies on this
"database" approach. It shouldn't, but I didn't know better at
the time.  Stefan's patch is only needed for it because refactoring
it is work that noone wants to put in (at least I don't).

> > Or consulting documentation.
>
> Again, only the mode's symbol is important.

No.  Say some consult-documentation minor-mode relies on
some setting of 'documentation-function'?

> > Or anything we've built (including muscle memory) that lives on top
> > of syntactic abstractions like forward-sexp.
>
> Here you already bump into a problem, because most languages have no
> notion of "sexp", so making a TS mode do the same as a traditional
> mode is not easy at all.

Of course they do!!  How else would electric-pair-mode have worked
for virtually every language for more than 10 years, or C-M-u,
C-M-SPC, etc etc?   e-p-m doesn't have knowledge of past and
future modes, yet it works.  How?

Well the reason why e-p-m and these things work today for most ts
modes is because they are also _using_ the Lisp/C parser based on
syntax tables and syntax-propertize-function.

So, in essence, TS modes currently use two parsers wasting CPU doing
a fair amount of duplicate work.  I suppose this waste would only
stop once syntax-tables are nullified for those modes.  But we can't
many syntax-ppss clients (e-p-m, symbol-at-point) hanging.

IMO There's an elegant out.  When one puts a suitable
syntax-propertize-function that consults ts nodes, like someone did
for c++-ts-mode (still very minimal though) we take full
advantage of TS and even enable things that are very complicated to
do without TS.

For example, in non-TS c++-mode it's hard to have e-p-m pair '<' with
'>' but only in C++ template contexts.  But in c++ts-mode it's within
reach.  It does needs both an addition to c++-ts-mode's s-p-function
and a bugfix to elec-pair.el: I'm looking into that.

> > Basically any preference that the major-mode expresses regarding an
> > orthogonal facility (minor mode or not) should, in principle, be
> > shared.
>
> I invite you to compare CC mode with c-ts-mode, and see for yourself
> how the common grounds are very small.  It seems surprising at first
> sight, but once you look at the code, it is very clear.

And this is mainly because CC mode is, well, rather corpulent software,
let's put it like that.  This is why I wrote it makes sense to start
from scratch for this one.

But would some kind of c++-base-mode hurt in some way? Presuming Alan
allows it, of course.

> > At the very least, it seems a common hook would be useful, and that's
> > what an empty foo-base-mode() would give.
>
> Where a base mode makes sense, sure.  But even that causes problems,
> since the base mode leaves some stuff not set up.

I don't follow.  Can you give an example of a problem?  In fact
I'm happy to see exactly the strategy I suggested is already done in
ruby-mode.el and ruby-ts-mode.el.  What problems are caused by it?

It makes sense for the base mode be abstract of course: meaning
we should flag an error if calling 'foo-base-mode' detects it is called
outside of the the context of 'foo-concrete-mode'.

>  and this various
> things that you'd want to do in a mode hook are impossible in the
> base-mode hook.

I don't follow this part either.  Can you give an example using, say
the existing ruby-base-mode.

In summary, my position is that regardless of Stefan's patch, which
I'm not opposed to, we should:

1. Use add-derived-mode-parents sparingly and consider foo-base-mode when
possible.

2. have a remove-derived-mode-parent (for the other bug)

3. perhaps tighten up what we mean by derived-mode-p in the docs

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 05 Jan 2024 18:44:02 GMT) Full text and rfc822 format available.

Message #53 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefankangas <at> gmail.com>
To: 68246 <at> debbugs.gnu.org
Cc: monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 5 Jan 2024 10:43:11 -0800
Stefan Monnier via "Bug reports for GNU Emacs, the Swiss army knife of
text editors" <bug-gnu-emacs <at> gnu.org> writes:

> Many packages use the `major-mode` as a proxy for the type of content
> in the buffer.  When using the new TS modes, these packages tend to
> behave poorly because they do not understand that a buffer in `js-ts-mode`
> contains Javascript.
>
> I suggest we add the non-TS mode as an extra parent, so
> `derived-mode-all-parents` includes `js-mode` in `js-ts-mode`.

We should probably also change *Help* to displays the extra parents, or
this will be pretty confusing.  It should probably both list it as a
parent mode, and say that it will run the corresponding hook.

Here's an example:

  (progn
    (define-derived-mode python-super-mode prog-mode "SuperPython" "Doc")
    (derived-mode-add-parents 'python-super-mode '(python-mode))
    (describe-function 'python-super-mode))




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 05 Jan 2024 18:58:02 GMT) Full text and rfc822 format available.

Message #56 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 05 Jan 2024 20:56:27 +0200
> From: João Távora <joaotavora <at> gmail.com>
> Date: Fri, 5 Jan 2024 18:02:29 +0000
> Cc: monnier <at> iro.umontreal.ca, 68246 <at> debbugs.gnu.org
> 
> > > Or consulting documentation.
> >
> > Again, only the mode's symbol is important.
> 
> No.  Say some consult-documentation minor-mode relies on
> some setting of 'documentation-function'?

I had info-look in mind.

> > > Or anything we've built (including muscle memory) that lives on top
> > > of syntactic abstractions like forward-sexp.
> >
> > Here you already bump into a problem, because most languages have no
> > notion of "sexp", so making a TS mode do the same as a traditional
> > mode is not easy at all.
> 
> Of course they do!!  How else would electric-pair-mode have worked
> for virtually every language for more than 10 years

forward-sexp moves forward even when there are no parentheses or
braces anywhere in sight.

> Well the reason why e-p-m and these things work today for most ts
> modes is because they are also _using_ the Lisp/C parser based on
> syntax tables and syntax-propertize-function.

That's because a language parser will not have any notion of a sexp,
so it cannot help.

> > I invite you to compare CC mode with c-ts-mode, and see for yourself
> > how the common grounds are very small.  It seems surprising at first
> > sight, but once you look at the code, it is very clear.
> 
> And this is mainly because CC mode is, well, rather corpulent software,
> let's put it like that.  This is why I wrote it makes sense to start
> from scratch for this one.

A discussion where you brush aside any argument that doesn't fit your
theory is not a useful one.  In Emacs we have to solve problems that
happen in the messy real world, not problems in an ideal world where
everything is according to some elegant theory.

> But would some kind of c++-base-mode hurt in some way? Presuming Alan
> allows it, of course.

Feel free to suggest such a base mode.  If it works and is helpful, we
will install it.  Frankly, I doubt you could come up with a useful
base mode like that: the differences are too large.

> > > At the very least, it seems a common hook would be useful, and that's
> > > what an empty foo-base-mode() would give.
> >
> > Where a base mode makes sense, sure.  But even that causes problems,
> > since the base mode leaves some stuff not set up.
> 
> I don't follow.  Can you give an example of a problem?

Yes, look at python.el and sh-script.el.  The base mode can only go so
far, it must stop before it gets into the stuff that is really
different between the TS and non-TS modes.  This means that the
base-mode hook will not see a mode that is ready for work, only its
beginning.

> In fact I'm happy to see exactly the strategy I suggested is already
> done in ruby-mode.el and ruby-ts-mode.el.  What problems are caused
> by it?

Some modes succeed in that, others don't.  I guess it depends on the
language grammar.

> >  and this various
> > things that you'd want to do in a mode hook are impossible in the
> > base-mode hook.
> 
> I don't follow this part either.  Can you give an example using, say
> the existing ruby-base-mode.

Again, look at the two examples I mentioned above.

> In summary, my position is that regardless of Stefan's patch, which
> I'm not opposed to, we should:
> 
> 1. Use add-derived-mode-parents sparingly and consider foo-base-mode when
> possible.
> 
> 2. have a remove-derived-mode-parent (for the other bug)
> 
> 3. perhaps tighten up what we mean by derived-mode-p in the docs

I have no opinion on that.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 05 Jan 2024 19:05:01 GMT) Full text and rfc822 format available.

Message #59 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefankangas <at> gmail.com>
To: João Távora <joaotavora <at> gmail.com>, 
 Eli Zaretskii <eliz <at> gnu.org>, Yuan Fu <casouri <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 5 Jan 2024 11:03:46 -0800
João Távora <joaotavora <at> gmail.com> writes:

> In summary, my position is that regardless of Stefan's patch, which
> I'm not opposed to, we should:
>
> 1. Use add-derived-mode-parents sparingly and consider foo-base-mode when
> possible.

I agree that inheriting from a `foo-base-mode' is a good way to reuse
code between different modes.  It's easy to think of examples of where
this will be useful: looking up some documentation, running a REPL,
interacting with a debugger, and so on and so forth.

But even if we added all the base modes today (as empty stubs), AFAIU it
wouldn't solve the exact problem that Monnier's patch is addressing,
namely to make packages and customizations work in both foo-mode and
foo-ts-mode even if they only say:

    (derived-mode-p 'foo-mode)

So I'm not sure doing one excludes the other, or that one should be
considered preferred over the other.  IOW, I'm not sure about the
recommendation to use `add-derived-mode-parents' sparingly, as that
would seem to defeat the point.

Am I missing something?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 05 Jan 2024 19:13:01 GMT) Full text and rfc822 format available.

Message #62 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefankangas <at> gmail.com>
To: 68246 <at> debbugs.gnu.org
Cc: monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 5 Jan 2024 11:11:50 -0800
Stefan Kangas <stefankangas <at> gmail.com> writes:

> We should probably also change *Help* to displays the extra parents, or
> this will be pretty confusing.  It should probably both list it as a
> parent mode, and say that it will run the corresponding hook.

Correcting myself here: the hook shouldn't be called out here as it
won't get run.  So please disregard that part.

But it should say something about the extra inherited mode.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 05 Jan 2024 23:21:01 GMT) Full text and rfc822 format available.

Message #65 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 5 Jan 2024 23:20:26 +0000
On Fri, Jan 5, 2024 at 6:57 PM Eli Zaretskii <eliz <at> gnu.org> wrote:

> > Of course they do!!  How else would electric-pair-mode have worked
> > for virtually every language for more than 10 years
>
> forward-sexp moves forward even when there are no parentheses or
> braces anywhere in sight.

electric-pair-mode uses scan-sexps.  Scan-sexps works perfectly to
navigate nested mixed delimiter structures of modes that are not Lisp,
otherwise e-p-m couldn't do it's auto-balancing job.

> > Well the reason why e-p-m and these things work today for most ts
> > modes is because they are also _using_ the Lisp/C parser based on
> > syntax tables and syntax-propertize-function.
>
> That's because a language parser will not have any notion of a sexp,
> so it cannot help.

As I am trying to explain, it doesn't have to be a "Lisp sexp".
It just has to be something that scan-sexps can navigate, and
scan-sexps works in all modes.  I'd think that at least
with a good enough grammar it's perfectly possible to do
e.g. show-paren-mode with TreeSitter alone.  And the way this
could work in Emacs is for TreeSitter to feed into scan-sexps.

> > > I invite you to compare CC mode with c-ts-mode, and see for yourself
> > > how the common grounds are very small.  It seems surprising at first
> > > sight, but once you look at the code, it is very clear.
> >
> > And this is mainly because CC mode is, well, rather corpulent software,
> > let's put it like that.  This is why I wrote it makes sense to start
> > from scratch for this one.
>
> A discussion where you brush aside any argument that doesn't fit your
> theory is not a useful one.

? You write this precisely in the point where I _agree_ with you.
That's the really the opposite of brushing aside.

> > But would some kind of c++-base-mode hurt in some way? Presuming Alan
> > allows it, of course.
>
> Feel free to suggest such a base mode.  If it works and is helpful, we
> will install it.  Frankly, I doubt you could come up with a useful
> base mode like that: the differences are too large.

As I am trying to explain, even a one-line empty base mode is useful.

> > > > At the very least, it seems a common hook would be useful, and that's
> > > > what an empty foo-base-mode() would give.
> > >
> > > Where a base mode makes sense, sure.  But even that causes problems,
> > > since the base mode leaves some stuff not set up.
> >
> > I don't follow.  Can you give an example of a problem?
>
> Yes, look at python.el and sh-script.el.  The base mode can only go so
> far, it must stop before it gets into the stuff that is really
> different between the TS and non-TS modes.

Very well, we are violently agreeing.

> This means that the
> base-mode hook will not see a mode that is ready for work, only its
> beginning.

Correct.  But a major-mode doesn't have to be "ready for work" (I presume
you mean ready for editing) for the hook to be useful.  That hook would
be perfectly suitable for setting variables used by minor modes and other
things. (eglot-server-program, flymake-diagnostic-functions,  company-backends,
mode-line-format, etc etc)
For turning on minor modes (eglot-ensure, company-mode, yasnippet-minor-mode,)
For binding commands.

And even without the hook the mere fact that foo-mode and foo-ts-mode
are derived from foo-base-mode according to derived-mode-p makes it
useful.

> > In fact I'm happy to see exactly the strategy I suggested is already
> > done in ruby-mode.el and ruby-ts-mode.el.  What problems are caused
> > by it?
>
> Some modes succeed in that, others don't.  I guess it depends on the
> language grammar.

I don't see the problem, really.  Now I see that many mode "base modes" already
exist.  That's great!  That's at least four simplifications to eglot.el's
eglot-server-programs (ruby, python, js and bash/sh).  I'd be happy to
know of more if someone has a fuller list.

And all the base mode definitions could well have settings for the
upcoming eglot-server-program.

> > >  and this various
> > > things that you'd want to do in a mode hook are impossible in the
> > > base-mode hook.
> >
> > I don't follow this part either.  Can you give an example using, say
> > the existing ruby-base-mode.
>
> Again, look at the two examples I mentioned above.

I couldn't see the problem in either python.el or sh-script.el.  What
do you wish you could do in those base mode bodies on have the user
do in the base mode hooks which is impossible?

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 05 Jan 2024 23:39:02 GMT) Full text and rfc822 format available.

Message #68 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Kangas <stefankangas <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 Yuan Fu <casouri <at> gmail.com>, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 5 Jan 2024 23:37:46 +0000
On Fri, Jan 5, 2024 at 7:03 PM Stefan Kangas <stefankangas <at> gmail.com> wrote:
>
> João Távora <joaotavora <at> gmail.com> writes:
>
> > In summary, my position is that regardless of Stefan's patch, which
> > I'm not opposed to, we should:
> >
> > 1. Use add-derived-mode-parents sparingly and consider foo-base-mode when
> > possible.
>
> I agree that inheriting from a `foo-base-mode' is a good way to reuse
> code between different modes.  It's easy to think of examples of where
> this will be useful: looking up some documentation, running a REPL,
> interacting with a debugger, and so on and so forth.

Exactly.  It's much cleaner and I am happy to see exactly what
I had idealized is already in the tree.

> But even if we added all the base modes today (as empty stubs), AFAIU it
> wouldn't solve the exact problem that Monnier's patch is addressing,
> namely to make packages and customizations work in both foo-mode and
> foo-ts-mode even if they only say:
>
>     (derived-mode-p 'foo-mode)

That's why these packages should be changed to say

  (derived-mode-p 'foo-base-mode)

Or maybe

   (cl-some #'derived-mode-p '(foo-mode foo-base-mode))

I'm not sure there is a problem in doing that, is there?  I can confirm
there isn't in Eglot and Yasnippet, two packages that use major-mode
how Stefan Monnier described it.  How many more are there
and is it really so hard to change them?

Or is it that we're deliberately trying to establish that
'foo-mode' is the canonical symbol designating a family
of potentially many major modes -- including, somewhat
confusingly, 'foo-mode' itself -- that are used for editing foo
source code?

If so, fine, but this should be very well documented.  And yes,
definitely called out in *Help*.  The concept of "major-mode family"
could emerge, perhaps.

> So I'm not sure doing one excludes the other, or that one should be
> considered preferred over the other.  IOW, I'm not sure about the
> recommendation to use `add-derived-mode-parents' sparingly, as that
> would seem to defeat the point.

Indeed one may not exclude the other, depending on how feasible
it is for extensions to start using the foo-base-mode.

Maybe it depends what are these many packages that
use "major-mode" in such ways.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 05 Jan 2024 23:52:01 GMT) Full text and rfc822 format available.

Message #71 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 05 Jan 2024 18:51:41 -0500
[ IMO, it would indeed be a good idea to try and write some abstraction
  layer so we can share more code between modes parsing with
  syntax-tables/tree-sitter/wisi/SMIE/younameit.  It will also be very
  useful when tree-sitter goes the way of the dodo.

  But that's a difficult job, and with limited immediately-visible
  benefits to the end users.  In the mean time, we're stuck with major
  modes that don't share much code.  ]

Whether it's worthwhile to have a `FOO-base-mode` or not depends on
the specifics, but it's largely an implementation detail.  More
importantly it's not directly relevant to this here bug, because
I want to say "FOO-ts-mode is a kind of mode for FOO, so it's
a kind of FOO-mode".  There are very few YASnippets for FOO-base-mode,
instead they're all for FOO-mode.  Similarly, Eglot doesn't have rules
for FOO-base-mode, only for FOO-mode.

That's why in my patch I add `python-mode` as an extra parent of
`python-ts-mode` even though they both share `python-base-mode` as
their parent.

IOW, in my patch, I'm using `FOO-mode` not really as the name of a major
mode, but rather as the name of a *file type*.
I already mentioned this distinction in the bug-report where
I introduced `major-mode-remap-alist`: Emacs usually conflates
file-type and major-mode, which works great where there's only one major
mode for a given file type, but less great where there are
several alternatives.


        Stefan


João Távora [2024-01-05 23:20:26] wrote:

> On Fri, Jan 5, 2024 at 6:57 PM Eli Zaretskii <eliz <at> gnu.org> wrote:
>
>> > Of course they do!!  How else would electric-pair-mode have worked
>> > for virtually every language for more than 10 years
>>
>> forward-sexp moves forward even when there are no parentheses or
>> braces anywhere in sight.
>
> electric-pair-mode uses scan-sexps.  Scan-sexps works perfectly to
> navigate nested mixed delimiter structures of modes that are not Lisp,
> otherwise e-p-m couldn't do it's auto-balancing job.
>
>> > Well the reason why e-p-m and these things work today for most ts
>> > modes is because they are also _using_ the Lisp/C parser based on
>> > syntax tables and syntax-propertize-function.
>>
>> That's because a language parser will not have any notion of a sexp,
>> so it cannot help.
>
> As I am trying to explain, it doesn't have to be a "Lisp sexp".
> It just has to be something that scan-sexps can navigate, and
> scan-sexps works in all modes.  I'd think that at least
> with a good enough grammar it's perfectly possible to do
> e.g. show-paren-mode with TreeSitter alone.  And the way this
> could work in Emacs is for TreeSitter to feed into scan-sexps.
>
>> > > I invite you to compare CC mode with c-ts-mode, and see for yourself
>> > > how the common grounds are very small.  It seems surprising at first
>> > > sight, but once you look at the code, it is very clear.
>> >
>> > And this is mainly because CC mode is, well, rather corpulent software,
>> > let's put it like that.  This is why I wrote it makes sense to start
>> > from scratch for this one.
>>
>> A discussion where you brush aside any argument that doesn't fit your
>> theory is not a useful one.
>
> ? You write this precisely in the point where I _agree_ with you.
> That's the really the opposite of brushing aside.
>
>> > But would some kind of c++-base-mode hurt in some way? Presuming Alan
>> > allows it, of course.
>>
>> Feel free to suggest such a base mode.  If it works and is helpful, we
>> will install it.  Frankly, I doubt you could come up with a useful
>> base mode like that: the differences are too large.
>
> As I am trying to explain, even a one-line empty base mode is useful.
>
>> > > > At the very least, it seems a common hook would be useful, and that's
>> > > > what an empty foo-base-mode() would give.
>> > >
>> > > Where a base mode makes sense, sure.  But even that causes problems,
>> > > since the base mode leaves some stuff not set up.
>> >
>> > I don't follow.  Can you give an example of a problem?
>>
>> Yes, look at python.el and sh-script.el.  The base mode can only go so
>> far, it must stop before it gets into the stuff that is really
>> different between the TS and non-TS modes.
>
> Very well, we are violently agreeing.
>
>> This means that the
>> base-mode hook will not see a mode that is ready for work, only its
>> beginning.
>
> Correct.  But a major-mode doesn't have to be "ready for work" (I presume
> you mean ready for editing) for the hook to be useful.  That hook would
> be perfectly suitable for setting variables used by minor modes and other
> things. (eglot-server-program, flymake-diagnostic-functions,  company-backends,
> mode-line-format, etc etc)
> For turning on minor modes (eglot-ensure, company-mode, yasnippet-minor-mode,)
> For binding commands.
>
> And even without the hook the mere fact that foo-mode and foo-ts-mode
> are derived from foo-base-mode according to derived-mode-p makes it
> useful.
>
>> > In fact I'm happy to see exactly the strategy I suggested is already
>> > done in ruby-mode.el and ruby-ts-mode.el.  What problems are caused
>> > by it?
>>
>> Some modes succeed in that, others don't.  I guess it depends on the
>> language grammar.
>
> I don't see the problem, really.  Now I see that many mode "base modes" already
> exist.  That's great!  That's at least four simplifications to eglot.el's
> eglot-server-programs (ruby, python, js and bash/sh).  I'd be happy to
> know of more if someone has a fuller list.
>
> And all the base mode definitions could well have settings for the
> upcoming eglot-server-program.
>
>> > >  and this various
>> > > things that you'd want to do in a mode hook are impossible in the
>> > > base-mode hook.
>> >
>> > I don't follow this part either.  Can you give an example using, say
>> > the existing ruby-base-mode.
>>
>> Again, look at the two examples I mentioned above.
>
> I couldn't see the problem in either python.el or sh-script.el.  What
> do you wish you could do in those base mode bodies on have the user
> do in the base mode hooks which is impossible?
>
> João





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sat, 06 Jan 2024 00:17:01 GMT) Full text and rfc822 format available.

Message #74 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sat, 6 Jan 2024 00:16:02 +0000
On Fri, Jan 5, 2024 at 11:51 PM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
>
> [ IMO, it would indeed be a good idea to try and write some abstraction
>   layer so we can share more code between modes parsing with
>   syntax-tables/tree-sitter/wisi/SMIE/younameit.  It will also be very
>   useful when tree-sitter goes the way of the dodo.

Very relevant.

>   But that's a difficult job, and with limited immediately-visible
>   benefits to the end users.  In the mean time, we're stuck with major
>   modes that don't share much code.  ]
>
> Whether it's worthwhile to have a `FOO-base-mode` or not depends on
> the specifics, but it's largely an implementation detail.  More
> importantly it's not directly relevant to this here bug, because
> I want to say "FOO-ts-mode is a kind of mode for FOO, so it's
> a kind of FOO-mode".

OK, so "kind".  So you seem to want to FOO-mode designates a family
of modes for FOO.  That's reasonable but it  has the confusing property
that the name of the family coincides with  the name of one of the
members of the family.  Using inheritance for  that just seems a bit off,
like a hack.  Really what we wanted is a  new variable called 'mode-family'
and test against that.

> There are very few YASnippets for FOO-base-mode,
> instead they're all for FOO-mode.  Similarly, Eglot doesn't have rules
> for FOO-base-mode, only for FOO-mode.

Right.  But we can change Eglot and Yasnippet, can't we?  I know what
to do in Eglot, which is something like this:

 (defvar eglot-server-programs `(((rust-ts-mode rust-mode) . ("rust-analyzer"))
                                 ((cmake-mode cmake-ts-mode) .
("cmake-language-server"))
                                 (vimrc-mode . ("vim-language-server"
"--stdio"))
-                                ((python-mode python-ts-mode)
+                                (python-base-mode
                                  . ,(eglot-alternatives

In Yasnippet, if I remember correctly (it was a long time ago), the
snippet directory could either be renamed foo-base-mode or something
in a .yas-parents inside that directory can be added containing
"foo-base-mode".

As to the problem that this doesn't work in Emacs < 30, we must
either:

* also keep the old definitions, or better

* use compat.el like Eli suggested to bring in those new base modes
(whereby compat.el could use derive-mode-add-parent itself, but for
but adding foo-base-mode as a parent of foo-mode and foo-ts-mode
instead).

> That's why in my patch I add `python-mode` as an extra parent of
> `python-ts-mode` even though they both share `python-base-mode` as
> their parent.

Right, that's what sounds hacky to me.  They're siblings, but now
one also is the parent of the other. Maybe it works, but is
definitely  odd.

That said, if it works, I'm not really opposed to it.  What
other packages do you know like Eglot and Yasnippet which use
major-mode in this way.

> IOW, in my patch, I'm using `FOO-mode` not really as the name of a major
> mode, but rather as the name of a *file type*.
> I already mentioned this distinction in the bug-report where
> I introduced `major-mode-remap-alist`: Emacs usually conflates
> file-type and major-mode, which works great where there's only one major
> mode for a given file type, but less great where there are
> several alternatives.

Agree, so your "file-type" seems to match my idea of "mode family".
So a new variable called file-type would be the correct abstraction.

And I don't see how "foo-base-mode" is much worse, modulo slightly
more akward naming and backward compatibility problems that you would
have anyway.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sat, 06 Jan 2024 03:21:01 GMT) Full text and rfc822 format available.

Message #77 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 5 Jan 2024 19:19:46 -0800
I certainly welcome base-mode, I’m the one that added them in the first place. But I also want to point out that they are only a partial solution. For one, adding the base mode needs cooperation from all the major mode authors. For built-in modes, that’s not a (big) problem; for countless third-party modes out there, I don’t have high hopes for it.

The good thing about derived-mode-add-parents is that it doesn’t need major mode author’s cooperation. Even a normal user can do it themselves.

Then there is the problem Eli pointed out, base-mode hooks runs before child major mode body does. It’s probably fine for most of the things, but if you want to change some buffer local variable that the major mode sets, base-mode hook can’t help. (Arguable a niche use-case, but my point is base-mode hooks have their limits.)

Obviously derived-mode-add-parents can’t help with hooks. But adding the config to two hooks doesn’t seem to be too bad. Plus I haven’t come up with good solution. So I’m not too eager to solve that inconvenience. 


As for adding xxx-mode to xxx-ts-mode’s parent, I think it’s fine? Like others in the thread, I couldn’t think of a scenarios where this will be problematic. I thought about adding xxx-lang as the parent of both xxx-mode and xxx-ts-mode, but that’s probably not very helpful, since the goal is to make things work for ts-mode without needing to change the existing code, and using xxx-lang still requires modifying existing code.

Yuan



Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sat, 06 Jan 2024 03:37:02 GMT) Full text and rfc822 format available.

Message #80 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Yuan Fu <casouri <at> gmail.com>, João Távora
 <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sat, 6 Jan 2024 05:36:09 +0200
On 06/01/2024 05:19, Yuan Fu wrote:

> The good thing about derived-mode-add-parents is that it doesn’t need major mode author’s cooperation. Even a normal user can do it themselves.

True.

> Then there is the problem Eli pointed out, base-mode hooks runs before child major mode body does. It’s probably fine for most of the things, but if you want to change some buffer local variable that the major mode sets, base-mode hook can’t help. (Arguable a niche use-case, but my point is base-mode hooks have their limits.)

This is the same for all other case of mode inheritance (e.g. js2-mode 
inheriting from js-mode, or python-ts-mode inheriting from python-mode).

It would be odd if some inheriters would have this inconvenience, and 
others not.

> Obviously derived-mode-add-parents can’t help with hooks. But adding the config to two hooks doesn’t seem to be too bad. Plus I haven’t come up with good solution. So I’m not too eager to solve that inconvenience.

I was thinking some "proper" language registry is the way to go. Like a 
custom structure tracking the correspondence between file names and 
languages, and a separate association list for lang->major-mode.

But it would require more changes indeed.

> As for adding xxx-mode to xxx-ts-mode’s parent, I think it’s fine? Like others in the thread, I couldn’t think of a scenarios where this will be problematic. I thought about adding xxx-lang as the parent of both xxx-mode and xxx-ts-mode, but that’s probably not very helpful, since the goal is to make things work for ts-mode without needing to change the existing code, and using xxx-lang still requires modifying existing code.

OTOH, the required changes could be made fairly minimal (aside those in 
the user's config): add a new keyword to define-derived-mode which would 
add (run-hooks 'xyz-lang-hook) at the end.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sat, 06 Jan 2024 04:09:01 GMT) Full text and rfc822 format available.

Message #83 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 05 Jan 2024 23:08:40 -0500
> OK, so "kind".  So you seem to want to FOO-mode designates a family
> of modes for FOO.  That's reasonable but it  has the confusing property
> that the name of the family coincides with  the name of one of the
> members of the family.

It's actually very common.  Take "Lisp" as an example.
Similarly, `tex-mode` is both the parent of several derived modes and
the entry point that dispatches to the appropriate derived mode.

> Using inheritance for that just seems a bit off, like a hack.

`derived-mode-add-parents` is not inheritance: there's no reuse of code
involved (tho it doesn't preclude it, of course).

> Really what we wanted is a  new variable called 'mode-family'
> and test against that.

I don't think that's necessary.

This is a common programming design decision: should we have different
"types" for the various stages of a pipeline, or is it preferable to
keep the same type across various stages?  When the stages of the
pipeline often do nothing at all, it can be a good choice to keep the
type unchanged.

E.g. in Lisp, macroexpansion returns something of the same "type" as its
input: it's OK to pass the output of `macroexpand(-all)` back to
`macroexpand(-all)`.  Scheme made a different design decision on this
one (for hygiene reasons).

Other example: keys as they go through `keyboard-coding-system`,
`input-decode-map`, `function-key-map`, `key-translation-map`.  Here we
decided to keep the type the same.

> In Yasnippet, if I remember correctly (it was a long time ago), the
> snippet directory could either be renamed foo-base-mode or something
> in a .yas-parents inside that directory can be added containing
> "foo-base-mode".

AFAIK adding a `.yas-parent` containing "FOO-base-mode" just means
"oh, and also include the snippets defined for FOO-base-mode", which is
redundant with the fact that `FOO-ts-mode` already derived from
`FOO-base-mode`.

So that won't do it.  We'd need (like I recently mentioned in
https://github.com/joaotavora/yasnippet/issues/1169) something like
a new `.yas-children` (or `.yas-siblings`) which tells YASnippet
to use the current directory also for those additional modes.

>> That's why in my patch I add `python-mode` as an extra parent of
>> `python-ts-mode` even though they both share `python-base-mode` as
>> their parent.
>
> Right, that's what sounds hacky to me.  They're siblings, but now
> one also is the parent of the other. Maybe it works, but is
> definitely  odd.

`python-mode` from `python.el`, `python-mode` from `python-mode.el`, and
`python-ts-mode`, are three implementation of the `python-mode`
functionality, yes.

The fact that we use the same namespace for actual major modes and for
conceptual functionalities saves us from having to do:

    (derived-mode-add-parents 'python-mode '(python-mode))

:-)

The same happens with Debian package names and Debian package features:
package `emacs` implicitly provides the feature `emacs`.

> That said, if it works, I'm not really opposed to it.  What
> other packages do you know like Eglot and Yasnippet which use
> major-mode in this way.

I'm not sure which modes might be affected (beside Eglot, YASnippet,
and CEDET).  I presume many others outside of Emacs are, since
`derived-mode-p` is used very often out there.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sat, 06 Jan 2024 04:17:02 GMT) Full text and rfc822 format available.

Message #86 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Yuan Fu <casouri <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 João Távora <joaotavora <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 05 Jan 2024 23:16:06 -0500
> Then there is the problem Eli pointed out, base-mode hooks runs before child
> major mode body does.

[ Side note: This is not relevant to the present suggested patch.  ]

No, all the mode hooks are run at the end of the major mode body,
i.e. you get the following order:

   fundamental-mode body
   prog-mode body
   FOO-base-mode body
   FOO-ts-mode body
   fundamental-mode-hook
   prog-mode-hook
   FOO-base-mode-hook
   FOO-ts-mode-hook
   
> (Arguable a niche use-case, but my point is base-mode hooks have
> their limits.)

It was sufficiently "not niche" that I fixed that problem in Emacs-22
(according to `C-h v delay-mode-hooks`)  :-)


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sat, 06 Jan 2024 08:08:01 GMT) Full text and rfc822 format available.

Message #89 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sat, 06 Jan 2024 10:07:09 +0200
> From: João Távora <joaotavora <at> gmail.com>
> Date: Fri, 5 Jan 2024 23:20:26 +0000
> Cc: casouri <at> gmail.com, monnier <at> iro.umontreal.ca, 68246 <at> debbugs.gnu.org
> 
> On Fri, Jan 5, 2024 at 6:57 PM Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
> > > Of course they do!!  How else would electric-pair-mode have worked
> > > for virtually every language for more than 10 years
> >
> > forward-sexp moves forward even when there are no parentheses or
> > braces anywhere in sight.
> 
> electric-pair-mode uses scan-sexps.  Scan-sexps works perfectly to
> navigate nested mixed delimiter structures of modes that are not Lisp,
> otherwise e-p-m couldn't do it's auto-balancing job.

You are thinking about forward-sexp and scan-sexps in the situations
where you start from a parenthesis.  But scan-sexps, like
forward-sexp, handles situations where the "sexp" is not a balanced
parenthesized expression.  It uses syntax tables that in the case
where you start not from a paren yield ad-hoc results whose relation
to "sexps" is arbitrary.  Therefore, doing something similar with
results of parsing the source code will always produce arbitrary
results that are different, and make as little sense as what
forward-sexp does.

There was a discussion not long ago how to define a "sexp" in TS-based
modes so that it would make sense.  You can see there that the
conclusions are not self-evident.

But we digress.  My point in talking about sexps was that basing it on
syntax tables (like we do in traditional modes) will produce different
results than if we base them on TS parsers.  If you still disagree,
let's agree to disagree, because I already explained this twice at
least.

> > > Well the reason why e-p-m and these things work today for most ts
> > > modes is because they are also _using_ the Lisp/C parser based on
> > > syntax tables and syntax-propertize-function.
> >
> > That's because a language parser will not have any notion of a sexp,
> > so it cannot help.
> 
> As I am trying to explain, it doesn't have to be a "Lisp sexp".
> It just has to be something that scan-sexps can navigate, and
> scan-sexps works in all modes.

scan-sexps is not using the results of parsing by TS, so it doesn't
really understand the structure of the source code, and in particular
cannot provide reasonable movements by portions of expressions in at
least some languages.

Again, if you disagree, let's agree to disagree.

> I'd think that at least with a good enough grammar it's perfectly
> possible to do e.g. show-paren-mode with TreeSitter alone.

Why do you think so?  Does a TS parser produce any information about
matching parens/braces, let alone characters like <> etc?

> And the way this could work in Emacs is for TreeSitter to feed into
> scan-sexps.

I'm not sure I understand how TS could feed scan-sexps.  Did you look
at the implementation of scan-sexps?  AFAICT, there's no way to base
the code there on TS, except by providing a completely different
implementation (assuming TS parsers even provide the required
information).

> > > > I invite you to compare CC mode with c-ts-mode, and see for yourself
> > > > how the common grounds are very small.  It seems surprising at first
> > > > sight, but once you look at the code, it is very clear.
> > >
> > > And this is mainly because CC mode is, well, rather corpulent software,
> > > let's put it like that.  This is why I wrote it makes sense to start
> > > from scratch for this one.
> >
> > A discussion where you brush aside any argument that doesn't fit your
> > theory is not a useful one.
> 
> ? You write this precisely in the point where I _agree_ with you.

You "agree" for the wrong reasons.  You are, in fact, claiming that
the CC mode cannot be an example of a problem because of unrelated
reasons.  I'm saying that the reasons are related.

> > This means that the
> > base-mode hook will not see a mode that is ready for work, only its
> > beginning.
> 
> Correct.  But a major-mode doesn't have to be "ready for work" (I presume
> you mean ready for editing) for the hook to be useful.  That hook would
> be perfectly suitable for setting variables used by minor modes and other
> things. (eglot-server-program, flymake-diagnostic-functions,  company-backends,
> mode-line-format, etc etc)
> For turning on minor modes (eglot-ensure, company-mode, yasnippet-minor-mode,)
> For binding commands.

Stefan's patch solves these cases in a simpler manner.

> And even without the hook the mere fact that foo-mode and foo-ts-mode
> are derived from foo-base-mode according to derived-mode-p makes it
> useful.

Stefan's patch solves this in a simpler manner.

> > > >  and this various
> > > > things that you'd want to do in a mode hook are impossible in the
> > > > base-mode hook.
> > >
> > > I don't follow this part either.  Can you give an example using, say
> > > the existing ruby-base-mode.
> >
> > Again, look at the two examples I mentioned above.
> 
> I couldn't see the problem in either python.el or sh-script.el.

Search for bugs in those two files, and you will see the issues that I
had in mind.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sat, 06 Jan 2024 08:11:02 GMT) Full text and rfc822 format available.

Message #92 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, stefankangas <at> gmail.com,
 monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sat, 06 Jan 2024 10:09:32 +0200
> From: João Távora <joaotavora <at> gmail.com>
> Date: Fri, 5 Jan 2024 23:37:46 +0000
> Cc: Eli Zaretskii <eliz <at> gnu.org>, Yuan Fu <casouri <at> gmail.com>, 68246 <at> debbugs.gnu.org, 
> 	monnier <at> iro.umontreal.ca
> 
> > But even if we added all the base modes today (as empty stubs), AFAIU it
> > wouldn't solve the exact problem that Monnier's patch is addressing,
> > namely to make packages and customizations work in both foo-mode and
> > foo-ts-mode even if they only say:
> >
> >     (derived-mode-p 'foo-mode)
> 
> That's why these packages should be changed to say
> 
>   (derived-mode-p 'foo-base-mode)
> 
> Or maybe
> 
>    (cl-some #'derived-mode-p '(foo-mode foo-base-mode))

Stefan's patch solves this in a simpler way, which is also friendlier
to 3rd-party packages out there.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sat, 06 Jan 2024 08:14:01 GMT) Full text and rfc822 format available.

Message #95 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sat, 06 Jan 2024 10:12:54 +0200
> From: João Távora <joaotavora <at> gmail.com>
> Date: Sat, 6 Jan 2024 00:16:02 +0000
> Cc: Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com, 68246 <at> debbugs.gnu.org
> 
> On Fri, Jan 5, 2024 at 11:51 PM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
> > That's why in my patch I add `python-mode` as an extra parent of
> > `python-ts-mode` even though they both share `python-base-mode` as
> > their parent.
> 
> Right, that's what sounds hacky to me.  They're siblings, but now
> one also is the parent of the other.

Not "parents", "extra parents".  The documentation explains the
purpose of this.

> And I don't see how "foo-base-mode" is much worse, modulo slightly
> more akward naming and backward compatibility problems that you would
> have anyway.

It requires more changes everywhere, for starters.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sat, 06 Jan 2024 13:54:01 GMT) Full text and rfc822 format available.

Message #98 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sat, 6 Jan 2024 13:52:48 +0000
On Sat, Jan 6, 2024 at 8:07 AM Eli Zaretskii <eliz <at> gnu.org> wrote:

> let's agree to disagree, because I already explained this twice at
> least.

I was just commenting on your earlier words

"That's because a language parser will not have any notion of a sexp,
so it cannot help."

That's manifestly wrong.  Even with a limited understanding of what
a sexp is, it can definitely contribute to sexp navigation as used
by scan-sexp clients.

> Again, if you disagree, let's agree to disagree.

Yes, let's.  But what you said doesn't contradict my assertion that TS
the can feed into scan-sexps in ways that are useful.

> > And the way this could work in Emacs is for TreeSitter to feed into
> > scan-sexps.
>
> I'm not sure I understand how TS could feed scan-sexps.  Did you look
> at the implementation of scan-sexps?  AFAICT, there's no way to base
> the code there on TS, except by providing a completely different
> implementation (assuming TS parsers even provide the required
> information).

One reasonably effective way to do it is just have
syntax-propertize-function propertize text based on TS nodes, which
is what c-ts-mode does today to make show-paren-mode do the right
thing to angle brackets.  Pretty useful already.  You're still running
through text properties and that's maybe not very pretty, but an
implementation detail all the same.  There could be be some more
efficient connection between sexps and TS (or whatever parser)
later on.

> > > > > I invite you to compare CC mode with c-ts-mode, and see for yourself
> > > > > how the common grounds are very small.  It seems surprising at first
> > > > > sight, but once you look at the code, it is very clear.
> > > >
> > > > And this is mainly because CC mode is, well, rather corpulent software,
> > > > let's put it like that.  This is why I wrote it makes sense to start
> > > > from scratch for this one.
> > >
> > > A discussion where you brush aside any argument that doesn't fit your
> > > theory is not a useful one.
> >
> > ? You write this precisely in the point where I _agree_ with you.
>
> You "agree" for the wrong reasons.  You are, in fact, claiming that
> the CC mode cannot be an example of a problem because of unrelated
> reasons.  I'm saying that the reasons are related.

Actually it _could_ be an example.  A c-uber-base-mode would have
_some_ benefit, like it would be a good place to stash snippets.
Just not as much benefit as others which have code.

> > > This means that the
> > > base-mode hook will not see a mode that is ready for work, only its
> > > beginning.
> >
> > Correct.  But a major-mode doesn't have to be "ready for work" (I presume
> > you mean ready for editing) for the hook to be useful.  That hook would
> > be perfectly suitable for setting variables used by minor modes and other
> > things. (eglot-server-program, flymake-diagnostic-functions,  company-backends,
> > mode-line-format, etc etc)
> > For turning on minor modes (eglot-ensure, company-mode, yasnippet-minor-mode,)
> > For binding commands.
>
> Stefan's patch solves these cases in a simpler manner.

I don't oppose Stefan's patch strongly, but I don't think it's
simpler.  For one, it's innovative, a new way to do things
that we could already do with a simpler-to-reason inheritance
that we've had much longer.

As far as motivation goes, the problems in Eglot and Yasnippet
are easily solvable without it.  Not sure about CEDET.

As Stefan K points out this needs calling out in *Help*.

> > I couldn't see the problem in either python.el or sh-script.el.
>
> Search for bugs in those two files, and you will see the issues that I
> had in mind.

Yes, will most definitely start my "search for bugs" in two
files totalling 10000 lines and you know in 200 years or so.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sat, 06 Jan 2024 14:37:02 GMT) Full text and rfc822 format available.

Message #101 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sat, 6 Jan 2024 14:36:17 +0000
On Sat, Jan 6, 2024 at 4:08 AM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:

> Other example: keys as they go through `keyboard-coding-system`,
> `input-decode-map`, `function-key-map`, `key-translation-map`.  Here we
> decided to keep the type the same.

As much as I want to believe these arguments for the conceptual
solidity of this "extra-parents" idea, I still think this is
burdensome.

As a data point, I've had  a fair number of Eglot users confused about
single simple inheritance as it stands.

> The same happens with Debian package names and Debian package features:
> package `emacs` implicitly provides the feature `emacs`.

[I don't find this model simple either, not as a casual Debian user.
But at least there they have clear separate concepts of "package" and
"feature", which seems to hint at my "mode family" or "file type" idea]

> I'm not sure which modes might be affected (beside Eglot, YASnippet,
> and CEDET).  I presume many others outside of Emacs are, since
> `derived-mode-p` is used very often out there.

Personally, I think if Eglot, YASnippet and CEDET are all we
actually know about, I think it's very simple to fix 2 out of three (even
_without_ the "base mode").  And the 3 out of 3 I _think_ I can
fix once someone points me to what exactly it should do.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sat, 06 Jan 2024 14:56:01 GMT) Full text and rfc822 format available.

Message #104 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Yuan Fu <casouri <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sat, 6 Jan 2024 14:54:49 +0000
On Sat, Jan 6, 2024 at 3:19 AM Yuan Fu <casouri <at> gmail.com> wrote:
>
> I certainly welcome base-mode, I’m the one that added them in the first place. But I also want to point out that they are only a partial solution. For one, adding the base mode needs cooperation from all the major mode authors. For built-in modes, that’s not a (big) problem; for countless third-party modes out there, I don’t have high hopes for it.

I don't think this is such a serious problem.  Ultimately, the
affected party aren't major modes themselves, it's the minor
modes that base decisions on 'major-mode' variables and
derived-mode-p.  So far examples are Eglot and Yasnippet (some
unknown way in CEDET).  Expressing the "foo-ts-mode" to
"foo-mode" relation directly in these minor modes isn't really
hard, in fact Eglot already does it for something completely
different that Stefan's patch wouldn't be able to fix anyway:

  (or language-id
      (or (get sym 'eglot-language-id)
          (replace-regexp-in-string
              "\\(?:-ts\\)?-mode$" ""
              (symbol-name sym))))

This is hacky but it works fine.  To do better, we would need
an abstraction like

   (get-mode-family major-mode)

So, personally I think Stefan's patch a rather heavy hammer
to fix something that has many alternatives.

> The good thing about derived-mode-add-parents is that it doesn’t need major
> mode author’s cooperation. Even a normal user can do it themselves.

Yes, precisely that's good.  I actually think
derived-mode-remove-parents is more useful, as it would fix a
real bug (bug#67463) that doesn't seem to have _any_ other solution.

> Then there is the problem Eli pointed out, base-mode hooks runs before child major mode body does. It’s probably fine for most of the things, but if you want to change some buffer local variable that the major mode sets, base-mode hook can’t help. (Arguable a niche use-case, but my point is base-mode hooks have their limits.)

As Dmitry pointed out, this is so for normal inheritance.  It's
not really a problem of the mechanism itself.

> As for adding xxx-mode to xxx-ts-mode’s parent, I think it’s fine? Like others in the thread, I couldn’t think of a scenarios where this will be problematic. I thought about adding xxx-lang as the parent of both xxx-mode and xxx-ts-mode, but that’s probably not very helpful, since the goal is to make things work for ts-mode without needing to change the existing code, and using xxx-lang still requires modifying existing code.

Dunno if problematic, time will tell.  I have run into
"endless cycle" problems with bad derived-mode-p in the past.
Granted, these were all bugs in my code, since fixed.
Above all I think it's a bit akward for users to grasp
this whereas a cleaner "file type registry/language family"
or some other cleaner concept seems more solid.

But maybe if I squint very hard and try to think of
a (single?) "extra parent" as the "language family", maybe
it will work flawlessly.

Regardless I still think the concept of "language id", "file type"
or "family" or whatever should be described somewhere and
linked to our particular implementation of it, currently
based on "extra parenting".




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sat, 06 Jan 2024 15:52:02 GMT) Full text and rfc822 format available.

Message #107 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sat, 06 Jan 2024 10:50:53 -0500
> Personally, I think if Eglot, YASnippet and CEDET are all we
> actually know about, I think it's very simple to fix 2 out of three (even
> _without_ the "base mode").  And the 3 out of 3 I _think_ I can
> fix once someone points me to what exactly it should do.

I don't think embedding (repeatedly) in YASnippet, CEDET, Eglot, ffap,
lsp-mode, (and all the others that I don't happen to know offhand) the
knowledge that `FOO-ts-mode` is used for the same files as `FOO-mode`
qualifies as "fixing".

Instead, I'd call it "working around the lack of info which should be
provided by the mode".  The way my patch does it may not be The Right
Way (tho the jury is still out on this), but at least it provides the
info at the right place.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sat, 06 Jan 2024 22:24:01 GMT) Full text and rfc822 format available.

Message #110 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sat, 6 Jan 2024 22:22:49 +0000
On Sat, Jan 6, 2024 at 3:50 PM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
>
> > Personally, I think if Eglot, YASnippet and CEDET are all we
> > actually know about, I think it's very simple to fix 2 out of three (even
> > _without_ the "base mode").  And the 3 out of 3 I _think_ I can
> > fix once someone points me to what exactly it should do.
>
> I don't think embedding (repeatedly) in YASnippet, CEDET, Eglot, ffap,
> lsp-mode, (and all the others that I don't happen to know offhand) the
> knowledge that `FOO-ts-mode` is used for the same files as `FOO-mode`
> qualifies as "fixing".

Not ideal, yes.  Depends on what needs to be done for each package.  It
would be good if we knew that.  I'll only speak for Eglot, YASnippet and
possibly for lsp-mode.  These 3 packages need to know the language of
the buffer they are operating on.  That's ultimately and definitely what
they want.

They do that with major-modes, because that has very often been a
good correspondence, often a 1-to-1, but not always. The N-to-1 problem
is not new with ts modes, it has long existed (Yasnippet is turning
15!).

> Instead, I'd call it "working around the lack of info which should be
> provided by the mode".  The way my patch does it may not be The Right
> Way (tho the jury is still out on this), but at least it provides the
> info at the right place.

Yes.  If by "at the right place" you mean "nearby the major modes
definitions themselves", yes I agree.

If you want to solve this problem cleanly, or more cleanly than it is
also solved now, I'm all for it.  I just think "extra parents" is a
clumsy a abstraction, though it can be used as an implementation aid.

For simplifying the existing ways these 3 packages already
solve the problem, I propose that instead of calling 'derived-mode-add-parents'
directly near the mode definitions, we instead do something like:

   (set-language-for-mode 'foo-ts-mode 'foo)

As a first implementation, this might just as well expand to

   (derived-mode-add-parents 'foo-ts-mode 'foo-mode)

I.e. it will be 100% your solution.  `derived-mode-p` will be
affected as you want and hopefully everything will keep working
as usual (not necessarily "start working" except for packages
not aware of TS modes at all).

But of course that's not all, because with very little work it also
provides for:

   (get-language-for-mode 'foo-ts-mode) ; => 'foo

And _that's_ when Eglot, Yasnippet, very likely lsp-mode and possibly
many others will see actual simplifications.

If we like this idea, I also think supporting a :language keyword to
'define-derived-mode' is natural.  Obviously, it would just expand to
that 'set-language-for mode' and default to whatever comes before the
"-mode" in the symbol.

So in summary this is what I think -- from the perspective of these 2/3
minor modes.  Yes this introduces the concept of "language", but I think
-- in fact I'm sure given the interactions I've had with new Eglot users
-- that "language" is a much simpler concept to grasp than "extra parents".

Finally, how to give this to packages outside the tree?  The usual
route: do it in core and package it either in compat.el or in a :core
GNU Elpa package, like we did for external-completion.el.  Keep the
derived-mode-add-parents (and add a "remove" counterpart), but suggest
them as last resort options/hammers.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sun, 07 Jan 2024 06:56:01 GMT) Full text and rfc822 format available.

Message #113 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sun, 07 Jan 2024 08:55:34 +0200
> From: João Távora <joaotavora <at> gmail.com>
> Date: Sat, 6 Jan 2024 22:22:49 +0000
> Cc: Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com, 68246 <at> debbugs.gnu.org
> 
> On Sat, Jan 6, 2024 at 3:50 PM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
> For simplifying the existing ways these 3 packages already
> solve the problem, I propose that instead of calling 'derived-mode-add-parents'
> directly near the mode definitions, we instead do something like:
> 
>    (set-language-for-mode 'foo-ts-mode 'foo)

This assumes that the only important aspect of a major mode that
transcends the "normal" ancestry is the language that the mode
supports.  But that is not necessarily true in all cases.  Also, some
major modes don't have a "language" attribute, in the usual sense of
that word.

IOW, this is IMO an even more leaky abstraction than what we get with
derived-mode-add-parents.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sun, 07 Jan 2024 07:01:01 GMT) Full text and rfc822 format available.

Message #116 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 João Távora <joaotavora <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sat, 6 Jan 2024 22:59:57 -0800

> On Jan 5, 2024, at 8:16 PM, Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
> 
>> Then there is the problem Eli pointed out, base-mode hooks runs before child
>> major mode body does.
> 
> [ Side note: This is not relevant to the present suggested patch.  ]
> 
> No, all the mode hooks are run at the end of the major mode body,
> i.e. you get the following order:
> 
>   fundamental-mode body
>   prog-mode body
>   FOO-base-mode body
>   FOO-ts-mode body
>   fundamental-mode-hook
>   prog-mode-hook
>   FOO-base-mode-hook
>   FOO-ts-mode-hook

Ah, yes, of course! We talked about this before and I completely forgot.

> 
>> (Arguable a niche use-case, but my point is base-mode hooks have
>> their limits.)
> 
> It was sufficiently "not niche" that I fixed that problem in Emacs-22
> (according to `C-h v delay-mode-hooks`)  :-)

And I thank you for that :-)

Yuan



Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sun, 07 Jan 2024 07:01:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 08 Jan 2024 00:14:02 GMT) Full text and rfc822 format available.

Message #122 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 8 Jan 2024 00:12:50 +0000
On Sun, Jan 7, 2024 at 6:55 AM Eli Zaretskii <eliz <at> gnu.org> wrote:

> But that is not necessarily true in all cases.

I specifically said I was speaking for 2 packages I created,
Eglot and Yasnippet, and possibly for lsp-mode how is facing the same
problem, which is answering the question:

  what, if any, is the language/file type for a given major mode?

I can't speak to those other cases unless someone bring them forth.

> Also, some major modes don't have a "language" attribute, in
> the usual sense of that word.

Then I guess "nil" would be a fine default for anything not inheriting from
"prog-mode"?  Or s/language/filetype if you prefer.  Dmitry said a language
database is missing.  Stefan mentioned the problem of conflation of file types
and major modes in Emacs.  I agree with both, so I thought of a simple
solution composed of a getter and a setter.

> IOW, this is IMO an even more leaky abstraction than what we get with
> derived-mode-add-parents.

We don't seem to share the same concept of what a "leaky abstraction" is.
In my world, it's an abstraction that exposes details of the thing
it's supposed to abstract away.  Unless we're trying to abstract away
lisp symbols, I don't see how set/get-language-for-mode is leaky.

But if Stefan's patch is supposed to also abstract away the
language-mode correspondence, it's definitely exposing details of
how it does it, which is via "extra parenting".

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 08 Jan 2024 03:35:02 GMT) Full text and rfc822 format available.

Message #125 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 08 Jan 2024 05:34:10 +0200
> From: João Távora <joaotavora <at> gmail.com>
> Date: Mon, 8 Jan 2024 00:12:50 +0000
> Cc: monnier <at> iro.umontreal.ca, casouri <at> gmail.com, 68246 <at> debbugs.gnu.org
> 
> On Sun, Jan 7, 2024 at 6:55 AM Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
> > But that is not necessarily true in all cases.
> 
> I specifically said I was speaking for 2 packages I created,
> Eglot and Yasnippet, and possibly for lsp-mode how is facing the same
> problem, which is answering the question:
> 
>   what, if any, is the language/file type for a given major mode?
> 
> I can't speak to those other cases unless someone bring them forth.

A generalization should be based on as many use cases as possible.

> > Also, some major modes don't have a "language" attribute, in
> > the usual sense of that word.
> 
> Then I guess "nil" would be a fine default for anything not inheriting from
> "prog-mode"?

That's not useful, since, for example, TS and non-TS mods for those
"no-language" modes will still want to be treated the same in some
situations, like .dir-locals.el.

> > IOW, this is IMO an even more leaky abstraction than what we get with
> > derived-mode-add-parents.
> 
> We don't seem to share the same concept of what a "leaky abstraction" is.
> In my world, it's an abstraction that exposes details of the thing
> it's supposed to abstract away.

That's the sense in which I'm using it.

> Unless we're trying to abstract away lisp symbols, I don't see how
> set/get-language-for-mode is leaky.

It attempts to abstract a trait that isn't abstract, by going in the
opposite direction of that used for abstractions.

> But if Stefan's patch is supposed to also abstract away the
> language-mode correspondence

It isn't, AFAIU.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 08 Jan 2024 04:12:01 GMT) Full text and rfc822 format available.

Message #128 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sun, 07 Jan 2024 23:11:00 -0500
>    (set-language-for-mode 'foo-ts-mode 'foo)

Maybe we want to introduce this concept, indeed.

maybe we want to that notion of "language" from elsewhere, such as
the one used in LSP?
Or maybe we want to take it from MIME types?
I'm sure there are other options out there.

Problem is: they come with their own complexities and corner cases.
After all, this is inevitable when you create a taxonomy.
IOW, while we *may* want to add support for an explicit notion of "file
type", it's a whole problem in itself and it will not solve all
our problems either.

In the mean time, I think `derived-mode-add-parents` is worth a try.
As mentioned in some message up-thread, I'm not 100% confident that it
won't introduce serious breakage.  But I think we do need more
experience and installing my patch is a good way to do that.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 08 Jan 2024 10:52:01 GMT) Full text and rfc822 format available.

Message #131 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 8 Jan 2024 10:50:57 +0000
On Mon, Jan 8, 2024 at 3:34 AM Eli Zaretskii <eliz <at> gnu.org> wrote:

> That's not useful, since, for example, TS and non-TS mods for those
> "no-language" modes will still want to be treated the same in some
> situations, like .dir-locals.el.

Yup, so pass them same :language to them and call `get-language-for-mode`
somewhere in the .dir-locals.el machinery?  Else suggest to use the base mode,
if it exists.  If it doesn't exist, create it?

> It attempts to abstract a trait that isn't abstract, by going in the
> opposite direction of that used for abstractions.

It's interesting how you state a simple get/set is a "leaky abstraction",
but then also not an abstraction at all.

Let's put it like this:  Eglot should probably fix this actual code:

   (replace-regexp-in-string "\\(?:-ts\\)?-mode$" "" (symbol-name sym))

to find the language to report to the server.  Users should also be relieved
to write or read :languageId in complicated fashion in eglot-server-programs.

Stefan's doesn't really address this.  Fine.  But it _will_ affect
eglot-server-programs.  In fact someone (TM) should come up with a docstring
change to e-s-p that at least hints at this concept of "extra parents",
since it will start taking effect immediately for some modes, for some
Emacs versions.

What if an Eglot users wants some server just for the non-TS mode?  Or a
Yasnippet user some snippets for such a mode?  Or even just regular
user some directory-local variable value?

The fact is that Eglot users, who are sometimes are surprised by mode
inheritance as it is will see a more complicated concept come
into force (in some Emacs versions, not all).  This concept will apply
to the python-mode/python-ts-mode pair but not the
clojure-mode/clojure-ts-mode one.

The following becomes harder to decipher:

   (add-to-list 'eglot-server-programs (foo-mode "Fooey" "stdio"))

It may or may not go together with the typical:

   (add-hook 'foo-mode-hook 'my-foo-eglot-config)

depending on whether the user has Stefan's patch, whether it patches
"foo-ts-mode" and depending on the mode the user is running.

The more I think about this, the more I think this is problematic.

But if you guys are so confident, sure, let's try it.  If nothing bad
happens great, else I guess I'll refer users to this thread.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 08 Jan 2024 11:12:02 GMT) Full text and rfc822 format available.

Message #134 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 8 Jan 2024 11:11:31 +0000
On Mon, Jan 8, 2024 at 4:11 AM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
>
> >    (set-language-for-mode 'foo-ts-mode 'foo)
>
> Maybe we want to introduce this concept, indeed.
>
> maybe we want to that notion of "language" from elsewhere, such as
> the one used in LSP?
> Or maybe we want to take it from MIME types?
> I'm sure there are other options out there.

I've seen the LSP list gain traction lately, granted in the M$ orbit.
I don't think MIME would be a bad choice either, we should
definitely use that if we s/language/file-type.

> Problem is: they come with their own complexities and corner cases.
> After all, this is inevitable when you create a taxonomy.

Yes, sure.  But they have they have the fundamental property of
simpler graphs (trivial in the case the LSP list).  No "extra parents"
there :-)  IOW they solve the fundamental conflation problem.

> IOW, while we *may* want to add support for an explicit notion of "file
> type", it's a whole problem in itself and it will not solve all
> our problems either.

I think it would solve the ones I found.   Say if set/get-mode-file-type is
added, it may just return a string designating a MIME type.  We don't
necessarily have to give it special interpretation anywhere but where
we feel it's useful.

> In the mean time, I think `derived-mode-add-parents` is worth a try.
> As mentioned in some message up-thread, I'm not 100% confident that it
> won't introduce serious breakage.  But I think we do need more
> experience and installing my patch is a good way to do that.

I presume you've understood my misgivings, so go ahead.
I'll leave it up to you guys to decide if docstring changes
are needed in e-s-p or *Help* as someone suggested.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 08 Jan 2024 12:47:02 GMT) Full text and rfc822 format available.

Message #137 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, joaotavora <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 08 Jan 2024 14:45:39 +0200
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz <at> gnu.org>,  casouri <at> gmail.com,  68246 <at> debbugs.gnu.org
> Date: Sun, 07 Jan 2024 23:11:00 -0500
> 
> >    (set-language-for-mode 'foo-ts-mode 'foo)
> 
> Maybe we want to introduce this concept, indeed.
> 
> maybe we want to that notion of "language" from elsewhere, such as
> the one used in LSP?

Please don't call it "language".  That'd be confusing.  LSP is about
programming languages, so "language" is natural there.  But in Emacs,
a major mode is more general than that.  For example, it is not
unthinkable to consider mail-mode to be the extra-parent of
message-mode (or vice versa) -- but what is the "language" in that
case?

> Or maybe we want to take it from MIME types?
> I'm sure there are other options out there.
> 
> Problem is: they come with their own complexities and corner cases.
> After all, this is inevitable when you create a taxonomy.
> IOW, while we *may* want to add support for an explicit notion of "file
> type", it's a whole problem in itself and it will not solve all
> our problems either.

File types also have problems: Emacs modes are sometimes defined for
buffers that don't visit files.

> In the mean time, I think `derived-mode-add-parents` is worth a try.
> As mentioned in some message up-thread, I'm not 100% confident that it
> won't introduce serious breakage.  But I think we do need more
> experience and installing my patch is a good way to do that.

Yes, I think we should try it and see what we learn.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 08 Jan 2024 13:15:01 GMT) Full text and rfc822 format available.

Message #140 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 08 Jan 2024 15:13:53 +0200
> From: João Távora <joaotavora <at> gmail.com>
> Date: Mon, 8 Jan 2024 10:50:57 +0000
> Cc: monnier <at> iro.umontreal.ca, casouri <at> gmail.com, 68246 <at> debbugs.gnu.org
> 
> On Mon, Jan 8, 2024 at 3:34 AM Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
> > That's not useful, since, for example, TS and non-TS mods for those
> > "no-language" modes will still want to be treated the same in some
> > situations, like .dir-locals.el.
> 
> Yup, so pass them same :language to them and call `get-language-for-mode`
> somewhere in the .dir-locals.el machinery?  Else suggest to use the base mode,
> if it exists.  If it doesn't exist, create it?

There's no "language" here.  Calling things by names they aren't is
not useful.  All it does is create confusion.

> > It attempts to abstract a trait that isn't abstract, by going in the
> > opposite direction of that used for abstractions.
> 
> It's interesting how you state a simple get/set is a "leaky abstraction",
> but then also not an abstraction at all.

It's "leaky" because it "leaks" the idea that it should be a
"language".

> Let's put it like this:  Eglot should probably fix this actual code:
> 
>    (replace-regexp-in-string "\\(?:-ts\\)?-mode$" "" (symbol-name sym))
> 
> to find the language to report to the server.

Eglot should not rely on the assumption that the "language", whatever
it is, is included verbatim in the mode's symbol name.  Neither should
Eglot assume anything else about the name of the mode.

> What if an Eglot users wants some server just for the non-TS mode?  Or a
> Yasnippet user some snippets for such a mode?  Or even just regular
> user some directory-local variable value?

Eglot provides hooks to do that.

> The more I think about this, the more I think this is problematic.
> 
> But if you guys are so confident, sure, let's try it.  If nothing bad
> happens great, else I guess I'll refer users to this thread.

Yes.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 08 Jan 2024 14:46:01 GMT) Full text and rfc822 format available.

Message #143 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 8 Jan 2024 14:45:31 +0000
On Mon, Jan 8, 2024 at 1:14 PM Eli Zaretskii <eliz <at> gnu.org> wrote:

> > > It attempts to abstract a trait that isn't abstract, by going in the
> > > opposite direction of that used for abstractions.
> >
> > It's interesting how you state a simple get/set is a "leaky abstraction",
> > but then also not an abstraction at all.
>
> It's "leaky" because it "leaks" the idea that it should be a
> "language".

Oh right, of course.  Who came with this "leaky" idea that there
are programming languages are all?

> > Let's put it like this:  Eglot should probably fix this actual code:
> >
> >    (replace-regexp-in-string "\\(?:-ts\\)?-mode$" "" (symbol-name sym))
> >
> > to find the language to report to the server.
>
> Eglot should not rely on the assumption that the "language", whatever
> it is, is included verbatim in the mode's symbol name.  Neither should
> Eglot assume anything else about the name of the mode.

You could have capped it off with "neither should it work" for
maximum consistency.

Else, it has to have this database itself.  And so does Yasnippet and
likely others.  This is what Stefan had to say about it:

> > I don't think embedding (repeatedly) in YASnippet, CEDET, Eglot, ffap,
> > lsp-mode, (and all the others that I don't happen to know offhand) the
> > knowledge that `FOO-ts-mode` is used for the same files as `FOO-mode`
> > qualifies as "fixing".

So that's why I formalized what others had already proposed, but
you bash it with whatever random CS adjectives you find at hand.

> > What if an Eglot users wants some server just for the non-TS mode?  Or a
> > Yasnippet user some snippets for such a mode?  Or even just regular
> > user some directory-local variable value?
>
> Eglot provides hooks to do that.

???  I think it's better to not answer when you don't have an answer.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 08 Jan 2024 17:17:01 GMT) Full text and rfc822 format available.

Message #146 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 08 Jan 2024 19:15:46 +0200
> From: João Távora <joaotavora <at> gmail.com>
> Date: Mon, 8 Jan 2024 14:45:31 +0000
> Cc: monnier <at> iro.umontreal.ca, casouri <at> gmail.com, 68246 <at> debbugs.gnu.org
> 
> On Mon, Jan 8, 2024 at 1:14 PM Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
> > > > It attempts to abstract a trait that isn't abstract, by going in the
> > > > opposite direction of that used for abstractions.
> > >
> > > It's interesting how you state a simple get/set is a "leaky abstraction",
> > > but then also not an abstraction at all.
> >
> > It's "leaky" because it "leaks" the idea that it should be a
> > "language".
> 
> Oh right, of course.  Who came with this "leaky" idea that there
> are programming languages are all?

For some reason, once any discussion with you got past some number of
messages, it always deteriorates into a stream of ad-hominem and
sarcastic nonsense.  I'm outta here.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 08 Jan 2024 18:18:02 GMT) Full text and rfc822 format available.

Message #149 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 08 Jan 2024 13:16:48 -0500
[Message part 1 (text/plain, inline)]
Here's an updated version of my patch, which tries to simplify some of
the code in the generic packages.  I also added `perl-mode` as
parent to `cperl-mode`, following the same reasoning, which helped me
find other generic packages affected.

Still no doc updates included.


        Stefan
[ts-parents.patch (text/x-diff, inline)]
diff --git a/.dir-locals.el b/.dir-locals.el
index ce7febca851..4edcad458a4 100644
--- a/.dir-locals.el
+++ b/.dir-locals.el
@@ -27,9 +27,7 @@
                (electric-quote-comment . nil)
                (electric-quote-string . nil)
 	       (mode . bug-reference-prog)))
- (c-ts-mode . ((c-ts-mode-indent-style . gnu)
-               (indent-tabs-mode . t)
-               (mode . bug-reference-prog)))
+ (c-ts-mode . ((c-ts-mode-indent-style . gnu))) ;Inherits `c-mode' settings.
  (log-edit-mode . ((log-edit-font-lock-gnu-style . t)
                    (log-edit-setup-add-author . t)
 		   (vc-git-log-edit-summary-target-len . 50)))
diff --git a/lisp/align.el b/lisp/align.el
index fa95f24fa02..81ccc4b5e2d 100644
--- a/lisp/align.el
+++ b/lisp/align.el
@@ -181,13 +181,12 @@ align-large-region
   :type '(choice (const :tag "Align a large region silently" nil) integer)
   :group 'align)
 
-(defcustom align-c++-modes '( c++-mode c-mode java-mode
-                              c-ts-mode c++-ts-mode)
+(defcustom align-c++-modes '( c++-mode c-mode java-mode)
   "A list of modes whose syntax resembles C/C++."
   :type '(repeat symbol)
   :group 'align)
 
-(defcustom align-perl-modes '(perl-mode cperl-mode)
+(defcustom align-perl-modes '(perl-mode)
   "A list of modes where Perl syntax is to be seen."
   :type '(repeat symbol)
   :group 'align)
@@ -576,13 +575,13 @@ align-rules-list
                     "="
                     (group (zero-or-more (syntax whitespace)))))
      (group . (1 2))
-     (modes . '(conf-toml-mode toml-ts-mode lua-mode lua-ts-mode)))
+     (modes . '(conf-toml-mode lua-mode)))
 
     (double-dash-comment
      (regexp . ,(rx (group (zero-or-more (syntax whitespace)))
                     "--"
                     (zero-or-more nonl)))
-     (modes  . '(lua-mode lua-ts-mode))
+     (modes  . '(lua-mode))
      (column . comment-column)
      (valid  . ,(lambda ()
                   (save-excursion
diff --git a/lisp/cedet/semantic/symref/grep.el b/lisp/cedet/semantic/symref/grep.el
index 83e3bc36073..cc4d1546c85 100644
--- a/lisp/cedet/semantic/symref/grep.el
+++ b/lisp/cedet/semantic/symref/grep.el
@@ -44,9 +44,7 @@ semantic-symref-tool-grep
 
 (defvar semantic-symref-filepattern-alist
   '((c-mode "*.[ch]")
-    (c-ts-mode "*.[ch]")
     (c++-mode "*.[chCH]" "*.[ch]pp" "*.cc" "*.hh")
-    (c++-ts-mode "*.[chCH]" "*.[ch]pp" "*.cc" "*.hh")
     (html-mode "*.html" "*.shtml" "*.php")
     (mhtml-mode "*.html" "*.shtml" "*.php") ; FIXME: remove
                                             ; duplication of
@@ -55,12 +53,8 @@ semantic-symref-filepattern-alist
                                             ; major mode definition?
     (ruby-mode "*.r[bu]" "*.rake" "*.gemspec" "*.erb" "*.haml"
                "Rakefile" "Thorfile" "Capfile" "Guardfile" "Vagrantfile")
-    (ruby-ts-mode "*.r[bu]" "*.rake" "*.gemspec" "*.erb" "*.haml"
-                  "Rakefile" "Thorfile" "Capfile" "Guardfile" "Vagrantfile")
     (python-mode "*.py" "*.pyi" "*.pyw")
-    (python-ts-mode "*.py" "*.pyi" "*.pyw")
     (perl-mode "*.pl" "*.PL")
-    (cperl-mode "*.pl" "*.PL")
     (lisp-interaction-mode "*.el" "*.ede" ".emacs" "_emacs")
     )
   "List of major modes and file extension pattern.
diff --git a/lisp/emulation/viper.el b/lisp/emulation/viper.el
index 83fcdf89375..287292a24dc 100644
--- a/lisp/emulation/viper.el
+++ b/lisp/emulation/viper.el
@@ -388,7 +388,6 @@ viper-vi-state-mode-list
     idl-mode
 
     perl-mode
-    cperl-mode
     javascript-mode
     tcl-mode
     python-mode
diff --git a/lisp/files.el b/lisp/files.el
index 8b4e4394e5a..e546e9473a3 100644
--- a/lisp/files.el
+++ b/lisp/files.el
@@ -4414,6 +4414,12 @@ dir-locals-collect-variables
                   (funcall predicate key)
                 (or (not key)
                     (derived-mode-p key)))
+              ;; If KEY is an extra parent it may remain not loaded
+              ;; (hence with some of its mode-specific vars missing their
+              ;; `safe-local-variable' property), leading to spurious
+              ;; prompts about unsafe vars (bug#68246).
+              (if (and (symbolp key) (autoloadp (indirect-function key)))
+                  (ignore-errors (autoload-do-load (indirect-function key))))
               (let* ((alist (cdr entry))
                      (subdirs (assq 'subdirs alist)))
                 (if (or (not subdirs)
diff --git a/lisp/htmlfontify.el b/lisp/htmlfontify.el
index 6b9c623f31f..89c2bee2204 100644
--- a/lisp/htmlfontify.el
+++ b/lisp/htmlfontify.el
@@ -586,6 +586,7 @@ 'hfy-colour-vals
 (defvar hfy-cperl-mode-kludged-p nil)
 
 (defun hfy-kludge-cperl-mode ()
+  ;; FIXME: Still?
   "CPerl mode does its damnedest not to do some of its fontification when not
 in a windowing system - try to trick it..."
   (declare (obsolete nil "28.1"))
diff --git a/lisp/info-look.el b/lisp/info-look.el
index da7beafe500..cd59fdf17d7 100644
--- a/lisp/info-look.el
+++ b/lisp/info-look.el
@@ -985,9 +985,8 @@ info-complete
                                  finally return "(python)Index")))))
 
 (info-lookup-maybe-add-help
- :mode 'cperl-mode
- :regexp "[$@%][^a-zA-Z]\\|\\$\\^[A-Z]\\|[$@%]?[a-zA-Z][_a-zA-Z0-9]*"
- :other-modes '(perl-mode))
+ :mode 'perl-mode
+ :regexp "[$@%][^a-zA-Z]\\|\\$\\^[A-Z]\\|[$@%]?[a-zA-Z][_a-zA-Z0-9]*")
 
 (info-lookup-maybe-add-help
  :mode 'latex-mode
diff --git a/lisp/progmodes/c-ts-mode.el b/lisp/progmodes/c-ts-mode.el
index e5835bdb62d..461218cbb7d 100644
--- a/lisp/progmodes/c-ts-mode.el
+++ b/lisp/progmodes/c-ts-mode.el
@@ -1314,6 +1314,8 @@ c-ts-mode
                   (lambda (_pos) 'c))
       (treesit-font-lock-recompute-features '(emacs-devel)))))
 
+(derived-mode-add-parents 'c-ts-mode '(c-mode))
+
 ;;;###autoload
 (define-derived-mode c++-ts-mode c-ts-base-mode "C++"
   "Major mode for editing C++, powered by tree-sitter.
@@ -1357,6 +1359,8 @@ c++-ts-mode
       (setq-local add-log-current-defun-function
                   #'c-ts-mode--emacs-current-defun-name))))
 
+(derived-mode-add-parents 'c++-ts-mode '(c++-mode))
+
 (easy-menu-define c-ts-mode-menu (list c-ts-mode-map c++-ts-mode-map)
   "Menu for `c-ts-mode' and `c++-ts-mode'."
   '("C/C++"
diff --git a/lisp/progmodes/cmake-ts-mode.el b/lisp/progmodes/cmake-ts-mode.el
index d933e4ebb81..2a185fb0aa2 100644
--- a/lisp/progmodes/cmake-ts-mode.el
+++ b/lisp/progmodes/cmake-ts-mode.el
@@ -261,6 +261,8 @@ cmake-ts-mode
 
     (treesit-major-mode-setup)))
 
+(derived-mode-add-parents 'cmake-ts-mode '(cmake-mode))
+
 (if (treesit-ready-p 'cmake)
     (add-to-list 'auto-mode-alist
                  '("\\(?:CMakeLists\\.txt\\|\\.cmake\\)\\'" . cmake-ts-mode)))
diff --git a/lisp/progmodes/cperl-mode.el b/lisp/progmodes/cperl-mode.el
index 9f7f29b8182..2496100c21c 100644
--- a/lisp/progmodes/cperl-mode.el
+++ b/lisp/progmodes/cperl-mode.el
@@ -1922,6 +1922,8 @@ cperl-mode
   ;; Setup Flymake
   (add-hook 'flymake-diagnostic-functions #'perl-flymake nil t))
 
+(derived-mode-add-parents 'cperl-mode '(perl-mode))
+
 (defun cperl--set-file-style ()
   (when cperl-file-style
     (cperl-set-style cperl-file-style)))
diff --git a/lisp/progmodes/csharp-mode.el b/lisp/progmodes/csharp-mode.el
index 7bf57bcbe21..18114d08528 100644
--- a/lisp/progmodes/csharp-mode.el
+++ b/lisp/progmodes/csharp-mode.el
@@ -998,6 +998,8 @@ csharp-ts-mode
 
   (add-to-list 'auto-mode-alist '("\\.cs\\'" . csharp-ts-mode)))
 
+(derived-mode-add-parents 'csharp-ts-mode '(csharp-mode))
+
 (provide 'csharp-mode)
 
 ;;; csharp-mode.el ends here
diff --git a/lisp/progmodes/dockerfile-ts-mode.el b/lisp/progmodes/dockerfile-ts-mode.el
index 334f3064d98..618082cfe7a 100644
--- a/lisp/progmodes/dockerfile-ts-mode.el
+++ b/lisp/progmodes/dockerfile-ts-mode.el
@@ -190,6 +190,8 @@ dockerfile-ts-mode
 
     (treesit-major-mode-setup)))
 
+(derived-mode-add-parents 'dockerfile-ts-mode '(dockerfile-mode))
+
 (if (treesit-ready-p 'dockerfile)
     (add-to-list 'auto-mode-alist
                  ;; NOTE: We can't use `rx' here, as it breaks bootstrap.
diff --git a/lisp/progmodes/eglot.el b/lisp/progmodes/eglot.el
index ba2cc72a6b4..7007e71713b 100644
--- a/lisp/progmodes/eglot.el
+++ b/lisp/progmodes/eglot.el
@@ -226,90 +226,101 @@ eglot-alternatives
                       when probe return (cons probe args)
                       finally (funcall err)))))))
 
-(defvar eglot-server-programs `(((rust-ts-mode rust-mode) . ("rust-analyzer"))
-                                ((cmake-mode cmake-ts-mode) . ("cmake-language-server"))
-                                (vimrc-mode . ("vim-language-server" "--stdio"))
-                                ((python-mode python-ts-mode)
-                                 . ,(eglot-alternatives
-                                     '("pylsp" "pyls" ("pyright-langserver" "--stdio") "jedi-language-server" "ruff-lsp")))
-                                ((js-json-mode json-mode json-ts-mode)
-                                 . ,(eglot-alternatives '(("vscode-json-language-server" "--stdio")
-                                                          ("vscode-json-languageserver" "--stdio")
-                                                          ("json-languageserver" "--stdio"))))
-                                (((js-mode :language-id "javascript")
-                                  (js-ts-mode :language-id "javascript")
-                                  (tsx-ts-mode :language-id "typescriptreact")
-                                  (typescript-ts-mode :language-id "typescript")
-                                  (typescript-mode :language-id "typescript"))
-                                 . ("typescript-language-server" "--stdio"))
-                                ((bash-ts-mode sh-mode) . ("bash-language-server" "start"))
-                                ((php-mode phps-mode)
-                                 . ,(eglot-alternatives
-                                     '(("phpactor" "language-server")
-                                       ("php" "vendor/felixfbecker/language-server/bin/php-language-server.php"))))
-                                ((c-mode c-ts-mode c++-mode c++-ts-mode objc-mode)
-                                 . ,(eglot-alternatives
-                                     '("clangd" "ccls")))
-                                (((caml-mode :language-id "ocaml")
-                                  (tuareg-mode :language-id "ocaml") reason-mode)
-                                 . ("ocamllsp"))
-                                ((ruby-mode ruby-ts-mode)
-                                 . ("solargraph" "socket" "--port" :autoport))
-                                (haskell-mode
-                                 . ("haskell-language-server-wrapper" "--lsp"))
-                                (elm-mode . ("elm-language-server"))
-                                (mint-mode . ("mint" "ls"))
-                                (kotlin-mode . ("kotlin-language-server"))
-                                ((go-mode go-dot-mod-mode go-dot-work-mode go-ts-mode go-mod-ts-mode)
-                                 . ("gopls"))
-                                ((R-mode ess-r-mode) . ("R" "--slave" "-e"
-                                                        "languageserver::run()"))
-                                ((java-mode java-ts-mode) . ("jdtls"))
-                                ((dart-mode dart-ts-mode)
-                                 . ("dart" "language-server"
-                                    "--client-id" "emacs.eglot-dart"))
-                                ((elixir-mode elixir-ts-mode heex-ts-mode)
-                                 . ,(if (and (fboundp 'w32-shell-dos-semantics)
-                                             (w32-shell-dos-semantics))
-                                        '("language_server.bat")
-                                      (eglot-alternatives
-                                       '("language_server.sh" "start_lexical.sh"))))
-                                (ada-mode . ("ada_language_server"))
-                                (scala-mode . ,(eglot-alternatives
-                                                '("metals" "metals-emacs")))
-                                (racket-mode . ("racket" "-l" "racket-langserver"))
-                                ((tex-mode context-mode texinfo-mode bibtex-mode)
-                                 . ,(eglot-alternatives '("digestif" "texlab")))
-                                (erlang-mode . ("erlang_ls" "--transport" "stdio"))
-                                ((yaml-ts-mode yaml-mode) . ("yaml-language-server" "--stdio"))
-                                (nix-mode . ,(eglot-alternatives '("nil" "rnix-lsp" "nixd")))
-                                (nickel-mode . ("nls"))
-                                (gdscript-mode . ("localhost" 6008))
-                                ((fortran-mode f90-mode) . ("fortls"))
-                                (futhark-mode . ("futhark" "lsp"))
-                                ((lua-mode lua-ts-mode) . ,(eglot-alternatives
-                                                            '("lua-language-server" "lua-lsp")))
-                                (zig-mode . ("zls"))
-                                ((css-mode css-ts-mode)
-                                 . ,(eglot-alternatives '(("vscode-css-language-server" "--stdio")
-                                                          ("css-languageserver" "--stdio"))))
-                                (html-mode . ,(eglot-alternatives '(("vscode-html-language-server" "--stdio") ("html-languageserver" "--stdio"))))
-                                ((dockerfile-mode dockerfile-ts-mode) . ("docker-langserver" "--stdio"))
-                                ((clojure-mode clojurescript-mode clojurec-mode clojure-ts-mode)
-                                 . ("clojure-lsp"))
-                                ((csharp-mode csharp-ts-mode)
-                                 . ,(eglot-alternatives
-                                     '(("omnisharp" "-lsp")
-                                       ("csharp-ls"))))
-                                (purescript-mode . ("purescript-language-server" "--stdio"))
-                                ((perl-mode cperl-mode) . ("perl" "-MPerl::LanguageServer" "-e" "Perl::LanguageServer::run"))
-                                (markdown-mode
-                                 . ,(eglot-alternatives
-                                     '(("marksman" "server")
-                                       ("vscode-markdown-language-server" "--stdio"))))
-                                (graphviz-dot-mode . ("dot-language-server" "--stdio"))
-                                (terraform-mode . ("terraform-ls" "serve"))
-                                ((uiua-ts-mode uiua-mode) . ("uiua" "lsp")))
+(defvar eglot-server-programs
+  ;; FIXME: Maybe this info should be distributed into the major modes
+  ;; themselves where they could set a buffer-local `eglot-server-program'
+  ;; instead of keeping this database centralized.
+  ;; FIXME: With `derived-mode-add-parents' in Emacs≥30, some of
+  ;; those entries can be simplified, but we keep them for when
+  ;; `eglot.el' is installed via GNU ELPA in an older Emacs.
+  `(((rust-ts-mode rust-mode) . ("rust-analyzer"))
+    ((cmake-mode cmake-ts-mode) . ("cmake-language-server"))
+    (vimrc-mode . ("vim-language-server" "--stdio"))
+    ((python-mode python-ts-mode)
+     . ,(eglot-alternatives
+         '("pylsp" "pyls" ("pyright-langserver" "--stdio")
+           "jedi-language-server" "ruff-lsp")))
+    ((js-json-mode json-mode json-ts-mode)
+     . ,(eglot-alternatives '(("vscode-json-language-server" "--stdio")
+                              ("vscode-json-languageserver" "--stdio")
+                              ("json-languageserver" "--stdio"))))
+    (((js-mode :language-id "javascript")
+      (js-ts-mode :language-id "javascript")
+      (tsx-ts-mode :language-id "typescriptreact")
+      (typescript-ts-mode :language-id "typescript")
+      (typescript-mode :language-id "typescript"))
+     . ("typescript-language-server" "--stdio"))
+    ((bash-ts-mode sh-mode) . ("bash-language-server" "start"))
+    ((php-mode phps-mode)
+     . ,(eglot-alternatives
+         '(("phpactor" "language-server")
+           ("php" "vendor/felixfbecker/language-server/bin/php-language-server.php"))))
+    ((c-mode c-ts-mode c++-mode c++-ts-mode objc-mode)
+     . ,(eglot-alternatives
+         '("clangd" "ccls")))
+    (((caml-mode :language-id "ocaml")
+      (tuareg-mode :language-id "ocaml") reason-mode)
+     . ("ocamllsp"))
+    ((ruby-mode ruby-ts-mode)
+     . ("solargraph" "socket" "--port" :autoport))
+    (haskell-mode
+     . ("haskell-language-server-wrapper" "--lsp"))
+    (elm-mode . ("elm-language-server"))
+    (mint-mode . ("mint" "ls"))
+    (kotlin-mode . ("kotlin-language-server"))
+    ((go-mode go-dot-mod-mode go-dot-work-mode go-ts-mode go-mod-ts-mode)
+     . ("gopls"))
+    ((R-mode ess-r-mode) . ("R" "--slave" "-e"
+                            "languageserver::run()"))
+    ((java-mode java-ts-mode) . ("jdtls"))
+    ((dart-mode dart-ts-mode)
+     . ("dart" "language-server"
+        "--client-id" "emacs.eglot-dart"))
+    ((elixir-mode elixir-ts-mode heex-ts-mode)
+     . ,(if (and (fboundp 'w32-shell-dos-semantics)
+                 (w32-shell-dos-semantics))
+            '("language_server.bat")
+          (eglot-alternatives
+           '("language_server.sh" "start_lexical.sh"))))
+    (ada-mode . ("ada_language_server"))
+    (scala-mode . ,(eglot-alternatives
+                    '("metals" "metals-emacs")))
+    (racket-mode . ("racket" "-l" "racket-langserver"))
+    ((tex-mode context-mode texinfo-mode bibtex-mode)
+     . ,(eglot-alternatives '("digestif" "texlab")))
+    (erlang-mode . ("erlang_ls" "--transport" "stdio"))
+    ((yaml-ts-mode yaml-mode) . ("yaml-language-server" "--stdio"))
+    (nix-mode . ,(eglot-alternatives '("nil" "rnix-lsp" "nixd")))
+    (nickel-mode . ("nls"))
+    (gdscript-mode . ("localhost" 6008))
+    ((fortran-mode f90-mode) . ("fortls"))
+    (futhark-mode . ("futhark" "lsp"))
+    ((lua-mode lua-ts-mode) . ,(eglot-alternatives
+                                '("lua-language-server" "lua-lsp")))
+    (zig-mode . ("zls"))
+    ((css-mode css-ts-mode)
+     . ,(eglot-alternatives '(("vscode-css-language-server" "--stdio")
+                              ("css-languageserver" "--stdio"))))
+    (html-mode . ,(eglot-alternatives
+                   '(("vscode-html-language-server" "--stdio")
+                     ("html-languageserver" "--stdio"))))
+    ((dockerfile-mode dockerfile-ts-mode) . ("docker-langserver" "--stdio"))
+    ((clojure-mode clojurescript-mode clojurec-mode clojure-ts-mode)
+     . ("clojure-lsp"))
+    ((csharp-mode csharp-ts-mode)
+     . ,(eglot-alternatives
+         '(("omnisharp" "-lsp")
+           ("csharp-ls"))))
+    (purescript-mode . ("purescript-language-server" "--stdio"))
+    ((perl-mode cperl-mode)
+     . ("perl" "-MPerl::LanguageServer" "-e" "Perl::LanguageServer::run"))
+    (markdown-mode
+     . ,(eglot-alternatives
+         '(("marksman" "server")
+           ("vscode-markdown-language-server" "--stdio"))))
+    (graphviz-dot-mode . ("dot-language-server" "--stdio"))
+    (terraform-mode . ("terraform-ls" "serve"))
+    ((uiua-ts-mode uiua-mode) . ("uiua" "lsp")))
   "How the command `eglot' guesses the server to start.
 An association list of (MAJOR-MODE . CONTACT) pairs.  MAJOR-MODE
 identifies the buffers that are to be managed by a specific
diff --git a/lisp/progmodes/elixir-ts-mode.el b/lisp/progmodes/elixir-ts-mode.el
index b493195eedd..9a819f5df0c 100644
--- a/lisp/progmodes/elixir-ts-mode.el
+++ b/lisp/progmodes/elixir-ts-mode.el
@@ -745,6 +745,8 @@ elixir-ts-mode
     (treesit-major-mode-setup)
     (setq-local syntax-propertize-function #'elixir-ts--syntax-propertize)))
 
+(derived-mode-add-parents 'elixir-ts-mode '(elixir-mode))
+
 (if (treesit-ready-p 'elixir)
     (progn
       (add-to-list 'auto-mode-alist '("\\.elixir\\'" . elixir-ts-mode))
diff --git a/lisp/progmodes/go-ts-mode.el b/lisp/progmodes/go-ts-mode.el
index 65adc1c55ea..e16459cd975 100644
--- a/lisp/progmodes/go-ts-mode.el
+++ b/lisp/progmodes/go-ts-mode.el
@@ -261,6 +261,8 @@ go-ts-mode
 
     (treesit-major-mode-setup)))
 
+(derived-mode-add-parents 'go-ts-mode '(go-mode))
+
 (if (treesit-ready-p 'go)
     (add-to-list 'auto-mode-alist '("\\.go\\'" . go-ts-mode)))
 
@@ -437,6 +439,9 @@ go-mod-ts-mode
 
     (treesit-major-mode-setup)))
 
+(derived-mode-add-parents 'go-mode-ts-mode '(go-mod-mode))
+
+
 (if (treesit-ready-p 'gomod)
     (add-to-list 'auto-mode-alist '("/go\\.mod\\'" . go-mod-ts-mode)))
 
diff --git a/lisp/progmodes/gud.el b/lisp/progmodes/gud.el
index be6357f4139..8b911918e86 100644
--- a/lisp/progmodes/gud.el
+++ b/lisp/progmodes/gud.el
@@ -3673,6 +3673,9 @@ gud-tooltip-mode
 (defcustom gud-tooltip-modes '( gud-mode c-mode c++-mode fortran-mode
 				python-mode c-ts-mode c++-ts-mode
                                 python-ts-mode)
+  ;; FIXME: Currently the check is made via
+  ;; `(memq major-mode gud-tooltip-modes)' so it doesn't pay attention
+  ;; to the mode hierarchy.
   "List of modes for which to enable GUD tooltips."
   :type '(repeat (symbol :tag "Major mode"))
   :group 'tooltip)
diff --git a/lisp/progmodes/heex-ts-mode.el b/lisp/progmodes/heex-ts-mode.el
index 7b53a44deb2..702610bc1eb 100644
--- a/lisp/progmodes/heex-ts-mode.el
+++ b/lisp/progmodes/heex-ts-mode.el
@@ -177,6 +177,8 @@ heex-ts-mode
 
     (treesit-major-mode-setup)))
 
+(derived-mode-add-parents 'heex-ts-mode '(heex-mode))
+
 (if (treesit-ready-p 'heex)
     ;; Both .heex and the deprecated .leex files should work
     ;; with the tree-sitter-heex grammar.
diff --git a/lisp/progmodes/hideshow.el b/lisp/progmodes/hideshow.el
index b181b21118f..b47a505f64f 100644
--- a/lisp/progmodes/hideshow.el
+++ b/lisp/progmodes/hideshow.el
@@ -254,6 +254,9 @@ hs-isearch-open
 
 ;;;###autoload
 (defvar hs-special-modes-alist
+  ;; FIXME: Currently the check is made via
+  ;; `(assoc major-mode hs-special-modes-alist)' so it doesn't pay attention
+  ;; to the mode hierarchy.
   (mapcar #'purecopy
   '((c-mode "{" "}" "/[*/]" nil nil)
     (c-ts-mode "{" "}" "/[*/]" nil nil)
diff --git a/lisp/progmodes/java-ts-mode.el b/lisp/progmodes/java-ts-mode.el
index 0b1ac49b99f..51e0eeef79a 100644
--- a/lisp/progmodes/java-ts-mode.el
+++ b/lisp/progmodes/java-ts-mode.el
@@ -401,6 +401,8 @@ java-ts-mode
                 ("Method" "\\`method_declaration\\'" nil nil)))
   (treesit-major-mode-setup))
 
+(derived-mode-add-parents 'java-ts-mode '(java-mode))
+
 (if (treesit-ready-p 'java)
     (add-to-list 'auto-mode-alist '("\\.java\\'" . java-ts-mode)))
 
diff --git a/lisp/progmodes/js.el b/lisp/progmodes/js.el
index 0115feb0e97..2420bdde50a 100644
--- a/lisp/progmodes/js.el
+++ b/lisp/progmodes/js.el
@@ -3898,6 +3898,8 @@ js-ts-mode
     (add-to-list 'auto-mode-alist
                  '("\\(\\.js[mx]\\|\\.har\\)\\'" . js-ts-mode))))
 
+(derived-mode-add-parents 'js-ts-mode '(js-mode))
+
 (defvar js-ts--s-p-query
   (when (treesit-available-p)
     (treesit-query-compile 'javascript
diff --git a/lisp/progmodes/json-ts-mode.el b/lisp/progmodes/json-ts-mode.el
index 32bc10bbda9..1fb96555010 100644
--- a/lisp/progmodes/json-ts-mode.el
+++ b/lisp/progmodes/json-ts-mode.el
@@ -164,6 +164,8 @@ json-ts-mode
 
   (treesit-major-mode-setup))
 
+(derived-mode-add-parents 'json-ts-mode '(json-mode))
+
 (if (treesit-ready-p 'json)
     (add-to-list 'auto-mode-alist
                  '("\\.json\\'" . json-ts-mode)))
diff --git a/lisp/progmodes/lua-ts-mode.el b/lisp/progmodes/lua-ts-mode.el
index 3b600f59521..e81f05ff3cb 100644
--- a/lisp/progmodes/lua-ts-mode.el
+++ b/lisp/progmodes/lua-ts-mode.el
@@ -757,6 +757,8 @@ lua-ts-mode
 
   (add-hook 'flymake-diagnostic-functions #'lua-ts-flymake-luacheck nil 'local))
 
+(derived-mode-add-parents 'lua-ts-mode '(lua-mode))
+
 (when (treesit-ready-p 'lua)
   (add-to-list 'auto-mode-alist '("\\.lua\\'" . lua-ts-mode)))
 
diff --git a/lisp/progmodes/python.el b/lisp/progmodes/python.el
index 1148da11a06..94a133b0688 100644
--- a/lisp/progmodes/python.el
+++ b/lisp/progmodes/python.el
@@ -6995,6 +6995,8 @@ python-ts-mode
     (add-to-list 'auto-mode-alist '("\\.py[iw]?\\'" . python-ts-mode))
     (add-to-list 'interpreter-mode-alist '("python[0-9.]*" . python-ts-mode))))
 
+(derived-mode-add-parents 'python-ts-mode '(python-mode))
+
 ;;; Completion predicates for M-x
 ;; Commands that only make sense when editing Python code.
 (dolist (sym '(python-add-import
diff --git a/lisp/progmodes/ruby-ts-mode.el b/lisp/progmodes/ruby-ts-mode.el
index 598eaa461ff..7282d43e091 100644
--- a/lisp/progmodes/ruby-ts-mode.el
+++ b/lisp/progmodes/ruby-ts-mode.el
@@ -1196,6 +1196,8 @@ ruby-ts-mode
 
   (setq-local syntax-propertize-function #'ruby-ts--syntax-propertize))
 
+(derived-mode-add-parents 'ruby-ts-mode '(ruby-mode))
+
 (if (treesit-ready-p 'ruby)
     ;; Copied from ruby-mode.el.
     (add-to-list 'auto-mode-alist
diff --git a/lisp/progmodes/rust-ts-mode.el b/lisp/progmodes/rust-ts-mode.el
index c5fc57cc374..c67ac43e4d0 100644
--- a/lisp/progmodes/rust-ts-mode.el
+++ b/lisp/progmodes/rust-ts-mode.el
@@ -474,6 +474,8 @@ rust-ts-mode
 
     (treesit-major-mode-setup)))
 
+(derived-mode-add-parents 'rust-ts-mode '(rust-mode))
+
 (if (treesit-ready-p 'rust)
     (add-to-list 'auto-mode-alist '("\\.rs\\'" . rust-ts-mode)))
 
diff --git a/lisp/progmodes/sh-script.el b/lisp/progmodes/sh-script.el
index 0562415b4e5..e7e08fba1c9 100644
--- a/lisp/progmodes/sh-script.el
+++ b/lisp/progmodes/sh-script.el
@@ -1638,6 +1638,8 @@ bash-ts-mode
     (setq-local treesit-defun-type-regexp "function_definition")
     (treesit-major-mode-setup)))
 
+(derived-mode-add-parents 'bash-ts-mode '(sh-mode))
+
 (advice-add 'bash-ts-mode :around #'sh--redirect-bash-ts-mode
             ;; Give it lower precedence than normal advice, so other
             ;; advices take precedence over it.
diff --git a/lisp/progmodes/typescript-ts-mode.el b/lisp/progmodes/typescript-ts-mode.el
index e9c6afff440..83a3baaf5ef 100644
--- a/lisp/progmodes/typescript-ts-mode.el
+++ b/lisp/progmodes/typescript-ts-mode.el
@@ -491,6 +491,8 @@ typescript-ts-mode
 
     (treesit-major-mode-setup)))
 
+(derived-mode-add-parents 'typescript-ts-mode '(typescript-mode))
+
 (if (treesit-ready-p 'typescript)
     (add-to-list 'auto-mode-alist '("\\.ts\\'" . typescript-ts-mode)))
 
@@ -548,6 +550,8 @@ tsx-ts-mode
 
     (treesit-major-mode-setup)))
 
+(derived-mode-add-parents 'tsx-ts-mode '(tsx-mode))
+
 (defvar typescript-ts--s-p-query
   (when (treesit-available-p)
     (treesit-query-compile 'typescript
diff --git a/lisp/textmodes/css-mode.el b/lisp/textmodes/css-mode.el
index 425f3ec8a30..f5a20e0ca0e 100644
--- a/lisp/textmodes/css-mode.el
+++ b/lisp/textmodes/css-mode.el
@@ -1830,6 +1830,8 @@ css-ts-mode
 
     (add-to-list 'auto-mode-alist '("\\.css\\'" . css-ts-mode))))
 
+(derived-mode-add-parents 'css-ts-mode '(css-mode))
+
 ;;;###autoload
 (define-derived-mode css-mode css-base-mode "CSS"
   "Major mode to edit Cascading Style Sheets (CSS).
diff --git a/lisp/textmodes/html-ts-mode.el b/lisp/textmodes/html-ts-mode.el
index 301f3e8791c..bf6c1307e96 100644
--- a/lisp/textmodes/html-ts-mode.el
+++ b/lisp/textmodes/html-ts-mode.el
@@ -123,6 +123,8 @@ html-ts-mode
               '(("Element" "\\`tag_name\\'" nil nil)))
   (treesit-major-mode-setup))
 
+(derived-mode-add-parents 'html-ts-mode '(html-mode))
+
 (if (treesit-ready-p 'html)
     (add-to-list 'auto-mode-alist '("\\.html\\'" . html-ts-mode)))
 
diff --git a/lisp/textmodes/toml-ts-mode.el b/lisp/textmodes/toml-ts-mode.el
index 1ba410045f5..1b621032f8a 100644
--- a/lisp/textmodes/toml-ts-mode.el
+++ b/lisp/textmodes/toml-ts-mode.el
@@ -153,6 +153,8 @@ toml-ts-mode
 
     (treesit-major-mode-setup)))
 
+(derived-mode-add-parents 'toml-ts-mode '(toml-mode))
+
 (if (treesit-ready-p 'toml)
     (add-to-list 'auto-mode-alist '("\\.toml\\'" . toml-ts-mode)))
 
diff --git a/lisp/textmodes/yaml-ts-mode.el b/lisp/textmodes/yaml-ts-mode.el
index 08fe4c49733..dc702dff790 100644
--- a/lisp/textmodes/yaml-ts-mode.el
+++ b/lisp/textmodes/yaml-ts-mode.el
@@ -165,6 +165,8 @@ yaml-ts-mode
 
     (treesit-major-mode-setup)))
 
+(derived-mode-add-parents 'yaml-ts-mode '(yaml-mode))
+
 (if (treesit-ready-p 'yaml)
     (add-to-list 'auto-mode-alist '("\\.ya?ml\\'" . yaml-ts-mode)))
 

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 08 Jan 2024 18:58:01 GMT) Full text and rfc822 format available.

Message #152 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>, Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, joaotavora <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 8 Jan 2024 20:57:13 +0200
On 08/01/2024 14:45, Eli Zaretskii wrote:
>> From: Stefan Monnier<monnier <at> iro.umontreal.ca>
>> Cc: Eli Zaretskii<eliz <at> gnu.org>,casouri <at> gmail.com,68246 <at> debbugs.gnu.org
>> Date: Sun, 07 Jan 2024 23:11:00 -0500
>>
>>>     (set-language-for-mode 'foo-ts-mode 'foo)
>> Maybe we want to introduce this concept, indeed.
>>
>> maybe we want to that notion of "language" from elsewhere, such as
>> the one used in LSP?
> Please don't call it "language".  That'd be confusing.  LSP is about
> programming languages, so "language" is natural there.  But in Emacs,
> a major mode is more general than that.  For example, it is not
> unthinkable to consider mail-mode to be the extra-parent of
> message-mode (or vice versa) -- but what is the "language" in that
> case?
> 
>> Or maybe we want to take it from MIME types?
>> I'm sure there are other options out there.
>>
>> Problem is: they come with their own complexities and corner cases.
>> After all, this is inevitable when you create a taxonomy.
>> IOW, while we*may*  want to add support for an explicit notion of "file
>> type", it's a whole problem in itself and it will not solve all
>> our problems either.
> File types also have problems: Emacs modes are sometimes defined for
> buffers that don't visit files.

Even if we call non-file-visiting buffers' contents "languages", I don't 
think anyone will have a heart attack or something.

But the "languages" thingy will mostly be useful when there can be 
several different major modes that can work on a given buffer contents. 
In most/all such cases, we would probably come up with a language name. 
E.g., for example, we have message-mode, but if we wanted to support 
alternatives, we could call the base "email-message". Or for different 
major modes to edit VC commit messages, we could call the language 
"vc-log-message".




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 08 Jan 2024 19:05:02 GMT) Full text and rfc822 format available.

Message #155 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>,
 João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 8 Jan 2024 21:04:30 +0200
On 08/01/2024 06:11, Stefan Monnier via Bug reports for GNU Emacs, the 
Swiss army knife of text editors wrote:
>>     (set-language-for-mode 'foo-ts-mode 'foo)
> 
> Maybe we want to introduce this concept, indeed.
> 
> maybe we want to that notion of "language" from elsewhere, such as
> the one used in LSP?
> Or maybe we want to take it from MIME types?
> I'm sure there are other options out there.

I think the precise source of the mapping is not that important. We 
might as well continue maintaining auto-most-alist, 
interpreter-mode-alist and magic-mode-alist by hand. Or indeed learn to 
populate them from the MIME database later.

What we have, though, it different major modes duplicating 
auto-mode-alist entries even inside the core Emacs. Such as c-ts-mode 
having modified copies of forms that originate from CC Mode.

Or ruby-mode and ruby-ts-mode using two copies of the same regexp. Etc.

Instead, we could have a mapping of files to "languages" and a separate 
one from languages to major modes. And one could fetch the "language" 
for the current buffer using 'rassoc'.

> Problem is: they come with their own complexities and corner cases.
> After all, this is inevitable when you create a taxonomy.
> IOW, while we *may* want to add support for an explicit notion of "file
> type", it's a whole problem in itself and it will not solve all
> our problems either.
> 
> In the mean time, I think `derived-mode-add-parents` is worth a try.
> As mentioned in some message up-thread, I'm not 100% confident that it
> won't introduce serious breakage.  But I think we do need more
> experience and installing my patch is a good way to do that.

This would've worked better inside the Emacs 29.1 release (which 
contains a few other "expedient" solutions).

I'm guessing it won't get into 29.2 either. So the users of such 
versions would have to deal with the existing taxonomy anyway, and 
half-measures might also serve to make people more confused about what 
works in which version and why.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 08 Jan 2024 19:19:02 GMT) Full text and rfc822 format available.

Message #158 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefankangas <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>, Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, joaotavora <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 8 Jan 2024 11:18:41 -0800
Eli Zaretskii <eliz <at> gnu.org> writes:

> Please don't call it "language".  That'd be confusing.  LSP is about
> programming languages, so "language" is natural there.  But in Emacs,
> a major mode is more general than that.  For example, it is not
> unthinkable to consider mail-mode to be the extra-parent of
> message-mode (or vice versa) -- but what is the "language" in that
> case?

Isn't the language for such modes in this paradigm just the empty set?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 08 Jan 2024 19:57:02 GMT) Full text and rfc822 format available.

Message #161 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca,
 joaotavora <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 08 Jan 2024 21:55:15 +0200
> Date: Mon, 8 Jan 2024 20:57:13 +0200
> Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, joaotavora <at> gmail.com
> From: Dmitry Gutov <dmitry <at> gutov.dev>
> 
> Even if we call non-file-visiting buffers' contents "languages", I don't 
> think anyone will have a heart attack or something.

"No heart attack" is a poor criterion for good parameterization and
consistent terminology.  Confusing terms will spread confusion and
bugs.  There's no reason for us to settle for sub-optimal terminology.

> E.g., for example, we have message-mode, but if we wanted to support 
> alternatives, we could call the base "email-message". Or for different 
> major modes to edit VC commit messages, we could call the language 
> "vc-log-message".

Those are not "languages", so let's not call them that.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 08 Jan 2024 19:58:01 GMT) Full text and rfc822 format available.

Message #164 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Kangas <stefankangas <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca,
 joaotavora <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 08 Jan 2024 21:57:10 +0200
> From: Stefan Kangas <stefankangas <at> gmail.com>
> Date: Mon, 8 Jan 2024 11:18:41 -0800
> Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, joaotavora <at> gmail.com
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > Please don't call it "language".  That'd be confusing.  LSP is about
> > programming languages, so "language" is natural there.  But in Emacs,
> > a major mode is more general than that.  For example, it is not
> > unthinkable to consider mail-mode to be the extra-parent of
> > message-mode (or vice versa) -- but what is the "language" in that
> > case?
> 
> Isn't the language for such modes in this paradigm just the empty set?

No.  The "language" there is "text", except that it's silly to call
that "language".




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 08 Jan 2024 20:07:02 GMT) Full text and rfc822 format available.

Message #167 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>, Stefan Kangas <stefankangas <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca,
 joaotavora <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 8 Jan 2024 22:05:43 +0200
On 08/01/2024 21:57, Eli Zaretskii wrote:
>> From: Stefan Kangas<stefankangas <at> gmail.com>
>> Date: Mon, 8 Jan 2024 11:18:41 -0800
>> Cc:68246 <at> debbugs.gnu.org,casouri <at> gmail.com,joaotavora <at> gmail.com
>>
>> Eli Zaretskii<eliz <at> gnu.org>  writes:
>>
>>> Please don't call it "language".  That'd be confusing.  LSP is about
>>> programming languages, so "language" is natural there.  But in Emacs,
>>> a major mode is more general than that.  For example, it is not
>>> unthinkable to consider mail-mode to be the extra-parent of
>>> message-mode (or vice versa) -- but what is the "language" in that
>>> case?
>> Isn't the language for such modes in this paradigm just the empty set?
> No.  The "language" there is "text", except that it's silly to call
> that "language".

If it was just "text", we wouldn't need different highlighting rules in 
message-mode or log-edit-mode, would we?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 08 Jan 2024 20:08:02 GMT) Full text and rfc822 format available.

Message #170 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca,
 joaotavora <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 8 Jan 2024 22:06:56 +0200
On 08/01/2024 21:55, Eli Zaretskii wrote:
>> Date: Mon, 8 Jan 2024 20:57:13 +0200
>> Cc:68246 <at> debbugs.gnu.org,casouri <at> gmail.com,joaotavora <at> gmail.com
>> From: Dmitry Gutov<dmitry <at> gutov.dev>
>>
>> Even if we call non-file-visiting buffers' contents "languages", I don't
>> think anyone will have a heart attack or something.
> "No heart attack" is a poor criterion for good parameterization and
> consistent terminology.  Confusing terms will spread confusion and
> bugs.  There's no reason for us to settle for sub-optimal terminology.
> 
>> E.g., for example, we have message-mode, but if we wanted to support
>> alternatives, we could call the base "email-message". Or for different
>> major modes to edit VC commit messages, we could call the language
>> "vc-log-message".
> Those are not "languages", so let's not call them that.

I'm not married to the term (have there been alternatives suggested?), 
but I do believe that having a notion distinct from "major modes" would 
bring more clarity in this area.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 08 Jan 2024 22:13:02 GMT) Full text and rfc822 format available.

Message #173 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 Yuan Fu <casouri <at> gmail.com>, Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 8 Jan 2024 22:12:03 +0000
[Message part 1 (text/plain, inline)]
 Mon, Jan 8, 2024, 20:08 Dmitry Gutov <dmitry <at> gujutov.dev> wrote:
>
>
> >> E.g., for example, we have message-mode, but if we wanted to support
> >> alternatives, we could call the base "email-message". Or for different
> >> major modes to edit VC commit messages, we could call the language
> >> "vc-log-message".
> > Those are not "languages", so let's not call them that.
>
> I'm not married to the term (have there been alternatives suggested?),

Neither am I btw. Naming is hard, but it shouldn't be
_this_ hard.

I think editors where you can't write emails, list processes
or chat in IRC do use "language" or "file format" In Emacs,
it'd make sense to me to give this to at least those modes
derived from prog-mode also maybe some more (org, markdown, etc).
Other modes would return nil to mean "nope, not a language per
se"

But if "language" or "file format" is still contentious, browser's
use of "content type" seems adequate.  Browsers don't always
server files after all.  Then probably the new getter would
return non-nil in even more modes, and  coverage would keep
growing to theoretically 100%.

João
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 09 Jan 2024 00:12:02 GMT) Full text and rfc822 format available.

Message #176 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 João Távora <joaotavora <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 08 Jan 2024 19:10:46 -0500
> Instead, we could have a mapping of files to "languages" and a separate one
> from languages to major modes.

Indeed.  I called this one `major-mode-remap-alist`.  🙂


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 09 Jan 2024 00:37:02 GMT) Full text and rfc822 format available.

Message #179 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 casouri <at> gmail.com, 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Tue, 9 Jan 2024 00:39:40 +0000
On Tue, Jan 9, 2024 at 12:10 AM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
>
> > Instead, we could have a mapping of files to "languages" and a separate one
> > from languages to major modes.
>
> Indeed.  I called this one `major-mode-remap-alist`.  🙂

"Called"?  Couldn't find this in your latest patch.  Can you elaborate
on this alist's structure?  Does it let one write

   {eglot|yasnippet}--get-language-for-mode

like Dmitry's "languages to major modes" suggests?

João

BTW, in your patch, can you keep the whitespace of eglot-server-programs
or, alternatively, change that whitespace in a separate commit that changes
nothing else?  I understand the current indentation is awful but I rely on
Git history/vc-region-history a lot in the whole of eglot.el




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 09 Jan 2024 00:53:02 GMT) Full text and rfc822 format available.

Message #182 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: João Távora <joaotavora <at> gmail.com>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 casouri <at> gmail.com, 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 08 Jan 2024 19:52:01 -0500
> BTW, in your patch, can you keep the whitespace of eglot-server-programs
> or, alternatively, change that whitespace in a separate commit that changes
> nothing else? I understand the current indentation is awful

Exactly, it's awful and goes against our convention to stay within 80
columns.  I'll make it a separate commit.

> but I rely on Git history/vc-region-history a lot in the whole of
> eglot.el

Not sure how making it a separate commit will help.
But following the conventions from the get-go would have avoided this
specific instance of the problem :-)


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 09 Jan 2024 01:03:02 GMT) Full text and rfc822 format available.

Message #185 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 casouri <at> gmail.com, 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Tue, 9 Jan 2024 01:05:10 +0000
On Tue, Jan 9, 2024 at 12:52 AM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:

> But following the conventions from the get-go would have avoided this
> specific instance of the problem :-)

They were, just look at e36892ef513 :-)

But then at a certain point I got a many pull requests requests to
add stuff there, no time to bother everyone with indentation (which
likely would have been off anyway), the commit message looking decent
is a Eeglot.  Eglot-alternatives made things worse, with people
adding more servers on the same line, and I think this isn't
always being enforced by people installing these patches (I don't).

> Not sure how making it a separate commit will help.

It will say "Stefan Monnier: reformat things", and i will trust that
commit changes nothing functionally. Also,  git log -L is pretty smart.

But you didn't comment on the topic at hand.  What does your
major-mode-remap-alist look like?  Can you write one or two entries
of this alist?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 09 Jan 2024 01:05:02 GMT) Full text and rfc822 format available.

Message #188 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: João Távora <joaotavora <at> gmail.com>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 casouri <at> gmail.com, 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 08 Jan 2024 20:04:02 -0500
> But you didn't comment on the topic at hand.  What does your
> major-mode-remap-alist look like?  Can you write one or two entries
> of this alist?

See commit 59f8c56d9e71a1b61ca8cc0794a6de4aa2f240e4


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 09 Jan 2024 01:09:01 GMT) Full text and rfc822 format available.

Message #191 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 casouri <at> gmail.com, 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Tue, 9 Jan 2024 01:11:08 +0000
On Tue, Jan 9, 2024 at 1:04 AM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
>
> > But you didn't comment on the topic at hand.  What does your
> > major-mode-remap-alist look like?  Can you write one or two entries
> > of this alist?
>
> See commit 59f8c56d9e71a1b61ca8cc0794a6de4aa2f240e4

So, it's nothing like Dmitry's idea:

> > Instead, we could have a mapping of files to "languages" and a separate one
> > from languages to major modes.

Or were you joking?  Or am I missing something obvious?

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 09 Jan 2024 01:10:02 GMT) Full text and rfc822 format available.

Message #194 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 João Távora <joaotavora <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Tue, 9 Jan 2024 03:09:08 +0200
On 09/01/2024 02:10, Stefan Monnier wrote:
>> Instead, we could have a mapping of files to "languages" and a separate one
>> from languages to major modes.
> Indeed.  I called this one `major-mode-remap-alist`.  🙂

Good point. I think it's unfortunate that it isn't used more.

The good side is that even if "languages" are introduced, this var won't 
necessarily need renaming.

Perhaps we should have entries like ("\\.js\\'" . 'js-lang) in 
auto-mode-alist and then map the symbol to the specific major mode in 
major-mode-remap-alist. But for this to be useful to determine the 
language of a major mode via reverse lookup, all/most programming 
language modes will need to be featured there, rather than this being 
optional and used only for custom overrides.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 09 Jan 2024 01:29:02 GMT) Full text and rfc822 format available.

Message #197 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Tue, 9 Jan 2024 01:31:25 +0000
On Tue, Jan 9, 2024 at 1:10 AM Dmitry Gutov <dmitry <at> gutov.dev> wrote:

> Perhaps we should have entries like ("\\.js\\'" . 'js-lang) in
> auto-mode-alist and then map the symbol to the specific major mode in
> major-mode-remap-alist. But for this to be useful to determine the
> language of a major mode via reverse lookup, all/most programming
> language modes will need to be featured there, rather than this being
> optional and used only for custom overrides.

That clarifies how it would be used, thanks.  But in addition to the
problem you note, there's the fact we would have many new foo-lang
functions and the cdr of that m-m-r-alist is specified to be a function
object, while  major-mode is supposed to be symbol, so a bit brittle
for  reverse lookup.

The simplest way to do that reverse mapping is still just adding an
optional entry to a mode symbol's plist.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 09 Jan 2024 03:29:01 GMT) Full text and rfc822 format available.

Message #200 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: joaotavora <at> gmail.com, 68246 <at> debbugs.gnu.org, casouri <at> gmail.com,
 stefankangas <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Tue, 09 Jan 2024 05:27:40 +0200
> Date: Mon, 8 Jan 2024 22:05:43 +0200
> Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca,
>  joaotavora <at> gmail.com
> From: Dmitry Gutov <dmitry <at> gutov.dev>
> 
> On 08/01/2024 21:57, Eli Zaretskii wrote:
> >> From: Stefan Kangas<stefankangas <at> gmail.com>
> >> Date: Mon, 8 Jan 2024 11:18:41 -0800
> >> Cc:68246 <at> debbugs.gnu.org,casouri <at> gmail.com,joaotavora <at> gmail.com
> >>
> >> Eli Zaretskii<eliz <at> gnu.org>  writes:
> >>
> >>> Please don't call it "language".  That'd be confusing.  LSP is about
> >>> programming languages, so "language" is natural there.  But in Emacs,
> >>> a major mode is more general than that.  For example, it is not
> >>> unthinkable to consider mail-mode to be the extra-parent of
> >>> message-mode (or vice versa) -- but what is the "language" in that
> >>> case?
> >> Isn't the language for such modes in this paradigm just the empty set?
> > No.  The "language" there is "text", except that it's silly to call
> > that "language".
> 
> If it was just "text", we wouldn't need different highlighting rules in 
> message-mode or log-edit-mode, would we?

Why don't you ask the same about perl-mode and cperl-mode?  Or about
c-mode and c-ts-mode?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 09 Jan 2024 03:30:02 GMT) Full text and rfc822 format available.

Message #203 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca,
 joaotavora <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Tue, 09 Jan 2024 05:28:32 +0200
> Date: Mon, 8 Jan 2024 22:06:56 +0200
> Cc: monnier <at> iro.umontreal.ca, 68246 <at> debbugs.gnu.org, casouri <at> gmail.com,
>  joaotavora <at> gmail.com
> From: Dmitry Gutov <dmitry <at> gutov.dev>
> 
> On 08/01/2024 21:55, Eli Zaretskii wrote:
> >> Date: Mon, 8 Jan 2024 20:57:13 +0200
> >> Cc:68246 <at> debbugs.gnu.org,casouri <at> gmail.com,joaotavora <at> gmail.com
> >> From: Dmitry Gutov<dmitry <at> gutov.dev>
> >>
> >> Even if we call non-file-visiting buffers' contents "languages", I don't
> >> think anyone will have a heart attack or something.
> > "No heart attack" is a poor criterion for good parameterization and
> > consistent terminology.  Confusing terms will spread confusion and
> > bugs.  There's no reason for us to settle for sub-optimal terminology.
> > 
> >> E.g., for example, we have message-mode, but if we wanted to support
> >> alternatives, we could call the base "email-message". Or for different
> >> major modes to edit VC commit messages, we could call the language
> >> "vc-log-message".
> > Those are not "languages", so let's not call them that.
> 
> I'm not married to the term (have there been alternatives suggested?), 
> but I do believe that having a notion distinct from "major modes" would 
> bring more clarity in this area.

If we can come up with some sane terminology, let's do that.  But
adopting incorrect one just because we need some name is not TRT.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 09 Jan 2024 03:50:02 GMT) Full text and rfc822 format available.

Message #206 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: João Távora <joaotavora <at> gmail.com>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 casouri <at> gmail.com, 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 08 Jan 2024 22:49:06 -0500
>> See commit 59f8c56d9e71a1b61ca8cc0794a6de4aa2f240e4
> So, it's nothing like Dmitry's idea:

Of course it is.

>> > Instead, we could have a mapping of files to "languages" and a separate one
>> > from languages to major modes.

`auto-mode-alist` maps from file names to languages/filetypes (where
"major-mode like" symbols are typically used to represent
languages/filetypes), and then `major-mode-remap-alist` maps from those
languages/filetypes to actual major modes.

Of course, if you want to use other symbols for the content types, that
works as well, e.g.:

    emacs --eval '(progn (add-to-list `auto-mode-alist `("\\.myf$" . text/html)) (add-to-list `major-mode-remap-alist `(text/html . html-mode)))' ~/tmp/foo.myf


-- Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 09 Jan 2024 03:56:01 GMT) Full text and rfc822 format available.

Message #209 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 João Távora <joaotavora <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 08 Jan 2024 22:55:27 -0500
> But for this to be useful to determine the language
> of a major mode via reverse lookup,

Define "the language".

The mapping from "language" to major mode can't be always reversible, so
`major-mode-remap-alist` works to map "language" to "major-mode" but not
the other way.

The current bug-report *is* about "finding the language" but the code
that needs that info luckily doesn't need "the language" it just needs
to know "does the current buffer contain language FOO", which is an
easier problem, which I propose to solve with `derived-mode-p`, since
that's what we've been using all these years.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 09 Jan 2024 04:51:02 GMT) Full text and rfc822 format available.

Message #212 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefankangas <at> gmail.com>
To: João Távora <joaotavora <at> gmail.com>, 
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 casouri <at> gmail.com, 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 8 Jan 2024 20:49:57 -0800
João Távora <joaotavora <at> gmail.com> writes:

>> Not sure how making it a separate commit will help.
>
> It will say "Stefan Monnier: reformat things", and i will trust that
> commit changes nothing functionally. Also,  git log -L is pretty smart.

Making cleanup type changes is useful for git spelunking, but more
importantly it simplifies reviewing.  You do not have to manually
separate the functional changes from the non-functional ones every time
you look at some commit in the future.

I'd certainly encourage that practice.  Git commits are cheap.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 09 Jan 2024 07:26:01 GMT) Full text and rfc822 format available.

Message #215 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Kévin Le Gouguec <kevin.legouguec <at> gmail.com>
To: João Távora <joaotavora <at> gmail.com>
Cc: casouri <at> gmail.com, Dmitry Gutov <dmitry <at> gutov.dev>,
 Stefan Kangas <stefankangas <at> gmail.com>, 68246 <at> debbugs.gnu.org,
 Eli Zaretskii <eliz <at> gnu.org>, Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Tue, 09 Jan 2024 08:24:58 +0100
João Távora <joaotavora <at> gmail.com> writes:

>> Not sure how making it a separate commit will help.
>
> It will say "Stefan Monnier: reformat things", and i will trust that
> commit changes nothing functionally. Also,  git log -L is pretty smart.

(Also², that singular cleanup commit can then be passed to git-blame via
--ignore-rev or blame.ignoreRevsFile for automatic skipping)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 09 Jan 2024 10:53:01 GMT) Full text and rfc822 format available.

Message #218 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 casouri <at> gmail.com, 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Tue, 9 Jan 2024 10:52:21 +0000
On Tue, Jan 9, 2024 at 3:49 AM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:

> Of course, if you want to use other symbols for the content types, that
> works as well, e.g.:
>
>     emacs --eval '(progn (add-to-list `auto-mode-alist `("\\.myf$" . text/html)) (add-to-list `major-mode-remap-alist `(text/html . html-mode)))' ~/tmp/foo.myf

OK, I get it.  This form above is very elegant.

I had already understood the gist from Dmitry's  email and if
more symbols are allowed there (notably symbols with names
matching the MIME type hierarchy), then I think it's mostly fine.

"Mostly" because of the fact that cdrs of entries in m-m-r-alist
are supposed to be functions and major-mode is a symbol, which
makes for brittle reverse lookups.

And then there's still the work of filling it in and changing
auto-mode-alist.  This is mostly mechanical work though.  I can
volunteer if this approach is validated.

Also, writing m-m-r-alist is not incompatible to passing a
:content-type keyword arg to define-derived-mode, which
I still think is the most natural way to maintain this
database.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 09 Jan 2024 11:07:02 GMT) Full text and rfc822 format available.

Message #221 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 casouri <at> gmail.com, 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Tue, 9 Jan 2024 11:05:55 +0000
On Tue, Jan 9, 2024 at 3:55 AM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:

> The mapping from "language" to major mode can't be always reversible, so
> `major-mode-remap-alist` works to map "language" to "major-mode" but not
> the other way.
>
> The current bug-report *is* about "finding the language" but the code
> that needs that info luckily doesn't need "the language"

I must be out of luck, because Eglot does need "the language" to send
as the LSP "languageID" to the server.

> which I propose to solve with `derived-mode-p`, since
> that's what we've been using all these years.

Even before your patch and before TS modes, derived-mode-p leads
to exposing Eglot users looking to customize eglot-server-programs to
much more complicated concepts.

If I could reliably write `get-language-for-mode`, this is much closer to
what they are really after:

    (ocaml "ocamllsp")
    (reason "ocamllsp")

Instead of

   (((caml-mode :language-id "ocaml")
     (tuareg-mode :language-id "ocaml") reason-mode) . ("ocamllsp"))

which is what is currently found in the Eglot database.  In fact
even if LSP languageID wasn't a thing, I still think it's easier to
customize on those terms.

It's also a fair bit simpler to.  And it'd be much simpler for
Yasnippet too.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Wed, 10 Jan 2024 01:16:02 GMT) Full text and rfc822 format available.

Message #224 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Wed, 10 Jan 2024 03:15:36 +0200
On 09/01/2024 13:05, João Távora wrote:
>> which I propose to solve with `derived-mode-p`, since
>> that's what we've been using all these years.
> Even before your patch and before TS modes, derived-mode-p leads
> to exposing Eglot users looking to customize eglot-server-programs to
> much more complicated concepts.
> 
> If I could reliably write `get-language-for-mode`, this is much closer to
> what they are really after:
> 
>      (ocaml "ocamllsp")
>      (reason "ocamllsp")

In the end, this could be a reliable solution without fully committing 
that our language identifiers are fully synced to VS Code/LSP.

Overrides would still be needed, but it could help shorten the list.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Wed, 10 Jan 2024 01:20:01 GMT) Full text and rfc822 format available.

Message #227 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>,
 João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Wed, 10 Jan 2024 03:18:47 +0200
On 09/01/2024 05:49, Stefan Monnier wrote:
>>>> Instead, we could have a mapping of files to "languages" and a separate one
>>>> from languages to major modes.
> `auto-mode-alist` maps from file names to languages/filetypes (where
> "major-mode like" symbols are typically used to represent
> languages/filetypes), and then `major-mode-remap-alist` maps from those
> languages/filetypes to actual major modes.
> 
> Of course, if you want to use other symbols for the content types, that
> works as well, e.g.:
> 
>      emacs --eval '(progn (add-to-list `auto-mode-alist `("\\.myf$" . text/html)) (add-to-list `major-mode-remap-alist `(text/html . html-mode)))' ~/tmp/foo.myf

That's very nice and concise, but it still leaves the issue of users 
being able to use a common hook for a family of major modes (for the 
same language). So I guess some inheritance-based solution is needed?

Or another integration for define-derived-mode which would run a hook 
with name derived from the name of the language.






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Wed, 10 Jan 2024 01:42:01 GMT) Full text and rfc822 format available.

Message #230 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 João Távora <joaotavora <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Wed, 10 Jan 2024 03:41:33 +0200
On 09/01/2024 05:55, Stefan Monnier wrote:
>> But for this to be useful to determine the language
>> of a major mode via reverse lookup,
> Define "the language".
> 
> The mapping from "language" to major mode can't be always reversible, so
> `major-mode-remap-alist` works to map "language" to "major-mode" but not
> the other way.

But that's the point: to be able to find 'javascript' from both 
'js-mode' and 'js2-mode'. Or 'ruby' from 'ruby-mode' and 'ruby-ts-mode'.

major-mode-remap-alist could have several entries for the same language: 
'assoc' will pick the highest priority (first) one, but 'rassoc' would 
be able to take advantage of every entry.

> The current bug-report*is*  about "finding the language" but the code
> that needs that info luckily doesn't need "the language" it just needs
> to know "does the current buffer contain language FOO", which is an
> easier problem, which I propose to solve with `derived-mode-p`, since
> that's what we've been using all these years.

TBF, I don't quite like the "subtleness" of this solution. The 
inheritance hierarchy of the modes is an implicit thing, and the fact 
that js-mode-hook would run in js-ts-mode in Emacs 30 but not in Emacs 
29 would likely trip over a lot of people. Especially those who read 
recipes on the Internet.

Also, "what language is this" does happen to be a meaningful question. 
Eglot's example aside, we can have other tools, databases, etc.

And what about the idea of TS modes becoming the "main" modes sometime, 
far in the future, if tree-sitter stays around long enough? At least for 
some languages, I mean. If the name of the "original" major mode stays 
synonymous with the file type, how do we migrate away from them? Create 
obsolete alias? Rename js-ts-mode to js-mode someday?

Finally, if we did have "languages" as an entity, we could have some UI 
for the user to choose the mode for a language - something like Debian's 
'update-alternatives'. And it would also serve to list the supported 
languages, I guess.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Wed, 10 Jan 2024 02:00:01 GMT) Full text and rfc822 format available.

Message #233 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Wed, 10 Jan 2024 01:59:12 +0000
On Wed, Jan 10, 2024 at 1:16 AM Dmitry Gutov <dmitry <at> gutov.dev> wrote:

> > If I could reliably write `get-language-for-mode`, this is much closer to
> > what they are really after:
> >
> >      (ocaml "ocamllsp")
> >      (reason "ocamllsp")
>
> In the end, this could be a reliable solution without fully committing
> that our language identifiers are fully synced to VS Code/LSP.

Yes, we needn't commit, though no reason to needlessly deviate from
that list either.  Note that the full list in the LSP spec isn't very long
or complete.  It's just a "recommendation".

As I understand, it's rather servers themselves  (particularly the ones
who support more than one language, like typescript-language-server,
ocamllsp, and others) who make this decision.  So I guess the full list
is the union of all supported identifiers , and there might be one or
two conflicts there, but I would suspect not a lot.

> Overrides would still be needed, but it could help shorten the list.

Yes, I think I can realistically support this:

  (("ocaml" "reason") "ocamllsp")

Note the strings in the car becasue symbols already mean major modes.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Wed, 10 Jan 2024 06:26:02 GMT) Full text and rfc822 format available.

Message #236 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefankangas <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>, Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 João Távora <joaotavora <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Tue, 9 Jan 2024 22:24:56 -0800
Dmitry Gutov <dmitry <at> gutov.dev> writes:

>> The current bug-report*is*  about "finding the language" but the code
>> that needs that info luckily doesn't need "the language" it just needs
>> to know "does the current buffer contain language FOO", which is an
>> easier problem, which I propose to solve with `derived-mode-p`, since
>> that's what we've been using all these years.

I can see the logic in that, however recently the situation has changed
such that many major modes will exist both in standard and treesitter
variants.  If there was ever a good time to take a step back and
consider doing something differently, this would be it, I think.

> TBF, I don't quite like the "subtleness" of this solution. The
> inheritance hierarchy of the modes is an implicit thing, and the fact
> that js-mode-hook would run in js-ts-mode in Emacs 30 but not in Emacs
> 29 would likely trip over a lot of people. Especially those who read
> recipes on the Internet.

I also misunderstood this at first: the major mode hooks would not be
run in Stefan M's proposal.  Perhaps the fact that two of us have had
the same misunderstanding tells us something about the complexity of
that solution.

> Also, "what language is this" does happen to be a meaningful question.
> Eglot's example aside, we can have other tools, databases, etc.

I can't speak for Stefan M but AFAIU he agrees that "which language
corresponds to this mode" is something we want to answer.  He just
proposes using the taxonomy we already have for this, instead of adding
a new one.

I.e. the difference is:

    (derived-mode-p 'foo-mode)   vs    (language-for-mode-p 'foo)
    Monnier                            Távora

Either of those would answer the question "does the current buffer
contain language FOO".  The former reuses the old taxonomy, the right
introduces a new one.

> Finally, if we did have "languages" as an entity, we could have some UI
> for the user to choose the mode for a language - something like Debian's
> 'update-alternatives'. And it would also serve to list the supported
> languages, I guess.

This is a good point.  Also to install extensions for those languages.
Or we could use it to implement the VSCode-like prompt "hey this seems
to be language <foo>, do you want to install support for it?".

At the same time, I don't think anything technically stops us from using
`foo-mode' for this either.  We would just have to be more careful in
how we present it to users, to avoid any confusion.

FWIW, I tend to slightly prefer the solution proposed by João, since it
makes things simpler and less confusing in some ways:

    (derived-mode-p 'foo-mode)

does what you expect, and in every mode where that sexp evaluates to t,
`foo-mode-hook' is also being run.  It also has less potential for
breakage, since only new stuff will use it.

At the same time, I don't want to just discard the argument that simply
sticking with what we have is even simpler.  It obviously is in some
ways.  But I'm not convinced that this level of simplicity doesn't have
a cost that rears its head in the form of complexity elsewhere.

Basically, the biggest weakness of Stefan M's solution is the biggest
strength of João's and vice versa: "backwards-compatibility" (if we can
call it that) vs "clean taxonomy".




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Wed, 10 Jan 2024 15:52:01 GMT) Full text and rfc822 format available.

Message #239 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Kangas <stefankangas <at> gmail.com>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 casouri <at> gmail.com, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Wed, 10 Jan 2024 15:51:38 +0000
On Wed, Jan 10, 2024 at 6:24 AM Stefan Kangas <stefankangas <at> gmail.com> wrote:

> I.e. the difference is:
>
>     (derived-mode-p 'foo-mode)   vs    (language-for-mode-p 'foo)
>     Monnier                            Távora
>
> Either of those would answer the question "does the current buffer
> contain language FOO".  The former reuses the old taxonomy, the right
> introduces a new one.

Yes, but the on on the right looks more like

   (cl-defun get-language-for-mode &optional (mode major-mode) ...)

i.e. it's not a boolean predicate.  You can make one with it, of course.

> Basically, the biggest weakness of Stefan M's solution is the biggest
> strength of João's and vice versa: "backwards-compatibility" (if we can
> call it that) vs "clean taxonomy".

Indeed I think it's important to clearly state what "backwards
compatibility" we're trying to solve here.  What exactly was "broken"
after the introduction of TS modes?  We could answer

   "nothing, the TS modes were new things!"

Or we could answer

   "some things, because in some situations the user is led mode or
    less easily to use the new mode and her configs for the old mode
    don't apply"

I believe Stefan and Eli and using the second interpretation.
Fine, that's perfectly fine.  They are actively trying to fix this
breakage.  Also fine.

Importantly, I think it's important to quantify how many "things" were
broken.  In the beginning of this discussion, I saw references to
Eglot and Yasnippet.  Then CEDET, then lsp-mode, then ffap.
I know very well what's going on with the first two, but not the
others.  Does anyone?  It's important to have an overview of what
is broken where, and if it's in the Emacs tree, in the ELPAs, or
elsewhere entirely.

We also know the problem is already mostly fixed in places like Eglot
and lsp-mode.  Elegantly?  Manifestly not, but it's "no worse" than
what was in place pre-TS-modes.

Can we do better for Emacs 30 (or Emacs 29 + compat.el)?  Probably
yes.

1. We could have more "base" modes like we already have and keep
   the relative simplicity of simple inheritance.

2. We could have a new concept of "language" that is a non-nil
   property of _some_ major modes.  With this new concept a number
   of new useful features are being speculated (for example
   language-specific hooks, a friendly dialog for beginners looking
   to use a specific language, etc).  But these new features are
   not essential to  "fixing the breakage".

   The concept of "content type" works just as well here, IMO.

   Also, it has been pointed out that the existing major-mode-remap-alist
   could be used for this.  I agree, but it should come with
   accessors and be localized to to define-derived-mode anyway.

3. We could have this special concept of extra parents so
   that any existing derived-mode-p call does what we thing is
   the right from here on.

All valid alternatives, but I'm surprised option 3 is such a strong
candidate, since it requires exposing the user to a non-trivial
concept.  The symbol "<foo>-mode" would be promoted to designate
something like a "meta-mode" (I also called "family" earlier) where
the current major mode might be 'derived-mode-p' from it but the
concrete-mode's hooks and body is not run.  Interesting as
this is (Stefan M made a defense of it based on practices in other
packages), I think it's just too strong of a hammer to use here,
and at least a minor headache in terms of docstrings.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Wed, 10 Jan 2024 16:06:02 GMT) Full text and rfc822 format available.

Message #242 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: João Távora <joaotavora <at> gmail.com>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 casouri <at> gmail.com, 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Wed, 10 Jan 2024 11:04:00 -0500
> I must be out of luck, because Eglot does need "the language" to send
> as the LSP "languageID" to the server.

No quite "the language": it needs "the language as defined by LSP (or by
its LSP server)".

FWIW, I view centralized mode-indexed databases like
`eglot-server-programs` generally as a "youth diseases": as a package
matures this gets replaced by buffer-local vars set by the respective
major modes.

Reminds me that maybe we should better label our major modes, so as to
distinguish a few different categories:
- `prog-mode` and `fundamental-mode`: "virtual modes" used for the sole
  purpose of sharing code between their children.
  They don't corresponding to any particular content type.
- `tuareg-mode`, `cperl-mode`, `foo-ts-mode`: "alternate modes".

This way, it might be easier to heuristically compute "the language" by
removing "-mode" from the non-"alternate", non-"virtual" mode(s) in
`derived-mode-all-parents`.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Wed, 10 Jan 2024 16:13:02 GMT) Full text and rfc822 format available.

Message #245 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 João Távora <joaotavora <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Wed, 10 Jan 2024 11:11:58 -0500
> That's very nice and concise, but it still leaves the issue of users being
> able to use a common hook for a family of major modes (for the same
> language).  So I guess some inheritance-based solution is needed?

We have `define-derived-mode` for that.
And even those major modes which don't want to inherit via
`define-derived-mode` can `run-mode-hooks` any additional hook they like.

Doing it when a mode is defined is easy and "safe".
Changing it after the fact risks introducing breakage (just like my
`derived-mode-add-parents` does) where the code run via the hook depends
on specific details of the original mode which aren't available in the
"pseudo-derived" mode.

For this reason my patch only proposes the use of
`derived-mode-add-parents` since that's the part where a clear need has
been found.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Wed, 10 Jan 2024 17:03:02 GMT) Full text and rfc822 format available.

Message #248 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>,
 João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Wed, 10 Jan 2024 19:02:18 +0200
On 10/01/2024 18:04, Stefan Monnier wrote:
> FWIW, I view centralized mode-indexed databases like
> `eglot-server-programs` generally as a "youth diseases": as a package
> matures this gets replaced by buffer-local vars set by the respective
> major modes.

One of the proposed approaches was indeed for major modes to label 
themselves in the definition as associated with a particular language.

(Or maybe languages, plural? Do we want that? I suppose that might happen.)




Severity set to 'wishlist' from 'normal' Request was from Stefan Kangas <stefankangas <at> gmail.com> to control <at> debbugs.gnu.org. (Wed, 10 Jan 2024 17:31:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Wed, 10 Jan 2024 17:33:02 GMT) Full text and rfc822 format available.

Message #253 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 casouri <at> gmail.com, 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Wed, 10 Jan 2024 17:31:50 +0000
On Wed, Jan 10, 2024 at 4:04 PM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
>
> > I must be out of luck, because Eglot does need "the language" to send
> > as the LSP "languageID" to the server.
>
> No quite "the language": it needs "the language as defined by LSP (or by
> its LSP server)".

As there is almost always a 100% match, I'm happy to have
eglot-emacs-language-to-lsp-language with very few exceptions.

> FWIW, I view centralized mode-indexed databases like
> `eglot-server-programs` generally as a "youth diseases": as a package
> matures this gets replaced by buffer-local vars

Me too.  But it's orthogonal to the "needs to know the language"
problem.

> set by the respective major modes.

...or directory-locals, or whatever hook/interface the user prefers.  So
I'd phrase that as "suggested by the major-mode".  And this major mode
doesn't have to be concrete either.

The foo-base-mode, parent of old-style foo-mode and new-style
foo-ts-mode is an excellent place to suggest that the LSP server
is for "foo" is "fools".


João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Thu, 11 Jan 2024 03:43:02 GMT) Full text and rfc822 format available.

Message #256 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 João Távora <joaotavora <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Thu, 11 Jan 2024 05:41:49 +0200
On 10/01/2024 18:11, Stefan Monnier wrote:
>> That's very nice and concise, but it still leaves the issue of users being
>> able to use a common hook for a family of major modes (for the same
>> language).  So I guess some inheritance-based solution is needed?
> We have `define-derived-mode` for that.
> And even those major modes which don't want to inherit via
> `define-derived-mode` can `run-mode-hooks` any additional hook they like.

Hmm, I guess I figured that having a common hook is the more pressing 
issue, since the users would want to try the different available modes, 
and they don't always know where to put their settings - on 
js-mode-hook, or js-ts-mode-hook, or that there is the base mode that 
can be used (which is the case not for every ts mode).

Whereas the packages that use derived-mode-p are most likely less 
numerous than our total set of users who employ customizations in hooks, 
and thus can more easily bear the inconvenience of having to mention 
both js-ts-mode and js-mode, for example.

> Doing it when a mode is defined is easy and "safe".

Indeed, 'run-mode-hooks' is a workable approach, but if we decided on a 
common hook name to use (e.g. if it used the format like 
xyz-language-mode-hook) that might relieve the situation somewhat.

> Changing it after the fact risks introducing breakage (just like my
> `derived-mode-add-parents` does) where the code run via the hook depends
> on specific details of the original mode which aren't available in the
> "pseudo-derived" mode.

Sure. A new hook shouldn't have such a problem, though.

> For this reason my patch only proposes the use of
> `derived-mode-add-parents` since that's the part where a clear need has
> been found.

That makes sense.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Thu, 11 Jan 2024 03:51:01 GMT) Full text and rfc822 format available.

Message #259 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Stefan Kangas <stefankangas <at> gmail.com>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 João Távora <joaotavora <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Thu, 11 Jan 2024 05:49:56 +0200
On 10/01/2024 08:24, Stefan Kangas wrote:
>> Also, "what language is this" does happen to be a meaningful question.
>> Eglot's example aside, we can have other tools, databases, etc.
> I can't speak for Stefan M but AFAIU he agrees that "which language
> corresponds to this mode" is something we want to answer.  He just
> proposes using the taxonomy we already have for this, instead of adding
> a new one.
> 
> I.e. the difference is:
> 
>      (derived-mode-p 'foo-mode)   vs    (language-for-mode-p 'foo)
>      Monnier                            Távora
> 
> Either of those would answer the question "does the current buffer
> contain language FOO".  The former reuses the old taxonomy, the right
> introduces a new one.

The predicate is available, but implementing the function that decides 
on the current buffer's language is harder.

Because js-mode derives from prog-mode as well. You can't really take 
the current major mode, or an ancestor, slash away "[ts-]-mode", and say 
"this is the name of the language", because it's hard to decide which of 
them to use.

>> Finally, if we did have "languages" as an entity, we could have some UI
>> for the user to choose the mode for a language - something like Debian's
>> 'update-alternatives'. And it would also serve to list the supported
>> languages, I guess.
> This is a good point.  Also to install extensions for those languages.
> Or we could use it to implement the VSCode-like prompt "hey this seems
> to be language <foo>, do you want to install support for it?".

Yup.

Or - as long as the language name is decidable, allow the user to just 
enable, say, fundamental-mode, call 'M-x eglot' and have it provide both 
syntax highlighting and indentation support through LSP. This might not 
always be practical (and syntax highlighting is not implemented in Eglot 
yet), but it seems like an interesting possibility. Especially in those 
rare and purely hypothetical cases when an LSP server for a language 
exists, but there is no Emacs major mode yet.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sun, 14 Jan 2024 02:20:02 GMT) Full text and rfc822 format available.

Message #262 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 68246 <at> debbugs.gnu.org,
 João Távora <joaotavora <at> gmail.com>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sat, 13 Jan 2024 18:19:26 -0800

> On Jan 8, 2024, at 9:15 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
>> From: João Távora <joaotavora <at> gmail.com>
>> Date: Mon, 8 Jan 2024 14:45:31 +0000
>> Cc: monnier <at> iro.umontreal.ca, casouri <at> gmail.com, 68246 <at> debbugs.gnu.org
>> 
>> On Mon, Jan 8, 2024 at 1:14 PM Eli Zaretskii <eliz <at> gnu.org> wrote:
>> 
>>>>> It attempts to abstract a trait that isn't abstract, by going in the
>>>>> opposite direction of that used for abstractions.
>>>> 
>>>> It's interesting how you state a simple get/set is a "leaky abstraction",
>>>> but then also not an abstraction at all.
>>> 
>>> It's "leaky" because it "leaks" the idea that it should be a
>>> "language".
>> 
>> Oh right, of course.  Who came with this "leaky" idea that there
>> are programming languages are all?
> 
> For some reason, once any discussion with you got past some number of
> messages, it always deteriorates into a stream of ad-hominem and
> sarcastic nonsense.  I'm outta here.

I've never doubted the good intention of the participants in this thread. You’ve all sacrificed much personal time for making Emacs better. And I believed we can have efficient and friendly discussions if everyone in this thread have enough time to go over each other’s message very carefully and compose a detailed reply that’s hard to misinterpret. 

Alas, no one in this thread has the luxury of “enough time” (especially Eli, I’m already amazed that he can keep up with all these discussions everywhere everyday). And we sometimes end up with unfortunate situations like this.

Now come back to the subject. What we can do now, it seems, is to apply this on master and observe what breaks? Then Joao can either say “I told you” or happily found out that this patch works ok. And maybe we could also mention this change in emacs-devel, given the potential impact of it.

Yuan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sun, 14 Jan 2024 03:11:02 GMT) Full text and rfc822 format available.

Message #265 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Yuan Fu <casouri <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sun, 14 Jan 2024 03:10:02 +0000
Yuan Fu <casouri <at> gmail.com> writes:

> observe what breaks? Then Joao can either say “I told you” or happily
> found out that this patch works ok.

Those are not the only two outcomes.  It might work OK and not break
anything [*].

However it is not easy to quantify confused users looking to understand
the new meaning of things in dir-locals.el.  Or users wondering why they
need to set Eglot variables in both 'c++-mode-hook' and
'c++-ts-mode-hook' when all they see is 'c++-mode' in
'eglot-server-programs'.

So it might not break, and even have some reasonalby solid theoretical
backend beautifully enshrined in documentation.  But is it the right
thing?  I think not.  Unless I'm mistaken it's already proven to be
confusing even to two seasoned Emacs hackers in this thread (and I'm not
including myself).

And it doesn't do much for two main problems that were presented at the
base: Eglot and Yasnippet.  I.e. Eglot still inescapably needs to report
the language to the server and Yasnippet would be better and much
simpler if it could organize snippets by languages instead of major
modes.

There are better alternatives to this patch:

1. The base modes, which are substantially _already_ in place.  They
   follow the naming convention <lang>-base-mode.  After giving more
   thought to your earlier objection about derived modes overriding
   variables, it doesn't make sense (I can elaborate if you want :-) ).

2. Explicitly associating some major modes with languages or file types.
   This doesn't seem hard and other further uses like suggesting modes
   or packages to a new user based on languages have been proposed.

Nevertheless, I suspect that you want a solution to some real problem
happening today Can you say in your own words what that problem is?  As
I explained, I don't have a good idea of the cases besides Eglot,
Yasnippet and possibly/likely Lsp-mode.

João

[*]: It's possible though.  One way would be for a user to have added
entries to 'eglot-server-programs' for non-TS 'foo-mode' specifically
with 'add-to-list', a very common practice.  Her later 'foo-ts-mode'
entries would be shadowed.  Unlikely, perhaps?  But what about variables
set in '.dir-locals.el' for 'foo-mode' and 'foo-ts-mode'?  Suprising
"magic" aside, it seems these settings will be merged (right?) in
'foo-ts-mode' buffers.  But does this make sense every time?  Even in
our own dir-locals.el there are some settings for 'c-mode' (unrelated to
cc-mode.el) that are not in the 'c-ts-mode' section, but after Stefan's
patch it will be as if they were.  I think it's unavoidable we'll catch
some users off-guard and break things.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sun, 14 Jan 2024 04:01:01 GMT) Full text and rfc822 format available.

Message #268 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sat, 13 Jan 2024 20:00:13 -0800

> On Jan 13, 2024, at 7:10 PM, João Távora <joaotavora <at> gmail.com> wrote:
> 
> Yuan Fu <casouri <at> gmail.com> writes:
> 
>> observe what breaks? Then Joao can either say “I told you” or happily
>> found out that this patch works ok.
> 
> Those are not the only two outcomes.  It might work OK and not break
> anything [*].
> 
> However it is not easy to quantify confused users looking to understand
> the new meaning of things in dir-locals.el.  Or users wondering why they
> need to set Eglot variables in both 'c++-mode-hook' and
> 'c++-ts-mode-hook' when all they see is 'c++-mode' in
> 'eglot-server-programs'.

I agree. “Not confusing” is very valuable for Emacs, or any system.

> So it might not break, and even have some reasonalby solid theoretical
> backend beautifully enshrined in documentation.  But is it the right
> thing?  I think not.  Unless I'm mistaken it's already proven to be
> confusing even to two seasoned Emacs hackers in this thread (and I'm not
> including myself).
> 
> And it doesn't do much for two main problems that were presented at the
> base: Eglot and Yasnippet.  I.e. Eglot still inescapably needs to report
> the language to the server and Yasnippet would be better and much
> simpler if it could organize snippets by languages instead of major
> modes.
> 
> There are better alternatives to this patch:
> 
> 1. The base modes, which are substantially _already_ in place.  They
>   follow the naming convention <lang>-base-mode.  After giving more
>   thought to your earlier objection about derived modes overriding
>   variables, it doesn't make sense (I can elaborate if you want :-) ).

Yeah, I made a mistake there, as Stefan corrected. Still, the other part of the argument holds: creating a base mode needs cooperation from every involved major modes’ authors. We can’t unilaterally create base modes and make third-party major modes to base on it. I’m not saying it wouldn’t work, it would, but we can’t apply it everywhere.

> 
> 2. Explicitly associating some major modes with languages or file types.
>   This doesn't seem hard and other further uses like suggesting modes
>   or packages to a new user based on languages have been proposed.
> 
> Nevertheless, I suspect that you want a solution to some real problem
> happening today Can you say in your own words what that problem is?  As
> I explained, I don't have a good idea of the cases besides Eglot,
> Yasnippet and possibly/likely Lsp-mode.

I think I want the same thing as you do: right now many packages have a central database that maps major mode to some mode-specific configuration. IIUC, for Eglot, that’s the server’s arguments; for Yasnippet, that’s the snippets. I can think of other examples like hs-special-modes-alist in hideshow.el. I’m sure there are countless third-party packages that uses a central database rather than a buffer-local variable for their mode-based configuration.

For those databases, I want lang-ts-mode to use the same configuration for lang-mode.

More importantly, I hope the countless databases in all these third-party packages to continue to work. Adding language tags for major modes is nice and all, but a) third-party packages has to change their database to make use of it, and b) third-party major modes need to add language tags. 

That’s why I like major mode names a bit better. Both major mode names and language tags are leaky abstraction to some degree, so might as well use the one that already exists.

> [*]: It's possible though.  One way would be for a user to have added
> entries to 'eglot-server-programs' for non-TS 'foo-mode' specifically
> with 'add-to-list', a very common practice.  Her later 'foo-ts-mode'
> entries would be shadowed.  Unlikely, perhaps?  But what about variables
> set in '.dir-locals.el' for 'foo-mode' and 'foo-ts-mode'?  Suprising
> "magic" aside, it seems these settings will be merged (right?) in
> 'foo-ts-mode' buffers.  But does this make sense every time?  Even in
> our own dir-locals.el there are some settings for 'c-mode' (unrelated to
> cc-mode.el) that are not in the 'c-ts-mode' section, but after Stefan's
> patch it will be as if they were.  I think it's unavoidable we'll catch
> some users off-guard and break things.

Right… Sometimes we want lang-mode and lang-ts-mode to share some config, and sometimes we don’t. I don’t have good ideas right now :-)

Yuan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sun, 14 Jan 2024 06:34:01 GMT) Full text and rfc822 format available.

Message #271 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Yuan Fu <casouri <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, joaotavora <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sun, 14 Jan 2024 08:33:19 +0200
> From: Yuan Fu <casouri <at> gmail.com>
> Date: Sat, 13 Jan 2024 18:19:26 -0800
> Cc: João Távora <joaotavora <at> gmail.com>,
>  Stefan Monnier <monnier <at> iro.umontreal.ca>,
>  68246 <at> debbugs.gnu.org
> 
> Now come back to the subject. What we can do now, it seems, is to
> apply this on master and observe what breaks?

I agree, and Stefan evidently also agrees, since he already installed
this on master, with suitable changes to NEWS and the ELisp reference
manual.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sun, 14 Jan 2024 07:03:02 GMT) Full text and rfc822 format available.

Message #274 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sun, 14 Jan 2024 09:02:25 +0200
> From: João Távora <joaotavora <at> gmail.com>
> Cc: Eli Zaretskii <eliz <at> gnu.org>,  Stefan Monnier
>  <monnier <at> iro.umontreal.ca>,  68246 <at> debbugs.gnu.org
> Date: Sun, 14 Jan 2024 03:10:02 +0000
> 
> Yuan Fu <casouri <at> gmail.com> writes:
> 
> > observe what breaks? Then Joao can either say “I told you” or happily
> > found out that this patch works ok.
> 
> Those are not the only two outcomes.  It might work OK and not break
> anything [*].
> 
> However it is not easy to quantify confused users looking to understand
> the new meaning of things in dir-locals.el.  Or users wondering why they
> need to set Eglot variables in both 'c++-mode-hook' and
> 'c++-ts-mode-hook' when all they see is 'c++-mode' in
> 'eglot-server-programs'.

Those users will hopefully submit bug reports or otherwise complain on
the Emacs mailing lists, and then we will know.

> There are better alternatives to this patch:
> 
> 1. The base modes, which are substantially _already_ in place.  They
>    follow the naming convention <lang>-base-mode.  After giving more
>    thought to your earlier objection about derived modes overriding
>    variables, it doesn't make sense (I can elaborate if you want :-) ).

The recommendation is to use base modes where it makes sense, and the
installed changes around derived-mode-add-parents don't in any way
preclude having a base mode and don't make it harder.  But I don't
think we should force everyone in this situation to invent a base mode
as the sole means for solving this.

> 2. Explicitly associating some major modes with languages or file types.
>    This doesn't seem hard and other further uses like suggesting modes
>    or packages to a new user based on languages have been proposed.

This is IMO a heavier and more thorough change, especially since Emacs
doesn't have the notion of "language".  This discussion shows that its
advantages are not evident, and moreover we don't even have a clear
shared view what will that entail.

So I think Yuan is right: let's see what happens with what we have on
master, and take it from there.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sun, 14 Jan 2024 23:19:01 GMT) Full text and rfc822 format available.

Message #277 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 68246 <at> debbugs.gnu.org, Yuan Fu <casouri <at> gmail.com>,
 monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sun, 14 Jan 2024 23:18:11 +0000
Eli Zaretskii <eliz <at> gnu.org> writes:

>> Now come back to the subject. What we can do now, it seems, is to
>> apply this on master and observe what breaks?
>
> I agree, and Stefan evidently also agrees, since he already installed
> this on master, with suitable changes to NEWS and the ELisp reference
> manual.

I don't see the Stefan M's patch to this bug number in either master or
Emacs 29.  If possible I'd like to sign off on the Eglot parts of the
final version of the patch to be installed.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sun, 14 Jan 2024 23:41:01 GMT) Full text and rfc822 format available.

Message #280 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sun, 14 Jan 2024 23:40:17 +0000
Eli Zaretskii <eliz <at> gnu.org> writes:

>> However it is not easy to quantify confused users looking to understand
>> the new meaning of things in dir-locals.el.  Or users wondering why they
>> need to set Eglot variables in both 'c++-mode-hook' and
>> 'c++-ts-mode-hook' when all they see is 'c++-mode' in
>> 'eglot-server-programs'.
>
> Those users will hopefully submit bug reports or otherwise complain on
> the Emacs mailing lists, and then we will know.

You also know this doesn't always happen.  A confused user has a certain
probability of using those channels, and that is most definitely not
100%.  But I will make sure to send any Eglot confusion this way yes.

>> There are better alternatives to this patch:
>> 
>> 1. The base modes, which are substantially _already_ in place.  They
>>    follow the naming convention <lang>-base-mode.  After giving more
>>    thought to your earlier objection about derived modes overriding
>>    variables, it doesn't make sense (I can elaborate if you want :-) ).
>
> The recommendation is to use base modes where it makes sense, and the
> installed changes around derived-mode-add-parents don't in any way
> preclude having a base mode and don't make it harder.  But I don't
> think we should force everyone in this situation to invent a base mode
> as the sole means for solving this.

We can invent for them.  There's very little to invent.  Earlier you
seemed to view a base mode as a receptacle for common code.  It _can_ be
that (and _should_ where applicable).  But it doesn't _have_ to be.  An
empty base mode is useful just for its hook and its behaviour in
dir-locals, for example.

So, find two modes for the same language foo, make an empty
foo-base-mode:

   (define-derived-mode foo-base-mode prog-mode "Base mode for Foo")

Then ask the authors of 'foo-mode', 'foo-ts-mode', 'foo-nongnu-mode',
etc to put foo-base-mode in their mode definitions.  If they refuse or
are unresponsive, consider insinuating 'foo-base-mode' in them (after
asking why, of course).  If this insinuation is acceptable for complex
"extra parents", why shouldn't it be acceptable for normal parents?

This is not technically hard to do, a simple add-function works (and
it's also self-documenting).

>> 2. Explicitly associating some major modes with languages or file types.
>>    This doesn't seem hard and other further uses like suggesting modes
>>    or packages to a new user based on languages have been proposed.
>
> This is IMO a heavier and more thorough change, especially since Emacs
> doesn't have the notion of "language".  This discussion shows that its
> advantages are not evident, and moreover we don't even have a clear
> shared view what will that entail.

It's not an extremely heavy change, at least not when compared to extra
parents at least.  But yes, we should be careful how to implement it.

The advantages are evident though.  Eglot and Yasnippet would be much
simpler to configure.  Even simpler with a language-specific hook.

But "base mode" approach, which has a significant deployment already, is
basically the "languages approach" in slightly more verbose naming.

> So I think Yuan is right: let's see what happens with what we have on
> master, and take it from there.

OK, experimenting in master is what master is good for, after all.
Could be effective (even too much, as the recent register imbroglio
clearly showed).  But I think the practical subtleties (like dir-locals
merging, potential Eglot fallout) need to be highlighted somewhere or at
least announced in emacs-devel.

Also I for one would like to understand the how consistently these
extra-parents are going to be used in the future: the symbol named
'foo-mode' is about to be promoted to something above just _a_
major-mode for Foo, and I'd like to see that cleanly described
somewhere.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 15 Jan 2024 02:11:01 GMT) Full text and rfc822 format available.

Message #283 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>, João Távora
 <joaotavora <at> gmail.com>, Stefan Kangas <stefankangas <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 15 Jan 2024 04:10:16 +0200
[Message part 1 (text/plain, inline)]
On 14/01/2024 09:02, Eli Zaretskii wrote:
>> From: João Távora <joaotavora <at> gmail.com>
>> Cc: Eli Zaretskii <eliz <at> gnu.org>,  Stefan Monnier
>>   <monnier <at> iro.umontreal.ca>,  68246 <at> debbugs.gnu.org
>> Date: Sun, 14 Jan 2024 03:10:02 +0000
>>
>> Yuan Fu <casouri <at> gmail.com> writes:
>>
>>> observe what breaks? Then Joao can either say “I told you” or happily
>>> found out that this patch works ok.
>>
>> Those are not the only two outcomes.  It might work OK and not break
>> anything [*].
>>
>> However it is not easy to quantify confused users looking to understand
>> the new meaning of things in dir-locals.el.  Or users wondering why they
>> need to set Eglot variables in both 'c++-mode-hook' and
>> 'c++-ts-mode-hook' when all they see is 'c++-mode' in
>> 'eglot-server-programs'.
> 
> Those users will hopefully submit bug reports or otherwise complain on
> the Emacs mailing lists, and then we will know.

I rather wouldn't rely on that.

>> There are better alternatives to this patch:
>>
>> 1. The base modes, which are substantially _already_ in place.  They
>>     follow the naming convention <lang>-base-mode.  After giving more
>>     thought to your earlier objection about derived modes overriding
>>     variables, it doesn't make sense (I can elaborate if you want :-) ).
> 
> The recommendation is to use base modes where it makes sense, and the
> installed changes around derived-mode-add-parents don't in any way
> preclude having a base mode and don't make it harder.  But I don't
> think we should force everyone in this situation to invent a base mode
> as the sole means for solving this.

It's not like we don't have an existing solution for this: if there are 
two different modes to configure, change the settings for both modes, or 
alter two hooks. Less magical and more verbose, but being explicit can 
be good.

>> 2. Explicitly associating some major modes with languages or file types.
>>     This doesn't seem hard and other further uses like suggesting modes
>>     or packages to a new user based on languages have been proposed.
> 
> This is IMO a heavier and more thorough change, especially since Emacs
> doesn't have the notion of "language".  This discussion shows that its
> advantages are not evident, and moreover we don't even have a clear
> shared view what will that entail.

Here's a draft patch of how a "language" could work. It doesn't alter 
every entry, but it is backward compatible.

It adds an entity called "language" above major modes which are denoted 
with keywords (you could also call it content-types, but that's a longer 
name). The patch implements an implicit language detection as well 
(based on the mode name), but ideally modes that belong to specific 
languages would not have direct entries in auto-mode-alist, etc, at all.

Benefits:
- Avoiding duplicating a bunch of regexps, for ts modes in particular.
- Uncovered an existing bug: ruby-ts-mode didn't add an entry for 
interpreter-mode-alist, only for auto-mode-alist.
- The user could avoid thinking in terms of major modes, and when they 
wanted to enable a fitting mode, they can type 'M-x set-buffer-language' 
and choose one of the known languages with completion.
- Features like Eglot could now call (buffer-language) and dispatch 
based on that (that can make the value of eglot-server-programs more 
compact in the long run).
- Hook <lang>-language-hook is run inside set-auto-mode-0, for the user 
to add customizations to major modes that share the language but no 
common ancestor.

Further possible additions (mentioned previously):
- Potential UI for customizing major-mode-remap-alist to decide which 
major mode to use for a given language, becomes easier/doable.
- Even when there is no installed major mode for a given language, but 
an auto-mode-alist entry for it exists, we could now do something with 
it in fundamental-mode. Like still use Eglot's features, or suggest the 
user installs one of the language support packages for that language 
from ELPA (knowing the language name, we can suggest a specific package).
[buffer-language.diff (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 15 Jan 2024 12:37:02 GMT) Full text and rfc822 format available.

Message #286 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 15 Jan 2024 14:35:58 +0200
> From: João Távora <joaotavora <at> gmail.com>
> Cc: Yuan Fu <casouri <at> gmail.com>,  monnier <at> iro.umontreal.ca,
>   68246 <at> debbugs.gnu.org
> Date: Sun, 14 Jan 2024 23:18:11 +0000
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> >> Now come back to the subject. What we can do now, it seems, is to
> >> apply this on master and observe what breaks?
> >
> > I agree, and Stefan evidently also agrees, since he already installed
> > this on master, with suitable changes to NEWS and the ELisp reference
> > manual.
> 
> I don't see the Stefan M's patch to this bug number in either master or
> Emacs 29.

Start with commit 4194f9bd870, I think.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 15 Jan 2024 12:40:02 GMT) Full text and rfc822 format available.

Message #289 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 15 Jan 2024 14:38:44 +0200
> From: João Távora <joaotavora <at> gmail.com>
> Cc: casouri <at> gmail.com,  monnier <at> iro.umontreal.ca,  68246 <at> debbugs.gnu.org
> Date: Sun, 14 Jan 2024 23:40:17 +0000
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> >> However it is not easy to quantify confused users looking to understand
> >> the new meaning of things in dir-locals.el.  Or users wondering why they
> >> need to set Eglot variables in both 'c++-mode-hook' and
> >> 'c++-ts-mode-hook' when all they see is 'c++-mode' in
> >> 'eglot-server-programs'.
> >
> > Those users will hopefully submit bug reports or otherwise complain on
> > the Emacs mailing lists, and then we will know.
> 
> You also know this doesn't always happen.

It's our only reliable instrument of getting feedback for our
decisions.

> > The recommendation is to use base modes where it makes sense, and the
> > installed changes around derived-mode-add-parents don't in any way
> > preclude having a base mode and don't make it harder.  But I don't
> > think we should force everyone in this situation to invent a base mode
> > as the sole means for solving this.
> 
> We can invent for them.

Yes, but only where it makes sense.  For example, an empty base mode
doesn't.

> An empty base mode is useful just for its hook and its behaviour in
> dir-locals, for example.

No, it is completely useless, and we shouldn't introduce such modes.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 15 Jan 2024 12:47:02 GMT) Full text and rfc822 format available.

Message #292 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, joaotavora <at> gmail.com,
 monnier <at> iro.umontreal.ca, stefankangas <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 15 Jan 2024 14:46:34 +0200
> Date: Mon, 15 Jan 2024 04:10:16 +0200
> Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca
> From: Dmitry Gutov <dmitry <at> gutov.dev>
> 
> >> However it is not easy to quantify confused users looking to understand
> >> the new meaning of things in dir-locals.el.  Or users wondering why they
> >> need to set Eglot variables in both 'c++-mode-hook' and
> >> 'c++-ts-mode-hook' when all they see is 'c++-mode' in
> >> 'eglot-server-programs'.
> > 
> > Those users will hopefully submit bug reports or otherwise complain on
> > the Emacs mailing lists, and then we will know.
> 
> I rather wouldn't rely on that.

Why not?  The decisions we made are not arbitrary.  Given the best
consensus (or lack thereof) we could arrive upon after careful
consideration of the issues, it is perfectly fine to expect feedback
to set us straight if we made a mistake.

> > The recommendation is to use base modes where it makes sense, and the
> > installed changes around derived-mode-add-parents don't in any way
> > preclude having a base mode and don't make it harder.  But I don't
> > think we should force everyone in this situation to invent a base mode
> > as the sole means for solving this.
> 
> It's not like we don't have an existing solution for this: if there are 
> two different modes to configure, change the settings for both modes, or 
> alter two hooks.

This doesn't solve the problem at hand, since the differences between
the modes are not limited to these simple aspects.

Less magical and more verbose, but being explicit can 
> be good.
> 
> >> 2. Explicitly associating some major modes with languages or file types.
> >>     This doesn't seem hard and other further uses like suggesting modes
> >>     or packages to a new user based on languages have been proposed.
> > 
> > This is IMO a heavier and more thorough change, especially since Emacs
> > doesn't have the notion of "language".  This discussion shows that its
> > advantages are not evident, and moreover we don't even have a clear
> > shared view what will that entail.
> 
> Here's a draft patch of how a "language" could work. It doesn't alter 
> every entry, but it is backward compatible.

Like I said: it is heavier, so we should only do that if the simpler
method don't work well enough.  So thanks, but let's try the existing
simpler solution first and see if we need something better.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 15 Jan 2024 14:46:02 GMT) Full text and rfc822 format available.

Message #295 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 15 Jan 2024 14:45:41 +0000
On Mon, Jan 15, 2024 at 12:38 PM Eli Zaretskii <eliz <at> gnu.org> wrote:

> It's our only reliable instrument of getting feedback for our
> decisions.

It's an instrument among others.  It's not particularly reliable.

> > An empty base mode is useful just for its hook and its behaviour in
> > dir-locals, for example.
>
> No, it is completely useless, and we shouldn't introduce such modes.

One more time.  The user hook for 'foo-base-mode', which is the
normal parent of 'foo-mode', 'foo-ts-mode' and 'foo-whatever-impl-mode'
can be used to:

* setup a  library of snippets for the Foo language.;
* define a suitable Flymake backend for said language
* appear in dir-locals to setup fill-column for this language
* define simpler more robust major-mode database, such as

((foo-base-mode . thingy-42)
 (js-base-mode . thingy-43)
 (ruby-base-mode . thingy-45))

* many more things

These are exactly the things being discussed here.

There is this crystal clear evidence of usefulness being laid
in front of you and yet you claim adamantly it is "completely
useless".  With no justification for the statement.  Because
of course, there is no such thing.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 15 Jan 2024 14:50:01 GMT) Full text and rfc822 format available.

Message #298 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 15 Jan 2024 14:49:15 +0000
On Mon, Jan 15, 2024 at 12:36 PM Eli Zaretskii <eliz <at> gnu.org> wrote:
>
> > From: João Távora <joaotavora <at> gmail.com>
> > Cc: Yuan Fu <casouri <at> gmail.com>,  monnier <at> iro.umontreal.ca,
> >   68246 <at> debbugs.gnu.org
> > Date: Sun, 14 Jan 2024 23:18:11 +0000
> >
> > Eli Zaretskii <eliz <at> gnu.org> writes:
> >
> > >> Now come back to the subject. What we can do now, it seems, is to
> > >> apply this on master and observe what breaks?
> > >
> > > I agree, and Stefan evidently also agrees, since he already installed
> > > this on master, with suitable changes to NEWS and the ELisp reference
> > > manual.
> >
> > I don't see the Stefan M's patch to this bug number in either master or
> > Emacs 29.
>
> Start with commit 4194f9bd870, I think.

That was added back in November and adds in the derived-mode-add-parents
machinery.  Uses said machinery in one place, not related to this
bug report.  So it's not in master yet, and we should continue
evaluating alternatives.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 15 Jan 2024 15:01:01 GMT) Full text and rfc822 format available.

Message #301 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 15 Jan 2024 17:00:34 +0200
> From: João Távora <joaotavora <at> gmail.com>
> Date: Mon, 15 Jan 2024 14:45:41 +0000
> Cc: casouri <at> gmail.com, monnier <at> iro.umontreal.ca, 68246 <at> debbugs.gnu.org
> 
> On Mon, Jan 15, 2024 at 12:38 PM Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
> > It's our only reliable instrument of getting feedback for our
> > decisions.
> 
> It's an instrument among others.  It's not particularly reliable.

You are entitled to your opinions, but mine is different in this
matter.

> > > An empty base mode is useful just for its hook and its behaviour in
> > > dir-locals, for example.
> >
> > No, it is completely useless, and we shouldn't introduce such modes.
> 
> One more time.

Thanks, I understood the first time.  I just don't agree with your
conclusions.  And I already explained why, so at this point let's
agree to disagree, again. 

> There is this crystal clear evidence of usefulness being laid
> in front of you and yet you claim adamantly it is "completely
> useless".  With no justification for the statement.  Because
> of course, there is no such thing.

Right.  The stream of ad-hominem and sarcastic nonsense again.  Sorry,
I forgot that I already bowed out of the attempts to discuss this with
you.  Mea culpa.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 15 Jan 2024 15:11:01 GMT) Full text and rfc822 format available.

Message #304 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 15 Jan 2024 15:09:42 +0000
On Mon, Jan 15, 2024 at 3:00 PM Eli Zaretskii <eliz <at> gnu.org> wrote:

> Thanks, I understood the first time.  I just don't agree with your
> conclusions.  And I already explained why, so at this point let's
> agree to disagree, again.

There's not much open to interpretation of what "useful" means here.
But by all means go ahead and justify what you think these hooks,
dir-locals use cases, and other features are "completely useless".
You haven't done that.

> > There is this crystal clear evidence of usefulness being laid
> > in front of you and yet you claim adamantly it is "completely
> > useless".  With no justification for the statement.  Because
> > of course, there is no such thing.

> Right.  The stream of ad-hominem and sarcastic nonsense again.  Sorry,
> I forgot that I already bowed out of the attempts to discuss this with
> you.  Mea culpa.

That was most obviously not sarcastic.  And calling my posts where
I give actual evidence of things "nonsense" (which you have
done twice now) is, I'm fairly sure, the true ad-hominem here.

Unsarcastically yours,
João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 15 Jan 2024 15:28:01 GMT) Full text and rfc822 format available.

Message #307 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 Stefan Kangas <stefankangas <at> gmail.com>, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 15 Jan 2024 15:27:05 +0000
On Mon, Jan 15, 2024 at 2:10 AM Dmitry Gutov <dmitry <at> gutov.dev> wrote:
>

> It's not like we don't have an existing solution for this: if there are
> two different modes to configure, change the settings for both modes, or
> alter two hooks. Less magical and more verbose, but being explicit can
> be good.

I don't think there's anything magical about a base mode.  But I like your
solution too.

> Here's a draft patch of how a "language" could work. It doesn't alter
> every entry, but it is backward compatible.

I think something like this can work, yes.

-      (funcall (alist-get mode major-mode-remap-alist mode))
+      ;; XXX: When there's no mapping for `:<language>', we could also
+      ;; look for a function called `<language>-mode'.
+      (funcall (alist-get mode major-mode-remap-alist (if (keywordp mode)
+                                                          #'fundamental-mode
+                                                        mode)))
+      (when (keywordp mode)             ;Perhaps do that unconditionally.
+        (run-hooks (intern (format "%s-language-hook" (buffer-language)))))
       (unless (eq mode major-mode)

Regarding the "XXX", this is basically the same questions in the
two headings, I think, which is whether to consider the <foo> in existing
<foo>-mode as language.  I think we can do it yes.  Eglot and other
packages [*] have been doing it for quite some time.  It will fail very
rarely, only for major modes outside Emacs (like "tuareg-mode" for
Ocaml) and we can probably fix that in-tree.

The only thing that leaves me some doubts is the 'set-buffer-language'
entry point.  It's a new thing not strictly required.  Normally the
databases are edited (via whatever means) and then the buffer is
reverted for a mode change.  So I don't think we need to introduce
this user convenience just yet (though, like the other user conveniences
you have imagined, I'm not necessarily opposed to it).

Also 'buffer-language' could be 'get-mode-language',  so you don't
have to have an actual buffer handy to get this association.  The
implementation would just be a reverse search in major-mode-remap-alist

Other than that, I think the solution is workable.

The other package [*] that does exactly the same thing as Eglot, and
invented it independently is markdown-mode:

(defun markdown-get-lang-mode (lang)
  "Return major mode that should be used for LANG.
LANG is a string, and the returned major mode is a symbol."
  (cl-find-if
   #'markdown--lang-mode-predicate
   (nconc (list (cdr (assoc lang markdown-code-lang-modes))
                (cdr (assoc (downcase lang) markdown-code-lang-modes)))
          (and (fboundp 'treesit-language-available-p)
               (list (and (treesit-language-available-p (intern lang))
                          (intern (concat lang "-ts-mode")))
                     (and (treesit-language-available-p (intern
(downcase lang)))
                          (intern (concat (downcase lang) "-ts-mode")))))
          (list
           (intern (concat lang "-mode"))
           (intern (concat (downcase lang) "-mode"))))))

It uses this to know what major-mode to use to fontify GitHub style
markdown code blocks (which have a little language cookie after the
three backticks).  Like Eglot's similar code, I think it could be trivially
rewritten if something like your patch were in place.

Bug#68217 is also relevant here.  Eglot calls into markdown-mode.el
to fontify LSP documentation snippets and sometimes the mode picked
by markdown-mode.el to do the fontification is not the same the user
is using for the buffer.  It most clearly should be.  So
get-language-for-mode and get-(preferred)-mode-for-language are two
evidently needed helpers.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 15 Jan 2024 18:33:02 GMT) Full text and rfc822 format available.

Message #310 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, joaotavora <at> gmail.com,
 monnier <at> iro.umontreal.ca, stefankangas <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 15 Jan 2024 20:32:27 +0200
On 15/01/2024 14:46, Eli Zaretskii wrote:
>> Date: Mon, 15 Jan 2024 04:10:16 +0200
>> Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, monnier <at> iro.umontreal.ca
>> From: Dmitry Gutov <dmitry <at> gutov.dev>
>>
>>>> However it is not easy to quantify confused users looking to understand
>>>> the new meaning of things in dir-locals.el.  Or users wondering why they
>>>> need to set Eglot variables in both 'c++-mode-hook' and
>>>> 'c++-ts-mode-hook' when all they see is 'c++-mode' in
>>>> 'eglot-server-programs'.
>>>
>>> Those users will hopefully submit bug reports or otherwise complain on
>>> the Emacs mailing lists, and then we will know.
>>
>> I rather wouldn't rely on that.
> 
> Why not?  The decisions we made are not arbitrary.  Given the best
> consensus (or lack thereof) we could arrive upon after careful
> consideration of the issues, it is perfectly fine to expect feedback
> to set us straight if we made a mistake.

Because the corresponding downsides are already known. They are not 
catastrophic ones, however, and as such they won't be critical in 
day-to-day usage (prompting fewer users to bother with bug reports).

And if one builds a chair with 5 legs, it will likely be not as 
convenient to use, but not many people will go and tell the author that 
the chair has an extra leg - it's obviously intentional.

Nor we are quick to change our mind based on such feedback, as bug#61177 
and bug#61177 demonstrate.

>>> The recommendation is to use base modes where it makes sense, and the
>>> installed changes around derived-mode-add-parents don't in any way
>>> preclude having a base mode and don't make it harder.  But I don't
>>> think we should force everyone in this situation to invent a base mode
>>> as the sole means for solving this.
>>
>> It's not like we don't have an existing solution for this: if there are
>> two different modes to configure, change the settings for both modes, or
>> alter two hooks.
> 
> This doesn't solve the problem at hand, since the differences between
> the modes are not limited to these simple aspects.

I don't understand your response.

The original description says:

  packages tend to behave poorly because they do not understand that a
  buffer in `js-ts-mode` contains Javascript

Presumably, a call like (derived-mode-p 'js-mode) fails. The packages 
can change it to (derived-mode-p '(js-mode js-ts-mode)), and it will 
succeed. Yes, it's a bit more work, but they will have to do it anyway 
to support Emacs 29.1 for a number of years.

> Less magical and more verbose, but being explicit can
>> be good.
>>
>>>> 2. Explicitly associating some major modes with languages or file types.
>>>>      This doesn't seem hard and other further uses like suggesting modes
>>>>      or packages to a new user based on languages have been proposed.
>>>
>>> This is IMO a heavier and more thorough change, especially since Emacs
>>> doesn't have the notion of "language".  This discussion shows that its
>>> advantages are not evident, and moreover we don't even have a clear
>>> shared view what will that entail.
>>
>> Here's a draft patch of how a "language" could work. It doesn't alter
>> every entry, but it is backward compatible.
> 
> Like I said: it is heavier, so we should only do that if the simpler
> method don't work well enough.  So thanks, but let's try the existing
> simpler solution first and see if we need something better.

Indeed it's heavier because it's not just a fix, but a whole new 
feature. I suggest people try it out and see how they like it.

If not, well...




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 15 Jan 2024 18:53:02 GMT) Full text and rfc822 format available.

Message #313 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, joaotavora <at> gmail.com,
 monnier <at> iro.umontreal.ca, stefankangas <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 15 Jan 2024 20:52:30 +0200
> Date: Mon, 15 Jan 2024 20:32:27 +0200
> Cc: joaotavora <at> gmail.com, stefankangas <at> gmail.com, 68246 <at> debbugs.gnu.org,
>  casouri <at> gmail.com, monnier <at> iro.umontreal.ca
> From: Dmitry Gutov <dmitry <at> gutov.dev>
> 
> >>> Those users will hopefully submit bug reports or otherwise complain on
> >>> the Emacs mailing lists, and then we will know.
> >>
> >> I rather wouldn't rely on that.
> > 
> > Why not?  The decisions we made are not arbitrary.  Given the best
> > consensus (or lack thereof) we could arrive upon after careful
> > consideration of the issues, it is perfectly fine to expect feedback
> > to set us straight if we made a mistake.
> 
> Because the corresponding downsides are already known.

And so are the advantages.  And from where I stand, the advantages
outweigh the known downsides.

> They are not catastrophic ones, however, and as such they won't be
> critical in day-to-day usage (prompting fewer users to bother with
> bug reports).

Once again: we should try the simpler, light-weight solution before we
go to more complex ones.

> And if one builds a chair with 5 legs

We don't build a chair with 5 legs, so the analogy misses the point.

> Nor we are quick to change our mind based on such feedback, as bug#61177 
> and bug#61177 demonstrate.

We are as quick as possible.  You are welcome to step up as an Emacs
maintainer and improve these aspects if you can.

> >>> The recommendation is to use base modes where it makes sense, and the
> >>> installed changes around derived-mode-add-parents don't in any way
> >>> preclude having a base mode and don't make it harder.  But I don't
> >>> think we should force everyone in this situation to invent a base mode
> >>> as the sole means for solving this.
> >>
> >> It's not like we don't have an existing solution for this: if there are
> >> two different modes to configure, change the settings for both modes, or
> >> alter two hooks.
> > 
> > This doesn't solve the problem at hand, since the differences between
> > the modes are not limited to these simple aspects.
> 
> I don't understand your response.

Then you will have to re-read all the discussions which led to
separate modes, and re-discover the problems we discovered almost a
year ago.  The current arrangement was not an accident and not
happened by default.  It is the best arrangement we could come up with
given the differences between the TS and non-TS modes.  Where the
differences are relatively small, we either have a significant base
mode or something similar.  But in several important cases that is
impractical, so we don't.

> The original description says:
> 
>    packages tend to behave poorly because they do not understand that a
>    buffer in `js-ts-mode` contains Javascript

A major mode doesn't need to "understand" anything, it needs to
support users as the users expect.

> >> Here's a draft patch of how a "language" could work. It doesn't alter
> >> every entry, but it is backward compatible.
> > 
> > Like I said: it is heavier, so we should only do that if the simpler
> > method don't work well enough.  So thanks, but let's try the existing
> > simpler solution first and see if we need something better.
> 
> Indeed it's heavier because it's not just a fix, but a whole new 
> feature. I suggest people try it out and see how they like it.

I think such a feature is unjustified by what we know about the
problem.  If we don't know enough, we will learn soon enough, and will
then be in a position to make informed decisions, unlike now that we
are arguing about issues we don't yet have enough experience about to
be able to discuss usefully and effectively.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 15 Jan 2024 20:19:01 GMT) Full text and rfc822 format available.

Message #316 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, joaotavora <at> gmail.com,
 monnier <at> iro.umontreal.ca, stefankangas <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 15 Jan 2024 22:17:56 +0200
On 15/01/2024 20:52, Eli Zaretskii wrote:

>> They are not catastrophic ones, however, and as such they won't be
>> critical in day-to-day usage (prompting fewer users to bother with
>> bug reports).
> 
> Once again: we should try the simpler, light-weight solution before we
> go to more complex ones.

A change in inheritance chain is a pretty heavy/complex solution in my book.

>> And if one builds a chair with 5 legs
> 
> We don't build a chair with 5 legs, so the analogy misses the point.

When a certain behavior is obviously intended, you don't always report 
is as a bug. Sometimes you sigh deeply and either continue using the tool,

>> Nor we are quick to change our mind based on such feedback, as bug#61177
>> and bug#61177 demonstrate.
> 
> We are as quick as possible.  You are welcome to step up as an Emacs
> maintainer and improve these aspects if you can.

I didn't mean that the speed was the issue here. The bugs above are 
closed as "wontfix", rather than remaining in-progress.

>>>>> The recommendation is to use base modes where it makes sense, and the
>>>>> installed changes around derived-mode-add-parents don't in any way
>>>>> preclude having a base mode and don't make it harder.  But I don't
>>>>> think we should force everyone in this situation to invent a base mode
>>>>> as the sole means for solving this.
>>>>
>>>> It's not like we don't have an existing solution for this: if there are
>>>> two different modes to configure, change the settings for both modes, or
>>>> alter two hooks.
>>>
>>> This doesn't solve the problem at hand, since the differences between
>>> the modes are not limited to these simple aspects.
>>
>> I don't understand your response.
> 
> Then you will have to re-read all the discussions which led to
> separate modes, and re-discover the problems we discovered almost a
> year ago.  The current arrangement was not an accident and not
> happened by default.  It is the best arrangement we could come up with
> given the differences between the TS and non-TS modes.  Where the
> differences are relatively small, we either have a significant base
> mode or something similar.  But in several important cases that is
> impractical, so we don't.

Perhaps I should clarify that my preferred alternative solution is not 
base modes, but "keep things as is". The same current arrangement you 
mention.

>> The original description says:
>>
>>     packages tend to behave poorly because they do not understand that a
>>     buffer in `js-ts-mode` contains Javascript
> 
> A major mode doesn't need to "understand" anything, it needs to
> support users as the users expect.

I would be best to address the technical approach I mentioned rather 
than pick at the phrasing in a sentence that doesn't belong to me.

>>>> Here's a draft patch of how a "language" could work. It doesn't alter
>>>> every entry, but it is backward compatible.
>>>
>>> Like I said: it is heavier, so we should only do that if the simpler
>>> method don't work well enough.  So thanks, but let's try the existing
>>> simpler solution first and see if we need something better.
>>
>> Indeed it's heavier because it's not just a fix, but a whole new
>> feature. I suggest people try it out and see how they like it.
> 
> I think such a feature is unjustified by what we know about the
> problem.  If we don't know enough, we will learn soon enough, and will
> then be in a position to make informed decisions, unlike now that we
> are arguing about issues we don't yet have enough experience about to
> be able to discuss usefully and effectively.

It doesn't seem to me like that experiment is easy to reverse.

And I'm not sure what is unclear about the problem, to require 
additional data.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 15 Jan 2024 20:29:02 GMT) Full text and rfc822 format available.

Message #319 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 68246 <at> debbugs.gnu.org, casouri <at> gmail.com, joaotavora <at> gmail.com,
 monnier <at> iro.umontreal.ca, stefankangas <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 15 Jan 2024 22:27:44 +0200
> Date: Mon, 15 Jan 2024 22:17:56 +0200
> Cc: joaotavora <at> gmail.com, stefankangas <at> gmail.com, 68246 <at> debbugs.gnu.org,
>  casouri <at> gmail.com, monnier <at> iro.umontreal.ca
> From: Dmitry Gutov <dmitry <at> gutov.dev>
> 
> On 15/01/2024 20:52, Eli Zaretskii wrote:
> 
> >> They are not catastrophic ones, however, and as such they won't be
> >> critical in day-to-day usage (prompting fewer users to bother with
> >> bug reports).
> > 
> > Once again: we should try the simpler, light-weight solution before we
> > go to more complex ones.
> 
> A change in inheritance chain is a pretty heavy/complex solution in my book.

Stefan's addition doesn't change the inheritance chain.

> >> Nor we are quick to change our mind based on such feedback, as bug#61177
> >> and bug#61177 demonstrate.
> > 
> > We are as quick as possible.  You are welcome to step up as an Emacs
> > maintainer and improve these aspects if you can.
> 
> I didn't mean that the speed was the issue here. The bugs above are 
> closed as "wontfix", rather than remaining in-progress.

Same answer.

> > I think such a feature is unjustified by what we know about the
> > problem.  If we don't know enough, we will learn soon enough, and will
> > then be in a position to make informed decisions, unlike now that we
> > are arguing about issues we don't yet have enough experience about to
> > be able to discuss usefully and effectively.
> 
> It doesn't seem to me like that experiment is easy to reverse.

If there are good reasons, we will.

> And I'm not sure what is unclear about the problem, to require 
> additional data.

Additional downsides, if any.  If there are none, then we already have
a solution that I consider satisfactory.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 15 Jan 2024 20:52:02 GMT) Full text and rfc822 format available.

Message #322 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 Stefan Kangas <stefankangas <at> gmail.com>, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 15 Jan 2024 22:51:32 +0200
On 15/01/2024 17:27, João Távora wrote:
> On Mon, Jan 15, 2024 at 2:10 AM Dmitry Gutov <dmitry <at> gutov.dev> wrote:
>>
> 
>> It's not like we don't have an existing solution for this: if there are
>> two different modes to configure, change the settings for both modes, or
>> alter two hooks. Less magical and more verbose, but being explicit can
>> be good.
> 
> I don't think there's anything magical about a base mode.  But I like your
> solution too.

As "magical", I meant the original patch for this report. I wouldn't 
mind the "base mode" approach, but I guess its still suffers from not 
being suitable for using with earlier Emacs versions.

And every programming mode will have to come with -base-mode defined, 
otherwise we'll have to revisit this every time a new third-party 
-ts-mode appears.

>> Here's a draft patch of how a "language" could work. It doesn't alter
>> every entry, but it is backward compatible.
> 
> I think something like this can work, yes.
> 
> -      (funcall (alist-get mode major-mode-remap-alist mode))
> +      ;; XXX: When there's no mapping for `:<language>', we could also
> +      ;; look for a function called `<language>-mode'.
> +      (funcall (alist-get mode major-mode-remap-alist (if (keywordp mode)
> +                                                          #'fundamental-mode
> +                                                        mode)))
> +      (when (keywordp mode)             ;Perhaps do that unconditionally.
> +        (run-hooks (intern (format "%s-language-hook" (buffer-language)))))
>         (unless (eq mode major-mode)
> 
> Regarding the "XXX", this is basically the same questions in the
> two headings, I think, which is whether to consider the <foo> in existing
> <foo>-mode as language.  I think we can do it yes.  Eglot and other
> packages [*] have been doing it for quite some time.  It will fail very
> rarely, only for major modes outside Emacs (like "tuareg-mode" for
> Ocaml) and we can probably fix that in-tree.

It's a choice between embedding the implicit logic here, or returning 
nil and allowing the callers to do their own fallbacks. I'm not sure, 
personally, which is the better. One might be convenient, but the other 
more strict, possibly leaning to fewer defects.

This choice is coupled with the corresponding logic in 'buffer-language' 
(whether to keep the replace-regexp-in-string branch).

> The only thing that leaves me some doubts is the 'set-buffer-language'
> entry point.  It's a new thing not strictly required.  Normally the
> databases are edited (via whatever means) and then the buffer is
> reverted for a mode change.  So I don't think we need to introduce
> this user convenience just yet (though, like the other user conveniences
> you have imagined, I'm not necessarily opposed to it).

I was thinking of what would be required to make "language" a 
first-class entity, so that users could interact with them instead of 
major modes. To prove the validity of the feature.

Because it's something that people do (I, at least): invoke the major 
mode to choose a different language/content-type for a buffer (one not 
visiting a file yet, perhaps). And if we have an abstraction over 
mmodes, it would make sense to use it.

The interactive behavior of set-buffer-language seems to justify it, I 
think: when you try to enable a major mode, you get all the session's 
functions as completions. But 'M-x set-buffer-language TAB' gives you a 
neat and tidy list of known languages, it's a tangible improvement.

> Also 'buffer-language' could be 'get-mode-language',  so you don't
> have to have an actual buffer handy to get this association.  The
> implementation would just be a reverse search in major-mode-remap-alist

This makes it dependent on the major mode already being applied. Which 
is a valid strategy, but then it can't be used in the last two features 
from my list ("Further possible additions") - they're about the case 
when there is no major mode defined.

> Other than that, I think the solution is workable.
> 
> The other package [*] that does exactly the same thing as Eglot, and
> invented it independently is markdown-mode:
> 
> (defun markdown-get-lang-mode (lang)
>    "Return major mode that should be used for LANG.
> LANG is a string, and the returned major mode is a symbol."
>    (cl-find-if
>     #'markdown--lang-mode-predicate
>     (nconc (list (cdr (assoc lang markdown-code-lang-modes))
>                  (cdr (assoc (downcase lang) markdown-code-lang-modes)))
>            (and (fboundp 'treesit-language-available-p)
>                 (list (and (treesit-language-available-p (intern lang))
>                            (intern (concat lang "-ts-mode")))
>                       (and (treesit-language-available-p (intern
> (downcase lang)))
>                            (intern (concat (downcase lang) "-ts-mode")))))
>            (list
>             (intern (concat lang "-mode"))
>             (intern (concat (downcase lang) "-mode"))))))
> 
> It uses this to know what major-mode to use to fontify GitHub style
> markdown code blocks (which have a little language cookie after the
> three backticks).  Like Eglot's similar code, I think it could be trivially
> rewritten if something like your patch were in place.

Yup, it could use the entries in major-mode-remap-alist.

> Bug#68217 is also relevant here.  Eglot calls into markdown-mode.el
> to fontify LSP documentation snippets and sometimes the mode picked
> by markdown-mode.el to do the fontification is not the same the user
> is using for the buffer.  It most clearly should be.  So
> get-language-for-mode and get-(preferred)-mode-for-language are two
> evidently needed helpers.

Are there specific uses for get-mode-for-language when there is no 
existing buffer? Hmm, I suppose it could be the better option when 
either the current buffer is not visiting a file (but you want to have 
code completion in it anyway), or the user switched to a different major 
mode explicitly, one that does not correspond to buffer-file-name in the 
current configuration.

We could have both functions: buffer-language and get-language-for-mode 
('get-mode-language'?). Or define one initially and add the other as needed.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Mon, 15 Jan 2024 23:13:02 GMT) Full text and rfc822 format available.

Message #325 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 Stefan Kangas <stefankangas <at> gmail.com>, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 15 Jan 2024 23:11:54 +0000
On Mon, Jan 15, 2024 at 8:51 PM Dmitry Gutov <dmitry <at> gutov.dev> wrote:

> > I don't think there's anything magical about a base mode.  But I like your
> > solution too.
>
> As "magical", I meant the original patch for this report. I wouldn't
> mind the "base mode" approach, but I guess its still suffers from not
> being suitable for using with earlier Emacs versions.
>
> And every programming mode will have to come with -base-mode defined,

We could define them all in batch in a macro, that's not too bad.  And
then let the existing fleshed out ones overwrite those definitions by
making sure to load them later.

The main advantages of the foo-base-mode approach is that:

* it is easily grokkable by everybody, as it is very simply based on
  simple inheritance, which everybody knows already.

* there's already a fair number of such modes in the tree.

But I do like your patch better, it seems pretty useful to introduce the
language concept, as it solves this and more problems more cleanly.  So
let's see where that goes.

> This choice is coupled with the corresponding logic in 'buffer-language'
> (whether to keep the replace-regexp-in-string branch).

Yes.  I think we should err on the side on convenience.  What exactly are
the defects can we get?  I can't see anything else but the tuareg-mode, and we
can plug that on our side.  Maybe you can see more.

> Are there specific uses for get-mode-for-language when there is no
> existing buffer?

Yes, I'd say this markdown-mode use is exactly that.  Markdown inserts
some text into a buffer and all it knows is the language it's supposed
to fontify it with.  The major mode has that logic, so it must invoke
the correct (and preferred) major-mode function.

Another use is allowing the user to choose major modes for languages,
say from a tutorial or wizard or at Emacs startup.  Say, I prefer
ruby-ts-mode for Ruby, but c++-mode for C++.  It'd be helpful to summarize
those preferences.

> We could have both functions: buffer-language and get-language-for-mode
> ('get-mode-language'?). Or define one initially and add the other as needed.

Yes.  buffer-language isn't bad, it's a useful helper.  But buffer-language
should be just

   (with-current-buffer buffer (get-language-for-mode major-mode))

Right?  Modulo some caching if it turns out to be very inneficient
(which I really doubt)

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 16 Jan 2024 02:10:02 GMT) Full text and rfc822 format available.

Message #328 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 Stefan Kangas <stefankangas <at> gmail.com>, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Tue, 16 Jan 2024 04:09:07 +0200
On 16/01/2024 01:11, João Távora wrote:
> On Mon, Jan 15, 2024 at 8:51 PM Dmitry Gutov <dmitry <at> gutov.dev> wrote:
> 
>>> I don't think there's anything magical about a base mode.  But I like your
>>> solution too.
>>
>> As "magical", I meant the original patch for this report. I wouldn't
>> mind the "base mode" approach, but I guess its still suffers from not
>> being suitable for using with earlier Emacs versions.
>>
>> And every programming mode will have to come with -base-mode defined,
> 
> We could define them all in batch in a macro, that's not too bad.  And
> then let the existing fleshed out ones overwrite those definitions by
> making sure to load them later.

A keyword for define-derived-mode like (:base t)? That would work.

> The main advantages of the foo-base-mode approach is that:
> 
> * it is easily grokkable by everybody, as it is very simply based on
>    simple inheritance, which everybody knows already.
> 
> * there's already a fair number of such modes in the tree.

Agree.

I guess the part I don't quite like is adding a lot more new -base- 
modes (we'll have to add one for every prog mode, at least), most of 
which would stay unused, but unlike hook variables, clutter the function 
namespace.

> But I do like your patch better, it seems pretty useful to introduce the
> language concept, as it solves this and more problems more cleanly.  So
> let's see where that goes.

Great.

>> This choice is coupled with the corresponding logic in 'buffer-language'
>> (whether to keep the replace-regexp-in-string branch).
> 
> Yes.  I think we should err on the side on convenience.  What exactly are
> the defects can we get?  I can't see anything else but the tuareg-mode, and we
> can plug that on our side.  Maybe you can see more.

For example, it would sometimes return ugly non-existent languages like 
:help-fns--edit-value, :org-lint--report or :xref--xref-buffer.

In most cases that would be harmless, but OTOH the callers would miss 
out on the opportunity to see that the language is nil and apply their 
own fallbacks right away. I don't have a real problem scenario in mind, 
though.

Perhaps some commands that would act on the language of the current 
buffer might want to say "no language is associated", but could not with 
the "convenience" approach.

>> Are there specific uses for get-mode-for-language when there is no
>> existing buffer?
> 
> Yes, I'd say this markdown-mode use is exactly that.  Markdown inserts
> some text into a buffer and all it knows is the language it's supposed
> to fontify it with.  The major mode has that logic, so it must invoke
> the correct (and preferred) major-mode function.

Sorry, I meant get-language-for-mode (which is the one implemented as 
buffer-language currently).

> Another use is allowing the user to choose major modes for languages,
> say from a tutorial or wizard or at Emacs startup.  Say, I prefer
> ruby-ts-mode for Ruby, but c++-mode for C++.  It'd be helpful to summarize
> those preferences.

This would require capabilities like "get all modes for a language" (not 
one of the set of functions mentioned so far, and it'll need a full scan 
of major-mode-remap-alist) and "get current mode for a language" (this 
one matches markdown-mode's function you posted).

BTW, get-current-mode-for-language could be implemented in terms of 
set-buffer-language.

>> We could have both functions: buffer-language and get-language-for-mode
>> ('get-mode-language'?). Or define one initially and add the other as needed.
> 
> Yes.  buffer-language isn't bad, it's a useful helper.  But buffer-language
> should be just
> 
>     (with-current-buffer buffer (get-language-for-mode major-mode))
> 
> Right?  Modulo some caching if it turns out to be very inneficient
> (which I really doubt)

Again: this won't work for files where no suitable major mode is 
available (e.g. not installed yet).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 16 Jan 2024 02:33:02 GMT) Full text and rfc822 format available.

Message #331 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Stefan Kangas <stefankangas <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 joaotavora <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 15 Jan 2024 21:32:22 -0500
>> Please don't call it "language".  That'd be confusing.  LSP is about
>> programming languages, so "language" is natural there.  But in Emacs,
>> a major mode is more general than that.  For example, it is not
>> unthinkable to consider mail-mode to be the extra-parent of
>> message-mode (or vice versa) -- but what is the "language" in that
>> case?
> Isn't the language for such modes in this paradigm just the empty set?

I'm not too worried about those cases, indeed.
I'm more worried about the taxonomy of languages.
We currently have the taxonomy of major modes, with which we're pretty
familiar, and we've had many years to learn about its downsides,
complexity, as well as how to deal with them, but for languages we're
only familiar with the easy cases, which makes us judge the idea in
a way that may prove naive.

IME, deciding what is the type of the content of a buffer is usually
trivial but with some notable caveats, such as XPM or Postscript files,
or "container formats" (like `.deb` or `.odt`, as well as things like
DocBook which can be considered either as their own format or as XML),
or "sublanguages" such as C being a subset of C++, or Javascript being
a subset of Typescript.  And I suspect the info we need will not always
be quite the same.

So while there might be a good case to be made to add some API functions
to query the language/type(s) of a given buffer (I'm not sure we'd need
the language of a given major mode, OTOH), or to find the preferred
mode(s) for a given language/type, I think it's worthwhile to try and
tweak our major mode taxonomy because it is information we must have
and information we know we will always have, so we should strive to make
it as good as we can.

It shouldn't make it any harder to add language/type API functionality.
On the contrary it should make it easier.

[ As suggested elsewhere in this thread, we could even try and merge
  those taxonomies, e.g. using extra parents of the form `LANG-lang`.  ]

As I said at the very beginning of this long thread, I'm not completely
sure how well my proposal will play out: the upsides are in plain sight,
but it may bump into real problems.  [ I'm actually surprised by Eli's
optimism about it 🙂 ]
But we won't know until we try it.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 16 Jan 2024 02:36:02 GMT) Full text and rfc822 format available.

Message #334 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Stefan Kangas <stefankangas <at> gmail.com>
Cc: João Távora <joaotavora <at> gmail.com>,
 Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 casouri <at> gmail.com, 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Mon, 15 Jan 2024 21:35:28 -0500
> Basically, the biggest weakness of Stefan M's solution is the biggest
> strength of João's and vice versa: "backwards-compatibility" (if we can
> call it that) vs "clean taxonomy".

While a fresh new taxonomy could definitely be cleaner, seeing how it
can be designed with 40 years of hindsight, I believe "clean taxonomy"
is an oxymoron.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 16 Jan 2024 10:35:01 GMT) Full text and rfc822 format available.

Message #337 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 casouri <at> gmail.com, Stefan Kangas <stefankangas <at> gmail.com>,
 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Tue, 16 Jan 2024 10:34:16 +0000
On Tue, Jan 16, 2024 at 2:35 AM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
>
> > Basically, the biggest weakness of Stefan M's solution is the biggest
> > strength of João's and vice versa: "backwards-compatibility" (if we can
> > call it that) vs "clean taxonomy".
>
> While a fresh new taxonomy could definitely be cleaner, seeing how it
> can be designed with 40 years of hindsight, I believe "clean taxonomy"
> is an oxymoron.

"clean taxonomy" is most definitely a oxymoron :-)  But is this really a
"taxonomy"?  I see no real categorization or classification.  I just see
a many-to-one mapping of major modes to languages.  I don't see any
conceptual wrinkles, can you point to one?  Can you point to some
concrete major modes inheriting from prog-mode where it's not easy
to give a reasonable answer to the question: what language is this
major mode for?

I don't think it's a problem if we have to choose between "JS" and
"JavaScript". Just pick one, and adapt to external systems if needed
at the boundaries.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 16 Jan 2024 11:07:01 GMT) Full text and rfc822 format available.

Message #340 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 Stefan Kangas <stefankangas <at> gmail.com>, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Tue, 16 Jan 2024 11:06:25 +0000
On Tue, Jan 16, 2024 at 2:09 AM Dmitry Gutov <dmitry <at> gutov.dev> wrote:

> > We could define them all in batch in a macro, that's not too bad.  And
> > then let the existing fleshed out ones overwrite those definitions by
> > making sure to load them later.
>
> A keyword for define-derived-mode like (:base t)? That would work.

I think that would just clash with the existing PARENT one (when would
one of those bases _not_ be the parent).  For base modes, I think
the current approach is mostly fine.  There's something we can do
for the existing ones (and future ones) though, which is to explicitly
mark them abstract somehow.

> > The main advantages of the foo-base-mode approach is that:
> >
> > * it is easily grokkable by everybody, as it is very simply based on
> >    simple inheritance, which everybody knows already.
> >
> > * there's already a fair number of such modes in the tree.
>
> Agree.
>
> I guess the part I don't quite like is adding a lot more new -base-
> modes (we'll have to add one for every prog mode, at least), most of
> which would stay unused, but unlike hook variables, clutter the function
> namespace.

Agree.  And this is why I'm not crazy about the solution either.  But as
to cluttering the function namespace we could say that (:abstract t) modes
do _not_ generate a function (or do not generate them in the public
namespace -- as I think the function still has to exist for any
concrete submodes down the line to call it).

> > But I do like your patch better, it seems pretty useful to introduce the
> > language concept, as it solves this and more problems more cleanly.  So
> > let's see where that goes.
>
> Great.
>
> >> This choice is coupled with the corresponding logic in 'buffer-language'
> >> (whether to keep the replace-regexp-in-string branch).
> >
> > Yes.  I think we should err on the side on convenience.  What exactly are
> > the defects can we get?  I can't see anything else but the tuareg-mode, and we
> > can plug that on our side.  Maybe you can see more.
>
> For example, it would sometimes return ugly non-existent languages like
> :help-fns--edit-value, :org-lint--report or :xref--xref-buffer.

What if we filter by prog-mode?  It would leave the ':ruby-base' and
':python-base' as false positives, I guess.  But then we could reasonably
say that anything ending with '-base' is abstract (or use the
aforementioned  explicit abstract prop).

It would also make ':lisp-data' a language.  But that's not bad.
lisp-data-mode is actually a useful concrete prog-mode derivative,
so I think it's OK to have ':lisp-data' as a language.

We can then have exceptions for some notable cases.  'lisp-mode' is
as we know, for Common Lisp only.

> In most cases that would be harmless, but OTOH the callers would miss
> out on the opportunity to see that the language is nil and apply their
> own fallbacks right away. I don't have a real problem scenario in mind,
> though.

Neither do I, but I agree we should be as accurate as possible.

> Perhaps some commands that would act on the language of the current
> buffer might want to say "no language is associated", but could not with
> the "convenience" approach.

For sure.

> >> Are there specific uses for get-mode-for-language when there is no
> >> existing buffer?
> >
> > Yes, I'd say this markdown-mode use is exactly that.  Markdown inserts
> > some text into a buffer and all it knows is the language it's supposed
> > to fontify it with.  The major mode has that logic, so it must invoke
> > the correct (and preferred) major-mode function.
>
> Sorry, I meant get-language-for-mode (which is the one implemented as
> buffer-language currently).
>
> > Another use is allowing the user to choose major modes for languages,
> > say from a tutorial or wizard or at Emacs startup.  Say, I prefer
> > ruby-ts-mode for Ruby, but c++-mode for C++.  It'd be helpful to summarize
> > those preferences.
>
> This would require capabilities like "get all modes for a language" (not
> one of the set of functions mentioned so far, and it'll need a full scan
> of major-mode-remap-alist) and "get current mode for a language" (this
> one matches markdown-mode's function you posted).

Yes.  I don't see the full scan of m-m-remap-alist as problematic
from a effiency perspective.  If we decide it's the database, it's
the database.   It's unfortunate that the "alist" implementation is
hardcoded in the name (which is why I would prefer a (:language "Foo")
kwarg to define-derived-mode) but we can abstract the alist aspect
away with accessors and do the usual "Do not change this variable
directly, use these accessors instead".

> BTW, get-current-mode-for-language could be implemented in terms of
> set-buffer-language.

What does get-current-mode-for-language do exactly?

> >> We could have both functions: buffer-language and get-language-for-mode
> >> ('get-mode-language'?). Or define one initially and add the other as needed.
> >
> > Yes.  buffer-language isn't bad, it's a useful helper.  But buffer-language
> > should be just
> >
> >     (with-current-buffer buffer (get-language-for-mode major-mode))
> >
> > Right?  Modulo some caching if it turns out to be very inneficient
> > (which I really doubt)
>
> Again: this won't work for files where no suitable major mode is
> available (e.g. not installed yet).

Right. So maybe

(or (with-current-buffer buffer (get-language-for-mode major-mode))
    (let (kw)
       (and buffer-file-name
            (keywordp (setq kw (alist-get buffer-file-name auto-mode-alist)))
            kw))
    (consult-oracles)
    (error "Don't know what language this is, sorry"))


?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 16 Jan 2024 17:46:02 GMT) Full text and rfc822 format available.

Message #343 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 Stefan Kangas <stefankangas <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Tue, 16 Jan 2024 19:45:17 +0200
On 16/01/2024 12:34, João Távora wrote:
> I see no real categorization or classification.  I just see
> a many-to-one mapping of major modes to languages.

It might even be many-to-many, at least in some cases.

E.g. js-ts-mode being good for both :js and :jsx.
Whileas typescript-ts-mode can work for :js and :typescript but not :jsx 
(or :tsx).
tsx-ts-mode is probably okay for both :tsx, :jsx and :js but not 
:typescript (in general, because of certain clashing syntax).

Not sure how useful this -to-many relation is going to be in the above 
cases, but it's probably a good illustration of the possibility.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 16 Jan 2024 22:01:01 GMT) Full text and rfc822 format available.

Message #346 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 Yuan Fu <casouri <at> gmail.com>, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 Stefan Kangas <stefankangas <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Tue, 16 Jan 2024 22:00:35 +0000
On Tue, Jan 16, 2024, 17:45 Dmitry Gutov <dmitry <at> gutov.dev> wrote:
>
> On 16/01/2024 12:34, João Távora wrote:
> > I see no real categorization or classification.  I just see
> > a many-to-one mapping of major modes to languages.
>
> It might even be many-to-many, at least in some cases.
>
> E.g. js-ts-mode being good for both :js and :jsx.

>
> Not sure how useful this -to-many relation is going to be in the above
> cases, but it's probably a good illustration of the possibility.

According to https://react.dev, jsx is a "JavaScript syntax
extension".  So it would seem JSX is a superset of JS.  If
js-ts-mode parses it perfectly, it could be called
jsx-ts-mode instead.

But does it? I see Emacs modes specific for jsx out there, I
suppose people use them for a reason.  There's also tsx-ts-mode
and typescript-ts-mode.

At the end of the day, a language is not so easy to define,
but it's not that problematic either, especially in the editor
(in the compiler, it's much more important).

The best sources are a standard, when it exists, but each iteration,
sometimes  each compiler is also its own language.  There's "GNU C",
"ANSI C", C17, C23.  All handled by the C modes we have and the best
way we have to designate this is just "C".  c++-mode also handles
this code by the way, probably flawlessly, and yet we don't say
c++-mode is for C and C++.

But if you want, I don't think there's any big problem
in making get-language-for-mode return a list, with
the most important likely language at the top.

I predict it'll be pretty rare, but I guess you could
have this (excuse the ugly CamelCase for demo purposes)

(setq auto-mode-alist '(("\\.js$" . :JavaScript)
                        ("\\.jsx$" . :JavaScriptReact)))

(setq m-m-remap-alist '((:JavaScript . js-ts-mode)
                        (:JavaScriptReact . js-ts-mode)))

And 'buffer-language' becomes more like:

(or buffer-overriding-language-keyword
    (with-current-buffer buffer (get-language-for-mode major-mode))
    (let (kw)
       (and buffer-file-name
            (keywordp
              (setq kw
                    ;; yes I know this needs regexps
                    (alist-get buffer-file-name auto-mode-alist)))
            kw))
    (consult-oracles)
    (error "Don't know what language this is, sorry"))

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Tue, 16 Jan 2024 23:31:01 GMT) Full text and rfc822 format available.

Message #349 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 Stefan Kangas <stefankangas <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Tue, 16 Jan 2024 23:29:42 +0000
On Tue, Jan 16, 2024 at 2:32 AM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:

> As I said at the very beginning of this long thread, I'm not completely
> sure how well my proposal will play out: the upsides are in plain sight,
> but it may bump into real problems.  [ I'm actually surprised by Eli's
> optimism about it 🙂 ]
> But we won't know until we try it.

It solves some things (that are already solved anyway).  But I think the
downsides are also in plain sight.  It doesn't solve common problems
in Eglot and Markdown-mode.  It's awkward to explain the hook and
dir-locals situations.  Some assertive docs on what the new foo-mode extra
parent means could make it better though.

I think Dmitry's 5-legged chair analogy is reasonably accurate.  We can
build it easily, yes.  People will sit on it, sure.  But it'll never
go with the furniture or be ergonomic, even if they never bump their
pinky toe in the extra leg.

We have boring old 4-legged chairs readily available (base modes) and
we can think of more elegant chairs.  What do you think of Dmitry's patch?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Wed, 17 Jan 2024 00:04:01 GMT) Full text and rfc822 format available.

Message #352 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 Stefan Kangas <stefankangas <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Tue, 16 Jan 2024 19:02:50 -0500
> It solves some things (that are already solved anyway).  But I think the
> downsides are also in plain sight.  It doesn't solve common problems
> in Eglot and Markdown-mode.

It's not designed to solve all problems.

[ The needs of Markdown-mode are different from those targeted by the
  current bug.  They're are of the form "find mode for type", as
  addressed by things like `major-mode-remap-alist`, whereas the current
  bug is about classifying modes.  ]

> We have boring old 4-legged chairs readily available (base modes) and
> we can think of more elegant chairs.  What do you think of Dmitry's patch?

I answered in the email to which your above email replied.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Wed, 17 Jan 2024 00:50:02 GMT) Full text and rfc822 format available.

Message #355 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 Stefan Kangas <stefankangas <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Wed, 17 Jan 2024 00:49:16 +0000
On Wed, Jan 17, 2024 at 12:03 AM Stefan Monnier
<monnier <at> iro.umontreal.ca> wrote:
>
> > It solves some things (that are already solved anyway).  But I think the
> > downsides are also in plain sight.  It doesn't solve common problems
> > in Eglot and Markdown-mode.
>
> It's not designed to solve all problems.

Sure, and fair enough.   But those other problems are real.
Two groups exists, give or take. If we take this solution for just
one, we make solving the other more difficult.

> [ The needs of Markdown-mode are different from those targeted by the
>   current bug.  They're are of the form "find mode for type", as
>   addressed by things like `major-mode-remap-alist`, whereas the current
>   bug is about classifying modes.  ]

We should get a holistic solution where we introduce the concept of
language, either explicitly -- Dmitry's patch -- or implicitly --
abstract base modes derived from "prog-mode".  I prefer Dmitry's
patch, but the base mode approach also covers all cases AFAICS.

I don't see the urgency of fixing the problems this patch addresses.
Can we quantify these problems?  What external package is currently
misbehaving so much that it has to be fixed like this and can't
wait for a better solution?  In contrast, bug#68217 points to a real
unsolved problem where discrepant modes may be chosen by the user
and the Markdown package, and there's no easy way to coordinate.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Wed, 17 Jan 2024 02:07:02 GMT) Full text and rfc822 format available.

Message #358 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 Yuan Fu <casouri <at> gmail.com>, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 Stefan Kangas <stefankangas <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Wed, 17 Jan 2024 04:05:58 +0200
On 17/01/2024 00:00, João Távora wrote:
> On Tue, Jan 16, 2024, 17:45 Dmitry Gutov <dmitry <at> gutov.dev> wrote:
>>
>> On 16/01/2024 12:34, João Távora wrote:
>>> I see no real categorization or classification.  I just see
>>> a many-to-one mapping of major modes to languages.
>>
>> It might even be many-to-many, at least in some cases.
>>
>> E.g. js-ts-mode being good for both :js and :jsx.
> 
>>
>> Not sure how useful this -to-many relation is going to be in the above
>> cases, but it's probably a good illustration of the possibility.
> 
> According to https://react.dev, jsx is a "JavaScript syntax
> extension".  So it would seem JSX is a superset of JS.  If
> js-ts-mode parses it perfectly, it could be called
> jsx-ts-mode instead.

The grammar tree-sitter-js supports JSX. So the mode is called js-ts-mode.

> But does it? I see Emacs modes specific for jsx out there, I
> suppose people use them for a reason.

They are older.

> There's also tsx-ts-mode
> and typescript-ts-mode.

Like I said:

  tsx-ts-mode is probably okay for both :tsx, :jsx and :js but not
  :typescript (in general, because of certain clashing syntax).

> At the end of the day, a language is not so easy to define,
> but it's not that problematic either, especially in the editor
> (in the compiler, it's much more important).
> 
> The best sources are a standard, when it exists, but each iteration,
> sometimes  each compiler is also its own language.  There's "GNU C",
> "ANSI C", C17, C23.  All handled by the C modes we have and the best
> way we have to designate this is just "C".  c++-mode also handles
> this code by the way, probably flawlessly, and yet we don't say
> c++-mode is for C and C++.
> 
> But if you want, I don't think there's any big problem
> in making get-language-for-mode return a list, with
> the most important likely language at the top.

It would be a problem if we decide that the major mode function runs the 
language-specific hook, and not set-auto-mode-0 like in my patch 
(because the mmode function would like run the hooks for all supported 
languages, rather than just the current one).

> I predict it'll be pretty rare, but I guess you could
> have this (excuse the ugly CamelCase for demo purposes)

It might indeed be rare enough for this to be a problem, and we might 
even decide to prohibit such usage, keeping the relation many-to-one.

> (setq auto-mode-alist '(("\\.js$" . :JavaScript)
>                          ("\\.jsx$" . :JavaScriptReact)))
> 
> (setq m-m-remap-alist '((:JavaScript . js-ts-mode)
>                          (:JavaScriptReact . js-ts-mode)))

It indeed should work okay with my proposal, but might be harder to do 
if the languages are inserted as part of the existing modes hierarchy 
(e.g. as "abstract" symbols). That is assuming we do want the language 
hook to run - which seems like important goal from my POV.

> And 'buffer-language' becomes more like:
> 
> (or buffer-overriding-language-keyword
>      (with-current-buffer buffer (get-language-for-mode major-mode))
>      (let (kw)
>         (and buffer-file-name
>              (keywordp
>                (setq kw
>                      ;; yes I know this needs regexps
>                      (alist-get buffer-file-name auto-mode-alist)))

There is a bunch of variables to look up: auto-mode-alist, 
magic-mode-alist, interpreter-mode-alist, magic-fallback-mode-alist. I 
didn't want to duplicate the logic from set-auto-mode, but this of 
course could be done.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Wed, 17 Jan 2024 02:42:01 GMT) Full text and rfc822 format available.

Message #361 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 Stefan Kangas <stefankangas <at> gmail.com>, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Wed, 17 Jan 2024 04:41:00 +0200
On 16/01/2024 13:06, João Távora wrote:

> Agree.  And this is why I'm not crazy about the solution either.  But as
> to cluttering the function namespace we could say that (:abstract t) modes
> do _not_ generate a function (or do not generate them in the public
> namespace -- as I think the function still has to exist for any
> concrete submodes down the line to call it).

An "abstract" mode is supposedly one that doesn't do anything. So it 
doesn't have to be callable.

Anyway, that's +1 feature required for the implementation.

>>>> This choice is coupled with the corresponding logic in 'buffer-language'
>>>> (whether to keep the replace-regexp-in-string branch).
>>>
>>> Yes.  I think we should err on the side on convenience.  What exactly are
>>> the defects can we get?  I can't see anything else but the tuareg-mode, and we
>>> can plug that on our side.  Maybe you can see more.
>>
>> For example, it would sometimes return ugly non-existent languages like
>> :help-fns--edit-value, :org-lint--report or :xref--xref-buffer.
> 
> What if we filter by prog-mode?  It would leave the ':ruby-base' and
> ':python-base' as false positives, I guess.  But then we could reasonably
> say that anything ending with '-base' is abstract (or use the
> aforementioned  explicit abstract prop).

We would also filter out :css, for example. TeX modes also do not derive 
from prog-mode. TeX does have an LSP server, however.

> It would also make ':lisp-data' a language.  But that's not bad.
> lisp-data-mode is actually a useful concrete prog-mode derivative,
> so I think it's OK to have ':lisp-data' as a language.
> 
> We can then have exceptions for some notable cases.  'lisp-mode' is
> as we know, for Common Lisp only.

>>>> Are there specific uses for get-mode-for-language when there is no
>>>> existing buffer?
>>>
>>> Yes, I'd say this markdown-mode use is exactly that.  Markdown inserts
>>> some text into a buffer and all it knows is the language it's supposed
>>> to fontify it with.  The major mode has that logic, so it must invoke
>>> the correct (and preferred) major-mode function.
>>
>> Sorry, I meant get-language-for-mode (which is the one implemented as
>> buffer-language currently).
>>
>>> Another use is allowing the user to choose major modes for languages,
>>> say from a tutorial or wizard or at Emacs startup.  Say, I prefer
>>> ruby-ts-mode for Ruby, but c++-mode for C++.  It'd be helpful to summarize
>>> those preferences.
>>
>> This would require capabilities like "get all modes for a language" (not
>> one of the set of functions mentioned so far, and it'll need a full scan
>> of major-mode-remap-alist) and "get current mode for a language" (this
>> one matches markdown-mode's function you posted).
> 
> Yes.  I don't see the full scan of m-m-remap-alist as problematic
> from a effiency perspective.  If we decide it's the database, it's
> the database.   It's unfortunate that the "alist" implementation is
> hardcoded in the name (which is why I would prefer a (:language "Foo")
> kwarg to define-derived-mode) but we can abstract the alist aspect
> away with accessors and do the usual "Do not change this variable
> directly, use these accessors instead".

I'm not saying in advance that it will be slow. Just that it's a 
different function.

>> BTW, get-current-mode-for-language could be implemented in terms of
>> set-buffer-language.
> 
> What does get-current-mode-for-language do exactly?

The major-mode currently configured to be used for the language (through 
m-m-a-alist, in the current proposal). set-auto-mode will choose it.

>>>> We could have both functions: buffer-language and get-language-for-mode
>>>> ('get-mode-language'?). Or define one initially and add the other as needed.
>>>
>>> Yes.  buffer-language isn't bad, it's a useful helper.  But buffer-language
>>> should be just
>>>
>>>      (with-current-buffer buffer (get-language-for-mode major-mode))
>>>
>>> Right?  Modulo some caching if it turns out to be very inneficient
>>> (which I really doubt)
>>
>> Again: this won't work for files where no suitable major mode is
>> available (e.g. not installed yet).
> 
> Right. So maybe
> 
> (or (with-current-buffer buffer (get-language-for-mode major-mode))
>      (let (kw)
>         (and buffer-file-name
>              (keywordp (setq kw (alist-get buffer-file-name auto-mode-alist)))
>              kw))
>      (consult-oracles)
>      (error "Don't know what language this is, sorry"))

Replied to this one in another email: referring to the results of the 
computation of set-auto-mode is easier. But that's a technical detail.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Wed, 17 Jan 2024 03:46:01 GMT) Full text and rfc822 format available.

Message #364 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>,
 Stefan Kangas <stefankangas <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 joaotavora <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Wed, 17 Jan 2024 05:45:34 +0200
On 16/01/2024 04:32, Stefan Monnier via Bug reports for GNU Emacs, the 
Swiss army knife of text editors wrote:
>>> Please don't call it "language".  That'd be confusing.  LSP is about
>>> programming languages, so "language" is natural there.  But in Emacs,
>>> a major mode is more general than that.  For example, it is not
>>> unthinkable to consider mail-mode to be the extra-parent of
>>> message-mode (or vice versa) -- but what is the "language" in that
>>> case?
>> Isn't the language for such modes in this paradigm just the empty set?
> 
> I'm not too worried about those cases, indeed.
> I'm more worried about the taxonomy of languages.
> We currently have the taxonomy of major modes, with which we're pretty
> familiar, and we've had many years to learn about its downsides,
> complexity, as well as how to deal with them, but for languages we're
> only familiar with the easy cases, which makes us judge the idea in
> a way that may prove naive.

Some of us perhaps familiar with more cases than others.

> IME, deciding what is the type of the content of a buffer is usually
> trivial but with some notable caveats, such as XPM or Postscript files,
> or "container formats" (like `.deb` or `.odt`, as well as things like
> DocBook which can be considered either as their own format or as XML),

Sounds like DocBook could be viewed using different major modes. I'm not 
sure whether they should be classified as different languages in general 
in such cases, but here is sounds like :doc_book vs :xml.

> or "sublanguages" such as C being a subset of C++, or Javascript being
> a subset of Typescript.  And I suspect the info we need will not always
> be quite the same.

So far my understanding is that "languages" would not have a hierarchy - 
no relation of being a "subset" or etc, because different applications 
will likely need different relations between such languages. Or none at 
all, most of the time.

When a major mode is suitable for a number of languages, it can be 
expressed externally, e.g. using several entries in major-mode-alist-alist.

> So while there might be a good case to be made to add some API functions
> to query the language/type(s) of a given buffer (I'm not sure we'd need
> the language of a given major mode, OTOH), or to find the preferred
> mode(s) for a given language/type, I think it's worthwhile to try and
> tweak our major mode taxonomy because it is information we must have
> and information we know we will always have, so we should strive to make
> it as good as we can.
> 
> It shouldn't make it any harder to add language/type API functionality.
> On the contrary it should make it easier.
> 
> [ As suggested elsewhere in this thread, we could even try and merge
>    those taxonomies, e.g. using extra parents of the form `LANG-lang`.  ]

Inserting an extra parent called LANG-lang could work to contain the 
(mode->lang) mapping, but only if we decide that a mode can correspond 
to only one language, or if we are not going to run the language hook in 
the mode function. But if we don't, the extra complexity seems not worth 
it: we'll need the lang->mode mapping somewhere else anyway. And looking 
there (in major-mode-remap-alist) we could fetch the reverse relation 
just as well.

This also wouldn't bring any of the other features I enumerated together 
with my patch.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Wed, 17 Jan 2024 10:21:02 GMT) Full text and rfc822 format available.

Message #367 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 Stefan Kangas <stefankangas <at> gmail.com>, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Wed, 17 Jan 2024 10:20:34 +0000
On Wed, Jan 17, 2024 at 2:41 AM Dmitry Gutov <dmitry <at> gutov.dev> wrote:

> An "abstract" mode is supposedly one that doesn't do anything. So it
> doesn't have to be callable.

No.  Not "abstract" as in "Java interface", abstract as in "Java
abstract class".

> Anyway, that's +1 feature required for the implementation.

Almost trivial feature.  See patch after sig.  It'll make this work:

(define-derived-mode foo-mode prog-mode "Fooey" :abstract t
   (message "Hey from foo-mode"))
(define-derived-mode bar-mode foo-mode "Barey"
   (message "Hey from bar-mode"))

'foo-mode' can't be called, but 'bar-mode can, of course.  And it
will call its parent.

> > What if we filter by prog-mode?  It would leave the ':ruby-base' and
> > ':python-base' as false positives, I guess.  But then we could reasonably
> > say that anything ending with '-base' is abstract (or use the
> > aforementioned  explicit abstract prop).
>
> We would also filter out :css, for example.

Sure?  I see a super-normal css-base-mode inheriting from
prog-mode.

> TeX modes also do not derive
> from prog-mode. TeX does have an LSP server, however.

At the end of the day we have to come to a conclusion of what
we want to do.  I want to find major modes that correspond
to languages, right?  So:

* these outliers start inheriting from prog-mode
* we inject new lang-mode between prog-mode and fundamental-mode and make
  outliers derive from that.
* we say these outliers aren't languages
* we code exceptions for these outliers

It could even be that

(derived-mode-add-parents 'tex-mode '(prog-mode))

is exactly what's needed here, showcasing how I think this
particular heavy hammer should be used for the exception, not
the rule.

> > Yes.  I don't see the full scan of m-m-remap-alist as problematic
> > from a effiency perspective.  If we decide it's the database, it's
> > the database.   It's unfortunate that the "alist" implementation is
> > hardcoded in the name (which is why I would prefer a (:language "Foo")
> > kwarg to define-derived-mode) but we can abstract the alist aspect
> > away with accessors and do the usual "Do not change this variable
> > directly, use these accessors instead".
>
> I'm not saying in advance that it will be slow. Just that it's a
> different function.

Ah.  Right.  And I think it's a good one.  Eglot needs it, and so does
Yasnippet, probably.

> >> BTW, get-current-mode-for-language could be implemented in terms of
> >> set-buffer-language.
> >
> > What does get-current-mode-for-language do exactly?
>
> The major-mode currently configured to be used for the language (through
> m-m-a-alist, in the current proposal). set-auto-mode will choose it.

Perfect.  But the, "set-buffer-language" comment?  Does a buffer object
have to exist for that job to be done?

> > Right. So maybe
> >
> > (or (with-current-buffer buffer (get-language-for-mode major-mode))
> >      (let (kw)
> >         (and buffer-file-name
> >              (keywordp (setq kw (alist-get buffer-file-name auto-mode-alist)))
> >              kw))
> >      (consult-oracles)
> >      (error "Don't know what language this is, sorry"))
>
> Replied to this one in another email: referring to the results of the
> computation of set-auto-mode is easier. But that's a technical detail.

For sure, reuse as much code as possible.  I was just
illustrating the intended fallback logic.

João

diff --git a/lisp/emacs-lisp/derived.el b/lisp/emacs-lisp/derived.el
index dec5883767d..a9b67965416 100644
--- a/lisp/emacs-lisp/derived.el
+++ b/lisp/emacs-lisp/derived.el
@@ -143,6 +143,9 @@ define-derived-mode
            :interactive BOOLEAN
                    Whether the derived mode should be `interactive' or not.
                    The default is t.
+           :abstract BOOLEAN
+                   You'll never be able to use the CHILD mode directly
+                   in a buffer, just use as a PARENT for other modes.

 BODY:      forms to execute just before running the
            hooks for the new mode.  Do not use `interactive' here.
@@ -192,7 +195,9 @@ define-derived-mode
  (hook (derived-mode-hook-name child))
  (group nil)
         (interactive t)
-        (after-hook nil))
+        (abstract nil)
+        (after-hook nil)
+        (function-name child))

     ;; Process the keyword args.
     (while (keywordp (car body))
@@ -201,9 +206,13 @@ define-derived-mode
  (:abbrev-table (setq abbrev (pop body)) (setq declare-abbrev nil))
  (:syntax-table (setq syntax (pop body)) (setq declare-syntax nil))
         (:after-hook (setq after-hook (pop body)))
+        (:abstract (setq abstract (pop body)))
         (:interactive (setq interactive (pop body)))
  (_ (pop body))))

+    (when abstract
+      (setq function-name (gensym "--internal-"))
+      (put child 'abstract-mode function-name))
     (setq docstring (derived-mode-make-docstring
       parent child docstring syntax abbrev))

@@ -245,13 +254,12 @@ define-derived-mode
          (put ',child 'derived-mode-parent ',parent))
        ,(if group `(put ',child 'custom-mode-group ,group))

-       (defun ,child ()
+       (defun ,function-name ()
  ,docstring
  ,(and interactive '(interactive))
  ; Run the parent.
  (delay-mode-hooks
-
-   (,(or parent 'kill-all-local-variables))
+   (,(or (get parent 'abstract-mode) parent 'kill-all-local-variables))
  ; Identify the child mode.
    (setq major-mode (quote ,child))
    (setq mode-name ,name)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Wed, 17 Jan 2024 10:32:02 GMT) Full text and rfc822 format available.

Message #370 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 Yuan Fu <casouri <at> gmail.com>, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 Stefan Kangas <stefankangas <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Wed, 17 Jan 2024 10:31:39 +0000
On Wed, Jan 17, 2024 at 2:06 AM Dmitry Gutov <dmitry <at> gutov.dev> wrote:
>
> On 17/01/2024 00:00, João Távora wrote:
> > On Tue, Jan 16, 2024, 17:45 Dmitry Gutov <dmitry <at> gutov.dev> wrote:
> >>
> >> On 16/01/2024 12:34, João Távora wrote:
> >>> I see no real categorization or classification.  I just see
> >>> a many-to-one mapping of major modes to languages.
> >>
> >> It might even be many-to-many, at least in some cases.
> >>
> >> E.g. js-ts-mode being good for both :js and :jsx.
> >
> >>
> >> Not sure how useful this -to-many relation is going to be in the above
> >> cases, but it's probably a good illustration of the possibility.
> >
> > According to https://react.dev, jsx is a "JavaScript syntax
> > extension".  So it would seem JSX is a superset of JS.  If
> > js-ts-mode parses it perfectly, it could be called
> > jsx-ts-mode instead.
>
> The grammar tree-sitter-js supports JSX. So the mode is called js-ts-mode.

OK.  In this case, I would just say js-ts-mode is for JavaScript.

A future trivial js-tsx-mode would be for JavaScriptReact.
Eglot doesn't support the "JavaScriptReact" LSP language-id
because of this, and noone has complained (they have about ts and
tsx tho).

>    tsx-ts-mode is probably okay for both :tsx, :jsx and :js but not
>    :typescript (in general, because of certain clashing syntax).

Right.  Which makes sense.  Like my-c++-mode is "probably okay"
for C, but that doesn't mean my-c++-mode isn't the mode for the C++
language (or family of languages commonly denominated as "C++")

Ergo, as I see it, tsx-ts-mode is the mode for TypeScriptReact,
a language which happens to be a superset of some other languages.

> > At the end of the day, a language is not so easy to define,
> > but it's not that problematic either, especially in the editor
> > (in the compiler, it's much more important).
> >
> > The best sources are a standard, when it exists, but each iteration,
> > sometimes  each compiler is also its own language.  There's "GNU C",
> > "ANSI C", C17, C23.  All handled by the C modes we have and the best
> > way we have to designate this is just "C".  c++-mode also handles
> > this code by the way, probably flawlessly, and yet we don't say
> > c++-mode is for C and C++.
> >
> > But if you want, I don't think there's any big problem
> > in making get-language-for-mode return a list, with
> > the most important likely language at the top.
>
> It would be a problem if we decide that the major mode function runs the
> language-specific hook, and not set-auto-mode-0 like in my patch
> (because the mmode function would like run the hooks for all supported
> languages, rather than just the current one).

I see.  Right.

> > I predict it'll be pretty rare, but I guess you could
> > have this (excuse the ugly CamelCase for demo purposes)
>
> It might indeed be rare enough for this to be a problem, and we might
> even decide to prohibit such usage, keeping the relation many-to-one.

I'd think that is fine.

> > (setq auto-mode-alist '(("\\.js$" . :JavaScript)
> >                          ("\\.jsx$" . :JavaScriptReact)))
> >
> > (setq m-m-remap-alist '((:JavaScript . js-ts-mode)
> >                          (:JavaScriptReact . js-ts-mode)))
>
> It indeed should work okay with my proposal, but might be harder to do
> if the languages are inserted as part of the existing modes hierarchy
> (e.g. as "abstract" symbols).

I don't follow.  Are you referencing the other mail?  That
(:abstract t) idea was mostly crafted for the base-mode approach.
It's not strictly needed with your patch (though I don't see how
it hurts either).

> That is assuming we do want the language
> hook to run - which seems like important goal from my POV.

Not absolutely essential to fix current problems, but yes I agree
it's natural  enough that it should be in the proposal.

> > And 'buffer-language' becomes more like:
> >
> > (or buffer-overriding-language-keyword
> >      (with-current-buffer buffer (get-language-for-mode major-mode))
> >      (let (kw)
> >         (and buffer-file-name
> >              (keywordp
> >                (setq kw
> >                      ;; yes I know this needs regexps
> >                      (alist-get buffer-file-name auto-mode-alist)))
>
> There is a bunch of variables to look up: auto-mode-alist,
> magic-mode-alist, interpreter-mode-alist, magic-fallback-mode-alist. I
> didn't want to duplicate the logic from set-auto-mode, but this of
> course could be done.

I was just illustrating the fallback logic.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Wed, 17 Jan 2024 17:10:02 GMT) Full text and rfc822 format available.

Message #373 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 João Távora <joaotavora <at> gmail.com>,
 Stefan Kangas <stefankangas <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Wed, 17 Jan 2024 12:08:02 -0500
> @@ -3150,6 +3150,9 @@ auto-mode-alist
>  Visiting a file whose name matches REGEXP specifies FUNCTION as the
>  mode function to use.  FUNCTION will be called, unless it is nil.
>  
> +FUNCTION can also be a keyword denoting a language, to be looked
> +up in `major-mode-remap-alist'.

Side note: the intention is OK but `major-mode-remap-alist' is
a defcustom and should remain nil by default.  It's there for the users
to express which major modes they prefer.  So if we want a mapping
between some new language/type concept and major modes, it should be
stored elsewhere (could be a plain alist that's handled as a kind of
"implicit tail of `major-mode-remap-alist`").

> @@ -3206,10 +3209,10 @@ interpreter-mode-alist
>       ("emacs" . emacs-lisp-mode)))
>    "Alist mapping interpreter names to major modes.
>  This is used for files whose first lines match `auto-mode-interpreter-regexp'.
> -Each element looks like (REGEXP . MODE).
> +Each element looks like (REGEXP . MODE-OR-LANGUAGE).
>  If REGEXP matches the entire name (minus any directory part) of
>  the interpreter specified in the first line of a script, enable
> -major mode MODE.
> +MODE-OR-LANGUAGE.

There's a similar need for "content type" rather than "language".  If we
want to mention "language" we should also take the opportunity to
mention other related categorizations like "content type".

> -      (funcall (alist-get mode major-mode-remap-alist mode))
> +      ;; XXX: When there's no mapping for `:<language>', we could also
> +      ;; look for a function called `<language>-mode'.
> +      (funcall (alist-get mode major-mode-remap-alist (if (keywordp mode)
> +                                                          #'fundamental-mode
> +                                                        mode)))
> +      (when (keywordp mode)             ;Perhaps do that unconditionally.
> +        (run-hooks (intern (format "%s-language-hook" (buffer-language)))))

That seems wrong:
- Why should this hook run when `auto-mode-alist` says `:js` but not
  when doing `M-x javascript-mode` (or other ways to enable this mode)?
- Why run this hook *after* the mode's `:after-hook` and after
  things like `after-change-major-mode-hook`?

I think it should remain the major mode's responsibility to decide which
hooks it runs.

> +(defun buffer-language ()
> +  "Return the language of the current buffer.
> +A language is a lowercase keyword with the name of the language."
> +  ;; Alternatively, we could go through all the matchers in
> +  ;; auto-mode-alist, interpreter-mode-alist,
> +  ;; magic-fallback-mode-alist here, possibly using a cache keyed on
> +  ;; buffer-file-name.  But that's probably an overkill: if the user
> +  ;; changes the settings, they can call `M-x revert-buffer' at the end.
> +  (if (keywordp (car set-auto-mode--last))
> +      (car set-auto-mode--last)
> +    ;; Backward compatibility.
> +    (intern (format ":%s" (replace-regexp-in-string "\\(?:-ts\\)?-mode\\'" ""
> +                                                    (symbol-name major-mode))))))

I'm not comfortable enshrining the "-ts-mode" convention here.

Also I think if we want a `buffer-language` function, it should not rely
on how the mode was installed (e.g. `set-auto-mode--last`) but only on
the major mode itself, i.e. something like

    (defun buffer-language ()
      (or buffer-language
          (some heuristic based on major-mode and/or derived-modes)))

[ Of course, I already mentioned that I also suspect that there can/will
  be sometimes several languages (or none).  ]

> +(defun set-buffer-language (language)
> +  "Set the language of the current buffer.
> +And switch the major mode appropriately."
> +  (interactive
> +   (list (let* ((ct (mapcan
> +                     (lambda (pair) (and (keywordp (car pair))
> +                                    (list (symbol-name (car pair)))))
> +                     major-mode-remap-alist))
> +                (lang (completing-read "Language: " ct)))
> +           (and lang (intern lang)))))
> +  (set-auto-mode-0 language))

I see several issues with this function (name and implementation), but
I wonder when we'd ever need such a thing.

>  ;;;###autoload
>  (dolist (name (list "node" "nodejs" "gjs" "rhino"))
> -  (add-to-list 'interpreter-mode-alist (cons (purecopy name) 'js-mode)))
> +  (add-to-list 'interpreter-mode-alist (cons (purecopy name) :js)))

BTW, my suggested patch basically proposes to use `<LANG>-mode` instead
of `:LANG>` which saves us from those changes since that matches our
historical conventions.

Another issue I see if we don't use something like
`derived-mode-add-parents` is that all the various places where we use
mode-indexing, such as `.dir-locals.el`, `ffap`, YASnippet, etc... will
need to be extended with a way to use "languages" as well, and then we
also need to define a sane precedence between settings that apply to
a given mode and settings that apply to a given language (setting for
`js-ts-mode` should presumably take precedence over settings for
`:js` which should take precedence over settings for `prog-mode`).


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Wed, 17 Jan 2024 23:38:02 GMT) Full text and rfc822 format available.

Message #376 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 Yuan Fu <casouri <at> gmail.com>, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 Stefan Kangas <stefankangas <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Thu, 18 Jan 2024 01:37:16 +0200
On 17/01/2024 12:31, João Távora wrote:
> On Wed, Jan 17, 2024 at 2:06 AM Dmitry Gutov <dmitry <at> gutov.dev> wrote:
>>
>> On 17/01/2024 00:00, João Távora wrote:
>>> On Tue, Jan 16, 2024, 17:45 Dmitry Gutov <dmitry <at> gutov.dev> wrote:
>>>>
>>>> On 16/01/2024 12:34, João Távora wrote:
>>>>> I see no real categorization or classification.  I just see
>>>>> a many-to-one mapping of major modes to languages.
>>>>
>>>> It might even be many-to-many, at least in some cases.
>>>>
>>>> E.g. js-ts-mode being good for both :js and :jsx.
>>>
>>>>
>>>> Not sure how useful this -to-many relation is going to be in the above
>>>> cases, but it's probably a good illustration of the possibility.
>>>
>>> According to https://react.dev, jsx is a "JavaScript syntax
>>> extension".  So it would seem JSX is a superset of JS.  If
>>> js-ts-mode parses it perfectly, it could be called
>>> jsx-ts-mode instead.
>>
>> The grammar tree-sitter-js supports JSX. So the mode is called js-ts-mode.
> 
> OK.  In this case, I would just say js-ts-mode is for JavaScript.
> 
> A future trivial js-tsx-mode would be for JavaScriptReact.
> Eglot doesn't support the "JavaScriptReact" LSP language-id
> because of this, and noone has complained (they have about ts and
> tsx tho).

This is indeed an approach that would work for all such cases, at the 
cost of extra typing and additional file organization (e.g. each major 
mode needs to be sorted into some file - it's no problem for 
js-jsx-ts-mode, but somewhat different for a potential new obscure 
language without close relatives).

>>     tsx-ts-mode is probably okay for both :tsx, :jsx and :js but not
>>     :typescript (in general, because of certain clashing syntax).
> 
> Right.  Which makes sense.  Like my-c++-mode is "probably okay"
> for C, but that doesn't mean my-c++-mode isn't the mode for the C++
> language (or family of languages commonly denominated as "C++")
> 
> Ergo, as I see it, tsx-ts-mode is the mode for TypeScriptReact,
> a language which happens to be a superset of some other languages.

What we couldn't do in this model, is create one small major mode called 
c-like-mode (which sets up a minimal syntax table), and use it for a 
bunch of languages like C, C++, JS, etc, delegating the major features 
such as syntax highlighting, indentation, imenu and completion to the 
LSP protocol (e.g. via Eglot). With no extra files required for many 
additional languages, just new entries in eglot-server-programs.

Of course it's not critical that we'd be able to do this, but seems 
interesting as a concept.

>>> (setq auto-mode-alist '(("\\.js$" . :JavaScript)
>>>                           ("\\.jsx$" . :JavaScriptReact)))
>>>
>>> (setq m-m-remap-alist '((:JavaScript . js-ts-mode)
>>>                           (:JavaScriptReact . js-ts-mode)))
>>
>> It indeed should work okay with my proposal, but might be harder to do
>> if the languages are inserted as part of the existing modes hierarchy
>> (e.g. as "abstract" symbols).
> 
> I don't follow.  Are you referencing the other mail?  That
> (:abstract t) idea was mostly crafted for the base-mode approach.
> It's not strictly needed with your patch (though I don't see how
> it hurts either).

Yep.

If we're referencing my patch, then many-to-many should work with it, of 
course.

>> That is assuming we do want the language
>> hook to run - which seems like important goal from my POV.
> 
> Not absolutely essential to fix current problems, but yes I agree
> it's natural  enough that it should be in the proposal.
> 
>>> And 'buffer-language' becomes more like:
>>>
>>> (or buffer-overriding-language-keyword
>>>       (with-current-buffer buffer (get-language-for-mode major-mode))
>>>       (let (kw)
>>>          (and buffer-file-name
>>>               (keywordp
>>>                 (setq kw
>>>                       ;; yes I know this needs regexps
>>>                       (alist-get buffer-file-name auto-mode-alist)))
>>
>> There is a bunch of variables to look up: auto-mode-alist,
>> magic-mode-alist, interpreter-mode-alist, magic-fallback-mode-alist. I
>> didn't want to duplicate the logic from set-auto-mode, but this of
>> course could be done.
> 
> I was just illustrating the fallback logic.

Not 100% clear where 'buffer-overriding-language-keyword' would come 
from. If set-buffer-language was the main entry point for overriding a 
buffer's language, however, its approach of overriding the cached info 
(currently set by set-auto-mode-0) seems the easiest.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Thu, 18 Jan 2024 00:48:02 GMT) Full text and rfc822 format available.

Message #379 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 Stefan Kangas <stefankangas <at> gmail.com>, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Thu, 18 Jan 2024 02:47:16 +0200
On 17/01/2024 12:20, João Távora wrote:
> On Wed, Jan 17, 2024 at 2:41 AM Dmitry Gutov <dmitry <at> gutov.dev> wrote:
> 
>> An "abstract" mode is supposedly one that doesn't do anything. So it
>> doesn't have to be callable.
> 
> No.  Not "abstract" as in "Java interface", abstract as in "Java
> abstract class".
> 
>> Anyway, that's +1 feature required for the implementation.
> 
> Almost trivial feature.  See patch after sig.  It'll make this work:
> 
> (define-derived-mode foo-mode prog-mode "Fooey" :abstract t
>     (message "Hey from foo-mode"))
> (define-derived-mode bar-mode foo-mode "Barey"
>     (message "Hey from bar-mode"))
> 
> 'foo-mode' can't be called, but 'bar-mode can, of course.  And it
> will call its parent.

What would such an "abstract" parent do anyway? Still set up keymaps etc?

>>> What if we filter by prog-mode?  It would leave the ':ruby-base' and
>>> ':python-base' as false positives, I guess.  But then we could reasonably
>>> say that anything ending with '-base' is abstract (or use the
>>> aforementioned  explicit abstract prop).
>>
>> We would also filter out :css, for example.
> 
> Sure?  I see a super-normal css-base-mode inheriting from
> prog-mode.
> 
>> TeX modes also do not derive
>> from prog-mode. TeX does have an LSP server, however.
> 
> At the end of the day we have to come to a conclusion of what
> we want to do.  I want to find major modes that correspond
> to languages, right?  So:

Right. But I suppose it's more or less the set of modes that correspond 
to file types (files on disk). Even text-mode, often used as a fallback, 
could have a language ("plain text") - you can see this file type (or 
"language mode") in the dropdown list of choices in editors that support 
switching between them with a mouse (e.g. VS Code).

> * these outliers start inheriting from prog-mode
> * we inject new lang-mode between prog-mode and fundamental-mode and make
>    outliers derive from that.
> * we say these outliers aren't languages
> * we code exceptions for these outliers
> 
> It could even be that
> 
> (derived-mode-add-parents 'tex-mode '(prog-mode))
> 
> is exactly what's needed here, showcasing how I think this
> particular heavy hammer should be used for the exception, not
> the rule.

Also, arguably, Makefile-mode should not be a prog-mode derivative 
because its indent-line-function is not meaningful. But we want to 
support it with LSP anyway (I think), and with other features that 
dispatch based on the current language.

>>>> BTW, get-current-mode-for-language could be implemented in terms of
>>>> set-buffer-language.
>>>
>>> What does get-current-mode-for-language do exactly?
>>
>> The major-mode currently configured to be used for the language (through
>> m-m-a-alist, in the current proposal). set-auto-mode will choose it.
> 
> Perfect.  But the, "set-buffer-language" comment?  Does a buffer object
> have to exist for that job to be done?

No, you could just do something like

  (defun get-current-mode-for-language (lang)
    (with-temp-buffer
      (set-buffer-language lang)
      major-mode))




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Thu, 18 Jan 2024 05:07:02 GMT) Full text and rfc822 format available.

Message #382 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 João Távora <joaotavora <at> gmail.com>,
 Stefan Kangas <stefankangas <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Thu, 18 Jan 2024 07:05:55 +0200
On 17/01/2024 19:08, Stefan Monnier wrote:
>> @@ -3150,6 +3150,9 @@ auto-mode-alist
>>   Visiting a file whose name matches REGEXP specifies FUNCTION as the
>>   mode function to use.  FUNCTION will be called, unless it is nil.
>>   
>> +FUNCTION can also be a keyword denoting a language, to be looked
>> +up in `major-mode-remap-alist'.
> 
> Side note: the intention is OK but `major-mode-remap-alist' is
> a defcustom and should remain nil by default.  It's there for the users
> to express which major modes they prefer.  So if we want a mapping
> between some new language/type concept and major modes, it should be
> stored elsewhere (could be a plain alist that's handled as a kind of
> "implicit tail of `major-mode-remap-alist`").

Good point. The user can customize it and lose the non-default modes 
configured for a language.

The way I introduced languages as keywords was an experiment, really. 
Mostly to save on typing - because the first plan was to have a parallel 
set of alists (since we can't right away deprecate the file -> mmode 
mappings right away). The language-specific version of 
major-mode-remap-alist looks necessary after all.

>> @@ -3206,10 +3209,10 @@ interpreter-mode-alist
>>        ("emacs" . emacs-lisp-mode)))
>>     "Alist mapping interpreter names to major modes.
>>   This is used for files whose first lines match `auto-mode-interpreter-regexp'.
>> -Each element looks like (REGEXP . MODE).
>> +Each element looks like (REGEXP . MODE-OR-LANGUAGE).
>>   If REGEXP matches the entire name (minus any directory part) of
>>   the interpreter specified in the first line of a script, enable
>> -major mode MODE.
>> +MODE-OR-LANGUAGE.
> 
> There's a similar need for "content type" rather than "language".  If we
> want to mention "language" we should also take the opportunity to
> mention other related categorizations like "content type".

Are "content type" and "language" going to be different things? They 
seem the same to me.

>> -      (funcall (alist-get mode major-mode-remap-alist mode))
>> +      ;; XXX: When there's no mapping for `:<language>', we could also
>> +      ;; look for a function called `<language>-mode'.
>> +      (funcall (alist-get mode major-mode-remap-alist (if (keywordp mode)
>> +                                                          #'fundamental-mode
>> +                                                        mode)))
>> +      (when (keywordp mode)             ;Perhaps do that unconditionally.
>> +        (run-hooks (intern (format "%s-language-hook" (buffer-language)))))
> 
> That seems wrong:
> - Why should this hook run when `auto-mode-alist` says `:js` but not
>    when doing `M-x javascript-mode` (or other ways to enable this mode)?
> - Why run this hook *after* the mode's `:after-hook` and after
>    things like `after-change-major-mode-hook`?
> 
> I think it should remain the major mode's responsibility to decide which
> hooks it runs.

On one hand, this is an artefact of not implementing the 
language-classification inside define-derived-mode.

OTOH, the major mode can only run the language hook, I think, if any 
major mode can correspond only to one language. Though I suppose if 
set-auto-mode-0 saves the currently "detected" language somewhere, the 
major mode definitions could pick it up and call the corresponding hook.

Hmm, perhaps in that case the major modes won't need any special 
attribute in their definitions (to specify their language): any major 
mode would run <lang>-language-hook where <lang> is the language 
detected for the buffer or assigned explicitly.

>> +(defun buffer-language ()
>> +  "Return the language of the current buffer.
>> +A language is a lowercase keyword with the name of the language."
>> +  ;; Alternatively, we could go through all the matchers in
>> +  ;; auto-mode-alist, interpreter-mode-alist,
>> +  ;; magic-fallback-mode-alist here, possibly using a cache keyed on
>> +  ;; buffer-file-name.  But that's probably an overkill: if the user
>> +  ;; changes the settings, they can call `M-x revert-buffer' at the end.
>> +  (if (keywordp (car set-auto-mode--last))
>> +      (car set-auto-mode--last)
>> +    ;; Backward compatibility.
>> +    (intern (format ":%s" (replace-regexp-in-string "\\(?:-ts\\)?-mode\\'" ""
>> +                                                    (symbol-name major-mode))))))
> 
> I'm not comfortable enshrining the "-ts-mode" convention here.

We can still go the "strict" approach, where when no language is 
assigned, we don't try to guess it.

> Also I think if we want a `buffer-language` function, it should not rely
> on how the mode was installed (e.g. `set-auto-mode--last`) but only on
> the major mode itself, i.e. something like
> 
>      (defun buffer-language ()
>        (or buffer-language

Where would the buffer-language variable be set, if not inside 
set-auto-mode-*?

>            (some heuristic based on major-mode and/or derived-modes)))

If we're sure we don't want several languages to be able to refer to the 
same major mode...

> [ Of course, I already mentioned that I also suspect that there can/will
>    be sometimes several languages (or none).  ]

I'm not clear on this. You mentioned complex cases - like an xml inside 
an archive? But depending on the usage, only one of the languages might 
be "active" at a given time. Or a combination of languages would simply 
be another language, basically.

A more specific scenario might clarify this better.

>> +(defun set-buffer-language (language)
>> +  "Set the language of the current buffer.
>> +And switch the major mode appropriately."
>> +  (interactive
>> +   (list (let* ((ct (mapcan
>> +                     (lambda (pair) (and (keywordp (car pair))
>> +                                    (list (symbol-name (car pair)))))
>> +                     major-mode-remap-alist))
>> +                (lang (completing-read "Language: " ct)))
>> +           (and lang (intern lang)))))
>> +  (set-auto-mode-0 language))
> 
> I see several issues with this function (name and implementation), but
> I wonder when we'd ever need such a thing.

It seemed like a missed opportunity not to provide a more high-level 
command to switch to a specific language for the buffer. E.g. how we 
sometimes use 'M-x foo-major-mode' when a file type's been misdetected, 
or the buffer is non-file-visiting (perhaps very temporary).

A command which does this with exhaustive completion across the 
configured languages seems handy. At least that's my impression from 
briefly testing it out.

Also, get-current-mode-for-language can be implemented in terms of 
set-buffer-language (see my earlier email to Joao).

>>   ;;;###autoload
>>   (dolist (name (list "node" "nodejs" "gjs" "rhino"))
>> -  (add-to-list 'interpreter-mode-alist (cons (purecopy name) 'js-mode)))
>> +  (add-to-list 'interpreter-mode-alist (cons (purecopy name) :js)))
> 
> BTW, my suggested patch basically proposes to use `<LANG>-mode` instead
> of `:LANG>` which saves us from those changes since that matches our
> historical conventions.

<LANG>-mode is lexically indistinguishable from <NONLANG>-mode. If we 
used the names like <LANG>-lang, at least one could tell whether one of 
the parents of a given <foo>-mode is a language.

> Another issue I see if we don't use something like
> `derived-mode-add-parents` is that all the various places where we use
> mode-indexing, such as `.dir-locals.el`, `ffap`, YASnippet, etc... will
> need to be extended with a way to use "languages" as well, and then we
> also need to define a sane precedence between settings that apply to
> a given mode and settings that apply to a given language (setting for
> `js-ts-mode` should presumably take precedence over settings for
> `:js` which should take precedence over settings for `prog-mode`).

That's a good point: if "languages" as a separate notion gets added, it 
would make sense to use them in more places (not 100% necessary, but 
good for consistency). With the associated complexity that you mention.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Thu, 18 Jan 2024 14:18:02 GMT) Full text and rfc822 format available.

Message #385 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 João Távora <joaotavora <at> gmail.com>,
 Stefan Kangas <stefankangas <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Thu, 18 Jan 2024 09:17:15 -0500
> away).  The language-specific version of major-mode-remap-alist looks
> necessary after all.

It doesn't have to be specifically about languages.
It can just be a "default" set of "major mode" remappings.

>>> @@ -3206,10 +3209,10 @@ interpreter-mode-alist
>>>        ("emacs" . emacs-lisp-mode)))
>>>     "Alist mapping interpreter names to major modes.
>>>   This is used for files whose first lines match `auto-mode-interpreter-regexp'.
>>> -Each element looks like (REGEXP . MODE).
>>> +Each element looks like (REGEXP . MODE-OR-LANGUAGE).
>>>   If REGEXP matches the entire name (minus any directory part) of
>>>   the interpreter specified in the first line of a script, enable
>>> -major mode MODE.
>>> +MODE-OR-LANGUAGE.
>> There's a similar need for "content type" rather than "language".  If we
>> want to mention "language" we should also take the opportunity to
>> mention other related categorizations like "content type".
> Are "content type" and "language" going to be different things?
> They seem the same to me.

I think there's the same kind of difference between "language" and
"content type" as between "language" and "major mode" :-)

> OTOH, the major mode can only run the language hook, I think, if any major
> mode can correspond only to one language.

Not so.  A major mode can easily do

    (run-mode-hooks (compute-the-hook))

> Though I suppose if set-auto-mode-0 saves the currently "detected"
> language somewhere, the major mode definitions could pick it up and
> call the corresponding hook.

Major modes are not activated solely via `set-auto-mode-0`, so relying
on that is a crutch/hack, not something on which to base a design.

>> I'm not comfortable enshrining the "-ts-mode" convention here.
> We can still go the "strict" approach, where when no language is assigned,
> we don't try to guess it.

I think the `<LANG>-mode` heuristic is acceptable, because it's been
*the* convention used in Emacs.

>> Also I think if we want a `buffer-language` function, it should not rely
>> on how the mode was installed (e.g. `set-auto-mode--last`) but only on
>> the major mode itself, i.e. something like
>>      (defun buffer-language ()
>>        (or buffer-language
> Where would the buffer-language variable be set, if not inside
> set-auto-mode-*?

In the major mode?

>>            (some heuristic based on major-mode and/or derived-modes)))
> If we're sure we don't want several languages to be able to refer to the
> same major mode...

A major mode can

    (setq major-mode ...)

If/when such "generic" major modes become a thing, and the `(setq major-mode ...)`
hack becomes too inconvenient, we can devise a better solution
(e.g. extending/tweaking the way `derived-mode-*` work).

>> [ Of course, I already mentioned that I also suspect that there can/will
>>    be sometimes several languages (or none).  ]
> I'm not clear on this. You mentioned complex cases - like an xml inside an
> archive? But depending on the usage, only one of the languages might be
> "active" at a given time.

But depending on what "the language/type/mode" is used for, we may not
care really about which language/type/mode is "active" but about which
languages/types/modes are applicable (e.g. for `.dir-locals.el`).

>>> +(defun set-buffer-language (language)
>>> +  "Set the language of the current buffer.
>>> +And switch the major mode appropriately."
>>> +  (interactive
>>> +   (list (let* ((ct (mapcan
>>> +                     (lambda (pair) (and (keywordp (car pair))
>>> +                                    (list (symbol-name (car pair)))))
>>> +                     major-mode-remap-alist))
>>> +                (lang (completing-read "Language: " ct)))
>>> +           (and lang (intern lang)))))
>>> +  (set-auto-mode-0 language))
>> I see several issues with this function (name and implementation), but
>> I wonder when we'd ever need such a thing.
>
> It seemed like a missed opportunity not to provide a more high-level command
> to switch to a specific language for the buffer. E.g. how we sometimes use
> 'M-x foo-major-mode' when a file type's been misdetected, or the buffer is
> non-file-visiting (perhaps very temporary).
>
> A command which does this with exhaustive completion across the configured
> languages seems handy. At least that's my impression from briefly testing
> it out.

We can do the same with major modes, of course (just `mapatom` and filter out
the non-major modes), so feel free to add such a command, but it doesn't
seem like offering it for "languages" is particularly more useful than
offering it for "major modes".

> Also, get-current-mode-for-language can be implemented in terms of
> set-buffer-language (see my earlier email to Joao).

That seems to be a roundabout way to go about it.
`get-current-mode-for-language/type/mode` should be used by
`set-auto-mode` rather than other way around, no?

> <LANG>-mode is lexically indistinguishable from <NONLANG>-mode. If we used
> the names like <LANG>-lang, at least one could tell whether one of the
> parents of a given <foo>-mode is a language.

Other than for Eglot, where does this distinction matter?

>> Another issue I see if we don't use something like
>> `derived-mode-add-parents` is that all the various places where we use
>> mode-indexing, such as `.dir-locals.el`, `ffap`, YASnippet, etc... will
>> need to be extended with a way to use "languages" as well, and then we
>> also need to define a sane precedence between settings that apply to
>> a given mode and settings that apply to a given language (setting for
>> `js-ts-mode` should presumably take precedence over settings for
>> `:js` which should take precedence over settings for `prog-mode`).
> That's a good point: if "languages" as a separate notion gets added, it
> would make sense to use them in more places (not 100% necessary, but good
> for consistency). With the associated complexity that you mention.

And if it's not merged into the same hierarchy as major modes, how do
you get `:js` (i.e. "language") to be sometimes higher-precedence and
sometimes lower precedence than a mode?


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Thu, 18 Jan 2024 19:56:01 GMT) Full text and rfc822 format available.

Message #388 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 João Távora <joaotavora <at> gmail.com>,
 Stefan Kangas <stefankangas <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Thu, 18 Jan 2024 21:55:34 +0200
On 18/01/2024 16:17, Stefan Monnier wrote:
>> away).  The language-specific version of major-mode-remap-alist looks
>> necessary after all.
> 
> It doesn't have to be specifically about languages.
> It can just be a "default" set of "major mode" remappings.

Could be. And it's something we should probably add irrespective of the 
outcome of this dicsussion.

>>>> @@ -3206,10 +3209,10 @@ interpreter-mode-alist
>>>>         ("emacs" . emacs-lisp-mode)))
>>>>      "Alist mapping interpreter names to major modes.
>>>>    This is used for files whose first lines match `auto-mode-interpreter-regexp'.
>>>> -Each element looks like (REGEXP . MODE).
>>>> +Each element looks like (REGEXP . MODE-OR-LANGUAGE).
>>>>    If REGEXP matches the entire name (minus any directory part) of
>>>>    the interpreter specified in the first line of a script, enable
>>>> -major mode MODE.
>>>> +MODE-OR-LANGUAGE.
>>> There's a similar need for "content type" rather than "language".  If we
>>> want to mention "language" we should also take the opportunity to
>>> mention other related categorizations like "content type".
>> Are "content type" and "language" going to be different things?
>> They seem the same to me.
> 
> I think there's the same kind of difference between "language" and
> "content type" as between "language" and "major mode" :-)

A "content type" could be serviced by multiple "languages"? Still not 
sure how that would work. I mean, we could have content-type like 
text/html or application/json, but neither splits into two languages, 
really.

>> OTOH, the major mode can only run the language hook, I think, if any major
>> mode can correspond only to one language.
> 
> Not so.  A major mode can easily do
> 
>      (run-mode-hooks (compute-the-hook))

I guess that would mean that the language hook is not run automatically, 
that each major mode would need explicit code to compute it and run.

>> Though I suppose if set-auto-mode-0 saves the currently "detected"
>> language somewhere, the major mode definitions could pick it up and
>> call the corresponding hook.
> 
> Major modes are not activated solely via `set-auto-mode-0`, so relying
> on that is a crutch/hack, not something on which to base a design.

The major mode could compute which language it is for. But the algorithm 
could be undecidable if the buffer is not visiting a file yet, doesn't 
have an interpreter comment, etc. That's where the command 
set-buffer-language was supposed to come in handy.

>>> I'm not comfortable enshrining the "-ts-mode" convention here.
>> We can still go the "strict" approach, where when no language is assigned,
>> we don't try to guess it.
> 
> I think the `<LANG>-mode` heuristic is acceptable, because it's been
> *the* convention used in Emacs.

We are now getting a whole set of new modes for which this heuristic 
isn't going to work (the tree-sitter based ones), and that list will 
grow. Perhaps it would be more consistent to drop the heuristic if we 
don't manage to make it work, somehow, for both kinds of modes.

>>> Also I think if we want a `buffer-language` function, it should not rely
>>> on how the mode was installed (e.g. `set-auto-mode--last`) but only on
>>> the major mode itself, i.e. something like
>>>       (defun buffer-language ()
>>>         (or buffer-language
>> Where would the buffer-language variable be set, if not inside
>> set-auto-mode-*?
> 
> In the major mode?

Then perhaps we won't need the fallbacks (the part that comes after 
'or') - the major mode's setting of the language could perform those 
"heuristic based" computations as well.

>>>             (some heuristic based on major-mode and/or derived-modes)))
>> If we're sure we don't want several languages to be able to refer to the
>> same major mode...
> 
> A major mode can
> 
>      (setq major-mode ...)
> 
> If/when such "generic" major modes become a thing, and the `(setq major-mode ...)`
> hack becomes too inconvenient, we can devise a better solution
> (e.g. extending/tweaking the way `derived-mode-*` work).

The major-mode could be fundamental-mode. If the language were to be 
specifiable through settings external to major modes, we could still do 
useful things while in fundamental-mode (e.g. do some useful editing 
with Eglot, provided it supports indentation and completion), or suggest 
which major modes to install from ELPA.

>>> [ Of course, I already mentioned that I also suspect that there can/will
>>>     be sometimes several languages (or none).  ]
>> I'm not clear on this. You mentioned complex cases - like an xml inside an
>> archive? But depending on the usage, only one of the languages might be
>> "active" at a given time.
> 
> But depending on what "the language/type/mode" is used for, we may not
> care really about which language/type/mode is "active" but about which
> languages/types/modes are applicable (e.g. for `.dir-locals.el`).

Would we really care that an xml file inside an archive is applied both 
archive-subfile-mode and xml-mode dir-locals settings? Offhand, I would 
really expect the xml-mode settings only. Though the former could be a 
nice bonus in rare cases.

Perhaps dir-locals.el could get a syntax for specifying variables when 
specific minor modes are enabled as well.

>>>> +(defun set-buffer-language (language)
>>>> +  "Set the language of the current buffer.
>>>> +And switch the major mode appropriately."
>>>> +  (interactive
>>>> +   (list (let* ((ct (mapcan
>>>> +                     (lambda (pair) (and (keywordp (car pair))
>>>> +                                    (list (symbol-name (car pair)))))
>>>> +                     major-mode-remap-alist))
>>>> +                (lang (completing-read "Language: " ct)))
>>>> +           (and lang (intern lang)))))
>>>> +  (set-auto-mode-0 language))
>>> I see several issues with this function (name and implementation), but
>>> I wonder when we'd ever need such a thing.
>>
>> It seemed like a missed opportunity not to provide a more high-level command
>> to switch to a specific language for the buffer. E.g. how we sometimes use
>> 'M-x foo-major-mode' when a file type's been misdetected, or the buffer is
>> non-file-visiting (perhaps very temporary).
>>
>> A command which does this with exhaustive completion across the configured
>> languages seems handy. At least that's my impression from briefly testing
>> it out.
> 
> We can do the same with major modes, of course (just `mapatom` and filter out
> the non-major modes), so feel free to add such a command, but it doesn't
> seem like offering it for "languages" is particularly more useful than
> offering it for "major modes".

If modes are annotated with their languages, the result could be almost 
as handy indeed, so maybe I will add such command, later.

>> Also, get-current-mode-for-language can be implemented in terms of
>> set-buffer-language (see my earlier email to Joao).
> 
> That seems to be a roundabout way to go about it.
> `get-current-mode-for-language/type/mode` should be used by
> `set-auto-mode` rather than other way around, no?

If the major modes decide the language, and if we don't mind that this 
won't work without an installed/available major mode, yes.

>> <LANG>-mode is lexically indistinguishable from <NONLANG>-mode. If we used
>> the names like <LANG>-lang, at least one could tell whether one of the
>> parents of a given <foo>-mode is a language.
> 
> Other than for Eglot, where does this distinction matter?

I suppose it comes down to the ease of implementing interaction with any 
external tools that need to be passed a language name.

If a function get-language-for-mode is possible to implement, then you 
only need to store the mapping 
language->language-name-spelled-in-specific-way for a number of 
exceptions, whereas if instead of get-language-for-mode you only have 
the full hierarchy of modes, then the mode->correct-spelling will likely 
need to be exhaustive in all cases, in order not to match any parent 
modes (e.g. prog-mode) that don't denote a language. And when such 
mappings have to be exhaustive, support for any new language would also 
need to be done explicitly in all cases.

Eglot would need explicit mappings either way because the name of the 
language server program is always different (though they would be 
simplified), but something like 'rg -t js Foo' only needs the language name.

>>> Another issue I see if we don't use something like
>>> `derived-mode-add-parents` is that all the various places where we use
>>> mode-indexing, such as `.dir-locals.el`, `ffap`, YASnippet, etc... will
>>> need to be extended with a way to use "languages" as well, and then we
>>> also need to define a sane precedence between settings that apply to
>>> a given mode and settings that apply to a given language (setting for
>>> `js-ts-mode` should presumably take precedence over settings for
>>> `:js` which should take precedence over settings for `prog-mode`).
>> That's a good point: if "languages" as a separate notion gets added, it
>> would make sense to use them in more places (not 100% necessary, but good
>> for consistency). With the associated complexity that you mention.
> 
> And if it's not merged into the same hierarchy as major modes, how do
> you get `:js` (i.e. "language") to be sometimes higher-precedence and
> sometimes lower precedence than a mode?

I'm not sure it's a requirement. If we decide that languages are "above" 
major modes, it would make just as much sense to first apply language 
settings, and then those for the major mode and its parents. Even though 
a language is often more specific than prog-mode. It can be different 
for other hierarchies (e.g. js-base-mode would be "below" language). We 
wouldn't want to specify the parent for each language as well, right?

E.g. I suppose if js-language was a major mode which also inherits from 
prog-mode, a priority resolution algorithm could then decide that it's 
also lesser priority when applying local variables for any modes which 
add js-language as its extra parent. But that seems like more work (both 
for the writers to implement and for the users to understand) for 
relatively minor gain.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Thu, 18 Jan 2024 21:26:01 GMT) Full text and rfc822 format available.

Message #391 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 João Távora <joaotavora <at> gmail.com>,
 Stefan Kangas <stefankangas <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Thu, 18 Jan 2024 16:24:46 -0500
> Still not sure how that would work.  I mean, we could have content-type
> like text/html or application/json, but neither splits into two
> languages, really.

Not sure what you mean by "split", but just as with major modes and
"languages", MIME content-types have inclusion properties, such as

    application/atom+xml ⊂ application/xml ⊂ text/plain

>>> OTOH, the major mode can only run the language hook, I think, if any major
>>> mode can correspond only to one language.
>> Not so.  A major mode can easily do
>>      (run-mode-hooks (compute-the-hook))
> I guess that would mean that the language hook is not run automatically,
> that each major mode would need explicit code to compute it and run.

Not necessarily, e.g. you could specify the language to
`define-derived-mode` with something like

    :language (compute-the-language)

and then have `define-derived-mode` compute the hook name from that.
This said, I suspect that generic major modes supporting many languages
will not be very numerous (after all, that's the point of being generic,
no?), so it should be OK if they have to do some things manually.

>>> Though I suppose if set-auto-mode-0 saves the currently "detected"
>>> language somewhere, the major mode definitions could pick it up and
>>> call the corresponding hook.
>> Major modes are not activated solely via `set-auto-mode-0`, so relying
>> on that is a crutch/hack, not something on which to base a design.
> The major mode could compute which language it is for. But the algorithm
> could be undecidable if the buffer is not visiting a file yet, doesn't have
> an interpreter comment, etc. That's where the command set-buffer-language
> was supposed to come in handy.

That still doesn't justify the major mode relying on `set-auto-mode-0`.

AFAICT you seem to want to standardize how the user controls the language of
language-generic major modes.  I'm not sure we need such a standard.
Do we even have such a generic major mode yet?

>>>> I'm not comfortable enshrining the "-ts-mode" convention here.
>>> We can still go the "strict" approach, where when no language is assigned,
>>> we don't try to guess it.
>> I think the `<LANG>-mode` heuristic is acceptable, because it's been
>> *the* convention used in Emacs.
> We are now getting a whole set of new modes for which this heuristic isn't
> going to work

Cue the patch I submitted when I open this bug report 🙂
Now `<LANG>-mode` is again included in `derived-mode-all-parents` for
those new modes.

Admittedly, it doesn't fully give a solution to the problem of computing
"the" language of a buffer.  But that gets us back to one of my recent
questions: beside Eglot, which other package needs that?  Is "the"
language always unique and always the same for all those packages?
Is it really the only thing those packages need?

In the case of Eglot, at least it doesn't seem to be the case: we don't
just need the language, but also the name of the language server to use.
And for some buffers there can be several applicable language servers,
and they don't necessarily all accept the same language.

So we need either the major mode to provide the name of the server, or
a central database that maps from language/type/mode to server name.
In both cases, adding the language info to the server name is
a non-issue.  And in neither case is it necessary to know "the" language
in order to find the server.  My patch makes the central database
work better.

> The major-mode could be fundamental-mode. If the language were to be
> specifiable through settings external to major modes, we could still do
> useful things while in fundamental-mode (e.g. do some useful editing with
> Eglot, provided it supports indentation and completion), or suggest which
> major modes to install from ELPA.

I don't see the interest of using specifically `fundamental-mode` for
that.  In any case, this seems too hypothetical at this stage to have
a good idea of what we'd need in such circumstances.

> Would we really care that an xml file inside an archive is applied both
> archive-subfile-mode and xml-mode dir-locals settings?

No, I wasn't thinking of XML files inside archives, but about files
which are both archives and something else (e.g. ODT).  The same applies
for most other "generic" data containers, like XML and JSON.

> Perhaps dir-locals.el could get a syntax for specifying variables when
> specific minor modes are enabled as well.

Or we could do something like

    (defun derived-mode-p (modes)
      (provided-mode-derived-p (or (funcall major-mode-function) major-mode)
                                modes)

or

    (defun derived-mode-current-all-parents ()
      (or (funcall major-mode-all-parents-function)
          (derived-mode-all-parents major-mode)))

So your XML major mode can indicate that the current buffer is actually
in `xml+atom-mode` (which is a child of `xml-mode`).

Anyway, as I mentioned elsewhere, I think this discussion of "languages"
is only tangentially related to my proposed patch.  There is some
overlap, but they serve different purposes, and they're not
mutually exclusive.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 19 Jan 2024 01:29:02 GMT) Full text and rfc822 format available.

Message #394 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 João Távora <joaotavora <at> gmail.com>,
 Stefan Kangas <stefankangas <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 19 Jan 2024 03:28:16 +0200
On 18/01/2024 23:24, Stefan Monnier wrote:
>> Still not sure how that would work.  I mean, we could have content-type
>> like text/html or application/json, but neither splits into two
>> languages, really.
> 
> Not sure what you mean by "split", but just as with major modes and
> "languages", MIME content-types have inclusion properties, such as
> 
>      application/atom+xml ⊂ application/xml ⊂ text/plain

I meant to try to clarify your meaning when you said that content-types 
are to languages the same as what languages are to major modes.

That might mean that a content-type corresponds to a number of 
languages, just like a language corresponds to a number (open set) of 
major modes. But I don't see how. Please enlighten.

Speaking of the relation of inclusion, as I said before we might not 
want to make languages hierarchical (even though it might help for 
certain cases), because the relation might not be universal across uses.

And if you had the idea of expressing it through major mode inheritance 
as well, it's likely to have adverse effects, something like:

  (derived-mode-add-parents 'c++-ts-mode 'c++-lang)
  +
  (derived-mode-add-parents 'c++-lang 'c-lang)
  =
  (provided-mode-derived-p 'c++-ts-mode 'c-lang)

Which can lead some callers to decide that c++-ts-mode's language is C.

>>>> OTOH, the major mode can only run the language hook, I think, if any major
>>>> mode can correspond only to one language.
>>> Not so.  A major mode can easily do
>>>       (run-mode-hooks (compute-the-hook))
>> I guess that would mean that the language hook is not run automatically,
>> that each major mode would need explicit code to compute it and run.
> 
> Not necessarily, e.g. you could specify the language to
> `define-derived-mode` with something like
> 
>      :language (compute-the-language)
> 
> and then have `define-derived-mode` compute the hook name from that.
> This said, I suspect that generic major modes supporting many languages
> will not be very numerous (after all, that's the point of being generic,
> no?), so it should be OK if they have to do some things manually.

Okay.

The side-effect of this approach is that we basically declare a mode's 
language twice: once in the attribute above, and once in the 
major-mode-remap-alist which is put into autoloads. But it's probably 
minor enough.

And if languages are distinct from major modes in naming, the :language 
attribute in define-derived-mode could make it run the corresponding 
hook at the end. Which seems good.

>>>> Though I suppose if set-auto-mode-0 saves the currently "detected"
>>>> language somewhere, the major mode definitions could pick it up and
>>>> call the corresponding hook.
>>> Major modes are not activated solely via `set-auto-mode-0`, so relying
>>> on that is a crutch/hack, not something on which to base a design.
>> The major mode could compute which language it is for. But the algorithm
>> could be undecidable if the buffer is not visiting a file yet, doesn't have
>> an interpreter comment, etc. That's where the command set-buffer-language
>> was supposed to come in handy.
> 
> That still doesn't justify the major mode relying on `set-auto-mode-0`.
> 
> AFAICT you seem to want to standardize how the user controls the language of
> language-generic major modes.  I'm not sure we need such a standard.
> Do we even have such a generic major mode yet?

In my picture that was just the natural conclusion. What I was trying to 
do, is put a level of control above the major modes - the mapping from 
languages to modes, and make it more easy to control and configurable.

It didn't seem that the presence of a major mode was required to detect 
the expected language, hence the addition of the new values. At that 
point it seemed natural to both allow the absence of a configured major 
mode (why not), and to run the language hook anyway, for reliability.

If we really don't need any of that, then the auto-mode-alist and the 
companion vars don't even have to change, and the only place where the 
language name could feature (aside from the code looking it up), is the 
mode definitions.

>>>>> I'm not comfortable enshrining the "-ts-mode" convention here.
>>>> We can still go the "strict" approach, where when no language is assigned,
>>>> we don't try to guess it.
>>> I think the `<LANG>-mode` heuristic is acceptable, because it's been
>>> *the* convention used in Emacs.
>> We are now getting a whole set of new modes for which this heuristic isn't
>> going to work
> 
> Cue the patch I submitted when I open this bug report 🙂
> Now `<LANG>-mode` is again included in `derived-mode-all-parents` for
> those new modes.

If the language is called <LANG>-lang instead (of without suffix), then 
the major mode could also run the language-specific hook, which in your 
patch it cannot do.

> Admittedly, it doesn't fully give a solution to the problem of computing
> "the" language of a buffer.  But that gets us back to one of my recent
> questions: beside Eglot, which other package needs that?  Is "the"
> language always unique and always the same for all those packages?
> Is it really the only thing those packages need?
> 
> In the case of Eglot, at least it doesn't seem to be the case: we don't
> just need the language, but also the name of the language server to use.
> And for some buffers there can be several applicable language servers,
> and they don't necessarily all accept the same language.
> 
> So we need either the major mode to provide the name of the server, or
> a central database that maps from language/type/mode to server name.
> In both cases, adding the language info to the server name is
> a non-issue.  And in neither case is it necessary to know "the" language
> in order to find the server.  My patch makes the central database
> work better.

I think I've included some thoughts on this subject in my previous 
email. They don't seem to be quoted/commented on here.

>> The major-mode could be fundamental-mode. If the language were to be
>> specifiable through settings external to major modes, we could still do
>> useful things while in fundamental-mode (e.g. do some useful editing with
>> Eglot, provided it supports indentation and completion), or suggest which
>> major modes to install from ELPA.
> 
> I don't see the interest of using specifically `fundamental-mode` for
> that.  In any case, this seems too hypothetical at this stage to have
> a good idea of what we'd need in such circumstances.

The latter feature (suggest which major modes to install) has come up 
recently. It's not that difficult to implement (with a whitelist of 
packages), and fundamental-mode is most likely *the* major mode which 
would be used until the suitable major mode is installed.

>> Would we really care that an xml file inside an archive is applied both
>> archive-subfile-mode and xml-mode dir-locals settings?
> 
> No, I wasn't thinking of XML files inside archives, but about files
> which are both archives and something else (e.g. ODT).  The same applies
> for most other "generic" data containers, like XML and JSON.

Okay, ODT. Which we can view with either doc-view-mode or xml-mode. 
Languages :doc or :xml. We configure one of these langauges to be used 
by default, and switch to another at will.

Not sure it's useful to consider both modes somehow active at the same time.

Although this example does underscore the problem of major modes needing 
to be able to specify/change the current language themselves. At least 
if we don't want such modes as doc-view to have to be rewritten.

On the third hand, external tools (lsp servers, ripgrep, etc) will view 
such files as a certain type only - just ODT. Which might make us a 
disservice if the current detected language changes as we change the 
major mode. Hmm. And since xml-mode itself doesn't know ODT, it won't be 
able to "compute" that language value either (same would likely be true 
for other "container" modes).

> Anyway, as I mentioned elsewhere, I think this discussion of "languages"
> is only tangentially related to my proposed patch.  There is some
> overlap, but they serve different purposes, and they're not
> mutually exclusive.

I think the "languages" feature seems to cover the same functionality as 
your patch, and then some. Although at the expense of the downstream 
callers having to use the new feature, rather than having things work 
"automagically" (as soon as they stop supporting Emacs 29.1, that is).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 19 Jan 2024 05:14:02 GMT) Full text and rfc822 format available.

Message #397 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Kangas <stefankangas <at> gmail.com>, joaotavora <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Thu, 18 Jan 2024 21:12:43 -0800

> On Jan 15, 2024, at 6:32 PM, Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
> 
>>> Please don't call it "language".  That'd be confusing.  LSP is about
>>> programming languages, so "language" is natural there.  But in Emacs,
>>> a major mode is more general than that.  For example, it is not
>>> unthinkable to consider mail-mode to be the extra-parent of
>>> message-mode (or vice versa) -- but what is the "language" in that
>>> case?
>> Isn't the language for such modes in this paradigm just the empty set?
> 
> I'm not too worried about those cases, indeed.
> I'm more worried about the taxonomy of languages.
> We currently have the taxonomy of major modes, with which we're pretty
> familiar, and we've had many years to learn about its downsides,
> complexity, as well as how to deal with them, but for languages we're
> only familiar with the easy cases, which makes us judge the idea in
> a way that may prove naive.

I don’t have anything insightful to contribute, but want to point out that in Emacs, “language” doesn’t always mean programming language. “Language” can also mean Chinese, English, etc, and Emacs are quite often used for editing natural language text. So it warrants some caution when using “language” to mean programming language specifically.

> 
> IME, deciding what is the type of the content of a buffer is usually
> trivial but with some notable caveats, such as XPM or Postscript files,
> or "container formats" (like `.deb` or `.odt`, as well as things like
> DocBook which can be considered either as their own format or as XML),
> or "sublanguages" such as C being a subset of C++, or Javascript being
> a subset of Typescript.  And I suspect the info we need will not always
> be quite the same.
> 
> So while there might be a good case to be made to add some API functions
> to query the language/type(s) of a given buffer (I'm not sure we'd need
> the language of a given major mode, OTOH), or to find the preferred
> mode(s) for a given language/type, I think it's worthwhile to try and
> tweak our major mode taxonomy because it is information we must have
> and information we know we will always have, so we should strive to make
> it as good as we can.
> 
> It shouldn't make it any harder to add language/type API functionality.
> On the contrary it should make it easier.
> 
> [ As suggested elsewhere in this thread, we could even try and merge
>  those taxonomies, e.g. using extra parents of the form `LANG-lang`.  ]
> 
> As I said at the very beginning of this long thread, I'm not completely
> sure how well my proposal will play out: the upsides are in plain sight,
> but it may bump into real problems.  [ I'm actually surprised by Eli's
> optimism about it 🙂 ]
> But we won't know until we try it.
> 
> 
>        Stefan
> 





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 19 Jan 2024 12:45:01 GMT) Full text and rfc822 format available.

Message #400 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 João Távora <joaotavora <at> gmail.com>,
 Stefan Kangas <stefankangas <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 19 Jan 2024 07:43:48 -0500
> That might mean that a content-type corresponds to a number of languages,
> just like a language corresponds to a number (open set) of major modes. But
> I don't see how. Please enlighten.

All three are taxonomies that are related to the content of the buffer.
They are almost identical in general, but differ in details because
taxonomies are not an exact science and those three have each been
defined separately, so the arbitrary decisions that are involved in
making a taxonomy have not been the same.

> The side-effect of this approach is that we basically declare a mode's
> language twice: once in the attribute above, and once in the
> major-mode-remap-alist which is put into autoloads. But it's probably
> minor enough.

Again: not necessarily.  You're making assumptions about what the source
code will look like, but we get to decide what the source code looks
like by defining functions/macros.  Even if the information is stored in
a redundant way, that doesn't mean the surce of that information can't
be the same.

So if/when such a duplication proves to be a problem, I can't see why it
would be difficult to fix it.

>> Cue the patch I submitted when I open this bug report 🙂
>> Now `<LANG>-mode` is again included in `derived-mode-all-parents` for
>> those new modes.
>
> If the language is called <LANG>-lang instead (of without suffix), then the
> major mode could also run the language-specific hook, which in your patch it
> cannot do.

I don't follow: why would the name of the mode (and hence hook) make it
harder/easier/possible to run the hook?

> I think I've included some thoughts on this subject in my previous
> email. They don't seem to be quoted/commented on here.

I didn't have anything to comment on them :-)

>>> The major-mode could be fundamental-mode. If the language were to be
>>> specifiable through settings external to major modes, we could still do
>>> useful things while in fundamental-mode (e.g. do some useful editing with
>>> Eglot, provided it supports indentation and completion), or suggest which
>>> major modes to install from ELPA.
>> I don't see the interest of using specifically `fundamental-mode` for
>> that.  In any case, this seems too hypothetical at this stage to have
>> a good idea of what we'd need in such circumstances.
> The latter feature (suggest which major modes to install) has come up
> recently. It's not that difficult to implement (with a whitelist of
> packages),

I'm with you so far (my `gnu-elpa` package intended to provide
a possible solution for that).

> and fundamental-mode is most likely *the* major mode which would
> be used until the suitable major mode is installed.

Here I don't see it.

>> Anyway, as I mentioned elsewhere, I think this discussion of "languages"
>> is only tangentially related to my proposed patch.  There is some
>> overlap, but they serve different purposes, and they're not
>> mutually exclusive.
> I think the "languages" feature seems to cover the same functionality as
> your patch,

In the longer term, there might be a fair bit of overlap, yes, tho it
all depends on how your proposal works out in the end.
My patch is a short term solution with no new API.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 19 Jan 2024 12:54:02 GMT) Full text and rfc822 format available.

Message #403 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 casouri <at> gmail.com, Stefan Kangas <stefankangas <at> gmail.com>,
 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 19 Jan 2024 12:53:22 +0000
On Fri, Jan 19, 2024 at 12:43 PM Stefan Monnier
<monnier <at> iro.umontreal.ca> wrote:

> In the longer term, there might be a fair bit of overlap, yes, tho it
> all depends on how your proposal works out in the end.
> My patch is a short term solution with no new API.

Problem is, as you know, there's nothing more permanent than
a short-term solution.

Now, if your patch is expressed carefully in terms of
some concepts we could save our skins, so to speak:

instead of directly:

  (derived-mode-add-parents 'foo-ts-mode '(foo-mode))

Why don't we

  (set-super-special-stefan-parent 'foo-ts-mode 'foo)

?

Then

   (provided-mode-derived-p 'foo-ts-mode '(foo-mode))

should always be true because high-minded conceptual reasons
impeccably explained somewhere :-)

Then, we can later edit the implementation of
set-super-special-stefan-parent to accommodate for 'foo'
being a language, the "preferred" or "main" language of
whatever mode is given as the first argument.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 19 Jan 2024 13:20:01 GMT) Full text and rfc822 format available.

Message #406 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: João Távora <joaotavora <at> gmail.com>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 casouri <at> gmail.com, Stefan Kangas <stefankangas <at> gmail.com>,
 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 19 Jan 2024 08:19:33 -0500
>> In the longer term, there might be a fair bit of overlap, yes, tho it
>> all depends on how your proposal works out in the end.
>> My patch is a short term solution with no new API.
> Problem is, as you know, there's nothing more permanent than
> a short-term solution.

It's not meant to be temporary, tho, so it's not a problem.
We may need to undo it if it encounters problems, of course.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 19 Jan 2024 14:03:02 GMT) Full text and rfc822 format available.

Message #409 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 casouri <at> gmail.com, Stefan Kangas <stefankangas <at> gmail.com>,
 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 19 Jan 2024 14:01:39 +0000
On Fri, Jan 19, 2024 at 1:19 PM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
>
> >> In the longer term, there might be a fair bit of overlap, yes, tho it
> >> all depends on how your proposal works out in the end.
> >> My patch is a short term solution with no new API.
> > Problem is, as you know, there's nothing more permanent than
> > a short-term solution.
>
> It's not meant to be temporary, tho, so it's not a problem.

All the problems it's solving already have solutions,
just as short-term and effective as the patch would do.

Problem is that patch is, or seems to be, incompatible with
all the other things  we're discussing.  I.e. can you clearly
paint a way forward from it that solves the "what is this
language in this buffer", "what is the language for this
major-mode" and the "what mode should I use for that language"
problems?

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 19 Jan 2024 18:06:02 GMT) Full text and rfc822 format available.

Message #412 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: João Távora <joaotavora <at> gmail.com>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 casouri <at> gmail.com, Stefan Kangas <stefankangas <at> gmail.com>,
 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 19 Jan 2024 13:05:08 -0500
> Problem is that patch is, or seems to be, incompatible with
> all the other things  we're discussing.

Do you have some example of incompatibility it would introduce with the
"other things we're discussing"?

> I.e. can you clearly paint a way forward from it that solves the "what
> is this language in this buffer", "what is the language for this
> major-mode" and the "what mode should I use for that
> language" problems?

As already explained ad-nauseam, my patch has no intention to solve
those problems, but I can't see any way in which it gets in the way of
solving them, if you care to solve them.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Fri, 19 Jan 2024 22:48:02 GMT) Full text and rfc822 format available.

Message #415 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>,
 casouri <at> gmail.com, Stefan Kangas <stefankangas <at> gmail.com>,
 68246 <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Fri, 19 Jan 2024 22:47:07 +0000
On Fri, Jan 19, 2024 at 6:05 PM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:

> > I.e. can you clearly paint a way forward from it that solves the "what
> > is this language in this buffer", "what is the language for this
> > major-mode" and the "what mode should I use for that
> > language" problems?
>
> As already explained ad-nauseam, my patch has no intention to solve
> those problems, but I can't see any way in which it gets in the way of
> solving them, if you care to solve them.

The patch is intended to solve problems in Eglot, Yasnippet,
ffap and so on. This problem always includes with "what is this
language in this buffer", i.e. number 1 in the preceding list.
That's what the problem is, no less. The fact that these packages
have always resorted to using `derived-mode-p` to solve that
problem is an unfortunate consequence of the longstanding
conflation between modes and languages that you yourself
identified.

But it's not the right solution, never was. Your patch contributes
to the perpetuation of this "wrong" solution, and I think
we should face that frontally.

After it lands, derived-mode-p will cease to mean "A derived
from B via defined-derived-mode, so you can trust hook for B
runs in hook for A and a lot of other things".  It will mean
something else.  Once that lands, it can never really safely
be rolled back.  So my position is:

* if it lands, we should document very well what that new meaning
  of "<lang>-mode" is.  Also make some "provided-mode-walk-parents"
  so that at least problem 2 can be solved, by string-matching
  the symbol name of what will now be an even more enshrined
  convention.  As to problem 3,   maybe, it can be written off to
  "major-mode-remap-alist" (which I doubt will ever see much
  adoption)

* if it doesn't land, we should look at some solution that solves
  1 2 and 3 cleanly.   I think Dmitry's patch is a decent start.

* in the meantime, we should continue using base modes as we
  already are.  In fact, the <foo>-base-mode convention is a
  much better convention to enshrine, it doesn't require any
  special caveats regarding hooks and dir-locals and changes
  to the *Help* of a major mode function description.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sat, 20 Jan 2024 05:44:02 GMT) Full text and rfc822 format available.

Message #418 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, casouri <at> gmail.com,
 João Távora <joaotavora <at> gmail.com>,
 Stefan Kangas <stefankangas <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sat, 20 Jan 2024 07:43:38 +0200
On 19/01/2024 14:43, Stefan Monnier wrote:
>> That might mean that a content-type corresponds to a number of languages,
>> just like a language corresponds to a number (open set) of major modes. But
>> I don't see how. Please enlighten.
> 
> All three are taxonomies that are related to the content of the buffer.
> They are almost identical in general, but differ in details because
> taxonomies are not an exact science and those three have each been
> defined separately, so the arbitrary decisions that are involved in
> making a taxonomy have not been the same.

All right. But we'll probably only add one, if any. Adding two more (in 
addition to major modes) seems like too much.

>> The side-effect of this approach is that we basically declare a mode's
>> language twice: once in the attribute above, and once in the
>> major-mode-remap-alist which is put into autoloads. But it's probably
>> minor enough.
> 
> Again: not necessarily.  You're making assumptions about what the source
> code will look like, but we get to decide what the source code looks
> like by defining functions/macros.  Even if the information is stored in
> a redundant way, that doesn't mean the surce of that information can't
> be the same.
> 
> So if/when such a duplication proves to be a problem, I can't see why it
> would be difficult to fix it.

Okay.

>>> Cue the patch I submitted when I open this bug report 🙂
>>> Now `<LANG>-mode` is again included in `derived-mode-all-parents` for
>>> those new modes.
>>
>> If the language is called <LANG>-lang instead (of without suffix), then the
>> major mode could also run the language-specific hook, which in your patch it
>> cannot do.
> 
> I don't follow: why would the name of the mode (and hence hook) make it
> harder/easier/possible to run the hook?

So that it can be set up that for every such major mode the language 
hook does run, and the user could depend on that fact without looking up 
the mode's definition or its docstring.

You said that the choice not to do that (leaving that up to individual 
modes, IIUC) is because the new parents are existing modes, and so (I 
imagine) those hooks can existing have configurations that might not 
work with the new "child". A new name would change that.

IIUC your original design decision for `derived-mode-add-parents' is for 
the MODE not to run any of EXTRA-PARENTS hook, but I think the invariant 
that when a mode is "derived", it runs the hooks, was pretty sensible.

>>>> The major-mode could be fundamental-mode. If the language were to be
>>>> specifiable through settings external to major modes, we could still do
>>>> useful things while in fundamental-mode (e.g. do some useful editing with
>>>> Eglot, provided it supports indentation and completion), or suggest which
>>>> major modes to install from ELPA.
>>> I don't see the interest of using specifically `fundamental-mode` for
>>> that.  In any case, this seems too hypothetical at this stage to have
>>> a good idea of what we'd need in such circumstances.
>> The latter feature (suggest which major modes to install) has come up
>> recently. It's not that difficult to implement (with a whitelist of
>> packages),
> 
> I'm with you so far (my `gnu-elpa` package intended to provide
> a possible solution for that).

Hmm, that implementation is more clever than I was thinking of. Without 
getting into the details of its UI, your point is very well made that 
the major mode symbol can serve as the point of indirection as well.

To sum up, I've been looking for some sort of middle ground between your 
patch and my more radical proposal. I might look like this:

- auto-mode-alist and major-mode-remap-alist stay the same (no 
"language" entries), although it would make sense to use 
major-mode-remap-alist more prominently rather than copy the regexps, 
irrespective of this change.
- The added extra parents have new names, not of existing modes, but 
something with "-lang" or just the name of the language, but without 
"-mode".
- The child mode runs the hooks of the extra parents as well.

I think Joao has been thinking in the same direction, except his choice 
was to extend the <lang>-base-mode scheme to all languages. Which mostly 
satisfies the same constraints, if we agree that the "-base-mode" naming 
should only extend to modes such as these (with the language in the 
name), so that (get-mode-language mode) could search for that name.

*OR*, alternatively, we don't add new parents. But add the :language 
keyword (or :content-type, or something similar) to define-derived-mode 
which would basically set a property. But when such property is set, 
(get-mode-language mode) would evaluate that property's value. And the 
mode would run the hook automatically when a language is set.

I suppose the second approach is technically compatible with your patch 
as well, and since it by itself doesn't help with the generalization of 
major modes in dir-locals.el, it might make sense to do both. But I'd be 
sorry to lose the invariant mentioned above, and specifying both the 
extra parent *and* the language, for ts modes, would feel like 
unfortunate duplication.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sat, 20 Jan 2024 05:48:02 GMT) Full text and rfc822 format available.

Message #421 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Yuan Fu <casouri <at> gmail.com>, Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Kangas <stefankangas <at> gmail.com>, joaotavora <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sat, 20 Jan 2024 07:47:27 +0200
On 19/01/2024 07:12, Yuan Fu wrote:
> 
>> On Jan 15, 2024, at 6:32 PM, Stefan Monnier<monnier <at> iro.umontreal.ca>  wrote:
>>
>>>> Please don't call it "language".  That'd be confusing.  LSP is about
>>>> programming languages, so "language" is natural there.  But in Emacs,
>>>> a major mode is more general than that.  For example, it is not
>>>> unthinkable to consider mail-mode to be the extra-parent of
>>>> message-mode (or vice versa) -- but what is the "language" in that
>>>> case?
>>> Isn't the language for such modes in this paradigm just the empty set?
>> I'm not too worried about those cases, indeed.
>> I'm more worried about the taxonomy of languages.
>> We currently have the taxonomy of major modes, with which we're pretty
>> familiar, and we've had many years to learn about its downsides,
>> complexity, as well as how to deal with them, but for languages we're
>> only familiar with the easy cases, which makes us judge the idea in
>> a way that may prove naive.
> I don’t have anything insightful to contribute, but want to point out that in Emacs, “language” doesn’t always mean programming language. “Language” can also mean Chinese, English, etc, and Emacs are quite often used for editing natural language text. So it warrants some caution when using “language” to mean programming language specifically.

That's a good point.

But hopefully when the suffix -lang or -language is used in the symbol 
name, the preceding word(s) will make it unambiguous. But the mentions 
of "language" in the documentation would have to be more careful indeed 
(perhaps we'd call them "content type" after all, and :ruby-lang would 
be one of the content types).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sat, 20 Jan 2024 07:05:01 GMT) Full text and rfc822 format available.

Message #424 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, dmitry <at> gutov.dev, casouri <at> gmail.com,
 monnier <at> iro.umontreal.ca, stefankangas <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sat, 20 Jan 2024 09:03:43 +0200
> From: João Távora <joaotavora <at> gmail.com>
> Date: Fri, 19 Jan 2024 22:47:07 +0000
> Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>, 
> 	Stefan Kangas <stefankangas <at> gmail.com>, 68246 <at> debbugs.gnu.org, casouri <at> gmail.com
> 
> On Fri, Jan 19, 2024 at 6:05 PM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
> 
> > > I.e. can you clearly paint a way forward from it that solves the "what
> > > is this language in this buffer", "what is the language for this
> > > major-mode" and the "what mode should I use for that
> > > language" problems?
> >
> > As already explained ad-nauseam, my patch has no intention to solve
> > those problems, but I can't see any way in which it gets in the way of
> > solving them, if you care to solve them.
> 
> The patch is intended to solve problems in Eglot, Yasnippet,
> ffap and so on. This problem always includes with "what is this
> language in this buffer", i.e. number 1 in the preceding list.
> That's what the problem is, no less. The fact that these packages
> have always resorted to using `derived-mode-p` to solve that
> problem is an unfortunate consequence of the longstanding
> conflation between modes and languages that you yourself
> identified.
> 
> But it's not the right solution, never was. Your patch contributes
> to the perpetuation of this "wrong" solution, and I think
> we should face that frontally.

Your opinions on this are well-taken and have been noted many messages
ago.  You made them abundantly clear.  There's no need to re-iterate
them time and again.

> After it lands, derived-mode-p will cease to mean "A derived
> from B via defined-derived-mode, so you can trust hook for B
> runs in hook for A and a lot of other things".  It will mean
> something else.

Indeed, and it was not meant to mean what you suggest it should mean.

> * if it lands, we should document very well what that new meaning
>   of "<lang>-mode" is.  Also make some "provided-mode-walk-parents"
>   so that at least problem 2 can be solved, by string-matching
>   the symbol name of what will now be an even more enshrined
>   convention.  As to problem 3,   maybe, it can be written off to
>   "major-mode-remap-alist" (which I doubt will ever see much
>   adoption)

Feel free to suggest improvements and clarifications to the
documentation in these matters.

> * if it doesn't land, we should look at some solution that solves
>   1 2 and 3 cleanly.   I think Dmitry's patch is a decent start.

Since it will land, there's no need yet to look for alternatives.  We
will consider alternatives or other ways to fix this when we have data
(as opposed to just theoretical discussions) to support the need for
such changes.  (This, too, has been mentioned several times already.)

> * in the meantime, we should continue using base modes as we
>   already are.  In fact, the <foo>-base-mode convention is a
>   much better convention to enshrine, it doesn't require any
>   special caveats regarding hooks and dir-locals and changes
>   to the *Help* of a major mode function description.

We already use base modes where it makes sense.  It sounds like your
opinion is that we should use it much more radically, with which I
disagree and will object to introduction of base modes that server no
useful purpose by themselves.  I believe I've made that clear as well.

I hope this will allow us finally to put this longish discussion to
rest, at least until we have actual data to discuss what and how needs
to be changed.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sat, 20 Jan 2024 07:47:02 GMT) Full text and rfc822 format available.

Message #427 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: joaotavora <at> gmail.com, 68246 <at> debbugs.gnu.org, casouri <at> gmail.com,
 monnier <at> iro.umontreal.ca, stefankangas <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sat, 20 Jan 2024 09:46:00 +0200
> Date: Sat, 20 Jan 2024 07:47:27 +0200
> Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
>  Stefan Kangas <stefankangas <at> gmail.com>, joaotavora <at> gmail.com
> From: Dmitry Gutov <dmitry <at> gutov.dev>
> 
> On 19/01/2024 07:12, Yuan Fu wrote:
> > 
> > I don’t have anything insightful to contribute, but want to point out that in Emacs, “language” doesn’t always mean programming language. “Language” can also mean Chinese, English, etc, and Emacs are quite often used for editing natural language text. So it warrants some caution when using “language” to mean programming language specifically.
> 
> That's a good point.
> 
> But hopefully when the suffix -lang or -language is used in the symbol 
> name, the preceding word(s) will make it unambiguous.

Unfortunately, it doesn't.  Witness the parallel discussion of
translating the manual into other languages.

Which is one (but not the only) reason why I asked repeatedly in this
thread not to use the notion of "language" in this context: it is
confusing for more than one reason.  I think Stefan suggested "content
type" or something to that effect, which is better terminology, IMO.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sat, 20 Jan 2024 10:18:02 GMT) Full text and rfc822 format available.

Message #430 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 68246 <at> debbugs.gnu.org, Dmitry Gutov <dmitry <at> gutov.dev>,
 Yuan Fu <casouri <at> gmail.com>, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 Stefan Kangas <stefankangas <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sat, 20 Jan 2024 10:16:45 +0000
On Sat, Jan 20, 2024, 07:04 Eli Zaretskii <eliz <at> gnu.org> wrote:

> > After it lands, derived-mode-p will cease to mean "A derived
> > from B via defined-derived-mode, so you can trust hook for B
> > runs in hook for A and a lot of other things".  It will mean
> > something else.
>
> Indeed, and it was not meant to mean what you suggest it should mean.

Then pray tell.  What will it mean exactly?  This patch has no doc
yet (last I saw).

> > * if it lands, we should document very well what that new meaning
> >   of "<lang>-mode" is.  Also make some "provided-mode-walk-parents"
> >   so that at least problem 2 can be solved, by string-matching
> >   the symbol name of what will now be an even more enshrined
> >   convention.  As to problem 3,   maybe, it can be written off to
> >   "major-mode-remap-alist" (which I doubt will ever see much
> >   adoption)
>
> Feel free to suggest improvements and clarifications to the
> documentation in these matters.

I don't understand the vision behind this patch.  It has do doc
yet.  Despite your attempts to wrap this up and shut me up
I'm trying to at least converse with the author to expound it.
Often it's when trying to explain something in plain English
that to see how suitable it is.

> > * if it doesn't land, we should look at some solution that solves
> >   1 2 and 3 cleanly.   I think Dmitry's patch is a decent start.
>
> Since it will land, there's no need yet to look for alternatives.

If you've already decided that, just install it and save
us all some time.

> We will consider alternatives or other ways to fix this when
> we have data

I've given you data: at least Eglot and markdown mode have brittle
hacks this patch does nothing for.  You have chosen to ignore it.
Also I've explained how potentially dangerous this patch is to
Eglot customizations.

> We already use base modes where it makes sense.  It sounds like your
> opinion is that we should use it much more radically, with which I
> disagree and will object to introduction of base modes that server no
> useful purpose by themselves.

So solving the common language-detection problem, deduplicating
hooks and dir-locals is not serving a purpose "by oneself"?
Indeed you make up an undefined high bar of "by oneselfness"
you get to choose what clears it and doesn't.  But it doesn't
make your argument an argument.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sun, 21 Jan 2024 00:33:02 GMT) Full text and rfc822 format available.

Message #433 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: João Távora <joaotavora <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, Dmitry Gutov <dmitry <at> gutov.dev>,
 Eli Zaretskii <eliz <at> gnu.org>, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 Stefan Kangas <stefankangas <at> gmail.com>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sat, 20 Jan 2024 16:32:02 -0800

> On Jan 20, 2024, at 2:16 AM, João Távora <joaotavora <at> gmail.com> wrote:
> 
> On Sat, Jan 20, 2024, 07:04 Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
>>> After it lands, derived-mode-p will cease to mean "A derived
>>> from B via defined-derived-mode, so you can trust hook for B
>>> runs in hook for A and a lot of other things".  It will mean
>>> something else.
>> 
>> Indeed, and it was not meant to mean what you suggest it should mean.
> 
> Then pray tell.  What will it mean exactly?  This patch has no doc
> yet (last I saw).
> 
>>> * if it lands, we should document very well what that new meaning
>>>  of "<lang>-mode" is.  Also make some "provided-mode-walk-parents"
>>>  so that at least problem 2 can be solved, by string-matching
>>>  the symbol name of what will now be an even more enshrined
>>>  convention.  As to problem 3,   maybe, it can be written off to
>>>  "major-mode-remap-alist" (which I doubt will ever see much
>>>  adoption)
>> 
>> Feel free to suggest improvements and clarifications to the
>> documentation in these matters.
> 
> I don't understand the vision behind this patch.  It has do doc
> yet.  Despite your attempts to wrap this up and shut me up
> I'm trying to at least converse with the author to expound it.
> Often it's when trying to explain something in plain English
> that to see how suitable it is.
> 
>>> * if it doesn't land, we should look at some solution that solves
>>>  1 2 and 3 cleanly.   I think Dmitry's patch is a decent start.
>> 
>> Since it will land, there's no need yet to look for alternatives.
> 
> If you've already decided that, just install it and save
> us all some time.
> 
>> We will consider alternatives or other ways to fix this when
>> we have data
> 
> I've given you data: at least Eglot and markdown mode have brittle
> hacks this patch does nothing for.  You have chosen to ignore it.
> Also I've explained how potentially dangerous this patch is to
> Eglot customizations.
> 
>> We already use base modes where it makes sense.  It sounds like your
>> opinion is that we should use it much more radically, with which I
>> disagree and will object to introduction of base modes that server no
>> useful purpose by themselves.
> 
> So solving the common language-detection problem, deduplicating
> hooks and dir-locals is not serving a purpose "by oneself"?
> Indeed you make up an undefined high bar of "by oneselfness"
> you get to choose what clears it and doesn't.  But it doesn't
> make your argument an argument.


[I’ve been loosely following the thread so this might have been brought up and I missed it]

IIUC Stefan’s patch is trying to use xxx-mode to represent “mode for xxx in general”, sort of like the keys in major-mode-remap-alist. And IIUC Joao and Dmitry are not very comfortable with it because (mode-A R mode-B) where R is derived-mode-p implicitly means mode-B runs mode-A’s hooks and major mode body, and this patch would break that, which would bring a lot of confusion.

Instead of using xxx-mode, can we set common-xxx-mode to the parent of both xxx-mode and xxx-ts-mode? Or maybe abtract-xxx-mode, or just xxx, the name doesn’t matter. The point is this is just a symbol and doesn’t have hooks and other implicit things a major mode have. It’s still a bit confusing, but it should be less confusing. We can also add a variable common-mode-list or abstract-mode-list so these symbols don’t seem to come out of nowhere.

Yuan



Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sun, 21 Jan 2024 00:34:01 GMT) Full text and rfc822 format available.

Message #436 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: joaotavora <at> gmail.com, 68246 <at> debbugs.gnu.org, casouri <at> gmail.com,
 monnier <at> iro.umontreal.ca, stefankangas <at> gmail.com
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sun, 21 Jan 2024 02:32:50 +0200
On 20/01/2024 09:46, Eli Zaretskii wrote:
>> Date: Sat, 20 Jan 2024 07:47:27 +0200
>> Cc: 68246 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
>>   Stefan Kangas <stefankangas <at> gmail.com>, joaotavora <at> gmail.com
>> From: Dmitry Gutov <dmitry <at> gutov.dev>
>>
>> On 19/01/2024 07:12, Yuan Fu wrote:
>>>
>>> I don’t have anything insightful to contribute, but want to point out that in Emacs, “language” doesn’t always mean programming language. “Language” can also mean Chinese, English, etc, and Emacs are quite often used for editing natural language text. So it warrants some caution when using “language” to mean programming language specifically.
>>
>> That's a good point.
>>
>> But hopefully when the suffix -lang or -language is used in the symbol
>> name, the preceding word(s) will make it unambiguous.
> 
> Unfortunately, it doesn't.  Witness the parallel discussion of
> translating the manual into other languages.
> 
> Which is one (but not the only) reason why I asked repeatedly in this
> thread not to use the notion of "language" in this context: it is
> confusing for more than one reason.  I think Stefan suggested "content
> type" or something to that effect, which is better terminology, IMO.

People are welcome to rewrite the docs in terms of "content type", I 
have no problem with that and referred to this alternative multiple 
times in the emails.

But the term "language" is closer to my understanding of the issue, so 
it's easier for me to use when explaining. And I'm apparently not alone 
in that: if one looks at VS Code's UI, in the bottom right corner it 
offers the user the choice of the "language mode" for the current file. 
Among the choices of language modes, there are programming languages, of 
course (C, JavaScript, Ruby, ...), but also values like "Plain Text", 
"Ini", "Properties", "TeX", "Code Snippets", "Git Commit Message" and 
"Binary". To be clear, my proposal was not inspired by it--today is the 
first time I've examined that list this closely.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Sun, 21 Jan 2024 09:55:02 GMT) Full text and rfc822 format available.

Message #439 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Yuan Fu <casouri <at> gmail.com>
Cc: 68246 <at> debbugs.gnu.org, dmitry <at> gutov.dev, stefankangas <at> gmail.com,
 joaotavora <at> gmail.com, monnier <at> iro.umontreal.ca
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sun, 21 Jan 2024 11:54:10 +0200
> From: Yuan Fu <casouri <at> gmail.com>
> Date: Sat, 20 Jan 2024 16:32:02 -0800
> Cc: Eli Zaretskii <eliz <at> gnu.org>,
>  Stefan Monnier <monnier <at> iro.umontreal.ca>,
>  Dmitry Gutov <dmitry <at> gutov.dev>,
>  Stefan Kangas <stefankangas <at> gmail.com>,
>  68246 <at> debbugs.gnu.org
> 
> IIUC Stefan’s patch is trying to use xxx-mode to represent “mode for xxx in general”, sort of like the keys in major-mode-remap-alist. And IIUC Joao and Dmitry are not very comfortable with it because (mode-A R mode-B) where R is derived-mode-p implicitly means mode-B runs mode-A’s hooks and major mode body, and this patch would break that, which would bring a lot of confusion.

There should be no confusion.  derived-mode-add-parents is documented
regarding the effects and meaning (and if the current documentation is
not clear enough, we can clarify it further).

Moreover, I see no reason to assume FOO-mode runs any mode hook except
FOO-mode-hook.

> Instead of using xxx-mode, can we set common-xxx-mode to the parent of both xxx-mode and xxx-ts-mode? Or maybe abtract-xxx-mode, or just xxx, the name doesn’t matter. The point is this is just a symbol and doesn’t have hooks and other implicit things a major mode have. It’s still a bit confusing, but it should be less confusing. We can also add a variable common-mode-list or abstract-mode-list so these symbols don’t seem to come out of nowhere.

I'm firmly against introducing modes that are not real modes.  We do
use base-modes where it makes sense, but if such a mode makes no
sense, introducing it as a means to some end is just going to make
things more confusing.

Once again: let's have real practical issues on our hands before we
look for solutions.  Right now, no such issues are known, since the
changes barely landed on master.  There's no reason for looking for
hasty solutions for problems we don't have a good handle on.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68246; Package emacs. (Wed, 24 Jan 2024 06:22:01 GMT) Full text and rfc822 format available.

Message #442 received at 68246 <at> debbugs.gnu.org (full text, mbox):

From: Yuan Fu <casouri <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 68246 <at> debbugs.gnu.org, dmitry <at> gutov.dev, stefankangas <at> gmail.com,
 João Távora <joaotavora <at> gmail.com>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Tue, 23 Jan 2024 22:20:49 -0800

> On Jan 21, 2024, at 1:54 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
>> From: Yuan Fu <casouri <at> gmail.com>
>> Date: Sat, 20 Jan 2024 16:32:02 -0800
>> Cc: Eli Zaretskii <eliz <at> gnu.org>,
>> Stefan Monnier <monnier <at> iro.umontreal.ca>,
>> Dmitry Gutov <dmitry <at> gutov.dev>,
>> Stefan Kangas <stefankangas <at> gmail.com>,
>> 68246 <at> debbugs.gnu.org
>> 
>> IIUC Stefan’s patch is trying to use xxx-mode to represent “mode for xxx in general”, sort of like the keys in major-mode-remap-alist. And IIUC Joao and Dmitry are not very comfortable with it because (mode-A R mode-B) where R is derived-mode-p implicitly means mode-B runs mode-A’s hooks and major mode body, and this patch would break that, which would bring a lot of confusion.
> 
> There should be no confusion.  derived-mode-add-parents is documented
> regarding the effects and meaning (and if the current documentation is
> not clear enough, we can clarify it further).
> 
> Moreover, I see no reason to assume FOO-mode runs any mode hook except
> FOO-mode-hook.
> 
>> Instead of using xxx-mode, can we set common-xxx-mode to the parent of both xxx-mode and xxx-ts-mode? Or maybe abtract-xxx-mode, or just xxx, the name doesn’t matter. The point is this is just a symbol and doesn’t have hooks and other implicit things a major mode have. It’s still a bit confusing, but it should be less confusing. We can also add a variable common-mode-list or abstract-mode-list so these symbols don’t seem to come out of nowhere.
> 
> I'm firmly against introducing modes that are not real modes.  We do
> use base-modes where it makes sense, but if such a mode makes no
> sense, introducing it as a means to some end is just going to make
> things more confusing.
> 
> Once again: let's have real practical issues on our hands before we
> look for solutions.  Right now, no such issues are known, since the
> changes barely landed on master.  There's no reason for looking for
> hasty solutions for problems we don't have a good handle on.

Ok, I can see that. I don’t have anything else to add.

Yuan



Reply sent to Stefan Monnier <monnier <at> iro.umontreal.ca>:
You have taken responsibility. (Sat, 09 Mar 2024 15:41:02 GMT) Full text and rfc822 format available.

Notification sent to Stefan Monnier <monnier <at> iro.umontreal.ca>:
bug acknowledged by developer. (Sat, 09 Mar 2024 15:41:02 GMT) Full text and rfc822 format available.

Message #447 received at 68246-done <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: 68246-done <at> debbugs.gnu.org
Subject: Re: bug#68246: 30.0.50; Add non-TS mode as extra parent of TS modes
Date: Sat, 09 Mar 2024 10:39:31 -0500
Pushed, closing,


        Stefan





bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 07 Apr 2024 11:24:25 GMT) Full text and rfc822 format available.

This bug report was last modified 228 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.