GNU bug report logs - #30513
Unicode Character Name is misspelled (MATHEMATICAL ITALIC SMALL LAMDA)

Previous Next

Package: emacs;

Reported by: Michael Grünewald <michipili <at> gmail.com>

Date: Sun, 18 Feb 2018 15:00:02 UTC

Severity: wishlist

Tags: fixed

Fixed in version 28.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 30513 in the body.
You can then email your comments to 30513 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#30513; Package emacs. (Sun, 18 Feb 2018 15:00:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Michael Grünewald <michipili <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sun, 18 Feb 2018 15:00:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Michael Grünewald <michipili <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: Unicode Character Name is misspelled (MATHEMATICAL ITALIC SMALL LAMDA)
Date: Sun, 18 Feb 2018 15:58:59 +0100
The Unicode character 𝜆 has its name misspelled, it should be

  MATHEMATICAL ITALIC SMALL LAMBDA

instead of 

  MATHEMATICAL ITALIC SMALL LAMDA

(notice a B is missing in the second variant).


When using C-u C-x = on that character, this information is displayed:

             position: 6184 of 10702 (58%), column: 11
            character: 𝜆 (displayed as 𝜆) (codepoint 120582, #o353406, #x1d706)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0x1D706
               script: mathematical
               syntax: w 	which means: word
             category: .:Base, L:Left-to-right (strong)
             to input: type "C-x 8 RET 1d706" or "C-x 8 RET MATHEMATICAL ITALIC SMALL LAMDA"
          buffer code: #xF0 #x9D #x9C #x86
            file code: #xF0 #x9D #x9C #x86 (encoded by coding system utf-8-unix)
              display: by this font (glyph code)
    mac-ct:-*-STIXGeneral-normal-normal-normal-*-16-*-*-*-p-0-iso10646-1 (#xBEF)

Character code properties: customize what to show
  name: MATHEMATICAL ITALIC SMALL LAMDA
  general-category: Ll (Letter, Lowercase)
  decomposition: (font 955) (font 'λ')

There are text properties here:
  fontified            t

[back]


In GNU Emacs 25.3.1 (x86_64-apple-darwin17.2.0, NS appkit-1561.10 Version 10.13.1 (Build 17B48))
 of 2017-11-01 built on MacBook-Pro.localdomain
Windowing system distributor 'Apple', version 10.3.1561
Configured using:
 'configure --prefix=/opt/local --without-ns --without-dbus
 --without-gconf --without-libotf --without-m17n-flt --without-gpm
 --without-gnutls --with-xml2 --with-modules --infodir
 /opt/local/share/info/emacs --with-ns CC=/usr/bin/clang 'CFLAGS=-pipe
 -Os -arch x86_64' 'LDFLAGS=-L/opt/local/lib
 -Wl,-headerpad_max_install_names -arch x86_64'
 CPPFLAGS=-I/opt/local/include'

Configured features:
NOTIFY ACL LIBXML2 ZLIB TOOLKIT_SCROLL_BARS NS MODULES

Important settings:
  value of $LC_CTYPE: UTF-8
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix

Major mode: Lisp

Minor modes in effect:
  diff-auto-refine-mode: t
  slime-trace-dialog-minor-mode: t
  slime-autodoc-mode: t
  slime-mode: t
  shell-dirtrack-mode: t
  global-whitespace-mode: t
  tooltip-mode: t
  global-eldoc-mode: t
  eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  column-number-mode: t
  line-number-mode: t
  auto-fill-function: do-auto-fill
  transient-mark-mode: t

Recent messages:
Saving file /Users/michael/AbleBaker/waermondt/src/bgm.lisp...
Wrote /Users/michael/AbleBaker/waermondt/src/bgm.lisp
Quit
Char: 𝜏 (120591, #o353417, #x1d70f, file ...) point=5964 of 10701 (56%) column=11
Type C-x 1 to delete the help window.
Char: 𝜏 (120591, #o353417, #x1d70f, file ...) point=5964 of 10701 (56%) column=11
Saving file /Users/michael/AbleBaker/waermondt/src/bgm.lisp...
Wrote /Users/michael/AbleBaker/waermondt/src/bgm.lisp
Quit
Auto-saving...
Quit [2 times]

Load-path shadows:
None found.

Features:
(shadow sort mail-extr emacsbug sendmail wid-edit descr-text tmm
log-edit message idna format-spec rfc822 mml mml-sec epg mm-decode
mm-bodies mm-encode mailabbrev mail-utils gmm-utils mailheader add-log
log-view pcvs-util vc vc-dispatcher rect two-column sgml-mode dired-aux
iso-transl cus-start cus-load quail latexenc vc-git diff-mode misearch
multi-isearch network-stream nsm starttls dired finder-inf package
epg-config seq merlin-cap merlin crm ocp-index ocp-indent php-mode
cc-langs cc-mode cc-fonts cc-guess cc-menus cc-cmds cc-styles cc-align
cc-engine cc-vars cc-defs slime-indentation slime-cl-indent cl-indent
slime-hyperdoc url-http tls gnutls url url-proxy url-privacy url-expand
url-methods url-history mailcap url-auth mail-parse rfc2231 rfc2047
rfc2045 ietf-drums url-cookie url-domsuf url-util url-parse auth-source
gnus-util mm-util help-fns mail-prsvr password-cache url-gw url-vars
slime-sprof slime-asdf grep slime-fancy slime-trace-dialog
slime-fontifying-fu slime-package-fu slime-references
slime-compiler-notes-tree slime-scratch slime-presentations bridge
slime-macrostep macrostep slime-mdot-fu slime-enclosing-context
slime-fuzzy derived slime-fancy-trace slime-fancy-inspector slime-c-p-c
slime-editing-commands slime-autodoc slime-repl edmacro kmacro elp cl
slime-parse slime etags xref cl-seq project eieio byte-opt bytecomp
byte-compile cl-extra help-mode cconv eieio-core cl-macs gv arc-mode
archive-mode noutline outline easy-mmode pp hyperspec thingatpt
browse-url cl-loaddefs pcase cl-lib slime-autoloads tex-mode shell
pcomplete css-mode smie caml tuareg speedbar sb-image ezimage dframe
advice compile comint ansi-color ring easymenu windmove whitespace
time-date mule-util tooltip eldoc electric uniquify ediff-hook vc-hooks
lisp-float-type mwheel ns-win ucs-normalize term/common-win tool-bar dnd
fontset image regexp-opt fringe tabulated-list newcomment elisp-mode
lisp-mode prog-mode register page menu-bar rfn-eshadow timer select
scroll-bar mouse jit-lock font-lock syntax facemenu font-core frame
cl-generic cham georgian utf-8-lang misc-lang vietnamese tibetan thai
tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian
slovak czech european ethiopic indian cyrillic chinese charscript
case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer
cl-preloaded nadvice loaddefs button faces cus-face macroexp files
text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget hashtable-print-readable backquote kqueue cocoa ns
multi-tty make-network-process emacs)

Memory information:
((conses 16 527508 118142)
 (symbols 48 46383 0)
 (miscs 40 1618 2299)
 (strings 32 123349 17918)
 (string-bytes 1 3265119)
 (vectors 16 59363)
 (vector-slots 8 1787488 231565)
 (floats 8 354 782)
 (intervals 56 8771 2523)
 (buffers 976 56))





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30513; Package emacs. (Sun, 18 Feb 2018 15:05:02 GMT) Full text and rfc822 format available.

Message #8 received at 30513 <at> debbugs.gnu.org (full text, mbox):

From: Noam Postavsky <npostavs <at> gmail.com>
To: Michael Grünewald <michipili <at> gmail.com>
Cc: 30513 <at> debbugs.gnu.org
Subject: Re: bug#30513: Unicode Character Name is misspelled (MATHEMATICAL
 ITALIC SMALL LAMDA)
Date: Sun, 18 Feb 2018 10:04:24 -0500
Michael Grünewald <michipili <at> gmail.com> writes:

> The Unicode character 𝜆 has its name misspelled, it should be
>
>   MATHEMATICAL ITALIC SMALL LAMBDA
>
> instead of 
>
>   MATHEMATICAL ITALIC SMALL LAMDA
>
> (notice a B is missing in the second variant).

https://en.wikipedia.org/wiki/Lambda says:

    Unicode uses the spelling "lamda" in character names, instead of
    "lambda", due to "preferences expressed by the Greek National Body"[14]

[14]: http://www.unicode.org/mail-arch/unicode-ml/y2010-m06/0063.html




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30513; Package emacs. (Sun, 18 Feb 2018 15:24:01 GMT) Full text and rfc822 format available.

Message #11 received at 30513 <at> debbugs.gnu.org (full text, mbox):

From: Michael Grünewald <michipili <at> gmail.com>
To: Noam Postavsky <npostavs <at> gmail.com>
Cc: 30513 <at> debbugs.gnu.org
Subject: Re: bug#30513: Unicode Character Name is misspelled (MATHEMATICAL
 ITALIC SMALL LAMDA)
Date: Sun, 18 Feb 2018 16:23:03 +0100
> On 18. Feb 2018, at 16:04, Noam Postavsky <npostavs <at> gmail.com> wrote:
> 
> Michael Grünewald <michipili <at> gmail.com> writes:
> 
>> The Unicode character 𝜆 has its name misspelled, it should be
>> 
>>  MATHEMATICAL ITALIC SMALL LAMBDA
>> 
>> instead of 
>> 
>>  MATHEMATICAL ITALIC SMALL LAMDA
>> 
>> (notice a B is missing in the second variant).
> 
> https://en.wikipedia.org/wiki/Lambda says:
> 
>    Unicode uses the spelling "lamda" in character names, instead of
>    "lambda", due to "preferences expressed by the Greek National Body"[14]
> 
> [14]: http://www.unicode.org/mail-arch/unicode-ml/y2010-m06/0063.html

I see, thank you for the very quick reply!


I sometimes use C-x 8 RET to enter characters using their Unicode names.  Since
these names are usually very long I rely heavily on auto-completion to reduce typing.
So if I need to enter a

  MATHEMATICAL ITALIC SMALL TAU

I start with just the word TAU and hit TAB to display possible alternatives and
quickly reach the desired name.

What would be the preferred way to enter math symbols without fumbling over such small
oddities? Is there any possibility of adjusting the auto-completion method so that it
is not so picky about the difference between LAMDA and LAMBDA?  Are there other better
approaches?

(I think I am well aware of the LAMBDA vs. LAMDA detail now but what about the huge crowd
of Emacs users entering MATHEMATICAL ITALIC SMALL letters using C-x 8 RET? ;) )

I would like to avoid using the TeX input method, because it does not interact so
nicely with programming (because of the special meaning quotes and the underscore become).

Best,
Michael





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30513; Package emacs. (Sun, 18 Feb 2018 15:55:02 GMT) Full text and rfc822 format available.

Message #14 received at 30513 <at> debbugs.gnu.org (full text, mbox):

From: Noam Postavsky <npostavs <at> gmail.com>
To: Michael Grünewald <michipili <at> gmail.com>
Cc: 30513 <at> debbugs.gnu.org
Subject: Re: bug#30513: Unicode Character Name is misspelled (MATHEMATICAL
 ITALIC SMALL LAMDA)
Date: Sun, 18 Feb 2018 10:54:21 -0500
[Message part 1 (text/plain, inline)]
tags 30513 + patch
quit

Michael Grünewald <michipili <at> gmail.com> writes:

> What would be the preferred way to enter math symbols without fumbling over such small
> oddities? Is there any possibility of adjusting the auto-completion method so that it
> is not so picky about the difference between LAMDA and LAMBDA?  Are there other better
> approaches?
>
> (I think I am well aware of the LAMBDA vs. LAMDA detail now but what about the huge crowd
> of Emacs users entering MATHEMATICAL ITALIC SMALL letters using C-x 8 RET? ;) )

Yeah, this is fresh in my memory since I was recently playing with ucs-insert
and was a bit surprised to discover "LAMBDA" under the *old* name:

  name: GREEK SMALL LETTER LAMDA
  old-name: GREEK SMALL LETTER LAMBDA

Anyway, I think the following patch should smooth things over:

[v1-0001-Allow-lambda-spelling-for-ucs-insert-Bug-30513.patch (text/x-diff, inline)]
From a7b40afd7fad41d55e7a43168c7febc9722ee3ac Mon Sep 17 00:00:00 2001
From: Noam Postavsky <npostavs <at> gmail.com>
Date: Sun, 18 Feb 2018 10:43:42 -0500
Subject: [PATCH v1] Allow "lambda" spelling for ucs-insert (Bug#30513)

* lisp/international/mule-cmds.el (ucs-names): Add a "LAMBDA"
completion variant for every "LAMDA" name.
---
 lisp/international/mule-cmds.el | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/lisp/international/mule-cmds.el b/lisp/international/mule-cmds.el
index 3468166263..2a995121dd 100644
--- a/lisp/international/mule-cmds.el
+++ b/lisp/international/mule-cmds.el
@@ -2949,6 +2949,14 @@ ucs-names
 	        ;; higher code, so it gets pushed later!
 	        (if new-name (puthash new-name c names))
 	        (if old-name (puthash old-name c names))
+                ;; Unicode uses the spelling "lamda" in character
+                ;; names, instead of "lambda", due to "preferences
+                ;; expressed by the Greek National Body" (Bug#30513).
+                ;; Some characters have an old-name with the "lambda"
+                ;; spelling, but others don't.  Add the traditional
+                ;; spelling for more convenient completion.
+                (if (and (not old-name) new-name (string-match "LAMDA" new-name))
+                    (puthash (replace-match "LAMBDA" t t new-name) c names))
 	        (setq c (1+ c))))))
         ;; Special case for "BELL" which is apparently the only char which
         ;; doesn't have a new name and whose old-name is shadowed by a newer
-- 
2.11.0


Added tag(s) patch. Request was from Noam Postavsky <npostavs <at> gmail.com> to control <at> debbugs.gnu.org. (Sun, 18 Feb 2018 15:55:04 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30513; Package emacs. (Sun, 18 Feb 2018 16:32:02 GMT) Full text and rfc822 format available.

Message #19 received at 30513 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Noam Postavsky <npostavs <at> gmail.com>
Cc: michipili <at> gmail.com, 30513 <at> debbugs.gnu.org
Subject: Re: bug#30513: Unicode Character Name is misspelled (MATHEMATICAL
 ITALIC SMALL LAMDA)
Date: Sun, 18 Feb 2018 18:31:00 +0200
> From: Noam Postavsky <npostavs <at> gmail.com>
> Date: Sun, 18 Feb 2018 10:04:24 -0500
> Cc: 30513 <at> debbugs.gnu.org
> 
> Michael Grünewald <michipili <at> gmail.com> writes:
> 
> > The Unicode character 𝜆 has its name misspelled, it should be
> >
> >   MATHEMATICAL ITALIC SMALL LAMBDA
> >
> > instead of 
> >
> >   MATHEMATICAL ITALIC SMALL LAMDA
> >
> > (notice a B is missing in the second variant).
> 
> https://en.wikipedia.org/wiki/Lambda says:
> 
>     Unicode uses the spelling "lamda" in character names, instead of
>     "lambda", due to "preferences expressed by the Greek National Body"[14]
> 
> [14]: http://www.unicode.org/mail-arch/unicode-ml/y2010-m06/0063.html

Indeed.  And Emacs takes the names from the Unicode character database
anyway, so we cannot misspell the names.

Note that some LAMDA letters have the "old name" property that uses
LAMBDA, but this specific character doesn't (probably because it was
added after the above-mentioned preference was adopted by the Unicode
consortium).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30513; Package emacs. (Sun, 18 Feb 2018 16:32:02 GMT) Full text and rfc822 format available.

Message #22 received at 30513 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Michael Grünewald <michipili <at> gmail.com>
Cc: npostavs <at> gmail.com, 30513 <at> debbugs.gnu.org
Subject: Re: bug#30513: Unicode Character Name is misspelled (MATHEMATICAL
 ITALIC SMALL LAMDA)
Date: Sun, 18 Feb 2018 18:31:40 +0200
> From: Michael Grünewald <michipili <at> gmail.com>
> Date: Sun, 18 Feb 2018 16:23:03 +0100
> Cc: 30513 <at> debbugs.gnu.org
> 
> (I think I am well aware of the LAMBDA vs. LAMDA detail now but what about the huge crowd
> of Emacs users entering MATHEMATICAL ITALIC SMALL letters using C-x 8 RET? ;) )

Patches to support such lose matching are welcome, I think.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30513; Package emacs. (Sun, 18 Feb 2018 17:17:02 GMT) Full text and rfc822 format available.

Message #25 received at 30513 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Noam Postavsky <npostavs <at> gmail.com>
Cc: michipili <at> gmail.com, 30513 <at> debbugs.gnu.org
Subject: Re: bug#30513: Unicode Character Name is misspelled (MATHEMATICAL
 ITALIC SMALL LAMDA)
Date: Sun, 18 Feb 2018 19:16:51 +0200
> From: Noam Postavsky <npostavs <at> gmail.com>
> Date: Sun, 18 Feb 2018 10:54:21 -0500
> Cc: 30513 <at> debbugs.gnu.org
> 
> +                (if (and (not old-name) new-name (string-match "LAMDA" new-name))
> +                    (puthash (replace-match "LAMBDA" t t new-name) c names))

Won't this make ucs-names even larger and more redundant?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30513; Package emacs. (Sun, 18 Feb 2018 18:30:01 GMT) Full text and rfc822 format available.

Message #28 received at 30513 <at> debbugs.gnu.org (full text, mbox):

From: Noam Postavsky <npostavs <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: michipili <at> gmail.com, 30513 <at> debbugs.gnu.org
Subject: Re: bug#30513: Unicode Character Name is misspelled (MATHEMATICAL
 ITALIC SMALL LAMDA)
Date: Sun, 18 Feb 2018 13:29:28 -0500
Eli Zaretskii <eliz <at> gnu.org> writes:

>> +                (if (and (not old-name) new-name (string-match "LAMDA" new-name))
>> +                    (puthash (replace-match "LAMBDA" t t new-name) c names))
>
> Won't this make ucs-names even larger and more redundant?

It will make ucs-names slightly larger and more redundant.  I think the
trade-off is worth it.  To give precise numbers, it adds 12 entries for
a total of 42857, which is 0.029%.

    (length
     ;; Entries satisfying (and (not old-name) new-name (string-match "LAMDA" new-name))
     '("MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL LAMDA"
       "MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL LAMDA"
       "MATHEMATICAL SANS-SERIF BOLD SMALL LAMDA"
       "MATHEMATICAL SANS-SERIF BOLD CAPITAL LAMDA"
       "MATHEMATICAL BOLD ITALIC SMALL LAMDA"
       "MATHEMATICAL BOLD ITALIC CAPITAL LAMDA"
       "MATHEMATICAL ITALIC SMALL LAMDA"
       "MATHEMATICAL ITALIC CAPITAL LAMDA"
       "MATHEMATICAL BOLD SMALL LAMDA"
       "MATHEMATICAL BOLD CAPITAL LAMDA"
       "UGARITIC LETTER LAMDA"
       "GREEK LETTER SMALL CAPITAL LAMDA")) ;=> 12

    (hash-table-count ucs-names) ;=> 42857




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30513; Package emacs. (Sun, 18 Feb 2018 19:50:02 GMT) Full text and rfc822 format available.

Message #31 received at 30513 <at> debbugs.gnu.org (full text, mbox):

From: Drew Adams <drew.adams <at> oracle.com>
To: Noam Postavsky <npostavs <at> gmail.com>, Eli Zaretskii <eliz <at> gnu.org>
Cc: michipili <at> gmail.com, 30513 <at> debbugs.gnu.org
Subject: RE: bug#30513: Unicode Character Name is misspelled (MATHEMATICAL
 ITALIC SMALL LAMDA)
Date: Sun, 18 Feb 2018 11:49:02 -0800 (PST)
> >> +                (if (and (not old-name) new-name (string-match
> "LAMDA" new-name))
> >> +                    (puthash (replace-match "LAMBDA" t t new-name) c
> names))
> >
> > Won't this make ucs-names even larger and more redundant?
> 
> It will make ucs-names slightly larger and more redundant.  I think the
> trade-off is worth it.  To give precise numbers, it adds 12 entries for
> a total of 42857, which is 0.029%.

I would not make the point that this adds too many chars
for `ucs-names' or for `C-x 8 RET'.

I would make the point that we should not be inventing
character names and then associating such inventions with
what has heretofore been a pretty faithful reflection of
the Unicode standard.

There are many different possible uses of `ucs-names'.
It should not be assumed that the only use is to complete
`C-x 8 RET' or that every use of that command or
`ucs-names' will be improved by "loose" matching that
allows names that are not defined by Unicode.

If someone wants a command (or a hash table or other
mapping similar to `ucs-names') that provides and uses
handy non-Unicode names, s?he can easily define it.

And if Emacs itself wants to provide such a command or
such a map-producing function it can do it.  But please
do not do this to `ucs-names' or the default behavior
for character insertion (i.e., `C-x 8 RET').  Every such
possible "improvement" of character names for one person
is sure to be a detriment to someone else and other use
cases.

Unicode itself does not define additional LAMDA versions
of character names.  `ucs-names' and `C-x 8 RET' should
respect that and faithfully reflect the Unicode standard.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30513; Package emacs. (Fri, 23 Feb 2018 19:51:01 GMT) Full text and rfc822 format available.

Message #34 received at 30513 <at> debbugs.gnu.org (full text, mbox):

From: Marcin Borkowski <mbork <at> mbork.pl>
To: Drew Adams <drew.adams <at> oracle.com>
Cc: michipili <at> gmail.com, Eli Zaretskii <eliz <at> gnu.org>,
 Noam Postavsky <npostavs <at> gmail.com>, 30513 <at> debbugs.gnu.org
Subject: Re: bug#30513: Unicode Character Name is misspelled (MATHEMATICAL
 ITALIC SMALL LAMDA)
Date: Fri, 23 Feb 2018 20:50:20 +0100
On 2018-02-18, at 20:49, Drew Adams <drew.adams <at> oracle.com> wrote:

>> >> +                (if (and (not old-name) new-name (string-match
>> "LAMDA" new-name))
>> >> +                    (puthash (replace-match "LAMBDA" t t new-name) c
>> names))
>> >
>> > Won't this make ucs-names even larger and more redundant?
>>
>> It will make ucs-names slightly larger and more redundant.  I think the
>> trade-off is worth it.  To give precise numbers, it adds 12 entries for
>> a total of 42857, which is 0.029%.
>
> I would not make the point that this adds too many chars
> for `ucs-names' or for `C-x 8 RET'.
>
> I would make the point that we should not be inventing
> character names and then associating such inventions with
> what has heretofore been a pretty faithful reflection of
> the Unicode standard.

How about this one?

︘
PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRAKCET
                                                         ^^

And what about this one?  Press C-x 8 RET MATHEMATICAL ITALIC SMALL TAB
and try to answer the puzzle: where has the "MATHEMATICAL ITALIC SMALL H"
gone?

(The answer, rot'd-13 so that I don't spoil it for Unicode wannabe
detectives;-): CYNAPX PBAFGNAG, U+210E.)

IOW, I would argue that _some_ kind of system to help the user overcome
the inherent Unicode problems might be a good idea.

Best,

--
Marcin Borkowski




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30513; Package emacs. (Fri, 23 Feb 2018 20:16:01 GMT) Full text and rfc822 format available.

Message #37 received at 30513 <at> debbugs.gnu.org (full text, mbox):

From: Drew Adams <drew.adams <at> oracle.com>
To: Marcin Borkowski <mbork <at> mbork.pl>
Cc: michipili <at> gmail.com, Eli Zaretskii <eliz <at> gnu.org>,
 Noam Postavsky <npostavs <at> gmail.com>, 30513 <at> debbugs.gnu.org
Subject: RE: bug#30513: Unicode Character Name is misspelled (MATHEMATICAL
 ITALIC SMALL LAMDA)
Date: Fri, 23 Feb 2018 12:15:28 -0800 (PST)
> > I would not make the point that this adds too many chars
> > for `ucs-names' or for `C-x 8 RET'.
> >
> > I would make the point that we should not be inventing
> > character names and then associating such inventions with
> > what has heretofore been a pretty faithful reflection of
> > the Unicode standard.
> 
> How about this one?
> 
> ︘
> PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRAKCET
>                                                          ^^
> And what about this one?  Press C-x 8 RET MATHEMATICAL ITALIC SMALL TAB
> and try to answer the puzzle: where has the "MATHEMATICAL ITALIC SMALL H"
> gone?
> 
> (The answer, rot'd-13 so that I don't spoil it for Unicode wannabe
> detectives;-): CYNAPX PBAFGNAG, U+210E.)
> 
> IOW, I would argue that _some_ kind of system to help the user
> overcome the inherent Unicode problems might be a good idea.

Agreed: some help would help. ;-)  But not at the cost of
changing `ucs-names'.

You snipped most of my post, including the part that said
that although we should leave the set of Unicode names as
Unicode defines them, so that `ucs-names' remains faithful
to the standard, we can certainly add Emacs constructs (e.g.
commands, completion functions, whatever), to help users
use alternate names of our own invention, including spelling
corrections.

The fault is not with `ucs-names'.  The fault, if there
be any, is with the ways we currently _make use of it_
for users.

We could offer additional or alternative ways for users to
make use of it.  We could, for example, change `insert-char'
to respect a user option that expresses just how much such
help to provide, e.g., the degree of spelling help,
correction, abbreviation, or whatever.

If we do that then we should at least allow one of the
option values to mean that no such help is to be offered,
in which case `insert-char' would offer only the official
names.

There are other uses of `ucs-names', beyond `insert-char',
at least in 3rd-party libraries.  We should definitely not
assume that all uses of `ucs-names' should benefit or be
troubled by any Emacs-specific "improvements" we might
want to offer for the available char names.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30513; Package emacs. (Sat, 24 Feb 2018 20:42:02 GMT) Full text and rfc822 format available.

Message #40 received at 30513 <at> debbugs.gnu.org (full text, mbox):

From: Marcin Borkowski <mbork <at> mbork.pl>
To: Drew Adams <drew.adams <at> oracle.com>
Cc: michipili <at> gmail.com, Noam Postavsky <npostavs <at> gmail.com>,
 30513 <at> debbugs.gnu.org
Subject: Re: bug#30513: Unicode Character Name is misspelled (MATHEMATICAL
 ITALIC SMALL LAMDA)
Date: Sat, 24 Feb 2018 21:41:31 +0100
On 2018-02-23, at 21:15, Drew Adams <drew.adams <at> oracle.com> wrote:

>> > I would not make the point that this adds too many chars
>> > for `ucs-names' or for `C-x 8 RET'.
>> >
>> > I would make the point that we should not be inventing
>> > character names and then associating such inventions with
>> > what has heretofore been a pretty faithful reflection of
>> > the Unicode standard.
>> 
>> How about this one?
>> 
>> ︘
>> PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRAKCET
>>                                                          ^^
>> And what about this one?  Press C-x 8 RET MATHEMATICAL ITALIC SMALL TAB
>> and try to answer the puzzle: where has the "MATHEMATICAL ITALIC SMALL H"
>> gone?
>> 
>> (The answer, rot'd-13 so that I don't spoil it for Unicode wannabe
>> detectives;-): CYNAPX PBAFGNAG, U+210E.)
>> 
>> IOW, I would argue that _some_ kind of system to help the user
>> overcome the inherent Unicode problems might be a good idea.
>
> Agreed: some help would help. ;-)  But not at the cost of
> changing `ucs-names'.
>
> You snipped most of my post, including the part that said
> that although we should leave the set of Unicode names as
> Unicode defines them, so that `ucs-names' remains faithful
> to the standard, we can certainly add Emacs constructs (e.g.
> commands, completion functions, whatever), to help users
> use alternate names of our own invention, including spelling
> corrections.
>
> The fault is not with `ucs-names'.  The fault, if there
> be any, is with the ways we currently _make use of it_
> for users.

I agree.

> We could offer additional or alternative ways for users to
> make use of it.  We could, for example, change `insert-char'
> to respect a user option that expresses just how much such
> help to provide, e.g., the degree of spelling help,
> correction, abbreviation, or whatever.
>
> If we do that then we should at least allow one of the
> option values to mean that no such help is to be offered,
> in which case `insert-char' would offer only the official
> names.
>
> There are other uses of `ucs-names', beyond `insert-char',
> at least in 3rd-party libraries.  We should definitely not
> assume that all uses of `ucs-names' should benefit or be
> troubled by any Emacs-specific "improvements" we might
> want to offer for the available char names.

+1.  For one interesting use of ucs-names, see my blog post here:
http://mbork.pl/2017-10-02_Converting_TeX_sequences_to_Unicode_characters

Best,

--
Marcin Borkowski




Removed tag(s) patch. Request was from Noam Postavsky <npostavs <at> gmail.com> to control <at> debbugs.gnu.org. (Wed, 07 Mar 2018 12:55:01 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#30513; Package emacs. (Fri, 04 Sep 2020 04:46:01 GMT) Full text and rfc822 format available.

Message #45 received at 30513 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Noam Postavsky <npostavs <at> gmail.com>
Cc: Michael Grünewald <michipili <at> gmail.com>,
 30513 <at> debbugs.gnu.org
Subject: Re: bug#30513: Unicode Character Name is misspelled (MATHEMATICAL
 ITALIC SMALL LAMDA)
Date: Fri, 04 Sep 2020 06:45:46 +0200
Noam Postavsky <npostavs <at> gmail.com> writes:

> Yeah, this is fresh in my memory since I was recently playing with ucs-insert
> and was a bit surprised to discover "LAMBDA" under the *old* name:
>
>   name: GREEK SMALL LETTER LAMDA
>   old-name: GREEK SMALL LETTER LAMBDA
>
> Anyway, I think the following patch should smooth things over:

The discussion then turned to whether this would add a lot of redundant
entries, but it's just 12, so I think that's fine.  It was also
suggested that there should be a more general mechanism to provide
alternative/fixed spellings for any kind of oddly-spelled Unicode
character, and that's true.

But in isolation, and because Emacs is a Lisp system, I think Noam's
patch makes sense, and I've applied it (with some cosmetic changes) to
Emacs 28.

If anybody disagrees, feel free to revert.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Added tag(s) fixed. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Fri, 04 Sep 2020 04:47:01 GMT) Full text and rfc822 format available.

bug marked as fixed in version 28.1, send any further explanations to 30513 <at> debbugs.gnu.org and Michael Grünewald <michipili <at> gmail.com> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Fri, 04 Sep 2020 04:47:01 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 02 Oct 2020 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 207 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.