GNU bug report logs - #19994
25.0.50; Unicode keyboard input on Windows

Previous Next

Package: emacs;

Reported by: Ilya Zakharevich <nospam-abuse <at> ilyaz.org>

Date: Tue, 3 Mar 2015 23:11:02 UTC

Severity: normal

Found in version 25.0.50

Done: Stefan Kangas <stefan <at> marxist.se>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 19994 in the body.
You can then email your comments to 19994 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#19994; Package emacs. (Tue, 03 Mar 2015 23:11:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ilya Zakharevich <nospam-abuse <at> ilyaz.org>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Tue, 03 Mar 2015 23:11:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Ilya Zakharevich <nospam-abuse <at> ilyaz.org>
To: bug-gnu-emacs <at> gnu.org
Subject: 25.0.50; Unicode keyboard input on Windows
Date: Tue, 3 Mar 2015 15:09:49 -0800
I’m working on a patch to make Unicode keyboard input to work properly on
Windows (in graphic mode).  The problems with the current implementation 
stem from the facts that

  • on Windows, it IS possible to implement a bullet-proof system of Unicode
    input (at least, for GUI applications);

  • However, how to do it is completely undocumented.

      [See
        http://search.cpan.org/~ilyaz/UI-KeyboardLayout/lib/UI/KeyboardLayout.pm#Keyboard_input_on_Windows:_interaction_of_applications_and_the_kernel
      ]

So, essentially, all developers of applications try to design their own 
set of heuristical approaches which 

  • cover several keyboard layouts they can put their hands on;

  • more or less follow the design goals of their applications.

The approach taken by Emacs is to break the keyboard keys (VK’s) into 
several groups, and treat different groups differently.  Only the keys on
the main island of the keyboard may input characters.  Moreover, only 
the most common combinations of modifiers are allowed to be used for
the character input.  (In addition, there are plain bugs — like treating
UTF-16 as if it were UTF-32.)

  [I gave a very terse description on
     https://groups.google.com/forum/?hl=en#!search/emacs$20keyboard$20windows$20ilya/gnu.emacs.help/ZHpZK2YfFuo/aAyZFUxrFeEJ
  ]

The “correct” approach should proceed in exactly the opposite direction:
if a keypress produces a character, it should be treated as a 
character — no matter where on the physical keyboard the key is residing,
and which modifiers were pressed.

The patch below

  • Implements this “primacy of characters” doctrine;
  
  • As far as I could see, is compatible with the current work of Emacs
    on “simple keyboard layouts”;
  
  • Worked at some moment (before I started a massive addition of 
    comments ;-] — and maybe it is still working, I did not touch it for a
    month);
  
  • (Currently) ignores the indent coding rules;
  
  • Passes all the test thrown at it by my super-puper-all-bells-and-whistles
    layouts; see e.g.
       http://k.ilyaz.org/windows/izKeys-visual-maps.html#examples
  
  • Is not bullet-proof: 
      ∘ I use one heuristic to detect which modifiers are “consumed” by the
        character input, and which are “on top” of character input;

      ∘ It does not (same as the current Emacs) support 
          Unicode-entered-by-Alt-numbers.
  
  • Does not fix a bug with UTF-16 of stand-alone (pumped to us) WM_CHAR’s.

If I ever find more time to work on it, I plan to:

  1) Add yet more documentation;

  2) Change a little bit the logic of detection of consumed/extra 
     modifiers.  This change may be cosmetic only — or maybe, with some 
     extremely devilous layouts, it may be beneficial.
     
     (I have not seen layouts where this change would matter, though!
      And I looked though the source code of hundred(s).)

  3) Bring it in sync with the Emacs coding style.

Meanwhile, I would greatly appreciate all input related to the current 
state of the patch.  (I *HOPE* that I did not break (many!) special cases
in the current implementation — but such things are hard to be sure in!)

Thanks for the parts of Emacs which ARE working great,
Ilya

=======================================================

--- w32fns.c-ini	2015-01-30 15:33:23.505201400 -0800
+++ w32fns.c	2015-02-15 02:46:12.070091800 -0800
@@ -2832,6 +2832,126 @@ post_character_message (HWND hwnd, UINT
   my_post_msg (&wmsg, hwnd, msg, wParam, lParam);
 }
 
+static int
+get_wm_chars (HWND aWnd, int *buf, int buflen, int ignore_ctrl, int ctrl, int *ctrl_cnt, int *is_dead, int vk, int exp)
+{
+  MSG msg;
+  int i = buflen, doubled = 0, code_unit;	/* If doubled is at the end, ignore it */
+  if (ctrl_cnt)
+    *ctrl_cnt = 0;
+  if (is_dead)
+    *is_dead = -1;
+  while (buflen &&				/* Should be called only when w32_unicode_gui */
+         PeekMessageW(&msg, aWnd, WM_KEYFIRST, WM_KEYLAST, PM_NOREMOVE | PM_NOYIELD) &&
+         (msg.message == WM_CHAR || msg.message == WM_SYSCHAR || 
+          msg.message == WM_DEADCHAR || msg.message == WM_SYSDEADCHAR || msg.message == WM_UNICHAR)) {	/* Not contigious */
+    int dead;
+
+    GetMessageW(&msg, aWnd, msg.message, msg.message);
+    dead = (msg.message == WM_DEADCHAR || msg.message == WM_SYSDEADCHAR);
+    if (is_dead)
+      *is_dead = (dead ? msg.wParam : -1);
+    if (dead)
+      continue;
+    code_unit = msg.wParam;
+    if (doubled) {				/* had surrogate */
+      if (msg.message == WM_UNICHAR || code_unit < 0xDC00 || code_unit > 0xDFFF) {
+        /* Mismatched first surrogate.  Pass both code units as if they were two characters. */
+        *buf++ = doubled;
+        if (!--buflen)	// Drop the second char if at the end of the buffer
+          return i;
+      } else {
+        code_unit = (doubled << 10) + code_unit - 0x35FDC00;
+      }
+      doubled = 0;
+    } else if (code_unit >= 0xD800 && code_unit <= 0xDBFF) {
+      doubled = code_unit;
+      continue;
+    }    /* We handle mismatched second surrogate the same as a normal character. */
+    /* The only "fake" characters delivered by ToUnicode() or TranslateMessage() are: 
+       0x01 .. 0x1a for Control-chars, 
+       0x00 and 0x1b .. 0x1f for Control- []\@^_ 
+       0x7f for Control-BackSpace
+       0x20 for Control-Space */
+    if (ignore_ctrl && (code_unit < 0x20 || code_unit == 0x7f || (code_unit == 0x20 && ctrl))) {
+      /* Non-character payload in a WM_CHAR (Ctrl-something pressed).  Ignore. */
+      if (ctrl_cnt)
+        *ctrl_cnt++;
+      continue;
+    }
+    if (code_unit < 0x7f && 
+        ((vk >= VK_NUMPAD0 && vk <= VK_DIVIDE) ||
+         (exp && ((vk >= VK_PRIOR && vk <= VK_DOWN) || 
+                   vk == VK_INSERT || vk == VK_DELETE || vk == VK_CLEAR))) &&
+         strchr("0123456789/*-+.,", code_unit))	/* Traditionally, Emacs translates these to characters later, in `self-insert-character' */
+	continue;
+    *buf++ = code_unit;
+    buflen--;
+  }
+  return i - buflen;
+}
+
+int
+deliver_wm_chars (int do_translate, HWND hwnd, UINT msg, UINT wParam, UINT lParam)
+{
+  /* An "old style" keyboard description may assign up to 125 UTF-16 code points to a keypress. 
+     (However, the "old style" TranslateMessage() would deliver at most 16 of them.)  Be on a
+     safe side, and prepare to treat many more. */
+  int ctrl_cnt, buf[1024], count, is_dead;
+
+  if (do_translate) {
+      MSG windows_msg = { hwnd, msg, wParam, lParam, 0, {0,0} };
+
+      windows_msg.time = GetMessageTime ();
+      TranslateMessage (&windows_msg);
+  }
+  count = get_wm_chars (hwnd, buf, sizeof(buf)/sizeof(*buf), 1,
+                        /* The message may have been synthesized by who knows what; be conservative. */
+                        modifier_set (VK_LCONTROL) || modifier_set (VK_RCONTROL) || modifier_set (VK_CONTROL), 
+                        &ctrl_cnt, &is_dead, wParam, (lParam & 0x1000000L) != 0);
+  if (count) {
+    W32Msg wmsg;
+    int *b = buf, strip_Alt = 1;
+
+    /* wParam is checked when converting CapsLock to Shift */
+    wmsg.dwModifiers = do_translate ? w32_get_key_modifiers (wParam, lParam) : 0;
+
+    /* What follows is just heuristics; the correct treatement requires non-destructive ToUnicode(). */
+    if (wmsg.dwModifiers & ctrl_modifier)	/* If ctrl-something delivers chars, ctrl and the rest should be hidden */
+      wmsg.dwModifiers = wmsg.dwModifiers & shift_modifier;
+    /* In many keyboard layouts, (left) Alt is not changing the character.  Unless we are in this situation, strip Alt/Meta. */
+    if (wmsg.dwModifiers & (alt_modifier | meta_modifier) &&	/* If alt-something delivers non-ASCIIchars, alt should be hidden */
+        count == 1 && *b < 0x10000) {
+      SHORT r = VkKeyScanW( *b );
+
+      fprintf(stderr, "VkKeyScanW %#06x %#04x\n", (int)r, wParam);
+      if ((r & 0xFF) == wParam && !(r & ~0x1FF)) {	/* Char available without Alt modifier, so Alt is "on top" */
+         if (*b > 0x7f && ('A' <= wParam && wParam <= 'Z'))
+           return 0;					/* Another branch below would convert it to Alt-Latin char via wParam */	
+         strip_Alt = 0;
+      }
+    }
+    if (strip_Alt)
+      wmsg.dwModifiers = wmsg.dwModifiers & ~(alt_modifier | meta_modifier);
+    
+    signal_user_input ();
+    while (count--)
+      {
+        fprintf(stderr, "unichar %#06x\n", *b);
+        my_post_msg (&wmsg, hwnd, WM_UNICHAR, *b++, lParam);
+      }
+    if (!ctrl_cnt)	/* Process ALSO as ctrl */
+      return 1;
+    else
+        fprintf(stderr, "extra ctrl char\n");
+    return -1;
+  } else if (is_dead >= 0) {
+      fprintf(stderr, "dead %#06x\n", is_dead);
+      return 1;
+  }
+  return 0;
+}
+
 /* Main window procedure */
 
 static LRESULT CALLBACK
@@ -3007,7 +3127,6 @@ w32_wnd_proc (HWND hwnd, UINT msg, WPARA
       /* Synchronize modifiers with current keystroke.  */
       sync_modifiers ();
       record_keydown (wParam, lParam);
-      wParam = map_keypad_keys (wParam, (lParam & 0x1000000L) != 0);
 
       windows_translate = 0;
 
@@ -3117,6 +3236,45 @@ w32_wnd_proc (HWND hwnd, UINT msg, WPARA
 	    wParam = VK_NUMLOCK;
 	  break;
 	default:
+	  if (w32_unicode_gui) {	
+	    /* If this event generates characters or deadkeys, do not interpret 
+	       it as a "raw combination of modifiers and keysym".  Hide  
+	       deadkeys, and use the generated character(s) instead of the  
+	       keysym.   (Backward compatibility: exceptions for numpad keys 
+	       generating 0-9 . , / * - +, and for extra-Alt combined with a 
+	       non-Latin char.) 
+	       
+	       Try to not report modifiers which have effect on which 
+	       character or deadkey is generated.
+	       
+	       Example (contrived): if rightAlt-? generates f (on a Cyrillic 
+	       keyboard layout), and Ctrl, leftAlt do not affect the generated
+	       character, one wants to report Ctrl-leftAlt-f if the user 
+	       presses Ctrl-leftAlt-rightAlt-?. */
+	    int res; 
+#if 0
+	    /* Some of WM_CHAR may be fed to us directly, some are results of 
+	       TranslateMessage().  Using 0 as the first argument (in a 
+	       separate call) might help us distinguish these two cases.
+
+	       However, the keypress feeders would most probably expect the
+	       "standard" message pump, when TranslateMessage() is called on 
+	       EVERY KeyDown/Keyup event.  So they may feed us Down-Ctrl
+	       Down-FAKE Char-o and expect us to recognize it as Ctrl-o.
+	       Using 0 as the first argument would interfere with this.  */
+	    deliver_wm_chars (0, hwnd, msg, wParam, lParam);
+#endif
+	    /* Processing the generated WM_CHAR messages *WHILE* we handle 
+	       KEYDOWN/UP event is the best choice, since withoug any fuss, 
+	       we know all 3 of: scancode, virtual keycode, and expansion. 
+	       (Additionally, one knows boundaries of expansion of different
+	       keypresses.) */
+	    res = deliver_wm_chars (1, hwnd, msg, wParam, lParam);
+	    windows_translate = -( res != 0 );
+	    if (res > 0)		/* Bound to character(s) or a deadkey */
+	      break;
+	  }				/* Some branches after this one may be not needed */
+          wParam = map_keypad_keys (wParam, (lParam & 0x1000000L) != 0);
 	  /* If not defined as a function key, change it to a WM_CHAR message. */
 	  if (wParam > 255 || !lispy_function_keys[wParam])
 	    {
@@ -3184,6 +3342,8 @@ w32_wnd_proc (HWND hwnd, UINT msg, WPARA
 	    }
 	}
 
+    if (windows_translate == -1)
+      break;
     translate:
       if (windows_translate)
 	{


=======================================================



In GNU Emacs 25.0.50.20 (i686-pc-mingw32)
 of 2015-02-08 on BUCEFAL
Repository revision: d5e3922e08587e7eb9e5aec2e9f84cbda405f857
Windowing system distributor `Microsoft Corp.', version 6.1.7601
Configured using:
 `configure --prefix=/k/test'

Configured features:
SOUND NOTIFY ACL

Important settings:
  value of $LANG: ENU
  locale-coding-system: cp1252

Major mode: Fundamental

Minor modes in effect:
  tooltip-mode: t
  global-eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  buffer-read-only: t
  line-number-mode: t

Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.

Load-path shadows:
None found.

Features:
(shadow sort gnus-util mail-extr emacsbug message dired format-spec
rfc822 mml easymenu mml-sec mm-decode mm-bodies mm-encode mail-parse
rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045
ietf-drums mm-util help-fns mail-prsvr mail-utils time-date tooltip
eldoc electric uniquify ediff-hook vc-hooks lisp-float-type mwheel
dos-w32 ls-lisp disp-table w32-win w32-vars tool-bar dnd fontset image
regexp-opt fringe tabulated-list newcomment elisp-mode lisp-mode
prog-mode register page menu-bar rfn-eshadow timer select scroll-bar
mouse jit-lock font-lock syntax facemenu font-core frame cham georgian
utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean
japanese hebrew greek romanian slovak czech european ethiopic indian
cyrillic chinese case-table epa-hook jka-cmpr-hook help simple abbrev
minibuffer cl-preloaded nadvice loaddefs button faces cus-face macroexp
files text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget hashtable-print-readable backquote make-network-process
w32notify w32 multi-tty emacs)

Memory information:
((conses 8 80324 9864)
 (symbols 32 17968 0)
 (miscs 32 85 128)
 (strings 16 12688 4007)
 (string-bytes 1 324435)
 (vectors 8 9470)
 (vector-slots 4 390690 6074)
 (floats 8 65 62)
 (intervals 28 243 45)
 (buffers 516 13))




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#19994; Package emacs. (Wed, 04 Mar 2015 18:02:01 GMT) Full text and rfc822 format available.

Message #8 received at 19994 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Ilya Zakharevich <nospam-abuse <at> ilyaz.org>
Cc: 19994 <at> debbugs.gnu.org
Subject: Re: bug#19994: 25.0.50; Unicode keyboard input on Windows
Date: Wed, 04 Mar 2015 20:01:01 +0200
> Date: Tue, 3 Mar 2015 15:09:49 -0800
> From: Ilya Zakharevich <nospam-abuse <at> ilyaz.org>
> 
> I’m working on a patch to make Unicode keyboard input to work properly on
> Windows (in graphic mode).

Thanks!

> The patch below
> 
>   • Implements this “primacy of characters” doctrine;
>  
>   • As far as I could see, is compatible with the current work of Emacs
>     on “simple keyboard layouts”;
>  
>   • Worked at some moment (before I started a massive addition of 
>     comments ;-] — and maybe it is still working, I did not touch it for a
>     month);
>  
>   • (Currently) ignores the indent coding rules;
>  
>   • Passes all the test thrown at it by my super-puper-all-bells-and-whistles
>     layouts; see e.g.
>        http://k.ilyaz.org/windows/izKeys-visual-maps.html#examples

Any chance of coming up with a few tests for this code, and adding
them to the test/ directory?

> If I ever find more time to work on it, I plan to:
>
>   1) Add yet more documentation;
> 
>   2) Change a little bit the logic of detection of consumed/extra 
>      modifiers.  This change may be cosmetic only — or maybe, with some 
>      extremely devilous layouts, it may be beneficial.
>     
>      (I have not seen layouts where this change would matter, though!
>       And I looked though the source code of hundred(s).)
> 
>   3) Bring it in sync with the Emacs coding style.

I suggest, indeed, to clean up the code so we could commit it to the
master branch.  That way, it will get wider testing, and we can fix
whatever problems it might cause.  Any deficiencies that don't cause
regressions wrt the current code can be fixed later, or even not at
all (if we decide them to not be important enough).

Question: did you try this code with IME input methods?

> Meanwhile, I would greatly appreciate all input related to the current 
> state of the patch.

Some of that (but not much) below.

> +static int
> +get_wm_chars (HWND aWnd, int *buf, int buflen, int ignore_ctrl, int ctrl, int
                            ^^^^^^^^
Why 'int' and not 'wchar_t'?

> +  while (buflen &&                             /* Should be called only when  w32_unicode_gui */
> +         PeekMessageW(&msg, aWnd, WM_KEYFIRST, WM_KEYLAST, PM_NOREMOVE | PM_NOYIELD) &&

Indeed, any "wide" APIs should only be called when w32_unicode_gui is
on, and there should be alternative code for when w32_unicode_gui is
off.  We still try to support Windows 9X.

> +      if (msg.message == WM_UNICHAR || code_unit < 0xDC00 || code_unit > 
> 0xDFFF) {
> +        /* Mismatched first surrogate.  Pass both code units as if they were 
> two characters. */
> +        *buf++ = doubled;
> +        if (!--buflen) // Drop the second char if at the end of the buffer
> +          return i;
> +      } else {
> +        code_unit = (doubled << 10) + code_unit - 0x35FDC00;
> +      }
> +      doubled = 0;
> +    } else if (code_unit >= 0xD800 && code_unit <= 0xDBFF) {

Either explain the "magic" constants in comments, or, better, use
macros with descriptive names.

> +  int ctrl_cnt, buf[1024], count, is_dead;

I think buf[] should be an array of wchar_t.  Also, will this code
work for the non-w32_unicode_gui mode?

> +  if (count) {
> +    W32Msg wmsg;
> +    int *b = buf, strip_Alt = 1;

Likewise with 'b'.

> +      SHORT r = VkKeyScanW( *b );

VkKeyScanW should be called only if w32_unicode_gui is on.  (Or maybe
the caller is only called when w32_unicode_gui is on, in which case
maybe we should have an eassert there.)

> +      fprintf(stderr, "VkKeyScanW %#06x %#04x\n", (int)r, wParam);
> +      if ((r & 0xFF) == wParam && !(r & ~0x1FF)) {     /* Char available 
> without Alt modifier, so Alt is "on top" */
> +         if (*b > 0x7f && ('A' <= wParam && wParam <= 'Z'))
> +           return 0;                                   /* Another branch below 
> would convert it to Alt-Latin char via wParam */        
> +         strip_Alt = 0;
> +      }
> +    }
> +    if (strip_Alt)
> +      wmsg.dwModifiers = wmsg.dwModifiers & ~(alt_modifier | meta_modifier);
> +    
> +    signal_user_input ();
> +    while (count--)
> +      {
> +        fprintf(stderr, "unichar %#06x\n", *b);
> +        my_post_msg (&wmsg, hwnd, WM_UNICHAR, *b++, lParam);
> +      }
> +    if (!ctrl_cnt)     /* Process ALSO as ctrl */
> +      return 1;
> +    else
> +        fprintf(stderr, "extra ctrl char\n");
> +    return -1;
> +  } else if (is_dead >= 0) {
> +      fprintf(stderr, "dead %#06x\n", is_dead);
> +      return 1;
> +  }

Lots of debugging output here that should be removed.

Thanks again for working on this.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#19994; Package emacs. (Fri, 06 Mar 2015 00:44:02 GMT) Full text and rfc822 format available.

Message #11 received at 19994 <at> debbugs.gnu.org (full text, mbox):

From: Ilya Zakharevich <ilya <at> math.berkeley.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 19994 <at> debbugs.gnu.org
Subject: Re: bug#19994: 25.0.50; Unicode keyboard input on Windows
Date: Thu, 5 Mar 2015 16:43:32 -0800
On Wed, Mar 04, 2015 at 08:01:01PM +0200, Eli Zaretskii wrote:
> > +static int
> > +get_wm_chars (HWND aWnd, int *buf, int buflen, int ignore_ctrl, int ctrl, int
>                             ^^^^^^^^
> Why 'int' and not 'wchar_t'?

This is for a Unicode chars.  They won’t fit into (Windows’ style) wchar_t.

> > +  while (buflen &&                             /* Should be called only when  w32_unicode_gui */
> > +         PeekMessageW(&msg, aWnd, WM_KEYFIRST, WM_KEYLAST, PM_NOREMOVE | PM_NOYIELD) &&
> 
> Indeed, any "wide" APIs should only be called when w32_unicode_gui is
> on, and there should be alternative code for when w32_unicode_gui is
> off.  We still try to support Windows 9X.

The caller ensures this.  Yes, assert() would be beneficial here.

> > +  int ctrl_cnt, buf[1024], count, is_dead;
> 
> I think buf[] should be an array of wchar_t.  Also, will this code
> work for the non-w32_unicode_gui mode?

This code is pure-GUI.  For non-GUI “bindable” input on Windows the
major hurdle is that 

  (A) I know no way to distinguish a “prefix key” (deadkey) keypress
      from a keypress which should trigger user bindings;

  (B) with “non-destructive ToUnicode()”, one WOULD be able to
      distinguish these two cases, — but I have no clue how to find
      out the current keyboard layout of a console session.

      (There is a lot of examples of code which returns the keyboard
       layout of a window; — but these examples do not work for
       console sessions.  I suppose that the reason is that the window
       is actually owned by a system process, and one does not have
       permissions to access its properties.)

Thanks,
Ilya




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#19994; Package emacs. (Fri, 06 Mar 2015 10:53:01 GMT) Full text and rfc822 format available.

Message #14 received at 19994 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Ilya Zakharevich <ilya <at> math.berkeley.edu>
Cc: 19994 <at> debbugs.gnu.org
Subject: Re: bug#19994: 25.0.50; Unicode keyboard input on Windows
Date: Fri, 06 Mar 2015 12:52:08 +0200
> Date: Thu, 5 Mar 2015 16:43:32 -0800
> From: Ilya Zakharevich <ilya <at> math.berkeley.edu>
> Cc: 19994 <at> debbugs.gnu.org
> 
> On Wed, Mar 04, 2015 at 08:01:01PM +0200, Eli Zaretskii wrote:
> > > +static int
> > > +get_wm_chars (HWND aWnd, int *buf, int buflen, int ignore_ctrl, int ctrl, int
> >                             ^^^^^^^^
> > Why 'int' and not 'wchar_t'?
> 
> This is for a Unicode chars.  They won’t fit into (Windows’ style) wchar_t.

Right.

> > Also, will this code work for the non-w32_unicode_gui mode?
> 
> This code is pure-GUI.  For non-GUI “bindable” input on Windows the
> major hurdle is that 

No, that's not what I meant.  I meant GUI sessions in which
w32_unicode_gui is zero, i.e. Windows 9X systems.

Console input is a different matter (and is handled separately, see
w32inevt.c).

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#19994; Package emacs. (Fri, 06 Mar 2015 11:41:01 GMT) Full text and rfc822 format available.

Message #17 received at 19994 <at> debbugs.gnu.org (full text, mbox):

From: Ilya Zakharevich <ilya <at> math.berkeley.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 19994 <at> debbugs.gnu.org
Subject: Re: bug#19994: 25.0.50; Unicode keyboard input on Windows
Date: Fri, 6 Mar 2015 03:40:03 -0800
On Fri, Mar 06, 2015 at 12:52:08PM +0200, Eli Zaretskii wrote:
> > > Also, will this code work for the non-w32_unicode_gui mode?
> > 
> > This code is pure-GUI.  For non-GUI “bindable” input on Windows the
> > major hurdle is that 
> 
> No, that's not what I meant.  I meant GUI sessions in which
> w32_unicode_gui is zero, i.e. Windows 9X systems.

Unless w32_unicode_gui is set, the changes made by this patch are a NOP.

Ilya




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#19994; Package emacs. (Fri, 06 Mar 2015 14:02:02 GMT) Full text and rfc822 format available.

Message #20 received at 19994 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Ilya Zakharevich <ilya <at> math.berkeley.edu>
Cc: 19994 <at> debbugs.gnu.org
Subject: Re: bug#19994: 25.0.50; Unicode keyboard input on Windows
Date: Fri, 06 Mar 2015 16:00:51 +0200
> Date: Fri, 6 Mar 2015 03:40:03 -0800
> From: Ilya Zakharevich <ilya <at> math.berkeley.edu>
> Cc: 19994 <at> debbugs.gnu.org
> 
> On Fri, Mar 06, 2015 at 12:52:08PM +0200, Eli Zaretskii wrote:
> > > > Also, will this code work for the non-w32_unicode_gui mode?
> > > 
> > > This code is pure-GUI.  For non-GUI “bindable” input on Windows the
> > > major hurdle is that 
> > 
> > No, that's not what I meant.  I meant GUI sessions in which
> > w32_unicode_gui is zero, i.e. Windows 9X systems.
> 
> Unless w32_unicode_gui is set, the changes made by this patch are a NOP.

That's fine, thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#19994; Package emacs. (Wed, 01 Jul 2015 10:08:02 GMT) Full text and rfc822 format available.

Message #23 received at 19994 <at> debbugs.gnu.org (full text, mbox):

From: Ilya Zakharevich <ilya <at> math.berkeley.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 19994 <at> debbugs.gnu.org
Subject: Re: bug#19994: 25.0.50; Unicode keyboard input on Windows
Date: Wed, 1 Jul 2015 03:07:12 -0700
On Wed, Mar 04, 2015 at 08:01:01PM +0200, Eli Zaretskii wrote:
> > Date: Tue, 3 Mar 2015 15:09:49 -0800
> > From: Ilya Zakharevich <nospam-abuse <at> ilyaz.org>
> > 
> > I’m working on a patch to make Unicode keyboard input to work properly on
> > Windows (in graphic mode).

> I suggest, indeed, to clean up the code so we could commit it to the
> master branch.  That way, it will get wider testing, and we can fix
> whatever problems it might cause.  Any deficiencies that don't cause
> regressions wrt the current code can be fixed later, or even not at
> all (if we decide them to not be important enough).

I had no time to work on the code itself, but
  • I fixed the formatting,
  • I pumped up the docs,
  • I put in the suggested eassert().

----------------

As it was before, the patch
  • defines two new static functions,
  • delays modification of wParam as late as needed (moves 1 LoC in
    w32_wnd_proc()), and
  • adds 8 LoC to w32_wnd_proc().
The call to these static functions is conditional on w32_unicode_gui.

Enjoy,
Ilya

--- w32fns.c-ini	2015-01-30 15:33:23.505201400 -0800
+++ w32fns.c	2015-07-01 02:56:30.787672000 -0700
@@ -2832,6 +2832,233 @@ post_character_message (HWND hwnd, UINT
   my_post_msg (&wmsg, hwnd, msg, wParam, lParam);
 }
 
+static int
+get_wm_chars (HWND aWnd, int *buf, int buflen, int ignore_ctrl, int ctrl, 
+              int *ctrl_cnt, int *is_dead, int vk, int exp)
+{
+  MSG msg;
+  /* If doubled is at the end, ignore it */
+  int i = buflen, doubled = 0, code_unit;
+
+  if (ctrl_cnt)
+    *ctrl_cnt = 0;
+  if (is_dead)
+    *is_dead = -1;
+  eassert(w32_unicode_gui);
+  while (buflen
+  	 /* Should be called only when w32_unicode_gui: */
+         && PeekMessageW(&msg, aWnd, WM_KEYFIRST, WM_KEYLAST, 
+         	      PM_NOREMOVE | PM_NOYIELD)
+         && (msg.message == WM_CHAR || msg.message == WM_SYSCHAR 
+             || msg.message == WM_DEADCHAR || msg.message == WM_SYSDEADCHAR 
+             || msg.message == WM_UNICHAR)) 
+    { 
+      /* We extract character payload, but in this call we handle only the 
+         characters which comes BEFORE the next keyup/keydown message. */
+      int dead;
+
+      GetMessageW(&msg, aWnd, msg.message, msg.message);
+      dead = (msg.message == WM_DEADCHAR || msg.message == WM_SYSDEADCHAR);
+      if (is_dead)
+        *is_dead = (dead ? msg.wParam : -1);
+      if (dead)
+        continue;
+      code_unit = msg.wParam;
+      if (doubled) 
+        { 
+          /* had surrogate */
+          if (msg.message == WM_UNICHAR 
+              || code_unit < 0xDC00 || code_unit > 0xDFFF) 
+            { /* Mismatched first surrogate.  
+                 Pass both code units as if they were two characters. */
+              *buf++ = doubled;
+              if (!--buflen)
+                return i; /* Drop the 2nd char if at the end of the buffer. */
+            } 
+          else /* see https://en.wikipedia.org/wiki/UTF-16 */
+            {
+              code_unit = (doubled << 10) + code_unit - 0x35FDC00;
+            }
+          doubled = 0;
+        } 
+      else if (code_unit >= 0xD800 && code_unit <= 0xDBFF) 
+        {    
+          /* Handle mismatched 2nd surrogate the same as a normal character. */
+          doubled = code_unit;
+          continue;
+        }
+
+      /* The only "fake" characters delivered by ToUnicode() or 
+         TranslateMessage() are: 
+         0x01 .. 0x1a for Ctrl-letter, Enter, Tab, Ctrl-Break, Esc, Backspace
+         0x00 and 0x1b .. 0x1f for Control- []\@^_ 
+         0x7f for Control-BackSpace
+         0x20 for Control-Space */
+      if (ignore_ctrl 
+          && (code_unit < 0x20 || code_unit == 0x7f 
+              || (code_unit == 0x20 && ctrl))) 
+        { 
+          /* Non-character payload in a WM_CHAR
+             (Ctrl-something pressed, see above).  Ignore, and report. */
+          if (ctrl_cnt)
+            *ctrl_cnt++;
+          continue;
+        }
+      /* Traditionally, Emacs would ignore the character payload of VK_NUMPAD* 
+         keys, and would treat them later via `function-key-map'.  In addition
+         to usual 102-key NUMPAD keys, this map also treats `kp-'-variants of
+         space, tab, enter, separator, equal.  TAB  and EQUAL, apparently, 
+         cannot be generated on Win-GUI branch.  ENTER is already handled 
+         by the code above.  According to `lispy_function_keys', kp_space is 
+         generated by not-extended VK_CLEAR.  (kp-tab !=  VK_OEM_NEC_EQUAL!). 
+       
+         We do similarly for backward-compatibility, but ignore only the
+         characters restorable later by `function-key-map'. */
+      if (code_unit < 0x7f 
+          && ((vk >= VK_NUMPAD0 && vk <= VK_DIVIDE) 
+              || (exp && ((vk >= VK_PRIOR && vk <= VK_DOWN) || 
+                     vk == VK_INSERT || vk == VK_DELETE || vk == VK_CLEAR))) 
+          && strchr("0123456789/*-+.,", code_unit))
+        continue;
+      *buf++ = code_unit;
+      buflen--;
+    }
+  return i - buflen;
+}
+
+#ifdef DBG_WM_CHARS
+#  define FPRINTF_WM_CHARS(ARG)	fprintf ARG
+#else
+#  define FPRINTF_WM_CHARS(ARG)	0
+#endif
+
+int
+deliver_wm_chars (int do_translate, HWND hwnd, UINT msg, UINT wParam, 
+                  UINT lParam, int legacy_alt_meta)
+{
+  /* An "old style" keyboard description may assign up to 125 UTF-16 code 
+     points to a keypress. 
+     (However, the "old style" TranslateMessage() would deliver at most 16 of 
+     them.)  Be on a safe side, and prepare to treat many more. */
+  int ctrl_cnt, buf[1024], count, is_dead;
+
+  /* Since the keypress processing logic of Windows has a lot of state, it 
+     is important to call TranslateMessage() for every keyup/keydown, AND
+     do it exactly once.  (The actual change of state is done by
+     ToUnicode[Ex](), which is called by TranslateMessage().  So one can
+     call ToUnicode[Ex]() instead.)
+     
+     The "usual" message pump calls TranslateMessage() for EVERY event.
+     Emacs calls TranslateMessage() very selectively (is it needed for doing 
+     some tricky stuff with Win95???  With newer Windows, selectiveness is,
+     most probably, not needed - and harms a lot). 
+     
+     So, with the usual message pump, the following call to TranslateMessage() 
+     is not needed (and is going to be VERY harmful).  With Emacs' message 
+     pump, the call is needed.  */
+  if (do_translate) {
+      MSG windows_msg = { hwnd, msg, wParam, lParam, 0, {0,0} };
+
+      windows_msg.time = GetMessageTime ();
+      TranslateMessage (&windows_msg);
+  }
+  count = get_wm_chars (hwnd, buf, sizeof(buf)/sizeof(*buf), 1,
+                        /* The message may have been synthesized by 
+                           who knows what; be conservative. */
+                        modifier_set (VK_LCONTROL) 
+                          || modifier_set (VK_RCONTROL) 
+                          || modifier_set (VK_CONTROL), 
+                        &ctrl_cnt, &is_dead, wParam, 
+                        (lParam & 0x1000000L) != 0);
+  if (count) {
+    W32Msg wmsg;
+    int *b = buf, strip_Alt = 1;
+
+    /* wParam is checked when converting CapsLock to Shift */
+    wmsg.dwModifiers = do_translate 
+	? w32_get_key_modifiers (wParam, lParam) : 0;
+
+    /* What follows is just heuristics; the correct treatement requires 
+       non-destructive ToUnicode(): 
+         http://search.cpan.org/~ilyaz/UI-KeyboardLayout/lib/UI/KeyboardLayout.pm#Can_an_application_on_Windows_accept_keyboard_events?_Part_IV:_application-specific_modifiers
+
+       What one needs to find is: 
+         * which of the present modifiers AFFECT the resulting char(s) 
+           (so should be stripped, since their EFFECT is "already
+            taken into account" in the string in buf), and 
+         * which modifiers are not affecting buf, so should be reported to
+           the application for further treatment.
+       
+       Example: assume that we know:
+         (A) lCtrl+rCtrl+rAlt modifiers with VK_A key produce a Latin "f"
+             ("may be logical" with a JCUKEN-flavored Russian keyboard flavor);
+         (B) removing any one of lCtrl, rCtrl, rAlt changes the produced char;
+         (C) Win-modifier is not affecting the produced character 
+             (this is the common case: happens with all "standard" layouts).
+
+       Suppose the user presses Win+lCtrl+rCtrl+rAlt modifiers with VK_A.
+       What is the intent of the user?  We need to guess the intent to decide  
+       which event to deliver to the application.
+       
+       This looks like a reasonable logic: wince Win- modifier does not affect 
+       the output string, the user was pressing Win for SOME OTHER purpose.
+       So the user wanted to generate Win-SOMETHING event.  Now, what is
+       something?  If one takes the mantra that "character payload is more 
+       important than the combination of keypresses which resulted in this 
+       payload", then one should ignore lCtrl+rCtrl+rAlt, ignore VK_A, and
+       assume that the user wanted to generate Win-f.
+       
+       Unfortunately, without non-destructive ToUnicode(), checking (B) and (C)
+       is out of question.  So we use heuristics (hopefully, covering 99.9999%
+       of cases).
+     */
+    
+    /* If ctrl-something delivers chars, ctrl and the rest should be hidden; 
+       so the consumer of key-event won't interpret it as an accelerator. */
+    if (wmsg.dwModifiers & ctrl_modifier)
+      wmsg.dwModifiers = wmsg.dwModifiers & shift_modifier;
+    /* In many keyboard layouts, (left) Alt is not changing the character.  
+       Unless we are in this situation, strip Alt/Meta. */
+    if (wmsg.dwModifiers & (alt_modifier | meta_modifier) 
+        /* If alt-something delivers non-ASCIIchars, alt should be hidden */
+        && count == 1 && *b < 0x10000) 
+      {
+        SHORT r = VkKeyScanW( *b );
+
+        FPRINTF_WM_CHARS((stderr, "VkKeyScanW %#06x %#04x\n", (int)r, wParam));
+        if ((r & 0xFF) == wParam && !(r & ~0x1FF)) 
+          {	
+            /* Char available without Alt modifier, so Alt is "on top" */
+            if (legacy_alt_meta 
+                && *b > 0x7f && ('A' <= wParam && wParam <= 'Z'))
+	      /* For backward-compatibility with older Emacsen, let
+	         this be processed by another branch below (which would convert 
+	         it to Alt-Latin char via wParam). */
+              return 0;
+            strip_Alt = 0;
+          }
+      }
+    if (strip_Alt)
+      wmsg.dwModifiers = wmsg.dwModifiers & ~(alt_modifier | meta_modifier);
+    
+    signal_user_input ();
+    while (count--)
+      {
+        FPRINTF_WM_CHARS((stderr, "unichar %#06x\n", *b));
+        my_post_msg (&wmsg, hwnd, WM_UNICHAR, *b++, lParam);
+      }
+    if (!ctrl_cnt) /* Process ALSO as ctrl */
+      return 1;
+    else
+        FPRINTF_WM_CHARS((stderr, "extra ctrl char\n"));
+    return -1;
+  } else if (is_dead >= 0) {
+      FPRINTF_WM_CHARS((stderr, "dead %#06x\n", is_dead));
+      return 1;
+  }
+  return 0;
+}
+
 /* Main window procedure */
 
 static LRESULT CALLBACK
@@ -3007,7 +3234,6 @@ w32_wnd_proc (HWND hwnd, UINT msg, WPARA
       /* Synchronize modifiers with current keystroke.  */
       sync_modifiers ();
       record_keydown (wParam, lParam);
-      wParam = map_keypad_keys (wParam, (lParam & 0x1000000L) != 0);
 
       windows_translate = 0;
 
@@ -3117,6 +3343,46 @@ w32_wnd_proc (HWND hwnd, UINT msg, WPARA
 	    wParam = VK_NUMLOCK;
 	  break;
 	default:
+	  if (w32_unicode_gui) {	
+	    /* If this event generates characters or deadkeys, do not interpret 
+	       it as a "raw combination of modifiers and keysym".  Hide  
+	       deadkeys, and use the generated character(s) instead of the  
+	       keysym.   (Backward compatibility: exceptions for numpad keys 
+	       generating 0-9 . , / * - +, and for extra-Alt combined with a 
+	       non-Latin char.) 
+	       
+	       Try to not report modifiers which have effect on which 
+	       character or deadkey is generated.
+	       
+	       Example (contrived): if rightAlt-? generates f (on a Cyrillic 
+	       keyboard layout), and Ctrl, leftAlt do not affect the generated
+	       character, one wants to report Ctrl-leftAlt-f if the user 
+	       presses Ctrl-leftAlt-rightAlt-?. */
+	    int res; 
+#if 0
+	    /* Some of WM_CHAR may be fed to us directly, some are results of 
+	       TranslateMessage().  Using 0 as the first argument (in a 
+	       separate call) might help us distinguish these two cases.
+
+	       However, the keypress feeders would most probably expect the
+	       "standard" message pump, when TranslateMessage() is called on 
+	       EVERY KeyDown/Keyup event.  So they may feed us Down-Ctrl
+	       Down-FAKE Char-o and expect us to recognize it as Ctrl-o.
+	       Using 0 as the first argument would interfere with this.  */
+	    deliver_wm_chars (0, hwnd, msg, wParam, lParam, 1);
+#endif
+	    /* Processing the generated WM_CHAR messages *WHILE* we handle 
+	       KEYDOWN/UP event is the best choice, since withoug any fuss, 
+	       we know all 3 of: scancode, virtual keycode, and expansion. 
+	       (Additionally, one knows boundaries of expansion of different
+	       keypresses.) */
+	    res = deliver_wm_chars (1, hwnd, msg, wParam, lParam, 1);
+	    windows_translate = -( res != 0 );
+	    if (res > 0) /* Bound to character(s) or a deadkey */
+	      break;
+	    /* deliver_wm_chars() may make some branches after this vestigal */
+	  }
+          wParam = map_keypad_keys (wParam, (lParam & 0x1000000L) != 0);
 	  /* If not defined as a function key, change it to a WM_CHAR message. */
 	  if (wParam > 255 || !lispy_function_keys[wParam])
 	    {
@@ -3184,6 +3450,8 @@ w32_wnd_proc (HWND hwnd, UINT msg, WPARA
 	    }
 	}
 
+    if (windows_translate == -1)
+      break;
     translate:
       if (windows_translate)
 	{




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#19994; Package emacs. (Thu, 09 Jul 2015 00:04:01 GMT) Full text and rfc822 format available.

Message #26 received at 19994 <at> debbugs.gnu.org (full text, mbox):

From: Ilya Zakharevich <ilya <at> math.berkeley.edu>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 19994 <at> debbugs.gnu.org
Subject: Re: bug#19994: 25.0.50; Unicode keyboard input on Windows
Date: Wed, 8 Jul 2015 17:02:59 -0700
[Message part 1 (text/plain, inline)]
On Wed, Jul 01, 2015 at 03:07:12AM -0700, Ilya Zakharevich wrote:
> On Wed, Mar 04, 2015 at 08:01:01PM +0200, Eli Zaretskii wrote:

> > I suggest, indeed, to clean up the code so we could commit it to the
> > master branch.  That way, it will get wider testing, and we can fix

> I had no time to work on the code itself, but
>   • I fixed the formatting,
>   • I pumped up the docs,
>   • I put in the suggested eassert().

The variant I sent was too primitive — it was not covering a (common?)
usage case when (with AltGr-layouts) leftCtrl+rightCtrl was behaving
differently than pressing AltGr:
   • leftCtrl+rightCtrl would trigger C-M-key;
   • altGr would enter the character payload.

This update

  (0) fixes two formatting-style omissions;

  (A) adds A LOAD of new comments;
  (B) treats such important cases (as above) separately;

  (z) Marks a piece of old code which does not make any sense.
        (see the last chunk in the relative patch)

Notes:

  • In (B), there are some decisions to make.  I encapsulate these
    decisions into two strings.  For best result, these strings should
    be user-customizable.  However, currently they are just put into
    C #defines.

    When I sit on this more, and if these customizations turn out to
    be useful, one can make them into Lisp variables.

  • There is a bug in the (old) Emacs code which prevents some cases
    treated in (B) from being really useful.  I did not fix it yet.

    To see the bug:
      ∘ switch to layout with AltGr;
      ∘ assume that AltGr-s produces ß (as with US International);
      ∘ pressing AltGr-rightControl-s produces Meta-ß;
      ∘ pressing rightControl-AltGr-s produces C-M-s.
    (I do not think this effect is intentional.)

  • And, BTW, is it documented anywhere that
    leftControl-rightControl-key produces C-M-key?

I include two patches:
  □   absolute (ignore the previous patches)
  □   relative (with whitespace ignored) — for reading.

Enjoy,
Ilya
[w32fns.c-diff-v2 (text/plain, attachment)]
[w32fns.c-diff-v2-relative (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#19994; Package emacs. (Fri, 31 Jul 2015 09:24:02 GMT) Full text and rfc822 format available.

Message #29 received at 19994 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Ilya Zakharevich <ilya <at> math.berkeley.edu>
Cc: 19994 <at> debbugs.gnu.org
Subject: Re: bug#19994: 25.0.50; Unicode keyboard input on Windows
Date: Fri, 31 Jul 2015 12:23:00 +0300
> Date: Wed, 8 Jul 2015 17:02:59 -0700
> From: Ilya Zakharevich <ilya <at> math.berkeley.edu>
> Cc: 19994 <at> debbugs.gnu.org
> 
> On Wed, Jul 01, 2015 at 03:07:12AM -0700, Ilya Zakharevich wrote:
> > On Wed, Mar 04, 2015 at 08:01:01PM +0200, Eli Zaretskii wrote:
> 
> > > I suggest, indeed, to clean up the code so we could commit it to the
> > > master branch.  That way, it will get wider testing, and we can fix
> 
> > I had no time to work on the code itself, but
> >   • I fixed the formatting,
> >   • I pumped up the docs,
> >   • I put in the suggested eassert().
> 
> The variant I sent was too primitive — it was not covering a (common?)
> usage case when (with AltGr-layouts) leftCtrl+rightCtrl was behaving
> differently than pressing AltGr:
>    • leftCtrl+rightCtrl would trigger C-M-key;
>    • altGr would enter the character payload.
> 
> This update
> 
>   (0) fixes two formatting-style omissions;
> 
>   (A) adds A LOAD of new comments;
>   (B) treats such important cases (as above) separately;
> 
>   (z) Marks a piece of old code which does not make any sense.
>         (see the last chunk in the relative patch)
> 
> Notes:
> 
>   • In (B), there are some decisions to make.  I encapsulate these
>     decisions into two strings.  For best result, these strings should
>     be user-customizable.  However, currently they are just put into
>     C #defines.
> 
>     When I sit on this more, and if these customizations turn out to
>     be useful, one can make them into Lisp variables.
> 
>   • There is a bug in the (old) Emacs code which prevents some cases
>     treated in (B) from being really useful.  I did not fix it yet.
> 
>     To see the bug:
>       ∘ switch to layout with AltGr;
>       ∘ assume that AltGr-s produces ß (as with US International);
>       ∘ pressing AltGr-rightControl-s produces Meta-ß;
>       ∘ pressing rightControl-AltGr-s produces C-M-s.
>     (I do not think this effect is intentional.)
> 
>   • And, BTW, is it documented anywhere that
>     leftControl-rightControl-key produces C-M-key?
> 
> I include two patches:
>   □   absolute (ignore the previous patches)
>   □   relative (with whitespace ignored) — for reading.

Thanks.  I committed this in your name, with a few minor stylistic
changes, and also fixed a few typos in the comments.  Sorry for a long
delay in doing that.

I also added a new variable, w32-use-fallback-wm-chars-method, which,
when non-nil, makes Emacs use the old code from before your changes.
This is meant to be a handy debugging aid, in case we discover some
issues with the new code.

Do you think there are any user-visible effects of your changes that
are worthy of mentioning in NEWS?  If so, please propose the text for
NEWS.

I leave it up to you to decide whether this bug should be closed, or
if there's something else to be done about it.

Thanks again for working on this.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#19994; Package emacs. (Sat, 01 Aug 2015 07:41:02 GMT) Full text and rfc822 format available.

Message #32 received at 19994 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: ilya <at> math.berkeley.edu
Cc: 19994 <at> debbugs.gnu.org
Subject: Re: bug#19994: 25.0.50; Unicode keyboard input on Windows
Date: Sat, 01 Aug 2015 10:40:05 +0300
> Date: Fri, 31 Jul 2015 12:23:00 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 19994 <at> debbugs.gnu.org
> 
> Thanks.  I committed this in your name, with a few minor stylistic
> changes, and also fixed a few typos in the comments.  Sorry for a long
> delay in doing that.
> 
> I also added a new variable, w32-use-fallback-wm-chars-method, which,
> when non-nil, makes Emacs use the old code from before your changes.
> This is meant to be a handy debugging aid, in case we discover some
> issues with the new code.
> 
> Do you think there are any user-visible effects of your changes that
> are worthy of mentioning in NEWS?  If so, please propose the text for
> NEWS.
> 
> I leave it up to you to decide whether this bug should be closed, or
> if there's something else to be done about it.

Here's one problem evidently caused by the new code: invoke "emacs -Q"
and type "M-x" after it starts => you will see "x" being inserted into
*scratch*.  This doesn't happen if w32-use-fallback-wm-chars-method is
non-nil.

This is a one-time problem: all the subsequent "M-x" are handled
correctly.  It sounds like some initialization somewhere is missing?

Could you please look into that ASAP?  TIA.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#19994; Package emacs. (Sun, 02 Aug 2015 14:43:02 GMT) Full text and rfc822 format available.

Message #35 received at 19994 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: ilya <at> math.berkeley.edu
Cc: 19994 <at> debbugs.gnu.org
Subject: Re: bug#19994: 25.0.50; Unicode keyboard input on Windows
Date: Sun, 02 Aug 2015 17:42:30 +0300
> Date: Sat, 01 Aug 2015 10:40:05 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 19994 <at> debbugs.gnu.org
> 
> Here's one problem evidently caused by the new code: invoke "emacs -Q"
> and type "M-x" after it starts => you will see "x" being inserted into
> *scratch*.  This doesn't happen if w32-use-fallback-wm-chars-method is
> non-nil.
> 
> This is a one-time problem: all the subsequent "M-x" are handled
> correctly.  It sounds like some initialization somewhere is missing?

I've found that the simple change below fixes this problem.  I
committed it; if you feel it's not the right fix, please propose an
alternative.

Thanks.

commit 0afb8fab99951262e81d6095302de4c84d7e8847
Author: Eli Zaretskii <eliz <at> gnu.org>
Date:   Sun Aug 2 17:40:19 2015 +0300

    Fix handling of 1st keystroke on MS-Windows
    
    * src/w32fns.c (globals_of_w32fns): Initialize after_deadkey to -1.
    This is needed to correctly handle the session's first keystroke,
    if it has any modifiers.  (Bug#19994)

diff --git a/src/w32fns.c b/src/w32fns.c
index 1c72974..31d23c4 100644
--- a/src/w32fns.c
+++ b/src/w32fns.c
@@ -9442,6 +9442,8 @@ typedef USHORT (WINAPI * CaptureStackBackTrace_proc) (ULONG, ULONG, PVOID *,
   else
     w32_unicode_gui = 0;
 
+  after_deadkey = -1;
+
   /* MessageBox does not work without this when linked to comctl32.dll 6.0.  */
   InitCommonControls ();
 




Reply sent to Stefan Kangas <stefan <at> marxist.se>:
You have taken responsibility. (Wed, 12 Aug 2020 16:33:03 GMT) Full text and rfc822 format available.

Notification sent to Ilya Zakharevich <nospam-abuse <at> ilyaz.org>:
bug acknowledged by developer. (Wed, 12 Aug 2020 16:33:04 GMT) Full text and rfc822 format available.

Message #40 received at 19994-done <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefan <at> marxist.se>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 19994-done <at> debbugs.gnu.org, ilya <at> math.berkeley.edu
Subject: Re: bug#19994: 25.0.50; Unicode keyboard input on Windows
Date: Wed, 12 Aug 2020 09:32:26 -0700
Eli Zaretskii <eliz <at> gnu.org> writes:

>> Date: Sat, 01 Aug 2015 10:40:05 +0300
>> From: Eli Zaretskii <eliz <at> gnu.org>
>> Cc: 19994 <at> debbugs.gnu.org
>>
>> Here's one problem evidently caused by the new code: invoke "emacs -Q"
>> and type "M-x" after it starts => you will see "x" being inserted into
>> *scratch*.  This doesn't happen if w32-use-fallback-wm-chars-method is
>> non-nil.
>>
>> This is a one-time problem: all the subsequent "M-x" are handled
>> correctly.  It sounds like some initialization somewhere is missing?
>
> I've found that the simple change below fixes this problem.  I
> committed it; if you feel it's not the right fix, please propose an
> alternative.

It seems like the patch here was installed, an additional fix was
committed, and there has been no further progress within 5 years.

I'm therefore closing this bug report.

Best regards,
Stefan Kangas




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 10 Sep 2020 11:24:15 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 65 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.