GNU bug report logs - #18420
24.3; interaction with external process hangs emacs

Previous Next

Package: emacs;

Reported by: stephen_leake <at> stephe-leake.org

Date: Sun, 7 Sep 2014 08:06:01 UTC

Severity: normal

Found in version 24.3

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 18420 in the body.
You can then email your comments to 18420 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#18420; Package emacs. (Sun, 07 Sep 2014 08:06:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to stephen_leake <at> stephe-leake.org:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sun, 07 Sep 2014 08:06:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: stephen_leake <at> stephe-leake.org
To: bug-gnu-emacs <at> gnu.org
Subject: 24.3; interaction with external process hangs emacs
Date: Sun, 07 Sep 2014 03:04:51 -0500
While developing an external parser for Emacs Ada Mode 5.x, I ran into a
situation where Emacs is hung while sending to and receiving from the
external parser; C-g does not recover. Separately killing the external
process does recover.

The bug with the full Emacs Ada mode and full external parser is quite
repeatable; it happens every time the
`wisi-ada-parse-session-send-parse' command is invoked on a sufficiently
large buffer, but only with the full development sources. I tried to
produce a simplified elisp and external code, but those do not reproduce
the bug.

The elisp that hangs is:

(defun wisi-ada-parse-session-send-parse ()
  "Send a parse command to ada_mode_wisi_parse session, followed by the contents of the current buffer."
  (wisi-ada-parse-require-session)

  ;; ada_mode_wisi_parse can't handle non-ASCII, so we don't need string-bytes here.
  (let* ((buf-string (buffer-substring-no-properties (point-min) (point-max)))
	 (byte-count-img (format "%d" (1- (length buf-string))))
	 (cmd (concat "parse " byte-count-img))
	 (msg (format "%02d%s" (length cmd) cmd))
	 (process (wisi-ada-parse--session-process wisi-ada-parse-session)))
    (when (> wisi-ada-parse-debug 0)
      (wisi-ada-parse-show-buffer)
      (message msg))
    (with-current-buffer (wisi-ada-parse--session-buffer wisi-ada-parse-session)
      (erase-buffer))
    (process-send-string process msg)

    ;; WORKAROUND: process-send-string is supposed to allow accepting
    ;; process output while writing to process input, but apparently
    ;; it doesn't, at least on windows via pipes. So break up
    ;; buf-string, and call accept-process-output explicitly
    (let ((bytes-per-cycle 2047)
	  (bytes-to-send (length buf-string))
	  (first 0)
	  (last 0))
      (while (< last bytes-to-send)
	(setq last (min bytes-to-send (+ first bytes-per-cycle)))
	(message "%d" last)
	(process-send-string process (substring buf-string first last))
	(setq first (1+ last))
	(accept-process-output process 0.1)))
    (wisi-ada-parse-session-wait)
    ))

It hangs on one of the process-send-string calls in the WORKAROUND loop.

The loop is replacing an earlier version that used a single
process-send-string to send the entire buffer string. I was hoping that
the explicit accept-process-out calls would allow C-g to work, but
apparently not.

The external process implements an LALR parser for Ada source; it
outputs the parse results back to Emacs. There is a lot of output, so
it can easily fill up the IO queue. Attaching to that process when
Emacs is hung shows it is blocked writing to stdout, in a normal part
of the code flow.

Attaching the debugger I have to the Emacs process shows that it is
also blocked in WriteFile, called from emacs_write, but I can't see
any farther up the stack; the debugger says the stack is corrupt. I
don't have debugging symbols, so I can't tell where in Emacs write it
is.

I'm working on building Emacs from the pre-release source, and
installing a compatible debugger.


In GNU Emacs 24.3.1 (i386-mingw-nt6.1.7601)
 of 2013-03-17 on MARVIN
Windowing system distributor `Microsoft Corp.', version 6.1.7601
Configured using:
 `configure --with-gcc (4.7) --cflags
 -ID:/devel/emacs/libs/libXpm-3.5.8/include
 -ID:/devel/emacs/libs/libXpm-3.5.8/src
 -ID:/devel/emacs/libs/libpng-dev_1.4.3-1/include
 -ID:/devel/emacs/libs/zlib-dev_1.2.5-2/include
 -ID:/devel/emacs/libs/giflib-4.1.4-1/include
 -ID:/devel/emacs/libs/jpeg-6b-4/include
 -ID:/devel/emacs/libs/tiff-3.8.2-1/include
 -ID:/devel/emacs/libs/gnutls-3.0.9/include
 -ID:/devel/emacs/libs/libiconv-1.13.1-1-dev/include
 -ID:/devel/emacs/libs/libxml2-2.7.8/include/libxml2'

Important settings:
  value of $LANG: en_US.UTF-8
  locale-coding-system: cp1252
  default enable-multibyte-characters: t

Major mode: Ada

Minor modes in effect:
  shell-dirtrack-mode: t
  diff-auto-refine-mode: t
  display-time-mode: t
  delete-selection-mode: t
  tooltip-mode: t
  mouse-wheel-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  column-number-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent input:
<up> <up> <up> <up> <up> <up> <up> <up> <up> <up> <up>
<up> <up> <up> <up> <up> <up> <up> <up> <up> <up> <up>
<up> <up> <up> <up> <up> <up> <up> <up> <up> <up> <up>
<up> <up> <up> <up> <up> <up> <up> <up> <up> <up> <up>
<up> <up> <up> <up> <up> <up> <up> <up> <up> <up> <up>
<C-M-up> <f11> . a d s <backspace> d b <backspace>
<backspace> b <return> <up> <up> <M-right> <M-right>
<M-right> <S-left> 2 <S-left> 3 <f5> <C-M-down> <f5>
<up> <return> <C-M-up> <M-home> <down> <down> <down>
<down> <down> <down> <down> <down> <down> <down> <down>
<down> <down> <down> <down> <down> <down> <down> <down>
<down> <down> <down> <down> <down> <down> <down> <down>
<down> <M-right> <M-right> <M-right> <M-right> <M-right>
<M-right> <S-left> 3 <M-down> <M-down> <M-down> <M-down>
<M-down> <M-down> <down> <down> <down> <down> <down>
<down> <down> <down> <down> <down> <return> <home>
C-SPC <down> <down> <down> <down> <down> <M-delete>
<insert> <up> <up> <up> <up> <up> <up> <up> <up> <up>
<M-right> <M-right> <M-right> <S-left> 1 <end> <return>
- - SPC SPC <up> <M-right> <M-right> <S-left> - <backspace>
0 <down> E m a c s SPC d o e s n ' t SPC h a v e SPC
a SPC c h a n g <backspace> c e SPC t o SPC r e a d
SPC t h e SPC t e x t SPC w e SPC j u s t SPC o u t
p u t SPC b e f o r e SPC e <backspace> w e SPC d i
e . <f4> M-w <f5> y e s <return> <lwindow> <switch-frame>
<switch-frame> <f6> <down> <down> <down> <down> <down>
<down> <down> <down> <down> <down> <down> <M-right>
<right> <S-right> 3 <f6> <f6> <down> <down> <down>
<down> <M-right> <right> <right> SPC | SPC 3 <f5> M-x
r e p o r <tab> <return>

Recent messages:
(No files need saving)
Mark set [3 times]
Saving file c:/home/stephe/Emacs_Work/background_proc_hangs_emacs/ada_mode_wisi_parse.adb...
Wrote c:/home/stephe/Emacs_Work/background_proc_hangs_emacs/ada_mode_wisi_parse.adb
(No files need saving)
Compilation killed
Compilation exited abnormally with code 2
Saving file c:/home/stephe/Emacs_Work/background_proc_hangs_emacs/ada_mode_wisi_parse.adb...
Wrote c:/home/stephe/Emacs_Work/background_proc_hangs_emacs/ada_mode_wisi_parse.adb
Compilation finished

Load-path shadows:
/Projects/org.emacs.ada-mode.stephe-1/ada-mode hides c:/Apps/emacs-24.3/lisp/progmodes/ada-mode

Features:
(shadow mail-extr emacsbug message format-spec rfc822 mml mml-sec
mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils
mailheader sendmail rfc2047 rfc2045 ietf-drums mail-utils
dvc-persistence dvc-config dvc-propagate dvc-state debug edebug bzr-dvc
bzr dvc-annotate dvc-revlist etags else-mode cc-langs cc-mode cc-fonts
cc-guess cc-menus cc-cmds cc-styles cc-align cc-engine cc-vars cc-defs
find-func help-mode hippie-exp gud sort rect wisi-parse copyright tabify
misearch multi-isearch shell pcomplete gpr-query make-mode xmtn-dvc
dvc-status dvc-diff dvc-fileinfo diff-mode xmtn-conflicts xmtn-ids
xmtn-match xmtn-automate xmtn-compat parse-time xmtn-run xmtn-basic-io
xmtn-base ada-skel ada-skeletons skeleton ada-compiler ada-gnat-compile
ada-xref-tool ada-gnat-xref gnat-core wisi-ada-parse ada-indent-engine
ada-wisi ada-indent-user-options ada_grammar-wy wisi-compile
semantic/wisent/comp semantic/wisent semantic/wisent/wisent
semantic/util-modes semantic/util semantic semantic/tag cedet
semantic/lex semantic/fw mode-local wisi wisi-compat-24.2
wisi-parse-common ada-fix-error ada-build ada-mode find-file align
xgit-core bzr-core xmtn-minimal dvc-log log-edit pcvs-util vc
vc-dispatcher ediff-merg ediff-diff ediff-wind ediff-help ediff-util
ediff-mult ediff-init ediff dvc-unified dvc-tips ffap url-parse
auth-source eieio byte-opt bytecomp byte-compile cconv gnus-util mm-util
mail-prsvr password-cache url-vars add-log dvc-autoloads dvc-core
dvc-lisp dvc-buffers dvc-ui dvc-register dvc-utils dvc-emacs ewoc
dvc-defs dvc-site uniquify advice help-fns advice-preload time delsel
cus-start cus-load color-theme cl-macs gv edmacro kmacro wid-edit cl
cl-lib noutline outline easy-mmode iswitchb whitespace dired-x dired-aux
dired compile comint ansi-color ring bbdb-autoloads info easymenu
package time-date tooltip ediff-hook vc-hooks lisp-float-type mwheel
dos-w32 ls-lisp w32-common-fns disp-table w32-win w32-vars tool-bar dnd
fontset image regexp-opt fringe tabulated-list newcomment lisp-mode
register page menu-bar rfn-eshadow timer select scroll-bar mouse
jit-lock font-lock syntax facemenu font-core frame cham georgian
utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean
japanese hebrew greek romanian slovak czech european ethiopic indian
cyrillic chinese case-table epa-hook jka-cmpr-hook help simple abbrev
minibuffer loaddefs button faces cus-face macroexp files text-properties
overlay sha1 md5 base64 format env code-pages mule custom widget
hashtable-print-readable backquote make-network-process w32 multi-tty
emacs)




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18420; Package emacs. (Sun, 07 Sep 2014 15:39:02 GMT) Full text and rfc822 format available.

Message #8 received at 18420 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: stephen_leake <at> stephe-leake.org
Cc: 18420 <at> debbugs.gnu.org
Subject: Re: bug#18420: 24.3; interaction with external process hangs emacs
Date: Sun, 07 Sep 2014 18:38:49 +0300
> Date: Sun, 07 Sep 2014 03:04:51 -0500
> From: stephen_leake <at> stephe-leake.org
> 
>     ;; WORKAROUND: process-send-string is supposed to allow accepting
>     ;; process output while writing to process input, but apparently
>     ;; it doesn't, at least on windows via pipes. So break up
>     ;; buf-string, and call accept-process-output explicitly
>     (let ((bytes-per-cycle 2047)
> 	  (bytes-to-send (length buf-string))
> 	  (first 0)
> 	  (last 0))
>       (while (< last bytes-to-send)
> 	(setq last (min bytes-to-send (+ first bytes-per-cycle)))
> 	(message "%d" last)
> 	(process-send-string process (substring buf-string first last))
> 	(setq first (1+ last))
> 	(accept-process-output process 0.1)))
>     (wisi-ada-parse-session-wait)
>     ))
> 
> It hangs on one of the process-send-string calls in the WORKAROUND loop.

Did you try replacing accept-process-output with sit-for?

How about enlarging the 2nd argument to accept-process-output -- did
you try that, and if so, did it have any effect?

> The loop is replacing an earlier version that used a single
> process-send-string to send the entire buffer string. I was hoping that
> the explicit accept-process-out calls would allow C-g to work, but
> apparently not.

I cannot easily interpret this, too much info is missing.  For
starters, if the parser gets only part of its expected input (whose
size you are apparently passing via command-line arguments), would it
start parsing, or will it wait for the rest?  If the latter, then the
way you've broken the buffer string into chunks will not change
anything, right?

> The external process implements an LALR parser for Ada source; it
> outputs the parse results back to Emacs. There is a lot of output, so
> it can easily fill up the IO queue. Attaching to that process when
> Emacs is hung shows it is blocked writing to stdout, in a normal part
> of the code flow.

Can you tell how much did it write to the pipe at that point?

> Attaching the debugger I have to the Emacs process shows that it is
> also blocked in WriteFile, called from emacs_write, but I can't see
> any farther up the stack; the debugger says the stack is corrupt. I
> don't have debugging symbols, so I can't tell where in Emacs write it
> is.

If you didn't have debugging symbols, GDB wouldn't be able to show you
symbols like emacs_write and WriteFile.

In any case, it sounds like your binary is optimized, in which case
backtraces are not useful at best, and downright misleading at worst.

> I'm working on building Emacs from the pre-release source, and
> installing a compatible debugger.

Be sure to build with CFLAGS='-O0 -g3'.

What is your versions of GCC, Binutils, MinGW runtime, and w32 API
headers, btw?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18420; Package emacs. (Mon, 08 Sep 2014 17:02:02 GMT) Full text and rfc822 format available.

Message #11 received at 18420 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Leake <stephen_leake <at> stephe-leake.org>
To: 18420 <at> debbugs.gnu.org
Subject: Re: bug#18420: 24.3; interaction with external process hangs emacs
Date: Mon, 08 Sep 2014 12:00:42 -0500
A few more data points:

- The hang occurs with emacs -Q

- The hang does not occur with Cygwin Emacs 23.4 (so I can debug the
  external process, at least :)

- I've built Emacs 23.3.93 pretest from source, with debugging
  information. The hang occurs there, and an attached debugger shows a
  full stack trace when it hangs:

(gdb) thread 1
[Switching to thread 1 (Thread 5552.0x864)]
#0  0x7709f93d in ntdll!ZwWriteFile () from C:\Windows\SysWOW64\ntdll.dll
(gdb) where
#0  0x7709f93d in ntdll!ZwWriteFile () from C:\Windows\SysWOW64\ntdll.dll
#1  0x755912cc in WriteFile () from C:\Windows\syswow64\kernel32.dll
#2  0x74b94035 in msvcrt!fprintf () from C:\Windows\syswow64\msvcrt.dll
#3  0x74b940eb in write () from C:\Windows\syswow64\msvcrt.dll
#4  0x01184f8f in sys_write (fd=<optimized out>, fd <at> entry=8, 
    buffer=<optimized out>, buffer <at> entry=0xff1b00c, count=8969928, 
    count <at> entry=7273) at w32.c:8337
#5  0x010b12a8 in emacs_full_write (fildes=8, 
    buf=buf <at> entry=0xff1b00c "--  Abstract :\n--\n--  Emacs background process for Ada mode; parse buffer text, return wisi Actions.\n--\n--  Copyright (C) 2014  All Rights Reserved.\n--\n--  This program is free software; you can redis"..., nbyte=nbyte <at> entry=7273, process_signals=true) at sysdep.c:2297
#6  0x010b23f9 in emacs_write_sig (fildes=<optimized out>, 
    buf=buf <at> entry=0xff1b00c, nbyte=nbyte <at> entry=7273) at sysdep.c:2334
#7  0x01141f75 in send_process (proc=267455925, buf=<optimized out>, 
    len=<optimized out>, object=<optimized out>, object <at> entry=266841793)
    at process.c:5528
#8  0x01142325 in Fprocess_send_string (process=267455925, string=266841793)
    at process.c:5645
#9  0x01105430 in Ffuncall (nargs=nargs <at> entry=3, args=args <at> entry=0x88e8d4)
    at eval.c:2818
#10 0x0113890f in exec_byte_code (bytestr=<optimized out>, vector=260525805, 
    maxdepth=16, args_template=81815586, nargs=nargs <at> entry=0, 
    args=<optimized out>, args <at> entry=0x0) at bytecode.c:916
#11 0x01104ed5 in funcall_lambda (fun=260521925, nargs=nargs <at> entry=0, 
    arg_vector=arg_vector <at> entry=0x88ea38) at eval.c:3049
#12 0x0110521b in Ffuncall (nargs=nargs <at> entry=1, args=args <at> entry=0x88ea34)
    at eval.c:2876
#13 0x0113890f in exec_byte_code (bytestr=<optimized out>, vector=260529781, 
    maxdepth=20, args_template=81815586, nargs=nargs <at> entry=0, 
    args=<optimized out>, args <at> entry=0x0) at bytecode.c:916
#14 0x01104ed5 in funcall_lambda (fun=260526045, nargs=nargs <at> entry=0, 
    arg_vector=arg_vector <at> entry=0x88eb98) at eval.c:3049
#15 0x0110521b in Ffuncall (nargs=nargs <at> entry=1, args=args <at> entry=0x88eb94)
    at eval.c:2876
#16 0x0113890f in exec_byte_code (bytestr=<optimized out>, vector=255630965, 
    maxdepth=16, args_template=81815586, nargs=nargs <at> entry=0, 
    args=<optimized out>, args <at> entry=0x0) at bytecode.c:916
#17 0x01104ed5 in funcall_lambda (fun=255565773, nargs=nargs <at> entry=0, 
    arg_vector=arg_vector <at> entry=0x88ecf8) at eval.c:3049
#18 0x0110521b in Ffuncall (nargs=nargs <at> entry=1, args=args <at> entry=0x88ecf4)
    at eval.c:2876
#19 0x0113890f in exec_byte_code (bytestr=<optimized out>, 
    bytestr <at> entry=266875825, vector=vector <at> entry=255565797, 
    maxdepth=maxdepth <at> entry=8, args_template=81815586, nargs=nargs <at> entry=0, 
    args=<optimized out>, args <at> entry=0x0) at bytecode.c:916
#20 0x0113b418 in Fbyte_code (bytestr=266875825, vector=255565797, maxdepth=8)
    at bytecode.c:482
#21 0x011048a7 in eval_sub (form=264366286) at eval.c:2191
#22 0x01107b66 in internal_lisp_condition_case (var=83517026, 
    bodyform=<optimized out>, handlers=<optimized out>) at eval.c:1323
#23 0x01139983 in exec_byte_code (bytestr=<optimized out>, vector=255631045, 
    maxdepth=24, args_template=81815586, nargs=nargs <at> entry=0, 
    args=<optimized out>, args <at> entry=0x0) at bytecode.c:1162
#24 0x01104ed5 in funcall_lambda (fun=255631149, nargs=nargs <at> entry=1, 
    arg_vector=arg_vector <at> entry=0x88f028) at eval.c:3049
#25 0x0110521b in Ffuncall (nargs=nargs <at> entry=2, args=args <at> entry=0x88f024)
    at eval.c:2876
#26 0x0113890f in exec_byte_code (bytestr=<optimized out>, vector=255917973, 
    maxdepth=8, args_template=81815586, nargs=nargs <at> entry=0, 
    args=<optimized out>, args <at> entry=0x0) at bytecode.c:916
#27 0x01104ed5 in funcall_lambda (fun=255917989, nargs=nargs <at> entry=0, 
    arg_vector=arg_vector <at> entry=0x88f194) at eval.c:3049
#28 0x0110521b in Ffuncall (nargs=nargs <at> entry=1, args=args <at> entry=0x88f190)
    at eval.c:2876
#29 0x01106837 in apply1 (fn=fn <at> entry=263136994, arg=arg <at> entry=81815586)
    at eval.c:2581
#30 0x0110140f in Fcall_interactively (function=263136994, 
    record_flag=84547650, keys=81836917) at callint.c:378
#31 0x01105419 in Ffuncall (nargs=nargs <at> entry=4, args=args <at> entry=0x88f2ac)
    at eval.c:2822
#32 0x0113890f in exec_byte_code (bytestr=<optimized out>, vector=19070677, 
    maxdepth=52, args_template=args_template <at> entry=4100, nargs=nargs <at> entry=2, 
    args=<optimized out>, args <at> entry=0x88f428) at bytecode.c:916
#33 0x01104f62 in funcall_lambda (fun=19070637, nargs=nargs <at> entry=2, 
    arg_vector=arg_vector <at> entry=0x88f428) at eval.c:2983
#34 0x0110521b in Ffuncall (nargs=nargs <at> entry=3, args=args <at> entry=0x88f424)
    at eval.c:2876
#35 0x0113890f in exec_byte_code (bytestr=<optimized out>, vector=19070349, 
    maxdepth=60, args_template=args_template <at> entry=2052, nargs=nargs <at> entry=2, 
    args=<optimized out>, args <at> entry=0x88f594) at bytecode.c:916
#36 0x01104f62 in funcall_lambda (fun=19070301, nargs=nargs <at> entry=2, 
    arg_vector=arg_vector <at> entry=0x88f594) at eval.c:2983
#37 0x0110521b in Ffuncall (nargs=3, args=args <at> entry=0x88f590) at eval.c:2876
#38 0x01106613 in Fapply (nargs=nargs <at> entry=2, args=args <at> entry=0x88f5f8)
    at eval.c:2354
#39 0x0110681f in apply1 (fn=fn <at> entry=82018402, arg=arg <at> entry=263985686)
    at eval.c:2588
#40 0x0110140f in Fcall_interactively (function=82018402, 
    record_flag=81815586, keys=81836917) at callint.c:378
#41 0x01105419 in Ffuncall (nargs=nargs <at> entry=4, args=args <at> entry=0x88f72c)
    at eval.c:2822
#42 0x0113890f in exec_byte_code (bytestr=<optimized out>, vector=19070677, 
    maxdepth=52, args_template=args_template <at> entry=4100, nargs=nargs <at> entry=1, 
    args=<optimized out>, args <at> entry=0x88f89c) at bytecode.c:916
#43 0x01104f62 in funcall_lambda (fun=19070637, nargs=nargs <at> entry=1, 
    arg_vector=arg_vector <at> entry=0x88f89c) at eval.c:2983
#44 0x0110521b in Ffuncall (nargs=nargs <at> entry=2, args=args <at> entry=0x88f898)
    at eval.c:2876
#45 0x01105587 in call1 (fn=81861730, arg1=82018402) at eval.c:2614
#46 0x010a73e7 in command_loop_1 () at keyboard.c:1559
#47 0x01103963 in internal_condition_case (
    bfun=bfun <at> entry=0x10a70c0 <command_loop_1>, handlers=81867042, 
    hfun=hfun <at> entry=0x109ed50 <cmd_error>) at eval.c:1354
#48 0x0109a610 in command_loop_2 (ignore=81815586) at keyboard.c:1177
#49 0x0110386d in internal_catch (tag=81862402, 
    func=func <at> entry=0x109a5f0 <command_loop_2>, arg=81815586) at eval.c:1118
#50 0x0109e9ba in command_loop () at keyboard.c:1156
#51 recursive_edit_1 () at keyboard.c:777
#52 0x0109ec8e in Frecursive_edit () at keyboard.c:848
#53 0x011b7dd8 in main (argc=<optimized out>, argv=0xd31b40) at emacs.c:1646

Is there any way to get lisp names from the Ffuncall entries? That would
be nice.

-- 
-- Stephe




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18420; Package emacs. (Tue, 09 Sep 2014 03:11:01 GMT) Full text and rfc822 format available.

Message #14 received at 18420 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Leake <stephen_leake <at> stephe-leake.org>
To: 18420 <at> debbugs.gnu.org
Subject: Re: bug#18420: 24.3; interaction with external process hangs emacs
Date: Mon, 08 Sep 2014 22:10:26 -0500
More info from the debugger on the hung emacs. There are 6 threads active:

thread 6 : gdb interface
thread 5 : child process reader_thread
    used by select emulation to check for child process output
    can't see contents of 'cp' object; optimized out
        the only child process of this emacs instance is the external
        parser; ada_mode_wisi_parse

    hung at "Wait until our input is acknowledged before reading again"
        waiting for 'select()' to ack?

thread 4 : windows message handler
thread 3 : itimer thread
thread 2 : waiting on some startup?
thread 1 : main, runing lisp code, process_send_string, emacs_full_write, sys_write _write
    sys_write can only be interrupted by interrupts/signals
    emacs_full_write checks for that, handles them
    should allow for read write, unless it there's a race condition

So there is a deadlock:

thread 5 is waiting for thread 1 to ack
thread 1 is waiting for child process to read
child process is waiting for thread 5 to read
                
We need to understand how we get here.

-- 
-- Stephe




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18420; Package emacs. (Tue, 09 Sep 2014 04:20:02 GMT) Full text and rfc822 format available.

Message #17 received at 18420 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Leake <stephen_leake <at> stephe-leake.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 18420 <at> debbugs.gnu.org
Subject: Re: bug#18420: 24.3; interaction with external process hangs emacs
Date: Mon, 08 Sep 2014 23:18:39 -0500
Eli Zaretskii <eliz <at> gnu.org> writes:

> Did you try replacing accept-process-output with sit-for?

The current code has:

    (accept-process-output process 0.1)

This waits up to 0.1 seconds for output from the child process
ada_mode_wisi_parse (or any other process, but there are none in this
setup). In practice in this scenario, it returns much sooner than 0.1
seconds; there is some output very quickly after sending any part of
the buffer contents.

Replaced with:

    (sit-for 0.1)

This performs redisplay, then waits for SECONDS seconds or until "input"
is available. The wait is done with 'read-event', which reads from "the
input stream". So apparently this means user input, not child process
input.

This fixed the hang. There is no user input during this process; it's
too fast. The test buffer is 7274 bytes long, so there are three cycles
in the send-parse loop. Apparently what happens is each cycle
waits for 0.1 seconds, giving the child process enough time to process
all of that input. Then Emacs wakes up, processes all of the child
process output, resumes send-parse, and sends the next chunk.

This works, but is essentially a race condition between the sit-for
timer and the child process. It will be too slow for general use; I have
clients with 250,000 byte buffers, which would take 12.2 seconds (slower
than the current elisp parser). Obviously I can reduce the 0.1, but that
runs into the race condition and the deadlock. 0.01 would be fast
enough for the largest buffer, so I tried:

    (sit-for 0.01)

This hangs, in the same deadlock described above. So we need a fix for
the deadlock.

I can use (sit-for 0.1) as a workaround while I continue debugging the
rest of the problems I have, on reasonably sized buffers.

There is another workaround I can use; queue all output in
ada_mode_wisi_parse until all input is read and parsing is done, then
send it all. That would lose parallel processing time, which is
significant with today's processors. But I'll use it if we can't fix
this soon.

> How about enlarging the 2nd argument to accept-process-output -- did
> you try that, and if so, did it have any effect?

tried:

    (accept-process-output 1.0)

This hangs. Since (sit-for 0.1) allows enough time for the child process
to process all of one chunk, this is expected.

>> The loop is replacing an earlier version that used a single
>> process-send-string to send the entire buffer string. I was hoping that
>> the explicit accept-process-out calls would allow C-g to work, but
>> apparently not.
>
> I cannot easily interpret this, too much info is missing.  For
> starters, if the parser gets only part of its expected input (whose
> size you are apparently passing via command-line arguments), would it
> start parsing, or will it wait for the rest?

It starts parsing immediately, and generates some output quickly. The
byte count indicates when to stop parsing.

> If the latter, then the
> way you've broken the buffer string into chunks will not change
> anything, right?

It doesn't change the child process behavior. I was hoping it would
change the emacs side, allowing it to at least check for C-g every 0.1
seconds. Hmm, I'm not clear that accept-process-input checks for input
events. Perhaps we need a combined sit-for/accept-process-input?

>> The external process implements an LALR parser for Ada source; it
>> outputs the parse results back to Emacs. There is a lot of output, so
>> it can easily fill up the IO queue. Attaching to that process when
>> Emacs is hung shows it is blocked writing to stdout, in a normal part
>> of the code flow.
>
> Can you tell how much did it write to the pipe at that point?

No, that info is not tracked in the current code. I can add it.

The parser has read 2935 bytes.

However, there is some variation in precisely when the hang occurs; the
wisi-ada-parse-send-parse loop outputs the sent bytes each cycle, and
that number changes with each run (among the 3 possible choices for this
size buffer).

> What is your versions of GCC, Binutils, MinGW runtime, and w32 API
> headers, btw?

gcc for Emacs: gcc.exe (Rev3, Built by MSYS2 project) 4.9.1

binutils: 2.24.78559.d43808f-1 (msys2 package)

MinGW: not sure what this means; the runtime is just a bunch of
packages. For one: libgc 7.2.d-1

w32 API headers:

/msys64/mingw32/i686-w64-mingw32/include/windows.h has no version number

mingw-w64-i686-headers-git 4.0.0.4210.0177153-1

Hope that helps.

--
-- Stephe




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18420; Package emacs. (Tue, 09 Sep 2014 14:18:01 GMT) Full text and rfc822 format available.

Message #20 received at 18420 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stephen Leake <stephen_leake <at> stephe-leake.org>
Cc: 18420 <at> debbugs.gnu.org
Subject: Re: bug#18420: 24.3; interaction with external process hangs emacs
Date: Tue, 09 Sep 2014 17:17:12 +0300
> From: Stephen Leake <stephen_leake <at> stephe-leake.org>
> Date: Mon, 08 Sep 2014 22:10:26 -0500
> 
> thread 6 : gdb interface
> thread 5 : child process reader_thread
>     used by select emulation to check for child process output
>     can't see contents of 'cp' object; optimized out
>         the only child process of this emacs instance is the external
>         parser; ada_mode_wisi_parse
> 
>     hung at "Wait until our input is acknowledged before reading again"
>         waiting for 'select()' to ack?
> 
> thread 4 : windows message handler
> thread 3 : itimer thread
> thread 2 : waiting on some startup?
> thread 1 : main, runing lisp code, process_send_string, emacs_full_write, sys_write _write
>     sys_write can only be interrupted by interrupts/signals
>     emacs_full_write checks for that, handles them
>     should allow for read write, unless it there's a race condition
> 
> So there is a deadlock:
> 
> thread 5 is waiting for thread 1 to ack
> thread 1 is waiting for child process to read
> child process is waiting for thread 5 to read

Almost: the thread that reads is not thread 5 (the "reader thread"),
it's thread 1, the main (a.k.a. "Lisp") thread.  The reader thread
reads only a single byte from the pipe, and when it succeeds to do so,
it signals the main thread that input is available.  That signal is
supposed to be noticed by sys_select (the Windows implementation of
pselect), and then the main thread should read that stuff as usual.

But since the main thread is blocked inside a 'write' call, it never
gets to sys_select, never notices that input is available, and doesn't
read it.

> We need to understand how we get here.

It turns out to be very simple: the current machinery of handling the
pipe I/O vis-a-vis subprocesses is not designed well enough to
robustly handle the case of massive exchange of material in both
directions.  Recall that the subprocesses with which Emacs exchanges
data, such as the speller and GDB, only send relatively small amounts
of data each way.  And subprocesses that send large quantities of
data, such as Grep, only use the pipe in one direction.  So we never
see these deadlocks in "normal" operations.

Add to this the fact that the default buffer size of the pipe on
Windows is 4K, and you will understand how this deadlock happens.

I can easily get Emacs to deadlock on Windows with the following
simple Lisp:

(defun pipe-torture (file)
  ""
  (let ((proc (start-process "pipe-reader" "pipe-read" "cat.exe"))
	pt)
    (with-temp-buffer
      (insert-file-contents file)
      (setq pt (goto-char (point-min)))
      (while (< pt (point-max))
	(process-send-region proc pt (min (+ pt 8000) (point-max)))
	(message "wrote %d" (1- (min (+ pt 8000) (point-max))))
	;(sit-for 1)
	(setq pt (min (+ pt 8000) (point-max)))))))

;; Invoke with (pipe-torture "src/xdisp.c")

Uncommenting sit-for lets this run without locking up, even if I
reduce the amount of time to wait to 0.02 sec, but it does hang if I
set that to 0.01 sec.

Of course, 'cat' is very light on processing its input before it sends
it back, so the 0.02 sec might not be good enough for subprocesses
with heavier processing.

Anyway, it turns out we can make our pipes non-blocking on write,
which then allows us to use the EWOULDBLOCK/EAGAIN handling in
send_process.  Please try the patch below, and if it gives good
results, I will install it shortly.  Using it, I can pipe the whole
xdisp.c to 'cat' even if I increase the chunk size to 400KB, and leave
sit-for commented out.

--- src/w32.c~0	2014-08-26 15:59:22 +0300
+++ src/w32.c	2014-09-09 16:42:47 +0300
@@ -7720,15 +7720,15 @@ fcntl (int s, int cmd, int options)
   if (cmd == F_DUPFD_CLOEXEC)
     return sys_dup (s);
 
-  if (winsock_lib == NULL)
-    {
-      errno = ENETDOWN;
-      return -1;
-    }
-
   check_errno ();
   if (fd_info[s].flags & FILE_SOCKET)
     {
+      if (winsock_lib == NULL)
+	{
+	  errno = ENETDOWN;
+	  return -1;
+	}
+
       if (cmd == F_SETFL && options == O_NONBLOCK)
 	{
 	  unsigned long nblock = 1;
@@ -7745,6 +7745,29 @@ fcntl (int s, int cmd, int options)
 	  return SOCKET_ERROR;
 	}
     }
+  else if ((fd_info[s].flags & (FILE_PIPE | FILE_WRITE))
+	   == (FILE_PIPE | FILE_WRITE))
+    {
+      /* Force our writes to pipes be non-blocking.  */
+      if (cmd == F_SETFL && options == O_NONBLOCK)
+	{
+	  HANDLE h = (HANDLE)_get_osfhandle (s);
+	  DWORD pipe_mode = PIPE_NOWAIT;
+
+	  if (!SetNamedPipeHandleState (h, &pipe_mode, NULL, NULL))
+	    {
+	      DebPrint (("SetNamedPipeHandleState: %lu\n", GetLastError ()));
+	      return SOCKET_ERROR;
+	    }
+	  fd_info[s].flags |= FILE_NDELAY;
+	  return 0;
+	}
+      else
+	{
+	  errno = EINVAL;
+	  return SOCKET_ERROR;
+	}
+    }
   errno = ENOTSOCK;
   return SOCKET_ERROR;
 }
@@ -8371,6 +8394,21 @@ sys_write (int fd, const void * buffer,
 	  nchars += n;
 	  if (n < 0)
 	    {
+	      /* When there's no buffer space in a pipe that is in
+		 non-blocking mode, _write returns ENOSPC.  We return
+		 EAGAIN instead, which should trigger the logic in
+		 send_process that enters waiting loop and calls
+		 wait_reading_process_output to allow process input to
+		 be accepted during the wait.  Those calls to
+		 wait_reading_process_output allow sys_select to
+		 notice when process input becomes available, thus
+		 avoiding deadlock whereby each side of the pipe is
+		 blocked on write.  */
+	      if (errno == ENOSPC
+		  && fd < MAXDESC
+		  && ((fd_info[fd].flags & (FILE_PIPE | FILE_NDELAY))
+		      == (FILE_PIPE | FILE_NDELAY)))
+		errno = EAGAIN;
 	      nchars = n;
 	      break;
 	    }




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18420; Package emacs. (Tue, 09 Sep 2014 14:27:01 GMT) Full text and rfc822 format available.

Message #23 received at 18420 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stephen Leake <stephen_leake <at> stephe-leake.org>
Cc: 18420 <at> debbugs.gnu.org
Subject: Re: bug#18420: 24.3; interaction with external process hangs emacs
Date: Tue, 09 Sep 2014 17:26:45 +0300
> From: Stephen Leake <stephen_leake <at> stephe-leake.org>
> Cc: 18420 <at> debbugs.gnu.org
> Date: Mon, 08 Sep 2014 23:18:39 -0500
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > Did you try replacing accept-process-output with sit-for?
> 
> The current code has:
> 
>     (accept-process-output process 0.1)
> 
> This waits up to 0.1 seconds for output from the child process
> ada_mode_wisi_parse (or any other process, but there are none in this
> setup). In practice in this scenario, it returns much sooner than 0.1
> seconds; there is some output very quickly after sending any part of
> the buffer contents.
> 
> Replaced with:
> 
>     (sit-for 0.1)
> 
> This performs redisplay, then waits for SECONDS seconds or until "input"
> is available. The wait is done with 'read-event', which reads from "the
> input stream". So apparently this means user input, not child process
> input.

No, sit-for also reads from subprocesses, because it calls 'pselect',
which listens to subprocesses as well.  What's more important, it
acknowledges the "read ready" signal sent by the reader thread, and
releases it from its wait.

> > How about enlarging the 2nd argument to accept-process-output -- did
> > you try that, and if so, did it have any effect?
> 
> tried:
> 
>     (accept-process-output 1.0)
> 
> This hangs.

It didn't hang for me in the simple program I just posted.  Not sure
why the difference, but it isn't worth our time to figure that out,
since the solution is at hand.

> It doesn't change the child process behavior. I was hoping it would
> change the emacs side, allowing it to at least check for C-g every 0.1
> seconds. Hmm, I'm not clear that accept-process-input checks for input
> events.

What do you need by "input events"?  Whenever 'pselect' is called, on
Windows it monitors the C-g keypresses as well (grep for
"interrupt_handle" in the sources).

> > Can you tell how much did it write to the pipe at that point?
> 
> No, that info is not tracked in the current code. I can add it.
> 
> The parser has read 2935 bytes.

If reading that much could cause it to write more than 4KB of output,
then it will block, and won't read more until Emacs reads its end of
the pipe.

> MinGW: not sure what this means; the runtime is just a bunch of
> packages.

The version is in the header _mingw.h.

> mingw-w64-i686-headers-git 4.0.0.4210.0177153-1

The version of the w32api is in w32api.h.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18420; Package emacs. (Tue, 09 Sep 2014 14:30:03 GMT) Full text and rfc822 format available.

Message #26 received at 18420 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stephen Leake <stephen_leake <at> stephe-leake.org>
Cc: 18420 <at> debbugs.gnu.org
Subject: Re: bug#18420: 24.3; interaction with external process hangs emacs
Date: Tue, 09 Sep 2014 17:30:00 +0300
> From: Stephen Leake <stephen_leake <at> stephe-leake.org>
> Date: Mon, 08 Sep 2014 12:00:42 -0500
> 
> Is there any way to get lisp names from the Ffuncall entries? That would
> be nice.

"xbacktrace" 9defined in src/.gdbinit) should display the Lisp-level
backtrace.  But beware: it calls functions inside Emacs, so it could
hang if Emacs is stuck.

The (tedious) fallback is to do this:

  #12 0x0110521b in Ffuncall (nargs=nargs <at> entry=1, args=args <at> entry=0x88ea34)
      at eval.c:2876

  (gdb) p args[0]
  (gdb) xtype # will usually say "Lisp symbol"
  (gdb) xsymbol




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18420; Package emacs. (Tue, 09 Sep 2014 21:49:01 GMT) Full text and rfc822 format available.

Message #29 received at 18420 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Leake <stephen_leake <at> stephe-leake.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 18420 <at> debbugs.gnu.org
Subject: Re: bug#18420: 24.3; interaction with external process hangs emacs
Date: Tue, 09 Sep 2014 16:48:26 -0500
Eli Zaretskii <eliz <at> gnu.org> writes:

> Anyway, it turns out we can make our pipes non-blocking on write,
> which then allows us to use the EWOULDBLOCK/EAGAIN handling in
> send_process.  Please try the patch below, and if it gives good
> results, I will install it shortly.  

Yes, it fixes the problem.

For this test, I reverted wisi-ada-parse-session-send-parse to use a
single (process-send-string process buf-string).

-- 
-- Stephe




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18420; Package emacs. (Tue, 09 Sep 2014 21:55:02 GMT) Full text and rfc822 format available.

Message #32 received at 18420 <at> debbugs.gnu.org (full text, mbox):

From: Stephen Leake <stephen_leake <at> stephe-leake.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 18420 <at> debbugs.gnu.org
Subject: Re: bug#18420: 24.3; interaction with external process hangs emacs
Date: Tue, 09 Sep 2014 16:54:20 -0500
Eli Zaretskii <eliz <at> gnu.org> writes:

> No, sit-for also reads from subprocesses, because it calls 'pselect',
> which listens to subprocesses as well.  What's more important, it
> acknowledges the "read ready" signal sent by the reader thread, and
> releases it from its wait.

Ok; the doc string only uses the word "input", while
accept-process-input doc string explicitly talks about "subprocess
input", so I had the impression they were handled separately. I'll
submit a separate documentation bug on this.

-- 
-- Stephe




Reply sent to Eli Zaretskii <eliz <at> gnu.org>:
You have taken responsibility. (Sun, 14 Sep 2014 15:21:01 GMT) Full text and rfc822 format available.

Notification sent to stephen_leake <at> stephe-leake.org:
bug acknowledged by developer. (Sun, 14 Sep 2014 15:21:02 GMT) Full text and rfc822 format available.

Message #37 received at 18420-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stephen Leake <stephen_leake <at> stephe-leake.org>
Cc: 18420-done <at> debbugs.gnu.org
Subject: Re: bug#18420: 24.3; interaction with external process hangs emacs
Date: Sun, 14 Sep 2014 18:20:02 +0300
> From: Stephen Leake <stephen_leake <at> stephe-leake.org>
> Cc: 18420 <at> debbugs.gnu.org
> Date: Tue, 09 Sep 2014 16:48:26 -0500
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > Anyway, it turns out we can make our pipes non-blocking on write,
> > which then allows us to use the EWOULDBLOCK/EAGAIN handling in
> > send_process.  Please try the patch below, and if it gives good
> > results, I will install it shortly.  
> 
> Yes, it fixes the problem.

Thanks, I installed this on the emacs-24 branch.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 13 Oct 2014 11:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 9 years and 209 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.