GNU bug report logs - #39413
26.2; Emacs gets hung

Previous Next

Package: emacs;

Reported by: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>

Date: Tue, 4 Feb 2020 12:39:01 UTC

Severity: normal

Tags: moreinfo, notabug, unreproducible

Found in version 26.2

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 39413 in the body.
You can then email your comments to 39413 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Tue, 04 Feb 2020 12:39:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Tue, 04 Feb 2020 12:39:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>
To: bug-gnu-emacs <at> gnu.org
Subject: 26.2; Emacs gets hung
Date: Tue, 4 Feb 2020 18:32:30 +0900
Hi,

I have been using emacs-26.2 for a while and noticed
a few lockups for the last few months.

I had not seen emacs lockups for quite a while (I have used GNU Emacs
since 1[678] days.)

The recent lockup occurred just yesterday while I was editing C++
source files. I was about to enter (or have entered) the second '/' to
start a comment and then '/' is not echoed back and emacs locked up.

I had experienced a few lockups in the last few months and had to kill
the running emacs process.

This time instead of simply killing emacs, I tried to monitor what was
going on.

Interestingly, there was no system call being issued. I checked the
output of strace -p PID-OF-EMACS-PROCESS.
So I suspect a pure looping inside Emacs.

Then I quite strace and attached gdb to the running process.

I attach the log from the gdb at the end.

Short analysis from the log.:
I printed stack backtrace and Emacs seemed to be in memory allocator.
Then I let it continue.
Still the emacs screen was locked up.
So I interrupted the execution, and printed stack backtrace.
Again Emacs was in memory allocator.
Eventually I gave up and killed emacs.

It seems that garbage collection or
routine in alloc.c was looping.

Has there been a change in alloc.c in the last few minor revisions?
(It now seems to that a routine or two in it take FOREVER, I mean
unreasonably LOOOONG time to return. Maybe the algorithm used was
changed to an inefficient one. In the past few months, noticed a few
cases where I thought Emacs locked up but then it began responsive
after a VERY LONG pause.  So yesterday I waited for a while to see if
Emacs would become responsive. No, it did not after almost a minute
passed.  It could be that the modification is inefficient in the sense
that it is NOT very virtual memory friendly and causes extreme
paging/thrashing during garbage collection, etc.
I noticed a very slow response from Emacs after I stayed away from it
doing other things. Just my wild guess.)

Anyway, I am upgrading to 26.3 now to see if it is fixed.

This is not a repeatable bug. I have no idea how to reproduce this. It
occurs every now and then.

However, as I mentioned, I have never seen Emacs locking up like this 
before.


TIA

Chiaki


Log:

gdb -p 237890 ~/repos/emacs-26.2/src/emacs
GNU gdb (Debian 8.3.1-1) 8.3.1
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/ishikawa/repos/emacs-26.2/src/emacs...
Attaching to program: /home/ishikawa/repos/emacs-26.2/src/emacs, process 
237890
[New LWP 237893]
[New LWP 237894]
[New LWP 237895]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00000000005490be in lisp_align_free (block=<optimized out>) at 
alloc.c:1379
1379          if (*tem >= (struct ablock *) abase && *tem < atop)
(gdb) where
#0  0x00000000005490be in lisp_align_free (block=<optimized out>) at 
alloc.c:1379
#1  0x0000000000549255 in sweep_conses () at alloc.c:6873
#2  0x000000000054cc66 in gc_sweep () at alloc.c:7157
#3  0x000000000054cc66 in garbage_collect_1 (end=<optimized out>) at 
alloc.c:6027
#4  0x000000000054cc66 in Fgarbage_collect () at alloc.c:6159
#5  0x00000000005650c9 in maybe_gc () at lisp.h:4744
#6  0x00000000005650c9 in Ffuncall (nargs=nargs <at> entry=2, 
args=args <at> entry=0x7ffcbd1a00d0) at eval.c:2758
#7  0x0000000000564652 in internal_condition_case_n (bfun=
    0x565030 <Ffuncall>, nargs=nargs <at> entry=2, 
args=args <at> entry=0x7ffcbd1a00d0, handlers=handlers <at> entry=49344, 
hfun=hfun <at> entry=0x43d4b0 <safe_eval_handler>) at eval.c:1416
#8  0x000000000042e7c9 in safe__call
    (inhibit_quit=inhibit_quit <at> entry=true, nargs=nargs <at> entry=2, 
func=func <at> entry=21408, ap=ap <at> entry=0x7ffcbd1a0148)
    at lisp.h:855
#9  0x000000000042e913 in safe__call1 
(inhibit_quit=inhibit_quit <at> entry=true, fn=fn <at> entry=21408) at xdisp.c:2683
#10 0x000000000044a7f4 in safe__eval (sexpr=<optimized out>, 
inhibit_quit=true) at lisp.h:855
#11 0x000000000044a7f4 in display_mode_element 
(it=it <at> entry=0x7ffcbd1a0400, depth=3,
    depth <at> entry=1, field_width=0, precision=precision <at> entry=0, 
elt=14558051, props=props <at> entry=0, risky=false)
    at xdisp.c:23845
#12 0x000000000044a963 in display_mode_element 
(it=it <at> entry=0x7ffcbd1a0400, depth=1,
    depth <at> entry=0, field_width=field_width <at> entry=0, 
precision=precision <at> entry=0, elt=<optimized out>,
    elt <at> entry=14644067, props=props <at> entry=0, risky=<optimized out>) at 
lisp.h:1221
#13 0x000000000044b9f6 in display_mode_line (w=w <at> entry=0x7a044200, 
face_id=MODE_LINE_FACE_ID, format=14644067)
    at lisp.h:855
--Type <RET> for more, q to quit, c to continue without paging--
#14 0x000000000044bc29 in display_mode_lines (w=w <at> entry=0x7a044200) at 
lisp.h:634
#15 0x000000000046335a in redisplay_window (window=2047099397, 
just_this_one_p=just_this_one_p <at> entry=false)
    at xdisp.c:17547
#16 0x000000000046779b in redisplay_window_0 
(window=window <at> entry=2047099397) at xdisp.c:14926
#17 0x0000000000564516 in internal_condition_case_1
    (bfun=bfun <at> entry=0x467770 <redisplay_window_0>, arg=2047099397, 
handlers=<optimized out>, hfun=hfun <at> entry=0x42d130 
<redisplay_window_error>) at eval.c:1360
#18 0x0000000000431cd2 in redisplay_windows (window=2047099397) at 
xdisp.c:14906
#19 0x0000000000454bfd in redisplay_internal () at xdisp.c:14395
#20 0x00000000004569f5 in redisplay () at xdisp.c:13613
#21 0x00000000004fa7cf in read_char
    (commandflag=commandflag <at> entry=1, map=map <at> entry=1955211187, 
prev_event=0, used_mouse_menu=used_mouse_menu <at> entry=0x7ffcbd1a59fb, 
end_time=end_time <at> entry=0x0) at keyboard.c:2482
#22 0x00000000004fe100 in read_key_sequence
    (keybuf=keybuf <at> entry=0x7ffcbd1a5af0, prompt=prompt <at> entry=0, 
dont_downcase_last=dont_downcase_last <at> entry=false, 
can_return_switch_frame=can_return_switch_frame <at> entry=true, 
fix_current_buffer=fix_current_buffer <at> entry=true, 
prevent_redisplay=prevent_redisplay <at> entry=false, bufsize=30) at 
keyboard.c:9157
#23 0x00000000004ff764 in command_loop_1 () at lisp.h:855
#24 0x000000000056447e in internal_condition_case
    (bfun=bfun <at> entry=0x4ff560 <command_loop_1>, 
handlers=handlers <at> entry=21120, hfun=hfun <at> entry=0x4f62b0 <cmd_error>)
    at eval.c:1336
#25 0x00000000004f0fb4 in command_loop_2 (ignore=ignore <at> entry=0) at 
lisp.h:855
#26 0x00000000005643ed in internal_catch
--Type <RET> for more, q to quit, c to continue without paging--
    (tag=tag <at> entry=50976, func=func <at> entry=0x4f0f90 <command_loop_2>, 
arg=arg <at> entry=0) at eval.c:1101
#27 0x00000000004f0f4b in command_loop () at lisp.h:855
#28 0x00000000004f5e93 in recursive_edit_1 () at keyboard.c:695
#29 0x00000000004f61d8 in Frecursive_edit () at keyboard.c:766
#30 0x00000000004182e1 in main (argc=<optimized out>, 
argv=0x7ffcbd1a5e78) at emacs.c:1722
(gdb) cont
Continuing.
^C
Thread 1 "emacs" received signal SIGINT, Interrupt.
0x00000000005490be in lisp_align_free (block=<optimized out>) at 
alloc.c:1379
1379          if (*tem >= (struct ablock *) abase && *tem < atop)
(gdb) print abase
$1 = (struct ablocks *) 0x3cf3c400
(gdb) print tem
$2 = (struct ablock **) 0x66fe1400
(gdb) print atop
$3 = (struct ablock *) 0x3cf40400
(gdb) print temp
No symbol "temp" in current context.
(gdb) print *tem
$4 = (struct ablock *) 0x66fe7000
(gdb) cont
Continuing.
^C
Thread 1 "emacs" received signal SIGINT, Interrupt.
0x00000000005490be in lisp_align_free (block=<optimized out>) at 
alloc.c:1379
1379          if (*tem >= (struct ablock *) abase && *tem < atop)
(gdb) quit
A debugging session is active.

    Inferior 1 [process 237890] will be detached.

Quit anyway? (y or n) y
Detaching from program: /home/ishikawa/repos/emacs-26.2/src/emacs, 
process 237890
[Inferior 1 (process 237890) detached]
ishikawa <at> ip030:/NREF-COMM-CENTRAL/mozilla/comm$ ps axg | grep emacs
 237890 pts/5    Rl    53:02 /home/ishikawa/repos/emacs-26.2/src/emacs 
--font fontset-24
 774365 pts/5    S+     0:00 grep emacs
ishikawa <at> ip030:/NREF-COMM-CENTRAL/mozilla/comm$ kill -HUP 237890

[end of my memo]

[What follows is created by `report-emacs-bug'.]

In GNU Emacs 26.2 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.5)
 of 2019-07-26 built on ip030
Windowing system distributor 'The X.Org Foundation', version 11.0.12006000
System Description:    Debian GNU/Linux bullseye/sid

Recent messages:
Wrote /tmp/ttt.txt
Auto-saving...done
Saving file /tmp/ttt.txt...
Wrote /tmp/ttt.txt
Saving file /tmp/ttt.txt...
Wrote /tmp/ttt.txt
Saving file /tmp/ttt.txt...
Wrote /tmp/ttt.txt
Saving file /tmp/ttt.txt...
Wrote /tmp/ttt.txt

Configured using:
 'configure REL_ALLOC=no --with-x-toolkit=gtk --without-xim'

Configured features:
XPM JPEG TIFF GIF PNG RSVG SOUND GPM DBUS GSETTINGS GLIB NOTIFY
LIBSELINUX GNUTLS LIBXML2 FREETYPE M17N_FLT LIBOTF XFT ZLIB
TOOLKIT_SCROLL_BARS GTK3 X11 XDBE XIM THREADS LIBSYSTEMD LCMS2

Important settings:
  value of $LANG: ja_JP.UTF-8
  value of $XMODIFIERS: @im=kinput2
  locale-coding-system: utf-8-unix

Major mode: Text

Minor modes in effect:
  diff-auto-refine-mode: t
  shell-dirtrack-mode: t
  global-whitespace-mode: t
  tooltip-mode: t
  global-eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  auto-fill-function: do-auto-fill
  transient-mark-mode: t

Load-path shadows:
/home/ishikawa/bin/org/org-pcomplete hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-pcomplete
/home/ishikawa/bin/org/ob-keys hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-keys
/home/ishikawa/bin/org/org-timer hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-timer
/home/ishikawa/bin/org/org-mhe hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-mhe
/home/ishikawa/bin/org/org-colview hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-colview
/home/ishikawa/bin/org/org-clock hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-clock
/home/ishikawa/bin/org/org hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org
/home/ishikawa/bin/org/org-element hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-element
/home/ishikawa/bin/org/ob-lilypond hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-lilypond
/home/ishikawa/bin/org/ob-ref hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-ref
/home/ishikawa/bin/org/ob-css hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-css
/home/ishikawa/bin/org/ob-matlab hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-matlab
/home/ishikawa/bin/org/org-inlinetask hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-inlinetask
/home/ishikawa/bin/org/org-mobile hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-mobile
/home/ishikawa/bin/org/ob-gnuplot hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-gnuplot
/home/ishikawa/bin/org/ob-maxima hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-maxima
/home/ishikawa/bin/org/org-bibtex hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-bibtex
/home/ishikawa/bin/org/ob-python hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-python
/home/ishikawa/bin/org/org-archive hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-archive
/home/ishikawa/bin/org/org-indent hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-indent
/home/ishikawa/bin/org/org-w3m hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-w3m
/home/ishikawa/bin/org/ob-sqlite hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-sqlite
/home/ishikawa/bin/org/org-ctags hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-ctags
/home/ishikawa/bin/org/org-macs hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-macs
/home/ishikawa/bin/org/ob-fortran hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-fortran
/home/ishikawa/bin/org/org-table hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-table
/home/ishikawa/bin/org/ob-io hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-io
/home/ishikawa/bin/org/ob-octave hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-octave
/home/ishikawa/bin/org/ob-picolisp hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-picolisp
/home/ishikawa/bin/org/ob-eval hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-eval
/home/ishikawa/bin/org/ob-calc hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-calc
/home/ishikawa/bin/org/org-docview hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-docview
/home/ishikawa/bin/org/org-mouse hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-mouse
/home/ishikawa/bin/org/org-compat hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-compat
/home/ishikawa/bin/org/ob-C hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-C
/home/ishikawa/bin/org/ob-haskell hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-haskell
/home/ishikawa/bin/org/org-info hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-info
/home/ishikawa/bin/org/ob-dot hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-dot
/home/ishikawa/bin/org/org-src hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-src
/home/ishikawa/bin/org/org-datetree hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-datetree
/home/ishikawa/bin/org/ob-sass hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-sass
/home/ishikawa/bin/org/ob-tangle hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-tangle
/home/ishikawa/bin/org/ob-table hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-table
/home/ishikawa/bin/org/ob-org hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-org
/home/ishikawa/bin/org/org-irc hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-irc
/home/ishikawa/bin/org/org-bbdb hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-bbdb
/home/ishikawa/bin/org/org-plot hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-plot
/home/ishikawa/bin/org/org-loaddefs hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-loaddefs
/home/ishikawa/bin/org/org-capture hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-capture
/home/ishikawa/bin/org/ob-ditaa hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-ditaa
/home/ishikawa/bin/org/ob-ocaml hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-ocaml
/home/ishikawa/bin/org/ob-plantuml hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-plantuml
/home/ishikawa/bin/org/org-feed hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-feed
/home/ishikawa/bin/org/ob-awk hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-awk
/home/ishikawa/bin/org/ob-latex hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-latex
/home/ishikawa/bin/org/org-list hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-list
/home/ishikawa/bin/org/org-habit hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-habit
/home/ishikawa/bin/org/org-faces hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-faces
/home/ishikawa/bin/org/ob-scheme hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-scheme
/home/ishikawa/bin/org/org-attach hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-attach
/home/ishikawa/bin/org/org-install hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-install
/home/ishikawa/bin/org/org-agenda hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-agenda
/home/ishikawa/bin/org/ob-clojure hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-clojure
/home/ishikawa/bin/org/ob-mscgen hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-mscgen
/home/ishikawa/bin/org/ob-screen hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-screen
/home/ishikawa/bin/org/org-eshell hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-eshell
/home/ishikawa/bin/org/org-gnus hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-gnus
/home/ishikawa/bin/org/ob hides /home/ishikawa/repos/emacs-26.2/lisp/org/ob
/home/ishikawa/bin/org/org-footnote hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-footnote
/home/ishikawa/bin/org/ob-R hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-R
/home/ishikawa/bin/org/ob-exp hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-exp
/home/ishikawa/bin/org/org-id hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-id
/home/ishikawa/bin/org/ob-emacs-lisp hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-emacs-lisp
/home/ishikawa/bin/org/ob-java hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-java
/home/ishikawa/bin/org/org-crypt hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-crypt
/home/ishikawa/bin/org/org-protocol hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-protocol
/home/ishikawa/bin/org/ob-lisp hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-lisp
/home/ishikawa/bin/org/ob-asymptote hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-asymptote
/home/ishikawa/bin/org/ob-lob hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-lob
/home/ishikawa/bin/org/ob-sql hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-sql
/home/ishikawa/bin/org/ob-ledger hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-ledger
/home/ishikawa/bin/org/ob-shen hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-shen
/home/ishikawa/bin/org/ob-ruby hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-ruby
/home/ishikawa/bin/org/ob-js hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-js
/home/ishikawa/bin/org/org-entities hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-entities
/home/ishikawa/bin/org/ob-perl hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-perl
/home/ishikawa/bin/org/ob-comint hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/ob-comint
/home/ishikawa/bin/org/org-rmail hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-rmail
/home/ishikawa/bin/org/org-version hides 
/home/ishikawa/repos/emacs-26.2/lisp/org/org-version
/home/ishikawa/repos/emacs-26.2/lisp/textmodes/ispell hides 
/home/ishikawa/bin/ispell
/home/ishikawa/.emacs.d/elpa/japanese-holidays-20190317.1220/japanese-holidays 
hides /home/ishikawa/bin/japanese-holidays
/home/ishikawa/repos/emacs-26.2/lisp/tempo hides /home/ishikawa/bin/tempo
/home/ishikawa/repos/emacs-26.2/lisp/whitespace hides 
/home/ishikawa/bin/whitespace
/home/ishikawa/.emacs.d/elpa/igrep-20130824.507/igrep hides 
/home/ishikawa/bin/igrep

Features:
(shadow sort mail-extr warnings help-mode apropos info igrep grep
compile diff-mode easy-mmode misearch multi-isearch shell pcomplete
comint ring ebuff-menu vc-dispatcher vc-hg cc-mode cc-fonts cc-guess
cc-menus cc-cmds cc-styles cc-align cc-engine cc-vars cc-defs map
elec-pair google-c-style ansi-color disp-table whitespace finder-inf
package url-handlers ffap thingatpt url-parse auth-source cl-seq eieio
eieio-core cl-macs eieio-loaddefs url-vars its/hira egg egg-com egg-cnv
egg-mlh menudiag its its-keydef egg-edep server emacsbug message rmc
puny seq byte-opt gv bytecomp byte-compile cconv cl-loaddefs cl-lib
dired dired-loaddefs format-spec rfc822 mml easymenu mml-sec
password-cache epa derived epg epg-config gnus-util rmail rmail-loaddefs
mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils
mailheader sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr
mail-utils lpr autoinsert time-date mule-util japan-util tooltip eldoc
electric uniquify ediff-hook vc-hooks lisp-float-type mwheel term/x-win
x-win term/common-win x-dnd tool-bar dnd fontset image regexp-opt fringe
tabulated-list replace newcomment text-mode elisp-mode lisp-mode
prog-mode register page menu-bar rfn-eshadow isearch timer select
scroll-bar mouse jit-lock font-lock syntax facemenu font-core
term/tty-colors frame cl-generic cham georgian utf-8-lang misc-lang
vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932
hebrew greek romanian slovak czech european ethiopic indian cyrillic
chinese composite charscript charprop case-table epa-hook jka-cmpr-hook
help simple abbrev obarray minibuffer cl-preloaded nadvice loaddefs
button faces cus-face macroexp files text-properties overlay sha1 md5
base64 format env code-pages mule custom widget hashtable-print-readable
backquote threads dbusbind inotify lcms2 dynamic-setting
system-font-setting font-render-setting move-toolbar gtk x-toolkit x
multi-tty make-network-process emacs)

Memory information:
((conses 16 12971824 322608)
 (symbols 48 33597 2)
 (miscs 40 2138 2822)
 (strings 32 98018 15566)
 (string-bytes 1 2744864)
 (vectors 16 42363)
 (vector-slots 8 1008055 81600)
 (floats 8 61 723)
 (intervals 56 1156261 2635)
 (buffers 992 40))





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Tue, 04 Feb 2020 15:34:01 GMT) Full text and rfc822 format available.

Message #8 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>
Cc: 39413 <at> debbugs.gnu.org
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Tue, 04 Feb 2020 17:33:40 +0200
> From: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>
> Date: Tue, 4 Feb 2020 18:32:30 +0900
> 
> I have been using emacs-26.2 for a while and noticed
> a few lockups for the last few months.
> 
> I had not seen emacs lockups for quite a while (I have used GNU Emacs
> since 1[678] days.)
> 
> The recent lockup occurred just yesterday while I was editing C++
> source files. I was about to enter (or have entered) the second '/' to
> start a comment and then '/' is not echoed back and emacs locked up.

Please try the latest emacs-27 branch of the Emacs Git repository, it
is quite possible that this problem was fixed there.  If not, we will
appreciate a minimal example of a file and sequence of commands to
type in order to reproduce the problem.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Wed, 05 Feb 2020 02:16:01 GMT) Full text and rfc822 format available.

Message #11 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>,
 39413 <at> debbugs.gnu.org
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Wed, 5 Feb 2020 10:11:56 +0900
On 2020/02/05 0:33, Eli Zaretskii wrote:
>> From: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>
>> Date: Tue, 4 Feb 2020 18:32:30 +0900
>>
>> I have been using emacs-26.2 for a while and noticed
>> a few lockups for the last few months.
>>
>> I had not seen emacs lockups for quite a while (I have used GNU Emacs
>> since 1[678] days.)
>>
>> The recent lockup occurred just yesterday while I was editing C++
>> source files. I was about to enter (or have entered) the second '/' to
>> start a comment and then '/' is not echoed back and emacs locked up.
> Please try the latest emacs-27 branch of the Emacs Git repository, it
> is quite possible that this problem was fixed there.  If not, we will
> appreciate a minimal example of a file and sequence of commands to
> type in order to reproduce the problem.
>
> Thanks.
>
Thank you for your reply.

As I mentioned it seemed to be related to garbage collection and is hard 
to reproduce.

I cannot say when the bug hits exactly. So far it seemed random and not 
everyday matter.

I will try emacs-27 instead.

Thank you again.

Happy Hacking (in the original good sense of the phrase).

Chiaki Ishikawa






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Fri, 21 Feb 2020 17:28:02 GMT) Full text and rfc822 format available.

Message #14 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: Noam Postavsky <npostavs <at> gmail.com>
To: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>
Cc: 39413 <at> debbugs.gnu.org
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Fri, 21 Feb 2020 12:27:26 -0500
tags 39413 + unreproducible
quit

chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp> writes:

> I printed stack backtrace and Emacs seemed to be in memory allocator.
> Then I let it continue.
> Still the emacs screen was locked up.
> So I interrupted the execution, and printed stack backtrace.
> Again Emacs was in memory allocator.
> Eventually I gave up and killed emacs.
>
> It seems that garbage collection or
> routine in alloc.c was looping.

That's not enough evidence to show that the garbage collection was
looping.  There could be some higher level loop that does a lot of
allocation, so that if you stop at some random point you would have a
high probability of stopping in the alloc.c.  If you catch this in gdb
again, try running the 'finish' a few times to see if it can leave the
alloc.c code.

> This is not a repeatable bug. I have no idea how to reproduce this. It
> occurs every now and then.




Added tag(s) unreproducible. Request was from Noam Postavsky <npostavs <at> gmail.com> to control <at> debbugs.gnu.org. (Fri, 21 Feb 2020 17:28:03 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Sat, 22 Feb 2020 17:54:01 GMT) Full text and rfc822 format available.

Message #19 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>
To: Noam Postavsky <npostavs <at> gmail.com>
Cc: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>,
 39413 <at> debbugs.gnu.org
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Sun, 23 Feb 2020 02:53:06 +0900
Thank you. I will taken note of "finish" and see if I can observe the 
return from alloc.c code
if I get to see the next obvious hung.

Chiaki

On 2020/02/22 2:27, Noam Postavsky wrote:
> tags 39413 + unreproducible
> quit
>
> chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp> writes:
>
>> I printed stack backtrace and Emacs seemed to be in memory allocator.
>> Then I let it continue.
>> Still the emacs screen was locked up.
>> So I interrupted the execution, and printed stack backtrace.
>> Again Emacs was in memory allocator.
>> Eventually I gave up and killed emacs.
>>
>> It seems that garbage collection or
>> routine in alloc.c was looping.
> That's not enough evidence to show that the garbage collection was
> looping.  There could be some higher level loop that does a lot of
> allocation, so that if you stop at some random point you would have a
> high probability of stopping in the alloc.c.  If you catch this in gdb
> again, try running the 'finish' a few times to see if it can leave the
> alloc.c code.
>
>> This is not a repeatable bug. I have no idea how to reproduce this. It
>> occurs every now and then.






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Tue, 28 Apr 2020 21:37:01 GMT) Full text and rfc822 format available.

Message #22 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
To: 39413 <at> debbugs.gnu.org
Cc: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>,
 Noam Postavsky <npostavs <at> gmail.com>
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Wed, 29 Apr 2020 06:36:46 +0900
Hi,

I am back to this strange guru meditation of emacs.
Emacs would eventually begin working, but the silent period where no key
or mouse event is acknowledged is rather longish.

Several weeks after I installed GNU Emacs 28.0.50 (that is what
|emacs-version| variable says. I installed from git repo.) , I began
noticing the occasional long pause (no response) again.

That is noticeable when I was running a mozilla thunderbird build
script in *shell* buffer. (M-X shell RETURN)
Obviously, the large log may cause some resource issues, but, say
the year before, I might have gotten the warning about memory running
low (I can't recall the exact warning. It asked something about "Am I
sure to do the operation?", like for example, when I invoked
|erase-buffer| to the very large *shell* buffer that contains many log
lines. Emacs responded something about dire situation and if I was
really invoking the function, and not by mistake, etc.
However, it never got seemingly hung for like 10-15 seconds.

Yesterday, this long pause happened again, and so I attached gdb to
the running emacs process and the following is what I found.

Basically, sweep_conses() takes unusually long time.
(It is possible that by the time I noticed the long hung and attached 
gdb, the major
portion of the problematic code had already been executed.)

This is the first stac ktrace when I attach gdb to emacs.
So it looks that it was trying to free cons cells.

#0  lisp_align_free (block=<optimized out>) at alloc.c:1266
#1  0x000055bc512500f2 in sweep_conses () at alloc.c:6907
#2  0x000055bc51255b27 in gc_sweep () at alloc.c:7134
#3  garbage_collect () at alloc.c:6105
#4  0x000055bc51255ed1 in maybe_garbage_collect () at alloc.c:5968
#5  0x000055bc51271785 in maybe_gc () at lisp.h:5068
#6  Ffuncall (nargs=4, args=args <at> entry=0x7fff92901a70) at eval.c:2778
#7  0x000055bc512a5838 in exec_byte_code
    (bytestr=<optimized out>, vector=<optimized out>, 
maxdepth=<optimized out>, args_template=<optimized out>, 
nargs=<optimized out>, args=<optimized out>) at bytecode.c:633
#8  0x000055bc51271707 in Ffuncall (nargs=4, 
args=args <at> entry=0x7fff92901f38) at eval.c:2808

    ... omitted ...

I invoked "finish" command to gdb. (A very handy command, indeed.)

finish
Run till exit from #0  lisp_align_free (block=<optimized out>) at 
alloc.c:1266
0x000055bc512500f2 in sweep_conses () at alloc.c:6907
6907              lisp_align_free (cblk);

 where
#0  0x000055bc512500f2 in sweep_conses () at alloc.c:6907
#1  0x000055bc51255b27 in gc_sweep () at alloc.c:7134
#2  garbage_collect () at alloc.c:6105
#3  0x000055bc51255ed1 in maybe_garbage_collect () at alloc.c:5968
    ... omitted ...

Another finish

Run till exit from #0  0x000055bc512500f2 in sweep_conses () at alloc.c:6907
gc_sweep () at alloc.c:7135
7135      sweep_floats ();
(gdb)


sweep_conses() took rather a long time. Maybe five seconds or so. (Could
be longer almost 10 seconds...)

Another finish.

 sweep_floats ();
(gdb) finish
Run till exit from #0  gc_sweep () at alloc.c:7135
garbage_collect () at alloc.c:7141
7141      check_string_bytes (!noninteractive);
(gdb)

sweep_float() ends quickly.

Another finish.

finish
Run till exit from #0  garbage_collect () at alloc.c:7141
0x000055bc51271785 in maybe_gc () at lisp.h:5068
5068        maybe_garbage_collect ();
(gdb)

check_string_bytes finished quickly.

finish
Run till exit from #0  0x000055bc51271785 in maybe_gc () at lisp.h:5068
deliver_input_available_signal (sig=29) at keyboard.c:7198
7198      deliver_process_signal (sig, handle_input_available_signal);
(gdb) Quit
(gdb) where

The above one took a fraction of a second. It was noticeable.
finish
Run till exit from #0  deliver_input_available_signal (sig=29) at 
keyboard.c:7198
<signal handler called>
(gdb) finish
Run till exit from #0  <signal handler called>
0x000055bc51271785 in maybe_gc () at lisp.h:5068
5068        maybe_garbage_collect ();
(gdb)

immediate return

Well, actually after a few more finish (which completed immediately),
emacs was responsive again.

I had a few similar experiences yesterday.


My observation:

So my take on this is sweep_conses() takes a rather long time and not
letting emacs to take any input. (And there could be other functions
before it.)

Question.: Has there been any change in emacs's alloc.c code in the
last year or two?  (Even an seemingly innocuous change might have
introduced a very bad CPU cache behavior that makes sweep_conses() to
perform badly. Or maybe it is now allowed to run without
time-constraint?)

I suspect that since the apparent long hung occurred during the time
when |*shell*| buffer was accumulating many lines from a largish
running build process, maybe the string allocation was not catching
up and causing excessive GC. In that case, the ordering of reclaiming
functions could be arranged to find string first?
(Addition to the buffer of lines read from a child process is
basically an insertion at the end of the buffer of strings, correct?)

OR maybe I am hitting the age old adagio of "GC considered
harmful". [1] Simply my virtual space is too large to be handled
efficiently.  I remember reading the paper the first time it appeared,
but I never thought it would hit me on my home PC. (The PC at the time
had 64KB-128KB of memory).

I am running emacs under Debian GNU/linux inside VirtualBox which is
assigned 16GB memory.

My suggestion:

I would think sweep_conses() may want to quit early, if possible, after
obtaining certain amount of space (or for a casual GC invoked
periodically, may want to quit after one second or something like
that. That is, we may want to set a elapsed time limit.)
In a really tight situation, we can invoke the reclaiming functions
with a flag telling them to use as much CPU/Elapsed time to reclaim as 
many octets as possible,
but for a periodic GC, it might make sense to return half-heartedly.

Oh, wait, have I set a variable to control periodic GC and forgot
about it?

I checked. I don't think I tweaked the following variables. C-H k output.

gc-cons-threshold is a variable defined in ‘C source code’.
Its value is 800000

gc-cons-percentage is a variable defined in ‘C source code’.
Its value is 0.1

As of now in a different incarnation of emacs process (this is a
differnt process from the one which I attached gdb.),

gc-elapsed is a variable defined in ‘C source code’.
Its value is 68.3795433

Hope this helps us shedding some light on the guru meditation of emacs
probably due to an unusually longish GC.

TIA

Chiaki

PS: Given that this may have something to do with the paging activity
of the emacs process, the linux version change (from 4 to 5) may have
affected the paging behavior of the program, but it is hard to say.  I
now see this linux instance is using 293M SWAP.  Compiling mozilla
thunderbird source runs |make -j6| and g++-9 compiles CPP source files
which are very large and so they exert heavy memory pressure. A bad
memory access pattern of emacs, not friendly to locality of access,
would add to page faults. Except, |xosview| program does not show any
paging activity. I am not sure if |xosview| is correct or not.


[1] "Address/memory management for a gigantic LISP environment or,
     GC considered harmful",  Jon L. White,
 LFP '80: Proceedings of the 1980 ACM conference on LISP and functional 
programming,
August 1980 Pages 119–127
https://doi.org/10.1145/800087.802797
https://dl.acm.org/doi/10.1145/1317203.1317206


On 2020/02/23 2:53, chiaki-ishikawa-thunderbird-account wrote:
> Thank you. I will taken note of "finish" and see if I can observe the 
> return from alloc.c code
> if I get to see the next obvious hung.
>
> Chiaki
>
> On 2020/02/22 2:27, Noam Postavsky wrote:
>> tags 39413 + unreproducible
>> quit
>>
>> chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp> writes:
>>
>>> I printed stack backtrace and Emacs seemed to be in memory allocator.
>>> Then I let it continue.
>>> Still the emacs screen was locked up.
>>> So I interrupted the execution, and printed stack backtrace.
>>> Again Emacs was in memory allocator.
>>> Eventually I gave up and killed emacs.
>>>
>>> It seems that garbage collection or
>>> routine in alloc.c was looping.
>> That's not enough evidence to show that the garbage collection was
>> looping.  There could be some higher level loop that does a lot of
>> allocation, so that if you stop at some random point you would have a
>> high probability of stopping in the alloc.c.  If you catch this in gdb
>> again, try running the 'finish' a few times to see if it can leave the
>> alloc.c code.
>>
>>> This is not a repeatable bug. I have no idea how to reproduce this. It
>>> occurs every now and then.
>
>
>





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Wed, 29 Apr 2020 06:57:02 GMT) Full text and rfc822 format available.

Message #25 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
Cc: chiaki.ishikawa <at> ubin.jp, 39413 <at> debbugs.gnu.org, npostavs <at> gmail.com
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Wed, 29 Apr 2020 09:55:57 +0300
> From: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
> Date: Wed, 29 Apr 2020 06:36:46 +0900
> Cc: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>,
>  Noam Postavsky <npostavs <at> gmail.com>
> 
> I am back to this strange guru meditation of emacs.
> Emacs would eventually begin working, but the silent period where no key
> or mouse event is acknowledged is rather longish.
> 
> Several weeks after I installed GNU Emacs 28.0.50 (that is what
> |emacs-version| variable says. I installed from git repo.) , I began
> noticing the occasional long pause (no response) again.

I suggest to begin by establishing whether these pauses are caused by
GC.  To that end, set garbage-collection-messages to a non-nil value,
and see if Emacs displays a message about GC in progress when these
pauses occur.  Also, try to tell how frequent are these pauses and
much time each pause takes.

If you see that every time you experience a pause there's GC running,
we can then proceed to investigating whether the frequency of GC and
the time it takes on your system is something abnormal, and why.

> So my take on this is sweep_conses() takes a rather long time and not
> letting emacs to take any input. (And there could be other functions
> before it.)

The question is how long is "long time".  We need some quantitative
measure of this before we can tell whether it's normal or not.

> Question.: Has there been any change in emacs's alloc.c code in the
> last year or two?

Yes, a lot.  But I don't see any significant change in frequency or
time duration of GC on my system due to those changes, FWIW.

> I am running emacs under Debian GNU/linux inside VirtualBox which is
> assigned 16GB memory.

It is possible that the virtual machine's memory management is
misconfigured, so it slows down Emacs when it needs to access a large
portion of its address space (which happens during GC).

> gc-elapsed is a variable defined in ‘C source code’.
> Its value is 68.3795433

The question is how many GC cycles took 68 seconds.  What is the value
of gcs-done?  Also, is this in a session to which you attached GDB? if
so, then some of the elapsed time was due to Emacs being stopped under
the debugger.  We need these values in a session that is never stopped
by a debugger.

> PS: Given that this may have something to do with the paging activity
> of the emacs process, the linux version change (from 4 to 5) may have
> affected the paging behavior of the program, but it is hard to say.  I
> now see this linux instance is using 293M SWAP.  Compiling mozilla
> thunderbird source runs |make -j6| and g++-9 compiles CPP source files
> which are very large and so they exert heavy memory pressure. A bad
> memory access pattern of emacs, not friendly to locality of access,
> would add to page faults. Except, |xosview| program does not show any
> paging activity. I am not sure if |xosview| is correct or not.

Building a large program with -j6 also uses CPU, and that could make
CPU-intensive operations, such as GC, slow.  Was that build running in
the same VirtualBox as Emacs?  If so, how many physical CPU execution
units do you have allocated to VirtualBox?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Thu, 30 Apr 2020 02:46:01 GMT) Full text and rfc822 format available.

Message #28 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: chiaki.ishikawa <at> ubin.jp, 39413 <at> debbugs.gnu.org, npostavs <at> gmail.com
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Thu, 30 Apr 2020 11:45:25 +0900
Hi,

Thank you for your tips.

My comment inline.:

On 2020/04/29 15:55, Eli Zaretskii wrote:
>> From: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
>> Date: Wed, 29 Apr 2020 06:36:46 +0900
>> Cc: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>,
>>   Noam Postavsky <npostavs <at> gmail.com>
>>
>> I am back to this strange guru meditation of emacs.
>> Emacs would eventually begin working, but the silent period where no key
>> or mouse event is acknowledged is rather longish.
>>
>> Several weeks after I installed GNU Emacs 28.0.50 (that is what
>> |emacs-version| variable says. I installed from git repo.) , I began
>> noticing the occasional long pause (no response) again.
> I suggest to begin by establishing whether these pauses are caused by
> GC.  To that end, set garbage-collection-messages to a non-nil value,
> and see if Emacs displays a message about GC in progress when these
> pauses occur.  Also, try to tell how frequent are these pauses and
> much time each pause takes.
>
> If you see that every time you experience a pause there's GC running,
> we can then proceed to investigating whether the frequency of GC and
> the time it takes on your system is something abnormal, and why.
I have set garbage-collection-messages to t.

Let me see what happens when the next apparent long guru meditation happens.


>> So my take on this is sweep_conses() takes a rather long time and not
>> letting emacs to take any input. (And there could be other functions
>> before it.)
> The question is how long is "long time".  We need some quantitative
> measure of this before we can tell whether it's normal or not.
I agree. It  is all relative.
>> Question.: Has there been any change in emacs's alloc.c code in the
>> last year or two?
> Yes, a lot.  But I don't see any significant change in frequency or
> time duration of GC on my system due to those changes, FWIW.

Maybe I should take a look when my time permits.


>> I am running emacs under Debian GNU/linux inside VirtualBox which is
>> assigned 16GB memory.
> It is possible that the virtual machine's memory management is
> misconfigured, so it slows down Emacs when it needs to access a large
> portion of its address space (which happens during GC).

I have not touched any thing (famous last words) for my linux 
configuration for the last few years aside from simply upgrading the 
packages from time to time.


>> gc-elapsed is a variable defined in ‘C source code’.
>> Its value is 68.3795433
> The question is how many GC cycles took 68 seconds.  What is the value
> of gcs-done?  Also, is this in a session to which you attached GDB? if
> so, then some of the elapsed time was due to Emacs being stopped under
> the debugger.  We need these values in a session that is never stopped
> by a debugger.
>

I will monitor gcs-done when the next guru meditation happens.

Is there a hook to garbage-collection so that I can print gcs-done and 
gc-elapsed maybe after the garbage-collection is run?


>> PS: Given that this may have something to do with the paging activity
>> of the emacs process, the linux version change (from 4 to 5) may have
>> affected the paging behavior of the program, but it is hard to say.  I
>> now see this linux instance is using 293M SWAP.  Compiling mozilla
>> thunderbird source runs |make -j6| and g++-9 compiles CPP source files
>> which are very large and so they exert heavy memory pressure. A bad
>> memory access pattern of emacs, not friendly to locality of access,
>> would add to page faults. Except, |xosview| program does not show any
>> paging activity. I am not sure if |xosview| is correct or not.
> Building a large program with -j6 also uses CPU, and that could make
> CPU-intensive operations, such as GC, slow.  Was that build running in
> the same VirtualBox as Emacs?  If so, how many physical CPU execution
> units do you have allocated to VirtualBox?
>
Yes, the build was running in the same VirtualBox as Emacs.
I have assigned 7 cores to VirtualBox.
(Oh wait!!! My current VirtualBox says it is assigned only 6 cores!?
I thought I did assign 7 cores to this VirtualBox image.
My PC runs Windows 10 natively on AMD Ryzen 7 Eight-Core processor that 
has additional 8 hyperthreads.
So there are 16 virtual cores.
I have no idea why only 6 core is assigned to VM image today.
The number has been tweaked when an earlier VBox had an unresolved issue 
when AMD Ryzen CPU came out.
[I used to run this image on XEON CPU.]
Maybe this could explain the problem on my HOME PC.However, the same 
guru meditation has been observed on office PC with a different Intel 
CPU, too.
Again, Emacs is run inside virtualBox on that PC that runs Windows 10 
professional natively.))

However, the important point is this. I have not noticed this strange 
guru meditation until about 12 months ago or so.
That is why I suspect that there could be an issue with the recent 
alloc.c or something, but I think I need to gather more data
to focus in it yet.

Stay tuned.

Thank you again for your help. I will try to gather more circumstantial 
evidence so that we can make an educated guess why Emacs seems to enter 
guru meditation on my PC.

Chiaki





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Wed, 20 May 2020 04:32:02 GMT) Full text and rfc822 format available.

Message #31 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: "chiaki.ishikawa <at> ubin.jp" <chiaki.ishikawa <at> ubin.jp>, 39413 <at> debbugs.gnu.org,
 Noam Postavsky <npostavs <at> gmail.com>
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Wed, 20 May 2020 13:31:29 +0900
[Message part 1 (text/plain, inline)]
Hi,

I am back.

I have observed another long hung.

This time, I did verify that it is during GC.

The hung lasted close to 8 minutes give or take.

My comment inline.

On 2020/04/30 11:45, chiaki-ishikawa-thunderbird-account wrote:
> Hi,
>
> Thank you for your tips.
>
> My comment inline.:
>
> On 2020/04/29 15:55, Eli Zaretskii wrote:
>>> From: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
>>> Date: Wed, 29 Apr 2020 06:36:46 +0900
>>> Cc: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>,
>>>   Noam Postavsky <npostavs <at> gmail.com>
>>>
>>> I am back to this strange guru meditation of emacs.
>>> Emacs would eventually begin working, but the silent period where no 
>>> key
>>> or mouse event is acknowledged is rather longish.
>>>
>>> Several weeks after I installed GNU Emacs 28.0.50 (that is what
>>> |emacs-version| variable says. I installed from git repo.) , I began
>>> noticing the occasional long pause (no response) again.
>> I suggest to begin by establishing whether these pauses are caused by
>> GC.  To that end, set garbage-collection-messages to a non-nil value,
>> and see if Emacs displays a message about GC in progress when these
>> pauses occur.  Also, try to tell how frequent are these pauses and
>> much time each pause takes.
>>
>> If you see that every time you experience a pause there's GC running,
>> we can then proceed to investigating whether the frequency of GC and
>> the time it takes on your system is something abnormal, and why.
> I have set garbage-collection-messages to t.
>
> Let me see what happens when the next apparent long guru meditation 
> happens.
>
Since the last message, I did not get a very long hung  until this 
morning. Maybe 5-10 second long GC only. Those I can tolerate.
I feel "what?" after maybe a five second hung and then realize something 
is holding up emacs, and then the hung is over. I tell myself GC was 
rather long, but that is it.

But this morning I get a rather long hung, or GC, twice.

Attaching gdb to the emacs process then after a 10 second or so of the 
first hung, and noticed that sweep_cons() took three seconds to finish 
after I entered "finish" command to gdb. That certainly felt long during 
gdb session.

So that was an initial clue that something is wrong.

(Now I suspect some kind of memory area corruption. Before this happened 
I loaded a several large log files each about 20MB, and when I tried to
erase a rather large *shell* buffer that contained the log from a large 
compilation job, this garbage collection kicked in.)

Anyway, I checked and found gc-elapsed showed a number in 400-500 
seconds range. I really wished I wrote down the exact number.

>
>
>>> So my take on this is sweep_conses() takes a rather long time and not
>>> letting emacs to take any input. (And there could be other functions
>>> before it.)
>> The question is how long is "long time".  We need some quantitative
>> measure of this before we can tell whether it's normal or not.
> I agree. It  is all relative.
>
I agree "long" is a relative term.

But after the somewhat long pause of maybe 20-30 seconds, I get another 
really long pause which prompted me to write this mail.

The second long hung this morning was maybe longer than 8 minutes give 
or take a minute or so.

I stopped emacs  a few times after the initial five minute or so of the 
long hung to check where it spends time occasionally using gdb. But it 
definitely is close to 8 minutes. I would have killed emacs and start 
over if I had not learned about gc-elapsed and other internal variables.

After GC finally finished and control returned to keyboard, I checked 
gc-elapsed.
It now shows:
gc-elapsed is a variable defined in ‘C source code’.
Its value is 1067.998007492

It matches my observation that gc took about 8 minutes give or take. (Of 
course, there were a few times I interrupted the operation from 
Control-C from GDB and even captured a screen shot or two. But that was 
less than two minutes in total.)


>>> Question.: Has there been any change in emacs's alloc.c code in the
>>> last year or two?
>> Yes, a lot.  But I don't see any significant change in frequency or
>> time duration of GC on my system due to those changes, FWIW.
>
> Maybe I should take a look when my time permits.
>
>
>>> I am running emacs under Debian GNU/linux inside VirtualBox which is
>>> assigned 16GB memory.
>> It is possible that the virtual machine's memory management is
>> misconfigured, so it slows down Emacs when it needs to access a large
>> portion of its address space (which happens during GC).
>
> I have not touched any thing (famous last words) for my linux 
> configuration for the last few years aside from simply upgrading the 
> packages from time to time.
>
>
>>> gc-elapsed is a variable defined in ‘C source code’.
>>> Its value is 68.3795433
>> The question is how many GC cycles took 68 seconds.  What is the value
>> of gcs-done?  Also, is this in a session to which you attached GDB? if
>> so, then some of the elapsed time was due to Emacs being stopped under
>> the debugger.  We need these values in a session that is never stopped
>> by a debugger.
>>
>
> I will monitor gcs-done when the next guru meditation happens.
>
> Is there a hook to garbage-collection so that I can print gcs-done and 
> gc-elapsed maybe after the garbage-collection is run?

I failed to check gcs-done. It shows 1032 after the fact.

But given that I monitored gc using gdb and it took almost 8 minutes 
before successive "finish" commands to gdb (after sweep_floats(), etc.)
to come up to the interactive functions, I am quite sure that one 
GC-related entry point used approximately 8 min to finish.
And I think that entry is maybe_gc() or maybe_garbage_collect().

(That is why I suspected some type of memory corruption software-wise. 
My PC has ECC enabled and so it is rather unlikely memory got corrupted 
spontaneously.)


>
>
>>> PS: Given that this may have something to do with the paging activity
>>> of the emacs process, the linux version change (from 4 to 5) may have
>>> affected the paging behavior of the program, but it is hard to say.  I
>>> now see this linux instance is using 293M SWAP.  Compiling mozilla
>>> thunderbird source runs |make -j6| and g++-9 compiles CPP source files
>>> which are very large and so they exert heavy memory pressure. A bad
>>> memory access pattern of emacs, not friendly to locality of access,
>>> would add to page faults. Except, |xosview| program does not show any
>>> paging activity. I am not sure if |xosview| is correct or not.
>> Building a large program with -j6 also uses CPU, and that could make
>> CPU-intensive operations, such as GC, slow.  Was that build running in
>> the same VirtualBox as Emacs?  If so, how many physical CPU execution
>> units do you have allocated to VirtualBox?
>>
> Yes, the build was running in the same VirtualBox as Emacs.
> I have assigned 7 cores to VirtualBox.
> (Oh wait!!! My current VirtualBox says it is assigned only 6 cores!?
> I thought I did assign 7 cores to this VirtualBox image.
> My PC runs Windows 10 natively on AMD Ryzen 7 Eight-Core processor 
> that has additional 8 hyperthreads.
> So there are 16 virtual cores.
> I have no idea why only 6 core is assigned to VM image today.
> The number has been tweaked when an earlier VBox had an unresolved 
> issue when AMD Ryzen CPU came out.
> [I used to run this image on XEON CPU.]
> Maybe this could explain the problem on my HOME PC.However, the same 
> guru meditation has been observed on office PC with a different Intel 
> CPU, too.
> Again, Emacs is run inside virtualBox on that PC that runs Windows 10 
> professional natively.))
>
> However, the important point is this. I have not noticed this strange 
> guru meditation until about 12 months ago or so.
> That is why I suspect that there could be an issue with the recent 
> alloc.c or something, but I think I need to gather more data
> to focus in it yet.
>
> Stay tuned.
>
> Thank you again for your help. I will try to gather more 
> circumstantial evidence so that we can make an educated guess why 
> Emacs seems to enter guru meditation on my PC.
>
> Chiaki
>
>
Since last message, I have re-assigned 7 CPUs to my virtual box image.

At the time of long hung this morning, I have no largish build or 
anything. I was only editing a message file or something.
I wanted to clear *shell* buffer with "erase-buffer" when this series of 
long hungs hit emacs.

I am attaching two screen dumps.
sweep-conses shows that emacs is spending time in sweep-conses after I 
typed "finish" command to gdb.
Emacs process in the background did not respond to X repaint event and 
thus only a portion of its screen, most notably echo area is partially 
shown.
The window on the right is xosview system resource monitor program.
That shows CPP 2, 3, and 4 are used 100% successively (and that is emacs).
The durration of time covered by the horizontal length of the montor 
window is about 15 seconds.
Emacs spent time almost like 8 minutes (I need to subtract a minute or 
two for gdb interaction)
like this. Just churning CPU and nothing else.

memory-consolidate-1.PNG is interesting in that this was the first time 
I saw something two level below sweep_conses().
A routine malloc_consolidate() is visible.
Maybe the memory consolidation may take time.

A repeated question, which I think has No answer is as follows.:
Is there a hook that gets executed just before GC happens?

Now I think the answer is "No" because GC is invoked when the available 
conses are few and we don't want to run arbitrary lisp program whch may 
lead to another GC invocation.

Short of such feature, is there a post hook that gets executed AFTER GC 
finishes (well in the stacktrace I see maybe_gc() called from call2(), 
and so
the hook may have to be placedon maybe_gc() routine if possible.

I am asking the questions about the hook(s) since I want to record some 
statistics about memory usage, etc. automatically
before/after GC so that we can have a scientific/measurable data to 
discuss what is going on.
If we can have a set of such pre/post data for the last few dozen GC 
invocations, we may be able to home into the cause of the program.
Short of lisp function, maybe I can add message print out in alloc.c to 
print out such information so that it gets recorded in *Message* buffer.
It is done only when gabage-collection-message is t and there should be 
some throttling mechanism to avoid spamming the user.


I now feel both alloc.c and malloc.c may be to blame, most probably in 
terms of locality of references that may interfere badly with CPU cache 
and linux memory system. However, do not there does not seem to be much 
paging activity AND more than 4 GB real memory is more or less 
available  still. That is visible in xosview's screen in 
sweep-conses.png. That lead to me wonder if there is algorithmic slow 
down due to non-linear execution time of memory size? But again, this is 
hard to believe in this age of home users having GB of memory. Such 
algorithmic lapse can't remain undetected for so long.

Hmm, that makes me to throw the net on linux AND virtualbox which I use 
to run linux on my home Windows PC.
I have 32 GB of memory on my home PC and assign 16 GB of memory to 
VirtualBox image.
Given that I still have 2 GB of free memory on Window side right now (I 
just checked with system performance monitor),
all the memory references during GC ought to go into the real memory 
without paging on the Window side even, .i.e., the virtual memory is 
mapped straight to the real memory on the PC.

OR, do I feel lucky and compile emacs with GCC-9's -O3 flag and see if 
it speeds up emacs?
To be honest, I always prefer correctness rather than speed, but 6 
minutes GC is not something I can put up with in an interactive program.
Come to think of it if Emacs is going multi-threaded (I mean Emacs-Lisp 
interpreter) then low-level GC can become incremental opportunistic 
non-blocking GC?

Any pointer re the hook(s) short of rewriting alloc.c is appreciated. 
Well, if worst comes to worst, I don't mind rewriting alloc.c to create 
an array of fixed size to store the last dozen or so GC-related 
information that is updated before and after invocation of a certain GC 
entry point.
I can print that out after a long GC to learn the memory situation.
But that sounds a bit drastic. OTOH, it may be the only way to go.
But the last time I worked on alloc.c to debug it to run on non-UNIX 
platform was close to 30 years ago. :-)

Thank you in advance for your attention on this matter.

Chiaki


[malloc-consolidate-1.PNG (image/png, attachment)]
[sweep-conses.png (image/png, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Mon, 25 May 2020 12:16:01 GMT) Full text and rfc822 format available.

Message #34 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: Noam Postavsky <npostavs <at> gmail.com>
To: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 39413 <at> debbugs.gnu.org
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Mon, 25 May 2020 08:15:29 -0400
chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp> writes:

> Short of lisp function, maybe I can add message print out in alloc.c
> to print out such information so that it gets recorded in *Message*
> buffer.

> It is done only when gabage-collection-message is t and there should
> be some throttling mechanism to avoid spamming the user.

> Any pointer re the hook(s) short of rewriting alloc.c is
> appreciated. Well, if worst comes to worst, I don't mind rewriting
> alloc.c to create an array of fixed size to store the last dozen or so
> GC-related information that is updated before and after invocation of
> a certain GC entry point.
> I can print that out after a long GC to learn the memory situation.
> But that sounds a bit drastic. OTOH, it may be the only way to go.

That should probably work.  I see the current
garbage_collection_messages code uses message_nolog, I guess because
growing the *Messages* buffer could also trigger GC.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Mon, 25 May 2020 15:51:02 GMT) Full text and rfc822 format available.

Message #37 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>
To: Noam Postavsky <npostavs <at> gmail.com>
Cc: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>,
 Eli Zaretskii <eliz <at> gnu.org>, 39413 <at> debbugs.gnu.org
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Tue, 26 May 2020 00:50:15 +0900
On 2020/05/25 21:15, Noam Postavsky wrote:
> chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp> writes:
>
>> Short of lisp function, maybe I can add message print out in alloc.c
>> to print out such information so that it gets recorded in *Message*
>> buffer.
>> It is done only when gabage-collection-message is t and there should
>> be some throttling mechanism to avoid spamming the user.
>> Any pointer re the hook(s) short of rewriting alloc.c is
>> appreciated. Well, if worst comes to worst, I don't mind rewriting
>> alloc.c to create an array of fixed size to store the last dozen or so
>> GC-related information that is updated before and after invocation of
>> a certain GC entry point.
>> I can print that out after a long GC to learn the memory situation.
>> But that sounds a bit drastic. OTOH, it may be the only way to go.
> That should probably work.  I see the current
> garbage_collection_messages code uses message_nolog, I guess because
> growing the *Messages* buffer could also trigger GC.
>
Thank you, I will probably do that.

Back in 18.x days, printing the message from within GC caused another GC 
invocation, thus emacs blew up due to stack flow after so many recursive 
calls.

I think having a static array of 10-20 cells to store previous GC info 
should work.

Stay tuned. I am really interested in WHERE/WHY this long pause occurs.: 
maybe Windows 10 memory allocation/ Virtual Box emulation/Linux kernel 
memory allocation, and AMD Ryzen CPU's cache replacement policy all add up.

Chiaki






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Mon, 08 Jun 2020 08:18:02 GMT) Full text and rfc822 format available.

Message #40 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>
To: Noam Postavsky <npostavs <at> gmail.com>
Cc: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>,
 Eli Zaretskii <eliz <at> gnu.org>, 39413 <at> debbugs.gnu.org
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Mon, 8 Jun 2020 17:17:00 +0900
On 2020/05/26 0:50, chiaki-ishikawa-thunderbird-account wrote:
> On 2020/05/25 21:15, Noam Postavsky wrote:
>> chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp> writes:
>>
>>> Short of lisp function, maybe I can add message print out in alloc.c
>>> to print out such information so that it gets recorded in *Message*
>>> buffer.
>>> It is done only when gabage-collection-message is t and there should
>>> be some throttling mechanism to avoid spamming the user.
>>> Any pointer re the hook(s) short of rewriting alloc.c is
>>> appreciated. Well, if worst comes to worst, I don't mind rewriting
>>> alloc.c to create an array of fixed size to store the last dozen or so
>>> GC-related information that is updated before and after invocation of
>>> a certain GC entry point.
>>> I can print that out after a long GC to learn the memory situation.
>>> But that sounds a bit drastic. OTOH, it may be the only way to go.
>> That should probably work.  I see the current
>> garbage_collection_messages code uses message_nolog, I guess because
>> growing the *Messages* buffer could also trigger GC.
>>
> Thank you, I will probably do that.
>
> Back in 18.x days, printing the message from within GC caused another 
> GC invocation, thus emacs blew up due to stack flow after so many 
> recursive calls.
>
> I think having a static array of 10-20 cells to store previous GC info 
> should work.
>
> Stay tuned. I am really interested in WHERE/WHY this long pause 
> occurs.: maybe Windows 10 memory allocation/ Virtual Box 
> emulation/Linux kernel memory allocation, and AMD Ryzen CPU's cache 
> replacement policy all add up.
>
> Chiaki
>
>
Hi,

After looking at alloc.c, I realized that there *IS* a post-hook that 
gets called after GC.

So instead of hacking and modifying alloc.c, I took the easy route and 
did the following in my .emacs
(the code is in postscript)
and now waiting for the next LOOONG gc behavior.

Basically I record some GC-related statistics for the last 10 
invocations, and make sure it is recorded in *Messages* buffer.
Maybe I can learn something from it. (My most suspicious part is string 
GC. When I run "erase-buffer" on the "*shell*" buffer which has 
accumulated a large build output a few times over, GC tends to be very 
longish.)

If you know any other easily obtainable GC-related information that 
ought to be in the statistics
on top of
(append (memory-use-counts) (list gc-elapsed gcs-done))
please let me know.

Directly evaluating (garbage-collect) prints a neat result, but I am not 
sure if that can be easily simulated from ordinary eval.
(And maybe it is only syntactic sugar and doesn't add much to the 
necessary statistical data.)

TIA

Chiaki Ishikawa


PS: The addition in my .emacs
;;;
;;; GC behavior monitor
;;;
;;; Record some key information for GC in the GC post-hook so that I
;;; can learn something about the behavior and performance of GC with
;;; concrete data.

(setq my-gc-statistics (make-vector 10 nil))

;;; The element is
;;; (append (memory-use-counts) (list gc-elapsed gcs-done))
;;; Each time the following function is called, the
;;; elements in the array is shifted toward the end.
;;; Use (message "%S" my-gc-statistics) to force the
;;; recording of my-gc-statistics value in *Messages* buffer for later 
analysis.


(defun update-my-gc-statistics ()
  (let ((i 8))
    (progn
     ;;; very unlike Lisp
     (while (<= 0 i)
       (progn (aset my-gc-statistics (+ 1 i) (aref my-gc-statistics i))
             (setq i (- i 1) )))
     (aset my-gc-statistics 0
           (append (memory-use-counts) (list gc-elapsed gcs-done)))
     (message "%S" my-gc-statistics))))

(setq post-gc-hook 'update-my-gc-statistics)








Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Tue, 10 Aug 2021 16:10:03 GMT) Full text and rfc822 format available.

Message #43 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 39413 <at> debbugs.gnu.org,
 Noam Postavsky <npostavs <at> gmail.com>
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Tue, 10 Aug 2021 18:09:08 +0200
chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp> writes:

> Stay tuned. I am really interested in WHERE/WHY this long pause
> occurs.: maybe Windows 10 memory allocation/ Virtual Box
> emulation/Linux kernel memory allocation, and AMD Ryzen CPU's cache
> replacement policy all add up.

This was a year ago -- are you still seeing these hangs, and if so, have
you confirmed that they don't happen because the machine (client or
host) is swapping?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Added tag(s) moreinfo. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Tue, 10 Aug 2021 16:10:03 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Wed, 11 Aug 2021 01:10:01 GMT) Full text and rfc822 format available.

Message #48 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
To: Lars Ingebrigtsen <larsi <at> gnus.org>,
 chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 39413 <at> debbugs.gnu.org,
 Noam Postavsky <npostavs <at> gmail.com>
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Wed, 11 Aug 2021 10:08:51 +0900
On 2021/08/11 1:09, Lars Ingebrigtsen wrote:
> chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp> writes:
>
>> Stay tuned. I am really interested in WHERE/WHY this long pause
>> occurs.: maybe Windows 10 memory allocation/ Virtual Box
>> emulation/Linux kernel memory allocation, and AMD Ryzen CPU's cache
>> replacement policy all add up.
> This was a year ago -- are you still seeing these hangs, and if so, have
> you confirmed that they don't happen because the machine (client or
> host) is swapping?

Hi,

Thank you for your follow-up.

I still think there is a problem with memory allocator in the later Emacs versions. I can't figure out exactly when this happens. I have been using Emacs in this setup for quite sometime and yet I had NOT experienced this strange long pause before.

Now, it is true that I do use much memory while running mozilla thunderbird compilation
WITHIN shell window which I open M-x shell.
But again, I used to do this for the last 8-9 years.

One thing I notice is that running compilation script of mozilla thunderbird
outputs a pattern of longish lines and then a few short lines very frequently.
Yes, they amount to a lot of memory, but I suspect that
they may cause a somewhat inefficient handling by string allocator.

I used the following post-GC-hook to record the behavior of emacs GC.

After the code snippet, I attach the output from such a call.

;;;

;;; GC behavior monitor
;;;
;;; Record some key information for GC in the GC post-hook so that I
;;; can learn something about the behavior and performance of GC with
;;; concrete data.

(setq my-gc-statistics (make-vector 30 nil))

;;; The element is
;;; (append (memory-use-counts) (list gc-elapsed gcs-done))
;;; Each time the following function is called, the
;;; elements in the array is shifted toward the end.
;;; Use (message "%S" my-gc-statistics) to force the
;;; recording of my-gc-statistics value in *Messages* buffer for later 
analysis.


(defun update-my-gc-statistics ()
  (let ((i 28))
    (progn
     ;;; very unlike Lisp
     (while (<= 0 i)
       (progn (aset my-gc-statistics (+ 1 i) (aref my-gc-statistics i))
             (setq i (- i 1) )))
     (aset my-gc-statistics 0
           (append (memory-use-counts) (list gc-elapsed gcs-done)))
     ;;; print the latest one last so that I can see the glimpse in the 
narrow
     ;;; output window.
     (message "%S\n%S" (current-time-string) (pp (reverse 
my-gc-statistics))))))

(setq post-gc-hook 'update-my-gc-statistics)

---
Normal behavior. Not much pause.

[(19051645 7445 43993805 36764 42521186 286669 1176042 13.581105925 320)
 (19117025 7446 44152747 36764 42547677 287685 1177518 13.624215366 321)
 (19174133 7469 44332450 36765 42585061 288281 1178955 13.667294932 322)
 (19238776 7470 44504363 36765 42620253 289199 1180384 13.714802148 323)
 (19303795 7484 44666173 36767 42649518 289786 1181923 13.756323136 324)
 (19365595 7485 44837589 36767 42676862 290659 1183423 13.807570787 325)
 (19425828 7486 45007513 36767 42712688 291410 1184879 13.847687952 326)
 (19482572 7488 45186533 36769 42753576 291969 1186503 13.905988177 327)
 (19539282 7489 45365763 36769 42782042 292534 1188002 13.947620008 328)
 (19610699 7526 45530194 36769 42832733 293078 1189542 13.98705147 329)
 (19677333 7536 45705015 36769 42861041 293694 1190941 14.041105235 330)
 (19736527 7537 45875148 36769 42896534 294340 1192316 14.082710146 331)
 (19799864 7538 46047037 36769 42922173 295201 1193738 14.126554401 332)
 (19865128 7542 46209994 36769 42949055 295876 1195133 14.167501197 333)
 (19924904 7564 46377456 36769 43008388 296531 1196605 14.207478738 334)
 (19986717 7674 46546221 36770 43035046 296928 1197955 14.249106813 335)
 (20046726 7675 46716423 36770 43060757 297558 1199338 14.302346074 336)
 (20107508 7678 46888051 36772 43097195 298290 1200743 14.342108851 337)
 (20172065 7680 47061015 36774 43126993 298770 1202285 14.383165327 338)
 (20235780 7681 47221594 36774 43153041 299635 1203754 14.423942963 339)
 (20294623 7682 47405064 36774 43202163 300434 1205260 14.473857509 340)
 (20353428 7692 47574958 36778 43250345 300773 1207056 14.528225012 341)
 (20427274 7778 47679351 36778 43337331 303690 1212437 14.570319341 342)
 (20501106 7941 47776024 36778 43443248 307419 1218097 14.610816267 343)
 (20566988 7996 47933001 36778 43490727 309071 1221003 14.675236804 344)
 (20637587 8078 48085242 36778 43532391 309493 1222405 14.716701768 345)
 (20697606 8127 48240788 36778 43601564 310907 1225370 14.759790056 346)
 (20762697 8186 48384470 36778 43646718 312881 1228282 14.799301233 347)
 (20816911 8199 48566522 36778 43672496 313533 1229651 14.863568335 348)
 (20885113 8221 48719120 36778 43710277 314237 1231072 14.904231536 349)]


Well, due to a build issue (corrupt repository caused by MERCURIAL bug.
https://bugzilla.mozilla.org/show_bug.cgi?id=1720302
I have not been able to do much build and debug work on mozilla TB for 
the last few weeks.

But when I do run the compilation, etc. I still see the strange 
pause/hang from time to time.
Only Emacs is affected. Other programs within virtualbox responds,.
Repainting of emacs does not happen.
Also, Windows 10 host is running smoothly. So I suspect Emacs is running 
CPU wildly when this happens. Yes, I have seen CPU usage go up using 
xosview.

There may be some unexpected inefficiency ( a coding error that may 
cause an extra loop over a gigabyte of memoyr?)

When I try to analyze the issue, it is always (well, it seems to me) 
related to string allocation.

I have no idea where to look for the issue. As I said, the particular 
pattern of allocation by shell-mode and a lot of output from compilation 
script (with short and long lines mixed)  may cause some strange 
inefficiency in memory access behavior.

Chiaki






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Wed, 11 Aug 2021 06:08:01 GMT) Full text and rfc822 format available.

Message #51 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>,
 Eli Zaretskii <eliz <at> gnu.org>, 39413 <at> debbugs.gnu.org,
 Noam Postavsky <npostavs <at> gmail.com>
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Wed, 11 Aug 2021 15:07:26 +0900
[Message part 1 (text/plain, inline)]
(Sorry I am working from home and I have used my office e-mail address 
and personal e-mail address interchangeably.)

Hi,

Good news. I could capture at least something that shows what happens 
when the Emacs seems to spend strangely large amount of time in GC and 
for all practical purposes it gets hung for a couple of minutes.

With the post-gc-hook, usually "Garbage collecting ..." is overwritten 
with the dump of array data that holds GC-related inforamtion. But if GC 
is spending long time, I see "Garbage collecting" and hung emacs window.

Just immediately after I wrote my previous e-mail, I noticed that the 
Mozilla repository corruption problem was finally solved. Official fix 
was announced. So I applied it and luckily I could compile and build 
mozilla thunderbird mail client for debugging again.

And yes, the strange emacs hung or rather "GC is spending too much CPU 
without visible change" happens again a few hours later, a few times.

I am attaching two screen dump.
- One is the screen capture of Linux desktop.
   In it, you can see xosview window.
   (I marked it with red pen.)
   You can see memory is not used  up. Only about 11 GB of memory is 
used out of 16 GB.
   No swapping should occur.
   Emacs is not responding at all since it is running wild in GC routine.
  Thus its window cannot be repainted. (Blue pen mark.)
   I had to move xosview window, etc. for better view and when the 
windows got repainted,
   Emacs window was not repainted. Other windows and programs such as 
terminals are responding without any issue at all.

   I could not measure the exact time Emacs was not responding. I think 
it was like More thant1 min 30 seconds more likely over 2 minutes.

   xosview has a CPU usage window which shows one CPU, CPU4  was 
occupied hundred percent.
   That was the CPU emacs uses.
   The full horizontal span of data window covers about 30 seconds. So 
you can see that GC (I think it is GC) runs more than 30 seconds by 
then. Actually, after more than 30 seconds, I drifted off to capture the 
screen, mark the captured image, etc. This more than 1 minute long GC  
is really puzzling.

- The other attachment  is Windows task manager memory information.
   Sorry the legends are in Japanese since I used Japanese Windows locale.
   You can see I still have like 4.5 G of free memory out of 32 GB real 
memory.
   Thus the host windows 10 is not feeling any problem and I can run 
browser, e-mail client, etc. without any issue at all while Emacs is 
running GC earnestly.

I am attaching the GC dump data collected by my post-gc-hook function.
But it does not reveal much.

I probably should include the start of GC time and end of GC time in the 
array of information I collect for each GC run. Then we will know for 
sure how much time GC ran.
But there does NOT seem to be a pre-gc-hook as opposed to post-gc-hook.
Hmm... I wonder how I record the start time of GC call.

Any suggestion on the last topic of how to record GC time at fine 
granularity will be appreciated.

Chiaki

On 2021/08/11 10:08, ISHIKAWA,chiaki wrote:
> On 2021/08/11 1:09, Lars Ingebrigtsen wrote:
>> chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp> writes:
>>
>>> Stay tuned. I am really interested in WHERE/WHY this long pause
>>> occurs.: maybe Windows 10 memory allocation/ Virtual Box
>>> emulation/Linux kernel memory allocation, and AMD Ryzen CPU's cache
>>> replacement policy all add up.
>> This was a year ago -- are you still seeing these hangs, and if so, have
>> you confirmed that they don't happen because the machine (client or
>> host) is swapping?
>
> Hi,
>
> Thank you for your follow-up.
>
> I still think there is a problem with memory allocator in the later 
> Emacs versions. I can't figure out exactly when this happens. I have 
> been using Emacs in this setup for quite sometime and yet I had NOT 
> experienced this strange long pause before.
>
> Now, it is true that I do use much memory while running mozilla 
> thunderbird compilation
> WITHIN shell window which I open M-x shell.
> But again, I used to do this for the last 8-9 years.
>
> One thing I notice is that running compilation script of mozilla 
> thunderbird
> outputs a pattern of longish lines and then a few short lines very 
> frequently.
> Yes, they amount to a lot of memory, but I suspect that
> they may cause a somewhat inefficient handling by string allocator.
>
> I used the following post-GC-hook to record the behavior of emacs GC.
>
> After the code snippet, I attach the output from such a call.
>
> ;;;
>
> ;;; GC behavior monitor
> ;;;
> ;;; Record some key information for GC in the GC post-hook so that I
> ;;; can learn something about the behavior and performance of GC with
> ;;; concrete data.
>
> (setq my-gc-statistics (make-vector 30 nil))
>
> ;;; The element is
> ;;; (append (memory-use-counts) (list gc-elapsed gcs-done))
> ;;; Each time the following function is called, the
> ;;; elements in the array is shifted toward the end.
> ;;; Use (message "%S" my-gc-statistics) to force the
> ;;; recording of my-gc-statistics value in *Messages* buffer for later 
> analysis.
>
>
> (defun update-my-gc-statistics ()
>   (let ((i 28))
>     (progn
>      ;;; very unlike Lisp
>      (while (<= 0 i)
>        (progn (aset my-gc-statistics (+ 1 i) (aref my-gc-statistics i))
>              (setq i (- i 1) )))
>      (aset my-gc-statistics 0
>            (append (memory-use-counts) (list gc-elapsed gcs-done)))
>      ;;; print the latest one last so that I can see the glimpse in 
> the narrow
>      ;;; output window.
>      (message "%S\n%S" (current-time-string) (pp (reverse 
> my-gc-statistics))))))
>
> (setq post-gc-hook 'update-my-gc-statistics)
>
> ---
> Normal behavior. Not much pause.
>
> [(19051645 7445 43993805 36764 42521186 286669 1176042 13.581105925 320)
>  (19117025 7446 44152747 36764 42547677 287685 1177518 13.624215366 321)
>  (19174133 7469 44332450 36765 42585061 288281 1178955 13.667294932 322)
>  (19238776 7470 44504363 36765 42620253 289199 1180384 13.714802148 323)
>  (19303795 7484 44666173 36767 42649518 289786 1181923 13.756323136 324)
>  (19365595 7485 44837589 36767 42676862 290659 1183423 13.807570787 325)
>  (19425828 7486 45007513 36767 42712688 291410 1184879 13.847687952 326)
>  (19482572 7488 45186533 36769 42753576 291969 1186503 13.905988177 327)
>  (19539282 7489 45365763 36769 42782042 292534 1188002 13.947620008 328)
>  (19610699 7526 45530194 36769 42832733 293078 1189542 13.98705147 329)
>  (19677333 7536 45705015 36769 42861041 293694 1190941 14.041105235 330)
>  (19736527 7537 45875148 36769 42896534 294340 1192316 14.082710146 331)
>  (19799864 7538 46047037 36769 42922173 295201 1193738 14.126554401 332)
>  (19865128 7542 46209994 36769 42949055 295876 1195133 14.167501197 333)
>  (19924904 7564 46377456 36769 43008388 296531 1196605 14.207478738 334)
>  (19986717 7674 46546221 36770 43035046 296928 1197955 14.249106813 335)
>  (20046726 7675 46716423 36770 43060757 297558 1199338 14.302346074 336)
>  (20107508 7678 46888051 36772 43097195 298290 1200743 14.342108851 337)
>  (20172065 7680 47061015 36774 43126993 298770 1202285 14.383165327 338)
>  (20235780 7681 47221594 36774 43153041 299635 1203754 14.423942963 339)
>  (20294623 7682 47405064 36774 43202163 300434 1205260 14.473857509 340)
>  (20353428 7692 47574958 36778 43250345 300773 1207056 14.528225012 341)
>  (20427274 7778 47679351 36778 43337331 303690 1212437 14.570319341 342)
>  (20501106 7941 47776024 36778 43443248 307419 1218097 14.610816267 343)
>  (20566988 7996 47933001 36778 43490727 309071 1221003 14.675236804 344)
>  (20637587 8078 48085242 36778 43532391 309493 1222405 14.716701768 345)
>  (20697606 8127 48240788 36778 43601564 310907 1225370 14.759790056 346)
>  (20762697 8186 48384470 36778 43646718 312881 1228282 14.799301233 347)
>  (20816911 8199 48566522 36778 43672496 313533 1229651 14.863568335 348)
>  (20885113 8221 48719120 36778 43710277 314237 1231072 14.904231536 349)]
>
>
> Well, due to a build issue (corrupt repository caused by MERCURIAL bug.
> https://bugzilla.mozilla.org/show_bug.cgi?id=1720302
> I have not been able to do much build and debug work on mozilla TB for 
> the last few weeks.
>
> But when I do run the compilation, etc. I still see the strange 
> pause/hang from time to time.
> Only Emacs is affected. Other programs within virtualbox responds,.
> Repainting of emacs does not happen.
> Also, Windows 10 host is running smoothly. So I suspect Emacs is 
> running CPU wildly when this happens. Yes, I have seen CPU usage go up 
> using xosview.
>
> There may be some unexpected inefficiency ( a coding error that may 
> cause an extra loop over a gigabyte of memoyr?)
>
> When I try to analyze the issue, it is always (well, it seems to me) 
> related to string allocation.
>
> I have no idea where to look for the issue. As I said, the particular 
> pattern of allocation by shell-mode and a lot of output from 
> compilation script (with short and long lines mixed)  may cause some 
> strange inefficiency in memory access behavior.
>
> Chiaki
>
>
>

[emacs-hung-2.PNG (image/png, attachment)]
[memory-use.PNG (image/png, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Wed, 11 Aug 2021 07:23:02 GMT) Full text and rfc822 format available.

Message #54 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>,
 Eli Zaretskii <eliz <at> gnu.org>, 39413 <at> debbugs.gnu.org,
 Noam Postavsky <npostavs <at> gmail.com>
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Wed, 11 Aug 2021 16:22:21 +0900
Hi,

Oops sorry, I forgot to attach the GC information dump when 2-min GC 
guru meditation time occurred.
Here it is. A copy from *Messages* buffer.
I am afraid that I had to back track to older messages and I believe the 
ones I copied here are related to the timing of the long GC.
But I could not exactly figure out which one.

Chiaki

<<< Type SPC or RET to bury the buffer list >>>
Mark set
[(388481492 12663 315852263 37859 252680715 12068535 8530128 
217.751094438 931)
 (390762053 12664 320601077 37859 254857509 12248048 8666125 
219.633925136 932)
 (393068244 12665 325413926 37859 257099914 12428854 8802846 
221.552147241 933)
 (402532085 12666 325423645 37859 257116197 12428863 8803027 
223.487605937 934)
 (411994601 12667 325425009 37859 257132482 12428865 8803207 
225.406804595 935)
 (414448248 12668 330538753 37859 259505987 12621465 8949175 
227.343266966 936)
 (424074389 12669 330548382 37859 259522285 12621478 8949356 
229.348980796 937)
 (433699244 12670 330549685 37859 259538585 12621481 8949536 
231.351762205 938)
 (436052588 12695 335564019 37860 262003952 12804062 9088033 
233.29986785 939)
 (437421791 12754 338433175 37860 263807816 12913522 9184089 
241.951760306 940)
 (438845632 12799 341601300 37860 265238759 13012817 9274435 
243.55520882 941)
 (440178606 12842 344865373 37860 266433618 13099621 9353887 
245.138839078 942)
 (441570556 12884 348132343 37860 267683757 13189878 9438753 
246.769836844 943)
 (442913777 12925 351504882 37860 268870526 13279030 9517477 
248.360581202 944)
 (444288336 12970 354793728 37860 270325980 13371631 9601867 
249.986431307 945)
 (445752850 13021 357867712 37860 271951417 13480694 9698259 
251.625818749 946)
 (447151863 13067 361600674 37860 273497660 13567563 9778153 
253.259859431 947)
 (448657655 13112 365067032 37860 275261240 13670086 9871747 
254.994917854 948)
 (450158914 13128 368163204 37860 276859198 13782657 9970119 
256.633972618 949)
 (451707980 13129 371342199 37860 278396622 13901166 10072262 
258.313592942 950)
 (453306224 13130 374620228 37860 279986655 14023233 10177572 
260.026878125 951)
 (464657341 13133 374630765 37860 280003882 14023235 10177761 
261.742461495 952)
 (476005819 13134 374631080 37860 280018058 14023235 10177809 
263.587127108 953)
 (477506723 13182 378534545 37862 281816769 14095140 10272953 
265.478689001 954)
 (478847002 13223 383581993 37862 283960864 14103107 10329518 
267.316126049 955)
 (479979175 13239 389369881 37862 285520586 14104493 10350952 
269.874493833 956)
 (481386726 13240 394330012 37862 287576346 14129844 10386458 
272.129021279 957)
 (483293142 13241 396849752 37862 290130666 14268679 10526833 
274.668402532 958)
 (485150904 13267 399531895 37862 292954354 14397053 10681026 
276.994701437 959)
 (520316838 13502 402388436 37862 368141808 16928818 10788359 
279.346871091 960)]

"Wed Aug 11 14:35:01 2021"
"[(388481492 12663 315852263 37859 252680715 12068535 8530128 
217.751094438 931)
 (390762053 12664 320601077 37859 254857509 12248048 8666125 
219.633925136 932)
 (393068244 12665 325413926 37859 257099914 12428854 8802846 
221.552147241 933)
 (402532085 12666 325423645 37859 257116197 12428863 8803027 
223.487605937 934)
 (411994601 12667 325425009 37859 257132482 12428865 8803207 
225.406804595 935)
 (414448248 12668 330538753 37859 259505987 12621465 8949175 
227.343266966 936)
 (424074389 12669 330548382 37859 259522285 12621478 8949356 
229.348980796 937)
 (433699244 12670 330549685 37859 259538585 12621481 8949536 
231.351762205 938)
 (436052588 12695 335564019 37860 262003952 12804062 9088033 
233.29986785 939)
 (437421791 12754 338433175 37860 263807816 12913522 9184089 
241.951760306 940)
 (438845632 12799 341601300 37860 265238759 13012817 9274435 
243.55520882 941)
 (440178606 12842 344865373 37860 266433618 13099621 9353887 
245.138839078 942)
 (441570556 12884 348132343 37860 267683757 13189878 9438753 
246.769836844 943)
 (442913777 12925 351504882 37860 268870526 13279030 9517477 
248.360581202 944)
 (444288336 12970 354793728 37860 270325980 13371631 9601867 
249.986431307 945)
 (445752850 13021 357867712 37860 271951417 13480694 9698259 
251.625818749 946)
 (447151863 13067 361600674 37860 273497660 13567563 9778153 
253.259859431 947)
 (448657655 13112 365067032 37860 275261240 13670086 9871747 
254.994917854 948)
 (450158914 13128 368163204 37860 276859198 13782657 9970119 
256.633972618 949)
 (451707980 13129 371342199 37860 278396622 13901166 10072262 
258.313592942 950)
 (453306224 13130 374620228 37860 279986655 14023233 10177572 
260.026878125 951)
 (464657341 13133 374630765 37860 280003882 14023235 10177761 
261.742461495 952)
 (476005819 13134 374631080 37860 280018058 14023235 10177809 
263.587127108 953)
 (477506723 13182 378534545 37862 281816769 14095140 10272953 
265.478689001 954)
 (478847002 13223 383581993 37862 283960864 14103107 10329518 
267.316126049 955)
 (479979175 13239 389369881 37862 285520586 14104493 10350952 
269.874493833 956)
 (481386726 13240 394330012 37862 287576346 14129844 10386458 
272.129021279 957)
 (483293142 13241 396849752 37862 290130666 14268679 10526833 
274.668402532 958)
 (485150904 13267 399531895 37862 292954354 14397053 10681026 
276.994701437 959)
 (520316838 13502 402388436 37862 368141808 16928818 10788359 
279.346871091 960)]
"
Mark set [2 times]
End of buffer
Beginning of buffer [2 times]
End of buffer
Beginning of buffer
End of buffer
<<< Type SPC or RET to bury the buffer list >>> [2 times]
Mark set
<<< Type SPC or RET to bury the buffer list >>>
Mark set [2 times]
ESC C-M-p is undefined
Mark set
Saved text from "
"Wed Aug 11 14:25:39 2021"
"[(377509160"
(New file)
Mark set [2 times]
Saving file /tmp/t.memory.txt...
Wrote /tmp/t.memory.txt
Making completion list...
Quit
[(390762053 12664 320601077 37859 254857509 12248048 8666125 
219.633925136 932)
 (393068244 12665 325413926 37859 257099914 12428854 8802846 
221.552147241 933)
 (402532085 12666 325423645 37859 257116197 12428863 8803027 
223.487605937 934)
 (411994601 12667 325425009 37859 257132482 12428865 8803207 
225.406804595 935)
 (414448248 12668 330538753 37859 259505987 12621465 8949175 
227.343266966 936)
 (424074389 12669 330548382 37859 259522285 12621478 8949356 
229.348980796 937)
 (433699244 12670 330549685 37859 259538585 12621481 8949536 
231.351762205 938)
 (436052588 12695 335564019 37860 262003952 12804062 9088033 
233.29986785 939)
 (437421791 12754 338433175 37860 263807816 12913522 9184089 
241.951760306 940)
 (438845632 12799 341601300 37860 265238759 13012817 9274435 
243.55520882 941)
 (440178606 12842 344865373 37860 266433618 13099621 9353887 
245.138839078 942)
 (441570556 12884 348132343 37860 267683757 13189878 9438753 
246.769836844 943)
 (442913777 12925 351504882 37860 268870526 13279030 9517477 
248.360581202 944)
 (444288336 12970 354793728 37860 270325980 13371631 9601867 
249.986431307 945)
 (445752850 13021 357867712 37860 271951417 13480694 9698259 
251.625818749 946)
 (447151863 13067 361600674 37860 273497660 13567563 9778153 
253.259859431 947)
 (448657655 13112 365067032 37860 275261240 13670086 9871747 
254.994917854 948)
 (450158914 13128 368163204 37860 276859198 13782657 9970119 
256.633972618 949)
 (451707980 13129 371342199 37860 278396622 13901166 10072262 
258.313592942 950)
 (453306224 13130 374620228 37860 279986655 14023233 10177572 
260.026878125 951)
 (464657341 13133 374630765 37860 280003882 14023235 10177761 
261.742461495 952)
 (476005819 13134 374631080 37860 280018058 14023235 10177809 
263.587127108 953)
 (477506723 13182 378534545 37862 281816769 14095140 10272953 
265.478689001 954)
 (478847002 13223 383581993 37862 283960864 14103107 10329518 
267.316126049 955)
 (479979175 13239 389369881 37862 285520586 14104493 10350952 
269.874493833 956)
 (481386726 13240 394330012 37862 287576346 14129844 10386458 
272.129021279 957)
 (483293142 13241 396849752 37862 290130666 14268679 10526833 
274.668402532 958)
 (485150904 13267 399531895 37862 292954354 14397053 10681026 
276.994701437 959)
 (520316838 13502 402388436 37862 368141808 16928818 10788359 
279.346871091 960)
 (520636475 13796 402726636 37868 370978420 16933589 10826355 
422.114721386 961)]

"Wed Aug 11 14:43:38 2021"
"[(390762053 12664 320601077 37859 254857509 12248048 8666125 
219.633925136 932)
 (393068244 12665 325413926 37859 257099914 12428854 8802846 
221.552147241 933)
 (402532085 12666 325423645 37859 257116197 12428863 8803027 
223.487605937 934)
 (411994601 12667 325425009 37859 257132482 12428865 8803207 
225.406804595 935)
 (414448248 12668 330538753 37859 259505987 12621465 8949175 
227.343266966 936)
 (424074389 12669 330548382 37859 259522285 12621478 8949356 
229.348980796 937)
 (433699244 12670 330549685 37859 259538585 12621481 8949536 
231.351762205 938)
 (436052588 12695 335564019 37860 262003952 12804062 9088033 
233.29986785 939)
 (437421791 12754 338433175 37860 263807816 12913522 9184089 
241.951760306 940)
 (438845632 12799 341601300 37860 265238759 13012817 9274435 
243.55520882 941)
 (440178606 12842 344865373 37860 266433618 13099621 9353887 
245.138839078 942)
 (441570556 12884 348132343 37860 267683757 13189878 9438753 
246.769836844 943)
 (442913777 12925 351504882 37860 268870526 13279030 9517477 
248.360581202 944)
 (444288336 12970 354793728 37860 270325980 13371631 9601867 
249.986431307 945)
 (445752850 13021 357867712 37860 271951417 13480694 9698259 
251.625818749 946)
 (447151863 13067 361600674 37860 273497660 13567563 9778153 
253.259859431 947)
 (448657655 13112 365067032 37860 275261240 13670086 9871747 
254.994917854 948)
 (450158914 13128 368163204 37860 276859198 13782657 9970119 
256.633972618 949)
 (451707980 13129 371342199 37860 278396622 13901166 10072262 
258.313592942 950)
 (453306224 13130 374620228 37860 279986655 14023233 10177572 
260.026878125 951)
 (464657341 13133 374630765 37860 280003882 14023235 10177761 
261.742461495 952)
 (476005819 13134 374631080 37860 280018058 14023235 10177809 
263.587127108 953)
 (477506723 13182 378534545 37862 281816769 14095140 10272953 
265.478689001 954)
 (478847002 13223 383581993 37862 283960864 14103107 10329518 
267.316126049 955)
 (479979175 13239 389369881 37862 285520586 14104493 10350952 
269.874493833 956)
 (481386726 13240 394330012 37862 287576346 14129844 10386458 
272.129021279 957)
 (483293142 13241 396849752 37862 290130666 14268679 10526833 
274.668402532 958)
 (485150904 13267 399531895 37862 292954354 14397053 10681026 
276.994701437 959)
 (520316838 13502 402388436 37862 368141808 16928818 10788359 
279.346871091 960)
 (520636475 13796 402726636 37868 370978420 16933589 10826355 
422.114721386 961)]
"
C-M-g is undefined
Quit
[(393068244 12665 325413926 37859 257099914 12428854 8802846 
221.552147241 933)
 (402532085 12666 325423645 37859 257116197 12428863 8803027 
223.487605937 934)
 (411994601 12667 325425009 37859 257132482 12428865 8803207 
225.406804595 935)
 (414448248 12668 330538753 37859 259505987 12621465 8949175 
227.343266966 936)
 (424074389 12669 330548382 37859 259522285 12621478 8949356 
229.348980796 937)
 (433699244 12670 330549685 37859 259538585 12621481 8949536 
231.351762205 938)
 (436052588 12695 335564019 37860 262003952 12804062 9088033 
233.29986785 939)
 (437421791 12754 338433175 37860 263807816 12913522 9184089 
241.951760306 940)
 (438845632 12799 341601300 37860 265238759 13012817 9274435 
243.55520882 941)
 (440178606 12842 344865373 37860 266433618 13099621 9353887 
245.138839078 942)
 (441570556 12884 348132343 37860 267683757 13189878 9438753 
246.769836844 943)
 (442913777 12925 351504882 37860 268870526 13279030 9517477 
248.360581202 944)
 (444288336 12970 354793728 37860 270325980 13371631 9601867 
249.986431307 945)
 (445752850 13021 357867712 37860 271951417 13480694 9698259 
251.625818749 946)
 (447151863 13067 361600674 37860 273497660 13567563 9778153 
253.259859431 947)
 (448657655 13112 365067032 37860 275261240 13670086 9871747 
254.994917854 948)
 (450158914 13128 368163204 37860 276859198 13782657 9970119 
256.633972618 949)
 (451707980 13129 371342199 37860 278396622 13901166 10072262 
258.313592942 950)
 (453306224 13130 374620228 37860 279986655 14023233 10177572 
260.026878125 951)
 (464657341 13133 374630765 37860 280003882 14023235 10177761 
261.742461495 952)
 (476005819 13134 374631080 37860 280018058 14023235 10177809 
263.587127108 953)
 (477506723 13182 378534545 37862 281816769 14095140 10272953 
265.478689001 954)
 (478847002 13223 383581993 37862 283960864 14103107 10329518 
267.316126049 955)
 (479979175 13239 389369881 37862 285520586 14104493 10350952 
269.874493833 956)
 (481386726 13240 394330012 37862 287576346 14129844 10386458 
272.129021279 957)
 (483293142 13241 396849752 37862 290130666 14268679 10526833 
274.668402532 958)
 (485150904 13267 399531895 37862 292954354 14397053 10681026 
276.994701437 959)
 (520316838 13502 402388436 37862 368141808 16928818 10788359 
279.346871091 960)
 (520636475 13796 402726636 37868 370978420 16933589 10826355 
422.114721386 961)
 (520702163 13807 402879841 38129 371108294 16933644 10830730 
620.747356527 962)]

"Wed Aug 11 14:58:00 2021"
"[(393068244 12665 325413926 37859 257099914 12428854 8802846 
221.552147241 933)
 (402532085 12666 325423645 37859 257116197 12428863 8803027 
223.487605937 934)
 (411994601 12667 325425009 37859 257132482 12428865 8803207 
225.406804595 935)
 (414448248 12668 330538753 37859 259505987 12621465 8949175 
227.343266966 936)
 (424074389 12669 330548382 37859 259522285 12621478 8949356 
229.348980796 937)
 (433699244 12670 330549685 37859 259538585 12621481 8949536 
231.351762205 938)
 (436052588 12695 335564019 37860 262003952 12804062 9088033 
233.29986785 939)
 (437421791 12754 338433175 37860 263807816 12913522 9184089 
241.951760306 940)
 (438845632 12799 341601300 37860 265238759 13012817 9274435 
243.55520882 941)
 (440178606 12842 344865373 37860 266433618 13099621 9353887 
245.138839078 942)
 (441570556 12884 348132343 37860 267683757 13189878 9438753 
246.769836844 943)
 (442913777 12925 351504882 37860 268870526 13279030 9517477 
248.360581202 944)
 (444288336 12970 354793728 37860 270325980 13371631 9601867 
249.986431307 945)
 (445752850 13021 357867712 37860 271951417 13480694 9698259 
251.625818749 946)

On 2021/08/11 15:07, ISHIKAWA,chiaki wrote:
> (Sorry I am working from home and I have used my office e-mail address 
> and personal e-mail address interchangeably.)
>
> Hi,
>
> Good news. I could capture at least something that shows what happens 
> when the Emacs seems to spend strangely large amount of time in GC and 
> for all practical purposes it gets hung for a couple of minutes.
>
> With the post-gc-hook, usually "Garbage collecting ..." is overwritten 
> with the dump of array data that holds GC-related inforamtion. But if 
> GC is spending long time, I see "Garbage collecting" and hung emacs 
> window.
>
> Just immediately after I wrote my previous e-mail, I noticed that the 
> Mozilla repository corruption problem was finally solved. Official fix 
> was announced. So I applied it and luckily I could compile and build 
> mozilla thunderbird mail client for debugging again.
>
> And yes, the strange emacs hung or rather "GC is spending too much CPU 
> without visible change" happens again a few hours later, a few times.
>
> I am attaching two screen dump.
> - One is the screen capture of Linux desktop.
>    In it, you can see xosview window.
>    (I marked it with red pen.)
>    You can see memory is not used  up. Only about 11 GB of memory is 
> used out of 16 GB.
>    No swapping should occur.
>    Emacs is not responding at all since it is running wild in GC routine.
>   Thus its window cannot be repainted. (Blue pen mark.)
>    I had to move xosview window, etc. for better view and when the 
> windows got repainted,
>    Emacs window was not repainted. Other windows and programs such as 
> terminals are responding without any issue at all.
>
>    I could not measure the exact time Emacs was not responding. I 
> think it was like More thant1 min 30 seconds more likely over 2 minutes.
>
>    xosview has a CPU usage window which shows one CPU, CPU4  was 
> occupied hundred percent.
>    That was the CPU emacs uses.
>    The full horizontal span of data window covers about 30 seconds. So 
> you can see that GC (I think it is GC) runs more than 30 seconds by 
> then. Actually, after more than 30 seconds, I drifted off to capture 
> the screen, mark the captured image, etc. This more than 1 minute long 
> GC  is really puzzling.
>
> - The other attachment  is Windows task manager memory information.
>    Sorry the legends are in Japanese since I used Japanese Windows 
> locale.
>    You can see I still have like 4.5 G of free memory out of 32 GB 
> real memory.
>    Thus the host windows 10 is not feeling any problem and I can run 
> browser, e-mail client, etc. without any issue at all while Emacs is 
> running GC earnestly.
>
> I am attaching the GC dump data collected by my post-gc-hook function.
> But it does not reveal much.
>
> I probably should include the start of GC time and end of GC time in 
> the array of information I collect for each GC run. Then we will know 
> for sure how much time GC ran.
> But there does NOT seem to be a pre-gc-hook as opposed to post-gc-hook.
> Hmm... I wonder how I record the start time of GC call.
>
> Any suggestion on the last topic of how to record GC time at fine 
> granularity will be appreciated.
>
> Chiaki
>
> On 2021/08/11 10:08, ISHIKAWA,chiaki wrote:
>> On 2021/08/11 1:09, Lars Ingebrigtsen wrote:
>>> chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp> writes:
>>>
>>>> Stay tuned. I am really interested in WHERE/WHY this long pause
>>>> occurs.: maybe Windows 10 memory allocation/ Virtual Box
>>>> emulation/Linux kernel memory allocation, and AMD Ryzen CPU's cache
>>>> replacement policy all add up.
>>> This was a year ago -- are you still seeing these hangs, and if so, 
>>> have
>>> you confirmed that they don't happen because the machine (client or
>>> host) is swapping?
>>
>> Hi,
>>
>> Thank you for your follow-up.
>>
>> I still think there is a problem with memory allocator in the later 
>> Emacs versions. I can't figure out exactly when this happens. I have 
>> been using Emacs in this setup for quite sometime and yet I had NOT 
>> experienced this strange long pause before.
>>
>> Now, it is true that I do use much memory while running mozilla 
>> thunderbird compilation
>> WITHIN shell window which I open M-x shell.
>> But again, I used to do this for the last 8-9 years.
>>
>> One thing I notice is that running compilation script of mozilla 
>> thunderbird
>> outputs a pattern of longish lines and then a few short lines very 
>> frequently.
>> Yes, they amount to a lot of memory, but I suspect that
>> they may cause a somewhat inefficient handling by string allocator.
>>
>> I used the following post-GC-hook to record the behavior of emacs GC.
>>
>> After the code snippet, I attach the output from such a call.
>>
>> ;;;
>>
>> ;;; GC behavior monitor
>> ;;;
>> ;;; Record some key information for GC in the GC post-hook so that I
>> ;;; can learn something about the behavior and performance of GC with
>> ;;; concrete data.
>>
>> (setq my-gc-statistics (make-vector 30 nil))
>>
>> ;;; The element is
>> ;;; (append (memory-use-counts) (list gc-elapsed gcs-done))
>> ;;; Each time the following function is called, the
>> ;;; elements in the array is shifted toward the end.
>> ;;; Use (message "%S" my-gc-statistics) to force the
>> ;;; recording of my-gc-statistics value in *Messages* buffer for 
>> later analysis.
>>
>>
>> (defun update-my-gc-statistics ()
>>   (let ((i 28))
>>     (progn
>>      ;;; very unlike Lisp
>>      (while (<= 0 i)
>>        (progn (aset my-gc-statistics (+ 1 i) (aref my-gc-statistics i))
>>              (setq i (- i 1) )))
>>      (aset my-gc-statistics 0
>>            (append (memory-use-counts) (list gc-elapsed gcs-done)))
>>      ;;; print the latest one last so that I can see the glimpse in 
>> the narrow
>>      ;;; output window.
>>      (message "%S\n%S" (current-time-string) (pp (reverse 
>> my-gc-statistics))))))
>>
>> (setq post-gc-hook 'update-my-gc-statistics)
>>
>> ---
>> Normal behavior. Not much pause.
>>
>> [(19051645 7445 43993805 36764 42521186 286669 1176042 13.581105925 320)
>>  (19117025 7446 44152747 36764 42547677 287685 1177518 13.624215366 321)
>>  (19174133 7469 44332450 36765 42585061 288281 1178955 13.667294932 322)
>>  (19238776 7470 44504363 36765 42620253 289199 1180384 13.714802148 323)
>>  (19303795 7484 44666173 36767 42649518 289786 1181923 13.756323136 324)
>>  (19365595 7485 44837589 36767 42676862 290659 1183423 13.807570787 325)
>>  (19425828 7486 45007513 36767 42712688 291410 1184879 13.847687952 326)
>>  (19482572 7488 45186533 36769 42753576 291969 1186503 13.905988177 327)
>>  (19539282 7489 45365763 36769 42782042 292534 1188002 13.947620008 328)
>>  (19610699 7526 45530194 36769 42832733 293078 1189542 13.98705147 329)
>>  (19677333 7536 45705015 36769 42861041 293694 1190941 14.041105235 330)
>>  (19736527 7537 45875148 36769 42896534 294340 1192316 14.082710146 331)
>>  (19799864 7538 46047037 36769 42922173 295201 1193738 14.126554401 332)
>>  (19865128 7542 46209994 36769 42949055 295876 1195133 14.167501197 333)
>>  (19924904 7564 46377456 36769 43008388 296531 1196605 14.207478738 334)
>>  (19986717 7674 46546221 36770 43035046 296928 1197955 14.249106813 335)
>>  (20046726 7675 46716423 36770 43060757 297558 1199338 14.302346074 336)
>>  (20107508 7678 46888051 36772 43097195 298290 1200743 14.342108851 337)
>>  (20172065 7680 47061015 36774 43126993 298770 1202285 14.383165327 338)
>>  (20235780 7681 47221594 36774 43153041 299635 1203754 14.423942963 339)
>>  (20294623 7682 47405064 36774 43202163 300434 1205260 14.473857509 340)
>>  (20353428 7692 47574958 36778 43250345 300773 1207056 14.528225012 341)
>>  (20427274 7778 47679351 36778 43337331 303690 1212437 14.570319341 342)
>>  (20501106 7941 47776024 36778 43443248 307419 1218097 14.610816267 343)
>>  (20566988 7996 47933001 36778 43490727 309071 1221003 14.675236804 344)
>>  (20637587 8078 48085242 36778 43532391 309493 1222405 14.716701768 345)
>>  (20697606 8127 48240788 36778 43601564 310907 1225370 14.759790056 346)
>>  (20762697 8186 48384470 36778 43646718 312881 1228282 14.799301233 347)
>>  (20816911 8199 48566522 36778 43672496 313533 1229651 14.863568335 348)
>>  (20885113 8221 48719120 36778 43710277 314237 1231072 14.904231536 
>> 349)]
>>
>>
>> Well, due to a build issue (corrupt repository caused by MERCURIAL bug.
>> https://bugzilla.mozilla.org/show_bug.cgi?id=1720302
>> I have not been able to do much build and debug work on mozilla TB 
>> for the last few weeks.
>>
>> But when I do run the compilation, etc. I still see the strange 
>> pause/hang from time to time.
>> Only Emacs is affected. Other programs within virtualbox responds,.
>> Repainting of emacs does not happen.
>> Also, Windows 10 host is running smoothly. So I suspect Emacs is 
>> running CPU wildly when this happens. Yes, I have seen CPU usage go 
>> up using xosview.
>>
>> There may be some unexpected inefficiency ( a coding error that may 
>> cause an extra loop over a gigabyte of memoyr?)
>>
>> When I try to analyze the issue, it is always (well, it seems to me) 
>> related to string allocation.
>>
>> I have no idea where to look for the issue. As I said, the particular 
>> pattern of allocation by shell-mode and a lot of output from 
>> compilation script (with short and long lines mixed)  may cause some 
>> strange inefficiency in memory access behavior.
>>
>> Chiaki
>>
>>
>>
>





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Wed, 11 Aug 2021 11:15:02 GMT) Full text and rfc822 format available.

Message #57 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
Cc: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>,
 Eli Zaretskii <eliz <at> gnu.org>, 39413 <at> debbugs.gnu.org,
 Noam Postavsky <npostavs <at> gmail.com>
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Wed, 11 Aug 2021 13:14:25 +0200
"ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp> writes:

>    You can see memory is not used  up. Only about 11 GB of memory is
> used out of 16 GB.
>    No swapping should occur.

The OS may well decide to swap in this situation, so try checking that
instead.  In Linux, "vmstat 1" should tell you whether swapping is
occurring, but I have no idea how to determine that in Windows.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Wed, 11 Aug 2021 11:48:01 GMT) Full text and rfc822 format available.

Message #60 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
Cc: chiaki.ishikawa <at> ubin.jp, larsi <at> gnus.org, 39413 <at> debbugs.gnu.org,
 npostavs <at> gmail.com
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Wed, 11 Aug 2021 14:47:38 +0300
> Cc: Noam Postavsky <npostavs <at> gmail.com>, Eli Zaretskii <eliz <at> gnu.org>,
>  39413 <at> debbugs.gnu.org,
>  chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>
> From: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
> Date: Wed, 11 Aug 2021 15:07:26 +0900
> 
> Good news. I could capture at least something that shows what happens 
> when the Emacs seems to spend strangely large amount of time in GC and 
> for all practical purposes it gets hung for a couple of minutes.

I don't see this in the data you sent:

> [(388481492 12663 315852263 37859 252680715 12068535 8530128 
> 217.751094438 931)
>   (390762053 12664 320601077 37859 254857509 12248048 8666125 
> 219.633925136 932)
>   (393068244 12665 325413926 37859 257099914 12428854 8802846 
> 221.552147241 933)
>   (402532085 12666 325423645 37859 257116197 12428863 8803027 
> 223.487605937 934)
>   (411994601 12667 325425009 37859 257132482 12428865 8803207 
> 225.406804595 935)
>   (414448248 12668 330538753 37859 259505987 12621465 8949175 
> 227.343266966 936)
>   (424074389 12669 330548382 37859 259522285 12621478 8949356 
> 229.348980796 937)
>   (433699244 12670 330549685 37859 259538585 12621481 8949536 
> 231.351762205 938)
>   (436052588 12695 335564019 37860 262003952 12804062 9088033 
> 233.29986785 939)
>   (437421791 12754 338433175 37860 263807816 12913522 9184089 
> 241.951760306 940)
>   (438845632 12799 341601300 37860 265238759 13012817 9274435 
> 243.55520882 941)
>   (440178606 12842 344865373 37860 266433618 13099621 9353887 
> 245.138839078 942)
>   (441570556 12884 348132343 37860 267683757 13189878 9438753 
> 246.769836844 943)
>   (442913777 12925 351504882 37860 268870526 13279030 9517477 
> 248.360581202 944)
>   (444288336 12970 354793728 37860 270325980 13371631 9601867 
> 249.986431307 945)
>   (445752850 13021 357867712 37860 271951417 13480694 9698259 
> 251.625818749 946)
>   (447151863 13067 361600674 37860 273497660 13567563 9778153 
> 253.259859431 947)
>   (448657655 13112 365067032 37860 275261240 13670086 9871747 
> 254.994917854 948)
>   (450158914 13128 368163204 37860 276859198 13782657 9970119 
> 256.633972618 949)
>   (451707980 13129 371342199 37860 278396622 13901166 10072262 
> 258.313592942 950)
>   (453306224 13130 374620228 37860 279986655 14023233 10177572 
> 260.026878125 951)
>   (464657341 13133 374630765 37860 280003882 14023235 10177761 
> 261.742461495 952)
>   (476005819 13134 374631080 37860 280018058 14023235 10177809 
> 263.587127108 953)
>   (477506723 13182 378534545 37862 281816769 14095140 10272953 
> 265.478689001 954)
>   (478847002 13223 383581993 37862 283960864 14103107 10329518 
> 267.316126049 955)
>   (479979175 13239 389369881 37862 285520586 14104493 10350952 
> 269.874493833 956)
>   (481386726 13240 394330012 37862 287576346 14129844 10386458 
> 272.129021279 957)
>   (483293142 13241 396849752 37862 290130666 14268679 10526833 
> 274.668402532 958)
>   (485150904 13267 399531895 37862 292954354 14397053 10681026 
> 276.994701437 959)
>   (520316838 13502 402388436 37862 368141808 16928818 10788359 
> 279.346871091 960)]

Look at the 2 last numbers in each sample: the count of GC cycles goes
up by one, and the elapsed accumulated time spent in GC goes up by 2.5
sec for each GC cycle.  That's nowhere near the couple of minutes
value you report.  Could it be that the Emacs process doesn't get CPU
resources?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Wed, 11 Aug 2021 13:43:02 GMT) Full text and rfc822 format available.

Message #63 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: chiaki.ishikawa <at> ubin.jp, larsi <at> gnus.org, 39413 <at> debbugs.gnu.org,
 npostavs <at> gmail.com
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Wed, 11 Aug 2021 22:41:56 +0900
Hi,

Sorry, I might have picked up too early dumps in my previous e-mail.

What about the dump below, which was in the *Messages* buffer following 
dump I sent.
I am afraid I picked up the wrong dump in my previous e-mail.

Now, is "gc-elapsed" the elapsed time spent in the GC routine?
If so, the two samples at the end of the dump here certainly look 
suspicious.
Actually, I saw a very long GC and after that,
I manually invoked GC by M-X garbage-collect and to my surprise THAT 
also turned out to take a long time to finish. I was hoping the first GC 
somehow aligned memory structure in a neat order, but that did not seem 
to be the case.

But as far as the amount of memory consumed does not differ much from 
other short-lived GC.
Or for whatever the reason, a small amount of memory increase may have
an unexpected effect on this CPU, AMD Ryzen 7 1700 Eight-Core Processor 
3.00 GHz?
Again, this is very unlikely. I have not seen this slow GC until about a 
year ago which forced me to file a bug report. Also, the attached dump 
shows that there are two tandem GC which took a long time, but other GCs 
that follow are short-lived.

I am still suspecting a certain  form of inefficiency due to allocation 
pattern of strings [I am assuming that strings are allocated and then 
added to the shell buffer, maybe I am wrong. It may be a long repetitoon 
of realloc?]  to store  longish output lines and then short lines 
produced by the particular compile/build process into the Shell buffer 
(M-X shell) in which I ran largish compilation/link process for mozilla 
thunderbird.

This could be purely due to excessive cache misses caused by particular 
memory access patterns during GC to handle my particular memory 
allocation patterns. But I have no idea.
I may have to find out the cache miss #.

Chiaki

Here is the suspect:  the last two lines look very suspicious.

C-M-g is undefined
Quit
[(393068244 12665 325413926 37859 257099914 12428854 8802846 
221.552147241 933)
 (402532085 12666 325423645 37859 257116197 12428863 8803027 
223.487605937 934)
 (411994601 12667 325425009 37859 257132482 12428865 8803207 
225.406804595 935)
 (414448248 12668 330538753 37859 259505987 12621465 8949175 
227.343266966 936)
 (424074389 12669 330548382 37859 259522285 12621478 8949356 
229.348980796 937)
 (433699244 12670 330549685 37859 259538585 12621481 8949536 
231.351762205 938)
 (436052588 12695 335564019 37860 262003952 12804062 9088033 
233.29986785 939)
 (437421791 12754 338433175 37860 263807816 12913522 9184089 
241.951760306 940)
 (438845632 12799 341601300 37860 265238759 13012817 9274435 
243.55520882 941)
 (440178606 12842 344865373 37860 266433618 13099621 9353887 
245.138839078 942)
 (441570556 12884 348132343 37860 267683757 13189878 9438753 
246.769836844 943)
 (442913777 12925 351504882 37860 268870526 13279030 9517477 
248.360581202 944)
 (444288336 12970 354793728 37860 270325980 13371631 9601867 
249.986431307 945)
 (445752850 13021 357867712 37860 271951417 13480694 9698259 
251.625818749 946)
 (447151863 13067 361600674 37860 273497660 13567563 9778153 
253.259859431 947)
 (448657655 13112 365067032 37860 275261240 13670086 9871747 
254.994917854 948)
 (450158914 13128 368163204 37860 276859198 13782657 9970119 
256.633972618 949)
 (451707980 13129 371342199 37860 278396622 13901166 10072262 
258.313592942 950)
 (453306224 13130 374620228 37860 279986655 14023233 10177572 
260.026878125 951)
 (464657341 13133 374630765 37860 280003882 14023235 10177761 
261.742461495 952)
 (476005819 13134 374631080 37860 280018058 14023235 10177809 
263.587127108 953)
 (477506723 13182 378534545 37862 281816769 14095140 10272953 
265.478689001 954)
 (478847002 13223 383581993 37862 283960864 14103107 10329518 
267.316126049 955)
 (479979175 13239 389369881 37862 285520586 14104493 10350952 
269.874493833 956)
 (481386726 13240 394330012 37862 287576346 14129844 10386458 
272.129021279 957)
 (483293142 13241 396849752 37862 290130666 14268679 10526833 
274.668402532 958)
 (485150904 13267 399531895 37862 292954354 14397053 10681026 
276.994701437 959)
 (520316838 13502 402388436 37862 368141808 16928818 10788359 
279.346871091 960)
 (520636475 13796 402726636 37868 370978420 16933589 10826355 
422.114721386 961)  <=
 (520702163 13807 402879841 38129 371108294 16933644 10830730 
620.747356527 962)] <=

and a couple of hours later.: I did not do a lot during this time.

Mark set
Auto-saving...done
Saving file /home/ishikawa/Dropbox/todo-memo-home.txt...
Wrote /home/ishikawa/Dropbox/todo-memo-home.txt
Saving file /home/ishikawa/Dropbox/todo-memo-home.txt...
Wrote /home/ishikawa/Dropbox/todo-memo-home.txt
Saving file /home/ishikawa/Dropbox/todo-memo-home.txt...
[(436052588 12695 335564019 37860 262003952 12804062 9088033 
233.29986785 939)
 (437421791 12754 338433175 37860 263807816 12913522 9184089 
241.951760306 940)
 (438845632 12799 341601300 37860 265238759 13012817 9274435 
243.55520882 941)
 (440178606 12842 344865373 37860 266433618 13099621 9353887 
245.138839078 942)
 (441570556 12884 348132343 37860 267683757 13189878 9438753 
246.769836844 943)
 (442913777 12925 351504882 37860 268870526 13279030 9517477 
248.360581202 944)
 (444288336 12970 354793728 37860 270325980 13371631 9601867 
249.986431307 945)
 (445752850 13021 357867712 37860 271951417 13480694 9698259 
251.625818749 946)
 (447151863 13067 361600674 37860 273497660 13567563 9778153 
253.259859431 947)
 (448657655 13112 365067032 37860 275261240 13670086 9871747 
254.994917854 948)
 (450158914 13128 368163204 37860 276859198 13782657 9970119 
256.633972618 949)
 (451707980 13129 371342199 37860 278396622 13901166 10072262 
258.313592942 950)
 (453306224 13130 374620228 37860 279986655 14023233 10177572 
260.026878125 951)
 (464657341 13133 374630765 37860 280003882 14023235 10177761 
261.742461495 952)
 (476005819 13134 374631080 37860 280018058 14023235 10177809 
263.587127108 953)
 (477506723 13182 378534545 37862 281816769 14095140 10272953 
265.478689001 954)
 (478847002 13223 383581993 37862 283960864 14103107 10329518 
267.316126049 955)
 (479979175 13239 389369881 37862 285520586 14104493 10350952 
269.874493833 956)
 (481386726 13240 394330012 37862 287576346 14129844 10386458 
272.129021279 957)
 (483293142 13241 396849752 37862 290130666 14268679 10526833 
274.668402532 958)
 (485150904 13267 399531895 37862 292954354 14397053 10681026 
276.994701437 959)
 (520316838 13502 402388436 37862 368141808 16928818 10788359 
279.346871091 960)
 (520636475 13796 402726636 37868 370978420 16933589 10826355 
422.114721386 961) <=
 (520702163 13807 402879841 38129 371108294 16933644 10830730 
620.747356527 962) <=
 (520801350 13855 402954201 38197 371207595 16935253 10837272 
620.862288325 963)
 (520860834 13895 403094379 38257 371361488 16936922 10844603 
620.954004521 964)
 (520960277 14286 403145781 38257 371718646 16938526 10849771 
621.0046367 965)
 (521023418 14444 403221395 38259 372414726 16940131 10857263 
621.049176975 966)
 (521071758 14508 403310401 38260 373187220 16940139 10866947 
621.097972876 967)
 (521097606 14509 403398321 38260 374994158 16940147 10867991 
621.156022718 968)]

On 2021/08/11 20:47, Eli Zaretskii wrote:
>> Cc: Noam Postavsky <npostavs <at> gmail.com>, Eli Zaretskii <eliz <at> gnu.org>,
>>   39413 <at> debbugs.gnu.org,
>>   chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>
>> From: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
>> Date: Wed, 11 Aug 2021 15:07:26 +0900
>>
>> Good news. I could capture at least something that shows what happens
>> when the Emacs seems to spend strangely large amount of time in GC and
>> for all practical purposes it gets hung for a couple of minutes.
> I don't see this in the data you sent:
>
>> [(388481492 12663 315852263 37859 252680715 12068535 8530128
>> 217.751094438 931)
>>    (390762053 12664 320601077 37859 254857509 12248048 8666125
>> 219.633925136 932)
>>    (393068244 12665 325413926 37859 257099914 12428854 8802846
>> 221.552147241 933)
>>    (402532085 12666 325423645 37859 257116197 12428863 8803027
>> 223.487605937 934)
>>    (411994601 12667 325425009 37859 257132482 12428865 8803207
>> 225.406804595 935)
>>    (414448248 12668 330538753 37859 259505987 12621465 8949175
>> 227.343266966 936)
>>    (424074389 12669 330548382 37859 259522285 12621478 8949356
>> 229.348980796 937)
>>    (433699244 12670 330549685 37859 259538585 12621481 8949536
>> 231.351762205 938)
>>    (436052588 12695 335564019 37860 262003952 12804062 9088033
>> 233.29986785 939)
>>    (437421791 12754 338433175 37860 263807816 12913522 9184089
>> 241.951760306 940)
>>    (438845632 12799 341601300 37860 265238759 13012817 9274435
>> 243.55520882 941)
>>    (440178606 12842 344865373 37860 266433618 13099621 9353887
>> 245.138839078 942)
>>    (441570556 12884 348132343 37860 267683757 13189878 9438753
>> 246.769836844 943)
>>    (442913777 12925 351504882 37860 268870526 13279030 9517477
>> 248.360581202 944)
>>    (444288336 12970 354793728 37860 270325980 13371631 9601867
>> 249.986431307 945)
>>    (445752850 13021 357867712 37860 271951417 13480694 9698259
>> 251.625818749 946)
>>    (447151863 13067 361600674 37860 273497660 13567563 9778153
>> 253.259859431 947)
>>    (448657655 13112 365067032 37860 275261240 13670086 9871747
>> 254.994917854 948)
>>    (450158914 13128 368163204 37860 276859198 13782657 9970119
>> 256.633972618 949)
>>    (451707980 13129 371342199 37860 278396622 13901166 10072262
>> 258.313592942 950)
>>    (453306224 13130 374620228 37860 279986655 14023233 10177572
>> 260.026878125 951)
>>    (464657341 13133 374630765 37860 280003882 14023235 10177761
>> 261.742461495 952)
>>    (476005819 13134 374631080 37860 280018058 14023235 10177809
>> 263.587127108 953)
>>    (477506723 13182 378534545 37862 281816769 14095140 10272953
>> 265.478689001 954)
>>    (478847002 13223 383581993 37862 283960864 14103107 10329518
>> 267.316126049 955)
>>    (479979175 13239 389369881 37862 285520586 14104493 10350952
>> 269.874493833 956)
>>    (481386726 13240 394330012 37862 287576346 14129844 10386458
>> 272.129021279 957)
>>    (483293142 13241 396849752 37862 290130666 14268679 10526833
>> 274.668402532 958)
>>    (485150904 13267 399531895 37862 292954354 14397053 10681026
>> 276.994701437 959)
>>    (520316838 13502 402388436 37862 368141808 16928818 10788359
>> 279.346871091 960)]
> Look at the 2 last numbers in each sample: the count of GC cycles goes
> up by one, and the elapsed accumulated time spent in GC goes up by 2.5
> sec for each GC cycle.  That's nowhere near the couple of minutes
> value you report.  Could it be that the Emacs process doesn't get CPU
> resources?
>





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Wed, 11 Aug 2021 13:47:01 GMT) Full text and rfc822 format available.

Message #66 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp>,
 Eli Zaretskii <eliz <at> gnu.org>, 39413 <at> debbugs.gnu.org,
 Noam Postavsky <npostavs <at> gmail.com>
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Wed, 11 Aug 2021 22:46:01 +0900
On 2021/08/11 20:14, Lars Ingebrigtsen wrote:
> "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp> writes:
>
>>     You can see memory is not used  up. Only about 11 GB of memory is
>> used out of 16 GB.
>>     No swapping should occur.
> The OS may well decide to swap in this situation, so try checking that
> instead.  In Linux, "vmstat 1" should tell you whether swapping is
> occurring, but I have no idea how to determine that in Windows.
>
Hi,

Thank you for your tips.

OK, I would run vmstat 1 next time this strange GC hung occurs.

As for windows, task monitor has  a peformance monitor subsystem and in it,
I find "hard faults" in memory monitor function.
This may not be purely swap, but at least it may give me an indication 
if excessive faults are happening, etc.

Thank you again for your helpful hints.

Chiaki.






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Thu, 12 Aug 2021 07:11:02 GMT) Full text and rfc822 format available.

Message #69 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
Cc: chiaki.ishikawa <at> ubin.jp, larsi <at> gnus.org, 39413 <at> debbugs.gnu.org,
 npostavs <at> gmail.com
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Thu, 12 Aug 2021 10:10:25 +0300
> Cc: larsi <at> gnus.org, npostavs <at> gmail.com, 39413 <at> debbugs.gnu.org,
>  chiaki.ishikawa <at> ubin.jp
> From: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
> Date: Wed, 11 Aug 2021 22:41:56 +0900
> 
> What about the dump below, which was in the *Messages* buffer following 
> dump I sent.
> I am afraid I picked up the wrong dump in my previous e-mail.
> 
> Now, is "gc-elapsed" the elapsed time spent in the GC routine?
> If so, the two samples at the end of the dump here certainly look 
> suspicious.

Yes, they show a couple of GCs that took about 150 to 200 sec.  That
is indeed excessive, but gc-elapsed reports the wall-clock time, not
the CPU time, so if Emacs wasn't getting processing resources to run,
which could happen on a stressed system, that could explain the long
GC times.  Note that other GC cycles, both before and after the 2
problematic ones, took much less time.  And the statistics returned by
memory-use-counts doesn't show any excessive number of Lisp objects;
in fact, my current Emacs session, which runs for 9.5 days, has an
order of magnitude more objects than what you show, and yet GC cycles
here are barely perceptible.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Sun, 15 Aug 2021 23:28:01 GMT) Full text and rfc822 format available.

Message #72 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: chiaki.ishikawa <at> ubin.jp, larsi <at> gnus.org, 39413 <at> debbugs.gnu.org,
 npostavs <at> gmail.com
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Mon, 16 Aug 2021 08:27:01 +0900
[Message part 1 (text/plain, inline)]
On 2021/08/12 16:10, Eli Zaretskii wrote:
>> Cc: larsi <at> gnus.org, npostavs <at> gmail.com, 39413 <at> debbugs.gnu.org,
>>   chiaki.ishikawa <at> ubin.jp
>> From: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
>> Date: Wed, 11 Aug 2021 22:41:56 +0900
>>
>> What about the dump below, which was in the *Messages* buffer following
>> dump I sent.
>> I am afraid I picked up the wrong dump in my previous e-mail.
>>
>> Now, is "gc-elapsed" the elapsed time spent in the GC routine?
>> If so, the two samples at the end of the dump here certainly look
>> suspicious.
> Yes, they show a couple of GCs that took about 150 to 200 sec.  That
> is indeed excessive, but gc-elapsed reports the wall-clock time, not
> the CPU time, so if Emacs wasn't getting processing resources to run,
> which could happen on a stressed system, that could explain the long
> GC times.  Note that other GC cycles, both before and after the 2
> problematic ones, took much less time.  And the statistics returned by
> memory-use-counts doesn't show any excessive number of Lisp objects;
> in fact, my current Emacs session, which runs for 9.5 days, has an
> order of magnitude more objects than what you show, and yet GC cycles
> here are barely perceptible.
>
Hi,

Thank you for your reply.

So gc-elapsed is the elapsed wall-clock time.
Is there a function to obtain CPU time in Emacs Lisp?

On BSD, many years ago, I wrote a short routine that uses
resource-related system call. I am using Emacs under linux now, and I 
assume similar calls exist under linux. I checked the manual.
I will use getrusage(2), which seems to be BSD-compatible, to obtain the 
resource usage statistics and include the numbers in my post-gc-hook result.

OTOH, I have always assumed that Emacs DID eats up CPU while longish GC. 
You may recall the graphics that showed the usage of a CPU core going 
all the way to 1000 in my previous e-mail (xosview). I am attaching it 
again.
I used the graphics to show the memory usage (hand-drawn red circle) in 
my previous e-mai.

But if you look slightly above, you will notice CPU core #4 is used 100% 
(!).
That is emacs process. No other process is running that earnestly at 
that moment.
Thanks to many cores (actually I allocate 7 cores), the linux desktop 
was responding swiftly and xosview was recording system resource usage 
as the graphics shows. Only Emacs process could not respond to window 
repaint request from X when I re-arranged the windows layout, and that 
is why the Emacs window is garbled (hand-drawn blue circle was placed at 
the upper-left of the un-repainted area that remained black).

I believe Emacs indeed was eating up the CPU time in a busy loop or 
deeply nested calls. (maybe a loop that could be handled in a for loop 
with a relatively shallow stack was handled by a recursive call that can 
be nested as deep as the list length?)

Numbers from getrusage() will give us additional insight on this matter 
hopefully.

Thank you again for all your tips.


Chiaki


PS: I found profiler-cpu-* functions, but I don't think it is wise to 
run them during GC since they seem to allocate vector tables. However, 
taking a snapshot of strack trace every now and then during GC seems 
attractive for my investigation to figure out WHERE in GC, the excessive 
time is spent.

[emacs-hung-2.PNG (image/png, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Mon, 16 Aug 2021 00:21:01 GMT) Full text and rfc822 format available.

Message #75 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: chiaki.ishikawa <at> ubin.jp, larsi <at> gnus.org, 39413 <at> debbugs.gnu.org,
 npostavs <at> gmail.com
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Mon, 16 Aug 2021 09:20:40 +0900
On 2021/08/16 8:27, ISHIKAWA,chiaki wrote:
> On 2021/08/12 16:10, Eli Zaretskii wrote:
>>> Cc: larsi <at> gnus.org, npostavs <at> gmail.com, 39413 <at> debbugs.gnu.org,
>>>   chiaki.ishikawa <at> ubin.jp
>>> From: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
>>> Date: Wed, 11 Aug 2021 22:41:56 +0900
>>>
>>> What about the dump below, which was in the *Messages* buffer following
>>> dump I sent.
>>> I am afraid I picked up the wrong dump in my previous e-mail.
>>>
>>> Now, is "gc-elapsed" the elapsed time spent in the GC routine?
>>> If so, the two samples at the end of the dump here certainly look
>>> suspicious.
>> Yes, they show a couple of GCs that took about 150 to 200 sec. That
>> is indeed excessive, but gc-elapsed reports the wall-clock time, not
>> the CPU time, so if Emacs wasn't getting processing resources to run,
>> which could happen on a stressed system, that could explain the long
>> GC times.  Note that other GC cycles, both before and after the 2
>> problematic ones, took much less time.  And the statistics returned by
>> memory-use-counts doesn't show any excessive number of Lisp objects;
>> in fact, my current Emacs session, which runs for 9.5 days, has an
>> order of magnitude more objects than what you show, and yet GC cycles
>> here are barely perceptible.
>>
> Hi,
>
> Thank you for your reply.
>
> So gc-elapsed is the elapsed wall-clock time.
> Is there a function to obtain CPU time in Emacs Lisp?
>
> On BSD, many years ago, I wrote a short routine that uses
> resource-related system call. I am using Emacs under linux now, and I 
> assume similar calls exist under linux. I checked the manual.
> I will use getrusage(2), which seems to be BSD-compatible, to obtain 
> the resource usage statistics and include the numbers in my 
> post-gc-hook result.
>
> OTOH, I have always assumed that Emacs DID eats up CPU while longish 
> GC. You may recall the graphics that showed the usage of a CPU core 
> going all the way to 1000 in my previous e-mail (xosview). I am 
> attaching it again.
> I used the graphics to show the memory usage (hand-drawn red circle) 
> in my previous e-mai.
>
> But if you look slightly above, you will notice CPU core #4 is used 
> 100% (!).
> That is emacs process. No other process is running that earnestly at 
> that moment.
> Thanks to many cores (actually I allocate 7 cores), the linux desktop 
> was responding swiftly and xosview was recording system resource usage 
> as the graphics shows. Only Emacs process could not respond to window 
> repaint request from X when I re-arranged the windows layout, and that 
> is why the Emacs window is garbled (hand-drawn blue circle was placed 
> at the upper-left of the un-repainted area that remained black).
>
> I believe Emacs indeed was eating up the CPU time in a busy loop or 
> deeply nested calls. (maybe a loop that could be handled in a for loop 
> with a relatively shallow stack was handled by a recursive call that 
> can be nested as deep as the list length?)
>
> Numbers from getrusage() will give us additional insight on this 
> matter hopefully.
>
> Thank you again for all your tips.
>
>
> Chiaki
>
>
> PS: I found profiler-cpu-* functions, but I don't think it is wise to 
> run them during GC since they seem to allocate vector tables. However, 
> taking a snapshot of strack trace every now and then during GC seems 
> attractive for my investigation to figure out WHERE in GC, the 
> excessive time is spent.
>
Aha, I realized that Emacs configure has a snipet to check for the 
presence of getrusage system call.
Now I know there is a function in src/sysdep.c called 
get-internal-run-time which would be useful for my purpose except it 
calls make_lisp_time(make_timespec(secs, usecs * 1000)) at the end.
I want to obtain precise time spent by GC by doing the following.
   geturasge(RUSAGE_SELF, &usage_before_GC);

   GC

   geturasge(RUSAGE_SELF, &usage_after_GC);

obtain the CPU time usage difference between two rusage structure
to know the CPU time spent by GC.

Well, get-interla-run-time is useful for obtain the AFTER-GC time, but I 
am not sure if we can call get-internal-run-time within the 
garbage-collect, at the very first line of its call due to the use of 
make_lisp_time.

Well, I guess I have to create a function to record the CPU usage at the 
start of GC and then before exiting GC, calculate the difference and put 
it into a variable (sys time and usr time added for simplicity).

I think I will basically follow how Vgc_elapsed is updated at the latter 
part of garbage_collection().
I think I will return a vector of numbers from getrusage().
If you have any suggestions about how the return value/variable should 
look like, please let me know.

If I can find a reasonably portable way to obtain cache miss and cache 
hit numbers of a single core process like Emacs, I may want to add it 
just in case.

TIA.

Chiaki







Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Mon, 16 Aug 2021 11:36:02 GMT) Full text and rfc822 format available.

Message #78 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
Cc: chiaki.ishikawa <at> ubin.jp, larsi <at> gnus.org, 39413 <at> debbugs.gnu.org,
 npostavs <at> gmail.com
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Mon, 16 Aug 2021 14:35:29 +0300
> Cc: larsi <at> gnus.org, npostavs <at> gmail.com, 39413 <at> debbugs.gnu.org,
>  chiaki.ishikawa <at> ubin.jp
> From: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
> Date: Mon, 16 Aug 2021 08:27:01 +0900
> 
> But if you look slightly above, you will notice CPU core #4 is used 100% 
> (!).
> That is emacs process. No other process is running that earnestly at 
> that moment.

If we believe everything these tools show us, maybe.  But those tools
suffer from the same problem Emacs does on a busy system: the tool
itself might not get CPU to update its data, so you actually see a
snapshot of what it saw last time it did get CPU.

> PS: I found profiler-cpu-* functions, but I don't think it is wise to 
> run them during GC since they seem to allocate vector tables. However, 
> taking a snapshot of strack trace every now and then during GC seems 
> attractive for my investigation to figure out WHERE in GC, the excessive 
> time is spent.

Emacs built-in profiling will not help us here, because it cannot
profile below Lisp primitives.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Fri, 20 Aug 2021 01:57:01 GMT) Full text and rfc822 format available.

Message #81 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: chiaki.ishikawa <at> ubin.jp, larsi <at> gnus.org, 39413 <at> debbugs.gnu.org,
 npostavs <at> gmail.com
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Fri, 20 Aug 2021 10:56:35 +0900
On 2021/08/16 20:35, Eli Zaretskii wrote:
>> Cc: larsi <at> gnus.org, npostavs <at> gmail.com, 39413 <at> debbugs.gnu.org,
>>   chiaki.ishikawa <at> ubin.jp
>> From: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
>> Date: Mon, 16 Aug 2021 08:27:01 +0900
>>
>> But if you look slightly above, you will notice CPU core #4 is used 100%
>> (!).
>> That is emacs process. No other process is running that earnestly at
>> that moment.
> If we believe everything these tools show us, maybe.  But those tools
> suffer from the same problem Emacs does on a busy system: the tool
> itself might not get CPU to update its data, so you actually see a
> snapshot of what it saw last time it did get CPU.
>
>> PS: I found profiler-cpu-* functions, but I don't think it is wise to
>> run them during GC since they seem to allocate vector tables. However,
>> taking a snapshot of strack trace every now and then during GC seems
>> attractive for my investigation to figure out WHERE in GC, the excessive
>> time is spent.
> Emacs built-in profiling will not help us here, because it cannot
> profile below Lisp primitives.
>

Very tough.
I will see what I can do and report back if anything interesting shows up.

(The last time I tried to see where the time was spent, GC was busy 
reclaiming |cons| cells.
May not mean much.

I can only recall the old paper and in an 16 GB environment (including 
OS),  we may see some bad behavior.
"


 Address/memory management for a gigantic LISP environment or, GC
 considered harmful

"
by Jon L. White

https://dl.acm.org/doi/abs/10.1145/800087.802797





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Fri, 17 Sep 2021 14:50:01 GMT) Full text and rfc822 format available.

Message #84 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
Cc: chiaki.ishikawa <at> ubin.jp, Eli Zaretskii <eliz <at> gnu.org>,
 39413 <at> debbugs.gnu.org, npostavs <at> gmail.com
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Fri, 17 Sep 2021 16:49:22 +0200
"ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp> writes:

> I will see what I can do and report back if anything interesting shows up.

This was a month ago -- have you made any further progress?  I think it
really sounds like the OS (the client or the host) isn't giving any
resources to Emacs (i.e., the OS is swapping or something like that), so
that is isn't an Emacs issue at all.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Mon, 18 Oct 2021 09:01:02 GMT) Full text and rfc822 format available.

Message #87 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
Cc: chiaki.ishikawa <at> ubin.jp, Eli Zaretskii <eliz <at> gnu.org>,
 39413 <at> debbugs.gnu.org, npostavs <at> gmail.com
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Mon, 18 Oct 2021 11:00:17 +0200
Lars Ingebrigtsen <larsi <at> gnus.org> writes:

> This was a month ago -- have you made any further progress?  I think it
> really sounds like the OS (the client or the host) isn't giving any
> resources to Emacs (i.e., the OS is swapping or something like that), so
> that is isn't an Emacs issue at all.

This was a month ago and there was no response, so I'm assuming that
this analysis is correct, and I'm closing this bug report.  If there's
anything more to do here, please respond to the debbugs address and
we'll reopen.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Added tag(s) notabug. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Mon, 18 Oct 2021 09:01:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 39413 <at> debbugs.gnu.org and chiaki-ishikawa-thunderbird-account <chiaki.ishikawa <at> ubin.jp> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Mon, 18 Oct 2021 09:01:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39413; Package emacs. (Tue, 19 Oct 2021 07:05:02 GMT) Full text and rfc822 format available.

Message #94 received at 39413 <at> debbugs.gnu.org (full text, mbox):

From: "ISHIKAWA,chiaki" <ishikawa <at> yk.rim.or.jp>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: chiaki.ishikawa <at> ubin.jp, Eli Zaretskii <eliz <at> gnu.org>,
 39413 <at> debbugs.gnu.org, npostavs <at> gmail.com
Subject: Re: bug#39413: 26.2; Emacs gets hung
Date: Tue, 19 Oct 2021 16:04:40 +0900
On 2021/10/18 18:00, Lars Ingebrigtsen wrote:
> Lars Ingebrigtsen <larsi <at> gnus.org> writes:
>
>> This was a month ago -- have you made any further progress?  I think it
>> really sounds like the OS (the client or the host) isn't giving any
>> resources to Emacs (i.e., the OS is swapping or something like that), so
>> that is isn't an Emacs issue at all.
> This was a month ago and there was no response, so I'm assuming that
> this analysis is correct, and I'm closing this bug report.  If there's
> anything more to do here, please respond to the debbugs address and
> we'll reopen.
>
Thanks.

I have observed a few hungs ( a minute or two) and still could not 
figure out where the problem is.
I still have a feeling that the memory allocator in the last few years 
may not be great for really large amount of short string allocations (to 
accommodate the output from thunderbird compilation/build process within 
eshell buffer, and then releasing them altogether at once (via 
erase-buffer to clear the shell interaction in the buffer, or deleting 
the buffer itself.). This began happening a few years ago in one version 
of Emacs.

But for now, I can't make reasonable progress. (All I see while I insert 
a debug break point is call to gc sweep mark process and that may not 
really reflect the deep/detailed cause. Once I find a smoking gun, I 
will reopen it.
(It could be even a virtualbox issue on AMD CPU if I stretch my 
imagination although we would have heard something about the large 
memory process on AMD CPU by now.)

Thank you again for your attention on this matter.

Chiaki






bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 16 Nov 2021 12:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 133 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.