GNU logs - #27234, boring messages


Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#27234: Hang in GC, inf loop while walking frame pointers
Resent-From: Linas Vepstas <linasvepstas@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Sun, 04 Jun 2017 14:59:01 +0000
Resent-Message-ID: <handler.27234.B.149658828914303 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: report 27234
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: 27234 <at> debbugs.gnu.org
X-Debbugs-Original-To: bug-guile@HIDDEN
Reply-To: linasvepstas@HIDDEN
Received: via spool by submit <at> debbugs.gnu.org id=B.149658828914303
          (code B ref -1); Sun, 04 Jun 2017 14:59:01 +0000
Received: (at submit) by debbugs.gnu.org; 4 Jun 2017 14:58:09 +0000
Received: from localhost ([127.0.0.1]:55940 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1dHWyl-0003iV-44
	for submit <at> debbugs.gnu.org; Sun, 04 Jun 2017 10:58:09 -0400
Received: from eggs.gnu.org ([208.118.235.92]:34576)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <linasvepstas@HIDDEN>) id 1dHWyi-0003i0-HI
 for submit <at> debbugs.gnu.org; Sun, 04 Jun 2017 10:58:06 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <linasvepstas@HIDDEN>) id 1dHWyX-0006Qk-Qj
 for submit <at> debbugs.gnu.org; Sun, 04 Jun 2017 10:57:59 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM,
 HTML_MESSAGE,T_DKIM_INVALID autolearn=disabled version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:49184)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <linasvepstas@HIDDEN>)
 id 1dHWyX-0006QX-Mn
 for submit <at> debbugs.gnu.org; Sun, 04 Jun 2017 10:57:53 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:53752)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <linasvepstas@HIDDEN>) id 1dHWyS-0001Ec-0z
 for bug-guile@HIDDEN; Sun, 04 Jun 2017 10:57:53 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <linasvepstas@HIDDEN>) id 1dHWyM-0006IZ-Gc
 for bug-guile@HIDDEN; Sun, 04 Jun 2017 10:57:48 -0400
Received: from mail-lf0-x22e.google.com ([2a00:1450:4010:c07::22e]:34158)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16)
 (Exim 4.71) (envelope-from <linasvepstas@HIDDEN>)
 id 1dHWyL-0006Fm-SF
 for bug-guile@HIDDEN; Sun, 04 Jun 2017 10:57:42 -0400
Received: by mail-lf0-x22e.google.com with SMTP id v20so28785472lfa.1
 for <bug-guile@HIDDEN>; Sun, 04 Jun 2017 07:57:39 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:reply-to:from:date:message-id:subject:to;
 bh=fDowtPjzhOCoDpRB3nbwureBQp9WCeOO4DHNio7/poM=;
 b=PjF4HNEygq4jxAHQ3acXy2rdo/AJySclymxcP9U+GU8xK8VsWiNgxPeKVG2OAsO8j7
 CFjFJ8E8C0YLJ7cpbwvB8e4fKqh3XtlGKt0ZxkBhdULX16j3SUbjOUa2EMUlCTlMZOoW
 HKOMjYh6C9n97H+Xctq9sO2y0VQ+nxIuxcpFDZ0433tADDIWBQos6g61c2oM6CtLKJb9
 ctLjtVlfZ8vPQqRcyIRNBEwECB9/L3dnutCqek2rIhYsBV06PeMtaKJX3v/D9UWYWBfy
 lSZKpBqFkW2jnFioCL3KzeKCpF+JtQmXwemBVGPpBEbiOacJa9vUbMfUyK2Whf8CIdoQ
 vQOg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:reply-to:from:date:message-id
 :subject:to;
 bh=fDowtPjzhOCoDpRB3nbwureBQp9WCeOO4DHNio7/poM=;
 b=ja9UeYKg8wV/+4VnywTBHaWkyUko/sUHly9gTpAUYq+aZC0tJ3i6PSJNBddWQ5GqNs
 YHIgt+P1TIMVpbF0Ubul7NBdwDKkt73Z6qQX+TsYfK+DFrzsF28qz2Trbm89tEHNknkG
 xWid2SlkXTIBh4Wt+2njhk1MZOjddBWSBqSwkHgMh2h/GGBu+3PoRGm98LNuKgz7HTNe
 ao5X2OGeAGwvJo5Rfiug6YKG2dIlE/+NasZy1C/1Rp7dT8nLcG74Fhm044uT9xSVX6zA
 HJr0bIQLCKKNJFSeE1Wk5CYhQ1Amcsq2wB+2AYdXGMMGu4DpaBxVv0brtmSu30BczO63
 yI3w==
X-Gm-Message-State: AODbwcCNiBkiPqcp26u1/krl06spKbtEXMtLhR8K77aPC/S634IjmfP5
 iyrekAUpCY5/DKtRCOXOtTn/fEpNODDL
X-Received: by 10.46.14.10 with SMTP id 10mr5021674ljo.25.1496588257697; Sun,
 04 Jun 2017 07:57:37 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.25.157.70 with HTTP; Sun, 4 Jun 2017 07:57:17 -0700 (PDT)
From: Linas Vepstas <linasvepstas@HIDDEN>
Date: Sun, 4 Jun 2017 09:57:17 -0500
Message-ID: <CAHrUA361VtW9ZOgEbxTYsDi3K7WY4YVaGA=YAK3VQKSVs8Bc_g@HIDDEN>
Content-Type: multipart/alternative; boundary="f403045ea6d679a79b0551239b33"
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
 recognized.
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -4.0 (----)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -4.0 (----)

--f403045ea6d679a79b0551239b33
Content-Type: text/plain; charset="UTF-8"

what: guile-2.2-stable, from git.

I've got a large, complex, heavily multi-threaded guile program that
hangs during garbage collection; usually after running for half a day.
It hangs in a tight loop in scm_i_vm_mark_stack, spinning at 100% of CPU.

This is due to the for-loop line fp = SCM_FRAME_DYNAMIC_LINK (fp))
at libguile/vm.c line 975 failing to advance the frame pointer.
There's no "obvious" corruption in the stack; it simply looks like
the frame was incompletely set up, and so incrementing to the next
fp does not go anywhere.

I have recompiled guile with VM_ENABLE_ASSERTIONS and am trying to
reproduce the bug now.  The rest of this email is a record of a long
debug session isolating the problem, and showing that, overall, the
thread and stack data look more-or-less correct and uncorrupoted,
except for the inability to walk forward in the frame.

-- linas

(gdb) bt
#0  scm_i_vm_mark_stack (vp=0x755c1bd0, mark_stack_ptr=0x7f3e9b783f40,
    mark_stack_limit=0x7f3e9b793eb0) at ../../libguile/vm.c:1011
#1  0x00007f3e9db8835e in GC_mark_from (mark_stack_top=0x7f3e9b783ee0,
    mark_stack_top@entry=0x7f3e9b783f00,
    mark_stack=mark_stack@entry=0x7f3e9b783eb0,
    mark_stack_limit=mark_stack_limit@entry=0x7f3e9b793eb0) at ../mark.c:772
#2  0x00007f3e9db8897e in GC_do_local_mark (local_mark_stack=0x7f3e9b783eb0,
    local_top=0x7f3e9b783f00) at ../mark.c:1037
#3  0x00007f3e9db88b98 in GC_mark_local (
    local_mark_stack=local_mark_stack@entry=0x7f3e9b783eb0, id=id@entry=4)
    at ../mark.c:1170
#4  0x00007f3e9db88eaa in GC_help_marker (my_mark_no=my_mark_no@entry=80003)
    at ../mark.c:1238
#5  0x00007f3e9db92e3c in GC_mark_thread (id=<optimized out>)
    at ../pthread_support.c:380
#6  0x00007f3e9e2ae6ba in start_thread (arg=0x7f3e9b794700)
    at pthread_create.c:333
#7  0x00007f3e9dfdd82d in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

(gdb) step

#0  find_slot_map (cache=0x7f3e9b783ab0, ip=0x1) at
../../libguile/vm.c:935
935       if (cache->entries[slot].ip == ip)
(gdb) print slot
$1 = 0
(gdb) print cache->entries[slot].ip
$2 = (scm_t_uint32 *) 0x1
(gdb) print cache->entries[slot].map
$4 = (const scm_t_uint8 *) 0x0
(gdb) step
scm_i_vm_mark_stack (vp=0x755c1bd0, mark_stack_ptr=0x7f3e9b783f40,
    mark_stack_limit=0x7f3e9b793eb0) at ../../libguile/vm.c:1011
1011        slot_map = find_slot_map (SCM_FRAME_RETURN_ADDRESS (fp),
&cache);
(gdb) print fp
$5 = (union scm_vm_stack_element *) 0x7f3e94cdee38

#define SCM_FRAME_RETURN_ADDRESS(fp)    ((fp)[0].as_ip)
(gdb) print (fp)[0].as_ip
$6 = (scm_t_uint32 *) 0x1

OK that looks weird ... is this corrupted ?? but whatever,
because the returned slot_map is never used ... because ...

(gdb) step
scm_i_vm_mark_stack (vp=0x755c1bd0, mark_stack_ptr=0x7f3e9b783f40,
    mark_stack_limit=0x7f3e9b793eb0) at ../../libguile/vm.c:975
975            fp = SCM_FRAME_DYNAMIC_LINK (fp))
(gdb) print fp
$7 = (union scm_vm_stack_element *) 0x7f3e94cdee38

frames.h:#define SCM_FRAME_DYNAMIC_LINK(fp)      ((fp) + (fp)[1].as_uint)
(gdb) print (fp)[1].as_uint
$8 = 0

OK, that seems bad, because now fp never advances, it just repeats
over and over with this same value.

(gdb)
979           for (slot = nlocals - 1; sp < fp; sp++, slot--)
(gdb) print nlocals
$10 = -2
(gdb) print sp
$11 = (union scm_vm_stack_element *) 0x7f3e94cdee48
#define SCM_FRAME_NUM_LOCALS(fp, sp)    ((fp) - (sp))
(gdb) print fp
$12 = (union scm_vm_stack_element *) 0x7f3e94cdee38
(gdb) print  ((fp) - (sp))
$13 = -2
Ohh .. its not -16 because its -2 * sizeof (union scm_vm_stack_element *)
so that's OK.
So for loops is skipped, it should go to.
      sp = SCM_FRAME_PREVIOUS_SP (fp);
frames.h:#define SCM_FRAME_PREVIOUS_SP(fp)    ((fp) + 2)

and so now it loops around and repeats.
(gdb) print cache
$19 = {entries = {{ip = 0x1, map = 0x0}, {ip = 0x0,
      map = 0x0} <repeats 23 times>, {ip = 0x1a1b5e0, map = 0x0}, {ip =
0x0,
      map = 0x0}, {ip = 0x0, map = 0x0}, {ip = 0x0, map = 0x0}, {ip = 0x0,
      map = 0x0}, {ip = 0x0, map = 0x0}, {ip = 0x0, map = 0x0}, {ip = 0x0,
      map = 0x0}}}
(gdb) print &cache
$20 = (struct slot_map_cache *) 0x7f3e9b783ab0

and so the loop repeates forever, because
fp = SCM_FRAME_DYNAMIC_LINK (fp) never advances fp, because
(fp)[1].as_uint  is zero.

So where is fp pointing to?  recall fp == 0x7f3e94cdee38

(gdb) x/20x 0x7f3e94cdee00
0x7f3e94cdee00: 0x0000000000000904      0x000000000000200c
0x7f3e94cdee10: 0x0000000001a1b7e0      0x0000000001a1b5e0
0x7f3e94cdee20: 0x0000000000000004      0x000000000000200c
0x7f3e94cdee30: 0x0000000000169bd6      0x0000000000000001
0x7f3e94cdee40: 0x0000000000000000      0x000000000192acd0
0x7f3e94cdee50: 0x0000000001fa4bc0      0x0000000001a1c6d0
0x7f3e94cdee60: 0x0000000044507950      0x0000000000000002
0x7f3e94cdee70: 0x000000000005a6f5      0x00000000018febd0
0x7f3e94cdee80: 0x000000006d490d00      0x00007f3e9e97241c
0x7f3e94cdee90: 0x0000000000000002      0x00007f3e9e97241c


(gdb) x/s 0x7f3e94cdee60
0x7f3e94cdee60: "PyPD"      <<< ?? is this a meaningul string?
(gdb) x/s 0x7f3e94cdee81
0x7f3e94cdee81: "\rIm"      << how about this ???

So this does not look healthy.  How did we get here?

(gdb) print vp
$21 = (struct scm_vm *) 0x755c1bd0
(gdb) print mark_stack_ptr
$22 = (struct GC_ms_entry *) 0x7f3e9b783f40
(gdb) print mark_stack_limit
$23 = (struct GC_ms_entry *) 0x7f3e9b793eb0
(gdb) print upper
$24 = (void *) 0x7ffdd986
(gdb) print GC_greatest_plausible_heap_addr
$25 = (void *) 0x7ffdd986
(gdb) print lower
$26 = (void *) 0x17a1ff8

(gdb) print vp->fp
$27 = (union scm_vm_stack_element *) 0x7f3e94cdee18
(gdb) print vp->sp
$28 = (union scm_vm_stack_element *) 0x7f3e94cdee00
(gdb) print  vp->stack_top
$29 = (union scm_vm_stack_element *) 0x7f3e94cdf000

So lets walk manually.
frames.h:#define SCM_FRAME_DYNAMIC_LINK(fp)      ((fp) + (fp)[1].as_uint)

(gdb) print (vp->fp) + (vp->fp)[1].as_uint
$31 = (union scm_vm_stack_element *) 0x7f3e94cdee38
So we are instanly in trouble! once through the loop, and we've got the
broken fp

So how did we get here?

(gdb) print vp
$32 = (struct scm_vm *) 0x755c1bd0
(gdb) print *vp
$33 = {ip = 0x7f3e9e5a6378 <subr_stub_code+56>, sp = 0x7f3e94cdee00,
  fp = 0x7f3e94cdee18, stack_limit = 0x7f3e94cde000, trace_level = 0,
  sp_min_since_gc = 0x7f3e94cdee00, stack_size = 512,
  stack_bottom = 0x7f3e94cde000, stack_top = 0x7f3e94cdf000,
  overflow_handler_stack = 0x304, hooks = {0x4, 0x4, 0x4, 0x4, 0x4},
  resumable_prompt_cookie = 0x7f3d397f9440, engine = 1}

Looks plausible, not corrupted. Why is engine=1 ??
#define SCM_VM_REGULAR_ENGINE 0
#define SCM_VM_DEBUG_ENGINE 1
#define SCM_VM_NUM_ENGINES 2
#  define SCM_I_CURRENT_THREAD (scm_i_current_thread)

(gdb) print scm_i_current_thread
$34 = (scm_i_thread *) 0x0

OK, that's not excellent.
(gdb) print thread_count
$35 = 161

(gdb) bt
#0  scm_i_vm_mark_stack (vp=0x755c1bd0, mark_stack_ptr=0x7f3e9b783f40,
    mark_stack_limit=0x7f3e9b793eb0) at ../../libguile/vm.c:1011
#1  0x00007f3e9db8835e in GC_mark_from (mark_stack_top=0x7f3e9b783ee0,
    mark_stack_top@entry=0x7f3e9b783f00,
    mark_stack=mark_stack@entry=0x7f3e9b783eb0,
    mark_stack_limit=mark_stack_limit@entry=0x7f3e9b793eb0) at ../mark.c:772
#2  0x00007f3e9db8897e in GC_do_local_mark
(local_mark_stack=0x7f3e9b783eb0,
    local_top=0x7f3e9b783f00) at ../mark.c:1037
#3  0x00007f3e9db88b98 in GC_mark_local (
    local_mark_stack=local_mark_stack@entry=0x7f3e9b783eb0, id=id@entry=4)
    at ../mark.c:1170
#4  0x00007f3e9db88eaa in GC_help_marker (my_mark_no=my_mark_no@entry=80003)
    at ../mark.c:1238
#5  0x00007f3e9db92e3c in GC_mark_thread (id=<optimized out>)
    at ../pthread_support.c:380
#6  0x00007f3e9e2ae6ba in start_thread (arg=0x7f3e9b794700)
    at pthread_create.c:333
#7  0x00007f3e9dfdd82d in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

so bdwgc is version gc7_6_0

So ../mark.c:772 is GET_HDR and GET_HDR is in
include/private/gc_hdrs.h

define GET_HDR(p, hhdr)  ... ((hhdr) = HDR_INNER(p)
# define HDR_INNER(p) HDR_FROM_BI(BI(p),p)
#define HDR_FROM_BI(bi, p) \
                ((bi)->index[((word)(p) >> LOG_HBLKSIZE) & (BOTTOM_SZ - 1)])

This seems all wrong.. wtf ....

is it called by thread_mark ??
in libguile/thread.c:
  thread_gc_kind =
    GC_new_kind (GC_new_free_list (),
       GC_MAKE_PROC (GC_new_proc (thread_mark), 0),
       0, 1)
in bdwgc/misc.c  GC_new_proc
    GC_mark_procs[result] = proc
#define GC_mark_procs GC_arrays._mark_procs
#define PROC(descr) \
      (GC_mark_procs[((descr) >> GC_DS_TAG_BITS) & (GC_MAX_MARK_PROCS-1)])

      GC_obj_kinds[result].ok_descriptor = descr;  <<< this is where
MAKE_PROC goes.

anyway ... in mark.c line 733
        case GC_DS_PROC:
          mark_stack_top--;
          credit -= GC_PROC_BYTES;
          mark_stack_top = (*PROC(descr))((word *)current_p, mark_stack_top,
                                          mark_stack_limit, ENV(descr));
          continue;
But that's the wrong line number.  line 772 is in
        case GC_DS_PER_OBJECT:
          if ((signed_word)descr >= 0) {
            /* Descriptor is in the object.     */
            descr = *(word *)(current_p + descr - GC_DS_PER_OBJECT);

Oh, it goes to retry, which loops around and calls PROC which is thread_mark
which is called with args
          mark_stack_top = (*PROC(descr))((word *)current_p, mark_stack_top,
                                          mark_stack_limit, ENV(descr));

(gdb) print current_p
$39 = (ptr_t) 0x7225d380 "@\325%r"

so back in libguile/threads.c line 84:
  const struct scm_i_thread *t = (struct scm_i_thread *) addr;

(gdb) print *(struct scm_i_thread *) current_p
$40 = {next_thread = 0x7225d540, handle = 0x5763f3b0, pthread =
139900934399744,
  result = 0x4, exited = 0, guile_mode = 1, needs_unregister = 1, wake =
0x0,
  sleep_cond = {__data = {__lock = 0, __futex = 0, __total_seq = 0,
      __wakeup_seq = 0, __woken_seq = 0, __mutex = 0x0, __nwaiters = 0,
      __broadcast_seq = 0}, __size = '\000' <repeats 47 times>, __align =
0},
  sleep_pipe = {151, 197}, freelists = 0x71cfe800,
  pointerless_freelists = 0x71cfe780, dynamic_state = 0x71de7900, dynstack
= {
    base = 0x698c3800, top = 0x698c3940, limit = 0x698c3a00},
  pending_asyncs = 0x304, block_asyncs = 0, continuation_root = 0x5763f340,
  continuation_base = 0x7f3d397f9c90, vp = 0x755c1bd0, base =
0x7f3d397f9da0,
  regs = {{__jmpbuf = {139905204221696, 139906923931021, 1,
139894281647232,
        139906913276512, 0, 139906918327551, 8388608},
      __mask_was_saved = 964664752, __saved_mask = {__val = {0,
139900934399744,
          139900934399744, 139900934397344, 139906923929400,
139893936840720,
          1804170656, 139906913276512, 139900934399744, 139900934397344,
          139900934397504, 139893936840720, 139906910859134,
139906913276484,
          139906910857112, 27401856}}}}}

(gdb) print ((struct scm_i_thread *) current_p)->vp
$41 = (struct scm_vm *) 0x755c1bd0

which is exactly the vp we had before, so this is the correct thread.

So  ... all the values in *(struct scm_i_thread *) current_p look more or
less OK;
its not corrupted. And all the values in *vp look more or less OK, its not
corrrupted
except that engine=1 looks wrong.  So what's up with the weird stack
looping?
Again:
From before, we had:

(gdb) print *vp
$42 = {ip = 0x7f3e9e5a6378 <subr_stub_code+56>, sp = 0x7f3e94cdee00,
  fp = 0x7f3e94cdee18, stack_limit = 0x7f3e94cde000, trace_level = 0,
  sp_min_since_gc = 0x7f3e94cdee00, stack_size = 512,
  stack_bottom = 0x7f3e94cde000, stack_top = 0x7f3e94cdf000,
  overflow_handler_stack = 0x304, hooks = {0x4, 0x4, 0x4, 0x4, 0x4},
  resumable_prompt_cookie = 0x7f3d397f9440, engine = 1}

so fp, sp, stack_bottom, stack_top all are self-consistent with each other.
The problem is that
    fp = SCM_FRAME_DYNAMIC_LINK (fp)
goes nowhere, after the first iteration.

Is the frame corrupted? How? what does the rest of the stack look like?

(gdb) x/32x 0x7f3e94cdee00
0x7f3e94cdee00: 0x0000000000000904      0x000000000000200c
0x7f3e94cdee10: 0x0000000001a1b7e0      0x0000000001a1b5e0
0x7f3e94cdee20: 0x0000000000000004      0x000000000000200c  <<< theres the 4
0x7f3e94cdee30: 0x0000000000169bd6      0x0000000000000001 <<< there's the
new fp
0x7f3e94cdee40: 0x0000000000000000      0x000000000192acd0 <<< there's the
zero.
0x7f3e94cdee50: 0x0000000001fa4bc0      0x0000000001a1c6d0
0x7f3e94cdee60: 0x0000000044507950      0x0000000000000002
0x7f3e94cdee70: 0x000000000005a6f5      0x00000000018febd0
0x7f3e94cdee80: 0x000000006d490d00      0x00007f3e9e97241c
0x7f3e94cdee90: 0x0000000000000002      0x00007f3e9e97241c
0x7f3e94cdeea0: 0x0000000000000002      0x00007f3e8d797630
0x7f3e94cdeeb0: 0x0000000000000002      0x00007f3e86effacc
0x7f3e94cdeec0: 0x0000000000000003      0x000000006d490d00
0x7f3e94cdeed0: 0x00007f3e86eb8698      0x0000000000000002
0x7f3e94cdeee0: 0x00007f3e8e51f450      0x0000000000000008
0x7f3e94cdeef0: 0x0000000001a37c10      0x000000000192acd0


So... who sets up thread->vp ?
/* #define VM_ENABLE_ASSERTIONS */
However, defining this breaks the build.

diff --git a/libguile/vm-engine.c b/libguile/vm-engine.c
index 6c88ebf..f0f8164 100644
--- a/libguile/vm-engine.c
+++ b/libguile/vm-engine.c
@@ -2050,11 +2050,11 @@ VM_NAME (scm_i_thread *thread, struct scm_vm *vp,

           if (!scm_module_system_booted_p)
             {
-              ASSERT (scm_is_true
+              ASSERT (scm_is_true(
                       scm_equal_p (modname,
                                    scm_list_2
                                    (SCM_BOOL_T,
-                                    scm_from_utf8_symbol ("guile"))));
+                                    scm_from_utf8_symbol ("guile")))));
               var = scm_lookup (sym);
             }
           else if (scm_is_true (SCM_CAR (modname)))
diff --git a/libguile/vm.c b/libguile/vm.c
index 18f2192..c26066e 100644
--- a/libguile/vm.c
+++ b/libguile/vm.c
@@ -63,7 +63,7 @@ static size_t page_size;

 /* The VM has a number of internal assertions that shouldn't normally be
    necessary, but might be if you think you found a bug in the VM. */
-/* #define VM_ENABLE_ASSERTIONS */
+#define VM_ENABLE_ASSERTIONS

 static void vm_expand_stack (struct scm_vm *vp,
                              union scm_vm_stack_element *new_sp)
SCM_NOINLINE;


 GEN      guile-procedures.texi
/bin/bash: line 1: 24078 Broken pipe

 ./meta/guile
Aborted
./libguile/guile
Aborted
gdb libguile/.libs/lt-guile

#0  0x00007ffff74f7428 in __GI_raise (sig=sig@entry=6)
    at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007ffff74f902a in __GI_abort () at abort.c:89
#2  0x00007ffff7b707c4 in vm_regular_engine (thread=0x7282, vp=0x806f30,
    registers=0x6, resume=-145787864) at ../../libguile/vm-engine.c:2053
#3  0x00007ffff7b711d2 in scm_call_n (proc=0x832b30, argv=argv@entry=0x0,
    nargs=nargs@entry=0) at ../../libguile/vm.c:1257
#4  0x00007ffff7af14e9 in scm_call_0 (proc=<optimized out>)
    at ../../libguile/eval.c:481
#5  0x00007ffff7b10568 in scm_primitive_load_path (args=<optimized out>)
    at ../../libguile/load.c:1248
#6  0x00007ffff7b6e976 in vm_regular_engine (thread=0x7282, vp=0x806f30,
    registers=0x6, resume=-145787864) at ../../libguile/vm-engine.c:784

Oh. wtf.

(define last-gc (gc-stats))
(define start-time (get-internal-real-time))
(define run-time (get-internal-run-time))
(define (avg-time-taken)
    (define now (get-internal-real-time))
    (define run (get-internal-run-time))
    (define cur (gc-stats))
    (define gc-time-taken (* 1.0e-9 (- (cdar cur) (cdar last-gc))))
    (define elapsed-time (* 1.0e-9 (- now start-time)))
    (define cpu-time (* 1.0e-9 (- run run-time)))
    (format #t "Elapsed time: ~5f secs. GC-time-taken: ~5f s or ~5f%
cpu-usage: ~5f%\n"
        elapsed-time gc-time-taken  (* 100 (/ gc-time-taken elapsed-time))
        (* 100 (/ cpu-time elapsed-time))
    )
    (set! last-gc cur)
    (set! start-time now)
    (set! run-time run)
)


linas@fanny: ~/src/novamente/src/opencog-git/opencog/nlp/learn $ vi x
linas@fanny: ~/src/novamente/src/opencog-git/opencog/nlp/learn $ cat x

#0  find_slot_map (cache=0x7f3e9b783ab0, ip=0x1) at
../../libguile/vm.c:935
935       if (cache->entries[slot].ip == ip)
(gdb) print slot
$1 = 0
(gdb) print cache->entries[slot].ip
$2 = (scm_t_uint32 *) 0x1
(gdb) print cache->entries[slot].map
$4 = (const scm_t_uint8 *) 0x0
(gdb) step
scm_i_vm_mark_stack (vp=0x755c1bd0, mark_stack_ptr=0x7f3e9b783f40,
    mark_stack_limit=0x7f3e9b793eb0) at ../../libguile/vm.c:1011
1011        slot_map = find_slot_map (SCM_FRAME_RETURN_ADDRESS (fp),
&cache);
(gdb) print fp
$5 = (union scm_vm_stack_element *) 0x7f3e94cdee38

#define SCM_FRAME_RETURN_ADDRESS(fp)    ((fp)[0].as_ip)
(gdb) print (fp)[0].as_ip
$6 = (scm_t_uint32 *) 0x1

OK that looks weird ... is this corrupted ?? but whatever,
because the returned slot_map is never used ... because ...

(gdb) step
scm_i_vm_mark_stack (vp=0x755c1bd0, mark_stack_ptr=0x7f3e9b783f40,
    mark_stack_limit=0x7f3e9b793eb0) at ../../libguile/vm.c:975
975            fp = SCM_FRAME_DYNAMIC_LINK (fp))
(gdb) print fp
$7 = (union scm_vm_stack_element *) 0x7f3e94cdee38

frames.h:#define SCM_FRAME_DYNAMIC_LINK(fp)      ((fp) + (fp)[1].as_uint)
(gdb) print (fp)[1].as_uint
$8 = 0

OK, that seems bad, because now fp never advances, it just repeats
over and over with this same value.

(gdb)
979           for (slot = nlocals - 1; sp < fp; sp++, slot--)
(gdb) print nlocals
$10 = -2
(gdb) print sp
$11 = (union scm_vm_stack_element *) 0x7f3e94cdee48
#define SCM_FRAME_NUM_LOCALS(fp, sp)    ((fp) - (sp))
(gdb) print fp
$12 = (union scm_vm_stack_element *) 0x7f3e94cdee38
(gdb) print  ((fp) - (sp))
$13 = -2
Ohh .. its not -16 because its -2 * sizeof (union scm_vm_stack_element *)
so that's OK.
So for loops is skipped, it should go to.
      sp = SCM_FRAME_PREVIOUS_SP (fp);
frames.h:#define SCM_FRAME_PREVIOUS_SP(fp)    ((fp) + 2)

and so now it loops around and repeats.
(gdb) print cache
$19 = {entries = {{ip = 0x1, map = 0x0}, {ip = 0x0,
      map = 0x0} <repeats 23 times>, {ip = 0x1a1b5e0, map = 0x0}, {ip =
0x0,
      map = 0x0}, {ip = 0x0, map = 0x0}, {ip = 0x0, map = 0x0}, {ip = 0x0,
      map = 0x0}, {ip = 0x0, map = 0x0}, {ip = 0x0, map = 0x0}, {ip = 0x0,
      map = 0x0}}}
(gdb) print &cache
$20 = (struct slot_map_cache *) 0x7f3e9b783ab0

and so the loop repeates forever, because
fp = SCM_FRAME_DYNAMIC_LINK (fp) never advances fp, because
(fp)[1].as_uint  is zero.

So where is fp pointing to?  recall fp == 0x7f3e94cdee38

(gdb) x/20x 0x7f3e94cdee00
0x7f3e94cdee00: 0x0000000000000904      0x000000000000200c
0x7f3e94cdee10: 0x0000000001a1b7e0      0x0000000001a1b5e0
0x7f3e94cdee20: 0x0000000000000004      0x000000000000200c
0x7f3e94cdee30: 0x0000000000169bd6      0x0000000000000001
0x7f3e94cdee40: 0x0000000000000000      0x000000000192acd0
0x7f3e94cdee50: 0x0000000001fa4bc0      0x0000000001a1c6d0
0x7f3e94cdee60: 0x0000000044507950      0x0000000000000002
0x7f3e94cdee70: 0x000000000005a6f5      0x00000000018febd0
0x7f3e94cdee80: 0x000000006d490d00      0x00007f3e9e97241c
0x7f3e94cdee90: 0x0000000000000002      0x00007f3e9e97241c


(gdb) x/s 0x7f3e94cdee60
0x7f3e94cdee60: "PyPD"      <<< ?? is this a meaningul string?
(gdb) x/s 0x7f3e94cdee81
0x7f3e94cdee81: "\rIm"      << how about this ???

So this does not look healthy.  How did we get here?

(gdb) print vp
$21 = (struct scm_vm *) 0x755c1bd0
(gdb) print mark_stack_ptr
$22 = (struct GC_ms_entry *) 0x7f3e9b783f40
(gdb) print mark_stack_limit
$23 = (struct GC_ms_entry *) 0x7f3e9b793eb0
(gdb) print upper
$24 = (void *) 0x7ffdd986
(gdb) print GC_greatest_plausible_heap_addr
$25 = (void *) 0x7ffdd986
(gdb) print lower
$26 = (void *) 0x17a1ff8

(gdb) print vp->fp
$27 = (union scm_vm_stack_element *) 0x7f3e94cdee18
(gdb) print vp->sp
$28 = (union scm_vm_stack_element *) 0x7f3e94cdee00
(gdb) print  vp->stack_top
$29 = (union scm_vm_stack_element *) 0x7f3e94cdf000

So lets walk manually.
frames.h:#define SCM_FRAME_DYNAMIC_LINK(fp)      ((fp) + (fp)[1].as_uint)

(gdb) print (vp->fp) + (vp->fp)[1].as_uint
$31 = (union scm_vm_stack_element *) 0x7f3e94cdee38
So we are instanly in trouble! once through the loop, and we've got the
broken fp

So how did we get here?

(gdb) print vp
$32 = (struct scm_vm *) 0x755c1bd0
(gdb) print *vp
$33 = {ip = 0x7f3e9e5a6378 <subr_stub_code+56>, sp = 0x7f3e94cdee00,
  fp = 0x7f3e94cdee18, stack_limit = 0x7f3e94cde000, trace_level = 0,
  sp_min_since_gc = 0x7f3e94cdee00, stack_size = 512,
  stack_bottom = 0x7f3e94cde000, stack_top = 0x7f3e94cdf000,
  overflow_handler_stack = 0x304, hooks = {0x4, 0x4, 0x4, 0x4, 0x4},
  resumable_prompt_cookie = 0x7f3d397f9440, engine = 1}

Looks plausible, not corrupted. Why is engine=1 ??
#define SCM_VM_REGULAR_ENGINE 0
#define SCM_VM_DEBUG_ENGINE 1
#define SCM_VM_NUM_ENGINES 2
#  define SCM_I_CURRENT_THREAD (scm_i_current_thread)

(gdb) print scm_i_current_thread
$34 = (scm_i_thread *) 0x0

OK, that's not excellent.
(gdb) print thread_count
$35 = 161

(gdb) bt
#0  scm_i_vm_mark_stack (vp=0x755c1bd0, mark_stack_ptr=0x7f3e9b783f40,
    mark_stack_limit=0x7f3e9b793eb0) at ../../libguile/vm.c:1011
#1  0x00007f3e9db8835e in GC_mark_from (mark_stack_top=0x7f3e9b783ee0,
    mark_stack_top@entry=0x7f3e9b783f00,
    mark_stack=mark_stack@entry=0x7f3e9b783eb0,
    mark_stack_limit=mark_stack_limit@entry=0x7f3e9b793eb0) at ../mark.c:772
#2  0x00007f3e9db8897e in GC_do_local_mark
(local_mark_stack=0x7f3e9b783eb0,
    local_top=0x7f3e9b783f00) at ../mark.c:1037
#3  0x00007f3e9db88b98 in GC_mark_local (
    local_mark_stack=local_mark_stack@entry=0x7f3e9b783eb0, id=id@entry=4)
    at ../mark.c:1170
#4  0x00007f3e9db88eaa in GC_help_marker (my_mark_no=my_mark_no@entry=80003)
    at ../mark.c:1238
#5  0x00007f3e9db92e3c in GC_mark_thread (id=<optimized out>)
    at ../pthread_support.c:380
#6  0x00007f3e9e2ae6ba in start_thread (arg=0x7f3e9b794700)
    at pthread_create.c:333
#7  0x00007f3e9dfdd82d in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

so bdwgc is version gc7_6_0

So ../mark.c:772 is GET_HDR and GET_HDR is in
include/private/gc_hdrs.h

define GET_HDR(p, hhdr)  ... ((hhdr) = HDR_INNER(p)
# define HDR_INNER(p) HDR_FROM_BI(BI(p),p)
#define HDR_FROM_BI(bi, p) \
                ((bi)->index[((word)(p) >> LOG_HBLKSIZE) & (BOTTOM_SZ - 1)])

This seems all wrong.. wtf ....

is it called by thread_mark ??
in libguile/thread.c:
  thread_gc_kind =
    GC_new_kind (GC_new_free_list (),
       GC_MAKE_PROC (GC_new_proc (thread_mark), 0),
       0, 1)
in bdwgc/misc.c  GC_new_proc
    GC_mark_procs[result] = proc
#define GC_mark_procs GC_arrays._mark_procs
#define PROC(descr) \
      (GC_mark_procs[((descr) >> GC_DS_TAG_BITS) & (GC_MAX_MARK_PROCS-1)])

      GC_obj_kinds[result].ok_descriptor = descr;  <<< this is where
MAKE_PROC goes.

anyway ... in mark.c line 733
        case GC_DS_PROC:
          mark_stack_top--;
          credit -= GC_PROC_BYTES;
          mark_stack_top = (*PROC(descr))((word *)current_p, mark_stack_top,
                                          mark_stack_limit, ENV(descr));
          continue;
But that's the wrong line number.  line 772 is in
        case GC_DS_PER_OBJECT:
          if ((signed_word)descr >= 0) {
            /* Descriptor is in the object.     */
            descr = *(word *)(current_p + descr - GC_DS_PER_OBJECT);

Oh, it goes to retry, which loops around and calls PROC which is thread_mark
which is called with args
          mark_stack_top = (*PROC(descr))((word *)current_p, mark_stack_top,
                                          mark_stack_limit, ENV(descr));

(gdb) print current_p
$39 = (ptr_t) 0x7225d380 "@\325%r"

so back in libguile/threads.c line 84:
  const struct scm_i_thread *t = (struct scm_i_thread *) addr;

(gdb) print *(struct scm_i_thread *) current_p
$40 = {next_thread = 0x7225d540, handle = 0x5763f3b0, pthread =
139900934399744,
  result = 0x4, exited = 0, guile_mode = 1, needs_unregister = 1, wake =
0x0,
  sleep_cond = {__data = {__lock = 0, __futex = 0, __total_seq = 0,
      __wakeup_seq = 0, __woken_seq = 0, __mutex = 0x0, __nwaiters = 0,
      __broadcast_seq = 0}, __size = '\000' <repeats 47 times>, __align =
0},
  sleep_pipe = {151, 197}, freelists = 0x71cfe800,
  pointerless_freelists = 0x71cfe780, dynamic_state = 0x71de7900, dynstack
= {
    base = 0x698c3800, top = 0x698c3940, limit = 0x698c3a00},
  pending_asyncs = 0x304, block_asyncs = 0, continuation_root = 0x5763f340,
  continuation_base = 0x7f3d397f9c90, vp = 0x755c1bd0, base =
0x7f3d397f9da0,
  regs = {{__jmpbuf = {139905204221696, 139906923931021, 1,
139894281647232,
        139906913276512, 0, 139906918327551, 8388608},
      __mask_was_saved = 964664752, __saved_mask = {__val = {0,
139900934399744,
          139900934399744, 139900934397344, 139906923929400,
139893936840720,
          1804170656, 139906913276512, 139900934399744, 139900934397344,
          139900934397504, 139893936840720, 139906910859134,
139906913276484,
          139906910857112, 27401856}}}}}

(gdb) print ((struct scm_i_thread *) current_p)->vp
$41 = (struct scm_vm *) 0x755c1bd0

which is exactly the vp we had before, so this is the correct thread.

So  ... all the values in *(struct scm_i_thread *) current_p look more or
less OK;
its not corrupted. And all the values in *vp look more or less OK, its not
corrrupted
except that engine=1 looks wrong.  So what's up with the weird stack
looping?
Again:
From before, we had:

(gdb) print *vp
$42 = {ip = 0x7f3e9e5a6378 <subr_stub_code+56>, sp = 0x7f3e94cdee00,
  fp = 0x7f3e94cdee18, stack_limit = 0x7f3e94cde000, trace_level = 0,
  sp_min_since_gc = 0x7f3e94cdee00, stack_size = 512,
  stack_bottom = 0x7f3e94cde000, stack_top = 0x7f3e94cdf000,
  overflow_handler_stack = 0x304, hooks = {0x4, 0x4, 0x4, 0x4, 0x4},
  resumable_prompt_cookie = 0x7f3d397f9440, engine = 1}

so fp, sp, stack_bottom, stack_top all are self-consistent with each other.
The problem is that
    fp = SCM_FRAME_DYNAMIC_LINK (fp)
goes nowhere, after the first iteration.

Is the frame corrupted? How? what does the rest of the stack look like?

(gdb) x/32x 0x7f3e94cdee00
0x7f3e94cdee00: 0x0000000000000904      0x000000000000200c
0x7f3e94cdee10: 0x0000000001a1b7e0      0x0000000001a1b5e0
0x7f3e94cdee20: 0x0000000000000004      0x000000000000200c  <<< theres the 4
0x7f3e94cdee30: 0x0000000000169bd6      0x0000000000000001 <<< there's the
new fp
0x7f3e94cdee40: 0x0000000000000000      0x000000000192acd0 <<< there's the
zero.
0x7f3e94cdee50: 0x0000000001fa4bc0      0x0000000001a1c6d0
0x7f3e94cdee60: 0x0000000044507950      0x0000000000000002
0x7f3e94cdee70: 0x000000000005a6f5      0x00000000018febd0
0x7f3e94cdee80: 0x000000006d490d00      0x00007f3e9e97241c
0x7f3e94cdee90: 0x0000000000000002      0x00007f3e9e97241c
0x7f3e94cdeea0: 0x0000000000000002      0x00007f3e8d797630
0x7f3e94cdeeb0: 0x0000000000000002      0x00007f3e86effacc
0x7f3e94cdeec0: 0x0000000000000003      0x000000006d490d00
0x7f3e94cdeed0: 0x00007f3e86eb8698      0x0000000000000002
0x7f3e94cdeee0: 0x00007f3e8e51f450      0x0000000000000008
0x7f3e94cdeef0: 0x0000000001a37c10      0x000000000192acd0


So... who sets up thread->vp ?
/* #define VM_ENABLE_ASSERTIONS */
However, defining this breaks the build.

diff --git a/libguile/vm-engine.c b/libguile/vm-engine.c
index 6c88ebf..f0f8164 100644
--- a/libguile/vm-engine.c
+++ b/libguile/vm-engine.c
@@ -2050,11 +2050,11 @@ VM_NAME (scm_i_thread *thread, struct scm_vm *vp,

           if (!scm_module_system_booted_p)
             {
-              ASSERT (scm_is_true
+              ASSERT (scm_is_true(
                       scm_equal_p (modname,
                                    scm_list_2
                                    (SCM_BOOL_T,
-                                    scm_from_utf8_symbol ("guile"))));
+                                    scm_from_utf8_symbol ("guile")))));
               var = scm_lookup (sym);
             }
           else if (scm_is_true (SCM_CAR (modname)))
diff --git a/libguile/vm.c b/libguile/vm.c
index 18f2192..c26066e 100644
--- a/libguile/vm.c
+++ b/libguile/vm.c
@@ -63,7 +63,7 @@ static size_t page_size;

 /* The VM has a number of internal assertions that shouldn't normally be
    necessary, but might be if you think you found a bug in the VM. */
-/* #define VM_ENABLE_ASSERTIONS */
+#define VM_ENABLE_ASSERTIONS

 static void vm_expand_stack (struct scm_vm *vp,
                              union scm_vm_stack_element *new_sp)
SCM_NOINLINE;


 GEN      guile-procedures.texi
/bin/bash: line 1: 24078 Broken pipe

 ./meta/guile
Aborted
./libguile/guile
Aborted
gdb libguile/.libs/lt-guile

#0  0x00007ffff74f7428 in __GI_raise (sig=sig@entry=6)
    at ../sysdeps/unix/sysv/linux/raise.c:54
#1  0x00007ffff74f902a in __GI_abort () at abort.c:89
#2  0x00007ffff7b707c4 in vm_regular_engine (thread=0x7282, vp=0x806f30,
    registers=0x6, resume=-145787864) at ../../libguile/vm-engine.c:2053
#3  0x00007ffff7b711d2 in scm_call_n (proc=0x832b30, argv=argv@entry=0x0,
    nargs=nargs@entry=0) at ../../libguile/vm.c:1257
#4  0x00007ffff7af14e9 in scm_call_0 (proc=<optimized out>)
    at ../../libguile/eval.c:481
#5  0x00007ffff7b10568 in scm_primitive_load_path (args=<optimized out>)
    at ../../libguile/load.c:1248
#6  0x00007ffff7b6e976 in vm_regular_engine (thread=0x7282, vp=0x806f30,
    registers=0x6, resume=-145787864) at ../../libguile/vm-engine.c:784

--f403045ea6d679a79b0551239b33
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div>what: guile-2.2-stable, from git.<br><br>I&#39;ve got=
 a large, complex, heavily multi-threaded guile program that<br>hangs durin=
g garbage collection; usually after running for half a day.<br>It hangs in =
a tight loop in scm_i_vm_mark_stack, spinning at 100% of CPU.<br><br>This i=
s due to the for-loop line fp =3D SCM_FRAME_DYNAMIC_LINK (fp))<br>at libgui=
le/vm.c line 975 failing to advance the frame pointer.<br>There&#39;s no &q=
uot;obvious&quot; corruption in the stack; it simply looks like<br>the fram=
e was incompletely set up, and so incrementing to the next<br>fp does not g=
o anywhere.<br><br>I have recompiled guile with VM_ENABLE_ASSERTIONS and am=
 trying to<br>reproduce the bug now.=C2=A0 The rest of this email is a reco=
rd of a long<br>debug session isolating the problem, and showing that, over=
all, the<br>thread and stack data look more-or-less correct and uncorrupote=
d,<br>except for the inability to walk forward in the frame.<br><br>-- lina=
s<br><br>(gdb) bt<br>#0=C2=A0 scm_i_vm_mark_stack (vp=3D0x755c1bd0, mark_st=
ack_ptr=3D0x7f3e9b783f40,<br>=C2=A0=C2=A0=C2=A0 mark_stack_limit=3D0x7f3e9b=
793eb0) at ../../libguile/vm.c:1011<br>#1=C2=A0 0x00007f3e9db8835e in GC_ma=
rk_from (mark_stack_top=3D0x7f3e9b783ee0,<br>=C2=A0=C2=A0=C2=A0 mark_stack_=
top@entry=3D0x7f3e9b783f00,<br>=C2=A0=C2=A0=C2=A0 mark_stack=3Dmark_stack@e=
ntry=3D0x7f3e9b783eb0,<br>=C2=A0=C2=A0=C2=A0 mark_stack_limit=3Dmark_stack_=
limit@entry=3D0x7f3e9b793eb0) at ../mark.c:772<br>#2=C2=A0 0x00007f3e9db889=
7e in GC_do_local_mark (local_mark_stack=3D0x7f3e9b783eb0,<br>=C2=A0=C2=A0=
=C2=A0 local_top=3D0x7f3e9b783f00) at ../mark.c:1037<br>#3=C2=A0 0x00007f3e=
9db88b98 in GC_mark_local (<br>=C2=A0=C2=A0=C2=A0 local_mark_stack=3Dlocal_=
mark_stack@entry=3D0x7f3e9b783eb0, id=3Did@entry=3D4)<br>=C2=A0=C2=A0=C2=A0=
 at ../mark.c:1170<br>#4=C2=A0 0x00007f3e9db88eaa in GC_help_marker (my_mar=
k_no=3Dmy_mark_no@entry=3D80003)<br>=C2=A0=C2=A0=C2=A0 at ../mark.c:1238<br=
>#5=C2=A0 0x00007f3e9db92e3c in GC_mark_thread (id=3D&lt;optimized out&gt;)=
<br>=C2=A0=C2=A0=C2=A0 at ../pthread_support.c:380<br>#6=C2=A0 0x00007f3e9e=
2ae6ba in start_thread (arg=3D0x7f3e9b794700)<br>=C2=A0=C2=A0=C2=A0 at pthr=
ead_create.c:333<br>#7=C2=A0 0x00007f3e9dfdd82d in clone ()<br>=C2=A0=C2=A0=
=C2=A0 at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109<br><br></div>(gdb) =
step<br><div><br>#0=C2=A0 find_slot_map (cache=3D0x7f3e9b783ab0, ip=3D0x1) =
at<br>../../libguile/vm.c:935<br>935=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if=
 (cache-&gt;entries[slot].ip =3D=3D ip)<br>(gdb) print slot<br>$1 =3D 0<br>=
(gdb) print cache-&gt;entries[slot].ip<br>$2 =3D (scm_t_uint32 *) 0x1<br>(g=
db) print cache-&gt;entries[slot].map<br>$4 =3D (const scm_t_uint8 *) 0x0<b=
r>(gdb) step<br>scm_i_vm_mark_stack (vp=3D0x755c1bd0, mark_stack_ptr=3D0x7f=
3e9b783f40, <br>=C2=A0=C2=A0=C2=A0 mark_stack_limit=3D0x7f3e9b793eb0) at ..=
/../libguile/vm.c:1011<br>1011=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 sl=
ot_map =3D find_slot_map (SCM_FRAME_RETURN_ADDRESS (fp), &amp;cache);<br>(g=
db) print fp<br>$5 =3D (union scm_vm_stack_element *) 0x7f3e94cdee38<br><br=
>#define SCM_FRAME_RETURN_ADDRESS(fp)=C2=A0=C2=A0=C2=A0 ((fp)[0].as_ip)<br>=
(gdb) print (fp)[0].as_ip<br>$6 =3D (scm_t_uint32 *) 0x1<br><br>OK that loo=
ks weird ... is this corrupted ?? but whatever,<br>because the returned slo=
t_map is never used ... because ...<br><br>(gdb) step<br>scm_i_vm_mark_stac=
k (vp=3D0x755c1bd0, mark_stack_ptr=3D0x7f3e9b783f40, <br>=C2=A0=C2=A0=C2=A0=
 mark_stack_limit=3D0x7f3e9b793eb0) at ../../libguile/vm.c:975<br>975=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 fp =3D SCM_FRA=
ME_DYNAMIC_LINK (fp))<br>(gdb) print fp<br>$7 =3D (union scm_vm_stack_eleme=
nt *) 0x7f3e94cdee38<br><br>frames.h:#define SCM_FRAME_DYNAMIC_LINK(fp)=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0 ((fp) + (fp)[1].as_uint)<br>(gdb) print (fp)[1]=
.as_uint<br>$8 =3D 0<br><br>OK, that seems bad, because now fp never advanc=
es, it just repeats<br>over and over with this same value.<br><br>(gdb) <br=
>979=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 for (slot =
=3D nlocals - 1; sp &lt; fp; sp++, slot--)<br>(gdb) print nlocals<br>$10 =
=3D -2<br>(gdb) print sp<br>$11 =3D (union scm_vm_stack_element *) 0x7f3e94=
cdee48<br>#define SCM_FRAME_NUM_LOCALS(fp, sp)=C2=A0=C2=A0=C2=A0 ((fp) - (s=
p))<br>(gdb) print fp<br>$12 =3D (union scm_vm_stack_element *) 0x7f3e94cde=
e38<br>(gdb) print=C2=A0 ((fp) - (sp))<br>$13 =3D -2<br>Ohh .. its not -16 =
because its -2 * sizeof (union scm_vm_stack_element *)<br>so that&#39;s OK.=
<br>So for loops is skipped, it should go to.<br>=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0 sp =3D SCM_FRAME_PREVIOUS_SP (fp);<br>frames.h:#define SCM_FRAME_PRE=
VIOUS_SP(fp)=C2=A0=C2=A0=C2=A0 ((fp) + 2)<br><br>and so now it loops around=
 and repeats.<br>(gdb) print cache<br>$19 =3D {entries =3D {{ip =3D 0x1, ma=
p =3D 0x0}, {ip =3D 0x0, <br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 map =3D 0x0} &l=
t;repeats 23 times&gt;, {ip =3D 0x1a1b5e0, map =3D 0x0}, {ip =3D 0x0, <br>=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 map =3D 0x0}, {ip =3D 0x0, map =3D 0x0}, {ip=
 =3D 0x0, map =3D 0x0}, {ip =3D 0x0, <br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 map=
 =3D 0x0}, {ip =3D 0x0, map =3D 0x0}, {ip =3D 0x0, map =3D 0x0}, {ip =3D 0x=
0, <br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 map =3D 0x0}}}<br>(gdb) print &amp;ca=
che<br>$20 =3D (struct slot_map_cache *) 0x7f3e9b783ab0<br><br>and so the l=
oop repeates forever, because <br>fp =3D SCM_FRAME_DYNAMIC_LINK (fp) never =
advances fp, because<br>(fp)[1].as_uint=C2=A0 is zero.<br><br>So where is f=
p pointing to?=C2=A0 recall fp =3D=3D 0x7f3e94cdee38<br><br>(gdb) x/20x 0x7=
f3e94cdee00<br>0x7f3e94cdee00: 0x0000000000000904=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0 0x000000000000200c<br>0x7f3e94cdee10: 0x0000000001a1b7e0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0 0x0000000001a1b5e0<br>0x7f3e94cdee20: 0x0000000000000004=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x000000000000200c<br>0x7f3e94cdee30: 0x0000=
000000169bd6=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x0000000000000001<br>0x7f3e94cd=
ee40: 0x0000000000000000=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x000000000192acd0<b=
r>0x7f3e94cdee50: 0x0000000001fa4bc0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x000000=
0001a1c6d0<br>0x7f3e94cdee60: 0x0000000044507950=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 0x0000000000000002<br>0x7f3e94cdee70: 0x000000000005a6f5=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 0x00000000018febd0<br>0x7f3e94cdee80: 0x000000006d490d00=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x00007f3e9e97241c<br>0x7f3e94cdee90: 0x0000000=
000000002=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x00007f3e9e97241c<br><br><br>(gdb)=
 x/s 0x7f3e94cdee60<br>0x7f3e94cdee60: &quot;PyPD&quot;=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0 &lt;&lt;&lt; ?? is this a meaningul string?<br>(gdb) x/s 0x7f3=
e94cdee81<br>0x7f3e94cdee81: &quot;\rIm&quot;=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
 &lt;&lt; how about this ???<br><br>So this does not look healthy.=C2=A0 Ho=
w did we get here?<br><br>(gdb) print vp<br>$21 =3D (struct scm_vm *) 0x755=
c1bd0<br>(gdb) print mark_stack_ptr<br>$22 =3D (struct GC_ms_entry *) 0x7f3=
e9b783f40<br>(gdb) print mark_stack_limit<br>$23 =3D (struct GC_ms_entry *)=
 0x7f3e9b793eb0<br>(gdb) print upper<br>$24 =3D (void *) 0x7ffdd986<br>(gdb=
) print GC_greatest_plausible_heap_addr<br>$25 =3D (void *) 0x7ffdd986<br>(=
gdb) print lower<br>$26 =3D (void *) 0x17a1ff8<br><br>(gdb) print vp-&gt;fp=
<br>$27 =3D (union scm_vm_stack_element *) 0x7f3e94cdee18<br>(gdb) print vp=
-&gt;sp<br>$28 =3D (union scm_vm_stack_element *) 0x7f3e94cdee00<br>(gdb) p=
rint=C2=A0 vp-&gt;stack_top<br>$29 =3D (union scm_vm_stack_element *) 0x7f3=
e94cdf000<br><br>So lets walk manually. <br>frames.h:#define SCM_FRAME_DYNA=
MIC_LINK(fp)=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ((fp) + (fp)[1].as_uint)<br><br>=
(gdb) print (vp-&gt;fp) + (vp-&gt;fp)[1].as_uint<br>$31 =3D (union scm_vm_s=
tack_element *) 0x7f3e94cdee38<br>So we are instanly in trouble! once throu=
gh the loop, and we&#39;ve got the broken fp<br><br>So how did we get here?=
<br><br>(gdb) print vp<br>$32 =3D (struct scm_vm *) 0x755c1bd0<br>(gdb) pri=
nt *vp<br>$33 =3D {ip =3D 0x7f3e9e5a6378 &lt;subr_stub_code+56&gt;, sp =3D =
0x7f3e94cdee00, <br>=C2=A0 fp =3D 0x7f3e94cdee18, stack_limit =3D 0x7f3e94c=
de000, trace_level =3D 0, <br>=C2=A0 sp_min_since_gc =3D 0x7f3e94cdee00, st=
ack_size =3D 512, <br>=C2=A0 stack_bottom =3D 0x7f3e94cde000, stack_top =3D=
 0x7f3e94cdf000, <br>=C2=A0 overflow_handler_stack =3D 0x304, hooks =3D {0x=
4, 0x4, 0x4, 0x4, 0x4}, <br>=C2=A0 resumable_prompt_cookie =3D 0x7f3d397f94=
40, engine =3D 1}<br><br>Looks plausible, not corrupted. Why is engine=3D1 =
?? <br>#define SCM_VM_REGULAR_ENGINE 0<br>#define SCM_VM_DEBUG_ENGINE 1<br>=
#define SCM_VM_NUM_ENGINES 2<br>#=C2=A0 define SCM_I_CURRENT_THREAD (scm_i_=
current_thread)<br><br>(gdb) print scm_i_current_thread<br>$34 =3D (scm_i_t=
hread *) 0x0<br><br>OK, that&#39;s not excellent.<br>(gdb) print thread_cou=
nt<br>$35 =3D 161<br><br>(gdb) bt<br>#0=C2=A0 scm_i_vm_mark_stack (vp=3D0x7=
55c1bd0, mark_stack_ptr=3D0x7f3e9b783f40, <br>=C2=A0=C2=A0=C2=A0 mark_stack=
_limit=3D0x7f3e9b793eb0) at ../../libguile/vm.c:1011<br>#1=C2=A0 0x00007f3e=
9db8835e in GC_mark_from (mark_stack_top=3D0x7f3e9b783ee0, <br>=C2=A0=C2=A0=
=C2=A0 mark_stack_top@entry=3D0x7f3e9b783f00, <br>=C2=A0=C2=A0=C2=A0 mark_s=
tack=3Dmark_stack@entry=3D0x7f3e9b783eb0, <br>=C2=A0=C2=A0=C2=A0 mark_stack=
_limit=3Dmark_stack_limit@entry=3D0x7f3e9b793eb0) at ../mark.c:772<br>#2=C2=
=A0 0x00007f3e9db8897e in GC_do_local_mark (local_mark_stack=3D0x7f3e9b783e=
b0, <br>=C2=A0=C2=A0=C2=A0 local_top=3D0x7f3e9b783f00) at ../mark.c:1037<br=
>#3=C2=A0 0x00007f3e9db88b98 in GC_mark_local (<br>=C2=A0=C2=A0=C2=A0 local=
_mark_stack=3Dlocal_mark_stack@entry=3D0x7f3e9b783eb0, id=3Did@entry=3D4)<b=
r>=C2=A0=C2=A0=C2=A0 at ../mark.c:1170<br>#4=C2=A0 0x00007f3e9db88eaa in GC=
_help_marker (my_mark_no=3Dmy_mark_no@entry=3D80003)<br>=C2=A0=C2=A0=C2=A0 =
at ../mark.c:1238<br>#5=C2=A0 0x00007f3e9db92e3c in GC_mark_thread (id=3D&l=
t;optimized out&gt;)<br>=C2=A0=C2=A0=C2=A0 at ../pthread_support.c:380<br>#=
6=C2=A0 0x00007f3e9e2ae6ba in start_thread (arg=3D0x7f3e9b794700)<br>=C2=A0=
=C2=A0=C2=A0 at pthread_create.c:333<br>#7=C2=A0 0x00007f3e9dfdd82d in clon=
e ()<br>=C2=A0=C2=A0=C2=A0 at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109=
<br><br>so bdwgc is version gc7_6_0<br><br>So ../mark.c:772 is GET_HDR and =
GET_HDR is in<br>include/private/gc_hdrs.h<br><br>define GET_HDR(p, hhdr)=
=C2=A0 ... ((hhdr) =3D HDR_INNER(p) <br># define HDR_INNER(p) HDR_FROM_BI(B=
I(p),p)<br>#define HDR_FROM_BI(bi, p) \<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ((bi)-&gt;inde=
x[((word)(p) &gt;&gt; LOG_HBLKSIZE) &amp; (BOTTOM_SZ - 1)])<br><br>This see=
ms all wrong.. wtf ....<br><br>is it called by thread_mark ??<br>in libguil=
e/thread.c:<br>=C2=A0 thread_gc_kind =3D<br>=C2=A0=C2=A0=C2=A0 GC_new_kind =
(GC_new_free_list (),<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 GC_MAKE_PROC =
(GC_new_proc (thread_mark), 0),<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0, =
1)<br>in bdwgc/misc.c=C2=A0 GC_new_proc<br>=C2=A0=C2=A0=C2=A0 GC_mark_procs=
[result] =3D proc<br>#define GC_mark_procs GC_arrays._mark_procs<br>#define=
 PROC(descr) \<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 (GC_mark_procs[((descr) &g=
t;&gt; GC_DS_TAG_BITS) &amp; (GC_MAX_MARK_PROCS-1)])<br><br>=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 GC_obj_kinds[result].ok_descriptor =3D descr;=C2=A0 &lt;&lt=
;&lt; this is where MAKE_PROC goes.<br><br>anyway ... in mark.c line 733<br=
>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 case GC_DS_PROC:<br>=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mark_stack_top--;<br>=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 credit -=3D GC_PROC_BYTES;=
<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mark_stack_top =
=3D (*PROC(descr))((word *)current_p, mark_stack_top,<br>=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 mark_stack_limit, ENV(descr));<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0 continue;<br>But that&#39;s the wrong line number.=C2=A0=
 line 772 is in <br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 case GC_DS_P=
ER_OBJECT:<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if ((s=
igned_word)descr &gt;=3D 0) {<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0 /* Descriptor is in the object.=C2=A0=C2=A0=C2=A0=
=C2=A0 */<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0 descr =3D *(word *)(current_p + descr - GC_DS_PER_OBJECT);<br><br>Oh=
, it goes to retry, which loops around and calls PROC which is thread_mark<=
br>which is called with args <br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0 mark_stack_top =3D (*PROC(descr))((word *)current_p, mark_stac=
k_top,<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mark_stack_limit, ENV(descr));<br><br>(gd=
b) print current_p<br>$39 =3D (ptr_t) 0x7225d380 &quot;@\325%r&quot;<br><br=
>so back in libguile/threads.c line 84:<br>=C2=A0 const struct scm_i_thread=
 *t =3D (struct scm_i_thread *) addr;<br><br>(gdb) print *(struct scm_i_thr=
ead *) current_p<br>$40 =3D {next_thread =3D 0x7225d540, handle =3D 0x5763f=
3b0, pthread =3D 139900934399744, <br>=C2=A0 result =3D 0x4, exited =3D 0, =
guile_mode =3D 1, needs_unregister =3D 1, wake =3D 0x0, <br>=C2=A0 sleep_co=
nd =3D {__data =3D {__lock =3D 0, __futex =3D 0, __total_seq =3D 0, <br>=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0 __wakeup_seq =3D 0, __woken_seq =3D 0, __mutex =
=3D 0x0, __nwaiters =3D 0, <br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 __broadcast_s=
eq =3D 0}, __size =3D &#39;\000&#39; &lt;repeats 47 times&gt;, __align =3D =
0}, <br>=C2=A0 sleep_pipe =3D {151, 197}, freelists =3D 0x71cfe800, <br>=C2=
=A0 pointerless_freelists =3D 0x71cfe780, dynamic_state =3D 0x71de7900, dyn=
stack =3D {<br>=C2=A0=C2=A0=C2=A0 base =3D 0x698c3800, top =3D 0x698c3940, =
limit =3D 0x698c3a00}, <br>=C2=A0 pending_asyncs =3D 0x304, block_asyncs =
=3D 0, continuation_root =3D 0x5763f340, <br>=C2=A0 continuation_base =3D 0=
x7f3d397f9c90, vp =3D 0x755c1bd0, base =3D 0x7f3d397f9da0, <br>=C2=A0 regs =
=3D {{__jmpbuf =3D {139905204221696, 139906923931021, 1, 139894281647232, <=
br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 139906913276512, 0, 139906918=
327551, 8388608}, <br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 __mask_was_saved =3D 9=
64664752, __saved_mask =3D {__val =3D {0, 139900934399744, <br>=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 139900934399744, 139900934397344=
, 139906923929400, 139893936840720, <br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0 1804170656, 139906913276512, 139900934399744, 1399009=
34397344, <br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 139900=
934397504, 139893936840720, 139906910859134, 139906913276484, <br>=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 139906910857112, 27401856}}}}=
}<br><br>(gdb) print ((struct scm_i_thread *) current_p)-&gt;vp<br>$41 =3D =
(struct scm_vm *) 0x755c1bd0<br><br>which is exactly the vp we had before, =
so this is the correct thread.<br><br>So=C2=A0 ... all the values in *(stru=
ct scm_i_thread *) current_p look more or less OK;<br>its not corrupted. An=
d all the values in *vp look more or less OK, its not corrrupted<br>except =
that engine=3D1 looks wrong.=C2=A0 So what&#39;s up with the weird stack lo=
oping?<br>Again: <br>From before, we had:<br><br>(gdb) print *vp<br>$42 =3D=
 {ip =3D 0x7f3e9e5a6378 &lt;subr_stub_code+56&gt;, sp =3D 0x7f3e94cdee00, <=
br>=C2=A0 fp =3D 0x7f3e94cdee18, stack_limit =3D 0x7f3e94cde000, trace_leve=
l =3D 0, <br>=C2=A0 sp_min_since_gc =3D 0x7f3e94cdee00, stack_size =3D 512,=
 <br>=C2=A0 stack_bottom =3D 0x7f3e94cde000, stack_top =3D 0x7f3e94cdf000, =
<br>=C2=A0 overflow_handler_stack =3D 0x304, hooks =3D {0x4, 0x4, 0x4, 0x4,=
 0x4}, <br>=C2=A0 resumable_prompt_cookie =3D 0x7f3d397f9440, engine =3D 1}=
<br><br>so fp, sp, stack_bottom, stack_top all are self-consistent with eac=
h other.<br>The problem is that<br>=C2=A0=C2=A0=C2=A0 fp =3D SCM_FRAME_DYNA=
MIC_LINK (fp)<br>goes nowhere, after the first iteration.<br><br>Is the fra=
me corrupted? How? what does the rest of the stack look like?<br><br>(gdb) =
x/32x 0x7f3e94cdee00<br>0x7f3e94cdee00: 0x0000000000000904=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 0x000000000000200c<br>0x7f3e94cdee10: 0x0000000001a1b7e0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x0000000001a1b5e0<br>0x7f3e94cdee20: 0x0000000=
000000004=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x000000000000200c=C2=A0 &lt;&lt;&l=
t; theres the 4<br>0x7f3e94cdee30: 0x0000000000169bd6=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0 0x0000000000000001 &lt;&lt;&lt; there&#39;s the new fp<br>0x7f3e9=
4cdee40: 0x0000000000000000=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x000000000192acd=
0 &lt;&lt;&lt; there&#39;s the zero.<br>0x7f3e94cdee50: 0x0000000001fa4bc0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x0000000001a1c6d0<br>0x7f3e94cdee60: 0x0000=
000044507950=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x0000000000000002<br>0x7f3e94cd=
ee70: 0x000000000005a6f5=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x00000000018febd0<b=
r>0x7f3e94cdee80: 0x000000006d490d00=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x00007f=
3e9e97241c<br>0x7f3e94cdee90: 0x0000000000000002=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 0x00007f3e9e97241c<br>0x7f3e94cdeea0: 0x0000000000000002=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 0x00007f3e8d797630<br>0x7f3e94cdeeb0: 0x0000000000000002=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x00007f3e86effacc<br>0x7f3e94cdeec0: 0x0000000=
000000003=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x000000006d490d00<br>0x7f3e94cdeed=
0: 0x00007f3e86eb8698=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x0000000000000002<br>0=
x7f3e94cdeee0: 0x00007f3e8e51f450=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x000000000=
0000008<br>0x7f3e94cdeef0: 0x0000000001a37c10=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
 0x000000000192acd0<br><br><br>So... who sets up thread-&gt;vp ? <br>/* #de=
fine VM_ENABLE_ASSERTIONS */<br>However, defining this breaks the build.<br=
><br>diff --git a/libguile/vm-engine.c b/libguile/vm-engine.c<br>index 6c88=
ebf..f0f8164 100644<br>--- a/libguile/vm-engine.c<br>+++ b/libguile/vm-engi=
ne.c<br>@@ -2050,11 +2050,11 @@ VM_NAME (scm_i_thread *thread, struct scm_v=
m *vp,<br>=C2=A0<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0 if (!scm_module_system_booted_p)<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 {<br>-=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ASSERT (scm_is_true<=
br>+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0 ASSERT (scm_is_true(<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0 scm_equal_p (modname,<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 scm_list_2<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 (SCM_BOOL_T,<br>-=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 scm_from_utf8_symbol (&quot;guile&quot;))=
));<br>+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 scm_f=
rom_utf8_symbol (&quot;guile&quot;)))));<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 var =3D scm_lookup (=
sym);<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0 }<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 el=
se if (scm_is_true (SCM_CAR (modname)))<br>diff --git a/libguile/vm.c b/lib=
guile/vm.c<br>index 18f2192..c26066e 100644<br>--- a/libguile/vm.c<br>+++ b=
/libguile/vm.c<br>@@ -63,7 +63,7 @@ static size_t page_size;<br>=C2=A0<br>=
=C2=A0/* The VM has a number of internal assertions that shouldn&#39;t norm=
ally be<br>=C2=A0=C2=A0=C2=A0 necessary, but might be if you think you foun=
d a bug in the VM. */<br>-/* #define VM_ENABLE_ASSERTIONS */<br>+#define VM=
_ENABLE_ASSERTIONS<br>=C2=A0<br>=C2=A0static void vm_expand_stack (struct s=
cm_vm *vp,<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 union scm_vm_stack_element *new_sp)=
 SCM_NOINLINE;<br><br><br>=C2=A0GEN=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 guile-pro=
cedures.texi<br>/bin/bash: line 1: 24078 Broken pipe <br><br>=C2=A0./meta/g=
uile<br>Aborted<br>./libguile/guile<br>Aborted<br>gdb libguile/.libs/lt-gui=
le<br><br>#0=C2=A0 0x00007ffff74f7428 in __GI_raise (sig=3Dsig@entry=3D6)<b=
r>=C2=A0=C2=A0=C2=A0 at ../sysdeps/unix/sysv/linux/raise.c:54<br>#1=C2=A0 0=
x00007ffff74f902a in __GI_abort () at abort.c:89<br>#2=C2=A0 0x00007ffff7b7=
07c4 in vm_regular_engine (thread=3D0x7282, vp=3D0x806f30, <br>=C2=A0=C2=A0=
=C2=A0 registers=3D0x6, resume=3D-145787864) at ../../libguile/vm-engine.c:=
2053<br>#3=C2=A0 0x00007ffff7b711d2 in scm_call_n (proc=3D0x832b30, argv=3D=
argv@entry=3D0x0, <br>=C2=A0=C2=A0=C2=A0 nargs=3Dnargs@entry=3D0) at ../../=
libguile/vm.c:1257<br>#4=C2=A0 0x00007ffff7af14e9 in scm_call_0 (proc=3D&lt=
;optimized out&gt;)<br>=C2=A0=C2=A0=C2=A0 at ../../libguile/eval.c:481<br>#=
5=C2=A0 0x00007ffff7b10568 in scm_primitive_load_path (args=3D&lt;optimized=
 out&gt;)<br>=C2=A0=C2=A0=C2=A0 at ../../libguile/load.c:1248<br>#6=C2=A0 0=
x00007ffff7b6e976 in vm_regular_engine (thread=3D0x7282, vp=3D0x806f30, <br=
>=C2=A0=C2=A0=C2=A0 registers=3D0x6, resume=3D-145787864) at ../../libguile=
/vm-engine.c:784<br><br>Oh. wtf. <br><br>(define last-gc (gc-stats))<br>(de=
fine start-time (get-internal-real-time))<br>(define run-time (get-internal=
-run-time))<br>(define (avg-time-taken) <br>=C2=A0=C2=A0=C2=A0 (define now =
(get-internal-real-time))<br>=C2=A0=C2=A0=C2=A0 (define run (get-internal-r=
un-time))<br>=C2=A0=C2=A0=C2=A0 (define cur (gc-stats))<br>=C2=A0=C2=A0=C2=
=A0 (define gc-time-taken (* 1.0e-9 (- (cdar cur) (cdar last-gc))))<br>=C2=
=A0=C2=A0=C2=A0 (define elapsed-time (* 1.0e-9 (- now start-time)))<br>=C2=
=A0=C2=A0=C2=A0 (define cpu-time (* 1.0e-9 (- run run-time)))<br>=C2=A0=C2=
=A0=C2=A0 (format #t &quot;Elapsed time: ~5f secs. GC-time-taken: ~5f s or =
~5f%=C2=A0=C2=A0 cpu-usage: ~5f%\n&quot;<br>=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=
=C2=A0 elapsed-time gc-time-taken=C2=A0 (* 100 (/ gc-time-taken elapsed-tim=
e))<br>=C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 (* 100 (/ cpu-time elapsed-tim=
e))<br>=C2=A0=C2=A0=C2=A0 )<br>=C2=A0=C2=A0=C2=A0 (set! last-gc cur)<br>=C2=
=A0=C2=A0=C2=A0 (set! start-time now)<br>=C2=A0=C2=A0=C2=A0 (set! run-time =
run)<br>)<br><br><br>linas@fanny: ~/src/novamente/src/opencog-git/opencog/n=
lp/learn $ vi x<br>linas@fanny: ~/src/novamente/src/opencog-git/opencog/nlp=
/learn $ cat x<br><br>#0=C2=A0 find_slot_map (cache=3D0x7f3e9b783ab0, ip=3D=
0x1) at<br>../../libguile/vm.c:935<br>935=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 if (cache-&gt;entries[slot].ip =3D=3D ip)<br>(gdb) print slot<br>$1 =3D=
 0<br>(gdb) print cache-&gt;entries[slot].ip<br>$2 =3D (scm_t_uint32 *) 0x1=
<br>(gdb) print cache-&gt;entries[slot].map<br>$4 =3D (const scm_t_uint8 *)=
 0x0<br>(gdb) step<br>scm_i_vm_mark_stack (vp=3D0x755c1bd0, mark_stack_ptr=
=3D0x7f3e9b783f40, <br>=C2=A0=C2=A0=C2=A0 mark_stack_limit=3D0x7f3e9b793eb0=
) at ../../libguile/vm.c:1011<br>1011=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0 slot_map =3D find_slot_map (SCM_FRAME_RETURN_ADDRESS (fp), &amp;cach=
e);<br>(gdb) print fp<br>$5 =3D (union scm_vm_stack_element *) 0x7f3e94cdee=
38<br><br>#define SCM_FRAME_RETURN_ADDRESS(fp)=C2=A0=C2=A0=C2=A0 ((fp)[0].a=
s_ip)<br>(gdb) print (fp)[0].as_ip<br>$6 =3D (scm_t_uint32 *) 0x1<br><br>OK=
 that looks weird ... is this corrupted ?? but whatever,<br>because the ret=
urned slot_map is never used ... because ...<br><br>(gdb) step<br>scm_i_vm_=
mark_stack (vp=3D0x755c1bd0, mark_stack_ptr=3D0x7f3e9b783f40, <br>=C2=A0=C2=
=A0=C2=A0 mark_stack_limit=3D0x7f3e9b793eb0) at ../../libguile/vm.c:975<br>=
975=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 fp =
=3D SCM_FRAME_DYNAMIC_LINK (fp))<br>(gdb) print fp<br>$7 =3D (union scm_vm_=
stack_element *) 0x7f3e94cdee38<br><br>frames.h:#define SCM_FRAME_DYNAMIC_L=
INK(fp)=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ((fp) + (fp)[1].as_uint)<br>(gdb) pri=
nt (fp)[1].as_uint<br>$8 =3D 0<br><br>OK, that seems bad, because now fp ne=
ver advances, it just repeats<br>over and over with this same value.<br><br=
>(gdb) <br>979=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =
for (slot =3D nlocals - 1; sp &lt; fp; sp++, slot--)<br>(gdb) print nlocals=
<br>$10 =3D -2<br>(gdb) print sp<br>$11 =3D (union scm_vm_stack_element *) =
0x7f3e94cdee48<br>#define SCM_FRAME_NUM_LOCALS(fp, sp)=C2=A0=C2=A0=C2=A0 ((=
fp) - (sp))<br>(gdb) print fp<br>$12 =3D (union scm_vm_stack_element *) 0x7=
f3e94cdee38<br>(gdb) print=C2=A0 ((fp) - (sp))<br>$13 =3D -2<br>Ohh .. its =
not -16 because its -2 * sizeof (union scm_vm_stack_element *)<br>so that&#=
39;s OK.<br>So for loops is skipped, it should go to.<br>=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0 sp =3D SCM_FRAME_PREVIOUS_SP (fp);<br>frames.h:#define SCM_FRA=
ME_PREVIOUS_SP(fp)=C2=A0=C2=A0=C2=A0 ((fp) + 2)<br><br>and so now it loops =
around and repeats.<br>(gdb) print cache<br>$19 =3D {entries =3D {{ip =3D 0=
x1, map =3D 0x0}, {ip =3D 0x0, <br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 map =3D 0=
x0} &lt;repeats 23 times&gt;, {ip =3D 0x1a1b5e0, map =3D 0x0}, {ip =3D 0x0,=
 <br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 map =3D 0x0}, {ip =3D 0x0, map =3D 0x0}=
, {ip =3D 0x0, map =3D 0x0}, {ip =3D 0x0, <br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 map =3D 0x0}, {ip =3D 0x0, map =3D 0x0}, {ip =3D 0x0, map =3D 0x0}, {ip=
 =3D 0x0, <br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 map =3D 0x0}}}<br>(gdb) print =
&amp;cache<br>$20 =3D (struct slot_map_cache *) 0x7f3e9b783ab0<br><br>and s=
o the loop repeates forever, because <br>fp =3D SCM_FRAME_DYNAMIC_LINK (fp)=
 never advances fp, because<br>(fp)[1].as_uint=C2=A0 is zero.<br><br>So whe=
re is fp pointing to?=C2=A0 recall fp =3D=3D 0x7f3e94cdee38<br><br>(gdb) x/=
20x 0x7f3e94cdee00<br>0x7f3e94cdee00: 0x0000000000000904=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0 0x000000000000200c<br>0x7f3e94cdee10: 0x0000000001a1b7e0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0 0x0000000001a1b5e0<br>0x7f3e94cdee20: 0x0000000000=
000004=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x000000000000200c<br>0x7f3e94cdee30: =
0x0000000000169bd6=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x0000000000000001<br>0x7f=
3e94cdee40: 0x0000000000000000=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x000000000192=
acd0<br>0x7f3e94cdee50: 0x0000000001fa4bc0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x=
0000000001a1c6d0<br>0x7f3e94cdee60: 0x0000000044507950=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0 0x0000000000000002<br>0x7f3e94cdee70: 0x000000000005a6f5=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0 0x00000000018febd0<br>0x7f3e94cdee80: 0x000000006d490=
d00=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x00007f3e9e97241c<br>0x7f3e94cdee90: 0x0=
000000000000002=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x00007f3e9e97241c<br><br><br=
>(gdb) x/s 0x7f3e94cdee60<br>0x7f3e94cdee60: &quot;PyPD&quot;=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0 &lt;&lt;&lt; ?? is this a meaningul string?<br>(gdb) x/s=
 0x7f3e94cdee81<br>0x7f3e94cdee81: &quot;\rIm&quot;=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0 &lt;&lt; how about this ???<br><br>So this does not look healthy.=C2=
=A0 How did we get here?<br><br>(gdb) print vp<br>$21 =3D (struct scm_vm *)=
 0x755c1bd0<br>(gdb) print mark_stack_ptr<br>$22 =3D (struct GC_ms_entry *)=
 0x7f3e9b783f40<br>(gdb) print mark_stack_limit<br>$23 =3D (struct GC_ms_en=
try *) 0x7f3e9b793eb0<br>(gdb) print upper<br>$24 =3D (void *) 0x7ffdd986<b=
r>(gdb) print GC_greatest_plausible_heap_addr<br>$25 =3D (void *) 0x7ffdd98=
6<br>(gdb) print lower<br>$26 =3D (void *) 0x17a1ff8<br><br>(gdb) print vp-=
&gt;fp<br>$27 =3D (union scm_vm_stack_element *) 0x7f3e94cdee18<br>(gdb) pr=
int vp-&gt;sp<br>$28 =3D (union scm_vm_stack_element *) 0x7f3e94cdee00<br>(=
gdb) print=C2=A0 vp-&gt;stack_top<br>$29 =3D (union scm_vm_stack_element *)=
 0x7f3e94cdf000<br><br>So lets walk manually. <br>frames.h:#define SCM_FRAM=
E_DYNAMIC_LINK(fp)=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ((fp) + (fp)[1].as_uint)<b=
r><br>(gdb) print (vp-&gt;fp) + (vp-&gt;fp)[1].as_uint<br>$31 =3D (union sc=
m_vm_stack_element *) 0x7f3e94cdee38<br>So we are instanly in trouble! once=
 through the loop, and we&#39;ve got the broken fp<br><br>So how did we get=
 here?<br><br>(gdb) print vp<br>$32 =3D (struct scm_vm *) 0x755c1bd0<br>(gd=
b) print *vp<br>$33 =3D {ip =3D 0x7f3e9e5a6378 &lt;subr_stub_code+56&gt;, s=
p =3D 0x7f3e94cdee00, <br>=C2=A0 fp =3D 0x7f3e94cdee18, stack_limit =3D 0x7=
f3e94cde000, trace_level =3D 0, <br>=C2=A0 sp_min_since_gc =3D 0x7f3e94cdee=
00, stack_size =3D 512, <br>=C2=A0 stack_bottom =3D 0x7f3e94cde000, stack_t=
op =3D 0x7f3e94cdf000, <br>=C2=A0 overflow_handler_stack =3D 0x304, hooks =
=3D {0x4, 0x4, 0x4, 0x4, 0x4}, <br>=C2=A0 resumable_prompt_cookie =3D 0x7f3=
d397f9440, engine =3D 1}<br><br>Looks plausible, not corrupted. Why is engi=
ne=3D1 ?? <br>#define SCM_VM_REGULAR_ENGINE 0<br>#define SCM_VM_DEBUG_ENGIN=
E 1<br>#define SCM_VM_NUM_ENGINES 2<br>#=C2=A0 define SCM_I_CURRENT_THREAD =
(scm_i_current_thread)<br><br>(gdb) print scm_i_current_thread<br>$34 =3D (=
scm_i_thread *) 0x0<br><br>OK, that&#39;s not excellent.<br>(gdb) print thr=
ead_count<br>$35 =3D 161<br><br>(gdb) bt<br>#0=C2=A0 scm_i_vm_mark_stack (v=
p=3D0x755c1bd0, mark_stack_ptr=3D0x7f3e9b783f40, <br>=C2=A0=C2=A0=C2=A0 mar=
k_stack_limit=3D0x7f3e9b793eb0) at ../../libguile/vm.c:1011<br>#1=C2=A0 0x0=
0007f3e9db8835e in GC_mark_from (mark_stack_top=3D0x7f3e9b783ee0, <br>=C2=
=A0=C2=A0=C2=A0 mark_stack_top@entry=3D0x7f3e9b783f00, <br>=C2=A0=C2=A0=C2=
=A0 mark_stack=3Dmark_stack@entry=3D0x7f3e9b783eb0, <br>=C2=A0=C2=A0=C2=A0 =
mark_stack_limit=3Dmark_stack_limit@entry=3D0x7f3e9b793eb0) at ../mark.c:77=
2<br>#2=C2=A0 0x00007f3e9db8897e in GC_do_local_mark (local_mark_stack=3D0x=
7f3e9b783eb0, <br>=C2=A0=C2=A0=C2=A0 local_top=3D0x7f3e9b783f00) at ../mark=
.c:1037<br>#3=C2=A0 0x00007f3e9db88b98 in GC_mark_local (<br>=C2=A0=C2=A0=
=C2=A0 local_mark_stack=3Dlocal_mark_stack@entry=3D0x7f3e9b783eb0, id=3Did@=
entry=3D4)<br>=C2=A0=C2=A0=C2=A0 at ../mark.c:1170<br>#4=C2=A0 0x00007f3e9d=
b88eaa in GC_help_marker (my_mark_no=3Dmy_mark_no@entry=3D80003)<br>=C2=A0=
=C2=A0=C2=A0 at ../mark.c:1238<br>#5=C2=A0 0x00007f3e9db92e3c in GC_mark_th=
read (id=3D&lt;optimized out&gt;)<br>=C2=A0=C2=A0=C2=A0 at ../pthread_suppo=
rt.c:380<br>#6=C2=A0 0x00007f3e9e2ae6ba in start_thread (arg=3D0x7f3e9b7947=
00)<br>=C2=A0=C2=A0=C2=A0 at pthread_create.c:333<br>#7=C2=A0 0x00007f3e9df=
dd82d in clone ()<br>=C2=A0=C2=A0=C2=A0 at ../sysdeps/unix/sysv/linux/x86_6=
4/clone.S:109<br><br>so bdwgc is version gc7_6_0<br><br>So ../mark.c:772 is=
 GET_HDR and GET_HDR is in<br>include/private/gc_hdrs.h<br><br>define GET_H=
DR(p, hhdr)=C2=A0 ... ((hhdr) =3D HDR_INNER(p) <br># define HDR_INNER(p) HD=
R_FROM_BI(BI(p),p)<br>#define HDR_FROM_BI(bi, p) \<br>=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ((bi)=
-&gt;index[((word)(p) &gt;&gt; LOG_HBLKSIZE) &amp; (BOTTOM_SZ - 1)])<br><br=
>This seems all wrong.. wtf ....<br><br>is it called by thread_mark ??<br>i=
n libguile/thread.c:<br>=C2=A0 thread_gc_kind =3D<br>=C2=A0=C2=A0=C2=A0 GC_=
new_kind (GC_new_free_list (),<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 GC_M=
AKE_PROC (GC_new_proc (thread_mark), 0),<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0 0, 1)<br>in bdwgc/misc.c=C2=A0 GC_new_proc<br>=C2=A0=C2=A0=C2=A0 GC_=
mark_procs[result] =3D proc<br>#define GC_mark_procs GC_arrays._mark_procs<=
br>#define PROC(descr) \<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 (GC_mark_procs[(=
(descr) &gt;&gt; GC_DS_TAG_BITS) &amp; (GC_MAX_MARK_PROCS-1)])<br><br>=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0 GC_obj_kinds[result].ok_descriptor =3D descr;=
=C2=A0 &lt;&lt;&lt; this is where MAKE_PROC goes.<br><br>anyway ... in mark=
.c line 733<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 case GC_DS_PROC:<=
br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mark_stack_top--;=
<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 credit -=3D GC_P=
ROC_BYTES;<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mark_s=
tack_top =3D (*PROC(descr))((word *)current_p, mark_stack_top,<br>=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0 mark_stack_limit, ENV(descr));<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0 continue;<br>But that&#39;s the wrong line numb=
er.=C2=A0 line 772 is in <br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 cas=
e GC_DS_PER_OBJECT:<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 if ((signed_word)descr &gt;=3D 0) {<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 /* Descriptor is in the object.=C2=A0=
=C2=A0=C2=A0=C2=A0 */<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0 descr =3D *(word *)(current_p + descr - GC_DS_PER_OBJECT=
);<br><br>Oh, it goes to retry, which loops around and calls PROC which is =
thread_mark<br>which is called with args <br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0 mark_stack_top =3D (*PROC(descr))((word *)current_=
p, mark_stack_top,<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 mark_stack_limit, ENV(descr))=
;<br><br>(gdb) print current_p<br>$39 =3D (ptr_t) 0x7225d380 &quot;@\325%r&=
quot;<br><br>so back in libguile/threads.c line 84:<br>=C2=A0 const struct =
scm_i_thread *t =3D (struct scm_i_thread *) addr;<br><br>(gdb) print *(stru=
ct scm_i_thread *) current_p<br>$40 =3D {next_thread =3D 0x7225d540, handle=
 =3D 0x5763f3b0, pthread =3D 139900934399744, <br>=C2=A0 result =3D 0x4, ex=
ited =3D 0, guile_mode =3D 1, needs_unregister =3D 1, wake =3D 0x0, <br>=C2=
=A0 sleep_cond =3D {__data =3D {__lock =3D 0, __futex =3D 0, __total_seq =
=3D 0, <br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 __wakeup_seq =3D 0, __woken_seq =
=3D 0, __mutex =3D 0x0, __nwaiters =3D 0, <br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 __broadcast_seq =3D 0}, __size =3D &#39;\000&#39; &lt;repeats 47 times&=
gt;, __align =3D 0}, <br>=C2=A0 sleep_pipe =3D {151, 197}, freelists =3D 0x=
71cfe800, <br>=C2=A0 pointerless_freelists =3D 0x71cfe780, dynamic_state =
=3D 0x71de7900, dynstack =3D {<br>=C2=A0=C2=A0=C2=A0 base =3D 0x698c3800, t=
op =3D 0x698c3940, limit =3D 0x698c3a00}, <br>=C2=A0 pending_asyncs =3D 0x3=
04, block_asyncs =3D 0, continuation_root =3D 0x5763f340, <br>=C2=A0 contin=
uation_base =3D 0x7f3d397f9c90, vp =3D 0x755c1bd0, base =3D 0x7f3d397f9da0,=
 <br>=C2=A0 regs =3D {{__jmpbuf =3D {139905204221696, 139906923931021, 1, 1=
39894281647232, <br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 139906913276=
512, 0, 139906918327551, 8388608}, <br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 __mas=
k_was_saved =3D 964664752, __saved_mask =3D {__val =3D {0, 139900934399744,=
 <br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 139900934399744=
, 139900934397344, 139906923929400, 139893936840720, <br>=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1804170656, 139906913276512, 139900934=
399744, 139900934397344, <br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0 139900934397504, 139893936840720, 139906910859134, 13990691327648=
4, <br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 1399069108571=
12, 27401856}}}}}<br><br>(gdb) print ((struct scm_i_thread *) current_p)-&g=
t;vp<br>$41 =3D (struct scm_vm *) 0x755c1bd0<br><br>which is exactly the vp=
 we had before, so this is the correct thread.<br><br>So=C2=A0 ... all the =
values in *(struct scm_i_thread *) current_p look more or less OK;<br>its n=
ot corrupted. And all the values in *vp look more or less OK, its not corrr=
upted<br>except that engine=3D1 looks wrong.=C2=A0 So what&#39;s up with th=
e weird stack looping?<br>Again: <br>From before, we had:<br><br>(gdb) prin=
t *vp<br>$42 =3D {ip =3D 0x7f3e9e5a6378 &lt;subr_stub_code+56&gt;, sp =3D 0=
x7f3e94cdee00, <br>=C2=A0 fp =3D 0x7f3e94cdee18, stack_limit =3D 0x7f3e94cd=
e000, trace_level =3D 0, <br>=C2=A0 sp_min_since_gc =3D 0x7f3e94cdee00, sta=
ck_size =3D 512, <br>=C2=A0 stack_bottom =3D 0x7f3e94cde000, stack_top =3D =
0x7f3e94cdf000, <br>=C2=A0 overflow_handler_stack =3D 0x304, hooks =3D {0x4=
, 0x4, 0x4, 0x4, 0x4}, <br>=C2=A0 resumable_prompt_cookie =3D 0x7f3d397f944=
0, engine =3D 1}<br><br>so fp, sp, stack_bottom, stack_top all are self-con=
sistent with each other.<br>The problem is that<br>=C2=A0=C2=A0=C2=A0 fp =
=3D SCM_FRAME_DYNAMIC_LINK (fp)<br>goes nowhere, after the first iteration.=
<br><br>Is the frame corrupted? How? what does the rest of the stack look l=
ike?<br><br>(gdb) x/32x 0x7f3e94cdee00<br>0x7f3e94cdee00: 0x000000000000090=
4=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x000000000000200c<br>0x7f3e94cdee10: 0x000=
0000001a1b7e0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x0000000001a1b5e0<br>0x7f3e94c=
dee20: 0x0000000000000004=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x000000000000200c=
=C2=A0 &lt;&lt;&lt; theres the 4<br>0x7f3e94cdee30: 0x0000000000169bd6=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x0000000000000001 &lt;&lt;&lt; there&#39;s the=
 new fp<br>0x7f3e94cdee40: 0x0000000000000000=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
 0x000000000192acd0 &lt;&lt;&lt; there&#39;s the zero.<br>0x7f3e94cdee50: 0=
x0000000001fa4bc0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x0000000001a1c6d0<br>0x7f3=
e94cdee60: 0x0000000044507950=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x0000000000000=
002<br>0x7f3e94cdee70: 0x000000000005a6f5=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x0=
0000000018febd0<br>0x7f3e94cdee80: 0x000000006d490d00=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0 0x00007f3e9e97241c<br>0x7f3e94cdee90: 0x0000000000000002=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0 0x00007f3e9e97241c<br>0x7f3e94cdeea0: 0x0000000000000=
002=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x00007f3e8d797630<br>0x7f3e94cdeeb0: 0x0=
000000000000002=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x00007f3e86effacc<br>0x7f3e9=
4cdeec0: 0x0000000000000003=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x000000006d490d0=
0<br>0x7f3e94cdeed0: 0x00007f3e86eb8698=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 0x000=
0000000000002<br>0x7f3e94cdeee0: 0x00007f3e8e51f450=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0 0x0000000000000008<br>0x7f3e94cdeef0: 0x0000000001a37c10=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0 0x000000000192acd0<br><br><br>So... who sets up thread-&=
gt;vp ? <br>/* #define VM_ENABLE_ASSERTIONS */<br>However, defining this br=
eaks the build.<br><br>diff --git a/libguile/vm-engine.c b/libguile/vm-engi=
ne.c<br>index 6c88ebf..f0f8164 100644<br>--- a/libguile/vm-engine.c<br>+++ =
b/libguile/vm-engine.c<br>@@ -2050,11 +2050,11 @@ VM_NAME (scm_i_thread *th=
read, struct scm_vm *vp,<br>=C2=A0<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0 if (!scm_module_system_booted_p)<br>=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 {<br>-=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 AS=
SERT (scm_is_true<br>+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0 ASSERT (scm_is_true(<br>=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 scm_equal_p (modname,<br>=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 scm_list_2<br>=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 (SCM_BOOL_T,<br>-=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 scm_from_utf8_symbol (&=
quot;guile&quot;))));<br>+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0 scm_from_utf8_symbol (&quot;guile&quot;)))));<br>=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 va=
r =3D scm_lookup (sym);<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0 }<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0 else if (scm_is_true (SCM_CAR (modname)))<br>diff --git a/l=
ibguile/vm.c b/libguile/vm.c<br>index 18f2192..c26066e 100644<br>--- a/libg=
uile/vm.c<br>+++ b/libguile/vm.c<br>@@ -63,7 +63,7 @@ static size_t page_si=
ze;<br>=C2=A0<br>=C2=A0/* The VM has a number of internal assertions that s=
houldn&#39;t normally be<br>=C2=A0=C2=A0=C2=A0 necessary, but might be if y=
ou think you found a bug in the VM. */<br>-/* #define VM_ENABLE_ASSERTIONS =
*/<br>+#define VM_ENABLE_ASSERTIONS<br>=C2=A0<br>=C2=A0static void vm_expan=
d_stack (struct scm_vm *vp,<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 union scm_vm_stac=
k_element *new_sp) SCM_NOINLINE;<br><br><br>=C2=A0GEN=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0 guile-procedures.texi<br>/bin/bash: line 1: 24078 Broken pipe <br=
><br>=C2=A0./meta/guile<br>Aborted<br>./libguile/guile<br>Aborted<br>gdb li=
bguile/.libs/lt-guile<br><br>#0=C2=A0 0x00007ffff74f7428 in __GI_raise (sig=
=3Dsig@entry=3D6)<br>=C2=A0=C2=A0=C2=A0 at ../sysdeps/unix/sysv/linux/raise=
.c:54<br>#1=C2=A0 0x00007ffff74f902a in __GI_abort () at abort.c:89<br>#2=
=C2=A0 0x00007ffff7b707c4 in vm_regular_engine (thread=3D0x7282, vp=3D0x806=
f30, <br>=C2=A0=C2=A0=C2=A0 registers=3D0x6, resume=3D-145787864) at ../../=
libguile/vm-engine.c:2053<br>#3=C2=A0 0x00007ffff7b711d2 in scm_call_n (pro=
c=3D0x832b30, argv=3Dargv@entry=3D0x0, <br>=C2=A0=C2=A0=C2=A0 nargs=3Dnargs=
@entry=3D0) at ../../libguile/vm.c:1257<br>#4=C2=A0 0x00007ffff7af14e9 in s=
cm_call_0 (proc=3D&lt;optimized out&gt;)<br>=C2=A0=C2=A0=C2=A0 at ../../lib=
guile/eval.c:481<br>#5=C2=A0 0x00007ffff7b10568 in scm_primitive_load_path =
(args=3D&lt;optimized out&gt;)<br>=C2=A0=C2=A0=C2=A0 at ../../libguile/load=
.c:1248<br>#6=C2=A0 0x00007ffff7b6e976 in vm_regular_engine (thread=3D0x728=
2, vp=3D0x806f30, <br>=C2=A0=C2=A0=C2=A0 registers=3D0x6, resume=3D-1457878=
64) at ../../libguile/vm-engine.c:784<br><br><br></div></div>

--f403045ea6d679a79b0551239b33--




Message sent:


Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Mailer: MIME-tools 5.505 (Entity 5.505)
Content-Type: text/plain; charset=utf-8
X-Loop: help-debbugs@HIDDEN
From: help-debbugs@HIDDEN (GNU bug Tracking System)
To: linasvepstas@HIDDEN
Subject: bug#27234: Acknowledgement (Hang in GC, inf loop while walking
 frame pointers)
Message-ID: <handler.27234.B.149658828914303.ack <at> debbugs.gnu.org>
References: <CAHrUA361VtW9ZOgEbxTYsDi3K7WY4YVaGA=YAK3VQKSVs8Bc_g@HIDDEN>
X-Gnu-PR-Message: ack 27234
X-Gnu-PR-Package: guile
Reply-To: 27234 <at> debbugs.gnu.org
Date: Sun, 04 Jun 2017 14:59:02 +0000

Thank you for filing a new bug report with debbugs.gnu.org.

This is an automatically generated reply to let you know your message
has been received.

Your message is being forwarded to the package maintainers and other
interested parties for their attention; they will reply in due course.

Your message has been sent to the package maintainer(s):
 bug-guile@HIDDEN

If you wish to submit further information on this problem, please
send it to 27234 <at> debbugs.gnu.org.

Please do not send mail to help-debbugs@HIDDEN unless you wish
to report a problem with the Bug-tracking system.

--=20
27234: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D27234
GNU Bug Tracking System
Contact help-debbugs@HIDDEN with problems


Message sent to bug-guile@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#27234: Followup.
References: <CAHrUA361VtW9ZOgEbxTYsDi3K7WY4YVaGA=YAK3VQKSVs8Bc_g@HIDDEN>
In-Reply-To: <CAHrUA361VtW9ZOgEbxTYsDi3K7WY4YVaGA=YAK3VQKSVs8Bc_g@HIDDEN>
Resent-From: Linas Vepstas <linasvepstas@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-guile@HIDDEN
Resent-Date: Sat, 09 Sep 2017 20:45:02 +0000
Resent-Message-ID: <handler.27234.B27234.150498987927783 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 27234
X-GNU-PR-Package: guile
X-GNU-PR-Keywords: 
To: 27234 <at> debbugs.gnu.org
Reply-To: linasvepstas@HIDDEN
Received: via spool by 27234-submit <at> debbugs.gnu.org id=B27234.150498987927783
          (code B ref 27234); Sat, 09 Sep 2017 20:45:02 +0000
Received: (at 27234) by debbugs.gnu.org; 9 Sep 2017 20:44:39 +0000
Received: from localhost ([127.0.0.1]:57959 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1dqmcJ-0007E3-IR
	for submit <at> debbugs.gnu.org; Sat, 09 Sep 2017 16:44:39 -0400
Received: from mail-lf0-f50.google.com ([209.85.215.50]:34334)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <linasvepstas@HIDDEN>) id 1dqmcI-0007Do-1z
 for 27234 <at> debbugs.gnu.org; Sat, 09 Sep 2017 16:44:38 -0400
Received: by mail-lf0-f50.google.com with SMTP id l196so11421459lfl.1
 for <27234 <at> debbugs.gnu.org>; Sat, 09 Sep 2017 13:44:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:reply-to:from:date:message-id:subject:to;
 bh=yZ86tWlta7Y5z6FPek4nVjpJC2CgQHZruB5di6MzktU=;
 b=IZgHi1fMyjx0k4gS9a64BduJ0pSwK2+5xiQYXiK8d5Lu1lLIoGGBC6l3fFeyNwMmJk
 atnEFPWuuB3NiJlX/XlBSlJ5B6ImHEOKn/l0AgmydoVYMmnkVYB8yFAMIjMYlEHKkzg0
 lR8gCuAG1j4ptq4XEMw1040kD9Rn58eQ+8zmteH6QSey+k0c6Zxl7vwkcTHD7HUEEfkb
 r6UpnFXRdN4rUsFJ7XBiIESppTPQqALwhaQbGvqmWaJ2ZGFj5McNXjBfFh728Ctm1gbU
 nPOwlxCQVDjl52caBZvDhCyyCp5TZ7PZ8caHXiF+8Wmf0X5PUP6xiv92+Dz7FVbX7qX6
 /g1w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:reply-to:from:date:message-id
 :subject:to;
 bh=yZ86tWlta7Y5z6FPek4nVjpJC2CgQHZruB5di6MzktU=;
 b=H92KYGGYEo7jP2aTLz70F9Oe2Xb3Oz7lpVTuBv0ik3dPEd75RfWbOX7At0GIu8V8L1
 ToRHVUdr5gGDQmnIx8I3HW+msNYVIAAqHV5jCJ6pmipnZtmorQTgl0yNAJ2lMKkyqj8/
 Ng25/Ju2uv0fsaiVKMldvFl3uPsS846l3XuxaO93x7iO16rPPfDOCLlctB7OQpqdJZr2
 llHgdA7pEBvzU3yDcj4v20kWzETHYRVrbJhMlZQNijFs0zgSL6nm595jaclAhj4eX7t3
 OONEB8o4JHmv1K6FLyqGTKZK3APF/+lSKg7z979zgjzTPqldeeKqaKb83r5PnRKrQeTF
 WaxA==
X-Gm-Message-State: AHPjjUi1BVSWaFPuJ8TuRiy/haQ1on6KnNgd54jq4CqlFB9I1q2LQ/tc
 A6OVggz1r1bn3dOQ41zwkEDS/Yz7DQ==
X-Google-Smtp-Source: AOwi7QBUAF4sbtaURAw6Zj1bKmUhi+kUf/9JHRMFy2raexKQqCYSEdHG3p+cTio8BPtZccCQjz4ja7BEGC80lPV+4y0=
X-Received: by 10.25.202.23 with SMTP id a23mr1717732lfg.260.1504989871998;
 Sat, 09 Sep 2017 13:44:31 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.25.44.200 with HTTP; Sat, 9 Sep 2017 13:44:11 -0700 (PDT)
From: Linas Vepstas <linasvepstas@HIDDEN>
Date: Sat, 9 Sep 2017 15:44:11 -0500
Message-ID: <CAHrUA37zT6FU606vJhDD250nByzpkmj9_DBVvNgjXE3PAdf1sg@HIDDEN>
Content-Type: multipart/alternative; boundary="f403045ea792b60b930558c7c250"
X-Spam-Score: 0.5 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: 0.5 (/)

--f403045ea792b60b930558c7c250
Content-Type: text/plain; charset="UTF-8"

The original bug report failed to explain how to trigger the bug.  I think
I know of a simple way to trigger the bug, but have not yet created the
simple test-case.  I believe that the bug can be triggered like so:

Create some medium-sized "random" (arbitrary) snippet of guile code. Send
it to the guile repl server.  Do it again and again, as fast as possible.
The bug will trigger.

By "arbitary/random" I mean some code that is simple but non-repeating:
say, concatenate some random strings, count the number of letters in them,
take the square root -- just do assorted, non-repeating computations that
force the REPL to evaluate some new blob of scheme code every time.

Now, once out of every 5 or 50 times, introduce a bug in the scheme code,
causing the REPL to throw an exception.  Using ut8 strings might also be a
required ingredient.

I believe that this will trigger the bug. I believe that the bug is some
unprotected section in the guile interpreter, and a race condition clobbers
the guile stack(s) and it all goes downhill from here.

--f403045ea792b60b930558c7c250
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><br clear=3D"all"><div>The original bug report failed to e=
xplain how to trigger the bug.=C2=A0 I think I know of a simple way to trig=
ger the bug, but have not yet created the simple test-case.=C2=A0 I believe=
 that the bug can be triggered like so:</div><div><br></div><div>Create som=
e medium-sized &quot;random&quot; (arbitrary) snippet of guile code. Send i=
t to the guile repl server.=C2=A0 Do it again and again, as fast as possibl=
e.=C2=A0 The bug will trigger.</div><div><br></div><div>By &quot;arbitary/r=
andom&quot; I mean some code that is simple but non-repeating: say, concate=
nate some random strings, count the number of letters in them, take the squ=
are root -- just do assorted, non-repeating computations that force the REP=
L to evaluate some new blob of scheme code every time.</div><div><br></div>=
<div>Now, once out of every 5 or 50 times, introduce a bug in the scheme co=
de, causing the REPL to throw an exception.=C2=A0 Using ut8 strings might a=
lso be a required ingredient.</div><div><br></div><div>I believe that this =
will trigger the bug. I believe that the bug is some unprotected section in=
 the guile interpreter, and a race condition clobbers the guile stack(s) an=
d it all goes downhill from here.</div><div><br></div><br>
</div>

--f403045ea792b60b930558c7c250--





Last modified: Mon, 25 Nov 2019 12:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.