GNU bug report logs -
#42943
28.0.50; Emacsclient crashes in ftcrfont_glyph_extents
Previous Next
Reported by: "Basil L. Contovounesios" <contovob <at> tcd.ie>
Date: Thu, 20 Aug 2020 00:48:01 UTC
Severity: normal
Tags: fixed
Found in version 28.0.50
Fixed in version 28.1
Done: "Basil L. Contovounesios" <contovob <at> tcd.ie>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 42943 in the body.
You can then email your comments to 42943 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42943
; Package
emacs
.
(Thu, 20 Aug 2020 00:48:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
"Basil L. Contovounesios" <contovob <at> tcd.ie>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Thu, 20 Aug 2020 00:48:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
From the project-root:
0. cd src
1. gdb ./emacs
2. set logging on
3. r -Q --fg-daemon
From the project-root in another shell:
4. ./lib-src/emacsclient -c
5. C-\ arabic RET
6. s ~ A
7. C-x 5 0
8. ./lib-src/emacsclient -c
Back to GDB:
9. bt full
I attach the resulting backtrace.
I then recompiled ftcrfont.c with the following printf sprinkles:
[printf-debug.diff (text/x-diff, inline)]
diff --git a/src/ftcrfont.c b/src/ftcrfont.c
index 29813c8d7d..7832d4f5ce 100644
--- b/src/ftcrfont.c
+++ a/src/ftcrfont.c
@@ -52,11 +52,19 @@
row = glyph / METRICS_NCOLS_PER_ROW;
col = glyph % METRICS_NCOLS_PER_ROW;
+ fprintf (stderr, "\nglyph %u row %d col %d nrows %d",
+ glyph, row, col, ftcrfont_info->metrics_nrows);
if (row >= ftcrfont_info->metrics_nrows)
{
+ fprintf (stderr, " realloc %p %zu", ftcrfont_info->metrics,
+ sizeof (struct font_metrics *) * (row + 1));
ftcrfont_info->metrics =
xrealloc (ftcrfont_info->metrics,
sizeof (struct font_metrics *) * (row + 1));
+ fprintf (stderr, " memset %p %zu",
+ ftcrfont_info->metrics + ftcrfont_info->metrics_nrows,
+ (sizeof (struct font_metrics *)
+ * (row + 1 - ftcrfont_info->metrics_nrows)));
memset (ftcrfont_info->metrics + ftcrfont_info->metrics_nrows, 0,
(sizeof (struct font_metrics *)
* (row + 1 - ftcrfont_info->metrics_nrows)));
@@ -68,10 +76,15 @@
int i;
new = xmalloc (sizeof (struct font_metrics) * METRICS_NCOLS_PER_ROW);
+ fprintf (stderr, " malloc %p %zu", new,
+ sizeof (struct font_metrics) * METRICS_NCOLS_PER_ROW);
for (i = 0; i < METRICS_NCOLS_PER_ROW; i++)
METRICS_SET_STATUS (new + i, METRICS_INVALID);
ftcrfont_info->metrics[row] = new;
}
+ fprintf (stderr, " m %p r %p c %p\n", ftcrfont_info->metrics,
+ ftcrfont_info->metrics[row],
+ ftcrfont_info->metrics[row] + col);
cache = ftcrfont_info->metrics[row] + col;
if (METRICS_STATUS (cache) == METRICS_INVALID)
@@ -503,8 +516,10 @@
}
glyphs = alloca (sizeof (cairo_glyph_t) * len);
+ fprintf (stderr, "sz %zu len %d", sizeof (cairo_glyph_t), len);
for (i = 0; i < len; i++)
{
+ fprintf (stderr, " from %d i %d", from, i);
glyphs[i].index = s->char2b[from + i];
glyphs[i].x = x;
glyphs[i].y = y;
[Message part 3 (text/plain, inline)]
and repeated the recipe. IIRC the segfault didn't happen in step 8 the
first time; I had to repeat steps 7-8 a few times before it did, and
this generated a lot of output. Here are the last few lines:
--8<---------------cut here---------------start------------->8---
glyph 7 row 0 col 7 nrows 1 m 0x5555561a7660 r 0x555556495210 c 0x555556495256
glyph 7 row 0 col 7 nrows 1 m 0x5555561a7660 r 0x555556495210 c 0x555556495256
glyph 63 row 0 col 63 nrows 1 m 0x5555561a7660 r 0x555556495210 c 0x555556495486
sz 24 len 1 from 0 i 0
glyph 3 row 0 col 3 nrows 1 m 0x5555561a7660 r 0x555556495210 c 0x55555649522e
sz 24 len 1 from 0 i 0
glyph 1135 row 8 col 111 nrows 10 m 0x5555562304d0 r 0x2073756c50656d55 c 0x2073756c506571ab
--8<---------------cut here---------------end--------------->8---
IIRC step 8 segfaulted every time after I removed some of the printfs.
Either way, the pattern is always the same: 'glyph' goes from being a
small number to 1135 or 1153, and the address of ftcrfont_info->metrics
changes.
Any ideas?
Thanks,
--
Basil
In GNU Emacs 28.0.50 (build 9, x86_64-pc-linux-gnu, X toolkit, cairo version 1.16.0, Xaw3d scroll bars)
of 2020-08-20 built on tabos
Repository revision: f8d3d18168a742691d095a3f0c83512f19621725
Repository branch: master
Windowing system distributor 'The X.Org Foundation', version 11.0.12008000
System Description: Debian GNU/Linux bullseye/sid
Configured using:
'configure 'CC=ccache gcc' 'CFLAGS=-O0 -g3 -ggdb -gdwarf-4'
--config-cache --prefix=/home/blc/.local --program-suffix=-dbg
--enable-checking=yes,glyphs --enable-check-lisp-object-type
--with-x-toolkit=lucid --with-file-notification=yes --with-x'
Configured features:
XAW3D XPM JPEG TIFF GIF PNG RSVG CAIRO SOUND GPM DBUS GSETTINGS GLIB
NOTIFY INOTIFY ACL LIBSELINUX GNUTLS LIBXML2 FREETYPE HARFBUZZ M17N_FLT
LIBOTF ZLIB TOOLKIT_SCROLL_BARS LUCID X11 XDBE XIM MODULES THREADS
LIBSYSTEMD JSON PDUMPER LCMS2
Important settings:
value of $LANG: en_IE.UTF-8
locale-coding-system: utf-8-unix
[gdb.txt (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42943
; Package
emacs
.
(Thu, 20 Aug 2020 14:30:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 42943 <at> debbugs.gnu.org (full text, mbox):
> From: "Basil L. Contovounesios" <contovob <at> tcd.ie>
> Date: Thu, 20 Aug 2020 01:47:32 +0100
>
> --8<---------------cut here---------------start------------->8---
> glyph 7 row 0 col 7 nrows 1 m 0x5555561a7660 r 0x555556495210 c 0x555556495256
>
> glyph 7 row 0 col 7 nrows 1 m 0x5555561a7660 r 0x555556495210 c 0x555556495256
>
> glyph 63 row 0 col 63 nrows 1 m 0x5555561a7660 r 0x555556495210 c 0x555556495486
> sz 24 len 1 from 0 i 0
> glyph 3 row 0 col 3 nrows 1 m 0x5555561a7660 r 0x555556495210 c 0x55555649522e
> sz 24 len 1 from 0 i 0
> glyph 1135 row 8 col 111 nrows 10 m 0x5555562304d0 r 0x2073756c50656d55 c 0x2073756c506571ab
> --8<---------------cut here---------------end--------------->8---
Is 0x2073756c506571ab a bogus pointer, i.e. it cannot be dereferenced?
If so, you need to figure out where did it come from. It could have
come from the xmalloc call, but then why the trace around that call
are not shown in your printfs?
And what is sizeof(struct font_metrics) there? I see no printfs which
show that, so I can only guess.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42943
; Package
emacs
.
(Fri, 21 Aug 2020 14:06:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 42943 <at> debbugs.gnu.org (full text, mbox):
I wonder if this is somehow related to bug#42897. Probably not, but two
ftcrfont-related crashes are reported within a few days, so perhaps
there's a connection...
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42943
; Package
emacs
.
(Thu, 22 Oct 2020 12:42:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 42943 <at> debbugs.gnu.org (full text, mbox):
Lars Ingebrigtsen <larsi <at> gnus.org> writes:
> I wonder if this is somehow related to bug#42897. Probably not, but two
> ftcrfont-related crashes are reported within a few days, so perhaps
> there's a connection...
Are you still seeing this problem? Pip's fix for bug#41627 may or may
not have fixed this problem.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42943
; Package
emacs
.
(Thu, 22 Oct 2020 22:12:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 42943 <at> debbugs.gnu.org (full text, mbox):
Lars Ingebrigtsen <larsi <at> gnus.org> writes:
> Lars Ingebrigtsen <larsi <at> gnus.org> writes:
>
>> I wonder if this is somehow related to bug#42897. Probably not, but two
>> ftcrfont-related crashes are reported within a few days, so perhaps
>> there's a connection...
>
> Are you still seeing this problem? Pip's fix for bug#41627 may or may
> not have fixed this problem.
Sadly, it doesn't seem to have - I still get a segfault, at least in my
optimised build of master. Hopefully I'll find some time over the
weekend to look into it.
Thanks,
--
Basil
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42943
; Package
emacs
.
(Sat, 24 Oct 2020 11:25:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 42943 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Thu, 22 Oct 2020 23:11:30 +0100, "Basil L. Contovounesios" <contovob <at> tcd.ie> said:
Basil> Sadly, it doesn't seem to have - I still get a segfault, at least in my
Basil> optimised build of master. Hopefully I'll find some time over the
Basil> weekend to look into it.
I can reproduce this with a GTK3 + cairo build. Your recipe is very
helpful, in fact just doing 'C-\ arabic RET' is enough to cause the
second emacsclient invocation to crash.
Running emacs under valgrind shows what's going on, but I donʼt know
how to fix it:
==9766== Invalid read of size 2
==9766== at 0x3F8FED: ftcrfont_glyph_extents (ftcrfont.c:81)
==9766== by 0x3F93A0: ftcrfont_draw (ftcrfont.c:522)
==9766== by 0x25C67F: x_draw_composite_glyph_string_foreground (xterm.c:1969)
==9766== by 0x25F9A4: x_draw_glyph_string (xterm.c:3780)
==9766== by 0x1BFC4D: draw_glyphs (xdisp.c:28915)
==9766== by 0x1C65EA: gui_write_glyphs (xdisp.c:30933)
==9766== by 0x15227D: update_text_area (dispnew.c:3849)
==9766== by 0x152C38: update_window_line (dispnew.c:4092)
==9766== by 0x151858: update_window (dispnew.c:3573)
==9766== by 0x151015: update_window_tree (dispnew.c:3344)
==9766== by 0x150C48: update_frame (dispnew.c:3226)
==9766== by 0x19766E: redisplay_internal (xdisp.c:16023)
==9766== Address 0xf36a92e is 126 bytes inside a block of size 1,280 free'd
==9766== at 0x48369AB: free (vg_replace_malloc.c:530)
==9766== by 0x327438: xfree (alloc.c:820)
==9766== by 0x3F9EF4: ftcrfont_close (ftcrfont.c:307)
==9766== by 0x3F9EF4: ftcrfont_close (ftcrfont.c:282)
==9766== by 0x37B285: font_clear_cache (font.c:2648)
==9766== by 0x37B096: font_finish_cache (font.c:2593)
==9766== by 0x37E258: font_update_drivers (font.c:3588)
==9766== by 0x15F56E: delete_frame (frame.c:2093)
==9766== by 0x15FD21: Fdelete_frame (frame.c:2325)
==9766== by 0x35F1FD: funcall_subr (eval.c:2884)
==9766== by 0x35EDE9: Ffuncall (eval.c:2809)
==9766== by 0x356002: Ffuncall_interactively (callint.c:253)
==9766== by 0x35F0FC: funcall_subr (eval.c:2862)
==9766== Block was alloc'd at
==9766== at 0x483577F: malloc (vg_replace_malloc.c:299)
==9766== by 0x327BEB: lmalloc (alloc.c:1359)
==9766== by 0x3272FA: xmalloc (alloc.c:761)
==9766== by 0x3F9093: ftcrfont_glyph_extents (ftcrfont.c:73)
==9766== by 0x3F94B2: ftcrfont_text_extents (ftcrfont.c:371)
==9766== by 0x3803DA: font_fill_lglyph_metrics (font.c:4430)
==9766== by 0x3E1736: fill_gstring_body (composite.c:843)
==9766== by 0x3E42C4: Fcomposition_get_gstring (composite.c:1792)
==9766== by 0x3E1BC3: autocmp_chars (composite.c:912)
==9766== by 0x3E2BCE: composition_reseat_it (composite.c:1269)
==9766== by 0x1858B9: next_element_from_string (xdisp.c:8578)
==9766== by 0x1854C6: next_element_from_string (xdisp.c:8504)
The call to ftcrfont_glyph_extents is from here:
static int
ftcrfont_draw (struct glyph_string *s,
int from, int to, int x, int y, bool with_background)
{
struct frame *f = s->f;
struct face *face = s->face;
struct font_info *ftcrfont_info = (struct font_info *) s->font;
So this means that the struct glyph_string here still refers to the
font from the previous frame, which has been closed. Iʼm not sure how
to get it to refer to the right font on the new frame.
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42943
; Package
emacs
.
(Sat, 24 Oct 2020 11:47:01 GMT)
Full text and
rfc822 format available.
Message #23 received at 42943 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 42943 <at> debbugs.gnu.org, eliz <at> gnu.org
> Date: Sat, 24 Oct 2020 13:24:42 +0200
>
> The call to ftcrfont_glyph_extents is from here:
>
> static int
> ftcrfont_draw (struct glyph_string *s,
> int from, int to, int x, int y, bool with_background)
> {
> struct frame *f = s->f;
> struct face *face = s->face;
> struct font_info *ftcrfont_info = (struct font_info *) s->font;
>
> So this means that the struct glyph_string here still refers to the
> font from the previous frame, which has been closed. Iʼm not sure how
> to get it to refer to the right font on the new frame.
I'm guessing that we close the font, but there's still a face that
references that font, and we try using that face for display. Can you
see if that is the case? The 'face' member of 'struct glyph_string'
should point to the face, and face->font should point to the font.
We call font_clear_cache when a frame is deleted, so it's unclear to
me how come the font is still remembered somewhere.
And I lack some background here: what is/are the character(s) we try
displaying here, and how that display is triggered by creating a new
frame due to the second emacsclient invocation?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42943
; Package
emacs
.
(Sat, 24 Oct 2020 12:16:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 42943 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Sat, 24 Oct 2020 14:46:29 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 42943 <at> debbugs.gnu.org, eliz <at> gnu.org
>> Date: Sat, 24 Oct 2020 13:24:42 +0200
>>
>> The call to ftcrfont_glyph_extents is from here:
>>
>> static int
>> ftcrfont_draw (struct glyph_string *s,
>> int from, int to, int x, int y, bool with_background)
>> {
>> struct frame *f = s->f;
>> struct face *face = s->face;
>> struct font_info *ftcrfont_info = (struct font_info *) s->font;
>>
>> So this means that the struct glyph_string here still refers to the
>> font from the previous frame, which has been closed. Iʼm not sure how
>> to get it to refer to the right font on the new frame.
Eli> I'm guessing that we close the font, but there's still a face that
Eli> references that font, and we try using that face for display. Can you
Eli> see if that is the case? The 'face' member of 'struct glyph_string'
Eli> should point to the face, and face->font should point to the font.
Yes, weʼre using the face thatʼs cached in the glyph_string:
Thread 1 "emacs" received signal SIGSEGV, Segmentation fault.
ftcrfont_glyph_extents (font=0x555556930478, glyph=1036,
metrics=metrics <at> entry=0x0) at ftcrfont.c:81
81 if (METRICS_STATUS (cache) == METRICS_INVALID)
(gdb) up
#1 0x00005555558453a1 in ftcrfont_draw (s=0x7fffffffb440,
from=<optimized out>, to=<optimized out>, x=17, y=<optimized out>,
with_background=<optimized out>) at ftcrfont.c:520
520 x += (s->padding_p ? 1 : ftcrfont_glyph_extents (s->font,
(gdb) l 500
495 struct face *face = s->face;
496 struct font_info *ftcrfont_info = (struct font_info *) s->font;
497 cairo_t *cr;
498 cairo_glyph_t *glyphs;
499 int len = to - from;
500 int i;
501
502 block_input ();
503
504 cr = x_begin_cr_clip (f, s->gc);
(gdb) p s->face
$1 = (struct face *) 0x555556113290
(gdb) p s->face->font
$2 = (struct font *) 0x555556930478
(gdb) p s->font
$3 = (struct font *) 0x555556930478
(this is from a separate run from the valgrind one)
Eli> We call font_clear_cache when a frame is deleted, so it's unclear to
Eli> me how come the font is still remembered somewhere.
font_clear_cache closes all the fonts and sets the frame's font cache
to Qnil, I donʼt see it doing anything with faces.
Eli> And I lack some background here: what is/are the character(s) we try
Eli> displaying here, and how that display is triggered by creating a new
Eli> frame due to the second emacsclient invocation?
Itʼs just emacsclient redisplaying *scratch*, I think. Are you saying redisplay
should not be called here?
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42943
; Package
emacs
.
(Sat, 24 Oct 2020 13:11:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 42943 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: contovob <at> tcd.ie, larsi <at> gnus.org, 42943 <at> debbugs.gnu.org
> Date: Sat, 24 Oct 2020 14:14:53 +0200
>
> Eli> I'm guessing that we close the font, but there's still a face that
> Eli> references that font, and we try using that face for display. Can you
> Eli> see if that is the case? The 'face' member of 'struct glyph_string'
> Eli> should point to the face, and face->font should point to the font.
>
> Yes, weʼre using the face thatʼs cached in the glyph_string:
But glyph_strings are not kept between redisplay cycles, AFAIR, they
are recreated anew each time we need to redisplay something. This
happens in the write_glyphs method that is called from update_window
and update_frame, which eventually calls gui_write_glyphs, which calls
draw_glyphs, which creates the glyph_strings in BUILD_GLYPH_STRINGS.
So it's unclear to me how can a face be cached in a glyph_string.
> Thread 1 "emacs" received signal SIGSEGV, Segmentation fault.
> ftcrfont_glyph_extents (font=0x555556930478, glyph=1036,
> metrics=metrics <at> entry=0x0) at ftcrfont.c:81
> 81 if (METRICS_STATUS (cache) == METRICS_INVALID)
> (gdb) up
> #1 0x00005555558453a1 in ftcrfont_draw (s=0x7fffffffb440,
> from=<optimized out>, to=<optimized out>, x=17, y=<optimized out>,
> with_background=<optimized out>) at ftcrfont.c:520
> 520 x += (s->padding_p ? 1 : ftcrfont_glyph_extents (s->font,
> (gdb) l 500
> 495 struct face *face = s->face;
> 496 struct font_info *ftcrfont_info = (struct font_info *) s->font;
> 497 cairo_t *cr;
> 498 cairo_glyph_t *glyphs;
> 499 int len = to - from;
> 500 int i;
> 501
> 502 block_input ();
> 503
> 504 cr = x_begin_cr_clip (f, s->gc);
> (gdb) p s->face
> $1 = (struct face *) 0x555556113290
> (gdb) p s->face->font
> $2 = (struct font *) 0x555556930478
> (gdb) p s->font
> $3 = (struct font *) 0x555556930478
And how do you see from the above that it's a pointer to the same
'struct font' that was used by the now-deleted first client frame?
> Eli> We call font_clear_cache when a frame is deleted, so it's unclear to
> Eli> me how come the font is still remembered somewhere.
>
> font_clear_cache closes all the fonts and sets the frame's font cache
> to Qnil, I donʼt see it doing anything with faces.
Faces are cached in the frame's face cache, see xfaces.c. When a
frame is deleted, its face cache is freed, so its faces should also go
away. A new frame starts with an empty face cache, and then faces are
added to that cache as they are "realized", starting from the "basic
faces" that are always needed, see realize_basic_faces. For a
non-ASCII character, we produce a new face based on the default face,
the first time we need to display a character that needs a font we
don't already have; then that face is also added to the frame's face
cache.
> Eli> And I lack some background here: what is/are the character(s) we try
> Eli> displaying here, and how that display is triggered by creating a new
> Eli> frame due to the second emacsclient invocation?
>
> Itʼs just emacsclient redisplaying *scratch*, I think.
And *scratch* has an Arabic character? How did that happen? I thought
the recipe was only to turn on the Arabic input method. Is the
offending character the IM indicator on the mode-line, per chance?
> Are you saying redisplay should not be called here?
No, of course not. When a new frame is created, it is immediately
displayed.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42943
; Package
emacs
.
(Sat, 24 Oct 2020 13:28:01 GMT)
Full text and
rfc822 format available.
Message #32 received at 42943 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Sat, 24 Oct 2020 14:14:53 +0200, Robert Pluim <rpluim <at> gmail.com> said:
Eli> I'm guessing that we close the font, but there's still a face that
Eli> references that font, and we try using that face for display. Can you
Eli> see if that is the case? The 'face' member of 'struct glyph_string'
Eli> should point to the face, and face->font should point to the font.
Robert> Yes, weʼre using the face thatʼs cached in the glyph_string:
Robert> Thread 1 "emacs" received signal SIGSEGV, Segmentation fault.
Robert> ftcrfont_glyph_extents (font=0x555556930478, glyph=1036,
Robert> metrics=metrics <at> entry=0x0) at ftcrfont.c:81
Robert> 81 if (METRICS_STATUS (cache) == METRICS_INVALID)
Robert> (gdb) up
Robert> #1 0x00005555558453a1 in ftcrfont_draw (s=0x7fffffffb440,
Robert> from=<optimized out>, to=<optimized out>, x=17, y=<optimized out>,
Robert> with_background=<optimized out>) at ftcrfont.c:520
Robert> 520 x += (s->padding_p ? 1 : ftcrfont_glyph_extents (s->font,
Robert> (gdb) l 500
Robert> 495 struct face *face = s->face;
Robert> 496 struct font_info *ftcrfont_info = (struct font_info *) s->font;
Robert> 497 cairo_t *cr;
Robert> 498 cairo_glyph_t *glyphs;
Robert> 499 int len = to - from;
Robert> 500 int i;
Robert> 501
Robert> 502 block_input ();
Robert> 503
Robert> 504 cr = x_begin_cr_clip (f, s->gc);
Robert> (gdb) p s->face
Robert> $1 = (struct face *) 0x555556113290
Robert> (gdb) p s->face->font
Robert> $2 = (struct font *) 0x555556930478
Robert> (gdb) p s->font
Robert> $3 = (struct font *) 0x555556930478
And that font comes from here:
static int
fill_gstring_glyph_string (struct glyph_string *s, int face_id,
int start, int end, int overlaps)
{
struct glyph *glyph, *last;
Lisp_Object lgstring;
int i;
bool glyph_not_available_p;
s->for_overlaps = overlaps;
glyph = s->row->glyphs[s->area] + start;
last = s->row->glyphs[s->area] + end;
glyph_not_available_p = glyph->glyph_not_available_p;
s->cmp_id = glyph->u.cmp.id;
s->cmp_from = glyph->slice.cmp.from;
s->cmp_to = glyph->slice.cmp.to + 1;
s->face = FACE_FROM_ID (s->f, face_id);
lgstring = composition_gstring_from_id (s->cmp_id);
s->font = XFONT_OBJECT (LGSTRING_FONT (lgstring)); <----
so itʼs the caching in the Lisp_Object for the composition thatʼs
causing the problem.
I can also get it to crash by entering 'a' followed by U+306 (COMBINING
BREVE), but the initial code path is different, as then the
initial caching of the font is here:
Lisp_Object
hbfont_shape (Lisp_Object lgstring, Lisp_Object direction)
{
struct font *font = CHECK_FONT_GET_OBJECT (LGSTRING_FONT (lgstring));
(but the final crash is at the same place).
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42943
; Package
emacs
.
(Sat, 24 Oct 2020 13:36:01 GMT)
Full text and
rfc822 format available.
Message #35 received at 42943 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Sat, 24 Oct 2020 16:10:31 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Cc: contovob <at> tcd.ie, larsi <at> gnus.org, 42943 <at> debbugs.gnu.org
>> Date: Sat, 24 Oct 2020 14:14:53 +0200
>>
Eli> I'm guessing that we close the font, but there's still a face that
Eli> references that font, and we try using that face for display. Can you
Eli> see if that is the case? The 'face' member of 'struct glyph_string'
Eli> should point to the face, and face->font should point to the font.
>>
>> Yes, weʼre using the face thatʼs cached in the glyph_string:
Eli> But glyph_strings are not kept between redisplay cycles, AFAIR, they
Eli> are recreated anew each time we need to redisplay something. This
Eli> happens in the write_glyphs method that is called from update_window
Eli> and update_frame, which eventually calls gui_write_glyphs, which calls
Eli> draw_glyphs, which creates the glyph_strings in BUILD_GLYPH_STRINGS.
Eli> So it's unclear to me how can a face be cached in a glyph_string.
Youʼre right, itʼs not cached in the glyph_string, itʼs cached in the
composition_gstring thatʼs used to create the glyph_string (see my
other message).
Eli> And how do you see from the above that it's a pointer to the same
Eli> 'struct font' that was used by the now-deleted first client frame?
thatʼs what the valgrind trace earlier said.
Eli> We call font_clear_cache when a frame is deleted, so it's unclear to
Eli> me how come the font is still remembered somewhere.
>>
>> font_clear_cache closes all the fonts and sets the frame's font cache
>> to Qnil, I donʼt see it doing anything with faces.
Eli> Faces are cached in the frame's face cache, see xfaces.c. When a
Eli> frame is deleted, its face cache is freed, so its faces should also go
Eli> away. A new frame starts with an empty face cache, and then faces are
Eli> added to that cache as they are "realized", starting from the "basic
Eli> faces" that are always needed, see realize_basic_faces. For a
Eli> non-ASCII character, we produce a new face based on the default face,
Eli> the first time we need to display a character that needs a font we
Eli> don't already have; then that face is also added to the frame's face
Eli> cache.
Eli> And I lack some background here: what is/are the character(s) we try
Eli> displaying here, and how that display is triggered by creating a new
Eli> frame due to the second emacsclient invocation?
>>
>> Itʼs just emacsclient redisplaying *scratch*, I think.
Eli> And *scratch* has an Arabic character? How did that happen? I thought
Eli> the recipe was only to turn on the Arabic input method. Is the
Eli> offending character the IM indicator on the mode-line, per chance?
Yes, I suspect so, since there are no Arabic characters in *scratch*
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42943
; Package
emacs
.
(Sat, 24 Oct 2020 14:13:02 GMT)
Full text and
rfc822 format available.
Message #38 received at 42943 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: contovob <at> tcd.ie, larsi <at> gnus.org, 42943 <at> debbugs.gnu.org
> Date: Sat, 24 Oct 2020 15:27:39 +0200
>
> static int
> fill_gstring_glyph_string (struct glyph_string *s, int face_id,
> int start, int end, int overlaps)
> {
> struct glyph *glyph, *last;
> Lisp_Object lgstring;
> int i;
> bool glyph_not_available_p;
>
> s->for_overlaps = overlaps;
> glyph = s->row->glyphs[s->area] + start;
> last = s->row->glyphs[s->area] + end;
> glyph_not_available_p = glyph->glyph_not_available_p;
> s->cmp_id = glyph->u.cmp.id;
> s->cmp_from = glyph->slice.cmp.from;
> s->cmp_to = glyph->slice.cmp.to + 1;
> s->face = FACE_FROM_ID (s->f, face_id);
> lgstring = composition_gstring_from_id (s->cmp_id);
> s->font = XFONT_OBJECT (LGSTRING_FONT (lgstring)); <----
>
> so itʼs the caching in the Lisp_Object for the composition thatʼs
> causing the problem.
OK, so when we are about to release a font, we need to go over all the
LGSTRING objects in gstring_hash_table, and remove from that cache
every LGSTRING whose LGSTRING_FONT object holds the font we are about
to release.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42943
; Package
emacs
.
(Sat, 24 Oct 2020 14:49:01 GMT)
Full text and
rfc822 format available.
Message #41 received at 42943 <at> debbugs.gnu.org (full text, mbox):
> Date: Sat, 24 Oct 2020 17:12:04 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: contovob <at> tcd.ie, larsi <at> gnus.org, 42943 <at> debbugs.gnu.org
>
> > s->font = XFONT_OBJECT (LGSTRING_FONT (lgstring)); <----
> >
> > so itʼs the caching in the Lisp_Object for the composition thatʼs
> > causing the problem.
>
> OK, so when we are about to release a font, we need to go over all the
> LGSTRING objects in gstring_hash_table, and remove from that cache
> every LGSTRING whose LGSTRING_FONT object holds the font we are about
> to release.
Here, does the below give good results?
diff --git a/src/composite.c b/src/composite.c
index 984e0d9..f1b4b97 100644
--- a/src/composite.c
+++ b/src/composite.c
@@ -679,6 +679,27 @@ composition_gstring_from_id (ptrdiff_t id)
return HASH_VALUE (h, id);
}
+/* Remove from the composition hash table every lgstring that
+ references the given FONT_OBJECT. */
+void
+composition_gstring_cache_clear_font (Lisp_Object font_object)
+{
+ struct Lisp_Hash_Table *h = XHASH_TABLE (gstring_hash_table);
+
+ for (ptrdiff_t i = 0; i < HASH_TABLE_SIZE (h); ++i)
+ {
+ Lisp_Object k = HASH_KEY (h, i);
+
+ if (!EQ (k, Qunbound))
+ {
+ Lisp_Object gstring = HASH_VALUE (h, i);
+
+ if (EQ (LGSTRING_FONT (gstring), font_object))
+ hash_remove_from_table (h, k);
+ }
+ }
+}
+
DEFUN ("clear-composition-cache", Fclear_composition_cache,
Sclear_composition_cache, 0, 0, 0,
doc: /* Internal use only.
diff --git a/src/composite.h b/src/composite.h
index 239f1e5..0d7d1c7 100644
--- a/src/composite.h
+++ b/src/composite.h
@@ -331,6 +331,8 @@ #define LGLYPH_WADJUST(g) (VECTORP (LGLYPH_ADJUSTMENT (g)) \
extern ptrdiff_t composition_adjust_point (ptrdiff_t, ptrdiff_t);
+extern void composition_gstring_cache_clear_font (Lisp_Object);
+
INLINE_HEADER_END
#endif /* not EMACS_COMPOSITE_H */
diff --git a/src/font.c b/src/font.c
index fe257f4..d74938e 100644
--- a/src/font.c
+++ b/src/font.c
@@ -2645,6 +2645,11 @@ font_clear_cache (struct frame *f, Lisp_Object cache,
if (! NILP (AREF (val, FONT_TYPE_INDEX)))
{
eassert (font && driver == font->driver);
+ /* We are going to close the font, so make
+ sure we don't have any lgstrings lying
+ around in lgstring cache that reference
+ the font. */
+ composition_gstring_cache_clear_font (val);
driver->close_font (font);
}
}
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42943
; Package
emacs
.
(Sat, 24 Oct 2020 15:35:02 GMT)
Full text and
rfc822 format available.
Message #44 received at 42943 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Sat, 24 Oct 2020 17:12:04 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Cc: contovob <at> tcd.ie, larsi <at> gnus.org, 42943 <at> debbugs.gnu.org
>> Date: Sat, 24 Oct 2020 15:27:39 +0200
>>
>> static int
>> fill_gstring_glyph_string (struct glyph_string *s, int face_id,
>> int start, int end, int overlaps)
>> {
>> struct glyph *glyph, *last;
>> Lisp_Object lgstring;
>> int i;
>> bool glyph_not_available_p;
>>
s-> for_overlaps = overlaps;
>> glyph = s->row->glyphs[s->area] + start;
>> last = s->row->glyphs[s->area] + end;
>> glyph_not_available_p = glyph->glyph_not_available_p;
s-> cmp_id = glyph->u.cmp.id;
s-> cmp_from = glyph->slice.cmp.from;
s-> cmp_to = glyph->slice.cmp.to + 1;
s-> face = FACE_FROM_ID (s->f, face_id);
>> lgstring = composition_gstring_from_id (s->cmp_id);
s-> font = XFONT_OBJECT (LGSTRING_FONT (lgstring)); <----
>>
>> so itʼs the caching in the Lisp_Object for the composition thatʼs
>> causing the problem.
Eli> OK, so when we are about to release a font, we need to go over all the
Eli> LGSTRING objects in gstring_hash_table, and remove from that cache
Eli> every LGSTRING whose LGSTRING_FONT object holds the font we are about
Eli> to release.
I stuck an Fclrhash (gstring_hash_table) in ftcrfont_close, and it no
longer crashes (and valgrind is happy). The following also doesnʼt
crash, but I worry about the quadratic behaviour when removing the
entries from the hash_table. Do we need a
hash_remove_from_table_by_index? Am I being overly paranoid?
Iʼve not checked with other font backends, so the same treatment might
be needed there.
diff --git i/src/composite.c w/src/composite.c
index 984e0d9cda..93a700897f 100644
--- i/src/composite.c
+++ w/src/composite.c
@@ -639,6 +639,18 @@ compose_text (ptrdiff_t start, ptrdiff_t end, Lisp_Object components,
static Lisp_Object gstring_lookup_cache (Lisp_Object);
+void
+gstring_remove_from_cache_by_font (struct font * font)
+{
+ struct Lisp_Hash_Table *h = XHASH_TABLE (gstring_hash_table);
+ for (ptrdiff_t i = 0; i < HASH_TABLE_SIZE (h); ++i)
+ {
+ Lisp_Object k = HASH_KEY (h, i);
+ if (!EQ (k, Qunbound) && font == XFONT_OBJECT (LGSTRING_FONT (HASH_VALUE (h, i))))
+ hash_remove_from_table (h, k);
+ }
+}
+
static Lisp_Object
gstring_lookup_cache (Lisp_Object header)
{
diff --git i/src/composite.h w/src/composite.h
index 239f1e531e..2c30ee1933 100644
--- i/src/composite.h
+++ w/src/composite.h
@@ -331,6 +331,8 @@ #define LGLYPH_WADJUST(g) (VECTORP (LGLYPH_ADJUSTMENT (g)) \
extern ptrdiff_t composition_adjust_point (ptrdiff_t, ptrdiff_t);
+extern void gstring_remove_from_cache_by_font (struct font *);
+
INLINE_HEADER_END
#endif /* not EMACS_COMPOSITE_H */
diff --git i/src/ftcrfont.c w/src/ftcrfont.c
index 4892a34a3a..61a2b1290f 100644
--- i/src/ftcrfont.c
+++ w/src/ftcrfont.c
@@ -283,6 +283,7 @@ ftcrfont_close (struct font *font)
struct font_info *ftcrfont_info = (struct font_info *) font;
block_input ();
+ gstring_remove_from_cache_by_font (font);
#ifdef HAVE_LIBOTF
if (ftcrfont_info->otf)
{
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42943
; Package
emacs
.
(Sat, 24 Oct 2020 15:42:01 GMT)
Full text and
rfc822 format available.
Message #47 received at 42943 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Sat, 24 Oct 2020 17:48:26 +0300, Eli Zaretskii <eliz <at> gnu.org> said:
>> Date: Sat, 24 Oct 2020 17:12:04 +0300
>> From: Eli Zaretskii <eliz <at> gnu.org>
>> Cc: contovob <at> tcd.ie, larsi <at> gnus.org, 42943 <at> debbugs.gnu.org
>>
>> > s->font = XFONT_OBJECT (LGSTRING_FONT (lgstring)); <----
>> >
>> > so itʼs the caching in the Lisp_Object for the composition thatʼs
>> > causing the problem.
>>
>> OK, so when we are about to release a font, we need to go over all the
>> LGSTRING objects in gstring_hash_table, and remove from that cache
>> every LGSTRING whose LGSTRING_FONT object holds the font we are about
>> to release.
Eli> Here, does the below give good results?
I should really hit 'g' in Gnus before writing patches :-)
That fixes it as well.
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42943
; Package
emacs
.
(Sat, 24 Oct 2020 15:54:02 GMT)
Full text and
rfc822 format available.
Message #50 received at 42943 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: contovob <at> tcd.ie, larsi <at> gnus.org, 42943 <at> debbugs.gnu.org
> Date: Sat, 24 Oct 2020 17:41:36 +0200
>
> >> OK, so when we are about to release a font, we need to go over all the
> >> LGSTRING objects in gstring_hash_table, and remove from that cache
> >> every LGSTRING whose LGSTRING_FONT object holds the font we are about
> >> to release.
>
> Eli> Here, does the below give good results?
>
> I should really hit 'g' in Gnus before writing patches :-)
>
> That fixes it as well.
OK, thanks. (I think clearing the entire cache of gstrings is too
radical: we don't necessarily remove all the fonts from all the
frames.)
Now the important question: should we install this on master or on the
release branch? I believe the problem is not limited to 28.0.50, is
it? OTOH, is this safe enough to install on emacs-27?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42943
; Package
emacs
.
(Mon, 26 Oct 2020 12:34:02 GMT)
Full text and
rfc822 format available.
Message #53 received at 42943 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii <eliz <at> gnu.org> writes:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Cc: contovob <at> tcd.ie, larsi <at> gnus.org, 42943 <at> debbugs.gnu.org
>> Date: Sat, 24 Oct 2020 17:41:36 +0200
>>
>> >> OK, so when we are about to release a font, we need to go over all the
>> >> LGSTRING objects in gstring_hash_table, and remove from that cache
>> >> every LGSTRING whose LGSTRING_FONT object holds the font we are about
>> >> to release.
>>
>> Eli> Here, does the below give good results?
>>
>> I should really hit 'g' in Gnus before writing patches :-)
>>
>> That fixes it as well.
>
> OK, thanks. (I think clearing the entire cache of gstrings is too
> radical: we don't necessarily remove all the fonts from all the
> frames.)
>
Right
> Now the important question: should we install this on master or on the
> release branch? I believe the problem is not limited to 28.0.50, is
> it? OTOH, is this safe enough to install on emacs-27?
I can't reproduce this crash on emacs-27 built with Cairo + Harfbuzz
at all.
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42943
; Package
emacs
.
(Mon, 26 Oct 2020 16:18:02 GMT)
Full text and
rfc822 format available.
Message #56 received at 42943 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: contovob <at> tcd.ie, larsi <at> gnus.org, 42943 <at> debbugs.gnu.org
> Date: Mon, 26 Oct 2020 13:33:49 +0100
>
> > Now the important question: should we install this on master or on the
> > release branch? I believe the problem is not limited to 28.0.50, is
> > it? OTOH, is this safe enough to install on emacs-27?
>
> I can't reproduce this crash on emacs-27 built with Cairo + Harfbuzz
> at all.
I'm a bit surprised. But okay, let's install on master and wait for
reports about similar problems on emacs-27.
Pushed.
Added tag(s) fixed.
Request was from
"Basil L. Contovounesios" <contovob <at> tcd.ie>
to
control <at> debbugs.gnu.org
.
(Mon, 26 Oct 2020 20:14:02 GMT)
Full text and
rfc822 format available.
bug marked as fixed in version 28.1, send any further explanations to
42943 <at> debbugs.gnu.org and "Basil L. Contovounesios" <contovob <at> tcd.ie>
Request was from
"Basil L. Contovounesios" <contovob <at> tcd.ie>
to
control <at> debbugs.gnu.org
.
(Mon, 26 Oct 2020 20:14:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#42943
; Package
emacs
.
(Mon, 26 Oct 2020 20:14:02 GMT)
Full text and
rfc822 format available.
Message #63 received at 42943-done <at> debbugs.gnu.org (full text, mbox):
tags 42943 fixed
close 42943 28.1
quit
Eli Zaretskii <eliz <at> gnu.org> writes:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Cc: contovob <at> tcd.ie, larsi <at> gnus.org, 42943 <at> debbugs.gnu.org
>> Date: Mon, 26 Oct 2020 13:33:49 +0100
>>
>> > Now the important question: should we install this on master or on the
>> > release branch? I believe the problem is not limited to 28.0.50, is
>> > it? OTOH, is this safe enough to install on emacs-27?
>>
>> I can't reproduce this crash on emacs-27 built with Cairo + Harfbuzz
>> at all.
Same here, at least with the recipe in the OP.
> I'm a bit surprised.
> But okay, let's install on master and wait for
> reports about similar problems on emacs-27.
>
> Pushed.
Thanks to both of you for debugging this. I can no longer reproduce the
reported crash on master, so I'm closing this report.
--
Basil
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Tue, 24 Nov 2020 12:24:10 GMT)
Full text and
rfc822 format available.
This bug report was last modified 3 years and 125 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.