GNU bug report logs - #42603
EWW shows chars > #xFF with font set by "set-fontset-font"

Previous Next

Package: emacs;

Reported by: Sebastian Urban <mrsebastianurban <at> gmail.com>

Date: Wed, 29 Jul 2020 16:27:02 UTC

Severity: normal

Done: Sebastian Urban <mrsebastianurban <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 42603 in the body.
You can then email your comments to 42603 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#42603; Package emacs. (Wed, 29 Jul 2020 16:27:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Sebastian Urban <mrsebastianurban <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Wed, 29 Jul 2020 16:27:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Sebastian Urban <mrsebastianurban <at> gmail.com>
To: Bug GNU Emacs <bug-gnu-emacs <at> gnu.org>
Subject: EWW shows chars > #xFF with font set by "set-fontset-font"
Date: Wed, 29 Jul 2020 18:26:07 +0200
Hello,

a quick recipe:
1. Open website in EWW with chars above #xFF, e.g.:
   M-x eww RET https://sjp.pl/slownik/ort/ RET
2. M-: (set-fontset-font t 'unicode "Times New Roman")
   ... or any font, other than in variable-pitch face
3. Watch as some chars change font.

The thing is, both unchanged and changed chars have face of
variable-pitch ("C-u C-x =" on char), so nothing should change,
I think.

Something similar happens in:
   M-x list-charset-chars RET unicode-bmp RET
#x00 -> #xFF stays the same, next segments change.

Tested on: GNU Emacs 28.0.50 (build 1, x86_64-w64-mingw32) of
2020-07-05.


S. U.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42603; Package emacs. (Wed, 29 Jul 2020 18:47:01 GMT) Full text and rfc822 format available.

Message #8 received at 42603 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Sebastian Urban <mrsebastianurban <at> gmail.com>
Cc: 42603 <at> debbugs.gnu.org
Subject: Re: bug#42603: EWW shows chars > #xFF with font set by
 "set-fontset-font"
Date: Wed, 29 Jul 2020 21:46:30 +0300
> From: Sebastian Urban <mrsebastianurban <at> gmail.com>
> Date: Wed, 29 Jul 2020 18:26:07 +0200
> 
> 2. M-: (set-fontset-font t 'unicode "Times New Roman")

This setting makes no sense: no single font can cover all of Unicode,
so you should never do that.

Why did you think you needed to do it in your case?

> 3. Watch as some chars change font.
> 
> The thing is, both unchanged and changed chars have face of
> variable-pitch ("C-u C-x =" on char), so nothing should change,
> I think.

I don't think I agree.  Times New Roman doesn't support all of the
characters.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42603; Package emacs. (Wed, 29 Jul 2020 19:04:01 GMT) Full text and rfc822 format available.

Message #11 received at 42603 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: mrsebastianurban <at> gmail.com
Cc: 42603 <at> debbugs.gnu.org
Subject: Re: bug#42603: EWW shows chars > #xFF with font set by
 "set-fontset-font"
Date: Wed, 29 Jul 2020 22:02:54 +0300
> Date: Wed, 29 Jul 2020 21:46:30 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 42603 <at> debbugs.gnu.org
> 
> > The thing is, both unchanged and changed chars have face of
> > variable-pitch ("C-u C-x =" on char), so nothing should change,
> > I think.
> 
> I don't think I agree.  Times New Roman doesn't support all of the
> characters.

And in addition, you seem to assume that set-fontset-font overrides
the frame's default font for the first 256 characters, which isn't
true, AFAIK.

In short, you are doing something that Emacs doesn't support, and I
wonder why you needed anything like that.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42603; Package emacs. (Fri, 31 Jul 2020 12:02:01 GMT) Full text and rfc822 format available.

Message #14 received at 42603 <at> debbugs.gnu.org (full text, mbox):

From: Sebastian Urban <mrsebastianurban <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 42603 <at> debbugs.gnu.org
Subject: Re: bug#42603: EWW shows chars > #xFF with font set by
 "set-fontset-font"
Date: Fri, 31 Jul 2020 14:01:49 +0200
>> 2. M-: (set-fontset-font t 'unicode "Times New Roman")
>
> This setting makes no sense: no single font can cover all of Unicode,
> so you should never do that.
>
> Why did you think you needed to do it in your case?

My use case is not related to EWW, but to fonts overall.  I use it to
"prevent" Emacs from searching for fonts, and to display codes of
characters instead of glyphs, to speed up loading text in situations
like in case of view-hello-file.  Then I have a file with Noto Fonts
set according to the script, like this:
   (add-to-list 'default-frame-alist '(font . "Consolas-13"))
   (set-fontset-font "fontset-default" 'unicode "Consolas")
   (set-fontset-font "fontset-default" 'unicode "Symbola" nil 'append)
   (load "noto-fonts.elc")

In this case, I simply spotted this strange behaviour of showing some
chars in my default font (Consolas) instead of variable-pitch (in my
case it's Arial), like "ł", "ą" and "ę" in sentence (see link in my
first message):
Słownik SJP.PL do programów sprawdzających pisownię (...):

When I type C-u C-x = on any of above letters, it says:
   There are text properties here:
     face                 variable-pitch
and the variable-pitch says Arial, but it's not Arial.

>> 3. Watch as some chars change font.
>>
>> The thing is, both unchanged and changed chars have face of
>> variable-pitch ("C-u C-x =" on char), so nothing should change,
>> I think.
>
> I don't think I agree.  Times New Roman doesn't support all of the
> characters.
>
> And in addition, you seem to assume that set-fontset-font overrides
> the frame's default font for the first 256 characters, which isn't
> true, AFAIK.

I think, I'm not assuming that, which I noted:
"1. Open website in EWW with chars above #xFF, e.g.:".
                                   ^^^^^^^^^^

Also, if I understood correctly your interpretation of my message:
I DON'T want to change whole text to TNR, or any other font, with this
command, quite the opposite, I don't want IT to change chars above
256, in EWW buffer that uses variable-pitch font, which is Arial.


S. U.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42603; Package emacs. (Fri, 31 Jul 2020 12:45:02 GMT) Full text and rfc822 format available.

Message #17 received at 42603 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Sebastian Urban <mrsebastianurban <at> gmail.com>
Cc: 42603 <at> debbugs.gnu.org
Subject: Re: bug#42603: EWW shows chars > #xFF with font set by
 "set-fontset-font"
Date: Fri, 31 Jul 2020 15:43:49 +0300
> From: Sebastian Urban <mrsebastianurban <at> gmail.com>
> Cc: 42603 <at> debbugs.gnu.org
> Date: Fri, 31 Jul 2020 14:01:49 +0200
> 
> >> 2. M-: (set-fontset-font t 'unicode "Times New Roman")
> >
> > This setting makes no sense: no single font can cover all of Unicode,
> > so you should never do that.
> >
> > Why did you think you needed to do it in your case?
> 
> My use case is not related to EWW, but to fonts overall.  I use it to
> "prevent" Emacs from searching for fonts, and to display codes of
> characters instead of glyphs, to speed up loading text in situations
> like in case of view-hello-file.

In that case, you should indeed use set-fontset-font, but instead of
telling Emacs that each of the fonts covers all of the Unicode, you
should tell Emacs which ranges of characters, or which scripts, should
be rendered by what fonts.

>     (set-fontset-font "fontset-default" 'unicode "Consolas")
>     (set-fontset-font "fontset-default" 'unicode "Symbola" nil 'append)

Instead of using 'unicode' in the above 2 lines, use either symbols of
scripts you want to render with each font, or explicit ranges of
character codepoints.  The node "Modifying Fontsets" in the Emacs user
manual and the node "Fontsets" in the ELisp manual have examples of
how to do that.

> When I type C-u C-x = on any of above letters, it says:
>     There are text properties here:
>       face                 variable-pitch
> and the variable-pitch says Arial, but it's not Arial.

You countermanded that with you over-optimistic set-fontset-font
setting, I think.

> I DON'T want to change whole text to TNR, or any other font, with this
> command, quite the opposite, I don't want IT to change chars above
> 256, in EWW buffer that uses variable-pitch font, which is Arial.

Then why did you use this:

  (set-fontset-font t 'unicode "Times New Roman")

?  It tells Emacs the opposite: to use Times New Roman for _any_
character (because the 'unicode' script spans all the characters you
can possibly have).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42603; Package emacs. (Fri, 31 Jul 2020 16:11:01 GMT) Full text and rfc822 format available.

Message #20 received at 42603 <at> debbugs.gnu.org (full text, mbox):

From: Sebastian Urban <mrsebastianurban <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 42603 <at> debbugs.gnu.org
Subject: Re: bug#42603: EWW shows chars > #xFF with font set by
 "set-fontset-font"
Date: Fri, 31 Jul 2020 18:10:06 +0200
>> My use case is not related to EWW, but to fonts overall.  I use it to
>> "prevent" Emacs from searching for fonts, and to display codes of
>> characters instead of glyphs, to speed up loading text in situations
>> like in case of view-hello-file.
> 
> In that case, you should indeed use set-fontset-font, but instead of
> telling Emacs that each of the fonts covers all of the Unicode, you
> should tell Emacs which ranges of characters, or which scripts, should
> be rendered by what fonts.
> (...)
>>     (set-fontset-font "fontset-default" 'unicode "Consolas")
>>     (set-fontset-font "fontset-default" 'unicode "Symbola" nil 'append)
> 
> Instead of using 'unicode' in the above 2 lines, use either symbols of
> scripts you want to render with each font, or explicit ranges of
> character codepoints.

And if I want some characters to be rendered, and the rest not to be
rendered - to which font I should assign "the rest"?

My reasoning was:
- use Consolas as default and for as much Unicode as it covers,
- then additionally use Symbola,
- then use fonts from noto-fonts.elc according to the script,
- everything else - don't show.

>> When I type C-u C-x = on any of above letters, it says:
>>     There are text properties here:
>>       face                 variable-pitch
>> and the variable-pitch says Arial, but it's not Arial.
> 
> You countermanded that with you over-optimistic set-fontset-font
> setting, I think.

I guess I underestimated the power of set-fontset-font, it was good as
long as it was used in buffer with default font and had the same value
as default font.

Anyway, I think I found a better way to use set-fontset-font in my case:
-(set-fontset-font "fontset-default" 'unicode "Consolas")
+(set-fontset-font "fontset-default" 'unicode "nil")
I don't think I have "nil" font, and it seems to work in both HELLO file
(it loads faster, codes for some chars instead of glyphs) and in EWW buffer.

>> I DON'T want to change whole text to TNR, or any other font, with this
>> command, quite the opposite, I don't want IT to change chars above
>> 256, in EWW buffer that uses variable-pitch font, which is Arial.
> 
> Then why did you use this:
> 
>   (set-fontset-font t 'unicode "Times New Roman")
> 
> ?  It tells Emacs the opposite: to use Times New Roman for _any_
> character (because the 'unicode' script spans all the characters you
> can possibly have).

With background info about how I set fonts, I think it is easier to see
"why". I should have describe this "bug" from perspective of my init.el
instead of out of context setting TNR.


S. U.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#42603; Package emacs. (Fri, 31 Jul 2020 17:50:02 GMT) Full text and rfc822 format available.

Message #23 received at 42603 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Sebastian Urban <mrsebastianurban <at> gmail.com>
Cc: 42603 <at> debbugs.gnu.org
Subject: Re: bug#42603: EWW shows chars > #xFF with font set by
 "set-fontset-font"
Date: Fri, 31 Jul 2020 20:49:24 +0300
> Cc: 42603 <at> debbugs.gnu.org
> From: Sebastian Urban <mrsebastianurban <at> gmail.com>
> Date: Fri, 31 Jul 2020 18:10:06 +0200
> 
> And if I want some characters to be rendered, and the rest not to be
> rendered - to which font I should assign "the rest"?

Leave them unassigned: Emacs will find the proper font itself.




Reply sent to Sebastian Urban <mrsebastianurban <at> gmail.com>:
You have taken responsibility. (Mon, 03 Aug 2020 18:40:01 GMT) Full text and rfc822 format available.

Notification sent to Sebastian Urban <mrsebastianurban <at> gmail.com>:
bug acknowledged by developer. (Mon, 03 Aug 2020 18:40:02 GMT) Full text and rfc822 format available.

Message #28 received at 42603-done <at> debbugs.gnu.org (full text, mbox):

From: Sebastian Urban <mrsebastianurban <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 42603-done <at> debbugs.gnu.org
Subject: Re: bug#42603: EWW shows chars > #xFF with font set by
 "set-fontset-font"
Date: Mon, 3 Aug 2020 20:39:47 +0200
>> And if I want some characters to be rendered, and the rest not to be
>> rendered - to which font I should assign "the rest"?
> 
> Leave them unassigned: Emacs will find the proper font itself.

Alright, I'll stick to:
   (set-fontset-font "fontset-default" 'unicode "nil")
for now, but if something goes wrong, I'll remove this line and try to
use per script/range settings for as much chars as I need/want, and let
Emacs do the work for other characters.


S. U.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 01 Sep 2020 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 210 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.