GNU bug report logs -
#44236
[PATCH] xdisp: Apply nobreak-char-display also to NARROW NO-BREAK SPACE U+202F
Previous Next
Reported by: Neil Roberts <bpeeluk <at> yahoo.co.uk>
Date: Mon, 26 Oct 2020 11:15:02 UTC
Severity: normal
Tags: fixed, patch
Fixed in version 28.1
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 44236 in the body.
You can then email your comments to 44236 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Mon, 26 Oct 2020 11:15:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Neil Roberts <bpeeluk <at> yahoo.co.uk>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Mon, 26 Oct 2020 11:15:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
U+202F is like the normal non-breaking space character except that it
is slightly narrower. In the French language, this character is
supposed to be used before most punctation marks such as question
marks and quote characters. For people using the BÉPO keyboard layout,
this character is typed with just shift+space, so it’s quite easy to
accidentally type it. For that reason it would be nice if it was
displayed differently like the regular non-breaking space. This patch
makes that change.
* src/charcter.h: Add an enum for the U+202F character.
* src/xdisp.c (get_next_display_element): Use nobreak_space face also
for U+202F.
---
src/character.h | 1 +
src/xdisp.c | 2 +-
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git src/character.h src/character.h
index af5023f77c..90708c8d38 100644
--- src/character.h
+++ src/character.h
@@ -69,6 +69,7 @@ #define EMACS_CHARACTER_H
enum
{
NO_BREAK_SPACE = 0x00A0,
+ NARROW_NO_BREAK_SPACE = 0x202F,
SOFT_HYPHEN = 0x00AD,
ZERO_WIDTH_NON_JOINER = 0x200C,
ZERO_WIDTH_JOINER = 0x200D,
diff --git src/xdisp.c src/xdisp.c
index 5a62cd6eb5..0772066f8a 100644
--- src/xdisp.c
+++ src/xdisp.c
@@ -7555,7 +7555,7 @@ get_next_display_element (struct it *it)
non-ASCII spaces and hyphens specially. */
if (! ASCII_CHAR_P (c) && ! NILP (Vnobreak_char_display))
{
- if (c == NO_BREAK_SPACE)
+ if (c == NO_BREAK_SPACE || c == NARROW_NO_BREAK_SPACE)
nonascii_space_p = true;
else if (c == SOFT_HYPHEN || c == HYPHEN
|| c == NON_BREAKING_HYPHEN)
--
2.25.4
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Mon, 26 Oct 2020 16:30:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 44236 <at> debbugs.gnu.org (full text, mbox):
> Date: Mon, 26 Oct 2020 12:13:48 +0100
> From: Neil Roberts via "Bug reports for GNU Emacs,
> the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
>
> U+202F is like the normal non-breaking space character except that it
> is slightly narrower. In the French language, this character is
> supposed to be used before most punctation marks such as question
> marks and quote characters. For people using the BÉPO keyboard layout,
> this character is typed with just shift+space, so it’s quite easy to
> accidentally type it. For that reason it would be nice if it was
> displayed differently like the regular non-breaking space. This patch
> makes that change.
Thanks.
But what is the purpose of showing this character like we do with
NBSP? We do that with NBSP because otherwise it will be easy to
interpret NBSP as a SPC: they have the same width and appearance on
display. By contrast, U+202F NARROW NO-BREAK SPACE is much thinner,
and cannot be mistaken to be SPC.
OTOH, if we make U+202F stand out, then why not others, for example
U+2007? or U+2060? or U+2002? or U+2003? or U+2009 etc.
IOW, we need to decide on the rationale for displaying these
specially, and then we can decide which ones should have this applied.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Mon, 26 Oct 2020 16:57:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 44236 <at> debbugs.gnu.org (full text, mbox):
> But what is the purpose of showing this character like we do with
> NBSP? We do that with NBSP because otherwise it will be easy to
> interpret NBSP as a SPC: they have the same width and appearance on
> display. By contrast, U+202F NARROW NO-BREAK SPACE is much thinner,
> and cannot be mistaken to be SPC.
>
> OTOH, if we make U+202F stand out, then why not others, for example
> U+2007? or U+2060? or U+2002? or U+2003? or U+2009 etc.
>
> IOW, we need to decide on the rationale for displaying these
> specially, and then we can decide which ones should have this applied.
I agree, both (1) that the main purpose of the current highlighting is to make no-break (aka hard) space stand out from ordinary space, and (2) that any additional highlighting needs a rationale.
___
FWIW -
My library highlight-chars.el lets you highlight particular chars in different ways, au choix. And in particular, you can highlight just hard spaces or just hard hyphens (they need not be treated the same way).
https://www.emacswiki.org/emacs/download/highlight-chars.el
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Tue, 27 Oct 2020 09:18:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 44236 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii <eliz <at> gnu.org> writes:
> But what is the purpose of showing this character like we do with
> NBSP? We do that with NBSP because otherwise it will be easy to
> interpret NBSP as a SPC: they have the same width and appearance on
> display. By contrast, U+202F NARROW NO-BREAK SPACE is much thinner,
> and cannot be mistaken to be SPC.
Most people use Emacs with a monospace font, as is the default if you
don’t change it, so in practice U+202F looks identical to NBSP and the
regular space. I would assume that most people using these characters
would be editing the source code for a document that would be displayed
in something else, such as editing an HTML document. In that case you
want to make sure that you got the right spaces in the source code and
without the visual indication it is really hard to do.
I guess ideally in my case it would be even better if U+202F had a
different face than NBSP so that I could also make sure I picked the
right non-breaking space when typing a document in French.
The other use case, which is probably more common for me, is that I am
editing some source code and I don’t want any non-breaking spaces at
all. With the bépo keyboard layout it’s kind of easy to accidentally
type them, so I just want to be able to recognise either of them. In
that case having the same face for both characters is still helpful.
> OTOH, if we make U+202F stand out, then why not others, for example
> U+2007? or U+2060? or U+2002? or U+2003? or U+2009 etc.
I think it would make sense to highlight all of the spaces that look
exactly the same as a regular space. That would exclude U+2060 because
that is zero-width. Maybe we could use all of the characters from the
“space separator” Unicode class except U+0020.
https://www.compart.com/en/unicode/category/Zs
- Neil
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Tue, 27 Oct 2020 15:25:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 44236 <at> debbugs.gnu.org (full text, mbox):
> From: Neil Roberts <bpeeluk <at> yahoo.co.uk>
> Cc: 44236 <at> debbugs.gnu.org
> Date: Tue, 27 Oct 2020 10:17:35 +0100
>
> > OTOH, if we make U+202F stand out, then why not others, for example
> > U+2007? or U+2060? or U+2002? or U+2003? or U+2009 etc.
>
> I think it would make sense to highlight all of the spaces that look
> exactly the same as a regular space. That would exclude U+2060 because
> that is zero-width. Maybe we could use all of the characters from the
> “space separator” Unicode class except U+0020.
I'm okay with displaying all "space" characters that way. We already
have a function which tests this category: blankp. Would you like to
submit a patch which implements the above? Please also include a NEWS
entry which calls out this new behavior.
Thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Wed, 28 Oct 2020 11:38:01 GMT)
Full text and
rfc822 format available.
Message #20 received at 44236 <at> debbugs.gnu.org (full text, mbox):
nobreak-char-display is documented as making Emacs display all
non-ASCII chars that have the same appearance as an ASCII space using
a special face. In practice however, this was limited to nbsp and the
hyphen characters. When using a monospace font, there are many other
characters that resemble an ASCII space, such as U+202F NARROW
NO-BREAK SPACE. That is like the normal non-breaking space character
except that it is slightly narrower. In the French language, this
character is supposed to be used before most punctuation marks such as
question marks and quote characters, so it is quite prevalent. For
that reason it would be nice if it was displayed differently like the
regular non-breaking space.
This patch makes it show all non-ASCII characters from the Unicode
horizontal space class using the special face.
* src/xdisp.c (get_next_display_element): Use blankp to test whether
to use the nobreak_space face.
---
doc/emacs/display.texi | 3 ++-
etc/NEWS | 8 ++++++++
src/xdisp.c | 5 +++--
3 files changed, 13 insertions(+), 3 deletions(-)
diff --git doc/emacs/display.texi doc/emacs/display.texi
index 6f1bc802b8..ccc945c3af 100644
--- doc/emacs/display.texi
+++ doc/emacs/display.texi
@@ -1605,7 +1605,8 @@ Text Display
realization, e.g., by yanking; for instance, source code compilers
typically do not treat non-@acronym{ASCII} spaces as whitespace
characters. To deal with this problem, Emacs displays such characters
-specially: it displays @code{U+00A0} (no-break space) with the
+specially: it displays @code{U+00A0} (no-break space) and other
+characters from the Unicode horizontal space class with the
@code{nobreak-space} face, and it displays @code{U+00AD} (soft
hyphen), @code{U+2010} (hyphen), and @code{U+2011} (non-breaking
hyphen) with the @code{nobreak-hyphen} face. To disable this, change
diff --git etc/NEWS etc/NEWS
index 7dbd3d51fa..dcf9a75723 100644
--- etc/NEWS
+++ etc/NEWS
@@ -163,6 +163,14 @@ your init file:
(setq frame-title-format '(multiple-frames "%b"
("" invocation-name "@" system-name)))
++++
+** 'nobreak-char-display' now also affects all non-ASCII Unicode horizontal space characters.
+The documented intention of this variable is to cause Emacs to display
+characters that could be confused with a space character using a
+different face. Previously this was limited only to NBSP and hyphen
+characters. Now it covers all of the Unicode space characters,
+including narrow NBSP, which has the same appearance.
+
* Editing Changes in Emacs 28.1
diff --git src/xdisp.c src/xdisp.c
index 5a62cd6eb5..cf30ba9479 100644
--- src/xdisp.c
+++ src/xdisp.c
@@ -7555,7 +7555,7 @@ get_next_display_element (struct it *it)
non-ASCII spaces and hyphens specially. */
if (! ASCII_CHAR_P (c) && ! NILP (Vnobreak_char_display))
{
- if (c == NO_BREAK_SPACE)
+ if (blankp (c))
nonascii_space_p = true;
else if (c == SOFT_HYPHEN || c == HYPHEN
|| c == NON_BREAKING_HYPHEN)
@@ -34740,7 +34740,8 @@ syms_of_xdisp (void)
same appearance as an ASCII space or hyphen, using the `nobreak-space'
or `nobreak-hyphen' face respectively.
-U+00A0 (no-break space), U+00AD (soft hyphen), U+2010 (hyphen), and
+All of the non-ASCII characters in the Unicode horizontal whitespace
+character class, as well as U+00AD (soft hyphen), U+2010 (hyphen), and
U+2011 (non-breaking hyphen) are affected.
Any other non-nil value means to display these characters as an escape
--
2.25.4
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Fri, 30 Oct 2020 12:15:01 GMT)
Full text and
rfc822 format available.
Message #23 received at 44236 <at> debbugs.gnu.org (full text, mbox):
Neil Roberts <bpeeluk <at> yahoo.co.uk> writes:
> This patch makes it show all non-ASCII characters from the Unicode
> horizontal space class using the special face.
Thanks; applied to Emacs 28.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
Added tag(s) fixed.
Request was from
Lars Ingebrigtsen <larsi <at> gnus.org>
to
control <at> debbugs.gnu.org
.
(Fri, 30 Oct 2020 12:15:02 GMT)
Full text and
rfc822 format available.
bug marked as fixed in version 28.1, send any further explanations to
44236 <at> debbugs.gnu.org and Neil Roberts <bpeeluk <at> yahoo.co.uk>
Request was from
Lars Ingebrigtsen <larsi <at> gnus.org>
to
control <at> debbugs.gnu.org
.
(Fri, 30 Oct 2020 12:15:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Sun, 01 Nov 2020 08:23:01 GMT)
Full text and
rfc822 format available.
Message #30 received at 44236 <at> debbugs.gnu.org (full text, mbox):
> IOW, we need to decide on the rationale for displaying these
> specially, and then we can decide which ones should have this applied.
For a long time my customization contained
(setq dired-listing-switches "-Alv --block-size='1")
that in Dired buffers displays file sizes using nice space
as the thousands separator between groups of 3 digit.
But now this clean space between numbers is polluted by visual garbage
of unrequested highlighted underlines.
Using 'C-u C-x =' on the character shows that it's NARROW NO-BREAK SPACE
with the nobreak-space face on it.
The intention of nobreak-space is to warn the user about confusable
characters in writable buffers. But why highlight such characters
in read-only Dired buffers?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Sun, 01 Nov 2020 08:41:01 GMT)
Full text and
rfc822 format available.
Message #33 received at 44236 <at> debbugs.gnu.org (full text, mbox):
> For a long time my customization contained
>
> (setq dired-listing-switches "-Alv --block-size='1")
>
> that in Dired buffers displays file sizes using nice space
> as the thousands separator between groups of 3 digit.
>
> But now this clean space between numbers is polluted by visual garbage
> of unrequested highlighted underlines.
For example, gnus-article-mode disables this highlighting
in read-only buffers with:
;; Prevent Emacs from displaying non-break space with
;; `nobreak-space' face.
(set (make-local-variable 'nobreak-char-display) nil)
But still in Dired buffers this highlighting is useful to see
bad characters in file names. Whereas such highlighting makes no sense
in file sizes.
> Using 'C-u C-x =' on the character shows that it's NARROW NO-BREAK SPACE
> with the nobreak-space face on it.
It displays this information with this patch:
diff --git a/lisp/descr-text.el b/lisp/descr-text.el
index ec9a968013..075cb21c21 100644
--- a/lisp/descr-text.el
+++ b/lisp/descr-text.el
@@ -687,7 +687,8 @@ describe-char
(save-excursion (goto-char pos)
(looking-at-p "[ \t]+$")))
'trailing-whitespace)
- ((and nobreak-char-display char (eq char '#xa0))
+ ((and nobreak-char-display char
+ (eq (get-char-code-property char 'general-category) 'Zs))
'nobreak-space)
((and nobreak-char-display char
(memq char '(#xad #x2010 #x2011)))
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Sun, 01 Nov 2020 13:14:02 GMT)
Full text and
rfc822 format available.
Message #36 received at 44236 <at> debbugs.gnu.org (full text, mbox):
Juri Linkov <juri <at> linkov.net> writes:
> The intention of nobreak-space is to warn the user about confusable
> characters in writable buffers. But why highlight such characters
> in read-only Dired buffers?
Perhaps `special-mode' should switch this highlighting off?
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Sun, 01 Nov 2020 15:17:02 GMT)
Full text and
rfc822 format available.
Message #39 received at 44236 <at> debbugs.gnu.org (full text, mbox):
> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Cc: Eli Zaretskii <eliz <at> gnu.org>, Neil Roberts <bpeeluk <at> yahoo.co.uk>,
> 44236 <at> debbugs.gnu.org
> Date: Sun, 01 Nov 2020 14:12:49 +0100
>
> Juri Linkov <juri <at> linkov.net> writes:
>
> > The intention of nobreak-space is to warn the user about confusable
> > characters in writable buffers. But why highlight such characters
> > in read-only Dired buffers?
>
> Perhaps `special-mode' should switch this highlighting off?
That sounds too drastic to me. But perhaps we should only highlight
this character and other "thin" spaces only on TTY frames, where they
really look like a SPC? Because on GUI frames it is quite easy to
understand that they are not a SPC character.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Sun, 01 Nov 2020 18:56:02 GMT)
Full text and
rfc822 format available.
Message #42 received at 44236 <at> debbugs.gnu.org (full text, mbox):
>> > The intention of nobreak-space is to warn the user about confusable
>> > characters in writable buffers. But why highlight such characters
>> > in read-only Dired buffers?
>>
>> Perhaps `special-mode' should switch this highlighting off?
>
> That sounds too drastic to me.
I agree.
> But perhaps we should only highlight this character and other "thin"
> spaces only on TTY frames, where they really look like a SPC?
> Because on GUI frames it is quite easy to understand that they are not
> a SPC character.
Even on GUI frames with monospaced fonts I see that NARROW NO-BREAK SPACE
still has the same width as all other space characters. So there is
no visual difference between them on GUI frames.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Sun, 01 Nov 2020 18:56:02 GMT)
Full text and
rfc822 format available.
Message #45 received at 44236 <at> debbugs.gnu.org (full text, mbox):
>> The intention of nobreak-space is to warn the user about confusable
>> characters in writable buffers. But why highlight such characters
>> in read-only Dired buffers?
>
> Perhaps `special-mode' should switch this highlighting off?
In Dired it's still useful to be able to spot unusual characters
in file names, but such highlighting is useless in file sizes.
Maybe highlighting should check for some text properties,
and not to highlight nobreak-chars in text with these properties?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Sun, 01 Nov 2020 19:31:02 GMT)
Full text and
rfc822 format available.
Message #48 received at 44236 <at> debbugs.gnu.org (full text, mbox):
> From: Juri Linkov <juri <at> linkov.net>
> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, bpeeluk <at> yahoo.co.uk,
> 44236 <at> debbugs.gnu.org
> Date: Sun, 01 Nov 2020 20:51:33 +0200
>
> > But perhaps we should only highlight this character and other "thin"
> > spaces only on TTY frames, where they really look like a SPC?
> > Because on GUI frames it is quite easy to understand that they are not
> > a SPC character.
>
> Even on GUI frames with monospaced fonts I see that NARROW NO-BREAK SPACE
> still has the same width as all other space characters. So there is
> no visual difference between them on GUI frames.
In which case the special face is entirely appropriate, and I don't
think I understand your complaint.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Sun, 01 Nov 2020 19:32:01 GMT)
Full text and
rfc822 format available.
Message #51 received at 44236 <at> debbugs.gnu.org (full text, mbox):
> From: Juri Linkov <juri <at> linkov.net>
> Cc: Eli Zaretskii <eliz <at> gnu.org>, Neil Roberts <bpeeluk <at> yahoo.co.uk>,
> 44236 <at> debbugs.gnu.org
> Date: Sun, 01 Nov 2020 20:53:59 +0200
>
> Maybe highlighting should check for some text properties,
> and not to highlight nobreak-chars in text with these properties?
That would mean an entirely different implementation from what we have
now.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Sun, 01 Nov 2020 19:43:02 GMT)
Full text and
rfc822 format available.
Message #54 received at 44236 <at> debbugs.gnu.org (full text, mbox):
Juri Linkov <juri <at> linkov.net> writes:
> In Dired it's still useful to be able to spot unusual characters
> in file names, but such highlighting is useless in file sizes.
If the idea is to spot malicious filenames that have confusable
characters then I think the problem is much larger than just confusing
space characters. For example “.аlias”, with a letter from the Cyrillic
alphabet and a zero-width space. I think that particular problem is out
of scope for the nobreak-char-display feature.
Regards,
- Neil
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Sun, 01 Nov 2020 19:47:01 GMT)
Full text and
rfc822 format available.
Message #57 received at 44236 <at> debbugs.gnu.org (full text, mbox):
>> Even on GUI frames with monospaced fonts I see that NARROW NO-BREAK SPACE
>> still has the same width as all other space characters. So there is
>> no visual difference between them on GUI frames.
>
> In which case the special face is entirely appropriate, and I don't
> think I understand your complaint.
Seeing hundreds of red underlines in Dired buffers is a horrible experience.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Sun, 01 Nov 2020 19:48:02 GMT)
Full text and
rfc822 format available.
Message #60 received at 44236 <at> debbugs.gnu.org (full text, mbox):
>> Maybe highlighting should check for some text properties,
>> and not to highlight nobreak-chars in text with these properties?
>
> That would mean an entirely different implementation from what we have
> now.
get_next_display_element has no access to text properties?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Sun, 01 Nov 2020 19:54:01 GMT)
Full text and
rfc822 format available.
Message #63 received at 44236 <at> debbugs.gnu.org (full text, mbox):
> From: Juri Linkov <juri <at> linkov.net>
> Cc: larsi <at> gnus.org, bpeeluk <at> yahoo.co.uk, 44236 <at> debbugs.gnu.org
> Date: Sun, 01 Nov 2020 21:40:20 +0200
>
> >> Even on GUI frames with monospaced fonts I see that NARROW NO-BREAK SPACE
> >> still has the same width as all other space characters. So there is
> >> no visual difference between them on GUI frames.
> >
> > In which case the special face is entirely appropriate, and I don't
> > think I understand your complaint.
>
> Seeing hundreds of red underlines in Dired buffers is a horrible experience.
You can turn the feature off locally in your Dired buffers, no?
I mean, you've created this situation by customizing the Dired
display, so customizing it a bit more should not be a grave problem,
IMO.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Sun, 01 Nov 2020 20:00:02 GMT)
Full text and
rfc822 format available.
Message #66 received at 44236 <at> debbugs.gnu.org (full text, mbox):
> From: Juri Linkov <juri <at> linkov.net>
> Cc: larsi <at> gnus.org, bpeeluk <at> yahoo.co.uk, 44236 <at> debbugs.gnu.org
> Date: Sun, 01 Nov 2020 21:41:10 +0200
>
> >> Maybe highlighting should check for some text properties,
> >> and not to highlight nobreak-chars in text with these properties?
> >
> > That would mean an entirely different implementation from what we have
> > now.
>
> get_next_display_element has no access to text properties?
Text properties are handled by the display code on a level above
get_next_display_element.
But that's not what I meant. I meant that if we want to base this on
text properties, we should do this via hi-lock or similar, not in the
display engine which treats all characters the same.
Alternatively, if this new feature is so annoying, and people are
unwilling to customize their Emacs to get the old behavior back, maybe
we should make nobreak-char-display more than just a simple boolean,
so that people could control which characters are and aren't
emphasized?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Sun, 01 Nov 2020 20:04:02 GMT)
Full text and
rfc822 format available.
Message #69 received at 44236 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
>> In Dired it's still useful to be able to spot unusual characters
>> in file names, but such highlighting is useless in file sizes.
>
> If the idea is to spot malicious filenames that have confusable
> characters then I think the problem is much larger than just confusing
> space characters. For example “.аlias”, with a letter from the Cyrillic
> alphabet and a zero-width space. I think that particular problem is out
> of scope for the nobreak-char-display feature.
You tried to sneak in confusable characters, but I see them clearly,
heh heh :-)
[confusables.png (image/png, inline)]
[Message part 3 (text/plain, inline)]
These characters are revealed thanks to the 'markchars' package from GNU ELPA,
and glyphless-char face customized to :background "red".
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Sun, 01 Nov 2020 20:14:01 GMT)
Full text and
rfc822 format available.
Message #72 received at 44236 <at> debbugs.gnu.org (full text, mbox):
>> Seeing hundreds of red underlines in Dired buffers is a horrible experience.
>
> You can turn the feature off locally in your Dired buffers, no?
>
> I mean, you've created this situation by customizing the Dired
> display, so customizing it a bit more should not be a grave problem,
> IMO.
Indeed, this should not be hard to do if a more general solution
can't be found.
> Text properties are handled by the display code on a level above
> get_next_display_element.
>
> But that's not what I meant. I meant that if we want to base this on
> text properties, we should do this via hi-lock or similar, not in the
> display engine which treats all characters the same.
Or markchars.el, or uni-confusables.el. Like these packages maybe better
to create another package e.g. nobreak.el, based on font-lock-mode?
> Alternatively, if this new feature is so annoying, and people are
> unwilling to customize their Emacs to get the old behavior back, maybe
> we should make nobreak-char-display more than just a simple boolean,
> so that people could control which characters are and aren't
> emphasized?
This would complicate the core functions.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Sun, 01 Nov 2020 22:46:01 GMT)
Full text and
rfc822 format available.
Message #75 received at 44236 <at> debbugs.gnu.org (full text, mbox):
> Alternatively, if this new feature is so annoying, and people are
> unwilling to customize their Emacs to get the old behavior back, maybe
> we should make nobreak-char-display more than just a simple boolean,
> so that people could control which characters are and aren't
> emphasized?
(Not speaking to how annoying anything might be, here.)
`nobreak-char-display' is what it is. It can't be
expected to do more than it does, IMO. Its aim is
to highlight non-ASCII chars that look similar to
ASCII space and hyphen.
That's already too much, IMO. I've said before that
it's a weakness that users can't separate those two
(highlighting look-alikes for SPC and hyphen). They
shouldn't be hard-coupled together (IMO).
And I mentioned my library `highlight-chars.el',
which lets you highlight different sets of chars.
And code can control that. It sounds like that's
maybe what's being looked for here: highlight certain
chars in certain contexts (not just everywhere).
___
Description:
https://www.emacswiki.org/emacs/ShowWhiteSpace#HighlightChars
Code:
https://www.emacswiki.org/emacs/download/highlight-chars.el
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Tue, 03 Nov 2020 19:11:02 GMT)
Full text and
rfc822 format available.
Message #78 received at 44236 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
>> But that's not what I meant. I meant that if we want to base this on
>> text properties, we should do this via hi-lock or similar, not in the
>> display engine which treats all characters the same.
>
> Or markchars.el, or uni-confusables.el. Like these packages maybe better
> to create another package e.g. nobreak.el, based on font-lock-mode?
Now I extended markchars.el to highlight exactly the same characters
as highlighted by nobreak-char-display, and additionally highlight them
only in files names in Dired. This is configurable with such hook:
(add-hook 'dired-mode-hook
(lambda ()
(setq-local nobreak-char-display nil)
(setq-local markchars-what '(markchars-nobreak-space
markchars-nobreak-hyphen))
(markchars-mode 1)))
[markchars-nobreak.patch (text/x-diff, inline)]
diff --git a/packages/markchars/markchars.el b/packages/markchars/markchars.el
index 7d7fe2982..bd902f7c7 100644
--- a/packages/markchars/markchars.el
+++ b/packages/markchars/markchars.el
@@ -31,6 +31,12 @@
;; `markchars-face-confusable' or `markchars-face-pattern'
;; respectively.
;;
+;; You can set `nobreak-char-display' to nil, and use
+;; `markchars-nobreak-space' and `markchars-nobreak-hyphen'
+;; in Dired buffers to highlight `nobreak-space' and `nobreak-hyphen'
+;; only in file names, not `nobreak-space' used by thousands separators
+;; in file sizes (bug#44236).
+;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;
;;; Change log:
@@ -79,6 +85,16 @@ markchars-white
"White face for `markchars-mode' char marking."
:group 'markchars)
+(defface markchars-nobreak-space
+ '((t (:inherit nobreak-space)))
+ "Face for displaying nobreak space."
+ :group 'markchars)
+
+(defface markchars-nobreak-hyphen
+ '((t (:inherit nobreak-hyphen)))
+ "Face for displaying nobreak hyphens."
+ :group 'markchars)
+
(defcustom markchars-face-pattern 'markchars-heavy
"Pointer to face used for marking matched patterns."
:type 'face
@@ -101,12 +117,40 @@ markchars-simple-pattern
:type 'regexp
:group 'markchars)
+(defvar markchars-nobreak-space-pattern
+ (rx (any ;; ?\N{SPACE}
+ ?\N{NO-BREAK SPACE}
+ ?\N{OGHAM SPACE MARK}
+ ?\N{EN QUAD}
+ ?\N{EM QUAD}
+ ?\N{EN SPACE}
+ ?\N{EM SPACE}
+ ?\N{THREE-PER-EM SPACE}
+ ?\N{FOUR-PER-EM SPACE}
+ ?\N{SIX-PER-EM SPACE}
+ ?\N{FIGURE SPACE}
+ ?\N{PUNCTUATION SPACE}
+ ?\N{THIN SPACE}
+ ?\N{HAIR SPACE}
+ ?\N{NARROW NO-BREAK SPACE}
+ ?\N{MEDIUM MATHEMATICAL SPACE}
+ ?\N{IDEOGRAPHIC SPACE}))
+ "A list of characters with general-category `Zs' (Separator, Space).")
+
+(defvar markchars-nobreak-hyphen-pattern
+ (rx (any ?\N{SOFT HYPHEN} ?\N{HYPHEN} ?\N{NON-BREAKING HYPHEN}))
+ "A list of hyphen characters.")
+
(defcustom markchars-what
`(markchars-simple-pattern
markchars-confusables
,@(when (fboundp 'idn-is-recommended) '(markchars-nonidn-fun)))
"Things to mark, a list of regular expressions or symbols."
:type `(repeat (choice :tag "Marking choices"
+ (const :tag "Non-ASCII space chars"
+ markchars-nobreak-space)
+ (const :tag "Non-ASCII hyphen chars"
+ markchars-nobreak-hyphen)
(const
:tag "Non IDN chars (Unicode.org tr39 suggestions)"
markchars-nonidn-fun)
@@ -129,6 +173,18 @@ markchars-set-keywords
(when (eq what 'markchars-simple-pattern)
(setq what markchars-simple-pattern))
(cond
+ ((eq what 'markchars-nobreak-space)
+ (list
+ markchars-nobreak-space-pattern
+ (list 0 '(markchars--render-nobreak-space
+ (match-beginning 0)
+ (match-end 0)))))
+ ((eq what 'markchars-nobreak-hyphen)
+ (list
+ markchars-nobreak-hyphen-pattern
+ (list 0 '(markchars--render-nobreak-hyphen
+ (match-beginning 0)
+ (match-end 0)))))
((eq what 'markchars-nonidn-fun)
(list
"\\<\\w+\\>"
@@ -184,6 +240,22 @@ markchars--render-nonidn
(put-text-property (point) (1+ (point)) 'face markchars-face-nonidn)))
(forward-char))))
+(defun markchars--render-nobreak-space (beg end)
+ "Assign markchars pattern properties between BEG and END.
+In Dired/WDired buffers, highlight nobreak-space characters
+only in file names, not anywhere else, so it doesn't highlight
+nobreak-space characters used by thousands separators in file sizes."
+ (when (or (not (derived-mode-p 'dired-mode 'wdired-mode))
+ (or (get-text-property beg 'dired-filename)
+ (get-text-property end 'dired-filename)))
+ (put-text-property beg end 'face 'markchars-nobreak-space)
+ (put-text-property beg end 'markchars 'nobreak-space)))
+
+(defun markchars--render-nobreak-hyphen (beg end)
+ "Assign markchars pattern properties between BEG and END."
+ (put-text-property beg end 'face 'markchars-nobreak-hyphen)
+ (put-text-property beg end 'markchars 'nobreak-hyphen))
+
;;;###autoload
(define-minor-mode markchars-mode
"Mark special characters.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Tue, 03 Nov 2020 21:08:02 GMT)
Full text and
rfc822 format available.
Message #81 received at 44236 <at> debbugs.gnu.org (full text, mbox):
Juri Linkov <juri <at> linkov.net> writes:
> +(defface markchars-nobreak-space
> + '((t (:inherit nobreak-space)))
> +(defface markchars-nobreak-hyphen
> + '((t (:inherit nobreak-hyphen)))
AKA '((t :inherit ...)).
--
Basil
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44236
; Package
emacs
.
(Wed, 04 Nov 2020 19:57:01 GMT)
Full text and
rfc822 format available.
Message #84 received at 44236 <at> debbugs.gnu.org (full text, mbox):
>> +(defface markchars-nobreak-space
>> + '((t (:inherit nobreak-space)))
>
>> +(defface markchars-nobreak-hyphen
>> + '((t (:inherit nobreak-hyphen)))
>
> AKA '((t :inherit ...)).
Thanks, I also fixed the existing deffaces where I copied this from,
and pushed to GNU ELPA.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Thu, 03 Dec 2020 12:24:05 GMT)
Full text and
rfc822 format available.
This bug report was last modified 4 years and 148 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.