GNU bug report logs - #50202
bibtex-mode: unescaped dollar sign in file field leads to wrong highlighting

Previous Next

Package: emacs;

Reported by: Yuu Yin <yuuyin <at> protonmail.com>

Date: Wed, 25 Aug 2021 19:16:02 UTC

Severity: normal

Done: Roland Winkler <winkler <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 50202 in the body.
You can then email your comments to 50202 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#50202; Package emacs. (Wed, 25 Aug 2021 19:16:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Yuu Yin <yuuyin <at> protonmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Wed, 25 Aug 2021 19:16:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Yuu Yin <yuuyin <at> protonmail.com>
To: "bug-gnu-emacs <at> gnu.org" <bug-gnu-emacs <at> gnu.org>
Subject: bibtex-mode: unescaped dollar sign in file field leads to wrong
 highlighting
Date: Wed, 25 Aug 2021 17:48:39 +0000
[Message part 1 (text/plain, inline)]
** Expected behavior
Given a buffer with

- bibtex-mode enabled,
- and a BibTeX entry which has a file field that in turn has as value a path that has a dollar sign ~/path/to/file $ name.ext~,

bibtex-mode correctly highlights this entry and following BibTeX entries in the buffer considering that the file field is a verbatim field (https://github.com/retorquere/zotero-better-bibtex/issues/1895#issuecomment-905572487).

** Actual behavior
Given a buffer with

- bibtex-mode enabled,
- and a BibTeX entry which has as value for file field a path that has dollar sign ~/path/to/file $ name.ext~

bibtex-mode doesn't recognizes that the dollar sign is verbatim for the file field, leading to wrong highlighting.

[emacs-bibtex-mode-dollar-sign.png]

** Reproduce
1. Open buffer with bibtex-mode enabled
2. Add to the buffer the following content

#+begin_src bibtex
@book{test-2021-test,
title = {Test},
author = {{Test}},
year = {2021},
f file = {/tmp/test \$ test.epub}
}

@book{test-2021-test,
title = {Test},
author = {{Test}},
year = {2021},
file = {/tmp/test $ test.epub}
}

@book{test-2021-test,
title = {Test},
author = {{Test}},
year = {2021},
file = {/tmp/test $ test.epub}
}
#+end_src

See that in the first entry the escaped dollar sign leads to correct highlight, but in the second entry the unescaped dollar sign leads to wrong highlighting.

In GNU Emacs 28.0.50 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.30, cairo version 1.16.0)
Repository revision: 7640f1da0be206a7598c96acdfdaaf390a2b546c
Repository branch: master
Windowing system distributor 'The X.Org Foundation', version 11.0.12013000
System Description: NixOS 21.11 (Porcupine)

Configured using:
'configure
--prefix=/nix/store/7gspx8jnx0s4wy753rcnfncvwhn5bnfr-emacs-gcc-20210731.0
--disable-build-details --with-modules --with-x-toolkit=gtk3 --with-xft
--with-cairo --with-native-compilation'

Configured features:
CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GSETTINGS HARFBUZZ JPEG JSON
LIBOTF LIBSELINUX LIBSYSTEMD LIBXML2 M17N_FLT MODULES NATIVE_COMP NOTIFY
INOTIFY PDUMPER PNG RSVG SECCOMP SOUND THREADS TIFF TOOLKIT_SCROLL_BARS
X11 XDBE XIM XPM GTK3 ZLIB

Important settings:
value of $LANG: en_US.UTF-8
locale-coding-system: utf-8-unix

Major mode: BibTeX

Minor modes in effect:
text-scale-mode: t
tooltip-mode: t
global-eldoc-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
line-number-mode: t
indent-tabs-mode: t
transient-mark-mode: t

Load-path shadows:
/run/current-system/sw/share/emacs/site-lisp/site-start hides /nix/store/7gspx8jnx0s4wy753rcnfncvwhn5bnfr-emacs-gcc-20210731.0/share/emacs/site-lisp/site-start

Features:
(shadow sort mail-extr mule-util ispell ffap org-element avl-tree
generator ol-eww eww xdg url-queue thingatpt mm-url ol-rmail ol-mhe
ol-irc ol-info ol-gnus nnselect gnus-search eieio-opt speedbar ezimage
dframe gnus-art mm-uu mml2015 mm-view mml-smime smime dig gnus-sum shr
kinsoku svg dom browse-url url url-proxy url-privacy url-expand
url-methods url-history url-cookie url-domsuf url-util url-parse
url-vars mailcap gnus-group gnus-undo gnus-start gnus-dbus dbus xml
gnus-cloud nnimap nnmail mail-source utf7 netrc nnoo parse-time
gnus-spec gnus-int gnus-range gnus-win gnus nnheader wid-edit ol-docview
doc-view jka-compr image-mode exif ol-bibtex ol-bbdb ol-w3m org ob
ob-tangle ob-ref ob-lob ob-table ob-exp org-macro org-footnote org-src
ob-comint org-pcomplete pcomplete comint ansi-color ring org-list
org-faces org-entities noutline outline easy-mmode org-version
ob-emacs-lisp ob-core ob-eval org-table ol org-keys org-compat advice
org-macs org-loaddefs format-spec find-func cal-menu calendar
cal-loaddefs emacsbug message rmc puny dired dired-loaddefs rfc822 mml
mml-sec epa derived epg epg-config gnus-util rmail rmail-loaddefs
auth-source eieio eieio-core eieio-loaddefs password-cache json map
text-property-search mm-decode mm-bodies mm-encode mail-parse rfc2231
mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums
mm-util mail-prsvr mail-utils face-remap bibtex iso8601 time-date server
comp comp-cstr warnings subr-x rx cl-seq cl-macs cl-extra help-mode seq
byte-opt gv cl-loaddefs cl-lib bytecomp byte-compile cconv iso-transl
tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type
mwheel term/x-win x-win term/common-win x-dnd tool-bar dnd fontset image
regexp-opt fringe tabulated-list replace newcomment text-mode elisp-mode
lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow isearch
easymenu timer select scroll-bar mouse jit-lock font-lock syntax
font-core term/tty-colors frame minibuffer cl-generic cham georgian
utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean
japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european
ethiopic indian cyrillic chinese composite charscript charprop
case-table epa-hook jka-cmpr-hook help simple abbrev obarray
cl-preloaded nadvice button loaddefs faces cus-face macroexp files
window text-properties overlay sha1 md5 base64 format env code-pages
mule custom widget hashtable-print-readable backquote threads dbusbind
inotify dynamic-setting system-font-setting font-render-setting cairo
move-toolbar gtk x-toolkit x multi-tty make-network-process
native-compile emacs)

Memory information:
((conses 16 262122 31074)
(symbols 48 20673 0)
(strings 32 73673 3641)
(string-bytes 1 2487919)
(vectors 16 38674)
(vector-slots 8 687588 32704)
(floats 8 325 267)
(intervals 56 1287 404)
(buffers 992 16))
[Message part 2 (text/html, inline)]
[emacs-bibtex-mode-dollar-sign.png (image/png, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50202; Package emacs. (Mon, 22 Aug 2022 14:34:01 GMT) Full text and rfc822 format available.

Message #8 received at 50202 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Yuu Yin <yuuyin <at> protonmail.com>
Cc: 50202 <at> debbugs.gnu.org, Roland Winkler <winkler <at> gnu.org>
Subject: Re: bug#50202: bibtex-mode: unescaped dollar sign in file field
 leads to wrong highlighting
Date: Mon, 22 Aug 2022 16:33:00 +0200
Yuu Yin <yuuyin <at> protonmail.com> writes:

> - bibtex-mode enabled,
> - and a BibTeX entry which has as value for file field a path that has dollar sign ~
> /path/to/file $ name.ext~
>
> bibtex-mode doesn't recognizes that the dollar sign is verbatim for the file field,
> leading to wrong highlighting.

This behaviour is still present in Emacs 29.  Perhaps Roland has some
comments; added to the CCs.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50202; Package emacs. (Thu, 25 Aug 2022 02:59:02 GMT) Full text and rfc822 format available.

Message #11 received at 50202 <at> debbugs.gnu.org (full text, mbox):

From: Roland Winkler <winkler <at> gnu.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 50202 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 Yuu Yin <yuuyin <at> protonmail.com>
Subject: Re: bug#50202: bibtex-mode: unescaped dollar sign in file field
 leads to wrong highlighting
Date: Wed, 24 Aug 2022 21:58:25 -0500
On Mon, Aug 22 2022, Lars Ingebrigtsen wrote:
> Yuu Yin <yuuyin <at> protonmail.com> writes:
>
>> - bibtex-mode enabled,
>> - and a BibTeX entry which has as value for file field a path that
>> has dollar sign ~
>> /path/to/file $ name.ext~
>>
>> bibtex-mode doesn't recognizes that the dollar sign is verbatim for
>> the file field, leading to wrong highlighting.
>
> This behaviour is still present in Emacs 29.  Perhaps Roland has some
> comments; added to the CCs.

A field "file" is, I believe, not part of standard BibTeX.  So the above
is somewhat pushing the limits of BibTeX mode.

From a more practical perspective, I need to say that the above problem
reaches the limits of my knowledge of how font-lock works in general and
how it deals with the (La)TeX delimiter "$" in particular.  Occassionally,
unpaired "$" give me strange results in LaTeX documents, though I have
no recipe to illustrate this.

I believe the above problem would require that BibTeX mode first parses
the BibTeX entries.  Then it uses different syntax tables for normal
fields and the file field.  I do not know how feasable this is, in
particular with larger BibTeX files.

I added Stefan to the CCs.  Long time ago, he helped me with font-lock
for BibTeX mode.  Maybe he has some comments.

Personally, I use a completely different strategy for associating file
names with BibTeX entries: the BibTeX autokey machinery generates the
nondirectory part of the filename that I associate with an entry.  And
find-dired locates the file, whereever it resides under a certain
directory.  So there is no file field at all that would require
maintenance.

Roland




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50202; Package emacs. (Thu, 25 Aug 2022 12:54:02 GMT) Full text and rfc822 format available.

Message #14 received at 50202 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Roland Winkler <winkler <at> gnu.org>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 50202 <at> debbugs.gnu.org,
 Yuu Yin <yuuyin <at> protonmail.com>
Subject: Re: bug#50202: bibtex-mode: unescaped dollar sign in file field
 leads to wrong highlighting
Date: Thu, 25 Aug 2022 08:52:55 -0400
>>> - bibtex-mode enabled,
>>> - and a BibTeX entry which has as value for file field a path that
>>> has dollar sign ~
>>> /path/to/file $ name.ext~
>>>
>>> bibtex-mode doesn't recognizes that the dollar sign is verbatim for
>>> the file field, leading to wrong highlighting.
>>
>> This behaviour is still present in Emacs 29.  Perhaps Roland has some
>> comments; added to the CCs.
>
> A field "file" is, I believe, not part of standard BibTeX.  So the above
> is somewhat pushing the limits of BibTeX mode.

I think the report/problem would be the same if there was a $ in a URL, tho.

> From a more practical perspective, I need to say that the above problem
> reaches the limits of my knowledge of how font-lock works in general and
> how it deals with the (La)TeX delimiter "$" in particular.

It doesn't deal with $ very well, indeed.

But I wonder how important it is for `bibtex-mode` to try and recognize
the (La)TeX meaning of the $ character.
Maybe we should just give $ the punctuation syntax in the syntax-table.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50202; Package emacs. (Thu, 25 Aug 2022 16:37:03 GMT) Full text and rfc822 format available.

Message #17 received at 50202 <at> debbugs.gnu.org (full text, mbox):

From: Roland Winkler <winkler <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 50202 <at> debbugs.gnu.org,
 Yuu Yin <yuuyin <at> protonmail.com>
Subject: Re: bug#50202: bibtex-mode: unescaped dollar sign in file field
 leads to wrong highlighting
Date: Thu, 25 Aug 2022 11:36:25 -0500
On Thu, Aug 25 2022, Stefan Monnier wrote:
>> A field "file" is, I believe, not part of standard BibTeX.  So the above
>> is somewhat pushing the limits of BibTeX mode.
>
> I think the report/problem would be the same if there was a $ in a
> URL, tho.

Good point.  (A url field is not part of standard BibTeX either, but for
sure it belongs to biblatex.)

> But I wonder how important it is for `bibtex-mode` to try and recognize
> the (La)TeX meaning of the $ character.
> Maybe we should just give $ the punctuation syntax in the syntax-table.

The (La)TeX meaning of the $ character is relevant for titles that my
contain formulas.  The question is what is in the end more acceptable:
formulas in titles that do not receive special treatment by font-lock.
Or $ being mis-treated in url and file fields (which goes beyond these
fields).  Maybe it is best to let the user decide whether $ should be
treated as math delimiter (the default) or as punctation.

Roland




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50202; Package emacs. (Thu, 25 Aug 2022 17:42:01 GMT) Full text and rfc822 format available.

Message #20 received at 50202 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Roland Winkler <winkler <at> gnu.org>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 50202 <at> debbugs.gnu.org,
 Yuu Yin <yuuyin <at> protonmail.com>
Subject: Re: bug#50202: bibtex-mode: unescaped dollar sign in file field
 leads to wrong highlighting
Date: Thu, 25 Aug 2022 13:41:00 -0400
>> But I wonder how important it is for `bibtex-mode` to try and recognize
>> the (La)TeX meaning of the $ character.
>> Maybe we should just give $ the punctuation syntax in the syntax-table.
>
> The (La)TeX meaning of the $ character is relevant for titles that my
> contain formulas.  The question is what is in the end more acceptable:
> formulas in titles that do not receive special treatment by font-lock.
> Or $ being mis-treated in url and file fields (which goes beyond these
> fields).  Maybe it is best to let the user decide whether $ should be
> treated as math delimiter (the default) or as punctation.

The downside of treating $ as punctuation is that $^2$ won't be
highlighted specially, but the upside is that "foo$bar" won't be
mishighlighted in a url.

To me the highlighting of math formulas is sufficiently secondary that
I'd err on the safe side (i.e. avoid mishighlighting).  The other option
is to try and make this choice on a field-by-field basis, which seems
like a lot of work for fairly little benefit.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50202; Package emacs. (Fri, 26 Aug 2022 15:54:01 GMT) Full text and rfc822 format available.

Message #23 received at 50202 <at> debbugs.gnu.org (full text, mbox):

From: Roland Winkler <winkler <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 50202 <at> debbugs.gnu.org,
 Yuu Yin <yuuyin <at> protonmail.com>
Subject: Re: bug#50202: bibtex-mode: unescaped dollar sign in file field
 leads to wrong highlighting
Date: Fri, 26 Aug 2022 10:52:55 -0500
On Thu, Aug 25 2022, Stefan Monnier wrote:
> The downside of treating $ as punctuation is that $^2$ won't be
> highlighted specially, but the upside is that "foo$bar" won't be
> mishighlighted in a url.
>
> To me the highlighting of math formulas is sufficiently secondary that
> I'd err on the safe side (i.e. avoid mishighlighting).

For me, the situation is opposite: in my BibTeX files (with quite many
entries that have accumulated over the years), many entries contain
LaTeX constructs like $^2$ so that I appreciate proper highlighting.
But I never encountered the opposite problem when a field should contain
a single $.

> The other option is to try and make this choice on a field-by-field
> basis, which seems like a lot of work for fairly little benefit.

I agree it doesn't make sense to (try to) highlight different fields
differently.  That's why I suggest that bibtex-mode continues to use
only one syntax table.  But make it customizable how this syntax table
treats $: You and the OP can make it a punctuation character (also fine
with me as default, it doesn't cause harm), while it remains possible to
keep the current behavior.

Roland




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50202; Package emacs. (Fri, 26 Aug 2022 19:01:02 GMT) Full text and rfc822 format available.

Message #26 received at 50202 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Roland Winkler <winkler <at> gnu.org>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 50202 <at> debbugs.gnu.org,
 Yuu Yin <yuuyin <at> protonmail.com>
Subject: Re: bug#50202: bibtex-mode: unescaped dollar sign in file field
 leads to wrong highlighting
Date: Fri, 26 Aug 2022 15:00:31 -0400
> For me, the situation is opposite: in my BibTeX files (with quite many
> entries that have accumulated over the years), many entries contain
> LaTeX constructs like $^2$ so that I appreciate proper highlighting.
> But I never encountered the opposite problem when a field should contain
> a single $.

My BibTeX file is in the same boat as yours, but I don't find the
highlighting of those thingies important at all.  The $ signs themselves
are more than enough to clarify visually what is what, since the text
between them is invariably short.

>> The other option is to try and make this choice on a field-by-field
>> basis, which seems like a lot of work for fairly little benefit.
>
> I agree it doesn't make sense to (try to) highlight different fields
> differently.  That's why I suggest that bibtex-mode continues to use
> only one syntax table.  But make it customizable how this syntax table
> treats $: You and the OP can make it a punctuation character (also fine
> with me as default, it doesn't cause harm), while it remains possible to
> keep the current behavior.

Another option is to mark $ as punctuation and then rely instead on
`font-lock-keywords` to highlight something like "\\$[^$\n]+\\$".
So $^2$ would still be highlighted in URLs but "foo$bar" wouldn't.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50202; Package emacs. (Mon, 29 Aug 2022 15:20:02 GMT) Full text and rfc822 format available.

Message #29 received at 50202 <at> debbugs.gnu.org (full text, mbox):

From: Roland Winkler <winkler <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 50202 <at> debbugs.gnu.org,
 Yuu Yin <yuuyin <at> protonmail.com>
Subject: Re: bug#50202: bibtex-mode: unescaped dollar sign in file field
 leads to wrong highlighting
Date: Mon, 29 Aug 2022 10:19:26 -0500
On Fri, Aug 26 2022, Stefan Monnier wrote:
> Another option is to mark $ as punctuation and then rely instead on
> `font-lock-keywords` to highlight something like "\\$[^$\n]+\\$".
> So $^2$ would still be highlighted in URLs but "foo$bar" wouldn't.

Thanks, that's a good compromise that should cover most use cases
without getting too complicated.




Reply sent to Roland Winkler <winkler <at> gnu.org>:
You have taken responsibility. (Fri, 30 Dec 2022 05:50:02 GMT) Full text and rfc822 format available.

Notification sent to Yuu Yin <yuuyin <at> protonmail.com>:
bug acknowledged by developer. (Fri, 30 Dec 2022 05:50:02 GMT) Full text and rfc822 format available.

Message #34 received at 50202-done <at> debbugs.gnu.org (full text, mbox):

From: Roland Winkler <winkler <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 50202-done <at> debbugs.gnu.org,
 Yuu Yin <yuuyin <at> protonmail.com>
Subject: Re: bug#50202: bibtex-mode: unescaped dollar sign in file field
 leads to wrong highlighting
Date: Thu, 29 Dec 2022 23:49:43 -0600
On Mon, Aug 29 2022, Roland Winkler wrote:
> On Fri, Aug 26 2022, Stefan Monnier wrote:
>> Another option is to mark $ as punctuation and then rely instead on
>> `font-lock-keywords` to highlight something like "\\$[^$\n]+\\$".
>> So $^2$ would still be highlighted in URLs but "foo$bar" wouldn't.
>
> Thanks, that's a good compromise that should cover most use cases
> without getting too complicated.

commit ab38abfdf75e091b9970dd3ba977aaa1b6067cc3




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 27 Jan 2023 12:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 62 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.