GNU bug report logs - #76993
Init files and UTF-8

Package: emacs;

Reported by: Stefan Kangas <stefankangas <at> gmail.com>

Date: Thu, 13 Mar 2025 06:02:02 UTC

Severity: wishlist

To reply to this bug, email your comments to 76993 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox

Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#76993; Package emacs. (Thu, 13 Mar 2025 06:02:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Stefan Kangas <stefankangas <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Thu, 13 Mar 2025 06:02:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefankangas <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: Init files and UTF-8
Date: Wed, 12 Mar 2025 23:00:44 -0700

Severity: wishlist

In (info "(emacs) Init Non-ASCII"), we read that:

       If you want to use non-ASCII characters in your init file, you
    should put a ‘-*-coding: CODING-SYSTEM-*-’ tag on the first line of
    the init file, and specify a coding system that supports the
    character(s) in question.  *Note Recognize Coding::.  This is because
    the defaults for decoding non-ASCII text might not yet be set up by
    the time Emacs reads those parts of your init file which use such
    strings, possibly leading Emacs to decode those strings incorrectly.

Is this correct?  When I open ~/.emacs.d/init.el on this machine,
`buffer-file-coding-system`, it is `prefer-utf-8-unix`, and I can't
recall ever having had a problem with non-ASCII key bindings.

Is the above only true on some platforms?  Should that be noted, or
should it be moved to some platform specific documentation?

Should the default be changed somehow, such that we always use UTF-8
when reading the init file?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#76993; Package emacs. (Thu, 13 Mar 2025 06:55:02 GMT) Full text and rfc822 format available.

Message #8 received at 76993 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Kangas <stefankangas <at> gmail.com>
Cc: 76993 <at> debbugs.gnu.org
Subject: Re: bug#76993: Init files and UTF-8
Date: Thu, 13 Mar 2025 08:54:12 +0200

> From: Stefan Kangas <stefankangas <at> gmail.com>
> Date: Wed, 12 Mar 2025 23:00:44 -0700
> 
> Severity: wishlist
> 
> In (info "(emacs) Init Non-ASCII"), we read that:
> 
>        If you want to use non-ASCII characters in your init file, you
>     should put a ‘-*-coding: CODING-SYSTEM-*-’ tag on the first line of
>     the init file, and specify a coding system that supports the
>     character(s) in question.  *Note Recognize Coding::.  This is because
>     the defaults for decoding non-ASCII text might not yet be set up by
>     the time Emacs reads those parts of your init file which use such
>     strings, possibly leading Emacs to decode those strings incorrectly.
> 
> Is this correct?  When I open ~/.emacs.d/init.el on this machine,
> `buffer-file-coding-system`, it is `prefer-utf-8-unix`, and I can't
> recall ever having had a problem with non-ASCII key bindings.

I'm guessing that that's because your system's encoding is UTF-8 to
begin with.

> Is the above only true on some platforms?

What matters is the locale's codeset, not the platform.  Though it is
true that most users of most platforms except Windows use UTF-8 these
days, I know of at least some users of GNU/Linux who still set up
their systems to use non-UTF-8 encoding.

> Should that be noted, or should it be moved to some platform
> specific documentation?

We could do that, but is that worth the hassle?

 . having a coding cookie can do no harm
 . having a coding cookie makes the init file portable and usable from
   several different systems with no subtle problems
 . explaining when this could matter and when it couldn't is not
   simple and could confuse users who do not know enough about locales
   and encodings

> Should the default be changed somehow, such that we always use UTF-8
> when reading the init file?

Why would we want to make such a breaking change, when all we suggest
is to have a coding cookie, in a small minority of cases where init
files bind non-ASCII keys?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#76993; Package emacs. (Thu, 13 Mar 2025 07:07:02 GMT) Full text and rfc822 format available.

Message #11 received at 76993 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefankangas <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 76993 <at> debbugs.gnu.org
Subject: Re: bug#76993: Init files and UTF-8
Date: Thu, 13 Mar 2025 00:06:45 -0700

Eli Zaretskii <eliz <at> gnu.org> writes:

>> Is the above only true on some platforms?
>
> What matters is the locale's codeset, not the platform.  Though it is
> true that most users of most platforms except Windows use UTF-8 these
> days, I know of at least some users of GNU/Linux who still set up
> their systems to use non-UTF-8 encoding.

Sure, I can see that some people will do that.

They will run into all kinds of fun, I'm sure, and not just in Emacs.

>> Should that be noted, or should it be moved to some platform
>> specific documentation?
>
> We could do that, but is that worth the hassle?
>
>  . having a coding cookie can do no harm
>  . having a coding cookie makes the init file portable and usable from
>    several different systems with no subtle problems

The main hassle is not the coding cookie, but the complication of having
an entire section in the documentation.  I was thinking that perhaps we
could spare our users that.

The portability argument is fair enough.  Maybe this suggests that this
might no longer warrant a section in the manual and could be moved to
(for example) the MS-Windows FAQ, or something?

>  . explaining when this could matter and when it couldn't is not
>    simple and could confuse users who do not know enough about locales
>    and encodings
>
>> Should the default be changed somehow, such that we always use UTF-8
>> when reading the init file?
>
> Why would we want to make such a breaking change, when all we suggest
> is to have a coding cookie, in a small minority of cases where init
> files bind non-ASCII keys?

Maybe this "small minority of cases" part could be clarified without
getting into the details.  The section reads to me as if it's always a
problem, which seems misleading.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#76993; Package emacs. (Thu, 13 Mar 2025 07:53:01 GMT) Full text and rfc822 format available.

Message #14 received at 76993 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Kangas <stefankangas <at> gmail.com>
Cc: 76993 <at> debbugs.gnu.org
Subject: Re: bug#76993: Init files and UTF-8
Date: Thu, 13 Mar 2025 09:52:18 +0200

> From: Stefan Kangas <stefankangas <at> gmail.com>
> Date: Thu, 13 Mar 2025 00:06:45 -0700
> Cc: 76993 <at> debbugs.gnu.org
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> >> Is the above only true on some platforms?
> >
> > What matters is the locale's codeset, not the platform.  Though it is
> > true that most users of most platforms except Windows use UTF-8 these
> > days, I know of at least some users of GNU/Linux who still set up
> > their systems to use non-UTF-8 encoding.
> 
> Sure, I can see that some people will do that.
> 
> They will run into all kinds of fun, I'm sure, and not just in Emacs.

Emacs should fully support such a setup.  The fact that it works on
MS-Windows is the best evidence to that effect.

> >> Should that be noted, or should it be moved to some platform
> >> specific documentation?
> >
> > We could do that, but is that worth the hassle?
> >
> >  . having a coding cookie can do no harm
> >  . having a coding cookie makes the init file portable and usable from
> >    several different systems with no subtle problems
> 
> The main hassle is not the coding cookie, but the complication of having
> an entire section in the documentation.  I was thinking that perhaps we
> could spare our users that.
> 
> The portability argument is fair enough.  Maybe this suggests that this
> might no longer warrant a section in the manual and could be moved to
> (for example) the MS-Windows FAQ, or something?

The section about key bindings in init files will be incomplete
without that information.  This was written in response to real
problems people had in real use cases.

We could move it into a sub-subsection of "Init Examples", which will
move it out of the way to some extent.

> Maybe this "small minority of cases" part could be clarified without
> getting into the details.  The section reads to me as if it's always a
> problem, which seems misleading.

Feel free to suggest changes in wording to make it less scary and
misleading.

This bug report was last modified 123 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #76993 Init files and UTF-8

GNU bug report logs - #76993
Init files and UTF-8