GNU bug report logs - #58507
Emacs does not preserve the coding system

Previous Next

Package: emacs;

Reported by: Juhana Sadeharju <johanrainhill <at> gmail.com>

Date: Fri, 14 Oct 2022 09:45:01 UTC

Severity: normal

Tags: moreinfo, notabug

Done: Stefan Kangas <stefankangas <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 58507 in the body.
You can then email your comments to 58507 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#58507; Package emacs. (Fri, 14 Oct 2022 09:45:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Juhana Sadeharju <johanrainhill <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Fri, 14 Oct 2022 09:45:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Juhana Sadeharju <johanrainhill <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: Emacs does not preserve the coding system
Date: Fri, 14 Oct 2022 09:02:25 +0300
[Message part 1 (text/plain, inline)]
Hello. I did set the coding system to utf-8 but at quit/restart the file
goes to iso-latin-dos and chars such as ä and ö becomes unreadable. Windows
11. Both the latest Emacs and older version 25.3 has this same problem.

(Windows' Note works ok and it has utf-8 set as default, unlike Emacs)
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58507; Package emacs. (Fri, 14 Oct 2022 10:43:03 GMT) Full text and rfc822 format available.

Message #8 received at 58507 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Juhana Sadeharju <johanrainhill <at> gmail.com>
Cc: 58507 <at> debbugs.gnu.org
Subject: Re: bug#58507: Emacs does not preserve the coding system
Date: Fri, 14 Oct 2022 13:42:20 +0300
> From: Juhana Sadeharju <johanrainhill <at> gmail.com>
> Date: Fri, 14 Oct 2022 09:02:25 +0300
> 
> Hello. I did set the coding system to utf-8 but at quit/restart the file goes to iso-latin-dos and chars such as ä
> and ö becomes unreadable. Windows 11. Both the latest Emacs and older version 25.3 has this same
> problem.

Please tell the details: how did you set coding system to utf-8, and
how did you see that the file goes to iso-latin-dos.  We need these
details to investigate the problem.

Thanks.




Added tag(s) moreinfo. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Fri, 14 Oct 2022 11:22:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58507; Package emacs. (Mon, 24 Oct 2022 02:16:03 GMT) Full text and rfc822 format available.

Message #13 received at 58507 <at> debbugs.gnu.org (full text, mbox):

From: Juhana Sadeharju <johanrainhill <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 58507 <at> debbugs.gnu.org
Subject: Re: bug#58507: Emacs does not preserve the coding system
Date: Mon, 24 Oct 2022 03:22:44 +0300
[Message part 1 (text/plain, inline)]
Hello. I have now tested this with versions 24.3, 25.3 and latest 28.2. All
have this problem.

First I create a new file with c-x c-f. By default, it has coding system
iso-latin-9-dos.

I change the coding system to utf-8 via the buffer's bottombar. Char "0"
changes to "U".

Then I write "äöäöäö" and quit emacs.

When I rerun the emacs and open the text, the "öä" looks wrong and coding
system is back at iso-latin-9-dos.

I use Total Commander's View to verify the file is ok. The file goes wrong
only if I now save the buffer.

What helps is c-x ret r which asks the coding system -- I type utf-8.

Also helps when I add ";;; -*- coding: utf-8-dos; -*-" to the top of file.
I will use this method as solution, but this is too advanced for regular
users -- please check what is the problem.


Windows 11 Home, ver 22H2, installed 11.10.2022, HP Pavilion Gaming Desktop
TG01-2xxx


pe 14. lokak. 2022 klo 13.42 Eli Zaretskii <eliz <at> gnu.org> kirjoitti:

> > From: Juhana Sadeharju <johanrainhill <at> gmail.com>
> > Date: Fri, 14 Oct 2022 09:02:25 +0300
> >
> > Hello. I did set the coding system to utf-8 but at quit/restart the file
> goes to iso-latin-dos and chars such as ä
> > and ö becomes unreadable. Windows 11. Both the latest Emacs and older
> version 25.3 has this same
> > problem.
>
> Please tell the details: how did you set coding system to utf-8, and
> how did you see that the file goes to iso-latin-dos.  We need these
> details to investigate the problem.
>
> Thanks.
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58507; Package emacs. (Mon, 24 Oct 2022 12:46:02 GMT) Full text and rfc822 format available.

Message #16 received at 58507 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Juhana Sadeharju <johanrainhill <at> gmail.com>
Cc: 58507 <at> debbugs.gnu.org
Subject: Re: bug#58507: Emacs does not preserve the coding system
Date: Mon, 24 Oct 2022 15:45:07 +0300
> From: Juhana Sadeharju <johanrainhill <at> gmail.com>
> Date: Mon, 24 Oct 2022 03:22:44 +0300
> Cc: 58507 <at> debbugs.gnu.org
> 
> First I create a new file with c-x c-f. By default, it has coding system iso-latin-9-dos.
> 
> I change the coding system to utf-8 via the buffer's bottombar. Char "0" changes to "U".
> 
> Then I write "äöäöäö" and quit emacs.
> 
> When I rerun the emacs and open the text, the "öä" looks wrong and coding system is back at iso-latin-9-dos.
> 
> I use Total Commander's View to verify the file is ok. The file goes wrong only if I now save the buffer.

This is expected: the short file that you created can be interpreted
both as UTF-8 and as ISO-8859-9.  When there is ambiguity in detection
of the encoding, Emacs prefers the locale-dependent defaults, which in
your case are ISO-8859-9.

> What helps is c-x ret r which asks the coding system -- I type utf-8.
> 
> Also helps when I add ";;; -*- coding: utf-8-dos; -*-" to the top of file. I will use this method as solution, but this
> is too advanced for regular users -- please check what is the problem.

These are indeed two ways of telling Emacs to visit the file as
encoded in UTF-8.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58507; Package emacs. (Thu, 09 Feb 2023 08:57:01 GMT) Full text and rfc822 format available.

Message #19 received at 58507 <at> debbugs.gnu.org (full text, mbox):

From: Juhana Sadeharju <johanrainhill <at> gmail.com>
To: 58507 <at> debbugs.gnu.org
Subject: Re: bug#58507: Acknowledgement (Emacs does not preserve the coding
 system)
Date: Thu, 9 Feb 2023 10:56:22 +0200
[Message part 1 (text/plain, inline)]
Hello. Has this bug or feature been fixed? The problem is that Emacs
doesn't keep the coding system I have set (utf-8). The file is opened with
different coding system and all äö chars are a mess. Even I set the coding
system again to utf-8, all the mess remains. Fixing the mess does not help
because the next time the coding system is wrong again.

Why Emacs doesn't let user to decide what is the coding system for the file?

I'm actually scared to use Emacs anymore because Emacs has converted
thousands lines of text to a mess because of this bug.

There was a trick to fix the coding system by inserting commands to the
start of file, so I suggest to add a command like "fix the coding system to
file" which adds the trick thing to the file. I keep forgetting the trick.

pe 14. lokak. 2022 klo 12.45 GNU bug Tracking System <help-debbugs <at> gnu.org>
kirjoitti:

> Thank you for filing a new bug report with debbugs.gnu.org.
>
> This is an automatically generated reply to let you know your message
> has been received.
>
> Your message is being forwarded to the package maintainers and other
> interested parties for their attention; they will reply in due course.
>
> Your message has been sent to the package maintainer(s):
>  bug-gnu-emacs <at> gnu.org
>
> If you wish to submit further information on this problem, please
> send it to 58507 <at> debbugs.gnu.org.
>
> Please do not send mail to help-debbugs <at> gnu.org unless you wish
> to report a problem with the Bug-tracking system.
>
> --
> 58507: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=58507
> GNU Bug Tracking System
> Contact help-debbugs <at> gnu.org with problems
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58507; Package emacs. (Thu, 09 Feb 2023 09:53:02 GMT) Full text and rfc822 format available.

Message #22 received at 58507 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Juhana Sadeharju <johanrainhill <at> gmail.com>
Cc: 58507 <at> debbugs.gnu.org
Subject: Re: bug#58507: Acknowledgement (Emacs does not preserve the coding
 system)
Date: Thu, 09 Feb 2023 11:52:22 +0200
> From: Juhana Sadeharju <johanrainhill <at> gmail.com>
> Date: Thu, 9 Feb 2023 10:56:22 +0200
> 
> Hello. Has this bug or feature been fixed?

We made no change to Emacs due to this report, since I don't believe
there's a bug here.  This is how Emacs behaves, and this behavior is
well documented and intended.

In previous discussion of this issue, I pointed out how to deal with
such situations; I repeat some of that below.

> The problem is that Emacs doesn't keep the coding system I have
> set (utf-8).

The information about the file's encoding, if you want to keep it,
should be in the file, using the 'coding:' cookie, by adding

  ";;; -*- coding: utf-8-dos; -*-"

in the first line of the file.  (You can also do this in the file's
Local Variables section near the end of the file; see the "Specifying
File Variables" node of the Emacs user manual for details.

Alternatively, you can force Emacs to use UTF-8 when you visit the
file:

  C-x RET c utf-8 RET C-x C-f <file name> RET

The "C-x RET c utf-8 RET" prefix forces the following command to use
UTF-8 for decoding and encoding text.

> The file is opened with different coding system and all äö chars are a mess. Even I set the coding
> system again to utf-8, all the mess remains. Fixing the mess does not help because the next time the coding
> system is wrong again.
> 
> Why Emacs doesn't let user to decide what is the coding system for the file?

It does, see above.

> I'm actually scared to use Emacs anymore because Emacs has converted thousands lines of text to a
> mess because of this bug.

As long as you only visit the file and don't make any changes to it,
the "mess" on the screen is just a display problem; the file's
contents is not changed.

> There was a trick to fix the coding system by inserting commands to the start of file, so I suggest to add a
> command like "fix the coding system to file" which adds the trick thing to the file. I keep forgetting the trick.

The command is "C-x RET r".  This re-reads the file after prompting
you for the coding-system to decode the file's contents.  Which is yet
another alternative to "fix" the problem after you visit the file and
notice the incorrect guess of its coding-system.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58507; Package emacs. (Sun, 03 Sep 2023 09:31:02 GMT) Full text and rfc822 format available.

Message #25 received at 58507 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefankangas <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 58507 <at> debbugs.gnu.org, Juhana Sadeharju <johanrainhill <at> gmail.com>
Subject: Re: bug#58507: Emacs does not preserve the coding system
Date: Sun, 3 Sep 2023 02:30:18 -0700
tags 58507 notabug
close 58507
thanks

Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: Juhana Sadeharju <johanrainhill <at> gmail.com>
>> Date: Thu, 9 Feb 2023 10:56:22 +0200
>>
>> Hello. Has this bug or feature been fixed?
>
> We made no change to Emacs due to this report, since I don't believe
> there's a bug here.  This is how Emacs behaves, and this behavior is
> well documented and intended.

Thanks.  I'm therefore closing this bug report.




Added tag(s) notabug. Request was from Stefan Kangas <stefankangas <at> gmail.com> to control <at> debbugs.gnu.org. (Sun, 03 Sep 2023 09:31:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 58507 <at> debbugs.gnu.org and Juhana Sadeharju <johanrainhill <at> gmail.com> Request was from Stefan Kangas <stefankangas <at> gmail.com> to control <at> debbugs.gnu.org. (Sun, 03 Sep 2023 09:31:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 01 Oct 2023 11:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 179 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.