GNU bug report logs - #20258
24.5; format-time-string miscounting of multibyte characters

Previous Next

Package: emacs;

Reported by: Gunnar Horrigmo <gunnar.horrigmo <at> usit.uio.no>

Date: Sat, 4 Apr 2015 15:36:01 UTC

Severity: minor

Tags: fixed, patch

Found in version 24.5

Fixed in version 27.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 20258 in the body.
You can then email your comments to 20258 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#20258; Package emacs. (Sat, 04 Apr 2015 15:36:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Gunnar Horrigmo <gunnar.horrigmo <at> usit.uio.no>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sat, 04 Apr 2015 15:36:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Gunnar Horrigmo <gunnar.horrigmo <at> usit.uio.no>
To: bug-gnu-emacs <at> gnu.org
Subject: 24.5; format-time-string miscounting of multibyte characters
Date: Sat, 04 Apr 2015 16:33:50 +0200
As the subject says, format-time-string miscounts multibyte characters.
Simple example with nb_NO.utf8 locale, where ø is two bytes:

(format-time-string "%6a" (date-to-time "Sat Apr  4 16:14:40 2015"))
"  lø."

(length (format-time-string "%6a" (date-to-time "Sat Apr  4 16:14:40 2015")))
5

Let me know if you need more info.

--Gunnar




In GNU Emacs 24.5.1 (x86_64-unknown-linux-gnu, GTK+ Version 3.14.10)
 of 2015-04-01 on lumpy-gravy.uio.no
Repository revision: 1b70aa634c9ce117fed418894b54b1f2647bda1c
Windowing system distributor `StarNet Communications Corp.', version 11.0.14000
System Description:	Fedora release 21 (Twenty One)

Important settings:
  value of $LC_MONETARY: nb_NO.utf8
  value of $LC_NUMERIC: nb_NO.utf8
  value of $LC_TIME: nb_NO.utf8
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix

Major mode: Help

Minor modes in effect:
  tooltip-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  buffer-read-only: t
  line-number-mode: t
  transient-mark-mode: t

Recent messages:
nnimap read 417k from secure.runbox.no
nnimap read 466k from secure.runbox.no
nnimap read 581k from secure.runbox.no
nnimap read 655k from secure.runbox.no
nnimap read 729k from secure.runbox.no
Mark set
Mark saved where search started [2 times]
Making completion list...
Quit [4 times]
Type C-x 1 to delete the help window, C-M-v to scroll help.
Quit [2 times]

Load-path shadows:
/uio/kant/usit-gdw-u1/horrigmo/emacs/locate hides /uio/kant/usit-gdw-u1/horrigmo/emacs/src/emacs-24/emacs/lisp/locate

Features:
(shadow nnir emacsbug reposition sort smiley gnus-cite mm-archive
mail-extr gnus-bcklg eieio-opt speedbar sb-image ezimage dframe
find-func gnus-async qp gnus-ml disp-table pp gnus-eform debug jka-compr
misearch multi-isearch help-mode gnus-topic nndraft nnmh utf-7 gnutls
nnimap utf7 parse-time netrc network-stream starttls tls gnus-agent
gnus-srvr gnus-score score-mode nnvirtual gnus-msg gnus-art mm-uu
mml2015 nntp gnus-cache gnus-sum gnus-group gnus-undo nnfolder nnoo
nnmail mail-source avoid mm-view mml-smime smime dig mailcap gnus-start
gnus-spec gnus-int gnus-range gnus-win gnus gnus-ems wid-edit nnheader
rt-liberation edmacro kmacro browse-url markstack epa-file epa derived
epg etags info smtpmail auth-source eieio byte-opt bytecomp byte-compile
cl-extra cconv eieio-core gnus-util password-cache sendmail message
cl-macs format-spec rfc822 mml easymenu mml-sec mm-decode mm-bodies
mm-encode mail-parse rfc2231 rfc2047 rfc2045 ietf-drums mm-util help-fns
mail-prsvr mailabbrev mail-utils gmm-utils mailheader ange-ftp comint
ansi-color ring cl gv cl-loaddefs cl-lib package epg-config time-date
tooltip electric uniquify ediff-hook vc-hooks lisp-float-type mwheel
x-win x-dnd tool-bar dnd fontset image regexp-opt fringe tabulated-list
newcomment lisp-mode prog-mode register page menu-bar rfn-eshadow timer
select scroll-bar mouse jit-lock font-lock syntax facemenu font-core
frame cham georgian utf-8-lang misc-lang vietnamese tibetan thai
tai-viet lao korean japanese hebrew greek romanian slovak czech european
ethiopic indian cyrillic chinese case-table epa-hook jka-cmpr-hook help
simple abbrev minibuffer nadvice loaddefs button faces cus-face macroexp
files text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget hashtable-print-readable backquote make-network-process
dbusbind gfilenotify dynamic-setting system-font-setting
font-render-setting move-toolbar gtk x-toolkit x multi-tty emacs)

Memory information:
((conses 16 278530 45717)
 (symbols 48 32240 0)
 (miscs 40 185 563)
 (strings 32 55977 9185)
 (string-bytes 1 1975855)
 (vectors 16 32030)
 (vector-slots 8 1345064 168271)
 (floats 8 255 726)
 (intervals 56 14581 139)
 (buffers 960 37)
 (heap 1024 54088 6705))


-- 
Gunnar




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20258; Package emacs. (Sat, 04 Apr 2015 15:43:01 GMT) Full text and rfc822 format available.

Message #8 received at 20258 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Gunnar Horrigmo <gunnar.horrigmo <at> usit.uio.no>
Cc: 20258 <at> debbugs.gnu.org
Subject: Re: bug#20258: 24.5;
 format-time-string miscounting of multibyte characters
Date: Sat, 04 Apr 2015 18:42:13 +0300
> From: Gunnar Horrigmo <gunnar.horrigmo <at> usit.uio.no>
> Date: Sat, 04 Apr 2015 16:33:50 +0200
> 
> 
> As the subject says, format-time-string miscounts multibyte characters.
> Simple example with nb_NO.utf8 locale, where ø is two bytes:
> 
> (format-time-string "%6a" (date-to-time "Sat Apr  4 16:14:40 2015"))
> "  lø."
> 
> (length (format-time-string "%6a" (date-to-time "Sat Apr  4 16:14:40 2015")))
> 5

'length' counts characters, not bytes.  If you need to count bytes,
use 'string-bytes' instead:

  (string-bytes "  lø.") => 6






Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20258; Package emacs. (Sat, 04 Apr 2015 16:04:02 GMT) Full text and rfc822 format available.

Message #11 received at 20258 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20258 <at> debbugs.gnu.org, Gunnar Horrigmo <gunnar.horrigmo <at> usit.uio.no>
Subject: Re: bug#20258: 24.5;
 format-time-string miscounting of multibyte characters
Date: Sat, 04 Apr 2015 12:03:47 -0400
> 'length' counts characters, not bytes.  If you need to count bytes,
> use 'string-bytes' instead:

>   (string-bytes "  lø.") => 6

And in 99% of the cases, using length-bytes doesn't do what you think
(it doesn't count the number of bytes that it would take in your
favorite coding-system, but the number of bytes it takes within Emacs's
internal encoding).
If you want to know how many bytes it would take in your locale's
encoding, then you need:

   (length (encode-coding-string <thestring> locale-coding-system))


        Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20258; Package emacs. (Sat, 04 Apr 2015 16:43:02 GMT) Full text and rfc822 format available.

Message #14 received at 20258 <at> debbugs.gnu.org (full text, mbox):

From: Andreas Schwab <schwab <at> linux-m68k.org>
To: Gunnar Horrigmo <gunnar.horrigmo <at> usit.uio.no>
Cc: 20258 <at> debbugs.gnu.org
Subject: Re: bug#20258: 24.5;
 format-time-string miscounting of multibyte characters
Date: Sat, 04 Apr 2015 18:42:29 +0200
Gunnar Horrigmo <gunnar.horrigmo <at> usit.uio.no> writes:

> As the subject says, format-time-string miscounts multibyte characters.
> Simple example with nb_NO.utf8 locale, where ø is two bytes:
>
> (format-time-string "%6a" (date-to-time "Sat Apr  4 16:14:40 2015"))
> "  lø."

This is a limitation of the underlying strftime, which operates on
bytes, not characters.  This could be fixed by using wcsftime instead.

Andreas.

-- 
Andreas Schwab, schwab <at> linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."




Reply sent to Stefan Kangas <stefan <at> marxist.se>:
You have taken responsibility. (Mon, 30 Sep 2019 00:36:01 GMT) Full text and rfc822 format available.

Notification sent to Gunnar Horrigmo <gunnar.horrigmo <at> usit.uio.no>:
bug acknowledged by developer. (Mon, 30 Sep 2019 00:36:01 GMT) Full text and rfc822 format available.

Message #19 received at 20258-done <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefan <at> marxist.se>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20258-done <at> debbugs.gnu.org, Gunnar Horrigmo <gunnar.horrigmo <at> usit.uio.no>
Subject: Re: bug#20258: 24.5;
 format-time-string miscounting of multibyte characters
Date: Mon, 30 Sep 2019 02:35:08 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: Gunnar Horrigmo <gunnar.horrigmo <at> usit.uio.no>
>> Date: Sat, 04 Apr 2015 16:33:50 +0200
>>
>>
>> As the subject says, format-time-string miscounts multibyte characters.
>> Simple example with nb_NO.utf8 locale, where ø is two bytes:
>>
>> (format-time-string "%6a" (date-to-time "Sat Apr  4 16:14:40 2015"))
>> "  lø."
>>
>> (length (format-time-string "%6a" (date-to-time "Sat Apr  4 16:14:40 2015")))
>> 5
>
> 'length' counts characters, not bytes.  If you need to count bytes,
> use 'string-bytes' instead:
>
>   (string-bytes "  lø.") => 6

I can see no bug here, only a misunderstanding about the length
function.  I'm therefore closing this bug.  If that's incorrect, please
reopen this bug report.

Best regards,
Stefan Kangas




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20258; Package emacs. (Mon, 30 Sep 2019 03:10:02 GMT) Full text and rfc822 format available.

Message #22 received at 20258 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: 20258 <at> debbugs.gnu.org
Cc: stefan <at> marxist.se, gunnar.horrigmo <at> usit.uio.no
Subject: Re: bug#20258: 24.5; format-time-string miscounting of multibyte
 characters
Date: Mon, 30 Sep 2019 05:09:08 +0200
Stefan Kangas <stefan <at> marxist.se> writes:

>>> As the subject says, format-time-string miscounts multibyte characters.
>>> Simple example with nb_NO.utf8 locale, where ø is two bytes:
>>>
>>> (format-time-string "%6a" (date-to-time "Sat Apr  4 16:14:40 2015"))
>>> "  lø."
>>>
>>> (length (format-time-string "%6a" (date-to-time "Sat Apr  4 16:14:40 2015")))
>>> 5
>>
>> 'length' counts characters, not bytes.  If you need to count bytes,
>> use 'string-bytes' instead:
>>
>>   (string-bytes "  lø.") => 6
>
> I can see no bug here, only a misunderstanding about the length
> function.  I'm therefore closing this bug.  If that's incorrect, please
> reopen this bug report.

But the issue here is that "%6a" should give you a string that's six
characters long, I think?  Admittedly the doc string is vague here:

---
A field width N is an unsigned decimal integer with a leading digit nonzero.
%NX is like %X, but takes up at least N positions.
---

But the natural interpretation of "positions" isn't bytes, I think, and
if is, then the doc string should say so.

(let ((system-time-locale "nb_NO.UTF-8"))
  (format-time-string "%6a" (date-to-time "Sat Apr  4 16:14:40 2015")))
=> "  lø."

(if you have that locale in /etc/locale.gen.)

But I seem to remember from previous discussions that this quirk is in
the C strftime function?  And Emacs just call it?  I haven't checked.
But this means that you can't use format-time-string to line stuff up,
but have to use `format':

(let ((system-time-locale "nb_NO.UTF-8"))
  (format "%6s" (format-time-string "%a" (date-to-time "Sat Apr  4 16:14:40 2015"))))
=> "   lø."

So I think what WIDTH means should be said explicitly in the doc string.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20258; Package emacs. (Mon, 30 Sep 2019 07:02:04 GMT) Full text and rfc822 format available.

Message #25 received at 20258 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 20258 <at> debbugs.gnu.org, stefan <at> marxist.se, gunnar.horrigmo <at> usit.uio.no
Subject: Re: bug#20258: 24.5;
 format-time-string miscounting of multibyte characters
Date: Mon, 30 Sep 2019 10:01:17 +0300
> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Date: Mon, 30 Sep 2019 05:09:08 +0200
> Cc: stefan <at> marxist.se, gunnar.horrigmo <at> usit.uio.no
> 
> A field width N is an unsigned decimal integer with a leading digit nonzero.
> %NX is like %X, but takes up at least N positions.
> ---
> 
> But the natural interpretation of "positions" isn't bytes, I think, and
> if is, then the doc string should say so.
> 
> (let ((system-time-locale "nb_NO.UTF-8"))
>   (format-time-string "%6a" (date-to-time "Sat Apr  4 16:14:40 2015")))
> => "  lø."
> 
> (if you have that locale in /etc/locale.gen.)
> 
> But I seem to remember from previous discussions that this quirk is in
> the C strftime function?  And Emacs just call it?

Yes, that's true.

> So I think what WIDTH means should be said explicitly in the doc string.

It can only warn that WIDTH _might_ be measured in bytes, since the
underlying implementation of strftime just might DTRT.  Or not.

I think this should be raised as a bug to glibc developers, as their
documentation says "characters", according to my reading.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20258; Package emacs. (Mon, 30 Sep 2019 08:42:02 GMT) Full text and rfc822 format available.

Message #28 received at 20258 <at> debbugs.gnu.org (full text, mbox):

From: Andreas Schwab <schwab <at> suse.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20258 <at> debbugs.gnu.org, Lars Ingebrigtsen <larsi <at> gnus.org>,
 stefan <at> marxist.se, gunnar.horrigmo <at> usit.uio.no
Subject: Re: bug#20258: 24.5;
 format-time-string miscounting of multibyte characters
Date: Mon, 30 Sep 2019 10:41:23 +0200
On Sep 30 2019, Eli Zaretskii <eliz <at> gnu.org> wrote:

> I think this should be raised as a bug to glibc developers, as their
> documentation says "characters", according to my reading.

The POSIX descriptions says bytes.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab <at> suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20258; Package emacs. (Mon, 30 Sep 2019 09:15:02 GMT) Full text and rfc822 format available.

Message #31 received at 20258 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Andreas Schwab <schwab <at> suse.de>
Cc: 20258 <at> debbugs.gnu.org, larsi <at> gnus.org, stefan <at> marxist.se,
 gunnar.horrigmo <at> usit.uio.no
Subject: Re: bug#20258: 24.5;
 format-time-string miscounting of multibyte characters
Date: Mon, 30 Sep 2019 12:13:59 +0300
> From: Andreas Schwab <schwab <at> suse.de>
> Cc: Lars Ingebrigtsen <larsi <at> gnus.org>,  20258 <at> debbugs.gnu.org,  stefan <at> marxist.se,  gunnar.horrigmo <at> usit.uio.no
> Date: Mon, 30 Sep 2019 10:41:23 +0200
> 
> On Sep 30 2019, Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
> > I think this should be raised as a bug to glibc developers, as their
> > documentation says "characters", according to my reading.
> 
> The POSIX descriptions says bytes.

Right.  So it might be a glibc documentation bug (or maybe the glibc
manual I have here is outdated).

And there is the issue with non-glibc implementations.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20258; Package emacs. (Mon, 30 Sep 2019 13:40:02 GMT) Full text and rfc822 format available.

Message #34 received at 20258 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20258 <at> debbugs.gnu.org, Andreas Schwab <schwab <at> suse.de>, stefan <at> marxist.se,
 gunnar.horrigmo <at> usit.uio.no
Subject: Re: bug#20258: 24.5; format-time-string miscounting of multibyte
 characters
Date: Mon, 30 Sep 2019 15:39:15 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

>> The POSIX descriptions says bytes.
>
> Right.  So it might be a glibc documentation bug (or maybe the glibc
> manual I have here is outdated).
>
> And there is the issue with non-glibc implementations.

What about something appropriately vague like the following patch to
draw attention to the issue:

diff --git a/src/timefns.c b/src/timefns.c
index 330d5623f0..20f7ccb7d7 100644
--- a/src/timefns.c
+++ b/src/timefns.c
@@ -1437,8 +1437,11 @@ DEFUN ("format-time-string", Fformat_time_string, Sformat_time_string, 1, 3, 0,
 `^' Use upper case characters if possible.
 `#' Use opposite case characters if possible.
 
-A field width N is an unsigned decimal integer with a leading digit nonzero.
-%NX is like %X, but takes up at least N positions.
+A field width N is an unsigned decimal integer with a leading digit
+nonzero.  %NX is like %X, but takes up at least N positions.  The
+field width is (on most systems) in bytes, not characters, so it
+depends on the locale what the width (in characters) %NX will end up
+being.
 
 The modifiers are:
 


-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20258; Package emacs. (Mon, 30 Sep 2019 14:00:02 GMT) Full text and rfc822 format available.

Message #37 received at 20258 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 20258 <at> debbugs.gnu.org, schwab <at> suse.de, stefan <at> marxist.se,
 gunnar.horrigmo <at> usit.uio.no
Subject: Re: bug#20258: 24.5; format-time-string miscounting of multibyte
 characters
Date: Mon, 30 Sep 2019 16:58:49 +0300
> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Cc: Andreas Schwab <schwab <at> suse.de>,  20258 <at> debbugs.gnu.org,
>   stefan <at> marxist.se,  gunnar.horrigmo <at> usit.uio.no
> Date: Mon, 30 Sep 2019 15:39:15 +0200
> 
> -A field width N is an unsigned decimal integer with a leading digit nonzero.
> -%NX is like %X, but takes up at least N positions.
> +A field width N is an unsigned decimal integer with a leading digit
> +nonzero.  %NX is like %X, but takes up at least N positions.  The
> +field width is (on most systems) in bytes, not characters, so it

"is measured in bytes".  Also, I'd say "on GNU/Linux and some other
systems", which is marginally more accurate.

> +depends on the locale what the width (in characters) %NX will end up
> +being.

I would mention "non-ASCII characters" here in some way, not just the
locale, to make this more explicit.

Thanks.




Added tag(s) patch. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Mon, 30 Sep 2019 14:05:03 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20258; Package emacs. (Mon, 30 Sep 2019 14:13:02 GMT) Full text and rfc822 format available.

Message #42 received at 20258 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20258 <at> debbugs.gnu.org, schwab <at> suse.de, stefan <at> marxist.se,
 gunnar.horrigmo <at> usit.uio.no
Subject: Re: bug#20258: 24.5; format-time-string miscounting of multibyte
 characters
Date: Mon, 30 Sep 2019 16:12:38 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

> "is measured in bytes".  Also, I'd say "on GNU/Linux and some other
> systems", which is marginally more accurate.

OK.

>> +depends on the locale what the width (in characters) %NX will end up
>> +being.
>
> I would mention "non-ASCII characters" here in some way, not just the
> locale, to make this more explicit.

I was pondering whether any users had a locale of *.UTF-16.  Then even
the ASCII characters will be subject to the byte/character difference,
so I thought it was best to leave even that vague.

But perhaps saying something like "especially with non-ASCII characters"
wouldn't be too misleading?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20258; Package emacs. (Mon, 30 Sep 2019 14:32:02 GMT) Full text and rfc822 format available.

Message #45 received at 20258 <at> debbugs.gnu.org (full text, mbox):

From: Gunnar Horrigmo <gunnar.horrigmo <at> usit.uio.no>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 20258 <at> debbugs.gnu.org, schwab <at> suse.de, Eli Zaretskii <eliz <at> gnu.org>,
 stefan <at> marxist.se, gunnar.horrigmo <at> usit.uio.no
Subject: Re: bug#20258: 24.5; format-time-string miscounting of multibyte
 characters
Date: Mon, 30 Sep 2019 16:30:55 +0200
Lars Ingebrigtsen <larsi <at> gnus.org> writes:

> I was pondering whether any users had a locale of *.UTF-16.

Windows users might, if that's at all relevant to the discussion.

--Gunnar




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20258; Package emacs. (Mon, 30 Sep 2019 14:42:02 GMT) Full text and rfc822 format available.

Message #48 received at 20258 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 20258 <at> debbugs.gnu.org, schwab <at> suse.de, stefan <at> marxist.se,
 gunnar.horrigmo <at> usit.uio.no
Subject: Re: bug#20258: 24.5; format-time-string miscounting of multibyte
 characters
Date: Mon, 30 Sep 2019 17:41:24 +0300
> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Cc: schwab <at> suse.de,  20258 <at> debbugs.gnu.org,  stefan <at> marxist.se,
>   gunnar.horrigmo <at> usit.uio.no
> Date: Mon, 30 Sep 2019 16:12:38 +0200
> 
> > I would mention "non-ASCII characters" here in some way, not just the
> > locale, to make this more explicit.
> 
> I was pondering whether any users had a locale of *.UTF-16.

Unlikely.

> But perhaps saying something like "especially with non-ASCII characters"
> wouldn't be too misleading?

Yes, that's what I had in mind.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20258; Package emacs. (Mon, 30 Sep 2019 14:45:02 GMT) Full text and rfc822 format available.

Message #51 received at 20258 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Gunnar Horrigmo <gunnar.horrigmo <at> usit.uio.no>
Cc: 20258 <at> debbugs.gnu.org, schwab <at> suse.de, larsi <at> gnus.org, stefan <at> marxist.se,
 gunnar.horrigmo <at> usit.uio.no
Subject: Re: bug#20258: 24.5; format-time-string miscounting of multibyte
 characters
Date: Mon, 30 Sep 2019 17:44:24 +0300
> From: Gunnar Horrigmo <gunnar.horrigmo <at> usit.uio.no>
> Cc: Eli Zaretskii <eliz <at> gnu.org>,  <schwab <at> suse.de>,
>   <20258 <at> debbugs.gnu.org>,  <stefan <at> marxist.se>,
>   <gunnar.horrigmo <at> usit.uio.no>
> Date: Mon, 30 Sep 2019 16:30:55 +0200
> 
> Lars Ingebrigtsen <larsi <at> gnus.org> writes:
> 
> > I was pondering whether any users had a locale of *.UTF-16.
> 
> Windows users might

I don't think so.  AFAIK, UTF-16 is not a valid codeset of any Windows
locale.  Windows uses UTF-16 internally, and exposes it in the Windows
APIs, but APIs that came from Posix (and locale is one of them) only
support single-byte and DBCS encodings as their codeset.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#20258; Package emacs. (Mon, 30 Sep 2019 14:49:02 GMT) Full text and rfc822 format available.

Message #54 received at 20258 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 20258 <at> debbugs.gnu.org, schwab <at> suse.de, stefan <at> marxist.se,
 gunnar.horrigmo <at> usit.uio.no
Subject: Re: bug#20258: 24.5; format-time-string miscounting of multibyte
 characters
Date: Mon, 30 Sep 2019 16:48:18 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

>> But perhaps saying something like "especially with non-ASCII characters"
>> wouldn't be too misleading?
>
> Yes, that's what I had in mind.

OK; doc string updated accordingly, and I'm closing this bug report.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Added tag(s) fixed. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Mon, 30 Sep 2019 14:49:02 GMT) Full text and rfc822 format available.

bug marked as fixed in version 27.1, send any further explanations to 20258 <at> debbugs.gnu.org and Gunnar Horrigmo <gunnar.horrigmo <at> usit.uio.no> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Mon, 30 Sep 2019 14:49:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 29 Oct 2019 11:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 178 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.