GNU bug report logs - #47103
numfmt: invalid suffix 'k'

Previous Next

Package: coreutils;

Reported by: Daniel Callejas Sevilla <daniel.callejas.sevilla <at> gmail.com>

Date: Fri, 12 Mar 2021 16:18:02 UTC

Severity: normal

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 47103 in the body.
You can then email your comments to 47103 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#47103; Package coreutils. (Fri, 12 Mar 2021 16:18:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Daniel Callejas Sevilla <daniel.callejas.sevilla <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Fri, 12 Mar 2021 16:18:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Daniel Callejas Sevilla <daniel.callejas.sevilla <at> gmail.com>
To: bug-coreutils <at> gnu.org
Subject: numfmt: invalid suffix 'k'
Date: Fri, 12 Mar 2021 17:12:13 +0100
Hello,

the SI prefix for 'a thousand' is a lowercase k and not an uppercase K [1].

The default behavior of numfmt with '--from=si' option is therefore
contrary to expectation:

$ numfmt --from=si
 500k    # Should be accepted as valid SI
 numfmt: invalid suffix in input: ‘500k’

$ numfmt --from=si
 500K    # Should result in error, 'K' stands for kelvin unit.
 500000

$ numfmt --version
 numfmt (GNU coreutils) 8.26
 Packaged by Cygwin (8.26-2)
 Copyright (C) 2016 Free Software Foundation, Inc.
 License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
 This is free software: you are free to change and redistribute it.
 There is NO WARRANTY, to the extent permitted by law.

 Written by Assaf Gordon.


Best regards,
Daniel.

[1] Page 143 of https://www.bipm.org/en/publications/si-brochure/




Information forwarded to bug-coreutils <at> gnu.org:
bug#47103; Package coreutils. (Sun, 26 Nov 2023 07:35:02 GMT) Full text and rfc822 format available.

Message #8 received at 47103 <at> debbugs.gnu.org (full text, mbox):

From: Sven Köhler <sven.koehler <at> gmail.com>
To: 47103 <at> debbugs.gnu.org
Subject: numfmt: invalid suffix 'k'
Date: Sat, 25 Nov 2023 22:27:15 +0100
[Message part 1 (text/plain, inline)]
Not only --from=si is broken. Also --to=si is broken:

$ numfmt --to=si 3000
3,0K

In order to not break backwards compatibility, you probably have to 
introduce a switch --lowercase-kilo such that --to=si produces proper SI 
compliant output. Then have --from=si accept both uppercase and lowercase k.

I have to say, that uppercause K is quite common, but it is not correct 
as far as SI prefixes are concerned.

Also note that Ki in iec-i mode is quite correct. I'm torn about iec 
mode. I believe that people silently switch 1000 for 1024 and use the 
lower case k as well as uppercase K. Maybe numfmt should have an option 
to accept/produce both here as well?

Is there really a standard/specification that allows k/K for 1024? 
Wikipedia only lists Ki as IEC prefixes.
[Message part 2 (text/html, inline)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#47103; Package coreutils. (Sun, 26 Nov 2023 14:59:01 GMT) Full text and rfc822 format available.

Message #11 received at 47103 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Sven Köhler <sven.koehler <at> gmail.com>,
 47103 <at> debbugs.gnu.org
Subject: Re: bug#47103: numfmt: invalid suffix 'k'
Date: Sun, 26 Nov 2023 14:57:44 +0000
[Message part 1 (text/plain, inline)]
On 25/11/2023 21:27, Sven Köhler wrote:
> Not only --from=si is broken. Also --to=si is broken:
> 
> $ numfmt --to=si 3000
> 3,0K
> 
> In order to not break backwards compatibility, you probably have to
> introduce a switch --lowercase-kilo such that --to=si produces proper SI
> compliant output. Then have --from=si accept both uppercase and lowercase k.
> 
> I have to say, that uppercause K is quite common, but it is not correct
> as far as SI prefixes are concerned.
> 
> Also note that Ki in iec-i mode is quite correct. I'm torn about iec
> mode. I believe that people silently switch 1000 for 1024 and use the
> lower case k as well as uppercase K. Maybe numfmt should have an option
> to accept/produce both here as well?
> 
> Is there really a standard/specification that allows k/K for 1024?
> Wikipedia only lists Ki as IEC prefixes.

I was thinking we only supported uppercase K for compat with output from existing coreutils.
But in fact it's quite the opposite. Other coreutils output lowercase k
when operating in SI mode. For e.g. this gives an error:

  $ ls -lh --si /bin/ | numfmt --from=si --field=5
  numfmt: invalid suffix in input: ‘54k’

So we should at least accept lowercase k.

As for outputting lowercase k for the SI case, the coreutils texinfo has the following
in relation to these Kilo prefixes:

  ‘kB’
       kilobyte: 10^3 = 1000.
  ‘k’
  ‘K’
  ‘KiB’
       kibibyte: 2^{10} = 1024.  ‘K’ is special: the SI prefix is ‘k’ and
       the ISO/IEC 80000-13 prefix is ‘Ki’, but tradition and POSIX use
       ‘k’ to mean ‘KiB’.

So one might be conservative here and keep outputting uppercase K in SI mode.
However the above is really in relation to specifying block sizes, to df or dd etc.,
so we should probably output lower case k for consistency with other coreutils at least.
We could be conservative here and have a new --to=Si option (note the casing)
to explicitly select/allow variable cased SI prefixes, but I'm not sure that's needed.

For IEC mode, we should could just allow uppercase K,
but it's simpler and more flexible to accept lowercase k here, without much ambiguity.

As for not allowing uppercase K in SI mode, that's probably overkill,
and would cause more problems than it would solve. One edge case
it would solve is when working with a Kelvin suffix, to avoid
the ambiguity in the first case of the following.
That's too much of an edge case to worry about I think:

  $ numfmt --suffix=K --from=si 500K
  500K
  $ numfmt --suffix=K --from=si 500M
  500000000K
  $ numfmt --suffix=K --from=si 500KK
  500000K


The attached make the adjustment to allow 'k' always,
and output 'k' in SI mode. Tests will need adjusting,
but no need to clutter the discussion patch with that.

cheers,
Pádraig.
[numfmt-k.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#47103; Package coreutils. (Sun, 26 Nov 2023 16:10:02 GMT) Full text and rfc822 format available.

Message #14 received at 47103 <at> debbugs.gnu.org (full text, mbox):

From: Sven Köhler <sven.koehler <at> gmail.com>
To: Pádraig Brady <P <at> draigBrady.com>, 47103 <at> debbugs.gnu.org
Subject: Re: bug#47103: numfmt: invalid suffix 'k'
Date: Sun, 26 Nov 2023 17:09:23 +0100
So Pádraig's patch does allow for parsing lowercase k, but it does not 
change numfmt to use lowercase k in its output in si mode.

As Pádraig has shown, ls uses lowercase k in --si mode. So it uses 
lowercase k for 1000. I think that numfmt should behave the same for 
consistency reasons.

Also, the texinfo then seems incomplete. It lists kB but not k alone, as 
used by ls in --si mode.





Reply sent to Pádraig Brady <P <at> draigBrady.com>:
You have taken responsibility. (Sun, 26 Nov 2023 17:00:03 GMT) Full text and rfc822 format available.

Notification sent to Daniel Callejas Sevilla <daniel.callejas.sevilla <at> gmail.com>:
bug acknowledged by developer. (Sun, 26 Nov 2023 17:00:03 GMT) Full text and rfc822 format available.

Message #19 received at 47103-done <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Sven Köhler <sven.koehler <at> gmail.com>,
 47103-done <at> debbugs.gnu.org
Subject: Re: bug#47103: numfmt: invalid suffix 'k'
Date: Sun, 26 Nov 2023 16:59:36 +0000
[Message part 1 (text/plain, inline)]
On 26/11/2023 16:09, Sven Köhler wrote:
> So Pádraig's patch does allow for parsing lowercase k, but it does not
> change numfmt to use lowercase k in its output in si mode.
> 
> As Pádraig has shown, ls uses lowercase k in --si mode. So it uses
> lowercase k for 1000. I think that numfmt should behave the same for
> consistency reasons.

It does output lowercase 'k' in SI mode.

Attached is the full patch.

Marking this as done.

Will push this tomorrow.

thanks,
Pádraig
[numfmt-k.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#47103; Package coreutils. (Sun, 26 Nov 2023 17:13:02 GMT) Full text and rfc822 format available.

Message #22 received at 47103 <at> debbugs.gnu.org (full text, mbox):

From: Glenn Golden <gdg <at> zplane.com>
To: Sven Köhler <sven.koehler <at> gmail.com>
Cc: 47103 <at> debbugs.gnu.org, Pádraig Brady <P <at> draigbrady.com>
Subject: Re: bug#47103: numfmt: invalid suffix 'k'
Date: Sun, 26 Nov 2023 10:12:13 -0700
Sven Köhler <sven.koehler <at> gmail.com> [2023-11-26 17:09:23 +0100]:
> So Pádraig's patch does allow for parsing lowercase k, but it does not
> change numfmt to use lowercase k in its output in si mode.
> 
> As Pádraig has shown, ls uses lowercase k in --si mode. So it uses lowercase
> k for 1000. I think that numfmt should behave the same for consistency
> reasons.
> 
> Also, the texinfo then seems incomplete. It lists kB but not k alone, as
> used by ls in --si mode.
> 

Danger Will Robinson...  The situation w.r.t. B vs. B-less is more complicated
than that; the texinfo doc is (and has been for years) self-contradictory
regarding the semantics of B vs. B-less suffixes. It just depends which
section you happen to refer to.

This thread from a few years back lays out the B-vs.-no-B confusion in
gory detail:

    https://lists.gnu.org/archive/html/coreutils/2020-09/msg00001.html

NOTE: The above was posted in 2020 when coreutils 8.32 was current; the
texinfo sections referred to in the initial message of that thread have
changed: In particular, the section referred to in the above thread as
"Section 2.3 "Block size" is now Section 2.2 of the present (coreutils 9.4)
texinfo.

Glenn




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 25 Dec 2023 12:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 94 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.