GNU bug report logs - #26621
hint for translators is missing from POT file, but is opaque anyhow

Previous Next

Package: coreutils;

Reported by: Benno Schulenberg <bensberg <at> justemail.net>

Date: Sun, 23 Apr 2017 15:18:02 UTC

Severity: normal

Tags: fixed

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 26621 in the body.
You can then email your comments to 26621 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#26621; Package coreutils. (Sun, 23 Apr 2017 15:18:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Benno Schulenberg <bensberg <at> justemail.net>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Sun, 23 Apr 2017 15:18:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Benno Schulenberg <bensberg <at> justemail.net>
To: Coreutils <bug-coreutils <at> gnu.org>
Subject: hint for translators is missing from POT file, but is opaque anyhow
Date: Sun, 23 Apr 2017 17:17:06 +0200
Commit 2dab6cd3c2e18eb574b24e54fba86a33c80b6a27 changed
the progress messages for dd, but in doing so separated
the instruction/hint for translators from the call to
ngettext().  For xgettext to pick up such comments, the
comment must end of the line directly before the call.
So the current POT file for coreutils does not contain
this comment/hint/instruction.

Second, the comment seems to consist of two parts that
appear to be unrelated.  So it would be better to split
the text into two separate paragraphs.

Third, the second part of that comment reads like this:

     If one of
     these formats A looks shorter on the screen than another format
     B, then A's string length should be less than B's, and appending
     strlen (B) - strlen (A) spaces to A should make it appear to be
     at least as long as B.

I don't understand what it is trying to say.  Does it say
that if, of those four strings, untranslated string A is
shorter than untranslated string B, that then also the
translation of string A must be shorter than the translation
of string B?  If yes, then: 1) please reword, 2) why?, and
3) does the program blow up if not?  Or is this part of the
comment not meant for translators at all?

Fourth, the first part of the comment begins with this:

     The instances of "s" in the following formats are
     the SI symbol "s" (meaning second), and should not be translated.

Why should they not be translated?  In order to avoid problems
with grammatical congruence in languages like Polish?  But for
a language like Dutch I would accept the mild incongruence
when the elapsed time is exactly x.1 seconds, which will be
a rare occasion.  For all other numbers it will be much clearer
to say "seconden" instead of just "s".  So I would suggest to
change this part of the comment to:

     The instances of "s" in the next four strings are the SI
     symbol "s" (meaning seconds).  It may be preferrable
     to leave them untranslated, to avoid problems with
     grammatical congruence.

Fifth (and this is the reason I arrived here), when using
status=progress, the elapsed time that is printed is shown
with four or five decimals.  1) Is the time measurement
really this accurate?  2) Sometimes the last one or two
or three decimals happen to be zero, and then they get
truncated, making the progress message a bit shorter for
one second.  It would be nicer to use a fixed number of
decimals so that the message doesn't unnecessarily "jump".

Sixth, the format string uses %g, which means that the
number of seconds will be displayed in exponential form
when the number becomes very large.  Is that intentional?
Wouldn't it be better to use %f?  I've played a bit with it,
and I think %.1f is best, because also the other numbers
in the progress message, when they are in decimal form,
use a single decimal of precision.

Benno

-- 
http://www.fastmail.com - Email service worth paying for. Try it for free





Information forwarded to bug-coreutils <at> gnu.org:
bug#26621; Package coreutils. (Mon, 24 Apr 2017 04:12:01 GMT) Full text and rfc822 format available.

Message #8 received at 26621 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Benno Schulenberg <bensberg <at> justemail.net>, 26621 <at> debbugs.gnu.org
Subject: Re: bug#26621: hint for translators is missing from POT file, but is
 opaque anyhow
Date: Sun, 23 Apr 2017 21:11:38 -0700
On 23/04/17 08:17, Benno Schulenberg wrote:
> 
> Commit 2dab6cd3c2e18eb574b24e54fba86a33c80b6a27 changed
> the progress messages for dd, but in doing so separated
> the instruction/hint for translators from the call to
> ngettext().  For xgettext to pick up such comments, the
> comment must end of the line directly before the call.
> So the current POT file for coreutils does not contain
> this comment/hint/instruction.
> 
> Second, the comment seems to consist of two parts that
> appear to be unrelated.  So it would be better to split
> the text into two separate paragraphs.
> 
> Third, the second part of that comment reads like this:
> 
>      If one of
>      these formats A looks shorter on the screen than another format
>      B, then A's string length should be less than B's, and appending
>      strlen (B) - strlen (A) spaces to A should make it appear to be
>      at least as long as B.
> 
> I don't understand what it is trying to say.  Does it say
> that if, of those four strings, untranslated string A is
> shorter than untranslated string B, that then also the
> translation of string A must be shorter than the translation
> of string B?  If yes, then: 1) please reword, 2) why?, and
> 3) does the program blow up if not?  Or is this part of the
> comment not meant for translators at all?
> 
> Fourth, the first part of the comment begins with this:
> 
>      The instances of "s" in the following formats are
>      the SI symbol "s" (meaning second), and should not be translated.
> 
> Why should they not be translated?  In order to avoid problems
> with grammatical congruence in languages like Polish?  But for
> a language like Dutch I would accept the mild incongruence
> when the elapsed time is exactly x.1 seconds, which will be
> a rare occasion.  For all other numbers it will be much clearer
> to say "seconden" instead of just "s".  So I would suggest to
> change this part of the comment to:
> 
>      The instances of "s" in the next four strings are the SI
>      symbol "s" (meaning seconds).  It may be preferrable
>      to leave them untranslated, to avoid problems with
>      grammatical congruence.
> 
> Fifth (and this is the reason I arrived here), when using
> status=progress, the elapsed time that is printed is shown
> with four or five decimals.  1) Is the time measurement
> really this accurate?  2) Sometimes the last one or two
> or three decimals happen to be zero, and then they get
> truncated, making the progress message a bit shorter for
> one second.  It would be nicer to use a fixed number of
> decimals so that the message doesn't unnecessarily "jump".
> 
> Sixth, the format string uses %g, which means that the
> number of seconds will be displayed in exponential form
> when the number becomes very large.  Is that intentional?
> Wouldn't it be better to use %f?  I've played a bit with it,
> and I think %.1f is best, because also the other numbers
> in the progress message, when they are in decimal form,
> use a single decimal of precision.

Yes the different width numbers is not great.
This jumps around on my system:
  dd status=progress if=/dev/zero of=/dev/null bs=2M

Yes, http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=v8.24-123-g2dab6cd
changed from %.6f to %g.

You can get some sense of the I/O overhead by looking
at the less significant decimal digits,
so I find the precision useful.

I'll change this back to .6f I think to avoid the jumping,
and fix up the TRANSLATOR notes.

cheers,
Pádraig




Information forwarded to bug-coreutils <at> gnu.org:
bug#26621; Package coreutils. (Mon, 24 Apr 2017 07:39:01 GMT) Full text and rfc822 format available.

Message #11 received at 26621 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Benno Schulenberg <bensberg <at> justemail.net>, 26621 <at> debbugs.gnu.org
Subject: Re: bug#26621: hint for translators is missing from POT file, but is
 opaque anyhow
Date: Mon, 24 Apr 2017 00:38:07 -0700
[Message part 1 (text/plain, inline)]
Benno Schulenberg wrote:
> Fourth, the first part of the comment begins with this:
>
>      The instances of "s" in the following formats are
>      the SI symbol "s" (meaning second), and should not be translated.
>
> Why should they not be translated?  In order to avoid problems
> with grammatical congruence in languages like Polish?

Yes. This is part of the SI standard. SI abbreviations are supposed to be 
identical in all languages, regardless of grammar issues.

> It would be nicer to use a fixed number of
> decimals so that the message doesn't unnecessarily "jump".

Yes, and since the messages you're talking about are supposed to come out once a 
second, dd should just omit everything after the decimal point.

> Sixth, the format string uses %g, which means that the
> number of seconds will be displayed in exponential form
> when the number becomes very large.  Is that intentional?

Yes, it's been that way since that code was introduced in 2004 (before 
status=progress was added). The idea I had back then was that we want more than 
1 digit when transfer times are short, and that we needn't bother with lots of 
digits when transfer times are long. I've never heard of a real-world situation 
where the exponential notation actually gets used (more than 11 days for the 
transfer, if I calculate aright) so the issue is to some extent academic.

Thanks for your other comments. I installed the attached patch, which I hope 
addresses them.
[0001-dd-status-progress-outputs-6-s-not-6.00001-s.patch (text/x-diff, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#26621; Package coreutils. (Mon, 24 Apr 2017 15:24:02 GMT) Full text and rfc822 format available.

Message #14 received at 26621 <at> debbugs.gnu.org (full text, mbox):

From: Bernhard Voelker <mail <at> bernhard-voelker.de>
To: Paul Eggert <eggert <at> cs.ucla.edu>,
 Benno Schulenberg <bensberg <at> justemail.net>, 26621 <at> debbugs.gnu.org
Subject: Re: bug#26621: hint for translators is missing from POT file, but is
 opaque anyhow
Date: Mon, 24 Apr 2017 17:23:10 +0200
On 04/24/2017 09:38 AM, Paul Eggert wrote:
> Yes, and since the messages you're talking about are supposed to come out
> once a second, dd should just omit everything after the decimal point.

Not quite - you can still get them with 'kill -USR1 $pid'.

Have a nice day,
Berny




Information forwarded to bug-coreutils <at> gnu.org:
bug#26621; Package coreutils. (Mon, 24 Apr 2017 17:16:02 GMT) Full text and rfc822 format available.

Message #17 received at 26621 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Bernhard Voelker <mail <at> bernhard-voelker.de>,
 Benno Schulenberg <bensberg <at> justemail.net>, 26621 <at> debbugs.gnu.org
Subject: Re: bug#26621: hint for translators is missing from POT file, but is
 opaque anyhow
Date: Mon, 24 Apr 2017 10:15:23 -0700
On 04/24/2017 08:23 AM, Bernhard Voelker wrote:
>> Yes, and since the messages you're talking about are supposed to come out
>> once a second, dd should just omit everything after the decimal point.
> Not quite - you can still get them with 'kill -USR1 $pid'.

In current master "kill -USR" messages use %g, as the signal might 
arrive at any time. %.0f is used only by ordinary status=progress 
messages, which are as close to the 1-second boundaries as can be arranged.





Information forwarded to bug-coreutils <at> gnu.org:
bug#26621; Package coreutils. (Mon, 24 Apr 2017 19:58:02 GMT) Full text and rfc822 format available.

Message #20 received at 26621 <at> debbugs.gnu.org (full text, mbox):

From: Benno Schulenberg <bensberg <at> justemail.net>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 26621 <at> debbugs.gnu.org
Subject: Re: bug#26621: hint for translators is missing from POT file,
 but is opaque anyhow
Date: Mon, 24 Apr 2017 21:57:08 +0200
On Mon, Apr 24, 2017, at 09:38, Paul Eggert wrote:
> Benno Schulenberg wrote:
> > It would be nicer to use a fixed number of
> > decimals so that the message doesn't unnecessarily "jump".
> 
> Yes, and since the messages you're talking about are supposed to come out once a 
> second, dd should just omit everything after the decimal point.

Well, in the past week I have been writing several ISO images
to a USB stick, and dd's progress messages first come out at
about 1.1 second intervals, then slow down to about every four
or five seconds, and after about a minute start to arrive again
at close to every 1.0 seconds.

(Also, when dd is done copying records, the USB stick isn't
ready yet: it continuous blinking for nearly a minute, but dd
sits there as if nothing is happening.  I don't know if this is
feasible, but it would be nice if dd would continue to count up
the seconds, and slowly reduce the transfer rate.)

> Thanks for your other comments. I installed the attached patch, which I hope 
> addresses them.

Okay.  If I understand the new comment well, I would have written:

/* TRANSLATORS: The translations of the next three msgids should
be of ascending length.  That is: each subsequent msgstr should be
longer than the preceding one. */

Benno

-- 
http://www.fastmail.com - Or how I learned to stop worrying and
                          love email again





Information forwarded to bug-coreutils <at> gnu.org:
bug#26621; Package coreutils. (Tue, 25 Apr 2017 01:21:01 GMT) Full text and rfc822 format available.

Message #23 received at 26621 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Benno Schulenberg <bensberg <at> justemail.net>
Cc: 26621 <at> debbugs.gnu.org
Subject: Re: bug#26621: hint for translators is missing from POT file, but is
 opaque anyhow
Date: Mon, 24 Apr 2017 18:20:18 -0700
On 04/24/2017 12:57 PM, Benno Schulenberg wrote:
> Well, in the past week I have been writing several ISO images
> to a USB stick, and dd's progress messages first come out at
> about 1.1 second intervals, then slow down to about every four
> or five seconds, and after about a minute start to arrive again
> at close to every 1.0 seconds.

Although that's annoying, I don't offhand see an easy way to fix that (a 
single write that takes a very long time, e.g., more than a second), and 
if the timestamps are that coarse then 1 s resolution is not so bad anyway.

> (Also, when dd is done copying records, the USB stick isn't
> ready yet:

You might need to use 'eject' on the stick before unplugging it. Perhaps 
dd should have an option to eject the output when done?

> /* TRANSLATORS: The translations of the next three msgids should
> be of ascending length.  That is: each subsequent msgstr should be
> longer than the preceding one. */

That's not technically correct, as the msgstr length itself is not 
directly relevant; what matters are the numbers of bytes and columns in 
the formatted output. One could have a short msgstr that formats to a 
long output. It's the number of bytes and columns in the formatted 
output that matters, not the number of bytes and columns in the msgstr.

To some extent this translator note is pedantic, as translations are 
quite likely to have the desired property even if translators don't know 
about the issue.




Information forwarded to bug-coreutils <at> gnu.org:
bug#26621; Package coreutils. (Thu, 27 Apr 2017 17:42:01 GMT) Full text and rfc822 format available.

Message #26 received at 26621 <at> debbugs.gnu.org (full text, mbox):

From: Benno Schulenberg <bensberg <at> justemail.net>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 26621 <at> debbugs.gnu.org
Subject: Re: bug#26621: hint for translators is missing from POT file,
 but is opaque anyhow
Date: Thu, 27 Apr 2017 19:41:12 +0200
On Tue, Apr 25, 2017, at 03:20, Paul Eggert wrote:
> On 04/24/2017 12:57 PM, Benno Schulenberg wrote:
> > (Also, when dd is done copying records, the USB stick isn't
> > ready yet:
> 
> You might need to use 'eject' on the stick before unplugging it. Perhaps 
> dd should have an option to eject the output when done?

That might be useful, but in practice I don't need it: as soon
as dd actually exits, my system detects that something new has
appeared on its USB port and automatically mounts it and opens
a file browser there.  I check that the stick now contains what
I expected to see, and then click unmount in the file browser.


> > /* TRANSLATORS: The translations of the next three msgids should
> > be of ascending length.  That is: each subsequent msgstr should be
> > longer than the preceding one. */
> 
> That's not technically correct, as the msgstr length itself is not 
> directly relevant; what matters are the numbers of bytes and columns in 
> the formatted output.

But how is a translator supposed to know exactly which numbers
can occur with what msgstr?  The translator is not going to change
the PRIuMAX, and all the other things are %s.  What length can they
have?  The translator is not going to analyze the whole program.
And he shouldn't need to.  It should be the program's responsibility
to present its output in a way that it can't get messed up.

> To some extent this translator note is pedantic, as translations are 
> quite likely to have the desired property

If the note is pedantic, then please leave it out.  Such notes are
meant to clarify things, so that the translator doesn't need to look
at the code in order to understand what a message means or in what
context it occurs.  If all the note says is to look at a comment in
the code, this is annoying.  And as that comment is incomprehensible,
it is useless.

But, since now the normal progress message contains the time with
a whole-second resolution, I suggest to split up the message again
and to properly pluralize it.

Depending on the prefix:

    ngettext ("%"PRIuMAX" byte copied",
              "%"PRIuMAX" bytes copied",
              select_plural (w_bytes), ...);
OR
    ngettext ("%"PRIuMAX" byte (%s) copied",
              "%"PRIuMAX" bytes (%s) copied",
              select_plural (w_bytes), ...);
OR
    ngettext ("%"PRIuMAX" byte (%s = %s) copied",
              "%"PRIuMAX" bytes (%s = %s) copied",
              select_plural (w_bytes), ...);

followed by

    ngettext (" in %.0f second -- %s/s",
              " in %.0f seconds -- %s/s",
              select_plural (time_in_second_resolution), ...);

If the entirety of that output can have a varying width
that sometimes gets shorter than it was before, then dd
should first print that output to a string, record its
longest length, and then always fill up the string with
spaces to that length.  dd should do the work, not the
translator.

Benno

-- 
http://www.fastmail.com - Access all of your messages and folders
                          wherever you are





Information forwarded to bug-coreutils <at> gnu.org:
bug#26621; Package coreutils. (Thu, 27 Apr 2017 21:58:01 GMT) Full text and rfc822 format available.

Message #29 received at 26621 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Benno Schulenberg <bensberg <at> justemail.net>
Cc: 26621 <at> debbugs.gnu.org
Subject: Re: bug#26621: hint for translators is missing from POT file, but is
 opaque anyhow
Date: Thu, 27 Apr 2017 14:57:28 -0700
[Message part 1 (text/plain, inline)]
On 04/27/2017 10:41 AM, Benno Schulenberg wrote:
> If the note is pedantic, then please leave it out.

OK, done in the attached.


> I suggest to split up the message again and to properly pluralize it.

That would make it harder to translate, no? The idea was to simplify the 
translator's jobs by using SI units.

Instead, how about if we remove the units from the string to be 
translated? That way, translators won't need to worry about whether to 
translate the units. Plus, there won't be any change to the output 
format in the C locale. I did that in the attached.

> If the entirety of that output can have a varying width
> that sometimes gets shorter than it was before, then dd
> should first print that output to a string, record its
> longest length, and then always fill up the string with
> spaces to that length.

dd already does that. This approach does not work in general, though, 
due to the difference between strlen and column counts. But as its 
problems seem to not occur in practice, the attached patch removes the 
confusing translator note about this.
[0001-dd-simplify-translator-s-jobs.txt (text/plain, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#26621; Package coreutils. (Mon, 29 Oct 2018 03:08:02 GMT) Full text and rfc822 format available.

Message #32 received at 26621 <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: 26621 <at> debbugs.gnu.org
Subject: Re: bug#26621: hint for translators is missing from POT file, but is
 opaque anyhow
Date: Sun, 28 Oct 2018 21:07:29 -0600
tags 26621 fixed
close 26621
stop

(triaging old bugs)
Pushed here:
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=7ea15a57c7a8b876daa3d4d01f1192af3f58f3c7
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=d1f5616b2d9b851298f377ae7888b2d718802140

So closing as "fixed".

-assaf




Added tag(s) fixed. Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Mon, 29 Oct 2018 03:08:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 26621 <at> debbugs.gnu.org and Benno Schulenberg <bensberg <at> justemail.net> Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Mon, 29 Oct 2018 03:08:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 26 Nov 2018 12:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 5 years and 125 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.