GNU bug report logs - #45648
`dd` seek/skip which way is up?

Previous Next

Package: coreutils;

Reported by: Bela Lubkin <bela.lubkin <at> gmail.com>

Date: Mon, 4 Jan 2021 06:32:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 45648 in the body.
You can then email your comments to 45648 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#45648; Package coreutils. (Mon, 04 Jan 2021 06:32:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Bela Lubkin <bela.lubkin <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Mon, 04 Jan 2021 06:32:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Bela Lubkin <bela.lubkin <at> gmail.com>
To: bug-coreutils <at> gnu.org
Subject: `dd` seek/skip which way is up?
Date: Sun, 3 Jan 2021 19:03:17 -0800
[Message part 1 (text/plain, inline)]
Hello --

I constantly confuse 'seek=N' and 'skip=N'.  The two words have no natural
affinity to one I/O direction or the other.

I previously encountered a `dd` implementation which also accepted
'oseek=N' and 'iseek=N', which I found far more natural and easy to
remember.

Here is a small patch implementing the same for coreutils `dd`.  Patch is
against just-gotten git tree; `dd --version` reports 'dd (coreutils)
8.32.101-ebf2c-dirty'.  (I probably got the .texi formatting wrong; please
repair as needed.)

While in the area, I slightly improved some of the help (and therefore man
page).

>Bela<

========================================================================

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index e9dd21c4e..417857c5e 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -9100,6 +9100,15 @@ Skip @var{n} @samp{obs}-byte blocks in the output
file before copying.
 if @samp{oflag=seek_bytes} is specified, @var{n} is interpreted
 as a byte count rather than a block count.

+@item oseek
+@item iseek
+@opindex oseek
+@opindex iseek
+As the distinction between @samp{seek} and @samp{skip}
+is easily confused, @samp{oseek} is accepted as an alias
+for @samp{seek}; @samp{iseek} for @samp{skip}.
+Do not use these in scripts, as this reduces compatibility.
+
 @item count=@var{n}
 @opindex count
 Copy @var{n} @samp{ibs}-byte blocks from the input file, instead
@@ -9457,6 +9466,15 @@ rather than a block count, which allows specifying
 an offset that is not a multiple of the I/O block size.
 This flag can be used only with @code{oflag}.

+@item oseek_bytes
+@item iseek_bytes
+@opindex oseek_bytes
+@opindex iseek_bytes
+As the distinction between @samp{seek_bytes} and @samp{skip_bytes}
+is easily confused, @samp{oseek_bytes} is accepted as an alias
+for @samp{seek_bytes}; @samp{iseek_bytes} for @samp{skip_bytes}.
+Do not use these in scripts, as this reduces compatibility.
+
 @end table

 These flags are not supported on all systems, and @samp{dd} rejects
diff --git a/src/dd.c b/src/dd.c
index 9152a2550..a187522c2 100644
--- a/src/dd.c
+++ b/src/dd.c
@@ -381,7 +381,9 @@ static struct symbol_value const flags[] =
   {"fullblock",   O_FULLBLOCK}, /* Accumulate full blocks from input.  */
   {"count_bytes", O_COUNT_BYTES},
   {"skip_bytes",  O_SKIP_BYTES},
+  {"iseek_bytes", O_SKIP_BYTES},
   {"seek_bytes",  O_SEEK_BYTES},
+  {"oseek_bytes", O_SEEK_BYTES},
   {"", 0}
 };

@@ -571,7 +573,7 @@ Copy a file, converting and formatting according to the
operands.\n\
                   overrides ibs and obs\n\
   cbs=BYTES       convert BYTES bytes at a time\n\
   conv=CONVS      convert the file as per the comma separated symbol
list\n\
-  count=N         copy only N input blocks\n\
+  count=N         copy only N input blocks (bytes if iflag=count_bytes)\n\
   ibs=BYTES       read up to BYTES bytes at a time (default: 512)\n\
 "), stdout);
       fputs (_("\
@@ -580,8 +582,8 @@ Copy a file, converting and formatting according to the
operands.\n\
   obs=BYTES       write BYTES bytes at a time (default: 512)\n\
   of=FILE         write to FILE instead of stdout\n\
   oflag=FLAGS     write as per the comma separated symbol list\n\
-  seek=N          skip N obs-sized blocks at start of output\n\
-  skip=N          skip N ibs-sized blocks at start of input\n\
+  seek=N (or oseek=N)  skip N obs-sized blocks at start of output (bytes
if oflag=seek_bytes)\n\
+  skip=N (or iseek=N)  skip N ibs-sized blocks at start of input (bytes if
iflag=skip_bytes)\n\
   status=LEVEL    The LEVEL of information to print to stderr;\n\
                   'none' suppresses everything but error messages,\n\
                   'noxfer' suppresses the final transfer statistics,\n\
@@ -660,10 +662,10 @@ Each FLAG symbol may be:\n\
         fputs (_("  count_bytes  treat 'count=N' as a byte count (iflag
only)\n\
 "), stdout);
       if (O_SKIP_BYTES)
-        fputs (_("  skip_bytes  treat 'skip=N' as a byte count (iflag
only)\n\
+        fputs (_("  skip_bytes (or iseek_bytes)  treat 'skip=N' as a byte
count (iflag only)\n\
 "), stdout);
       if (O_SEEK_BYTES)
-        fputs (_("  seek_bytes  treat 'seek=N' as a byte count (oflag
only)\n\
+        fputs (_("  seek_bytes (or oseek_bytes)  treat 'seek=N' as a byte
count (oflag only)\n\
 "), stdout);

       {
@@ -1554,9 +1556,11 @@ scanargs (int argc, char *const *argv)
               n_max = SIZE_MAX;
               conversion_blocksize = n;
             }
-          else if (operand_is (name, "skip"))
+          else if (operand_is (name, "skip") ||
+                    operand_is (name, "iseek"))
             skip = n;
-          else if (operand_is (name, "seek"))
+          else if (operand_is (name, "seek") ||
+                   operand_is (name, "oseek"))
             seek = n;
           else if (operand_is (name, "count"))
             count = n;
[Message part 2 (text/html, inline)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#45648; Package coreutils. (Mon, 04 Jan 2021 08:34:01 GMT) Full text and rfc822 format available.

Message #8 received at 45648 <at> debbugs.gnu.org (full text, mbox):

From: Andreas Schwab <schwab <at> linux-m68k.org>
To: Bela Lubkin <bela.lubkin <at> gmail.com>
Cc: 45648 <at> debbugs.gnu.org
Subject: Re: bug#45648: `dd` seek/skip which way is up?
Date: Mon, 04 Jan 2021 09:33:13 +0100
On Jan 03 2021, Bela Lubkin wrote:

> diff --git a/doc/coreutils.texi b/doc/coreutils.texi
> index e9dd21c4e..417857c5e 100644
> --- a/doc/coreutils.texi
> +++ b/doc/coreutils.texi
> @@ -9100,6 +9100,15 @@ Skip @var{n} @samp{obs}-byte blocks in the output
> file before copying.
>  if @samp{oflag=seek_bytes} is specified, @var{n} is interpreted
>  as a byte count rather than a block count.
>
> +@item oseek
> +@item iseek

The second @item needs to be @itemx.

Andreas.

-- 
Andreas Schwab, schwab <at> linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."




Information forwarded to bug-coreutils <at> gnu.org:
bug#45648; Package coreutils. (Mon, 04 Jan 2021 23:08:01 GMT) Full text and rfc822 format available.

Message #11 received at 45648 <at> debbugs.gnu.org (full text, mbox):

From: Bernhard Voelker <mail <at> bernhard-voelker.de>
To: Bela Lubkin <bela.lubkin <at> gmail.com>, 45648 <at> debbugs.gnu.org
Subject: Re: bug#45648: `dd` seek/skip which way is up?
Date: Tue, 5 Jan 2021 00:07:39 +0100
On 1/4/21 4:03 AM, Bela Lubkin wrote:
> I constantly confuse 'seek=N' and 'skip=N'.  The two words have no natural
> affinity to one I/O direction or the other.

While the words 'seek' and 'skip' may not be strong enough for everyone
to be clear about whether they apply on input or output - e.g. for non-native
English speaker like myself - they are well documented in usage() and more places:

  $ dd --help | grep -E ' (skip|seek)=N '
    seek=N          skip N obs-sized blocks at start of output
    skip=N          skip N ibs-sized blocks at start of input

FWIW these terms are required by POSIX:

  https://pubs.opengroup.org/onlinepubs/9699919799/utilities/dd.html

> I previously encountered a `dd` implementation which also accepted
> 'oseek=N' and 'iseek=N', which I found far more natural and easy to
> remember.

What 'dd' implementation was this specifically?

> Here is a small patch implementing the same for coreutils `dd`.

In my opinion: if the word chosen for an option is not clear enough
to distinguish from another one, then adding yet another alias would
just increase confusion.

Adding options to coreutils programs has to be carefully chosen.
The only reason I'd see to add such an alias would be existing
behavior in one of the other major implementations.

Have a nice day,
Berny




Information forwarded to bug-coreutils <at> gnu.org:
bug#45648; Package coreutils. (Tue, 05 Jan 2021 02:07:02 GMT) Full text and rfc822 format available.

Message #14 received at 45648 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Bernhard Voelker <mail <at> bernhard-voelker.de>,
 Bela Lubkin <bela.lubkin <at> gmail.com>, 45648 <at> debbugs.gnu.org
Subject: Re: bug#45648: `dd` seek/skip which way is up?
Date: Mon, 4 Jan 2021 18:06:37 -0800
On 1/4/21 3:07 PM, Bernhard Voelker wrote:
>> I previously encountered a `dd` implementation which also accepted
>> 'oseek=N' and 'iseek=N', which I found far more natural and easy to
>> remember.
> What 'dd' implementation was this specifically?

Solaris dd has iseek and oseek. However, they are not aliases for skip 
and seek. If coreutils dd were to add these features I expect we should 
do them the Solaris way, instead of making them aliases for skip and 
seek. This would take more work than the proposed patches.

https://docs.oracle.com/cd/E36784_01/html/E36871/dd-1m.html




Information forwarded to bug-coreutils <at> gnu.org:
bug#45648; Package coreutils. (Tue, 05 Jan 2021 03:29:01 GMT) Full text and rfc822 format available.

Message #17 received at 45648 <at> debbugs.gnu.org (full text, mbox):

From: Bernhard Voelker <mail <at> bernhard-voelker.de>
To: Paul Eggert <eggert <at> cs.ucla.edu>, Bela Lubkin <bela.lubkin <at> gmail.com>,
 45648 <at> debbugs.gnu.org
Subject: Re: bug#45648: `dd` seek/skip which way is up?
Date: Tue, 5 Jan 2021 04:28:15 +0100
On 1/5/21 3:06 AM, Paul Eggert wrote:
> On 1/4/21 3:07 PM, Bernhard Voelker wrote:
>> What 'dd' implementation was this specifically?
> 
> Solaris dd has iseek and oseek. However, they are not aliases for skip 
> and seek. If coreutils dd were to add these features I expect we should 
> do them the Solaris way, instead of making them aliases for skip and 
> seek. This would take more work than the proposed patches.
> 
> https://docs.oracle.com/cd/E36784_01/html/E36871/dd-1m.html

That would make the situation even more confusing for the user
... and more complex because such implementation would interfere
with GNU dd's seek/skip and iflag=skip_bytes and oflag=skip_bytes
functionality.  Doesn't sound like a good idea.

Have a nice day,
Berny




Information forwarded to bug-coreutils <at> gnu.org:
bug#45648; Package coreutils. (Tue, 05 Jan 2021 03:45:02 GMT) Full text and rfc822 format available.

Message #20 received at 45648 <at> debbugs.gnu.org (full text, mbox):

From: Bela Lubkin <bela.lubkin <at> gmail.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Bernhard Voelker <mail <at> bernhard-voelker.de>, 45648 <at> debbugs.gnu.org
Subject: Re: bug#45648: `dd` seek/skip which way is up?
Date: Mon, 4 Jan 2021 19:44:11 -0800
[Message part 1 (text/plain, inline)]
TLDR: *huge* existing presence of 'iseek' and 'oseek'; most OSes document
them as pure synonyms for 'skip' and 'seek'.

====

The implementation where I encountered it was SCO OpenServer.  Like
Solaris, there was a distinction between 'iseek' and 'skip' ('skip' reads,
'iseek' seeks); no distinction between 'oseek' and 'seek'.

I consulted with freebsd.org/cgi/man.cgi?query=dd -- this shows that *many*
OSes support these keywords.  The current default display is FreeBSD 12.2,
which says:

'iseek=n  Seek on the input file n blocks. This is synonymous with skip=n.'
'oseek=n  Seek on the output file n blocks. This is synonymous with seek=n.'

Identical text exists since FreeBSD 4.0 (2000-03); Darwin 5.0.1; HP-UX
11.1; NetBSD 6.0; DEC OSF/1 4.0.  These are *ancient* OSes.

IRIX 6.5.30 actually documents 'seek' as 'Identical to oseek, retained for
backward compatibility.', i.e. 'oseek' is the real flag in this man page's
mind.

The man pages from Plan 9 & Inferno 4th edition (AT&T research OSes)
document 'skip', 'iseek', 'oseek', but not 'seek' at all!

Regarding the actual implementation, being able to manually control seeking
vs. actually doing useless I/O does not seem useful to me in 2021.  The
distinction exist(ed) for the benefit of things like tape drives, which of
course do still exist.  But back then, information about what was or was
not seekable was poorly plumbed up from drivers to userland.  Today, it
should be clear whether a file (whatever its fundamental implementation is)
is, or is not, seekable; `dd` should always attempt to seek if possible,
slog through the corresponding I/O only if the underlying file cannot seek.

In fact, the pointed-to Open Group specification precisely supports that
position:

'skip' says, 'Skip n input blocks ... On seekable files, ... read the
blocks or seek past them; on non-seekable files, ... read and ...
[discard]';

'seek' says, 'Skip n [output] blocks ... On non-seekable files, [read]
existing blocks ...; on seekable files, ... seek ... or read ...'

i.e. 'do I/O if not seekable; implementer's choice if seekable'.

The Solaris page is the only one where there is a possible implication that
'oseek' is different from 'seek', but only because the 'oseek' description
is vestigial.  (Exact same text persists from Solaris 2.5.1 through the
11.2 pointed to above.)

Should coreutils `dd` insist that if one uses 'oseek' and the file isn't
seekable, it should fail?  This violates least surprise.  'iseek' and
'oseek' should seek if possible, read if not.  Whereas 'skip' and 'seek'
*may* seek if possible, read if not.  This distinction is uninteresting
since the implementation *should* take advantage of the *may*.

Both the Solaris and Open Group man pages describe 'seek' as 'Skip[s] n
blocks', again showing that the words are not at all bound to a particular
direction.

>Bela<

On Mon, Jan 4, 2021 at 6:06 PM Paul Eggert <eggert <at> cs.ucla.edu> wrote:

> On 1/4/21 3:07 PM, Bernhard Voelker wrote:
> >> I previously encountered a `dd` implementation which also accepted
> >> 'oseek=N' and 'iseek=N', which I found far more natural and easy to
> >> remember.
> > What 'dd' implementation was this specifically?
>
> Solaris dd has iseek and oseek. However, they are not aliases for skip
> and seek. If coreutils dd were to add these features I expect we should
> do them the Solaris way, instead of making them aliases for skip and
> seek. This would take more work than the proposed patches.
>
> https://docs.oracle.com/cd/E36784_01/html/E36871/dd-1m.html
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#45648; Package coreutils. (Tue, 05 Jan 2021 04:09:02 GMT) Full text and rfc822 format available.

Message #23 received at 45648 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Bela Lubkin <bela.lubkin <at> gmail.com>
Cc: Bernhard Voelker <mail <at> bernhard-voelker.de>, 45648 <at> debbugs.gnu.org
Subject: Re: bug#45648: `dd` seek/skip which way is up?
Date: Mon, 4 Jan 2021 20:08:36 -0800
On 1/4/21 7:44 PM, Bela Lubkin wrote:
> TLDR: *huge* existing presence of 'iseek' and 'oseek'; most OSes document
> them as pure synonyms for 'skip' and 'seek'.

Thanks for doing all that research. It's compelling, and I think your 
patch (or something like it) should go in. I'll wait for a bit to hear 
other opinions.




Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Tue, 22 Feb 2022 17:13:01 GMT) Full text and rfc822 format available.

Notification sent to Bela Lubkin <bela.lubkin <at> gmail.com>:
bug acknowledged by developer. (Tue, 22 Feb 2022 17:13:01 GMT) Full text and rfc822 format available.

Message #28 received at 45648-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Bela Lubkin <bela.lubkin <at> gmail.com>
Cc: 45648-done <at> debbugs.gnu.org
Subject: Re: bug#45648: `dd` seek/skip which way is up?
Date: Tue, 22 Feb 2022 09:12:42 -0800
[Message part 1 (text/plain, inline)]
On 1/4/21 20:08, Paul Eggert wrote:
> On 1/4/21 7:44 PM, Bela Lubkin wrote:
>> TLDR: *huge* existing presence of 'iseek' and 'oseek'; most OSes document
>> them as pure synonyms for 'skip' and 'seek'.
> 
> Thanks for doing all that research. It's compelling, and I think your 
> patch (or something like it) should go in. I'll wait for a bit to hear 
> other opinions.

After thinking about the patch a bit more, let's omit the part about 
adding new conversions iseek_bytes etc., as I think there's a better way 
to address that issue. I proposed something in <https://bugs.gnu.org/54112>.

So instead of your patch, I installed the attached patches. The first 
one adds the iseek and oseek operands that you suggested; the second one 
clarifies dd documentation, as I found several things were confusing 
when rereading it carefully. Something like these patches should appear 
in the next coreutils release.
[0001-dd-support-iseek-and-oseek.patch (text/x-patch, attachment)]
[0002-dd-improve-doc-relative-to-POSIX.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#45648; Package coreutils. (Thu, 24 Feb 2022 13:19:01 GMT) Full text and rfc822 format available.

Message #31 received at 45648 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: 45648 <at> debbugs.gnu.org, eggert <at> cs.ucla.edu, bela.lubkin <at> gmail.com
Subject: Re: bug#45648: `dd` seek/skip which way is up?
Date: Thu, 24 Feb 2022 13:18:11 +0000
On 22/02/2022 17:12, Paul Eggert wrote:
> On 1/4/21 20:08, Paul Eggert wrote:
>> On 1/4/21 7:44 PM, Bela Lubkin wrote:
>>> TLDR: *huge* existing presence of 'iseek' and 'oseek'; most OSes document
>>> them as pure synonyms for 'skip' and 'seek'.
>>
>> Thanks for doing all that research. It's compelling, and I think your
>> patch (or something like it) should go in. I'll wait for a bit to hear
>> other opinions.
> 
> After thinking about the patch a bit more, let's omit the part about
> adding new conversions iseek_bytes etc., as I think there's a better way
> to address that issue. I proposed something in <https://bugs.gnu.org/54112>.
> 
> So instead of your patch, I installed the attached patches. The first
> one adds the iseek and oseek operands that you suggested; the second one
> clarifies dd documentation, as I found several things were confusing
> when rereading it carefully. Something like these patches should appear
> in the next coreutils release.

+1

The aliases are useful.
I always remembered it like skIp for Input,
but that is awkward.

As for the overlap in solaris with disabling reading,
I think that would be better as a flag, like "seek_only",
if deemed useful.

thanks,
Pádraig




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 25 Mar 2022 11:24:07 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 32 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.