GNU bug report logs - #8419
cp -au : New hard links in source becomes new files at destination when using cp -au

Previous Next

Package: coreutils;

Reported by: Odd.Harry.Mannsverk <at> benteler-alu.com

Date: Mon, 4 Apr 2011 10:49:01 UTC

Severity: normal

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 8419 in the body.
You can then email your comments to 8419 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#8419; Package coreutils. (Mon, 04 Apr 2011 10:49:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Odd.Harry.Mannsverk <at> benteler-alu.com:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Mon, 04 Apr 2011 10:49:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Odd.Harry.Mannsverk <at> benteler-alu.com
To: bug-coreutils <at> gnu.org
Cc: Hans.Kristian.Nordengen <at> ergo.no
Subject: cp -au : New hard links in source becomes new files at destination
	when using cp -au
Date: Mon, 4 Apr 2011 11:29:00 +0200
Hi

I have tried to use the command cp combining the -a and the -u options.
I had to stop the copying process midways and restarted it again, and to my
suprice the diskusage at the destination was 10 -20 % larger than the
diskusage at the source and my disks ran full even though the destination
disks was the same size as the source disks.

My disks contained rsync snapshot backup data so every inod had typically
15 hard links to it.
I found out that when I ran cp -au to update the target, cp -au creates new
files in the destination when there is a new hard link in the source to a
file already existing in the destination directory.
This is not the behaviour I expected.

I got the same behavior on:
SUSE Linux Enterprise Server 11 (x86_64)
and
Ubuntu 10.04

I have included a series of commands to:
    create a source directory with hard linked files,
    make folder target, and copy the directory to this folder
    add a new hard link in the source folder
    update the target folder with the command cp -au
    listing the directory contents of source and target to show that the
source contains 3 hard links to the same file, in the target directory
hardlink2 has become a separate identical file.

If you disagree with me that this is not the desired behaviour of cp -au,
please let me know why.
If you agree, I hope the behaviour of cp -au will change some time in the
future.


regards
Odd Harry Mannsverk



> mkdir source
> touch source/file1
> ln  source/file1 source/hardlink1
> mkdir target
> cp -a source target
> ln  source/file1 source/hardlink2
> cp -au source target
> ls -1iR source
source:
229418 file1
229418 hardlink1
229418 hardlink2
> ls -1iR target
target:
229414 source

target/source:
229415 file1
229415 hardlink1
229416 hardlink2
>





Reply sent to Jim Meyering <jim <at> meyering.net>:
You have taken responsibility. (Mon, 25 Jul 2011 11:43:01 GMT) Full text and rfc822 format available.

Notification sent to Odd.Harry.Mannsverk <at> benteler-alu.com:
bug acknowledged by developer. (Mon, 25 Jul 2011 11:43:02 GMT) Full text and rfc822 format available.

Message #10 received at 8419-done <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Odd.Harry.Mannsverk <at> benteler-alu.com
Cc: 8419-done <at> debbugs.gnu.org, Hans.Kristian.Nordengen <at> ergo.no
Subject: Re: bug#8419: cp -au : New hard links in source becomes new files at
	destination when using cp -au
Date: Mon, 25 Jul 2011 13:42:24 +0200
Odd.Harry.Mannsverk <at> benteler-alu.com wrote:
> I have tried to use the command cp combining the -a and the -u options.
> I had to stop the copying process midways and restarted it again, and to my
> suprice the diskusage at the destination was 10 -20 % larger than the
> diskusage at the source and my disks ran full even though the destination
> disks was the same size as the source disks.
...

Thank you for a fine bug report.
That is indeed a bug, and it affects the latest release, coreutils-8.12.
I confirmed that it afflicts fileutils-3.16 too, so this bug has probably
been present since the initial implementation.

Here's the fix I expect to push:

From 3095daab7f6d7980b77e01d97d75e702ce4a2e63 Mon Sep 17 00:00:00 2001
From: Jim Meyering <meyering <at> redhat.com>
Date: Mon, 25 Jul 2011 11:31:01 +0200
Subject: [PATCH] cp -up: preserve all hard links

* src/copy.c (copy_internal): With --update (-u), this function would
return early once it found that the destination is not older than the
source, *without* recording the source-dev/ino--to--dest_name mapping.
That mapping is required in order to preserve src hard links in the
destination tree, so when using cp with --update and --preserve=links
(perhaps via -p or -a), cp could fail to preserve one hard link
per inode when at least one of the hard-linked names already exists
in the destination tree.
Reported by Odd Harry Mannsverk in http://debbugs.gnu.org/8419.
* tests/cp/preserve-link: New file.  Exercise the flaw/fix.
* tests/Makefile.am (TESTS): Add it.
* NEWS (Bug fixes): Mention it.
---
 NEWS                   |    6 ++++++
 src/copy.c             |   12 ++++++++++++
 tests/Makefile.am      |    1 +
 tests/cp/preserve-link |   40 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 59 insertions(+), 0 deletions(-)
 create mode 100755 tests/cp/preserve-link

diff --git a/NEWS b/NEWS
index 0720719..416060f 100644
--- a/NEWS
+++ b/NEWS
@@ -8,6 +8,12 @@ GNU coreutils NEWS                                    -*- outline -*-
   I.E. for skipped files, the original ownership is output, not the new one.
   [bug introduced in sh-utils-2.0g]

+  cp -u -p would fail to preserve one hard link for each up-to-date copy
+  of a src-hard-linked name in the destination tree.  I.e., if s/a and s/b
+  are hard-linked and dst/s/a is up to date, "cp -up s dst" would copy s/b
+  to dst/s/b rather than simply linking dst/s/b to dst/s/a.
+  [This bug appears to have been present in "the beginning".]
+
   printf '%d' '"' no longer accesses out-of-bounds memory in the diagnostic.
   [bug introduced in sh-utils-1.16]

diff --git a/src/copy.c b/src/copy.c
index c17b942..df8b1db 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -1628,6 +1628,17 @@ copy_internal (char const *src_name, char const *dst_name,
                      end up removing the source file.  */
                   if (rename_succeeded)
                     *rename_succeeded = true;
+
+                  /* However, we still must record that we've processed
+                     this src/dest pair, in case this source file is
+                     hard-linked to another one.  In that case, we'll use
+                     the mapping information to link the corresponding
+                     destination names.  */
+                  earlier_file = remember_copied (dst_name, src_sb.st_ino,
+                                                  src_sb.st_dev);
+                  if (earlier_file)
+                    goto create_hard_link;
+
                   return true;
                 }
             }
@@ -1948,6 +1959,7 @@ copy_internal (char const *src_name, char const *dst_name,
         }
       else
         {
+        create_hard_link:;
           /* We want to guarantee that symlinks are not followed.  */
           bool link_failed = (linkat (AT_FDCWD, earlier_file, AT_FDCWD,
                                       dst_name, 0) != 0);
diff --git a/tests/Makefile.am b/tests/Makefile.am
index ebd1b11..0a83dae 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -341,6 +341,7 @@ TESTS =						\
   cp/parent-perm-race				\
   cp/perm					\
   cp/preserve-2					\
+  cp/preserve-link				\
   cp/preserve-slink-time			\
   cp/proc-short-read				\
   cp/proc-zero-len				\
diff --git a/tests/cp/preserve-link b/tests/cp/preserve-link
new file mode 100755
index 0000000..d0da873
--- /dev/null
+++ b/tests/cp/preserve-link
@@ -0,0 +1,40 @@
+#!/bin/sh
+# Exercise the fix for http://debbugs.gnu.org/8419
+
+# Copyright (C) 2011 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+print_ver_ cp
+
+same_inode()
+{
+  local u v
+  u=$(stat --format %i "$1") &&
+    v=$(stat --format %i "$2") && test "$u" = "$v"
+}
+
+mkdir -p s t/s || framework_failure_
+touch s/f t/s/f || framework_failure_
+ln s/f s/link || framework_failure_
+
+# This must create a hard link, t/s/link, to the existing file, t/s/f.
+# With cp from coreutils-8.12 and prior, it would mistakenly copy
+# the file rather than creating the link.
+cp -au s t || fail=1
+
+same_inode t/s/f t/s/link || fail=1
+
+Exit $fail
--
1.7.6.609.gbf6a9




Message #11 received at 8419-done <at> debbugs.gnu.org (full text, mbox):

From: "Voelker, Bernhard" <bernhard.voelker <at> siemens-enterprise.com>
To: Jim Meyering <jim <at> meyering.net>, "Odd.Harry.Mannsverk <at> benteler-alu.com"
	<Odd.Harry.Mannsverk <at> benteler-alu.com>
Cc: "8419-done <at> debbugs.gnu.org" <8419-done <at> debbugs.gnu.org>,
	"Hans.Kristian.Nordengen <at> ergo.no" <Hans.Kristian.Nordengen <at> ergo.no>
Subject: RE: bug#8419: cp -au : New hard links in source becomes new files
	at	destination when using cp -au
Date: Mon, 25 Jul 2011 15:18:13 +0200
Jim Meyering wrote:
+        create_hard_link:;

Is there a specific reason for the additional empty statement?

Also seen in:
  find . -name '*.c' | xargs grep ':;'
  ./src/head.c:    free_mem:;
  ./src/kill.c: no_more_options:;
  ./src/od.c:cleanup:;
  ./src/paste.c: done:;
  ./src/printf.c:        no_more_flag_characters:;

Berny



Message #12 received at 8419-done <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: "Voelker\, Bernhard" <bernhard.voelker <at> siemens-enterprise.com>
Cc: "8419-done <at> debbugs.gnu.org" <8419-done <at> debbugs.gnu.org>,
	"Hans.Kristian.Nordengen <at> ergo.no" <Hans.Kristian.Nordengen <at> ergo.no>,
	"Odd.Harry.Mannsverk <at> benteler-alu.com"
	<Odd.Harry.Mannsverk <at> benteler-alu.com>
Subject: Re: bug#8419: cp -au : New hard links in source becomes new files
	at	destination when using cp -au
Date: Mon, 25 Jul 2011 15:20:45 +0200
Voelker, Bernhard wrote:
> Jim Meyering wrote:
> +        create_hard_link:;
>
> Is there a specific reason for the additional empty statement?
>
> Also seen in:
>   find . -name '*.c' | xargs grep ':;'
>   ./src/head.c:    free_mem:;
>   ./src/kill.c: no_more_options:;
>   ./src/od.c:cleanup:;
>   ./src/paste.c: done:;
>   ./src/printf.c:        no_more_flag_characters:;

Yes.
It won't compile without that, since the thing right after
it is a declaration, and not a statement.




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#8419; Package coreutils. (Mon, 25 Jul 2011 13:39:02 GMT) Full text and rfc822 format available.

Message #15 received at 8419 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: 8419 <at> debbugs.gnu.org, jim <at> meyering.net
Subject: Re: bug#8419: cp -au : New hard links in source becomes new files
	at	destination when using cp -au
Date: Mon, 25 Jul 2011 14:35:39 +0100
[Message part 1 (text/plain, inline)]
On 25/07/11 12:42, Jim Meyering wrote:
> Odd.Harry.Mannsverk <at> benteler-alu.com wrote:
>> I have tried to use the command cp combining the -a and the -u options.
>> I had to stop the copying process midways and restarted it again, and to my
>> suprice the diskusage at the destination was 10 -20 % larger than the
>> diskusage at the source and my disks ran full even though the destination
>> disks was the same size as the source disks.
> ...
> 
> Thank you for a fine bug report.
> That is indeed a bug, and it affects the latest release, coreutils-8.12.
> I confirmed that it afflicts fileutils-3.16 too, so this bug has probably
> been present since the initial implementation.
> 
> Here's the fix I expect to push:

That looks good.
Should we fold the attached patch in too?
It refactors out a create_hard_link() function,
so we wouldn't be adding any new labels.

cheers,
Pádraig.
[create_hl.diff (text/x-patch, attachment)]

Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#8419; Package coreutils. (Mon, 25 Jul 2011 14:01:01 GMT) Full text and rfc822 format available.

Message #18 received at 8419 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: 8419 <at> debbugs.gnu.org
Subject: Re: bug#8419: cp -au : New hard links in source becomes new files
	at	destination when using cp -au
Date: Mon, 25 Jul 2011 16:00:37 +0200
Pádraig Brady wrote:
> On 25/07/11 12:42, Jim Meyering wrote:
>> Odd.Harry.Mannsverk <at> benteler-alu.com wrote:
>>> I have tried to use the command cp combining the -a and the -u options.
>>> I had to stop the copying process midways and restarted it again, and to my
>>> suprice the diskusage at the destination was 10 -20 % larger than the
>>> diskusage at the source and my disks ran full even though the destination
>>> disks was the same size as the source disks.
>> ...
>>
>> Thank you for a fine bug report.
>> That is indeed a bug, and it affects the latest release, coreutils-8.12.
>> I confirmed that it afflicts fileutils-3.16 too, so this bug has probably
>> been present since the initial implementation.
>>
>> Here's the fix I expect to push:
>
> That looks good.
> Should we fold the attached patch in too?
> It refactors out a create_hard_link() function,
> so we wouldn't be adding any new labels.

Thanks for the review.
Factoring out that function is definitely worth doing.
However, I'd prefer to keep that clean-up change separate
from the bug-fixing one.  Would you please add a comment for
the new function?  Maybe something as simple as this:

/* Create a hard link DST_NAME to SRC_NAME, honoring the REPLACE and
   VERBOSE settings.  Return true upon success.  Otherwise, diagnose
   the failure and return false.  */

> diff --git a/src/copy.c b/src/copy.c
> index c17b942..95d05f0 100644
> --- a/src/copy.c
> +++ b/src/copy.c
> @@ -1488,6 +1488,37 @@ restore_default_fscreatecon_or_die (void)
>             _("failed to restore the default file creation context"));
>  }
>
> +static bool
> +create_hard_link (char const *src_name, char const *dst_name,
> +                  bool replace, bool verbose)




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#8419; Package coreutils. (Mon, 25 Jul 2011 14:04:01 GMT) Full text and rfc822 format available.

Message #21 received at 8419 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: 8419 <at> debbugs.gnu.org
Subject: Re: bug#8419: cp -au : New hard links in source becomes new files
	at	destination when using cp -au
Date: Mon, 25 Jul 2011 15:00:29 +0100
On 25/07/11 15:00, Jim Meyering wrote:
> Thanks for the review.
> Factoring out that function is definitely worth doing.
> However, I'd prefer to keep that clean-up change separate
> from the bug-fixing one.  Would you please add a comment for
> the new function?  Maybe something as simple as this:
> 
> /* Create a hard link DST_NAME to SRC_NAME, honoring the REPLACE and
>    VERBOSE settings.  Return true upon success.  Otherwise, diagnose
>    the failure and return false.  */

OK I'll cleanup and apply the refactoring after yours.

cheers,
Pádraig.




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#8419; Package coreutils. (Mon, 25 Jul 2011 16:30:04 GMT) Full text and rfc822 format available.

Message #24 received at 8419 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: 8419 <at> debbugs.gnu.org
Subject: Re: bug#8419: cp -au : New hard links in source becomes new files
	at	destination when using cp -au
Date: Mon, 25 Jul 2011 17:26:53 +0100
Actually I'm wondering now whether the new code
should be unconditionally replacing the dest?
What if the dest is a separate newer file?
I.E. I think the following amended test should pass?
Also what about backups if the separate file is older?

diff --git a/tests/cp/preserve-link b/tests/cp/preserve-link
index d0da873..cee5ec0 100755
--- a/tests/cp/preserve-link
+++ b/tests/cp/preserve-link
@@ -28,13 +28,26 @@ same_inode()

 mkdir -p s t/s || framework_failure_
 touch s/f t/s/f || framework_failure_
+
+# a non existing link must be linked in the dest tree
 ln s/f s/link || framework_failure_

+# an existing link must be updated
+ln s/f s/linke || framework_failure_
+ln t/s/f t/s/linke || framework_failure_
+
+# a updated link must not be overwritten
+ln s/f s/linku || framework_failure_
+touch -d "+2 hour" t/s/linku || framework_failure_
+
 # This must create a hard link, t/s/link, to the existing file, t/s/f.
 # With cp from coreutils-8.12 and prior, it would mistakenly copy
 # the file rather than creating the link.
+touch -d "+1 hour" s/f || framework_failure_
 cp -au s t || fail=1

 same_inode t/s/f t/s/link || fail=1
+same_inode t/s/f t/s/linke || fail=1
+same_inode t/s/f t/s/linku && fail=1

 Exit $fail




Message #25 received at 8419-done <at> debbugs.gnu.org (full text, mbox):

From: "Voelker, Bernhard" <bernhard.voelker <at> siemens-enterprise.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: "8419-done <at> debbugs.gnu.org" <8419-done <at> debbugs.gnu.org>,
	"Hans.Kristian.Nordengen <at> ergo.no" <Hans.Kristian.Nordengen <at> ergo.no>,
	"Odd.Harry.Mannsverk <at> benteler-alu.com"
	<Odd.Harry.Mannsverk <at> benteler-alu.com>
Subject: RE: bug#8419: cp -au : New hard links in source becomes new files
	at	destination when using cp -au
Date: Mon, 25 Jul 2011 20:01:29 +0200
Jim Meyering wrote:

>Voelker, Bernhard wrote:
>> Jim Meyering wrote:
>> +        create_hard_link:;
>>
>> Is there a specific reason for the additional empty statement?
>>
>> Also seen in:
>>   find . -name '*.c' | xargs grep ':;'
>>   ./src/head.c:    free_mem:;
>>   ./src/kill.c: no_more_options:;
>>   ./src/od.c:cleanup:;
>>   ./src/paste.c: done:;
>>   ./src/printf.c:        no_more_flag_characters:;
>
> Yes.
> It won't compile without that, since the thing right after
> it is a declaration, and not a statement.

ah, yes, there's a bool following - it's not my best day.

But it's not necessary in the other cases.
I've sent a patch for this in a separate mail to coreutils <at> gnu.org.


Have a nice day,
Berny





Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#8419; Package coreutils. (Tue, 26 Jul 2011 00:37:02 GMT) Full text and rfc822 format available.

Message #28 received at 8419 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: 8419 <at> debbugs.gnu.org
Subject: Re: bug#8419: cp -au : New hard links in source becomes new files
	at	destination when using cp -au
Date: Tue, 26 Jul 2011 01:33:32 +0100
[Message part 1 (text/plain, inline)]
On 25/07/11 17:26, Pádraig Brady wrote:
> Actually I'm wondering now whether the new code
> should be unconditionally replacing the dest?
> What if the dest is a separate newer file?
> I.E. I think the following amended test should pass?
> Also what about backups if the separate file is older?

Well backups take a different path, as do
older or non existing destination paths.
So how about this change to just remove
the new create_hard_link: call and beef up the test?

cheers,
Pádraig.
[cp-update-link-new.diff (text/x-patch, attachment)]

Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#8419; Package coreutils. (Tue, 26 Jul 2011 09:37:02 GMT) Full text and rfc822 format available.

Message #31 received at 8419 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: 8419 <at> debbugs.gnu.org
Subject: Re: bug#8419: cp -au : New hard links in source becomes new files
	at	destination when using cp -au
Date: Tue, 26 Jul 2011 11:36:48 +0200
Pádraig Brady wrote:
> On 25/07/11 17:26, Pádraig Brady wrote:
>> Actually I'm wondering now whether the new code
>> should be unconditionally replacing the dest?
>> What if the dest is a separate newer file?
>> I.E. I think the following amended test should pass?
>> Also what about backups if the separate file is older?
>
> Well backups take a different path, as do
> older or non existing destination paths.
> So how about this change to just remove
> the new create_hard_link: call and beef up the test?

Hi Pádraig,

The link-creation code there cannot be removed.
Consider this scenario:

  src/{a,b}  # a and b are linked
  dst/src/a

If cp -au encounters src/a first, then the new "remember_copied"
call records the required info so when it encounters src/b it can
make the desired link from the preexisting dst/src/a to dst/src/b.
In that case, the link-creation code is indeed unnecessary.

However, what if it encounters src/b first?
In that case, it must copy src/b to dst/src/b (new inode!)
Then, when it encounters src/a, it must *remove* the preexisting dst/src/a
and replace it with a hard link to dst/src/b.
As an alternative, cp could conceivably remove the just-copied dst/src/b,
replacing it with a hard-link to dst/src/a.

The trouble I have with all of this is that we can see two different
outcomes, depending on the order in which "a" and "b" (in the src
directory) are processed by cp.  In one case, the preexisting and
up-to-date dst/src/a is not modified.  In the other, it is removed,
and replaced by a hard-link to potentially-differing-content "dst/src/b".

Test comments below.

> diff --git a/src/copy.c b/src/copy.c
> index df8b1db..d6a0d1a 100644
> --- a/src/copy.c
> +++ b/src/copy.c
> @@ -1633,11 +1633,11 @@ copy_internal (char const *src_name, char const *dst_name,
>                       this src/dest pair, in case this source file is
>                       hard-linked to another one.  In that case, we'll use
>                       the mapping information to link the corresponding
> -                     destination names.  */
> -                  earlier_file = remember_copied (dst_name, src_sb.st_ino,
> -                                                  src_sb.st_dev);
> -                  if (earlier_file)
> -                    goto create_hard_link;
> +                     destination names.  Note we don't hard link DST_NAME
> +                     here, because it may be a separate file with newer
> +                     or same timestamp. If it's older than SRC_NAME,
> +                     then this path is not taken.  */
> +                  remember_copied (dst_name, src_sb.st_ino, src_sb.st_dev);
...

> diff --git a/tests/cp/preserve-link b/tests/cp/preserve-link
> index d0da873..6beae02 100755
> --- a/tests/cp/preserve-link
> +++ b/tests/cp/preserve-link
> @@ -28,13 +28,31 @@ same_inode()
>
>  mkdir -p s t/s || framework_failure_
>  touch s/f t/s/f || framework_failure_
> +
> +# a non existing link must be linked in the dest tree
>  ln s/f s/link || framework_failure_
>
> +# an existing link must be updated
> +ln s/f s/linke || framework_failure_
> +ln t/s/f t/s/linke || framework_failure_
> +
> +# an updated older file must be overwritten
> +ln s/f s/fileo || framework_failure_
> +touch -d "-1 hour" t/s/fileo || framework_failure_

Since this "fileo" is hard-linked to f (and f to "linke"),
changing its mtime with touch changes those of the other files, too.

> +# an updated non linked file must not be overwritten
> +ln s/f s/fileu || framework_failure_
> +touch -d "+1 hour" t/s/fileu || framework_failure_

Doesn't this undo the effects of the preceding touch?

You might want to use a separate "cp" command,
with separate inputs -- or even a separate test script,
moving "same_inode" into a init.cfg.

>  # This must create a hard link, t/s/link, to the existing file, t/s/f.
>  # With cp from coreutils-8.12 and prior, it would mistakenly copy
>  # the file rather than creating the link.
> +touch -d "+1 hour" t/s/f || framework_failure_
>  cp -au s t || fail=1
>
>  same_inode t/s/f t/s/link || fail=1
> +same_inode t/s/f t/s/linke || fail=1
> +same_inode t/s/f t/s/fileo || fail=1
> +same_inode t/s/f t/s/fileu && fail=1
>
>  Exit $fail




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#8419; Package coreutils. (Tue, 26 Jul 2011 09:51:01 GMT) Full text and rfc822 format available.

Message #34 received at 8419 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: 8419 <at> debbugs.gnu.org
Subject: Re: bug#8419: cp -au : New hard links in source becomes new files
	at	destination when using cp -au
Date: Tue, 26 Jul 2011 11:50:27 +0200
Pádraig Brady wrote:
> On 25/07/11 17:26, Pádraig Brady wrote:
>> Actually I'm wondering now whether the new code
>> should be unconditionally replacing the dest?
>> What if the dest is a separate newer file?
>> I.E. I think the following amended test should pass?
>> Also what about backups if the separate file is older?
>
> Well backups take a different path, as do
> older or non existing destination paths.
> So how about this change to just remove
> the new create_hard_link: call and beef up the test?
>
> cheers,
> Pádraig.
>
> diff --git a/src/copy.c b/src/copy.c
> index df8b1db..d6a0d1a 100644
> --- a/src/copy.c
> +++ b/src/copy.c
> @@ -1633,11 +1633,11 @@ copy_internal (char const *src_name, char const *dst_name,
>                       this src/dest pair, in case this source file is
>                       hard-linked to another one.  In that case, we'll use
>                       the mapping information to link the corresponding
> -                     destination names.  */
> -                  earlier_file = remember_copied (dst_name, src_sb.st_ino,
> -                                                  src_sb.st_dev);
> -                  if (earlier_file)
> -                    goto create_hard_link;
> +                     destination names.  Note we don't hard link DST_NAME
> +                     here, because it may be a separate file with newer
> +                     or same timestamp. If it's older than SRC_NAME,
> +                     then this path is not taken.  */
> +                  remember_copied (dst_name, src_sb.st_ino, src_sb.st_dev);

BTW, I made exactly that mistake for the first iteration of the
patch I committed yesterday.  Obviously (to me, now), I should
have written more in that commit to justify the link-creating code.
But it's only this morning that I realized the non-determinism
and understood well enough to write coherently about it.

For the record, if I apply that change and run the existing test,
it fails like this (using ext4 and Fedora 15):

    $ make check -C tests TESTS=cp/preserve-link VERBOSE=yes
    ...
    + mkdir -p s t/s
    + touch s/f t/s/f
    + ln s/f s/link
    + cp -au s t
    + same_inode t/s/f t/s/link
    + local u v
    ++ stat --format %i t/s/f
    + u=1573873
    ++ stat --format %i t/s/link
    + v=1573874
    + test 1573873 = 1573874
    + fail=1
    + Exit 1
    ...
    ======================================
    1 of 1 test failed




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#8419; Package coreutils. (Tue, 26 Jul 2011 10:08:02 GMT) Full text and rfc822 format available.

Message #37 received at 8419 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: 8419 <at> debbugs.gnu.org
Subject: Re: bug#8419: cp -au : New hard links in source becomes new files
	at	destination when using cp -au
Date: Tue, 26 Jul 2011 12:07:10 +0200
Jim Meyering wrote:
> Pádraig Brady wrote:
>> On 25/07/11 17:26, Pádraig Brady wrote:
>>> Actually I'm wondering now whether the new code
>>> should be unconditionally replacing the dest?
>>> What if the dest is a separate newer file?
>>> I.E. I think the following amended test should pass?
>>> Also what about backups if the separate file is older?
>>
>> Well backups take a different path, as do
>> older or non existing destination paths.
>> So how about this change to just remove
>> the new create_hard_link: call and beef up the test?
>
> Hi Pádraig,
>
> The link-creation code there cannot be removed.
> Consider this scenario:
>
>   src/{a,b}  # a and b are linked
>   dst/src/a
>
> If cp -au encounters src/a first, then the new "remember_copied"
> call records the required info so when it encounters src/b it can
> make the desired link from the preexisting dst/src/a to dst/src/b.
> In that case, the link-creation code is indeed unnecessary.
>
> However, what if it encounters src/b first?
> In that case, it must copy src/b to dst/src/b (new inode!)
> Then, when it encounters src/a, it must *remove* the preexisting dst/src/a

This "must" is my interpretation that --preserve=link (implied
by -a and -p) has a higher priority than the --update (-u) option.

> and replace it with a hard link to dst/src/b.
> As an alternative, cp could conceivably remove the just-copied dst/src/b,
> replacing it with a hard-link to dst/src/a.
>
> The trouble I have with all of this is that we can see two different
> outcomes, depending on the order in which "a" and "b" (in the src
> directory) are processed by cp.  In one case, the preexisting and
> up-to-date dst/src/a is not modified.  In the other, it is removed,
> and replaced by a hard-link to potentially-differing-content "dst/src/b".
...




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#8419; Package coreutils. (Tue, 26 Jul 2011 11:43:01 GMT) Full text and rfc822 format available.

Message #40 received at 8419 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: 8419 <at> debbugs.gnu.org
Subject: Re: bug#8419: cp -au : New hard links in source becomes new files
	at	destination when using cp -au
Date: Tue, 26 Jul 2011 12:39:38 +0100
On 26/07/11 11:07, Jim Meyering wrote:
> Jim Meyering wrote:
>> Pádraig Brady wrote:
>>> On 25/07/11 17:26, Pádraig Brady wrote:
>>>> Actually I'm wondering now whether the new code
>>>> should be unconditionally replacing the dest?
>>>> What if the dest is a separate newer file?
>>>> I.E. I think the following amended test should pass?
>>>> Also what about backups if the separate file is older?
>>>
>>> Well backups take a different path, as do
>>> older or non existing destination paths.
>>> So how about this change to just remove
>>> the new create_hard_link: call and beef up the test?
>>
>> Hi Pádraig,
>>
>> The link-creation code there cannot be removed.
>> Consider this scenario:
>>
>>   src/{a,b}  # a and b are linked
>>   dst/src/a
>>
>> If cp -au encounters src/a first, then the new "remember_copied"
>> call records the required info so when it encounters src/b it can
>> make the desired link from the preexisting dst/src/a to dst/src/b.
>> In that case, the link-creation code is indeed unnecessary.
>>
>> However, what if it encounters src/b first?
>> In that case, it must copy src/b to dst/src/b (new inode!)
>> Then, when it encounters src/a, it must *remove* the preexisting dst/src/a
> 
> This "must" is my interpretation that --preserve=link (implied
> by -a and -p) has a higher priority than the --update (-u) option.

That's a bit surprising, thought understandable I think.
If the separate file in the dest is newer it will be replaced
by an older link from the source.  That could legitimately happen
I suppose if the original copy was with -rp, then the timestamps
would be newer than those in dest.
I'll update the test to enforce the replacement of newer files in dest

>> and replace it with a hard link to dst/src/b.
>> As an alternative, cp could conceivably remove the just-copied dst/src/b,
>> replacing it with a hard-link to dst/src/a.
>>
>> The trouble I have with all of this is that we can see two different
>> outcomes, depending on the order in which "a" and "b" (in the src
>> directory) are processed by cp.  In one case, the preexisting and
>> up-to-date dst/src/a is not modified.  In the other, it is removed,
>> and replaced by a hard-link to potentially-differing-content "dst/src/b".

TBH I don't understand all the implications.
Note the `touch` commands in the test operate on the separate inodes in dest.
Note also your original test didn't fail for me on ext4 on F15.

cheers,
Pádraig.




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#8419; Package coreutils. (Tue, 26 Jul 2011 13:24:02 GMT) Full text and rfc822 format available.

Message #43 received at 8419 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: 8419 <at> debbugs.gnu.org
Subject: Re: bug#8419: cp -au : New hard links in source becomes new files
	at	destination when using cp -au
Date: Tue, 26 Jul 2011 15:23:36 +0200
Pádraig Brady wrote:
...
>>> The link-creation code there cannot be removed.
>>> Consider this scenario:
>>>
>>>   src/{a,b}  # a and b are linked
>>>   dst/src/a
>>>
>>> If cp -au encounters src/a first, then the new "remember_copied"
>>> call records the required info so when it encounters src/b it can
>>> make the desired link from the preexisting dst/src/a to dst/src/b.
>>> In that case, the link-creation code is indeed unnecessary.
>>>
>>> However, what if it encounters src/b first?
>>> In that case, it must copy src/b to dst/src/b (new inode!)
>>> Then, when it encounters src/a, it must *remove* the preexisting dst/src/a
>>
>> This "must" is my interpretation that --preserve=link (implied
>> by -a and -p) has a higher priority than the --update (-u) option.
>
> That's a bit surprising, thought understandable I think.
> If the separate file in the dest is newer it will be replaced
> by an older link from the source.  That could legitimately happen
> I suppose if the original copy was with -rp, then the timestamps
> would be newer than those in dest.
> I'll update the test to enforce the replacement of newer files in dest
>
>>> and replace it with a hard link to dst/src/b.
>>> As an alternative, cp could conceivably remove the just-copied dst/src/b,
>>> replacing it with a hard-link to dst/src/a.
>>>
>>> The trouble I have with all of this is that we can see two different
>>> outcomes, depending on the order in which "a" and "b" (in the src
>>> directory) are processed by cp.  In one case, the preexisting and
>>> up-to-date dst/src/a is not modified.  In the other, it is removed,
>>> and replaced by a hard-link to potentially-differing-content "dst/src/b".
>
> TBH I don't understand all the implications.
> Note the `touch` commands in the test operate on the separate inodes in dest.

Oh!  Sorry I missed that.

> Note also your original test didn't fail for me on ext4 on F15.

When I apply the change below, recompile "cp" and run
  make check -C tests TESTS=cp/preserve-link VERBOSE=yes
with the test tweaked to run cp via strace,
  strace -e stat,lstat -o /tmp/cp-strace cp -au s t || fail=1
I get this, which shows it's processing s/link before s/f.

  stat("t", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
  lstat("s", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
  lstat("t/s", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
  lstat("s/link", {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
  stat("t/s/link", 0x7fffa85ca0a0)        = -1 ENOENT (No such file or directory)
  lstat("s/f", {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
  stat("t/s/f", {st_mode=S_IFREG|0600, st_size=0, ...}) = 0

I suspect you'll see that it's processing those two files in the reverse
order on your system.  In case it's kernel-related, I'm using this:
  2.6.38.8-32.fc15.x86_64
and the disk is an SSD.

diff --git a/src/copy.c b/src/copy.c
index aaf7e79..f4b6587 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -1636,10 +1636,7 @@ copy_internal (char const *src_name, char const *dst_name,
                      hard-linked to another one.  In that case, we'll use
                      the mapping information to link the corresponding
                      destination names.  */
-                  earlier_file = remember_copied (dst_name, src_sb.st_ino,
-                                                  src_sb.st_dev);
-                  if (earlier_file)
-                    goto create_hard_link;
+                  remember_copied (dst_name, src_sb.st_ino, src_sb.st_dev);

                   return true;
                 }
@@ -1961,7 +1958,6 @@ copy_internal (char const *src_name, char const *dst_name,
         }
       else
         {
-        create_hard_link:;
           /* We want to guarantee that symlinks are not followed.  */
           bool link_failed = (linkat (AT_FDCWD, earlier_file, AT_FDCWD,
                                       dst_name, 0) != 0);




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#8419; Package coreutils. (Tue, 26 Jul 2011 13:47:02 GMT) Full text and rfc822 format available.

Message #46 received at 8419 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: 8419 <at> debbugs.gnu.org
Subject: Re: bug#8419: cp -au : New hard links in source becomes new files
	at	destination when using cp -au
Date: Tue, 26 Jul 2011 14:43:30 +0100
On 26/07/11 14:23, Jim Meyering wrote:
> Pádraig Brady wrote:
> ...
>>>> The link-creation code there cannot be removed.
>>>> Consider this scenario:
>>>>
>>>>   src/{a,b}  # a and b are linked
>>>>   dst/src/a
>>>>
>>>> If cp -au encounters src/a first, then the new "remember_copied"
>>>> call records the required info so when it encounters src/b it can
>>>> make the desired link from the preexisting dst/src/a to dst/src/b.
>>>> In that case, the link-creation code is indeed unnecessary.
>>>>
>>>> However, what if it encounters src/b first?
>>>> In that case, it must copy src/b to dst/src/b (new inode!)
>>>> Then, when it encounters src/a, it must *remove* the preexisting dst/src/a
>>>
>>> This "must" is my interpretation that --preserve=link (implied
>>> by -a and -p) has a higher priority than the --update (-u) option.
>>
>> That's a bit surprising, thought understandable I think.
>> If the separate file in the dest is newer it will be replaced
>> by an older link from the source.  That could legitimately happen
>> I suppose if the original copy was with -rp, then the timestamps
>> would be newer than those in dest.
>> I'll update the test to enforce the replacement of newer files in dest
>>
>>>> and replace it with a hard link to dst/src/b.
>>>> As an alternative, cp could conceivably remove the just-copied dst/src/b,
>>>> replacing it with a hard-link to dst/src/a.
>>>>
>>>> The trouble I have with all of this is that we can see two different
>>>> outcomes, depending on the order in which "a" and "b" (in the src
>>>> directory) are processed by cp.  In one case, the preexisting and
>>>> up-to-date dst/src/a is not modified.  In the other, it is removed,
>>>> and replaced by a hard-link to potentially-differing-content "dst/src/b".
>>
>> TBH I don't understand all the implications.
>> Note the `touch` commands in the test operate on the separate inodes in dest.
> 
> Oh!  Sorry I missed that.
> 
>> Note also your original test didn't fail for me on ext4 on F15.
> 
> When I apply the change below, recompile "cp" and run
>   make check -C tests TESTS=cp/preserve-link VERBOSE=yes
> with the test tweaked to run cp via strace,
>   strace -e stat,lstat -o /tmp/cp-strace cp -au s t || fail=1
> I get this, which shows it's processing s/link before s/f.
> 
>   stat("t", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
>   lstat("s", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
>   lstat("t/s", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
>   lstat("s/link", {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
>   stat("t/s/link", 0x7fffa85ca0a0)        = -1 ENOENT (No such file or directory)
>   lstat("s/f", {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
>   stat("t/s/f", {st_mode=S_IFREG|0600, st_size=0, ...}) = 0

confirmed.

stat("t", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat("s", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat("t/s", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat("s/f", {st_mode=S_IFREG|0664, st_size=0, ...}) = 0
stat("t/s/f", {st_mode=S_IFREG|0664, st_size=0, ...}) = 0
lstat("s/link", {st_mode=S_IFREG|0664, st_size=0, ...}) = 0
stat("t/s/link", 0x7fff731e22c0)        = -1 ENOENT (No such file or directory)
stat("s", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
stat("s", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
stat("t/s", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0

cheers,
Pádraig.




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#8419; Package coreutils. (Wed, 27 Jul 2011 09:39:01 GMT) Full text and rfc822 format available.

Message #49 received at 8419 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: 8419 <at> debbugs.gnu.org
Subject: Re: bug#8419: cp -au : New hard links in source becomes new files
	at	destination when using cp -au
Date: Wed, 27 Jul 2011 10:35:14 +0100
[Message part 1 (text/plain, inline)]
On 26/07/11 14:23, Jim Meyering wrote:
> Pádraig Brady wrote:
> ...
> 
>> Note also your original test didn't fail for me on ext4 on F15.
> 

> I suspect you'll see that it's processing those two files in the reverse
> order on your system.  In case it's kernel-related, I'm using this:
>   2.6.38.8-32.fc15.x86_64
> and the disk is an SSD.

My guess here is there is inode sorting going on,
which is unstable and returning matching inodes in
a non deterministic order?

Anyway I've updated the test (attached) to try both ways.

cheers,
Pádraig.
[cp-preserve-link.diff (text/x-patch, attachment)]

Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#8419; Package coreutils. (Wed, 27 Jul 2011 12:11:01 GMT) Full text and rfc822 format available.

Message #52 received at 8419 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: 8419 <at> debbugs.gnu.org
Subject: Re: bug#8419: cp -au : New hard links in source becomes new files
	at	destination when using cp -au
Date: Wed, 27 Jul 2011 14:10:31 +0200
Pádraig Brady wrote:
> On 26/07/11 14:23, Jim Meyering wrote:
>> Pádraig Brady wrote:
>> ...
>>
>>> Note also your original test didn't fail for me on ext4 on F15.
>>
>
>> I suspect you'll see that it's processing those two files in the reverse
>> order on your system.  In case it's kernel-related, I'm using this:
>>   2.6.38.8-32.fc15.x86_64
>> and the disk is an SSD.
>
> My guess here is there is inode sorting going on,
> which is unstable and returning matching inodes in
> a non deterministic order?
>
> Anyway I've updated the test (attached) to try both ways.
...
> Subject: [PATCH] tests: cp/preserve-link: test all relevant paths
>
> * tests/cp/preserve-link: Add test cases for when a missing
> link in the destination tree is encountered first and second.
> Also add cases for old and new separate files in the destination
> tree, both to make the clobbering behavior explicit, and to
> test any changes in this area in future.

Thanks!




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 25 Aug 2011 11:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 12 years and 219 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.