GNU bug report logs - #8061
Introduce SEEK_DATA/SEEK_HOLE to extent_scan module

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: coreutils; Reported by: Jeff liu <jeff.liu@HIDDEN>; dated Thu, 17 Feb 2011 13:50:03 UTC; Maintainer for coreutils is bug-coreutils@HIDDEN.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 26 Aug 2011 09:46:39 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Aug 26 05:46:39 2011
Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1Qwszm-0001D4-0v
	for submit <at> debbugs.gnu.org; Fri, 26 Aug 2011 05:46:39 -0400
Received: from eggs.gnu.org ([140.186.70.92])
	by debbugs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <jeff.liu@HIDDEN>) id 1Qwszh-0001Cv-Lx
	for submit <at> debbugs.gnu.org; Fri, 26 Aug 2011 05:46:36 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <jeff.liu@HIDDEN>) id 1Qwsx1-0005mZ-I9
	for submit <at> debbugs.gnu.org; Fri, 26 Aug 2011 05:43:49 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD
	autolearn=unavailable version=3.3.1
Received: from lists.gnu.org ([140.186.70.17]:47007)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jeff.liu@HIDDEN>) id 1Qwsx1-0005mV-Fo
	for submit <at> debbugs.gnu.org; Fri, 26 Aug 2011 05:43:47 -0400
Received: from eggs.gnu.org ([140.186.70.92]:37994)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jeff.liu@HIDDEN>) id 1Qwswz-0005Cg-0o
	for bug-coreutils@HIDDEN; Fri, 26 Aug 2011 05:43:47 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <jeff.liu@HIDDEN>) id 1Qwsww-0005le-GX
	for bug-coreutils@HIDDEN; Fri, 26 Aug 2011 05:43:44 -0400
Received: from acsinet15.oracle.com ([141.146.126.227]:53320)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jeff.liu@HIDDEN>) id 1Qwsww-0005lG-3i
	for bug-coreutils@HIDDEN; Fri, 26 Aug 2011 05:43:42 -0400
Received: from rtcsinet21.oracle.com (rtcsinet21.oracle.com [66.248.204.29])
	by acsinet15.oracle.com (Switch-3.4.4/Switch-3.4.4) with ESMTP id
	p7Q9haUr007904
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK);
	Fri, 26 Aug 2011 09:43:38 GMT
Received: from acsmt357.oracle.com (acsmt357.oracle.com [141.146.40.157])
	by rtcsinet21.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id
	p7Q9hZvP012634
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Fri, 26 Aug 2011 09:43:35 GMT
Received: from abhmt108.oracle.com (abhmt108.oracle.com [141.146.116.60])
	by acsmt357.oracle.com (8.12.11.20060308/8.12.11) with ESMTP id
	p7Q9hTgu010063; Fri, 26 Aug 2011 04:43:29 -0500
Received: from [192.168.1.102] (/123.119.98.207)
	by default (Oracle Beehive Gateway v4.0)
	with ESMTP ; Fri, 26 Aug 2011 02:43:28 -0700
Message-ID: <4E576ABC.4080705@HIDDEN>
Date: Fri, 26 Aug 2011 17:43:24 +0800
From: Jeff Liu <jeff.liu@HIDDEN>
Organization: Oracle
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US;
	rv:1.9.2.18) Gecko/20110617 Thunderbird/3.1.11
MIME-Version: 1.0
To: bug-coreutils@HIDDEN
Subject: Re: bug#8061: Introduce SEEK_DATA/SEEK_HOLE to extent_scan module
References: <2DB776C1-EF34-423D-8BE5-71C2F49DFF01@HIDDEN>	<BE690E2C-C275-4B28-8ACB-9616D367EF96@HIDDEN>
	<3E3FBE56-4D89-44A4-94ED-13F3D1F693A3@HIDDEN>
In-Reply-To: <3E3FBE56-4D89-44A4-94ED-13F3D1F693A3@HIDDEN>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Source-IP: rtcsinet21.oracle.com [66.248.204.29]
X-CT-RefId: str=0001.0A090207.4E576ACA.020F,ss=1,re=-2.300,fgs=0
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 1)
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3)
X-Received-From: 140.186.70.17
X-Spam-Score: -6.4 (------)
X-Debbugs-Envelope-To: submit
Cc: zfs-discuss@HIDDEN, chris.mason@HIDDEN,
	linux-btrfs@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.11
Precedence: list
Reply-To: jeff.liu@HIDDEN
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
	<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
	<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Sender: debbugs-submit-bounces <at> debbugs.gnu.org
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
X-Spam-Score: -6.4 (------)

Dear All,

As the SEEK_HOLE/SEEK_DATA has been implemented on Btrfs in 3.1.0+ and 
Glibc, I have worked out a new version for your guys review.

Changes:
======
extent_scan.[c|h]:
1.  add a function pointer to "struct extent_scan":
/* Scan method.  */
bool (*extent_scan) (struct extent_scan *scan);

2. add  a structure item to indicate seek back issue maybe occurred:
/* Failed to seek back to position 0 or not.  */
bool seek_back_failed;
If the file system support SEEK_HOLE, the file offset will pointed to 
somewhere > 0, so need to
seek back to the beginning after support_seek_hole() checking for the 
proceeding extent scan.

3. rename extent_scan to fiemap_extent_scan.

4. add a new seek_extent_scan method.

5. add a new method to check SEEK stuff is supported or not.
if the underlaying file system support SEEK_HOLE, assign 
seek_extent_scan to scan->extent_scan, or else, fiemap_extent_scan() 
will be assigned to it.

copy.c:
1. pass src_total_size to extent_scan_init ().
2. for the first round extent scan,  we need to seek back to position 0 
too, if the data extent is started at the beginning of source file.

Tested:
======
1. make syntax-check.
2. verify a copied sparse file with 4697 extents on btrfs
jeff@pibroch:~/gnu/coreutils$ python -c "f=open('/btrfs/sparse_test', 
'w'); [(f.seek(x) or f.write(str(x))) for x in range(1, 1000000000, 
99999)]; f.close()"
jeff@pibroch:~/gnu/coreutils$ ./src/cp --sparse=always 
/btrfs/sparse_test /btrfs/sp.seek
jeff@pibroch:~/gnu/coreutils$ cmp /btrfs/sparse_test /btrfs/sp.seek
jeff@pibroch:~/gnu/coreutils$ echo $?
0

Also, the previous patch was developed on Solaris ZFS,  but my test env 
was lost now. :(  so anyone can help testing it on ZFS would be 
appreciated!!


 From 5892744f977a06b5557042682c39fd007eec8030 Mon Sep 17 00:00:00 2001
From: Jie Liu <jeff.liu@HIDDEN>
Date: Fri, 26 Aug 2011 17:11:33 +0800
Subject: [PATCH 1/1] copy: add SEEK_DATA/SEEK_HOLE support to 
extent_scan module

* src/extent_scan.h: introduce src_total_size to struct extent_info, we
   need it for lseek(2) iteration, add seek_back_failed to indicate that the
   seek back to position 0 failed in seek captical check or not, and it can
   be used for further debugging IMHO.
   add bool (*extent_scan) (struct extent_scan *scan) to switch the scan 
method.
* src/extent_scan.c: implement a new seek_scan_read() through SEEK_DATA
   and SEEK_HOLE.
* src/copy.c: a few code changes according to the new extent call interface.

Signed-off-by: Jie Liu <jeff.liu@HIDDEN>
---
  src/copy.c        |   26 +++++++++-
  src/extent-scan.c |  149 
++++++++++++++++++++++++++++++++++++++++++++++++++--
  src/extent-scan.h |   16 +++++-
  3 files changed, 183 insertions(+), 8 deletions(-)

diff --git a/src/copy.c b/src/copy.c
index bc4d7bd..c5e8714 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -309,7 +309,18 @@ extent_copy (int src_fd, int dest_fd, char *buf, 
size_t buf_size,
       We may need this at the end, for a final ftruncate.  */
    off_t dest_pos = 0;

-  extent_scan_init (src_fd, &scan);
+  bool init_ok = extent_scan_init (src_fd, src_total_size, &scan);
+  /* If the underlaying file system support SEEK_HOLE, but failed
+     to seek back to position 0 after the initial seek checking,
+     let extent copy failure in this case.  */
+  if (! init_ok)
+    {
+      if (scan.seek_back_failed)
+        error (0, errno,
+          _("%s: extent_scan_init () failed, cannot seek back to 
position 0"),
+               quote (src_name));
+      return false;
+    }

    *require_normal_copy = false;
    bool wrote_hole_at_eof = true;
@@ -356,6 +367,19 @@ extent_copy (int src_fd, int dest_fd, char *buf, 
size_t buf_size,

            wrote_hole_at_eof = false;

+          /* For the first round scan, if the data extent start at the
+             beginning, and the current file pointer is not at position
+             0, set it back first, otherwise, we'll read from undesired
+             file offset.  */
+          if (ext_start == 0 && lseek (src_fd, 0, SEEK_CUR) != 0)
+            {
+              if (lseek (src_fd, 0, SEEK_SET) < 0)
+                {
+                  error (0, errno, _("cannot lseek %s"), quote (src_name));
+                  return false;
+                }
+            }
+
            if (hole_size)
              {
                if (lseek (src_fd, ext_start, SEEK_SET) < 0)
diff --git a/src/extent-scan.c b/src/extent-scan.c
index 37445b8..c835b63 100644
--- a/src/extent-scan.c
+++ b/src/extent-scan.c
@@ -27,6 +27,12 @@
  #include "fiemap.h"
  #include "xstrtol.h"

+#ifndef SEEK_DATA
+# define SEEK_DATA 3  /* Seek to next data.  */
+#endif
+#ifndef SEEK_HOLE
+# define SEEK_HOLE 4  /* Seek to next hole.  */
+#endif

  /* Work around Linux kernel issues on BTRFS and EXT4 before 2.6.39.
     FIXME: remove in 2013, or whenever we're pretty confident
@@ -65,10 +71,48 @@ extent_need_sync (void)
  #endif
  }

+static bool
+support_seek_hole (struct extent_scan *scan)
+{
+  off_t hole_pos;
+
+# ifdef _PC_MIN_HOLE_SIZE
+  /* To determine if the underlaying file system support
+     SEEK_HOLE, if not, fall back to fiemap extent scan or
+     the standard copy.  */
+  if (fpathconf (scan->fd, _PC_MIN_HOLE_SIZE) < 0)
+      return false;
+# endif
+
+  /* Inspired by STAR, If we have been compiled on an OS that
+     supports SEEK_HOLE but run on an OS that does not support
+     SEEK_HOLE, we get EINVAL.  If the underlying file system
+     does not support the SEEK_HOLE call, we get ENOTSUP, fall
+     back to the fiemap scan or standard copy in either case.  */
+  hole_pos = lseek (scan->fd, (off_t) 0, SEEK_HOLE);
+  if (hole_pos < 0)
+    {
+      if (errno == EINVAL || errno == ENOTSUP)
+          return false;
+    }
+
+  /* Seek back to position 0 first if we detected a real hole.  */
+  if (hole_pos > 0)
+    {
+      if (lseek (scan->fd, (off_t) 0, SEEK_SET) != 0)
+        {
+          scan->seek_back_failed = true;
+          return false;
+        }
+    }
+
+  return true;
+}
+
  /* Allocate space for struct extent_scan, initialize the entries if
     necessary and return it as the input argument of 
extent_scan_read().  */
-extern void
-extent_scan_init (int src_fd, struct extent_scan *scan)
+extern bool
+extent_scan_init (int src_fd, size_t src_total_size, struct extent_scan 
*scan)
  {
    scan->fd = src_fd;
    scan->ei_count = 0;
@@ -76,17 +120,110 @@ extent_scan_init (int src_fd, struct extent_scan 
*scan)
    scan->scan_start = 0;
    scan->initial_scan_failed = false;
    scan->hit_final_extent = false;
-  scan->fm_flags = extent_need_sync () ? FIEMAP_FLAG_SYNC : 0;
+  scan->seek_back_failed = false;
+
+  if (support_seek_hole (scan))
+    {
+      scan->src_total_size = src_total_size;
+      scan->extent_scan = seek_extent_scan;
+    }
+  else
+    {
+      /* The underlying file system support SEEK_HOLE, but failed
+         to seek back to position 0 after seek checking, Oops!  */
+      if (scan->seek_back_failed)
+        return false;
+
+      scan->extent_scan = fiemap_extent_scan;
+      scan->fm_flags = extent_need_sync () ? FIEMAP_FLAG_SYNC : 0;
+    }
+
+  return true;
+}
+
+extern inline bool
+extent_scan_read (struct extent_scan *scan)
+{
+  return scan->extent_scan (scan);
+}
+
+extern bool
+seek_extent_scan (struct extent_scan *scan)
+{
+  off_t data_pos, hole_pos;
+  union { struct extent_info ei; char c[4096]; } extent_buf;
+  struct extent_info *ext_info = &extent_buf.ei;
+  enum { count = (sizeof extent_buf / sizeof *ext_info) };
+  verify (count != 0);
+
+  memset (&extent_buf, 0, sizeof extent_buf);
+
+  unsigned int i = 0;
+  /* If lseek(2) failed and the errno is set to ENXIO, for
+     SEEK_DATA there are no more data regions past the supplied
+     offset.  For SEEK_HOLE, there are no more holes past the
+     supplied offset.  Set scan->hit_final_extent to true for
+     either case.  */
+  do {
+    data_pos = lseek (scan->fd, scan->scan_start, SEEK_DATA);
+    if (data_pos < 0)
+      {
+        if (errno == ENXIO)
+          {
+            scan->hit_final_extent = true;
+            return true;
+          }
+        return false;
+      }
+
+    /* We hit the final extent if the data offset is equal to
+       the source file size.  */
+    if (data_pos == scan->src_total_size)
+      {
+        scan->hit_final_extent = true;
+        break;
+      }
+
+    hole_pos = lseek (scan->fd, data_pos, SEEK_HOLE);
+    if (hole_pos < 0)
+      {
+        if (errno != ENXIO)
+          return false;
+        else
+          {
+            scan->hit_final_extent = true;
+            return true;
+          }
+      }
+
+    ext_info[i].ext_logical = data_pos;
+    ext_info[i].ext_length = hole_pos - data_pos;
+    scan->scan_start = hole_pos;
+    ++i;
+  } while (scan->scan_start < scan->src_total_size && i < count);
+
+  scan->ei_count = i;
+  scan->ext_info = xnmalloc (scan->ei_count, sizeof (struct extent_info));
+
+  for (i = 0; i < scan->ei_count; i++)
+    {
+      assert (ext_info[i].ext_logical <= OFF_T_MAX);
+
+      scan->ext_info[i].ext_logical = ext_info[i].ext_logical;
+      scan->ext_info[i].ext_length = ext_info[i].ext_length;
+    }
+
+  return true;
  }

-#ifdef __linux__
+#if defined __linux__
  # ifndef FS_IOC_FIEMAP
  #  define FS_IOC_FIEMAP _IOWR ('f', 11, struct fiemap)
  # endif
  /* Call ioctl(2) with FS_IOC_FIEMAP (available in linux 2.6.27) to
     obtain a map of file extents excluding holes.  */
  extern bool
-extent_scan_read (struct extent_scan *scan)
+fiemap_extent_scan (struct extent_scan *scan)
  {
    unsigned int si = 0;
    struct extent_info *last_ei IF_LINT ( = scan->ext_info);
@@ -212,7 +349,7 @@ extent_scan_read (struct extent_scan *scan)
  }
  #else
  extern bool
-extent_scan_read (struct extent_scan *scan ATTRIBUTE_UNUSED)
+fiemap_extent_scan (struct extent_scan *scan ATTRIBUTE_UNUSED)
  {
    scan->initial_scan_failed = true;
    errno = ENOTSUP;
diff --git a/src/extent-scan.h b/src/extent-scan.h
index 5b4ded5..e751810 100644
--- a/src/extent-scan.h
+++ b/src/extent-scan.h
@@ -38,6 +38,9 @@ struct extent_scan
    /* File descriptor of extent scan run against.  */
    int fd;

+  /* Source file size, i.e, (struct stat) &statbuf.st_size.  */
+  size_t src_total_size;
+
    /* Next scan start offset.  */
    off_t scan_start;

@@ -47,6 +50,9 @@ struct extent_scan
    /* How many extent info returned for a scan.  */
    uint32_t ei_count;

+  /* Failed to seek back to position 0 or not.  */
+  bool seek_back_failed;
+
    /* If true, fall back to a normal copy, either set by the
       failure of ioctl(2) for FIEMAP or lseek(2) with SEEK_DATA.  */
    bool initial_scan_failed;
@@ -54,14 +60,22 @@ struct extent_scan
    /* If true, the total extent scan per file has been finished.  */
    bool hit_final_extent;

+  /* Scan method.  */
+  bool (*extent_scan) (struct extent_scan *scan);
+
    /* Extent information: a malloc'd array of ei_count structs.  */
    struct extent_info *ext_info;
  };

-void extent_scan_init (int src_fd, struct extent_scan *scan);
+bool extent_scan_init (int src_fd, size_t src_total_size,
+                       struct extent_scan *scan);

  bool extent_scan_read (struct extent_scan *scan);

+bool fiemap_extent_scan (struct extent_scan *scan);
+
+bool seek_extent_scan (struct extent_scan *scan);
+
  static inline void
  extent_scan_free (struct extent_scan *scan)
  {
-- 
1.7.4.1



Thanks,
-Jeff

On 04/19/2011 04:51 PM, Jeff liu wrote:
>> Hi All,
>>
>> Please ignore the current patch, I will submit another patch with a few fixes soon.
> Now the new patch set coming,
>
> In previous post, I have tried to change the extent_scan_init() interface by adding a new argument to indicate the source file size,
> this will reduce the overhead of call fstat(2)  in extent_scan_read(), since the file size is definitely needed for SEEK* stuff, however, the file size is redundant for FIEMAP.
> so I changed my idea to keep extent_scan_init() as before,  instead, to retrieve the file size in extent_scan_read() when launching the first scan, one benefit is, there is nothing need to
> be modified in extent_copy() for this patch.
>
> Tests:
> ====
> A new test sparse-lseek was introduced in this post, it make use of the sparse file generation function in Perl, and do `cmp` against the target copied file.
> I have also took a look at the `sdb` utility shipped with ZFS, but did not found any interesting stuff can be used for this test.
>
> Test run passed on my environment as below,
>
> bash-3.00# make check TESTS=cp/sparse-lseek VERBOSE=yes
> make  check-TESTS
> make[1]: Entering directory `/coreutils/tests'
> make[2]: Entering directory `/coreutils/tests'
> PASS: cp/sparse-lseek
> =============
> 1 test passed
> =============
> make[2]: Leaving directory `/coreutils/tests'
> make[1]: Leaving directory `/coreutils/tests'
>    GEN    vc_exe_in_TESTS
> No differences encountered
>
> Manual tests:
> ===========
> 1. Ensure trailing blanks, test 0 size sparse file, non-sparse file,  sparse file with hole start and hole end.
> 2. make syntax-check failed, I have no idea of this issue at the moment,  I also tried to run make distcheck, looks the package building, install and uninstall procedures all passed,
> but it also failed at the final stage, am I missing something here?
>
> The logs which were shown as following,
> bash-3.00# make syntax-check
> GFDL_version
> awk: syntax error near line 1
> awk: bailing out near line 1
> make: *** [sc_GFDL_version.z] Error 2
>
> make distcheck:
> ==============
> ......
> make[1]: Entering directory `/coreutils'
>    GEN    check-ls-dircolors
> make my-distcheck
> make[2]: Entering directory `/coreutils'
> make syntax-check
> make[3]: Entering directory `/coreutils'
> GFDL_version
> awk: syntax error near line 1
> awk: bailing out near line 1
> make[3]: *** [sc_GFDL_version.z] Error 2
> make[3]: Leaving directory `/coreutils'
> make[2]: *** [my-distcheck] Error 2
> make[2]: Leaving directory `/coreutils'
> make[1]: *** [distcheck-hook] Error 2
> make[1]: Leaving directory `/coreutils'
> make: *** [distcheck] Error 1
>
>
>
> Below is the revised patch,
>
>  From 4f966c1fe6226f3f711faae120cd8bea78e722b8 Mon Sep 17 00:00:00 2001
> From: Jie Liu<jeff.liu@HIDDEN>
> Date: Tue, 19 Apr 2011 15:24:50 -0700
> Subject: [PATCH 1/1] copy: add SEEK_DATA/SEEK_HOLE support to extent_scan module
>
> * src/extent_scan.h: introduce src_total_size to struct extent_info, we
>    need it for lseek(2) iteration.
> * src/extent_scan.c: implement a new extent_scan_read() through SEEK_DATA
>    and SEEK_HOLE if those stuff are supported.
> * tests/cp/sparse-lseek: add a new test for lseek(2) extent copy.
>
> Signed-off-by: Jie Liu<jeff.liu@HIDDEN>
> ---
>   src/extent-scan.c     |  119 +++++++++++++++++++++++++++++++++++++++++++++++++
>   src/extent-scan.h     |    5 ++
>   tests/Makefile.am     |    1 +
>   tests/cp/sparse-lseek |   56 +++++++++++++++++++++++
>   4 files changed, 181 insertions(+), 0 deletions(-)
>   create mode 100755 tests/cp/sparse-lseek
>
> diff --git a/src/extent-scan.c b/src/extent-scan.c
> index da7eb9d..a54eca0 100644
> --- a/src/extent-scan.c
> +++ b/src/extent-scan.c
> @@ -17,7 +17,9 @@
>      Written by Jie Liu (jeff.liu@HIDDEN).  */
>
>   #include<config.h>
> +#include<fcntl.h>
>   #include<sys/types.h>
> +#include<sys/stat.h>
>   #include<sys/ioctl.h>
>   #include<sys/utsname.h>
>   #include<assert.h>
> @@ -71,6 +73,9 @@ extent_scan_init (int src_fd, struct extent_scan *scan)
>     scan->initial_scan_failed = false;
>     scan->hit_final_extent = false;
>     scan->fm_flags = extent_need_sync () ? FIEMAP_FLAG_SYNC : 0;
> +#if defined (SEEK_DATA)&&  defined (SEEK_HOLE)
> +  scan->src_total_size = 0;
> +#endif
>   }
>
>   #ifdef __linux__
> @@ -204,6 +209,120 @@ extent_scan_read (struct extent_scan *scan)
>
>     return true;
>   }
> +#elif defined (SEEK_HOLE)&&  defined (SEEK_DATA)
> +extern bool
> +extent_scan_read (struct extent_scan *scan)
> +{
> +  off_t data_pos, hole_pos;
> +  union { struct extent_info ei; char c[4096]; } extent_buf;
> +  struct extent_info *ext_info =&extent_buf.ei;
> +  enum { count = (sizeof extent_buf / sizeof *ext_info) };
> +  verify (count != 0);
> +
> +  memset (&extent_buf, 0, sizeof extent_buf);
> +
> +  if (scan->scan_start == 0)
> +    {
> +# ifdef _PC_MIN_HOLE_SIZE
> +      /* To determine if the underlaying file system support
> +         SEEK_HOLE.  If not, fall back to the standard copy.  */
> +      if (fpathconf (scan->fd, _PC_MIN_HOLE_SIZE)<  0)
> +        {
> +          scan->initial_scan_failed = true;
> +          return false;
> +        }
> +# endif
> +
> +      /* If we have been compiled on an OS that supports SEEK_HOLE
> +         but run on an OS that does not support SEEK_HOLE, we get
> +         EINVAL.  If the underlying file system does not support the
> +         SEEK_HOLE call, we get ENOTSUP, setting initial_scan_failed
> +         to true to fall back to the standard copy in either case.  */
> +      hole_pos = lseek (scan->fd, (off_t) 0, SEEK_HOLE);
> +      if (hole_pos<  0)
> +        {
> +          if (errno == EINVAL || errno == ENOTSUP)
> +            scan->initial_scan_failed = true;
> +          return false;
> +        }
> +
> +      /* Seek back to position 0 first.  */
> +      if (hole_pos>  0)
> +        {
> +          if (lseek (scan->fd, (off_t) 0, SEEK_SET)<  0)
> +            return false;
> +        }
> +
> +      struct stat sb;
> +      if (fstat (scan->fd,&sb)<  0)
> +        return false;
> +
> +      /* This is definitely not a sparse file, we treat it as a big extent.  */
> +      if (hole_pos>= sb.st_size)
> +        {
> +          scan->ei_count = 1;
> +          scan->ext_info = xnmalloc (scan->ei_count, sizeof (struct extent_info));
> +          scan->ext_info[0].ext_logical = 0;
> +          scan->ext_info[0].ext_length = sb.st_size;
> +          scan->hit_final_extent = true;
> +          return true;
> +        }
> +      scan->src_total_size = sb.st_size;
> +    }
> +
> +  unsigned int i = 0;
> +  /* If lseek(2) failed and the errno is set to ENXIO, for
> +     SEEK_DATA there are no more data regions past the supplied
> +     offset.  For SEEK_HOLE, there are no more holes past the
> +     supplied offset.  Set scan->hit_final_extent to true in
> +     either case.  */
> +  while (scan->scan_start<  scan->src_total_size&&  i<  count)
> +    {
> +      data_pos = lseek (scan->fd, scan->scan_start, SEEK_DATA);
> +      if (data_pos<  0)
> +        {
> +          if (errno == ENXIO)
> +            {
> +              scan->hit_final_extent = true;
> +              break;
> +            }
> +          return false;
> +        }
> +
> +      hole_pos = lseek (scan->fd, data_pos, SEEK_HOLE);
> +      if (hole_pos<  0)
> +        {
> +          if (errno == ENXIO)
> +            {
> +              scan->hit_final_extent = true;
> +              hole_pos = scan->src_total_size;
> +              if (data_pos<  hole_pos)
> +                goto preserve_ext_info;
> +              break;
> +            }
> +          return false;
> +        }
> +
> +preserve_ext_info:
> +      ext_info[i].ext_logical = data_pos;
> +      ext_info[i].ext_length = hole_pos - data_pos;
> +      scan->scan_start = hole_pos;
> +      ++i;
> +    }
> +
> +  scan->ei_count = i;
> +  scan->ext_info = xnmalloc (scan->ei_count, sizeof (struct extent_info));
> +
> +  for (i = 0; i<  scan->ei_count; i++)
> +    {
> +      assert (ext_info[i].ext_logical<= OFF_T_MAX);
> +
> +      scan->ext_info[i].ext_logical = ext_info[i].ext_logical;
> +      scan->ext_info[i].ext_length = ext_info[i].ext_length;
> +    }
> +
> +  return (lseek (scan->fd, (off_t) 0, SEEK_SET)<  0) ? false : true;
> +}
>   #else
>   extern bool
>   extent_scan_read (struct extent_scan *scan ATTRIBUTE_UNUSED)
> diff --git a/src/extent-scan.h b/src/extent-scan.h
> index 5b4ded5..4fc05c6 100644
> --- a/src/extent-scan.h
> +++ b/src/extent-scan.h
> @@ -38,6 +38,11 @@ struct extent_scan
>     /* File descriptor of extent scan run against.  */
>     int fd;
>
> +# if defined (SEEK_DATA)&&  defined (SEEK_HOLE)
> +  /* Source file size, i.e, (struct stat)&statbuf.st_size.  */
> +  size_t src_total_size;
> +#endif
> +
>     /* Next scan start offset.  */
>     off_t scan_start;
>
> diff --git a/tests/Makefile.am b/tests/Makefile.am
> index 685eb52..6c596b9 100644
> --- a/tests/Makefile.am
> +++ b/tests/Makefile.am
> @@ -28,6 +28,7 @@ root_tests =					\
>     cp/cp-mv-enotsup-xattr			\
>     cp/capability					\
>     cp/sparse-fiemap				\
> +  cp/sparse-lseek                               \
>     dd/skip-seek-past-dev				\
>     install/install-C-root			\
>     ls/capability					\
> diff --git a/tests/cp/sparse-lseek b/tests/cp/sparse-lseek
> new file mode 100755
> index 0000000..5b8f2c1
> --- /dev/null
> +++ b/tests/cp/sparse-lseek
> @@ -0,0 +1,56 @@
> +#!/bin/sh
> +# Test cp --sparse=always through lseek(SEEK_DATA/SEEK_HOLE) copy
> +
> +# Copyright (C) 2010-2011 Free Software Foundation, Inc.
> +
> +# This program is free software: you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation, either version 3 of the License, or
> +# (at your option) any later version.
> +
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see<http://www.gnu.org/licenses/>.
> +
> +. "${srcdir=.}/init.sh"; path_prepend_ ../src
> +print_ver_ cp
> +$PERL -e 1 || skip_test_ 'you lack perl'
> +
> +zfsdisk=diskX
> +zfspool=seektest
> +
> +require_root_
> +
> +cwd=$PWD
> +cleanup_() { zpool destroy $zfspool; }
> +
> +skip=0
> +mkfile 128m "$cwd/$zfsdisk" || skip=1
> +
> +# Check if the seektest pool is already exists
> +zpool list $zfspool 2>/dev/null&&
> +  skip_test_ "$zfspool already exists"
> +
> +# Create pool and verify if it is mounted automatically
> +zpool create $zfspool "$cwd/$zfsdisk" || skip=1
> +zpool list $zfspool>/dev/null || skip=1
> +
> +test $skip = 1&&  skip_test_ "insufficient ZFS support"
> +
> +for i in $(seq 1 2 21); do
> +  for j in 1 2 31 100; do
> +    $PERL -e 'BEGIN { $n = '$i' * 1024; *F = *STDOUT }' \
> +          -e 'for (1..'$j') { sysseek (*F, $n, 1)' \
> +          -e '&&  syswrite (*F, chr($_)x$n) or die "$!"}'>  /$zfspool/j1 || fail=1
> +
> +    cp --sparse=always /$zfspool/j1 /$zfspool/j2 || fail=1
> +    cmp /$zfspool/j1 /$zfspool/j2 || fail=1
> +    test $fail = 1&&  break 2
> +  done
> +done
> +
> +Exit $fail





Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils@HIDDEN:
bug#8061; Package coreutils. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 19 Apr 2011 09:04:03 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Tue Apr 19 05:04:03 2011
Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1QC6qo-0004eM-RU
	for submit <at> debbugs.gnu.org; Tue, 19 Apr 2011 05:04:02 -0400
Received: from eggs.gnu.org ([140.186.70.92])
	by debbugs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <jim@HIDDEN>) id 1QC6qn-0004du-0l
	for submit <at> debbugs.gnu.org; Tue, 19 Apr 2011 05:04:01 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <jim@HIDDEN>) id 1QC6qg-0006t0-Qk
	for submit <at> debbugs.gnu.org; Tue, 19 Apr 2011 05:03:55 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED,
	T_RP_MATCHES_RCVD autolearn=unavailable version=3.3.1
Received: from lists.gnu.org ([140.186.70.17]:41876)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jim@HIDDEN>) id 1QC6qg-0006sw-P5
	for submit <at> debbugs.gnu.org; Tue, 19 Apr 2011 05:03:54 -0400
Received: from eggs.gnu.org ([140.186.70.92]:35769)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jim@HIDDEN>) id 1QC6qc-0000cA-CL
	for bug-coreutils@HIDDEN; Tue, 19 Apr 2011 05:03:54 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <jim@HIDDEN>) id 1QC6qX-0006rM-Mt
	for bug-coreutils@HIDDEN; Tue, 19 Apr 2011 05:03:50 -0400
Received: from mx.meyering.net ([82.230.74.64]:40748)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jim@HIDDEN>) id 1QC6qX-0006rA-Cm
	for bug-coreutils@HIDDEN; Tue, 19 Apr 2011 05:03:45 -0400
Received: by rho.meyering.net (Acme Bit-Twister, from userid 1000)
	id B3F9C60113; Tue, 19 Apr 2011 11:03:44 +0200 (CEST)
From: Jim Meyering <jim@HIDDEN>
To: Jeff liu <jeff.liu@HIDDEN>
Subject: Re: Introduce SEEK_DATA/SEEK_HOLE to extent_scan module
In-Reply-To: <3E3FBE56-4D89-44A4-94ED-13F3D1F693A3@HIDDEN> (Jeff liu's
	message of "Tue, 19 Apr 2011 16:51:38 +0800")
References: <2DB776C1-EF34-423D-8BE5-71C2F49DFF01@HIDDEN>
	<BE690E2C-C275-4B28-8ACB-9616D367EF96@HIDDEN>
	<3E3FBE56-4D89-44A4-94ED-13F3D1F693A3@HIDDEN>
Date: Tue, 19 Apr 2011 11:03:44 +0200
Message-ID: <877haq1ukv.fsf@HIDDEN>
Lines: 11
MIME-Version: 1.0
Content-Type: text/plain
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
	recognized.
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3)
X-Received-From: 140.186.70.17
X-Spam-Score: -5.9 (-----)
X-Debbugs-Envelope-To: submit
Cc: =?iso-8859-1?Q?P=E1draig?= Brady <P@HIDDEN>, bug-coreutils@HIDDEN
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.11
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
	<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <http://debbugs.gnu.org/pipermail/debbugs-submit>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
	<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Sender: debbugs-submit-bounces <at> debbugs.gnu.org
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
X-Spam-Score: -5.9 (-----)

Jeff liu wrote:
...
> Below is the revised patch,
>
> From 4f966c1fe6226f3f711faae120cd8bea78e722b8 Mon Sep 17 00:00:00 2001
> From: Jie Liu <jeff.liu@HIDDEN>
> Date: Tue, 19 Apr 2011 15:24:50 -0700
> Subject: [PATCH 1/1] copy: add SEEK_DATA/SEEK_HOLE support to extent_scan module

Thank you for the update.
I will look at it in a week or so if no one else does.




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils@HIDDEN:
bug#8061; Package coreutils. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 19 Apr 2011 08:53:21 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Tue Apr 19 04:53:21 2011
Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1QC6gS-0004Pk-67
	for submit <at> debbugs.gnu.org; Tue, 19 Apr 2011 04:53:21 -0400
Received: from eggs.gnu.org ([140.186.70.92])
	by debbugs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <jeff.liu@HIDDEN>) id 1QC6gN-0004PW-Sl
	for submit <at> debbugs.gnu.org; Tue, 19 Apr 2011 04:53:18 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <jeff.liu@HIDDEN>) id 1QC6gF-0004e3-F9
	for submit <at> debbugs.gnu.org; Tue, 19 Apr 2011 04:53:10 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,HTML_MESSAGE,
	RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD autolearn=unavailable version=3.3.1
Received: from lists.gnu.org ([140.186.70.17]:42601)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jeff.liu@HIDDEN>) id 1QC6gF-0004dz-9g
	for submit <at> debbugs.gnu.org; Tue, 19 Apr 2011 04:53:07 -0400
Received: from eggs.gnu.org ([140.186.70.92]:60467)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jeff.liu@HIDDEN>) id 1QC6gC-0007KG-25
	for bug-coreutils@HIDDEN; Tue, 19 Apr 2011 04:53:07 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <jeff.liu@HIDDEN>) id 1QC6g8-0004cn-CG
	for bug-coreutils@HIDDEN; Tue, 19 Apr 2011 04:53:04 -0400
Received: from rcsinet10.oracle.com ([148.87.113.121]:49013)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jeff.liu@HIDDEN>) id 1QC6g7-0004ce-VD
	for bug-coreutils@HIDDEN; Tue, 19 Apr 2011 04:53:00 -0400
Received: from rcsinet15.oracle.com (rcsinet15.oracle.com [148.87.113.117])
	by rcsinet10.oracle.com (Switch-3.4.2/Switch-3.4.2) with ESMTP id
	p3J8qixt002378
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK);
	Tue, 19 Apr 2011 08:52:46 GMT
Received: from acsmt358.oracle.com (acsmt358.oracle.com [141.146.40.158])
	by rcsinet15.oracle.com (Switch-3.4.2/Switch-3.4.1) with ESMTP id
	p3J8qhGA022441
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Tue, 19 Apr 2011 08:52:43 GMT
Received: from abhmt013.oracle.com (abhmt013.oracle.com [141.146.116.22])
	by acsmt358.oracle.com (8.12.11.20060308/8.12.11) with ESMTP id
	p3J8pgPb002153; Tue, 19 Apr 2011 03:51:42 -0500
Received: from [10.189.10.146] (/203.190.176.53)
	by default (Oracle Beehive Gateway v4.0)
	with ESMTP ; Tue, 19 Apr 2011 01:51:41 -0700
Subject: Re: Introduce SEEK_DATA/SEEK_HOLE to extent_scan module
Mime-Version: 1.0 (Apple Message framework v1082)
Content-Type: multipart/alternative; boundary=Apple-Mail-1-625867617
From: Jeff liu <jeff.liu@HIDDEN>
In-Reply-To: <BE690E2C-C275-4B28-8ACB-9616D367EF96@HIDDEN>
Date: Tue, 19 Apr 2011 16:51:38 +0800
Message-Id: <3E3FBE56-4D89-44A4-94ED-13F3D1F693A3@HIDDEN>
References: <2DB776C1-EF34-423D-8BE5-71C2F49DFF01@HIDDEN>
	<BE690E2C-C275-4B28-8ACB-9616D367EF96@HIDDEN>
To: Jeff liu <jeff.liu@HIDDEN>
X-Mailer: Apple Mail (2.1082)
X-Source-IP: acsmt358.oracle.com [141.146.40.158]
X-Auth-Type: Internal IP
X-CT-RefId: str=0001.0A090201.4DAD4D5C.00E3:SCFSTAT5015188, ss=1, pt=DBB_65838,
	fgs=0
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3)
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3)
X-Received-From: 140.186.70.17
X-Spam-Score: -6.4 (------)
X-Debbugs-Envelope-To: submit
Cc: =?GB2312?Q?P=A8=A2draig_Brady?= <P@HIDDEN>, bug-coreutils@HIDDEN,
	Jim Meyering <jim@HIDDEN>
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.11
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
	<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <http://debbugs.gnu.org/pipermail/debbugs-submit>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
	<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Sender: debbugs-submit-bounces <at> debbugs.gnu.org
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
X-Spam-Score: -6.4 (------)


--Apple-Mail-1-625867617
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=us-ascii


> Hi All,
>=20
> Please ignore the current patch, I will submit another patch with a =
few fixes soon.

Now the new patch set coming, =20

In previous post, I have tried to change the extent_scan_init() =
interface by adding a new argument to indicate the source file size,
this will reduce the overhead of call fstat(2)  in extent_scan_read(), =
since the file size is definitely needed for SEEK* stuff, however, the =
file size is redundant for FIEMAP.
so I changed my idea to keep extent_scan_init() as before,  instead, to =
retrieve the file size in extent_scan_read() when launching the first =
scan, one benefit is, there is nothing need to
be modified in extent_copy() for this patch.

Tests:
=3D=3D=3D=3D
A new test sparse-lseek was introduced in this post, it make use of the =
sparse file generation function in Perl, and do `cmp` against the target =
copied file.
I have also took a look at the `sdb` utility shipped with ZFS, but did =
not found any interesting stuff can be used for this test.

Test run passed on my environment as below,

bash-3.00# make check TESTS=3Dcp/sparse-lseek VERBOSE=3Dyes
make  check-TESTS
make[1]: Entering directory `/coreutils/tests'
make[2]: Entering directory `/coreutils/tests'
PASS: cp/sparse-lseek
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
1 test passed
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
make[2]: Leaving directory `/coreutils/tests'
make[1]: Leaving directory `/coreutils/tests'
  GEN    vc_exe_in_TESTS
No differences encountered

Manual tests:
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
1. Ensure trailing blanks, test 0 size sparse file, non-sparse file,  =
sparse file with hole start and hole end.
2. make syntax-check failed, I have no idea of this issue at the moment, =
 I also tried to run make distcheck, looks the package building, install =
and uninstall procedures all passed,
but it also failed at the final stage, am I missing something here?=20

The logs which were shown as following,
bash-3.00# make syntax-check
GFDL_version
awk: syntax error near line 1
awk: bailing out near line 1
make: *** [sc_GFDL_version.z] Error 2

make distcheck:
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
......
make[1]: Entering directory `/coreutils'
  GEN    check-ls-dircolors
make my-distcheck
make[2]: Entering directory `/coreutils'
make syntax-check
make[3]: Entering directory `/coreutils'
GFDL_version
awk: syntax error near line 1
awk: bailing out near line 1
make[3]: *** [sc_GFDL_version.z] Error 2
make[3]: Leaving directory `/coreutils'
make[2]: *** [my-distcheck] Error 2
make[2]: Leaving directory `/coreutils'
make[1]: *** [distcheck-hook] Error 2
make[1]: Leaving directory `/coreutils'
make: *** [distcheck] Error 1



Below is the revised patch,

=46rom 4f966c1fe6226f3f711faae120cd8bea78e722b8 Mon Sep 17 00:00:00 2001
From: Jie Liu <jeff.liu@HIDDEN>
Date: Tue, 19 Apr 2011 15:24:50 -0700
Subject: [PATCH 1/1] copy: add SEEK_DATA/SEEK_HOLE support to =
extent_scan module

* src/extent_scan.h: introduce src_total_size to struct extent_info, we
  need it for lseek(2) iteration.
* src/extent_scan.c: implement a new extent_scan_read() through =
SEEK_DATA
  and SEEK_HOLE if those stuff are supported.
* tests/cp/sparse-lseek: add a new test for lseek(2) extent copy.

Signed-off-by: Jie Liu <jeff.liu@HIDDEN>
---
 src/extent-scan.c     |  119 =
+++++++++++++++++++++++++++++++++++++++++++++++++
 src/extent-scan.h     |    5 ++
 tests/Makefile.am     |    1 +
 tests/cp/sparse-lseek |   56 +++++++++++++++++++++++
 4 files changed, 181 insertions(+), 0 deletions(-)
 create mode 100755 tests/cp/sparse-lseek

diff --git a/src/extent-scan.c b/src/extent-scan.c
index da7eb9d..a54eca0 100644
--- a/src/extent-scan.c
+++ b/src/extent-scan.c
@@ -17,7 +17,9 @@
    Written by Jie Liu (jeff.liu@HIDDEN).  */
=20
 #include <config.h>
+#include <fcntl.h>
 #include <sys/types.h>
+#include <sys/stat.h>
 #include <sys/ioctl.h>
 #include <sys/utsname.h>
 #include <assert.h>
@@ -71,6 +73,9 @@ extent_scan_init (int src_fd, struct extent_scan =
*scan)
   scan->initial_scan_failed =3D false;
   scan->hit_final_extent =3D false;
   scan->fm_flags =3D extent_need_sync () ? FIEMAP_FLAG_SYNC : 0;
+#if defined (SEEK_DATA) && defined (SEEK_HOLE)
+  scan->src_total_size =3D 0;
+#endif
 }
=20
 #ifdef __linux__
@@ -204,6 +209,120 @@ extent_scan_read (struct extent_scan *scan)
=20
   return true;
 }
+#elif defined (SEEK_HOLE) && defined (SEEK_DATA)
+extern bool
+extent_scan_read (struct extent_scan *scan)
+{
+  off_t data_pos, hole_pos;
+  union { struct extent_info ei; char c[4096]; } extent_buf;
+  struct extent_info *ext_info =3D &extent_buf.ei;
+  enum { count =3D (sizeof extent_buf / sizeof *ext_info) };
+  verify (count !=3D 0);
+
+  memset (&extent_buf, 0, sizeof extent_buf);
+
+  if (scan->scan_start =3D=3D 0)
+    {
+# ifdef _PC_MIN_HOLE_SIZE
+      /* To determine if the underlaying file system support
+         SEEK_HOLE.  If not, fall back to the standard copy.  */
+      if (fpathconf (scan->fd, _PC_MIN_HOLE_SIZE) < 0)
+        {
+          scan->initial_scan_failed =3D true;
+          return false;
+        }
+# endif
+
+      /* If we have been compiled on an OS that supports SEEK_HOLE
+         but run on an OS that does not support SEEK_HOLE, we get
+         EINVAL.  If the underlying file system does not support the
+         SEEK_HOLE call, we get ENOTSUP, setting initial_scan_failed
+         to true to fall back to the standard copy in either case.  */
+      hole_pos =3D lseek (scan->fd, (off_t) 0, SEEK_HOLE);
+      if (hole_pos < 0)
+        {
+          if (errno =3D=3D EINVAL || errno =3D=3D ENOTSUP)
+            scan->initial_scan_failed =3D true;
+          return false;
+        }
+
+      /* Seek back to position 0 first.  */
+      if (hole_pos > 0)
+        {
+          if (lseek (scan->fd, (off_t) 0, SEEK_SET) < 0)
+            return false;
+        }
+
+      struct stat sb;
+      if (fstat (scan->fd, &sb) < 0)
+        return false;
+
+      /* This is definitely not a sparse file, we treat it as a big =
extent.  */
+      if (hole_pos >=3D sb.st_size)
+        {
+          scan->ei_count =3D 1;
+          scan->ext_info =3D xnmalloc (scan->ei_count, sizeof (struct =
extent_info));
+          scan->ext_info[0].ext_logical =3D 0;
+          scan->ext_info[0].ext_length =3D sb.st_size;
+          scan->hit_final_extent =3D true;
+          return true;
+        }
+      scan->src_total_size =3D sb.st_size;
+    }
+
+  unsigned int i =3D 0;
+  /* If lseek(2) failed and the errno is set to ENXIO, for
+     SEEK_DATA there are no more data regions past the supplied
+     offset.  For SEEK_HOLE, there are no more holes past the
+     supplied offset.  Set scan->hit_final_extent to true in
+     either case.  */
+  while (scan->scan_start < scan->src_total_size && i < count)
+    {
+      data_pos =3D lseek (scan->fd, scan->scan_start, SEEK_DATA);
+      if (data_pos < 0)
+        {
+          if (errno =3D=3D ENXIO)
+            {
+              scan->hit_final_extent =3D true;
+              break;
+            }
+          return false;
+        }
+
+      hole_pos =3D lseek (scan->fd, data_pos, SEEK_HOLE);
+      if (hole_pos < 0)
+        {
+          if (errno =3D=3D ENXIO)
+            {
+              scan->hit_final_extent =3D true;
+              hole_pos =3D scan->src_total_size;
+              if (data_pos < hole_pos)
+                goto preserve_ext_info;
+              break;
+            }
+          return false;
+        }
+
+preserve_ext_info:
+      ext_info[i].ext_logical =3D data_pos;
+      ext_info[i].ext_length =3D hole_pos - data_pos;
+      scan->scan_start =3D hole_pos;
+      ++i;
+    }
+
+  scan->ei_count =3D i;
+  scan->ext_info =3D xnmalloc (scan->ei_count, sizeof (struct =
extent_info));
+
+  for (i =3D 0; i < scan->ei_count; i++)
+    {
+      assert (ext_info[i].ext_logical <=3D OFF_T_MAX);
+
+      scan->ext_info[i].ext_logical =3D ext_info[i].ext_logical;
+      scan->ext_info[i].ext_length =3D ext_info[i].ext_length;
+    }
+
+  return (lseek (scan->fd, (off_t) 0, SEEK_SET) < 0) ? false : true;
+}
 #else
 extern bool
 extent_scan_read (struct extent_scan *scan ATTRIBUTE_UNUSED)
diff --git a/src/extent-scan.h b/src/extent-scan.h
index 5b4ded5..4fc05c6 100644
--- a/src/extent-scan.h
+++ b/src/extent-scan.h
@@ -38,6 +38,11 @@ struct extent_scan
   /* File descriptor of extent scan run against.  */
   int fd;
=20
+# if defined (SEEK_DATA) && defined (SEEK_HOLE)
+  /* Source file size, i.e, (struct stat) &statbuf.st_size.  */
+  size_t src_total_size;
+#endif
+
   /* Next scan start offset.  */
   off_t scan_start;
=20
diff --git a/tests/Makefile.am b/tests/Makefile.am
index 685eb52..6c596b9 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -28,6 +28,7 @@ root_tests =3D					=
\
   cp/cp-mv-enotsup-xattr			\
   cp/capability					\
   cp/sparse-fiemap				\
+  cp/sparse-lseek                               \
   dd/skip-seek-past-dev				\
   install/install-C-root			\
   ls/capability					\
diff --git a/tests/cp/sparse-lseek b/tests/cp/sparse-lseek
new file mode 100755
index 0000000..5b8f2c1
--- /dev/null
+++ b/tests/cp/sparse-lseek
@@ -0,0 +1,56 @@
+#!/bin/sh
+# Test cp --sparse=3Dalways through lseek(SEEK_DATA/SEEK_HOLE) copy
+
+# Copyright (C) 2010-2011 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+. "${srcdir=3D.}/init.sh"; path_prepend_ ../src
+print_ver_ cp
+$PERL -e 1 || skip_test_ 'you lack perl'
+
+zfsdisk=3DdiskX
+zfspool=3Dseektest
+
+require_root_
+
+cwd=3D$PWD
+cleanup_() { zpool destroy $zfspool; }
+
+skip=3D0
+mkfile 128m "$cwd/$zfsdisk" || skip=3D1
+
+# Check if the seektest pool is already exists
+zpool list $zfspool 2>/dev/null &&
+  skip_test_ "$zfspool already exists"
+
+# Create pool and verify if it is mounted automatically
+zpool create $zfspool "$cwd/$zfsdisk" || skip=3D1
+zpool list $zfspool >/dev/null || skip=3D1
+
+test $skip =3D 1 && skip_test_ "insufficient ZFS support"
+
+for i in $(seq 1 2 21); do
+  for j in 1 2 31 100; do
+    $PERL -e 'BEGIN { $n =3D '$i' * 1024; *F =3D *STDOUT }' \
+          -e 'for (1..'$j') { sysseek (*F, $n, 1)' \
+          -e '&& syswrite (*F, chr($_)x$n) or die "$!"}' > /$zfspool/j1 =
|| fail=3D1
+
+    cp --sparse=3Dalways /$zfspool/j1 /$zfspool/j2 || fail=3D1
+    cmp /$zfspool/j1 /$zfspool/j2 || fail=3D1
+    test $fail =3D 1 && break 2
+  done
+done
+
+Exit $fail
--=20
1.7.4


Any comments are appreciated!

Thanks,
-Jeff


>=20
>=20
> Thanks,
> -Jeff
>=20
>=20
>> Hello All,
>>=20
>> This is the first try to introduce the SEEK_DATA/SEEK_HOLE support to =
extent_scan module for efficient sparse file copy on ZFS,  I have =
delayed it for a long time, sorry for that!
>>=20
>> Below is the code change lists:
>> src/extent_scan.h:  add a new structure item 'src_total_size' to =
"struct extent_info",  since I have to make use of this value to =
determine
>> a file is sparse of not for the initial scan.  If the returns of =
lseek(fd, 0, SEEK_HOLE) is equal to the src_total_size or large than it, =
it means the source file
>> is definitely not a sparse file or maybe it is a sparse file but it =
does not make sense for proceeding scan read.
>> another change in this file is the signature of extent_scan_init(), =
just as I mentioned above, it need to accept the src_total_size =
variable.
>> src/extent_scan.c: implement the new exent_scan_read() through =
SEEK_DATA/SEEK_HOLE, it will be called if those two values are defined =
at <unistd.h>.
>> src/copy.c: pass src_total_size to extent_scan_init().
>>=20
>> On my test environment,  Solaris10, SunOS 5.10 Generic_142910-17, I =
have tried a few simple cases, they are works to me.
>>=20
>> For now, I am using diff(1) to verify the copy result,  does anyone =
know some utilities can be used to write the test script?
>> I have sent an email to ZFS DEV mail-list to ask this question =
yesterday,  a nice guy suggest me to use =
ZDB(http://cuddletech.com/blog/?p=3D407) for that, I'm
>> still study this utility now,   I also noticed there is patch to add =
SEEK_HOLE/SEEK_DATA support to os module in Python community,  please =
refer to:
>> http://bugs.python.org/file19566/z.patch
>> but it require very latest python build I think,  so could anyone =
give some other advices in this point?
>>=20
>> The patch is shown as following, any help testing and comments are =
appreciated!!
>>=20
>>=20
>> Thanks,
>> -Jeff
>>=20
>>=20
>> From: Jie Liu <jeff.liu@HIDDEN>
>> Date: Thu, 17 Feb 2011 21:14:23 +0800
>> Subject: [PATCH 1/1] copy: add SEEK_DATA/SEEK_HOLE support to =
extent_scan module
>>=20
>> * src/extent_scan.h: add src_total_size to struct extent_info, we =
need
>>   to check the SEEK_HOLE result against it for initial extent scan.
>>   modify the extent_scan_init() signature, to add size_t =
src_total_size.
>> * src/extent_scan.c: implement a new extent_scan_read() through =
SEEK_DATA
>>   and SEEK_HOLE.
>> * src/copy.c: pass src_total_size to extent_scan_init().
>>=20
>> Signed-off-by: Jie Liu <jeff.liu@HIDDEN>
>> ---
>>  src/copy.c        |    2 +-
>>  src/extent-scan.c |  113 =
++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>  src/extent-scan.h |    9 +++-
>>  3 files changed, 120 insertions(+), 4 deletions(-)
>>=20
>> diff --git a/src/copy.c b/src/copy.c
>> index 104652d..22b9911 100644
>> --- a/src/copy.c
>> +++ b/src/copy.c
>> @@ -306,7 +306,7 @@ extent_copy (int src_fd, int dest_fd, char *buf, =
size_t buf_size,
>>       We may need this at the end, for a final ftruncate.  */
>>    off_t dest_pos =3D 0;
>> =20
>> -  extent_scan_init (src_fd, &scan);
>> +  extent_scan_init (src_fd, src_total_size, &scan);
>> =20
>>    *require_normal_copy =3D false;
>>    bool wrote_hole_at_eof =3D true;
>> diff --git a/src/extent-scan.c b/src/extent-scan.c
>> index 1ba59db..ffeab7a 100644
>> --- a/src/extent-scan.c
>> +++ b/src/extent-scan.c
>> @@ -32,13 +32,17 @@
>>  /* Allocate space for struct extent_scan, initialize the entries if
>>     necessary and return it as the input argument of =
extent_scan_read().  */
>>  extern void
>> -extent_scan_init (int src_fd, struct extent_scan *scan)
>> +extent_scan_init (int src_fd, size_t src_total_size,
>> +                  struct extent_scan *scan)
>>  {
>>    scan->fd =3D src_fd;
>>    scan->ei_count =3D 0;
>>    scan->scan_start =3D 0;
>>    scan->initial_scan_failed =3D false;
>>    scan->hit_final_extent =3D false;
>> +#if defined(SEEK_HOLE) && defined(SEEK_DATA)
>> +  scan->src_total_size =3D src_total_size;
>> +#endif
>>  }
>> =20
>>  #ifdef __linux__
>> @@ -106,6 +110,113 @@ extent_scan_read (struct extent_scan *scan)
>> =20
>>    return true;
>>  }
>> +#elif defined(SEEK_HOLE) && defined(SEEK_DATA)
>> +extern bool
>> +extent_scan_read (struct extent_scan *scan)
>> +{
>> +  off_t data_pos, hole_pos;
>> +  union { struct extent_info ei; char c[4096]; } extent_buf;
>> +  struct extent_info *ext_info =3D &extent_buf.ei;
>> +  enum { count =3D (sizeof extent_buf / sizeof *ext_info) };
>> +  verify (count !=3D 0);
>> +
>> +  memset (&extent_buf, 0, sizeof extent_buf);
>> +
>> +  if (scan->scan_start =3D=3D 0)
>> +    {
>> +# ifdef _PC_MIN_HOLE_SIZE
>> +      /* To determine if the underlaying file system support
>> +         SEEK_HOLE, if not, fall back to the standard copy.  */
>> +      if (fpathconf (scan->fd, _PC_MIN_HOLE_SIZE) < 0)
>> +        {
>> +          scan->initial_scan_failed =3D true;
>> +          return false;
>> +        }
>> +# endif
>> +
>> +      /* If we have been compiled on an OS that supports SEEK_HOLE
>> +         but run on an OS that does not support SEEK_HOLE, we get
>> +         EINVAL.  If the underlying filesystem does not support the
>> +         SEEK_HOLE call, we get ENOTSUP, fall back to standard copy
>> +         in either case.  */
>> +      hole_pos =3D lseek (scan->fd, (off_t) 0, SEEK_HOLE);
>> +      if (hole_pos < 0)
>> +        {
>> +          if (errno =3D=3D EINVAL || errno =3D=3D ENOTSUP)
>> +            scan->initial_scan_failed =3D true;
>> +          return false;
>> +        }
>> +
>> +      /* Seek back to position 0 first if we detected a real hole.  =
*/
>> +      if (hole_pos > 0)
>> +        {
>> +          off_t tmp_pos;
>> +          tmp_pos =3D lseek (scan->fd, (off_t) 0, SEEK_SET);
>> +          if (tmp_pos !=3D (off_t) 0)
>> +              return false;
>> +
>> +          /* The source file is definitely not a sparse file, or it
>> +             maybe a sparse file but SEEK_HOLE returns the source =
file's
>> +             total size, fall back to the standard copy too.  */
>> +          if (hole_pos >=3D scan->src_total_size)
>> +            {
>> +              scan->initial_scan_failed =3D true;
>> +              return false;
>> +            }
>> +        }
>> +    }
>> +
>> +  unsigned int i =3D 0;
>> +  /* If lseek(2) failed and the errno is set to ENXIO, for
>> +     SEEK_DATA there are no more data regions past the supplied
>> +     offset.  For SEEK_HOLE, there are no more holes past the=20
>> +     supplied offset.  Set scan->hit_final_extent to true for
>> +     either case.  */
>> +  do {
>> +    data_pos =3D lseek (scan->fd, scan->scan_start, SEEK_DATA);
>> +    if (data_pos < 0)
>> +      {
>> +        if (errno !=3D ENXIO)
>> +          return false;
>> +        else
>> +          {
>> +            scan->hit_final_extent =3D true;
>> +            return true;
>> +          }
>> +      }
>> +
>> +    hole_pos =3D lseek (scan->fd, data_pos, SEEK_HOLE);
>> +    if (hole_pos < 0)
>> +      {
>> +        if (errno !=3D ENXIO)
>> +          return false;
>> +        else
>> +          {
>> +            scan->hit_final_extent =3D true;
>> +            return true;
>> +          }
>> +      }
>> +
>> +    ext_info[i].ext_logical =3D data_pos;
>> +    ext_info[i].ext_length =3D hole_pos - data_pos;
>> +    scan->scan_start =3D hole_pos;
>> +    ++i;
>> +  } while (scan->scan_start < scan->src_total_size && i < count);
>> +
>> +  i--;
>> +  scan->ei_count =3D i;
>> +  scan->ext_info =3D xnmalloc (scan->ei_count, sizeof (struct =
extent_info));
>> +
>> +  for (i =3D 0; i < scan->ei_count; i++)
>> +    {
>> +      assert (ext_info[i].ext_logical <=3D OFF_T_MAX);
>> +
>> +      scan->ext_info[i].ext_logical =3D ext_info[i].ext_logical;
>> +      scan->ext_info[i].ext_length =3D ext_info[i].ext_length;
>> +    }
>> +
>> +  return true;=20
>> +}
>>  #else
>>  extern bool
>>  extent_scan_read (struct extent_scan *scan ATTRIBUTE_UNUSED)
>> diff --git a/src/extent-scan.h b/src/extent-scan.h
>> index 4724b25..a271b95 100644
>> --- a/src/extent-scan.h
>> +++ b/src/extent-scan.h
>> @@ -18,7 +18,6 @@
>> =20
>>  #ifndef EXTENT_SCAN_H
>>  # define EXTENT_SCAN_H
>> -
>>  /* Structure used to store information of each extent.  */
>>  struct extent_info
>>  {
>> @@ -38,6 +37,11 @@ struct extent_scan
>>    /* File descriptor of extent scan run against.  */
>>    int fd;
>> =20
>> +#if defined(SEEK_DATA) && defined(SEEK_HOLE)
>> +  /* Source file size, i.e, (struct stat) &statbuf.st_size.  */
>> +  size_t src_total_size;
>> +#endif
>> +
>>    /* Next scan start offset.  */
>>    off_t scan_start;
>> =20
>> @@ -55,7 +59,8 @@ struct extent_scan
>>    struct extent_info *ext_info;
>>  };
>> =20
>> -void extent_scan_init (int src_fd, struct extent_scan *scan);
>> +void extent_scan_init (int src_fd, size_t src_total_size,
>> +                       struct extent_scan *scan);
>> =20
>>  bool extent_scan_read (struct extent_scan *scan);
>> =20
>> --=20
>> 1.7.4
>=20




--Apple-Mail-1-625867617
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=us-ascii

<html><head></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; =
"><div><br></div><div><blockquote type=3D"cite"><div style=3D"word-wrap: =
break-word; -webkit-nbsp-mode: space; -webkit-line-break: =
after-white-space; ">Hi All,<div><br></div><div>Please ignore the =
current patch, I will submit another patch with a few fixes =
soon.</div></div></blockquote><br>Now the new patch set coming, =
&nbsp;</div><div><br></div><div>In previous post, I have tried to change =
the extent_scan_init() interface by adding a new argument to indicate =
the source file size,</div><div>this will reduce the overhead of call =
fstat(2) &nbsp;in extent_scan_read(), since the file size&nbsp;is =
definitely needed for SEEK* stuff, however, the file size is redundant =
for FIEMAP.</div><div>so I changed my idea to keep extent_scan_init() as =
before, &nbsp;instead, to retrieve&nbsp;the file size in =
extent_scan_read() when launching the first scan, one benefit is, there =
is nothing need to</div><div>be modified&nbsp;in extent_copy() for this =
patch.</div><div><br></div><div>Tests:</div><div>=3D=3D=3D=3D</div><div>A =
new test sparse-lseek was introduced in this post, it make use of the =
sparse file generation function in Perl, and do `cmp` against the target =
copied file.</div><div>I have also took a look at the `sdb` utility =
shipped with ZFS, but did not found any interesting stuff can be used =
for this test.</div><div><br></div><div>Test run passed on my =
environment as below,</div><div><br></div><div><div>bash-3.00# make =
check TESTS=3Dcp/sparse-lseek VERBOSE=3Dyes</div><div>make =
&nbsp;check-TESTS</div><div>make[1]: Entering directory =
`/coreutils/tests'</div><div>make[2]: Entering directory =
`/coreutils/tests'</div><div>PASS: =
cp/sparse-lseek</div><div>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D</div><di=
v>1 test passed</div><div>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D</div><di=
v>make[2]: Leaving directory `/coreutils/tests'</div><div>make[1]: =
Leaving directory `/coreutils/tests'</div><div>&nbsp;&nbsp;GEN &nbsp; =
&nbsp;vc_exe_in_TESTS</div><div>No differences =
encountered</div><div><br></div></div><div>Manual =
tests:</div><div>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D</div><div>1. Ensure =
trailing blanks, test 0 size sparse file, non-sparse file, &nbsp;sparse =
file with hole start and hole end.</div><div>2. make syntax-check =
failed, I have no idea of this issue at the moment, &nbsp;I also tried =
to run make distcheck, looks the package building, install and uninstall =
procedures all passed,</div><div>but it also failed&nbsp;at the final =
stage, am I missing something here?&nbsp;</div><div><br></div><div>The =
logs which&nbsp;were shown as following,</div><div>bash-3.00# make =
syntax-check</div><div>GFDL_version</div><div>awk: syntax error near =
line 1</div><div>awk: bailing out near line 1</div><div>make: *** =
[sc_GFDL_version.z] Error 2</div><div><br></div><div>make =
distcheck:</div><div>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D</div><div>=
......</div><div>make[1]: Entering directory =
`/coreutils'</div><div>&nbsp;&nbsp;GEN &nbsp; =
&nbsp;check-ls-dircolors</div><div>make my-distcheck</div><div>make[2]: =
Entering directory `/coreutils'</div><div>make =
syntax-check</div><div>make[3]: Entering directory =
`/coreutils'</div><div>GFDL_version</div><div>awk: syntax error near =
line 1</div><div>awk: bailing out near line 1</div><div>make[3]: *** =
[sc_GFDL_version.z] Error 2</div><div>make[3]: Leaving directory =
`/coreutils'</div><div>make[2]: *** [my-distcheck] Error =
2</div><div>make[2]: Leaving directory `/coreutils'</div><div>make[1]: =
*** [distcheck-hook] Error 2</div><div>make[1]: Leaving directory =
`/coreutils'</div><div>make: *** [distcheck] Error =
1</div><div><br></div><div><br></div><div><br></div><div>Below is the =
revised patch,</div><div><br></div><div><div>=46rom =
4f966c1fe6226f3f711faae120cd8bea78e722b8 Mon Sep 17 00:00:00 =
2001</div><div>From: Jie Liu &lt;<a =
href=3D"mailto:jeff.liu@HIDDEN">jeff.liu@HIDDEN</a>&gt;</div><div>=
Date: Tue, 19 Apr 2011 15:24:50 -0700</div><div>Subject: [PATCH 1/1] =
copy: add SEEK_DATA/SEEK_HOLE support to extent_scan =
module</div><div><br></div><div>* src/extent_scan.h: introduce =
src_total_size to struct extent_info, we</div><div>&nbsp;&nbsp;need it =
for lseek(2) iteration.</div><div>* src/extent_scan.c: implement a new =
extent_scan_read() through SEEK_DATA</div><div>&nbsp;&nbsp;and SEEK_HOLE =
if those stuff are supported.</div><div>* tests/cp/sparse-lseek: add a =
new test for lseek(2) extent =
copy.</div><div><br></div><div>Signed-off-by: Jie Liu &lt;<a =
href=3D"mailto:jeff.liu@HIDDEN">jeff.liu@HIDDEN</a>&gt;</div><div>=
---</div><div>&nbsp;src/extent-scan.c &nbsp; &nbsp; | &nbsp;119 =
+++++++++++++++++++++++++++++++++++++++++++++++++</div><div>&nbsp;src/exte=
nt-scan.h &nbsp; &nbsp; | &nbsp; &nbsp;5 =
++</div><div>&nbsp;tests/Makefile.am &nbsp; &nbsp; | &nbsp; &nbsp;1 =
+</div><div>&nbsp;tests/cp/sparse-lseek | &nbsp; 56 =
+++++++++++++++++++++++</div><div>&nbsp;4 files changed, 181 =
insertions(+), 0 deletions(-)</div><div>&nbsp;create mode 100755 =
tests/cp/sparse-lseek</div><div><br></div><div>diff --git =
a/src/extent-scan.c b/src/extent-scan.c</div><div>index da7eb9d..a54eca0 =
100644</div><div>--- a/src/extent-scan.c</div><div>+++ =
b/src/extent-scan.c</div><div>@@ -17,7 +17,9 @@</div><div>&nbsp;&nbsp; =
&nbsp;Written by Jie Liu (<a =
href=3D"mailto:jeff.liu@HIDDEN">jeff.liu@HIDDEN</a>). =
&nbsp;*/</div><div>&nbsp;</div><div>&nbsp;#include =
&lt;config.h&gt;</div><div>+#include =
&lt;fcntl.h&gt;</div><div>&nbsp;#include =
&lt;sys/types.h&gt;</div><div>+#include =
&lt;sys/stat.h&gt;</div><div>&nbsp;#include =
&lt;sys/ioctl.h&gt;</div><div>&nbsp;#include =
&lt;sys/utsname.h&gt;</div><div>&nbsp;#include =
&lt;assert.h&gt;</div><div>@@ -71,6 +73,9 @@ extent_scan_init (int =
src_fd, struct extent_scan *scan)</div><div>&nbsp;&nbsp; =
scan-&gt;initial_scan_failed =3D false;</div><div>&nbsp;&nbsp; =
scan-&gt;hit_final_extent =3D false;</div><div>&nbsp;&nbsp; =
scan-&gt;fm_flags =3D extent_need_sync () ? FIEMAP_FLAG_SYNC : =
0;</div><div>+#if defined (SEEK_DATA) &amp;&amp; defined =
(SEEK_HOLE)</div><div>+ &nbsp;scan-&gt;src_total_size =3D =
0;</div><div>+#endif</div><div>&nbsp;}</div><div>&nbsp;</div><div>&nbsp;#i=
fdef __linux__</div><div>@@ -204,6 +209,120 @@ extent_scan_read (struct =
extent_scan *scan)</div><div>&nbsp;</div><div>&nbsp;&nbsp; return =
true;</div><div>&nbsp;}</div><div>+#elif defined (SEEK_HOLE) &amp;&amp; =
defined (SEEK_DATA)</div><div>+extern bool</div><div>+extent_scan_read =
(struct extent_scan *scan)</div><div>+{</div><div>+ &nbsp;off_t =
data_pos, hole_pos;</div><div>+ &nbsp;union { struct extent_info ei; =
char c[4096]; } extent_buf;</div><div>+ &nbsp;struct extent_info =
*ext_info =3D &amp;extent_buf.ei;</div><div>+ &nbsp;enum { count =3D =
(sizeof extent_buf / sizeof *ext_info) };</div><div>+ &nbsp;verify =
(count !=3D 0);</div><div>+</div><div>+ &nbsp;memset (&amp;extent_buf, =
0, sizeof extent_buf);</div><div>+</div><div>+ &nbsp;if =
(scan-&gt;scan_start =3D=3D 0)</div><div>+ &nbsp; &nbsp;{</div><div>+# =
ifdef _PC_MIN_HOLE_SIZE</div><div>+ &nbsp; &nbsp; &nbsp;/* To determine =
if the underlaying file system support</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp; SEEK_HOLE. &nbsp;If not, fall back to the standard copy. =
&nbsp;*/</div><div>+ &nbsp; &nbsp; &nbsp;if (fpathconf (scan-&gt;fd, =
_PC_MIN_HOLE_SIZE) &lt; 0)</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp;{</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;scan-&gt;initial_scan_failed =3D true;</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp;return false;</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp;}</div><div>+# endif</div><div>+</div><div>+ &nbsp; &nbsp; =
&nbsp;/* If we have been compiled on an OS that supports =
SEEK_HOLE</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; but run on an OS that =
does not support SEEK_HOLE, we get</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp; EINVAL. &nbsp;If the underlying file system does not support =
the</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; SEEK_HOLE call, we get =
ENOTSUP, setting initial_scan_failed</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp; to true to fall back to the standard copy in either case. =
&nbsp;*/</div><div>+ &nbsp; &nbsp; &nbsp;hole_pos =3D lseek =
(scan-&gt;fd, (off_t) 0, SEEK_HOLE);</div><div>+ &nbsp; &nbsp; &nbsp;if =
(hole_pos &lt; 0)</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp;{</div><div>+ =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;if (errno =3D=3D EINVAL || errno =3D=3D =
ENOTSUP)</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;scan-&gt;initial_scan_failed =3D true;</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp;return false;</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp;}</div><div>+</div><div>+ &nbsp; &nbsp; &nbsp;/* Seek back to =
position 0 first. &nbsp;*/</div><div>+ &nbsp; &nbsp; &nbsp;if (hole_pos =
&gt; 0)</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp;{</div><div>+ &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp;if (lseek (scan-&gt;fd, (off_t) 0, SEEK_SET) =
&lt; 0)</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;return =
false;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp;}</div><div>+</div><div>+ =
&nbsp; &nbsp; &nbsp;struct stat sb;</div><div>+ &nbsp; &nbsp; &nbsp;if =
(fstat (scan-&gt;fd, &amp;sb) &lt; 0)</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp;return false;</div><div>+</div><div>+ &nbsp; &nbsp; &nbsp;/* This =
is definitely not a sparse file, we treat it as a big extent. =
&nbsp;*/</div><div>+ &nbsp; &nbsp; &nbsp;if (hole_pos &gt;=3D =
sb.st_size)</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp;{</div><div>+ &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp;scan-&gt;ei_count =3D 1;</div><div>+ &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp;scan-&gt;ext_info =3D xnmalloc =
(scan-&gt;ei_count, sizeof (struct extent_info));</div><div>+ &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp;scan-&gt;ext_info[0].ext_logical =3D =
0;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;scan-&gt;ext_info[0].ext_length =3D sb.st_size;</div><div>+ &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp;scan-&gt;hit_final_extent =3D =
true;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;return =
true;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp;}</div><div>+ &nbsp; &nbsp; =
&nbsp;scan-&gt;src_total_size =3D sb.st_size;</div><div>+ &nbsp; =
&nbsp;}</div><div>+</div><div>+ &nbsp;unsigned int i =3D 0;</div><div>+ =
&nbsp;/* If lseek(2) failed and the errno is set to ENXIO, =
for</div><div>+ &nbsp; &nbsp; SEEK_DATA there are no more data regions =
past the supplied</div><div>+ &nbsp; &nbsp; offset. &nbsp;For SEEK_HOLE, =
there are no more holes past the</div><div>+ &nbsp; &nbsp; supplied =
offset. &nbsp;Set scan-&gt;hit_final_extent to true in</div><div>+ =
&nbsp; &nbsp; either case. &nbsp;*/</div><div>+ &nbsp;while =
(scan-&gt;scan_start &lt; scan-&gt;src_total_size &amp;&amp; i &lt; =
count)</div><div>+ &nbsp; &nbsp;{</div><div>+ &nbsp; &nbsp; =
&nbsp;data_pos =3D lseek (scan-&gt;fd, scan-&gt;scan_start, =
SEEK_DATA);</div><div>+ &nbsp; &nbsp; &nbsp;if (data_pos &lt; =
0)</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp;{</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp;if (errno =3D=3D ENXIO)</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp;{</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp;scan-&gt;hit_final_extent =3D true;</div><div>+ =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;break;</div><div>+ =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;}</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp;return false;</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp;}</div><div>+</div><div>+ &nbsp; &nbsp; &nbsp;hole_pos =3D lseek =
(scan-&gt;fd, data_pos, SEEK_HOLE);</div><div>+ &nbsp; &nbsp; &nbsp;if =
(hole_pos &lt; 0)</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp;{</div><div>+ =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;if (errno =3D=3D ENXIO)</div><div>+ =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;scan-&gt;hit_final_extent =3D =
true;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;hole_pos =3D scan-&gt;src_total_size;</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;if (data_pos &lt; =
hole_pos)</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;goto preserve_ext_info;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp;break;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp;}</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;return =
false;</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp;}</div><div>+</div><div>+preserve_ext_info:</div><div>+ &nbsp; =
&nbsp; &nbsp;ext_info[i].ext_logical =3D data_pos;</div><div>+ &nbsp; =
&nbsp; &nbsp;ext_info[i].ext_length =3D hole_pos - data_pos;</div><div>+ =
&nbsp; &nbsp; &nbsp;scan-&gt;scan_start =3D hole_pos;</div><div>+ &nbsp; =
&nbsp; &nbsp;++i;</div><div>+ &nbsp; &nbsp;}</div><div>+</div><div>+ =
&nbsp;scan-&gt;ei_count =3D i;</div><div>+ &nbsp;scan-&gt;ext_info =3D =
xnmalloc (scan-&gt;ei_count, sizeof (struct =
extent_info));</div><div>+</div><div>+ &nbsp;for (i =3D 0; i &lt; =
scan-&gt;ei_count; i++)</div><div>+ &nbsp; &nbsp;{</div><div>+ &nbsp; =
&nbsp; &nbsp;assert (ext_info[i].ext_logical &lt;=3D =
OFF_T_MAX);</div><div>+</div><div>+ &nbsp; &nbsp; =
&nbsp;scan-&gt;ext_info[i].ext_logical =3D =
ext_info[i].ext_logical;</div><div>+ &nbsp; &nbsp; =
&nbsp;scan-&gt;ext_info[i].ext_length =3D =
ext_info[i].ext_length;</div><div>+ &nbsp; =
&nbsp;}</div><div>+</div><div>+ &nbsp;return (lseek (scan-&gt;fd, =
(off_t) 0, SEEK_SET) &lt; 0) ? false : =
true;</div><div>+}</div><div>&nbsp;#else</div><div>&nbsp;extern =
bool</div><div>&nbsp;extent_scan_read (struct extent_scan *scan =
ATTRIBUTE_UNUSED)</div><div>diff --git a/src/extent-scan.h =
b/src/extent-scan.h</div><div>index 5b4ded5..4fc05c6 =
100644</div><div>--- a/src/extent-scan.h</div><div>+++ =
b/src/extent-scan.h</div><div>@@ -38,6 +38,11 @@ struct =
extent_scan</div><div>&nbsp;&nbsp; /* File descriptor of extent scan run =
against. &nbsp;*/</div><div>&nbsp;&nbsp; int =
fd;</div><div>&nbsp;</div><div>+# if defined (SEEK_DATA) &amp;&amp; =
defined (SEEK_HOLE)</div><div>+ &nbsp;/* Source file size, i.e, (struct =
stat) &amp;statbuf.st_size. &nbsp;*/</div><div>+ &nbsp;size_t =
src_total_size;</div><div>+#endif</div><div>+</div><div>&nbsp;&nbsp; /* =
Next scan start offset. &nbsp;*/</div><div>&nbsp;&nbsp; off_t =
scan_start;</div><div>&nbsp;</div><div>diff --git a/tests/Makefile.am =
b/tests/Makefile.am</div><div>index 685eb52..6c596b9 =
100644</div><div>--- a/tests/Makefile.am</div><div>+++ =
b/tests/Makefile.am</div><div>@@ -28,6 +28,7 @@ root_tests =3D<span =
class=3D"Apple-tab-span" style=3D"white-space:pre">				=
	</span>\</div><div>&nbsp;&nbsp; cp/cp-mv-enotsup-xattr<span =
class=3D"Apple-tab-span" style=3D"white-space:pre">			=
</span>\</div><div>&nbsp;&nbsp; cp/capability<span =
class=3D"Apple-tab-span" style=3D"white-space:pre">				=
	</span>\</div><div>&nbsp;&nbsp; cp/sparse-fiemap<span =
class=3D"Apple-tab-span" style=3D"white-space:pre">				=
</span>\</div><div>+ &nbsp;cp/sparse-lseek &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; \</div><div>&nbsp;&nbsp; dd/skip-seek-past-dev<span =
class=3D"Apple-tab-span" style=3D"white-space:pre">				=
</span>\</div><div>&nbsp;&nbsp; install/install-C-root<span =
class=3D"Apple-tab-span" style=3D"white-space:pre">			=
</span>\</div><div>&nbsp;&nbsp; ls/capability<span =
class=3D"Apple-tab-span" style=3D"white-space:pre">				=
	</span>\</div><div>diff --git a/tests/cp/sparse-lseek =
b/tests/cp/sparse-lseek</div><div>new file mode 100755</div><div>index =
0000000..5b8f2c1</div><div>--- /dev/null</div><div>+++ =
b/tests/cp/sparse-lseek</div><div>@@ -0,0 +1,56 =
@@</div><div>+#!/bin/sh</div><div>+# Test cp --sparse=3Dalways through =
lseek(SEEK_DATA/SEEK_HOLE) copy</div><div>+</div><div>+# Copyright (C) =
2010-2011 Free Software Foundation, Inc.</div><div>+</div><div>+# This =
program is free software: you can redistribute it and/or =
modify</div><div>+# it under the terms of the GNU General Public License =
as published by</div><div>+# the Free Software Foundation, either =
version 3 of the License, or</div><div>+# (at your option) any later =
version.</div><div>+</div><div>+# This program is distributed in the =
hope that it will be useful,</div><div>+# but WITHOUT ANY WARRANTY; =
without even the implied warranty of</div><div>+# MERCHANTABILITY or =
FITNESS FOR A PARTICULAR PURPOSE. &nbsp;See the</div><div>+# GNU General =
Public License for more details.</div><div>+</div><div>+# You should =
have received a copy of the GNU General Public License</div><div>+# =
along with this program. &nbsp;If not, see &lt;<a =
href=3D"http://www.gnu.org/licenses/">http://www.gnu.org/licenses/</a>&gt;=
.</div><div>+</div><div>+. "${srcdir=3D.}/init.sh"; path_prepend_ =
../src</div><div>+print_ver_ cp</div><div>+$PERL -e 1 || skip_test_ 'you =
lack =
perl'</div><div>+</div><div>+zfsdisk=3DdiskX</div><div>+zfspool=3Dseektest=
</div><div>+</div><div>+require_root_</div><div>+</div><div>+cwd=3D$PWD</d=
iv><div>+cleanup_() { zpool destroy $zfspool; =
}</div><div>+</div><div>+skip=3D0</div><div>+mkfile 128m "$cwd/$zfsdisk" =
|| skip=3D1</div><div>+</div><div>+# Check if the seektest pool is =
already exists</div><div>+zpool list $zfspool 2&gt;/dev/null =
&amp;&amp;</div><div>+ &nbsp;skip_test_ "$zfspool already =
exists"</div><div>+</div><div>+# Create pool and verify if it is mounted =
automatically</div><div>+zpool create $zfspool "$cwd/$zfsdisk" || =
skip=3D1</div><div>+zpool list $zfspool &gt;/dev/null || =
skip=3D1</div><div>+</div><div>+test $skip =3D 1 &amp;&amp; skip_test_ =
"insufficient ZFS support"</div><div>+</div><div>+for i in $(seq 1 2 =
21); do</div><div>+ &nbsp;for j in 1 2 31 100; do</div><div>+ &nbsp; =
&nbsp;$PERL -e 'BEGIN { $n =3D '$i' * 1024; *F =3D *STDOUT }' =
\</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;-e 'for (1..'$j') { =
sysseek (*F, $n, 1)' \</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;-e =
'&amp;&amp; syswrite (*F, chr($_)x$n) or die "$!"}' &gt; /$zfspool/j1 || =
fail=3D1</div><div>+</div><div>+ &nbsp; &nbsp;cp --sparse=3Dalways =
/$zfspool/j1 /$zfspool/j2 || fail=3D1</div><div>+ &nbsp; &nbsp;cmp =
/$zfspool/j1 /$zfspool/j2 || fail=3D1</div><div>+ &nbsp; &nbsp;test =
$fail =3D 1 &amp;&amp; break 2</div><div>+ =
&nbsp;done</div><div>+done</div><div>+</div><div>+Exit =
$fail</div><div>--&nbsp;</div><div>1.7.4</div><div><br></div><div><br></di=
v><div>Any comments are =
appreciated!</div><div><br></div><div>Thanks,</div><div>-Jeff</div></div><=
div><br></div><div><br></div><div><blockquote type=3D"cite"><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; =
"><div><br></div><div><br></div><div>Thanks,</div><div>-Jeff</div><div><br=
><div><br class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite"><div style=3D"word-wrap: break-word; -webkit-nbsp-mode: =
space; -webkit-line-break: after-white-space; "><div>Hello =
All,</div><div><br></div><div>This is the first try to introduce the =
SEEK_DATA/SEEK_HOLE support to extent_scan module for efficient sparse =
file copy on ZFS, &nbsp;I have delayed it for a long time, sorry for =
that!</div><div><br></div><div>Below is the code change =
lists:</div><div>src/extent_scan.h: &nbsp;add a new structure item =
'src_total_size' to "struct extent_info", &nbsp;since I have to make use =
of this value to determine</div><div>a file is sparse of not for the =
initial scan. &nbsp;If the returns of lseek(fd, 0, SEEK_HOLE) is equal =
to the src_total_size or large than it, it means the source =
file</div><div>is definitely not a sparse file or maybe it is a sparse =
file but it does not make sense for proceeding scan =
read.</div><div>another change in this file is the signature of =
extent_scan_init(), just as I mentioned above, it need to accept the =
src_total_size variable.</div><div>src/extent_scan.c: implement the new =
exent_scan_read() through SEEK_DATA/SEEK_HOLE, it will be called if =
those two values are defined at &lt;unistd.h&gt;.</div><div>src/copy.c: =
pass src_total_size to extent_scan_init().</div><div><br></div><div>On =
my test environment, &nbsp;Solaris10, SunOS 5.10 Generic_142910-17, I =
have tried a few simple cases, they are works to =
me.</div><div><br></div><div>For now, I am using diff(1) to verify the =
copy result, &nbsp;does anyone know some utilities can be used to write =
the test script?</div><div>I have sent an email to ZFS DEV mail-list to =
ask this question yesterday, &nbsp;a nice guy suggest me to use ZDB(<a =
href=3D"http://cuddletech.com/blog/?p=3D407">http://cuddletech.com/blog/?p=
=3D407</a>) for that, I'm</div><div>still study this utility now, &nbsp; =
I also noticed there is patch to add SEEK_HOLE/SEEK_DATA support to os =
module in Python community, &nbsp;please refer to:</div><div><a =
href=3D"http://bugs.python.org/file19566/z.patch">http://bugs.python.org/f=
ile19566/z.patch</a></div><div>but it require very latest python build I =
think, &nbsp;so could anyone give some other advices in this =
point?</div><div><br></div><div>The patch is shown as following, any =
help testing and comments are =
appreciated!!</div><div><br></div><div><br></div><div>Thanks,</div><div>-J=
eff</div><div><br></div><div><br></div><div>From: Jie Liu &lt;<a =
href=3D"mailto:jeff.liu@HIDDEN">jeff.liu@HIDDEN</a>&gt;</div><div>=
Date: Thu, 17 Feb 2011 21:14:23 +0800</div><div>Subject: [PATCH 1/1] =
copy: add SEEK_DATA/SEEK_HOLE support to extent_scan =
module</div><div><br></div><div>* src/extent_scan.h: add src_total_size =
to struct extent_info, we need</div><div>&nbsp;&nbsp;to check the =
SEEK_HOLE result against it for initial extent =
scan.</div><div>&nbsp;&nbsp;modify the extent_scan_init() signature, to =
add size_t src_total_size.</div><div>* src/extent_scan.c: implement a =
new extent_scan_read() through SEEK_DATA</div><div>&nbsp;&nbsp;and =
SEEK_HOLE.</div><div>* src/copy.c: pass src_total_size to =
extent_scan_init().</div><div><br></div><div>Signed-off-by: Jie Liu =
&lt;<a =
href=3D"mailto:jeff.liu@HIDDEN">jeff.liu@HIDDEN</a>&gt;</div><div>=
---</div><div>&nbsp;src/copy.c &nbsp; &nbsp; &nbsp; &nbsp;| &nbsp; =
&nbsp;2 +-</div><div>&nbsp;src/extent-scan.c | &nbsp;113 =
++++++++++++++++++++++++++++++++++++++++++++++++++++-</div><div>&nbsp;src/=
extent-scan.h | &nbsp; &nbsp;9 +++-</div><div>&nbsp;3 files changed, 120 =
insertions(+), 4 deletions(-)</div><div><br></div><div>diff --git =
a/src/copy.c b/src/copy.c</div><div>index 104652d..22b9911 =
100644</div><div>--- a/src/copy.c</div><div>+++ =
b/src/copy.c</div><div>@@ -306,7 +306,7 @@ extent_copy (int src_fd, int =
dest_fd, char *buf, size_t buf_size,</div><div>&nbsp;&nbsp; &nbsp; =
&nbsp;We may need this at the end, for a final ftruncate. =
&nbsp;*/</div><div>&nbsp;&nbsp; off_t dest_pos =3D =
0;</div><div>&nbsp;</div><div>- &nbsp;extent_scan_init (src_fd, =
&amp;scan);</div><div>+ &nbsp;extent_scan_init (src_fd, src_total_size, =
&amp;scan);</div><div>&nbsp;</div><div>&nbsp;&nbsp; *require_normal_copy =
=3D false;</div><div>&nbsp;&nbsp; bool wrote_hole_at_eof =3D =
true;</div><div>diff --git a/src/extent-scan.c =
b/src/extent-scan.c</div><div>index 1ba59db..ffeab7a =
100644</div><div>--- a/src/extent-scan.c</div><div>+++ =
b/src/extent-scan.c</div><div>@@ -32,13 +32,17 @@</div><div>&nbsp;/* =
Allocate space for struct extent_scan, initialize the entries =
if</div><div>&nbsp;&nbsp; &nbsp;necessary and return it as the input =
argument of extent_scan_read(). &nbsp;*/</div><div>&nbsp;extern =
void</div><div>-extent_scan_init (int src_fd, struct extent_scan =
*scan)</div><div>+extent_scan_init (int src_fd, size_t =
src_total_size,</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp;struct extent_scan =
*scan)</div><div>&nbsp;{</div><div>&nbsp;&nbsp; scan-&gt;fd =3D =
src_fd;</div><div>&nbsp;&nbsp; scan-&gt;ei_count =3D =
0;</div><div>&nbsp;&nbsp; scan-&gt;scan_start =3D =
0;</div><div>&nbsp;&nbsp; scan-&gt;initial_scan_failed =3D =
false;</div><div>&nbsp;&nbsp; scan-&gt;hit_final_extent =3D =
false;</div><div>+#if defined(SEEK_HOLE) &amp;&amp; =
defined(SEEK_DATA)</div><div>+ &nbsp;scan-&gt;src_total_size =3D =
src_total_size;</div><div>+#endif</div><div>&nbsp;}</div><div>&nbsp;</div>=
<div>&nbsp;#ifdef __linux__</div><div>@@ -106,6 +110,113 @@ =
extent_scan_read (struct extent_scan =
*scan)</div><div>&nbsp;</div><div>&nbsp;&nbsp; return =
true;</div><div>&nbsp;}</div><div>+#elif defined(SEEK_HOLE) &amp;&amp; =
defined(SEEK_DATA)</div><div>+extern bool</div><div>+extent_scan_read =
(struct extent_scan *scan)</div><div>+{</div><div>+ &nbsp;off_t =
data_pos, hole_pos;</div><div>+ &nbsp;union { struct extent_info ei; =
char c[4096]; } extent_buf;</div><div>+ &nbsp;struct extent_info =
*ext_info =3D &amp;extent_buf.ei;</div><div>+ &nbsp;enum { count =3D =
(sizeof extent_buf / sizeof *ext_info) };</div><div>+ &nbsp;verify =
(count !=3D 0);</div><div>+</div><div>+ &nbsp;memset (&amp;extent_buf, =
0, sizeof extent_buf);</div><div>+</div><div>+ &nbsp;if =
(scan-&gt;scan_start =3D=3D 0)</div><div>+ &nbsp; &nbsp;{</div><div>+# =
ifdef _PC_MIN_HOLE_SIZE</div><div>+ &nbsp; &nbsp; &nbsp;/* To determine =
if the underlaying file system support</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp; SEEK_HOLE, if not, fall back to the standard copy. =
&nbsp;*/</div><div>+ &nbsp; &nbsp; &nbsp;if (fpathconf (scan-&gt;fd, =
_PC_MIN_HOLE_SIZE) &lt; 0)</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp;{</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;scan-&gt;initial_scan_failed =3D true;</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp;return false;</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp;}</div><div>+# endif</div><div>+</div><div>+ &nbsp; &nbsp; =
&nbsp;/* If we have been compiled on an OS that supports =
SEEK_HOLE</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; but run on an OS that =
does not support SEEK_HOLE, we get</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp; EINVAL. &nbsp;If the underlying filesystem does not support =
the</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; SEEK_HOLE call, we get =
ENOTSUP, fall back to standard copy</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp; in either case. &nbsp;*/</div><div>+ &nbsp; &nbsp; &nbsp;hole_pos =
=3D lseek (scan-&gt;fd, (off_t) 0, SEEK_HOLE);</div><div>+ &nbsp; &nbsp; =
&nbsp;if (hole_pos &lt; 0)</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp;{</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;if (errno =3D=3D =
EINVAL || errno =3D=3D ENOTSUP)</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp;scan-&gt;initial_scan_failed =3D true;</div><div>+ &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp;return false;</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp;}</div><div>+</div><div>+ &nbsp; &nbsp; &nbsp;/* Seek back =
to position 0 first if we detected a real hole. &nbsp;*/</div><div>+ =
&nbsp; &nbsp; &nbsp;if (hole_pos &gt; 0)</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp;{</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;off_t =
tmp_pos;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;tmp_pos =3D lseek =
(scan-&gt;fd, (off_t) 0, SEEK_SET);</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp;if (tmp_pos !=3D (off_t) 0)</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;return false;</div><div>+</div><div>+ =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;/* The source file is definitely not a =
sparse file, or it</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
maybe a sparse file but SEEK_HOLE returns the source file's</div><div>+ =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; total size, fall back to the =
standard copy too. &nbsp;*/</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;if (hole_pos &gt;=3D scan-&gt;src_total_size)</div><div>+ &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp;scan-&gt;initial_scan_failed =3D =
true;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;return =
false;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;}</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp;}</div><div>+ &nbsp; =
&nbsp;}</div><div>+</div><div>+ &nbsp;unsigned int i =3D 0;</div><div>+ =
&nbsp;/* If lseek(2) failed and the errno is set to ENXIO, =
for</div><div>+ &nbsp; &nbsp; SEEK_DATA there are no more data regions =
past the supplied</div><div>+ &nbsp; &nbsp; offset. &nbsp;For SEEK_HOLE, =
there are no more holes past the&nbsp;</div><div>+ &nbsp; &nbsp; =
supplied offset. &nbsp;Set scan-&gt;hit_final_extent to true =
for</div><div>+ &nbsp; &nbsp; either case. &nbsp;*/</div><div>+ &nbsp;do =
{</div><div>+ &nbsp; &nbsp;data_pos =3D lseek (scan-&gt;fd, =
scan-&gt;scan_start, SEEK_DATA);</div><div>+ &nbsp; &nbsp;if (data_pos =
&lt; 0)</div><div>+ &nbsp; &nbsp; &nbsp;{</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp;if (errno !=3D ENXIO)</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp;return false;</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp;else</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{</div><div>+ =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;scan-&gt;hit_final_extent =3D =
true;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;return =
true;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;}</div><div>+ &nbsp; =
&nbsp; &nbsp;}</div><div>+</div><div>+ &nbsp; &nbsp;hole_pos =3D lseek =
(scan-&gt;fd, data_pos, SEEK_HOLE);</div><div>+ &nbsp; &nbsp;if =
(hole_pos &lt; 0)</div><div>+ &nbsp; &nbsp; &nbsp;{</div><div>+ &nbsp; =
&nbsp; &nbsp; &nbsp;if (errno !=3D ENXIO)</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp;return false;</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp;else</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{</div><div>+ =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;scan-&gt;hit_final_extent =3D =
true;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;return =
true;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;}</div><div>+ &nbsp; =
&nbsp; &nbsp;}</div><div>+</div><div>+ &nbsp; =
&nbsp;ext_info[i].ext_logical =3D data_pos;</div><div>+ &nbsp; =
&nbsp;ext_info[i].ext_length =3D hole_pos - data_pos;</div><div>+ &nbsp; =
&nbsp;scan-&gt;scan_start =3D hole_pos;</div><div>+ &nbsp; =
&nbsp;++i;</div><div>+ &nbsp;} while (scan-&gt;scan_start &lt; =
scan-&gt;src_total_size &amp;&amp; i &lt; =
count);</div><div>+</div><div>+ &nbsp;i--;</div><div>+ =
&nbsp;scan-&gt;ei_count =3D i;</div><div>+ &nbsp;scan-&gt;ext_info =3D =
xnmalloc (scan-&gt;ei_count, sizeof (struct =
extent_info));</div><div>+</div><div>+ &nbsp;for (i =3D 0; i &lt; =
scan-&gt;ei_count; i++)</div><div>+ &nbsp; &nbsp;{</div><div>+ &nbsp; =
&nbsp; &nbsp;assert (ext_info[i].ext_logical &lt;=3D =
OFF_T_MAX);</div><div>+</div><div>+ &nbsp; &nbsp; =
&nbsp;scan-&gt;ext_info[i].ext_logical =3D =
ext_info[i].ext_logical;</div><div>+ &nbsp; &nbsp; =
&nbsp;scan-&gt;ext_info[i].ext_length =3D =
ext_info[i].ext_length;</div><div>+ &nbsp; =
&nbsp;}</div><div>+</div><div>+ &nbsp;return =
true;&nbsp;</div><div>+}</div><div>&nbsp;#else</div><div>&nbsp;extern =
bool</div><div>&nbsp;extent_scan_read (struct extent_scan *scan =
ATTRIBUTE_UNUSED)</div><div>diff --git a/src/extent-scan.h =
b/src/extent-scan.h</div><div>index 4724b25..a271b95 =
100644</div><div>--- a/src/extent-scan.h</div><div>+++ =
b/src/extent-scan.h</div><div>@@ -18,7 +18,6 =
@@</div><div>&nbsp;</div><div>&nbsp;#ifndef =
EXTENT_SCAN_H</div><div>&nbsp;# define =
EXTENT_SCAN_H</div><div>-</div><div>&nbsp;/* Structure used to store =
information of each extent. &nbsp;*/</div><div>&nbsp;struct =
extent_info</div><div>&nbsp;{</div><div>@@ -38,6 +37,11 @@ struct =
extent_scan</div><div>&nbsp;&nbsp; /* File descriptor of extent scan run =
against. &nbsp;*/</div><div>&nbsp;&nbsp; int =
fd;</div><div>&nbsp;</div><div>+#if defined(SEEK_DATA) &amp;&amp; =
defined(SEEK_HOLE)</div><div>+ &nbsp;/* Source file size, i.e, (struct =
stat) &amp;statbuf.st_size. &nbsp;*/</div><div>+ &nbsp;size_t =
src_total_size;</div><div>+#endif</div><div>+</div><div>&nbsp;&nbsp; /* =
Next scan start offset. &nbsp;*/</div><div>&nbsp;&nbsp; off_t =
scan_start;</div><div>&nbsp;</div><div>@@ -55,7 +59,8 @@ struct =
extent_scan</div><div>&nbsp;&nbsp; struct extent_info =
*ext_info;</div><div>&nbsp;};</div><div>&nbsp;</div><div>-void =
extent_scan_init (int src_fd, struct extent_scan *scan);</div><div>+void =
extent_scan_init (int src_fd, size_t src_total_size,</div><div>+ &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
struct extent_scan *scan);</div><div>&nbsp;</div><div>&nbsp;bool =
extent_scan_read (struct extent_scan =
*scan);</div><div>&nbsp;</div><div>--&nbsp;</div><div>1.7.4</div></div></b=
lockquote></div><br></div></div></blockquote><div><div style=3D"word-wrap:=
 break-word; -webkit-nbsp-mode: space; -webkit-line-break: =
after-white-space; =
"><div><br></div><div><br></div></div></div></div><br></body></html>=

--Apple-Mail-1-625867617--




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils@HIDDEN:
bug#8061; Package coreutils. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 18 Apr 2011 14:16:14 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Apr 18 10:16:14 2011
Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1QBpFK-0001ha-Jh
	for submit <at> debbugs.gnu.org; Mon, 18 Apr 2011 10:16:14 -0400
Received: from eggs.gnu.org ([140.186.70.92])
	by debbugs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <jeff.liu@HIDDEN>) id 1QBpFH-0001h8-0q
	for submit <at> debbugs.gnu.org; Mon, 18 Apr 2011 10:16:08 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <jeff.liu@HIDDEN>) id 1QBpF7-0000h8-P2
	for submit <at> debbugs.gnu.org; Mon, 18 Apr 2011 10:16:01 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,HTML_MESSAGE,
	RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD autolearn=unavailable version=3.3.1
Received: from lists.gnu.org ([140.186.70.17]:44165)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jeff.liu@HIDDEN>) id 1QBpF7-0000h4-NG
	for submit <at> debbugs.gnu.org; Mon, 18 Apr 2011 10:15:57 -0400
Received: from eggs.gnu.org ([140.186.70.92]:42879)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jeff.liu@HIDDEN>) id 1QBpF3-0006wU-I4
	for bug-coreutils@HIDDEN; Mon, 18 Apr 2011 10:15:57 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <jeff.liu@HIDDEN>) id 1QBpEx-0000g4-8J
	for bug-coreutils@HIDDEN; Mon, 18 Apr 2011 10:15:53 -0400
Received: from rcsinet10.oracle.com ([148.87.113.121]:56885)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jeff.liu@HIDDEN>) id 1QBpEw-0000fo-UI
	for bug-coreutils@HIDDEN; Mon, 18 Apr 2011 10:15:47 -0400
Received: from acsinet15.oracle.com (acsinet15.oracle.com [141.146.126.227])
	by rcsinet10.oracle.com (Switch-3.4.2/Switch-3.4.2) with ESMTP id
	p3IEFga9001504
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK);
	Mon, 18 Apr 2011 14:15:44 GMT
Received: from acsmt358.oracle.com (acsmt358.oracle.com [141.146.40.158])
	by acsinet15.oracle.com (Switch-3.4.2/Switch-3.4.1) with ESMTP id
	p3IEFe4G019882
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Mon, 18 Apr 2011 14:15:40 GMT
Received: from abhmt016.oracle.com (abhmt016.oracle.com [141.146.116.25])
	by acsmt358.oracle.com (8.12.11.20060308/8.12.11) with ESMTP id
	p3IEFdvj013419; Mon, 18 Apr 2011 09:15:40 -0500
Received: from [192.168.1.100] (/123.119.97.134)
	by default (Oracle Beehive Gateway v4.0)
	with ESMTP ; Mon, 18 Apr 2011 07:15:34 -0700
Subject: Re: Introduce SEEK_DATA/SEEK_HOLE to extent_scan module
Mime-Version: 1.0 (Apple Message framework v1082)
Content-Type: multipart/alternative; boundary=Apple-Mail-3-558882838
From: Jeff liu <jeff.liu@HIDDEN>
In-Reply-To: <2DB776C1-EF34-423D-8BE5-71C2F49DFF01@HIDDEN>
Date: Mon, 18 Apr 2011 22:15:13 +0800
Message-Id: <BE690E2C-C275-4B28-8ACB-9616D367EF96@HIDDEN>
References: <2DB776C1-EF34-423D-8BE5-71C2F49DFF01@HIDDEN>
To: Jeff liu <jeff.liu@HIDDEN>
X-Mailer: Apple Mail (2.1082)
X-Source-IP: acsmt358.oracle.com [141.146.40.158]
X-Auth-Type: Internal IP
X-CT-RefId: str=0001.0A090201.4DAC478D.005F:SCFSTAT5015188, ss=1, pt=DBB_65838,
	fgs=0
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3)
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3)
X-Received-From: 140.186.70.17
X-Spam-Score: -6.4 (------)
X-Debbugs-Envelope-To: submit
Cc: bug-coreutils@HIDDEN, Jim Meyering <jim@HIDDEN>
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.11
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
	<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <http://debbugs.gnu.org/pipermail/debbugs-submit>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
	<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Sender: debbugs-submit-bounces <at> debbugs.gnu.org
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
X-Spam-Score: -6.4 (------)


--Apple-Mail-3-558882838
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=GB2312

Hi All,

Please ignore the current patch, I will submit another patch with a few =
fixes soon.


Thanks,
-Jeff

=D4=DA 2011-2-17=A3=AC=CF=C2=CE=E79:57=A3=AC Jeff liu =D0=B4=B5=C0=A3=BA

> Hello All,
>=20
> This is the first try to introduce the SEEK_DATA/SEEK_HOLE support to =
extent_scan module for efficient sparse file copy on ZFS,  I have =
delayed it for a long time, sorry for that!
>=20
> Below is the code change lists:
> src/extent_scan.h:  add a new structure item 'src_total_size' to =
"struct extent_info",  since I have to make use of this value to =
determine
> a file is sparse of not for the initial scan.  If the returns of =
lseek(fd, 0, SEEK_HOLE) is equal to the src_total_size or large than it, =
it means the source file
> is definitely not a sparse file or maybe it is a sparse file but it =
does not make sense for proceeding scan read.
> another change in this file is the signature of extent_scan_init(), =
just as I mentioned above, it need to accept the src_total_size =
variable.
> src/extent_scan.c: implement the new exent_scan_read() through =
SEEK_DATA/SEEK_HOLE, it will be called if those two values are defined =
at <unistd.h>.
> src/copy.c: pass src_total_size to extent_scan_init().
>=20
> On my test environment,  Solaris10, SunOS 5.10 Generic_142910-17, I =
have tried a few simple cases, they are works to me.
>=20
> For now, I am using diff(1) to verify the copy result,  does anyone =
know some utilities can be used to write the test script?
> I have sent an email to ZFS DEV mail-list to ask this question =
yesterday,  a nice guy suggest me to use =
ZDB(http://cuddletech.com/blog/?p=3D407) for that, I'm
> still study this utility now,   I also noticed there is patch to add =
SEEK_HOLE/SEEK_DATA support to os module in Python community,  please =
refer to:
> http://bugs.python.org/file19566/z.patch
> but it require very latest python build I think,  so could anyone give =
some other advices in this point?
>=20
> The patch is shown as following, any help testing and comments are =
appreciated!!
>=20
>=20
> Thanks,
> -Jeff
>=20
>=20
> From: Jie Liu <jeff.liu@HIDDEN>
> Date: Thu, 17 Feb 2011 21:14:23 +0800
> Subject: [PATCH 1/1] copy: add SEEK_DATA/SEEK_HOLE support to =
extent_scan module
>=20
> * src/extent_scan.h: add src_total_size to struct extent_info, we need
>   to check the SEEK_HOLE result against it for initial extent scan.
>   modify the extent_scan_init() signature, to add size_t =
src_total_size.
> * src/extent_scan.c: implement a new extent_scan_read() through =
SEEK_DATA
>   and SEEK_HOLE.
> * src/copy.c: pass src_total_size to extent_scan_init().
>=20
> Signed-off-by: Jie Liu <jeff.liu@HIDDEN>
> ---
>  src/copy.c        |    2 +-
>  src/extent-scan.c |  113 =
++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  src/extent-scan.h |    9 +++-
>  3 files changed, 120 insertions(+), 4 deletions(-)
>=20
> diff --git a/src/copy.c b/src/copy.c
> index 104652d..22b9911 100644
> --- a/src/copy.c
> +++ b/src/copy.c
> @@ -306,7 +306,7 @@ extent_copy (int src_fd, int dest_fd, char *buf, =
size_t buf_size,
>       We may need this at the end, for a final ftruncate.  */
>    off_t dest_pos =3D 0;
> =20
> -  extent_scan_init (src_fd, &scan);
> +  extent_scan_init (src_fd, src_total_size, &scan);
> =20
>    *require_normal_copy =3D false;
>    bool wrote_hole_at_eof =3D true;
> diff --git a/src/extent-scan.c b/src/extent-scan.c
> index 1ba59db..ffeab7a 100644
> --- a/src/extent-scan.c
> +++ b/src/extent-scan.c
> @@ -32,13 +32,17 @@
>  /* Allocate space for struct extent_scan, initialize the entries if
>     necessary and return it as the input argument of =
extent_scan_read().  */
>  extern void
> -extent_scan_init (int src_fd, struct extent_scan *scan)
> +extent_scan_init (int src_fd, size_t src_total_size,
> +                  struct extent_scan *scan)
>  {
>    scan->fd =3D src_fd;
>    scan->ei_count =3D 0;
>    scan->scan_start =3D 0;
>    scan->initial_scan_failed =3D false;
>    scan->hit_final_extent =3D false;
> +#if defined(SEEK_HOLE) && defined(SEEK_DATA)
> +  scan->src_total_size =3D src_total_size;
> +#endif
>  }
> =20
>  #ifdef __linux__
> @@ -106,6 +110,113 @@ extent_scan_read (struct extent_scan *scan)
> =20
>    return true;
>  }
> +#elif defined(SEEK_HOLE) && defined(SEEK_DATA)
> +extern bool
> +extent_scan_read (struct extent_scan *scan)
> +{
> +  off_t data_pos, hole_pos;
> +  union { struct extent_info ei; char c[4096]; } extent_buf;
> +  struct extent_info *ext_info =3D &extent_buf.ei;
> +  enum { count =3D (sizeof extent_buf / sizeof *ext_info) };
> +  verify (count !=3D 0);
> +
> +  memset (&extent_buf, 0, sizeof extent_buf);
> +
> +  if (scan->scan_start =3D=3D 0)
> +    {
> +# ifdef _PC_MIN_HOLE_SIZE
> +      /* To determine if the underlaying file system support
> +         SEEK_HOLE, if not, fall back to the standard copy.  */
> +      if (fpathconf (scan->fd, _PC_MIN_HOLE_SIZE) < 0)
> +        {
> +          scan->initial_scan_failed =3D true;
> +          return false;
> +        }
> +# endif
> +
> +      /* If we have been compiled on an OS that supports SEEK_HOLE
> +         but run on an OS that does not support SEEK_HOLE, we get
> +         EINVAL.  If the underlying filesystem does not support the
> +         SEEK_HOLE call, we get ENOTSUP, fall back to standard copy
> +         in either case.  */
> +      hole_pos =3D lseek (scan->fd, (off_t) 0, SEEK_HOLE);
> +      if (hole_pos < 0)
> +        {
> +          if (errno =3D=3D EINVAL || errno =3D=3D ENOTSUP)
> +            scan->initial_scan_failed =3D true;
> +          return false;
> +        }
> +
> +      /* Seek back to position 0 first if we detected a real hole.  =
*/
> +      if (hole_pos > 0)
> +        {
> +          off_t tmp_pos;
> +          tmp_pos =3D lseek (scan->fd, (off_t) 0, SEEK_SET);
> +          if (tmp_pos !=3D (off_t) 0)
> +              return false;
> +
> +          /* The source file is definitely not a sparse file, or it
> +             maybe a sparse file but SEEK_HOLE returns the source =
file's
> +             total size, fall back to the standard copy too.  */
> +          if (hole_pos >=3D scan->src_total_size)
> +            {
> +              scan->initial_scan_failed =3D true;
> +              return false;
> +            }
> +        }
> +    }
> +
> +  unsigned int i =3D 0;
> +  /* If lseek(2) failed and the errno is set to ENXIO, for
> +     SEEK_DATA there are no more data regions past the supplied
> +     offset.  For SEEK_HOLE, there are no more holes past the=20
> +     supplied offset.  Set scan->hit_final_extent to true for
> +     either case.  */
> +  do {
> +    data_pos =3D lseek (scan->fd, scan->scan_start, SEEK_DATA);
> +    if (data_pos < 0)
> +      {
> +        if (errno !=3D ENXIO)
> +          return false;
> +        else
> +          {
> +            scan->hit_final_extent =3D true;
> +            return true;
> +          }
> +      }
> +
> +    hole_pos =3D lseek (scan->fd, data_pos, SEEK_HOLE);
> +    if (hole_pos < 0)
> +      {
> +        if (errno !=3D ENXIO)
> +          return false;
> +        else
> +          {
> +            scan->hit_final_extent =3D true;
> +            return true;
> +          }
> +      }
> +
> +    ext_info[i].ext_logical =3D data_pos;
> +    ext_info[i].ext_length =3D hole_pos - data_pos;
> +    scan->scan_start =3D hole_pos;
> +    ++i;
> +  } while (scan->scan_start < scan->src_total_size && i < count);
> +
> +  i--;
> +  scan->ei_count =3D i;
> +  scan->ext_info =3D xnmalloc (scan->ei_count, sizeof (struct =
extent_info));
> +
> +  for (i =3D 0; i < scan->ei_count; i++)
> +    {
> +      assert (ext_info[i].ext_logical <=3D OFF_T_MAX);
> +
> +      scan->ext_info[i].ext_logical =3D ext_info[i].ext_logical;
> +      scan->ext_info[i].ext_length =3D ext_info[i].ext_length;
> +    }
> +
> +  return true;=20
> +}
>  #else
>  extern bool
>  extent_scan_read (struct extent_scan *scan ATTRIBUTE_UNUSED)
> diff --git a/src/extent-scan.h b/src/extent-scan.h
> index 4724b25..a271b95 100644
> --- a/src/extent-scan.h
> +++ b/src/extent-scan.h
> @@ -18,7 +18,6 @@
> =20
>  #ifndef EXTENT_SCAN_H
>  # define EXTENT_SCAN_H
> -
>  /* Structure used to store information of each extent.  */
>  struct extent_info
>  {
> @@ -38,6 +37,11 @@ struct extent_scan
>    /* File descriptor of extent scan run against.  */
>    int fd;
> =20
> +#if defined(SEEK_DATA) && defined(SEEK_HOLE)
> +  /* Source file size, i.e, (struct stat) &statbuf.st_size.  */
> +  size_t src_total_size;
> +#endif
> +
>    /* Next scan start offset.  */
>    off_t scan_start;
> =20
> @@ -55,7 +59,8 @@ struct extent_scan
>    struct extent_info *ext_info;
>  };
> =20
> -void extent_scan_init (int src_fd, struct extent_scan *scan);
> +void extent_scan_init (int src_fd, size_t src_total_size,
> +                       struct extent_scan *scan);
> =20
>  bool extent_scan_read (struct extent_scan *scan);
> =20
> --=20
> 1.7.4


--Apple-Mail-3-558882838
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=GB2312

<html><head></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Hi =
All,<div><br></div><div>Please ignore the current patch, I will submit =
another patch with a few fixes =
soon.</div><div><br></div><div><br></div><div>Thanks,</div><div>-Jeff</div=
><div><br><div><div>=D4=DA 2011-2-17=A3=AC=CF=C2=CE=E79:57=A3=AC Jeff =
liu =D0=B4=B5=C0=A3=BA</div><br =
class=3D"Apple-interchange-newline"><blockquote type=3D"cite"><div =
style=3D"word-wrap: break-word; -webkit-nbsp-mode: space; =
-webkit-line-break: after-white-space; "><div>Hello =
All,</div><div><br></div><div>This is the first try to introduce the =
SEEK_DATA/SEEK_HOLE support to extent_scan module for efficient sparse =
file copy on ZFS, &nbsp;I have delayed it for a long time, sorry for =
that!</div><div><br></div><div>Below is the code change =
lists:</div><div>src/extent_scan.h: &nbsp;add a new structure item =
'src_total_size' to "struct extent_info", &nbsp;since I have to make use =
of this value to determine</div><div>a file is sparse of not for the =
initial scan. &nbsp;If the returns of lseek(fd, 0, SEEK_HOLE) is equal =
to the src_total_size or large than it, it means the source =
file</div><div>is definitely not a sparse file or maybe it is a sparse =
file but it does not make sense for proceeding scan =
read.</div><div>another change in this file is the signature of =
extent_scan_init(), just as I mentioned above, it need to accept the =
src_total_size variable.</div><div>src/extent_scan.c: implement the new =
exent_scan_read() through SEEK_DATA/SEEK_HOLE, it will be called if =
those two values are defined at &lt;unistd.h&gt;.</div><div>src/copy.c: =
pass src_total_size to extent_scan_init().</div><div><br></div><div>On =
my test environment, &nbsp;Solaris10, SunOS 5.10 Generic_142910-17, I =
have tried a few simple cases, they are works to =
me.</div><div><br></div><div>For now, I am using diff(1) to verify the =
copy result, &nbsp;does anyone know some utilities can be used to write =
the test script?</div><div>I have sent an email to ZFS DEV mail-list to =
ask this question yesterday, &nbsp;a nice guy suggest me to use ZDB(<a =
href=3D"http://cuddletech.com/blog/?p=3D407">http://cuddletech.com/blog/?p=
=3D407</a>) for that, I'm</div><div>still study this utility now, &nbsp; =
I also noticed there is patch to add SEEK_HOLE/SEEK_DATA support to os =
module in Python community, &nbsp;please refer to:</div><div><a =
href=3D"http://bugs.python.org/file19566/z.patch">http://bugs.python.org/f=
ile19566/z.patch</a></div><div>but it require very latest python build I =
think, &nbsp;so could anyone give some other advices in this =
point?</div><div><br></div><div>The patch is shown as following, any =
help testing and comments are =
appreciated!!</div><div><br></div><div><br></div><div>Thanks,</div><div>-J=
eff</div><div><br></div><div><br></div><div>From: Jie Liu &lt;<a =
href=3D"mailto:jeff.liu@HIDDEN">jeff.liu@HIDDEN</a>&gt;</div><div>=
Date: Thu, 17 Feb 2011 21:14:23 +0800</div><div>Subject: [PATCH 1/1] =
copy: add SEEK_DATA/SEEK_HOLE support to extent_scan =
module</div><div><br></div><div>* src/extent_scan.h: add src_total_size =
to struct extent_info, we need</div><div>&nbsp;&nbsp;to check the =
SEEK_HOLE result against it for initial extent =
scan.</div><div>&nbsp;&nbsp;modify the extent_scan_init() signature, to =
add size_t src_total_size.</div><div>* src/extent_scan.c: implement a =
new extent_scan_read() through SEEK_DATA</div><div>&nbsp;&nbsp;and =
SEEK_HOLE.</div><div>* src/copy.c: pass src_total_size to =
extent_scan_init().</div><div><br></div><div>Signed-off-by: Jie Liu =
&lt;<a =
href=3D"mailto:jeff.liu@HIDDEN">jeff.liu@HIDDEN</a>&gt;</div><div>=
---</div><div>&nbsp;src/copy.c &nbsp; &nbsp; &nbsp; &nbsp;| &nbsp; =
&nbsp;2 +-</div><div>&nbsp;src/extent-scan.c | &nbsp;113 =
++++++++++++++++++++++++++++++++++++++++++++++++++++-</div><div>&nbsp;src/=
extent-scan.h | &nbsp; &nbsp;9 +++-</div><div>&nbsp;3 files changed, 120 =
insertions(+), 4 deletions(-)</div><div><br></div><div>diff --git =
a/src/copy.c b/src/copy.c</div><div>index 104652d..22b9911 =
100644</div><div>--- a/src/copy.c</div><div>+++ =
b/src/copy.c</div><div>@@ -306,7 +306,7 @@ extent_copy (int src_fd, int =
dest_fd, char *buf, size_t buf_size,</div><div>&nbsp;&nbsp; &nbsp; =
&nbsp;We may need this at the end, for a final ftruncate. =
&nbsp;*/</div><div>&nbsp;&nbsp; off_t dest_pos =3D =
0;</div><div>&nbsp;</div><div>- &nbsp;extent_scan_init (src_fd, =
&amp;scan);</div><div>+ &nbsp;extent_scan_init (src_fd, src_total_size, =
&amp;scan);</div><div>&nbsp;</div><div>&nbsp;&nbsp; *require_normal_copy =
=3D false;</div><div>&nbsp;&nbsp; bool wrote_hole_at_eof =3D =
true;</div><div>diff --git a/src/extent-scan.c =
b/src/extent-scan.c</div><div>index 1ba59db..ffeab7a =
100644</div><div>--- a/src/extent-scan.c</div><div>+++ =
b/src/extent-scan.c</div><div>@@ -32,13 +32,17 @@</div><div>&nbsp;/* =
Allocate space for struct extent_scan, initialize the entries =
if</div><div>&nbsp;&nbsp; &nbsp;necessary and return it as the input =
argument of extent_scan_read(). &nbsp;*/</div><div>&nbsp;extern =
void</div><div>-extent_scan_init (int src_fd, struct extent_scan =
*scan)</div><div>+extent_scan_init (int src_fd, size_t =
src_total_size,</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp;struct extent_scan =
*scan)</div><div>&nbsp;{</div><div>&nbsp;&nbsp; scan-&gt;fd =3D =
src_fd;</div><div>&nbsp;&nbsp; scan-&gt;ei_count =3D =
0;</div><div>&nbsp;&nbsp; scan-&gt;scan_start =3D =
0;</div><div>&nbsp;&nbsp; scan-&gt;initial_scan_failed =3D =
false;</div><div>&nbsp;&nbsp; scan-&gt;hit_final_extent =3D =
false;</div><div>+#if defined(SEEK_HOLE) &amp;&amp; =
defined(SEEK_DATA)</div><div>+ &nbsp;scan-&gt;src_total_size =3D =
src_total_size;</div><div>+#endif</div><div>&nbsp;}</div><div>&nbsp;</div>=
<div>&nbsp;#ifdef __linux__</div><div>@@ -106,6 +110,113 @@ =
extent_scan_read (struct extent_scan =
*scan)</div><div>&nbsp;</div><div>&nbsp;&nbsp; return =
true;</div><div>&nbsp;}</div><div>+#elif defined(SEEK_HOLE) &amp;&amp; =
defined(SEEK_DATA)</div><div>+extern bool</div><div>+extent_scan_read =
(struct extent_scan *scan)</div><div>+{</div><div>+ &nbsp;off_t =
data_pos, hole_pos;</div><div>+ &nbsp;union { struct extent_info ei; =
char c[4096]; } extent_buf;</div><div>+ &nbsp;struct extent_info =
*ext_info =3D &amp;extent_buf.ei;</div><div>+ &nbsp;enum { count =3D =
(sizeof extent_buf / sizeof *ext_info) };</div><div>+ &nbsp;verify =
(count !=3D 0);</div><div>+</div><div>+ &nbsp;memset (&amp;extent_buf, =
0, sizeof extent_buf);</div><div>+</div><div>+ &nbsp;if =
(scan-&gt;scan_start =3D=3D 0)</div><div>+ &nbsp; &nbsp;{</div><div>+# =
ifdef _PC_MIN_HOLE_SIZE</div><div>+ &nbsp; &nbsp; &nbsp;/* To determine =
if the underlaying file system support</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp; SEEK_HOLE, if not, fall back to the standard copy. =
&nbsp;*/</div><div>+ &nbsp; &nbsp; &nbsp;if (fpathconf (scan-&gt;fd, =
_PC_MIN_HOLE_SIZE) &lt; 0)</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp;{</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;scan-&gt;initial_scan_failed =3D true;</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp;return false;</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp;}</div><div>+# endif</div><div>+</div><div>+ &nbsp; &nbsp; =
&nbsp;/* If we have been compiled on an OS that supports =
SEEK_HOLE</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; but run on an OS that =
does not support SEEK_HOLE, we get</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp; EINVAL. &nbsp;If the underlying filesystem does not support =
the</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; SEEK_HOLE call, we get =
ENOTSUP, fall back to standard copy</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp; in either case. &nbsp;*/</div><div>+ &nbsp; &nbsp; &nbsp;hole_pos =
=3D lseek (scan-&gt;fd, (off_t) 0, SEEK_HOLE);</div><div>+ &nbsp; &nbsp; =
&nbsp;if (hole_pos &lt; 0)</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp;{</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;if (errno =3D=3D =
EINVAL || errno =3D=3D ENOTSUP)</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp;scan-&gt;initial_scan_failed =3D true;</div><div>+ &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp;return false;</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp;}</div><div>+</div><div>+ &nbsp; &nbsp; &nbsp;/* Seek back =
to position 0 first if we detected a real hole. &nbsp;*/</div><div>+ =
&nbsp; &nbsp; &nbsp;if (hole_pos &gt; 0)</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp;{</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;off_t =
tmp_pos;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;tmp_pos =3D lseek =
(scan-&gt;fd, (off_t) 0, SEEK_SET);</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp;if (tmp_pos !=3D (off_t) 0)</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;return false;</div><div>+</div><div>+ =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;/* The source file is definitely not a =
sparse file, or it</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
maybe a sparse file but SEEK_HOLE returns the source file's</div><div>+ =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; total size, fall back to the =
standard copy too. &nbsp;*/</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;if (hole_pos &gt;=3D scan-&gt;src_total_size)</div><div>+ &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp;scan-&gt;initial_scan_failed =3D =
true;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;return =
false;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;}</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp;}</div><div>+ &nbsp; =
&nbsp;}</div><div>+</div><div>+ &nbsp;unsigned int i =3D 0;</div><div>+ =
&nbsp;/* If lseek(2) failed and the errno is set to ENXIO, =
for</div><div>+ &nbsp; &nbsp; SEEK_DATA there are no more data regions =
past the supplied</div><div>+ &nbsp; &nbsp; offset. &nbsp;For SEEK_HOLE, =
there are no more holes past the&nbsp;</div><div>+ &nbsp; &nbsp; =
supplied offset. &nbsp;Set scan-&gt;hit_final_extent to true =
for</div><div>+ &nbsp; &nbsp; either case. &nbsp;*/</div><div>+ &nbsp;do =
{</div><div>+ &nbsp; &nbsp;data_pos =3D lseek (scan-&gt;fd, =
scan-&gt;scan_start, SEEK_DATA);</div><div>+ &nbsp; &nbsp;if (data_pos =
&lt; 0)</div><div>+ &nbsp; &nbsp; &nbsp;{</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp;if (errno !=3D ENXIO)</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp;return false;</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp;else</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{</div><div>+ =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;scan-&gt;hit_final_extent =3D =
true;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;return =
true;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;}</div><div>+ &nbsp; =
&nbsp; &nbsp;}</div><div>+</div><div>+ &nbsp; &nbsp;hole_pos =3D lseek =
(scan-&gt;fd, data_pos, SEEK_HOLE);</div><div>+ &nbsp; &nbsp;if =
(hole_pos &lt; 0)</div><div>+ &nbsp; &nbsp; &nbsp;{</div><div>+ &nbsp; =
&nbsp; &nbsp; &nbsp;if (errno !=3D ENXIO)</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp;return false;</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp;else</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{</div><div>+ =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;scan-&gt;hit_final_extent =3D =
true;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;return =
true;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;}</div><div>+ &nbsp; =
&nbsp; &nbsp;}</div><div>+</div><div>+ &nbsp; =
&nbsp;ext_info[i].ext_logical =3D data_pos;</div><div>+ &nbsp; =
&nbsp;ext_info[i].ext_length =3D hole_pos - data_pos;</div><div>+ &nbsp; =
&nbsp;scan-&gt;scan_start =3D hole_pos;</div><div>+ &nbsp; =
&nbsp;++i;</div><div>+ &nbsp;} while (scan-&gt;scan_start &lt; =
scan-&gt;src_total_size &amp;&amp; i &lt; =
count);</div><div>+</div><div>+ &nbsp;i--;</div><div>+ =
&nbsp;scan-&gt;ei_count =3D i;</div><div>+ &nbsp;scan-&gt;ext_info =3D =
xnmalloc (scan-&gt;ei_count, sizeof (struct =
extent_info));</div><div>+</div><div>+ &nbsp;for (i =3D 0; i &lt; =
scan-&gt;ei_count; i++)</div><div>+ &nbsp; &nbsp;{</div><div>+ &nbsp; =
&nbsp; &nbsp;assert (ext_info[i].ext_logical &lt;=3D =
OFF_T_MAX);</div><div>+</div><div>+ &nbsp; &nbsp; =
&nbsp;scan-&gt;ext_info[i].ext_logical =3D =
ext_info[i].ext_logical;</div><div>+ &nbsp; &nbsp; =
&nbsp;scan-&gt;ext_info[i].ext_length =3D =
ext_info[i].ext_length;</div><div>+ &nbsp; =
&nbsp;}</div><div>+</div><div>+ &nbsp;return =
true;&nbsp;</div><div>+}</div><div>&nbsp;#else</div><div>&nbsp;extern =
bool</div><div>&nbsp;extent_scan_read (struct extent_scan *scan =
ATTRIBUTE_UNUSED)</div><div>diff --git a/src/extent-scan.h =
b/src/extent-scan.h</div><div>index 4724b25..a271b95 =
100644</div><div>--- a/src/extent-scan.h</div><div>+++ =
b/src/extent-scan.h</div><div>@@ -18,7 +18,6 =
@@</div><div>&nbsp;</div><div>&nbsp;#ifndef =
EXTENT_SCAN_H</div><div>&nbsp;# define =
EXTENT_SCAN_H</div><div>-</div><div>&nbsp;/* Structure used to store =
information of each extent. &nbsp;*/</div><div>&nbsp;struct =
extent_info</div><div>&nbsp;{</div><div>@@ -38,6 +37,11 @@ struct =
extent_scan</div><div>&nbsp;&nbsp; /* File descriptor of extent scan run =
against. &nbsp;*/</div><div>&nbsp;&nbsp; int =
fd;</div><div>&nbsp;</div><div>+#if defined(SEEK_DATA) &amp;&amp; =
defined(SEEK_HOLE)</div><div>+ &nbsp;/* Source file size, i.e, (struct =
stat) &amp;statbuf.st_size. &nbsp;*/</div><div>+ &nbsp;size_t =
src_total_size;</div><div>+#endif</div><div>+</div><div>&nbsp;&nbsp; /* =
Next scan start offset. &nbsp;*/</div><div>&nbsp;&nbsp; off_t =
scan_start;</div><div>&nbsp;</div><div>@@ -55,7 +59,8 @@ struct =
extent_scan</div><div>&nbsp;&nbsp; struct extent_info =
*ext_info;</div><div>&nbsp;};</div><div>&nbsp;</div><div>-void =
extent_scan_init (int src_fd, struct extent_scan *scan);</div><div>+void =
extent_scan_init (int src_fd, size_t src_total_size,</div><div>+ &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
struct extent_scan *scan);</div><div>&nbsp;</div><div>&nbsp;bool =
extent_scan_read (struct extent_scan =
*scan);</div><div>&nbsp;</div><div>--&nbsp;</div><div>1.7.4</div></div></b=
lockquote></div><br></div></body></html>=

--Apple-Mail-3-558882838--




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils@HIDDEN:
bug#8061; Package coreutils. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 17 Feb 2011 13:49:27 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Feb 17 08:49:27 2011
Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1Pq4EY-0005Z7-Aw
	for submit <at> debbugs.gnu.org; Thu, 17 Feb 2011 08:49:27 -0500
Received: from eggs.gnu.org ([140.186.70.92])
	by debbugs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <jeff.liu@HIDDEN>) id 1Pq4EV-0005Yv-T5
	for submit <at> debbugs.gnu.org; Thu, 17 Feb 2011 08:49:24 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <jeff.liu@HIDDEN>) id 1Pq4NH-00089I-L7
	for submit <at> debbugs.gnu.org; Thu, 17 Feb 2011 08:58:29 -0500
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,HTML_MESSAGE,
	RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD autolearn=unavailable version=3.3.1
Received: from lists.gnu.org ([199.232.76.165]:43637)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jeff.liu@HIDDEN>) id 1Pq4NH-00089B-Ae
	for submit <at> debbugs.gnu.org; Thu, 17 Feb 2011 08:58:27 -0500
Received: from [140.186.70.92] (port=39195 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1Pq4Mw-0004mB-2Q
	for bug-coreutils@HIDDEN; Thu, 17 Feb 2011 08:58:26 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <jeff.liu@HIDDEN>) id 1Pq4MU-0007zp-Lx
	for bug-coreutils@HIDDEN; Thu, 17 Feb 2011 08:57:40 -0500
Received: from rcsinet10.oracle.com ([148.87.113.121]:18456)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jeff.liu@HIDDEN>) id 1Pq4MU-0007zR-8M
	for bug-coreutils@HIDDEN; Thu, 17 Feb 2011 08:57:38 -0500
Received: from acsinet15.oracle.com (acsinet15.oracle.com [141.146.126.227])
	by rcsinet10.oracle.com (Switch-3.4.2/Switch-3.4.2) with ESMTP id
	p1HDvT4c001402
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK);
	Thu, 17 Feb 2011 13:57:34 GMT
Received: from acsmt355.oracle.com (acsmt355.oracle.com [141.146.40.155])
	by acsinet15.oracle.com (Switch-3.4.2/Switch-3.4.1) with ESMTP id
	p1HD39m2027007; Thu, 17 Feb 2011 13:57:29 GMT
Received: from abhmt014.oracle.com by acsmt354.oracle.com
	with ESMTP id 1063839691297951043; Thu, 17 Feb 2011 05:57:23 -0800
Received: from [192.168.1.101] (/221.223.109.12)
	by default (Oracle Beehive Gateway v4.0)
	with ESMTP ; Thu, 17 Feb 2011 05:57:21 -0800
From: Jeff liu <jeff.liu@HIDDEN>
Content-Type: multipart/alternative; boundary=Apple-Mail-3--331228896
Subject: Introduce SEEK_DATA/SEEK_HOLE to extent_scan module
Date: Thu, 17 Feb 2011 21:57:14 +0800
Message-Id: <2DB776C1-EF34-423D-8BE5-71C2F49DFF01@HIDDEN>
To: bug-coreutils@HIDDEN
Mime-Version: 1.0 (Apple Message framework v1081)
X-Mailer: Apple Mail (2.1081)
X-Source-IP: acsmt355.oracle.com [141.146.40.155]
X-Auth-Type: Internal IP
X-CT-RefId: str=0001.0A090203.4D5D2949.0106:SCFMA4539814,ss=1,fgs=0
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3)
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2)
X-Received-From: 199.232.76.165
X-Spam-Score: -6.4 (------)
X-Debbugs-Envelope-To: submit
Cc: "jeff.liu" <jeff.liu@HIDDEN>, Jim Meyering <jim@HIDDEN>
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.11
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
	<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <http://debbugs.gnu.org/pipermail/debbugs-submit>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
	<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Sender: debbugs-submit-bounces <at> debbugs.gnu.org
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
X-Spam-Score: -6.4 (------)


--Apple-Mail-3--331228896
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=us-ascii

Hello All,

This is the first try to introduce the SEEK_DATA/SEEK_HOLE support to =
extent_scan module for efficient sparse file copy on ZFS,  I have =
delayed it for a long time, sorry for that!

Below is the code change lists:
src/extent_scan.h:  add a new structure item 'src_total_size' to "struct =
extent_info",  since I have to make use of this value to determine
a file is sparse of not for the initial scan.  If the returns of =
lseek(fd, 0, SEEK_HOLE) is equal to the src_total_size or large than it, =
it means the source file
is definitely not a sparse file or maybe it is a sparse file but it does =
not make sense for proceeding scan read.
another change in this file is the signature of extent_scan_init(), just =
as I mentioned above, it need to accept the src_total_size variable.
src/extent_scan.c: implement the new exent_scan_read() through =
SEEK_DATA/SEEK_HOLE, it will be called if those two values are defined =
at <unistd.h>.
src/copy.c: pass src_total_size to extent_scan_init().

On my test environment,  Solaris10, SunOS 5.10 Generic_142910-17, I have =
tried a few simple cases, they are works to me.

For now, I am using diff(1) to verify the copy result,  does anyone know =
some utilities can be used to write the test script?
I have sent an email to ZFS DEV mail-list to ask this question =
yesterday,  a nice guy suggest me to use =
ZDB(http://cuddletech.com/blog/?p=3D407) for that, I'm
still study this utility now,   I also noticed there is patch to add =
SEEK_HOLE/SEEK_DATA support to os module in Python community,  please =
refer to:
http://bugs.python.org/file19566/z.patch
but it require very latest python build I think,  so could anyone give =
some other advices in this point?

The patch is shown as following, any help testing and comments are =
appreciated!!


Thanks,
-Jeff


From: Jie Liu <jeff.liu@HIDDEN>
Date: Thu, 17 Feb 2011 21:14:23 +0800
Subject: [PATCH 1/1] copy: add SEEK_DATA/SEEK_HOLE support to =
extent_scan module

* src/extent_scan.h: add src_total_size to struct extent_info, we need
  to check the SEEK_HOLE result against it for initial extent scan.
  modify the extent_scan_init() signature, to add size_t src_total_size.
* src/extent_scan.c: implement a new extent_scan_read() through =
SEEK_DATA
  and SEEK_HOLE.
* src/copy.c: pass src_total_size to extent_scan_init().

Signed-off-by: Jie Liu <jeff.liu@HIDDEN>
---
 src/copy.c        |    2 +-
 src/extent-scan.c |  113 =
++++++++++++++++++++++++++++++++++++++++++++++++++++-
 src/extent-scan.h |    9 +++-
 3 files changed, 120 insertions(+), 4 deletions(-)

diff --git a/src/copy.c b/src/copy.c
index 104652d..22b9911 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -306,7 +306,7 @@ extent_copy (int src_fd, int dest_fd, char *buf, =
size_t buf_size,
      We may need this at the end, for a final ftruncate.  */
   off_t dest_pos =3D 0;
=20
-  extent_scan_init (src_fd, &scan);
+  extent_scan_init (src_fd, src_total_size, &scan);
=20
   *require_normal_copy =3D false;
   bool wrote_hole_at_eof =3D true;
diff --git a/src/extent-scan.c b/src/extent-scan.c
index 1ba59db..ffeab7a 100644
--- a/src/extent-scan.c
+++ b/src/extent-scan.c
@@ -32,13 +32,17 @@
 /* Allocate space for struct extent_scan, initialize the entries if
    necessary and return it as the input argument of extent_scan_read(). =
 */
 extern void
-extent_scan_init (int src_fd, struct extent_scan *scan)
+extent_scan_init (int src_fd, size_t src_total_size,
+                  struct extent_scan *scan)
 {
   scan->fd =3D src_fd;
   scan->ei_count =3D 0;
   scan->scan_start =3D 0;
   scan->initial_scan_failed =3D false;
   scan->hit_final_extent =3D false;
+#if defined(SEEK_HOLE) && defined(SEEK_DATA)
+  scan->src_total_size =3D src_total_size;
+#endif
 }
=20
 #ifdef __linux__
@@ -106,6 +110,113 @@ extent_scan_read (struct extent_scan *scan)
=20
   return true;
 }
+#elif defined(SEEK_HOLE) && defined(SEEK_DATA)
+extern bool
+extent_scan_read (struct extent_scan *scan)
+{
+  off_t data_pos, hole_pos;
+  union { struct extent_info ei; char c[4096]; } extent_buf;
+  struct extent_info *ext_info =3D &extent_buf.ei;
+  enum { count =3D (sizeof extent_buf / sizeof *ext_info) };
+  verify (count !=3D 0);
+
+  memset (&extent_buf, 0, sizeof extent_buf);
+
+  if (scan->scan_start =3D=3D 0)
+    {
+# ifdef _PC_MIN_HOLE_SIZE
+      /* To determine if the underlaying file system support
+         SEEK_HOLE, if not, fall back to the standard copy.  */
+      if (fpathconf (scan->fd, _PC_MIN_HOLE_SIZE) < 0)
+        {
+          scan->initial_scan_failed =3D true;
+          return false;
+        }
+# endif
+
+      /* If we have been compiled on an OS that supports SEEK_HOLE
+         but run on an OS that does not support SEEK_HOLE, we get
+         EINVAL.  If the underlying filesystem does not support the
+         SEEK_HOLE call, we get ENOTSUP, fall back to standard copy
+         in either case.  */
+      hole_pos =3D lseek (scan->fd, (off_t) 0, SEEK_HOLE);
+      if (hole_pos < 0)
+        {
+          if (errno =3D=3D EINVAL || errno =3D=3D ENOTSUP)
+            scan->initial_scan_failed =3D true;
+          return false;
+        }
+
+      /* Seek back to position 0 first if we detected a real hole.  */
+      if (hole_pos > 0)
+        {
+          off_t tmp_pos;
+          tmp_pos =3D lseek (scan->fd, (off_t) 0, SEEK_SET);
+          if (tmp_pos !=3D (off_t) 0)
+              return false;
+
+          /* The source file is definitely not a sparse file, or it
+             maybe a sparse file but SEEK_HOLE returns the source =
file's
+             total size, fall back to the standard copy too.  */
+          if (hole_pos >=3D scan->src_total_size)
+            {
+              scan->initial_scan_failed =3D true;
+              return false;
+            }
+        }
+    }
+
+  unsigned int i =3D 0;
+  /* If lseek(2) failed and the errno is set to ENXIO, for
+     SEEK_DATA there are no more data regions past the supplied
+     offset.  For SEEK_HOLE, there are no more holes past the=20
+     supplied offset.  Set scan->hit_final_extent to true for
+     either case.  */
+  do {
+    data_pos =3D lseek (scan->fd, scan->scan_start, SEEK_DATA);
+    if (data_pos < 0)
+      {
+        if (errno !=3D ENXIO)
+          return false;
+        else
+          {
+            scan->hit_final_extent =3D true;
+            return true;
+          }
+      }
+
+    hole_pos =3D lseek (scan->fd, data_pos, SEEK_HOLE);
+    if (hole_pos < 0)
+      {
+        if (errno !=3D ENXIO)
+          return false;
+        else
+          {
+            scan->hit_final_extent =3D true;
+            return true;
+          }
+      }
+
+    ext_info[i].ext_logical =3D data_pos;
+    ext_info[i].ext_length =3D hole_pos - data_pos;
+    scan->scan_start =3D hole_pos;
+    ++i;
+  } while (scan->scan_start < scan->src_total_size && i < count);
+
+  i--;
+  scan->ei_count =3D i;
+  scan->ext_info =3D xnmalloc (scan->ei_count, sizeof (struct =
extent_info));
+
+  for (i =3D 0; i < scan->ei_count; i++)
+    {
+      assert (ext_info[i].ext_logical <=3D OFF_T_MAX);
+
+      scan->ext_info[i].ext_logical =3D ext_info[i].ext_logical;
+      scan->ext_info[i].ext_length =3D ext_info[i].ext_length;
+    }
+
+  return true;=20
+}
 #else
 extern bool
 extent_scan_read (struct extent_scan *scan ATTRIBUTE_UNUSED)
diff --git a/src/extent-scan.h b/src/extent-scan.h
index 4724b25..a271b95 100644
--- a/src/extent-scan.h
+++ b/src/extent-scan.h
@@ -18,7 +18,6 @@
=20
 #ifndef EXTENT_SCAN_H
 # define EXTENT_SCAN_H
-
 /* Structure used to store information of each extent.  */
 struct extent_info
 {
@@ -38,6 +37,11 @@ struct extent_scan
   /* File descriptor of extent scan run against.  */
   int fd;
=20
+#if defined(SEEK_DATA) && defined(SEEK_HOLE)
+  /* Source file size, i.e, (struct stat) &statbuf.st_size.  */
+  size_t src_total_size;
+#endif
+
   /* Next scan start offset.  */
   off_t scan_start;
=20
@@ -55,7 +59,8 @@ struct extent_scan
   struct extent_info *ext_info;
 };
=20
-void extent_scan_init (int src_fd, struct extent_scan *scan);
+void extent_scan_init (int src_fd, size_t src_total_size,
+                       struct extent_scan *scan);
=20
 bool extent_scan_read (struct extent_scan *scan);
=20
--=20
1.7.4=

--Apple-Mail-3--331228896
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=us-ascii

<html><head></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; =
"><div>Hello All,</div><div><br></div><div>This is the first try to =
introduce the SEEK_DATA/SEEK_HOLE support to extent_scan module for =
efficient sparse file copy on ZFS, &nbsp;I have delayed it for a long =
time, sorry for that!</div><div><br></div><div>Below is the code change =
lists:</div><div>src/extent_scan.h: &nbsp;add a new structure item =
'src_total_size' to "struct extent_info", &nbsp;since I have to make use =
of this value to determine</div><div>a file is sparse of not for the =
initial scan. &nbsp;If the returns of lseek(fd, 0, SEEK_HOLE) is equal =
to the src_total_size or large than it, it means the source =
file</div><div>is definitely not a sparse file or maybe it is a sparse =
file but it does not make sense for proceeding scan =
read.</div><div>another change in this file is the signature of =
extent_scan_init(), just as I mentioned above, it need to accept the =
src_total_size variable.</div><div>src/extent_scan.c: implement the new =
exent_scan_read() through SEEK_DATA/SEEK_HOLE, it will be called if =
those two values are defined at &lt;unistd.h&gt;.</div><div>src/copy.c: =
pass src_total_size to extent_scan_init().</div><div><br></div><div>On =
my test environment, &nbsp;Solaris10, SunOS 5.10 Generic_142910-17, I =
have tried a few simple cases, they are works to =
me.</div><div><br></div><div>For now, I am using diff(1) to verify the =
copy result, &nbsp;does anyone know some utilities can be used to write =
the test script?</div><div>I have sent an email to ZFS DEV mail-list to =
ask this question yesterday, &nbsp;a nice guy suggest me to use ZDB(<a =
href=3D"http://cuddletech.com/blog/?p=3D407">http://cuddletech.com/blog/?p=
=3D407</a>) for that, I'm</div><div>still study this utility now, &nbsp; =
I also noticed there is patch to add SEEK_HOLE/SEEK_DATA support to os =
module in Python community, &nbsp;please refer to:</div><div><a =
href=3D"http://bugs.python.org/file19566/z.patch">http://bugs.python.org/f=
ile19566/z.patch</a></div><div>but it require very latest python build I =
think, &nbsp;so could anyone give some other advices in this =
point?</div><div><br></div><div>The patch is shown as following, any =
help testing and comments are =
appreciated!!</div><div><br></div><div><br></div><div>Thanks,</div><div>-J=
eff</div><div><br></div><div><br></div><div>From: Jie Liu &lt;<a =
href=3D"mailto:jeff.liu@HIDDEN">jeff.liu@HIDDEN</a>&gt;</div><div>=
Date: Thu, 17 Feb 2011 21:14:23 +0800</div><div>Subject: [PATCH 1/1] =
copy: add SEEK_DATA/SEEK_HOLE support to extent_scan =
module</div><div><br></div><div>* src/extent_scan.h: add src_total_size =
to struct extent_info, we need</div><div>&nbsp;&nbsp;to check the =
SEEK_HOLE result against it for initial extent =
scan.</div><div>&nbsp;&nbsp;modify the extent_scan_init() signature, to =
add size_t src_total_size.</div><div>* src/extent_scan.c: implement a =
new extent_scan_read() through SEEK_DATA</div><div>&nbsp;&nbsp;and =
SEEK_HOLE.</div><div>* src/copy.c: pass src_total_size to =
extent_scan_init().</div><div><br></div><div>Signed-off-by: Jie Liu =
&lt;<a =
href=3D"mailto:jeff.liu@HIDDEN">jeff.liu@HIDDEN</a>&gt;</div><div>=
---</div><div>&nbsp;src/copy.c &nbsp; &nbsp; &nbsp; &nbsp;| &nbsp; =
&nbsp;2 +-</div><div>&nbsp;src/extent-scan.c | &nbsp;113 =
++++++++++++++++++++++++++++++++++++++++++++++++++++-</div><div>&nbsp;src/=
extent-scan.h | &nbsp; &nbsp;9 +++-</div><div>&nbsp;3 files changed, 120 =
insertions(+), 4 deletions(-)</div><div><br></div><div>diff --git =
a/src/copy.c b/src/copy.c</div><div>index 104652d..22b9911 =
100644</div><div>--- a/src/copy.c</div><div>+++ =
b/src/copy.c</div><div>@@ -306,7 +306,7 @@ extent_copy (int src_fd, int =
dest_fd, char *buf, size_t buf_size,</div><div>&nbsp;&nbsp; &nbsp; =
&nbsp;We may need this at the end, for a final ftruncate. =
&nbsp;*/</div><div>&nbsp;&nbsp; off_t dest_pos =3D =
0;</div><div>&nbsp;</div><div>- &nbsp;extent_scan_init (src_fd, =
&amp;scan);</div><div>+ &nbsp;extent_scan_init (src_fd, src_total_size, =
&amp;scan);</div><div>&nbsp;</div><div>&nbsp;&nbsp; *require_normal_copy =
=3D false;</div><div>&nbsp;&nbsp; bool wrote_hole_at_eof =3D =
true;</div><div>diff --git a/src/extent-scan.c =
b/src/extent-scan.c</div><div>index 1ba59db..ffeab7a =
100644</div><div>--- a/src/extent-scan.c</div><div>+++ =
b/src/extent-scan.c</div><div>@@ -32,13 +32,17 @@</div><div>&nbsp;/* =
Allocate space for struct extent_scan, initialize the entries =
if</div><div>&nbsp;&nbsp; &nbsp;necessary and return it as the input =
argument of extent_scan_read(). &nbsp;*/</div><div>&nbsp;extern =
void</div><div>-extent_scan_init (int src_fd, struct extent_scan =
*scan)</div><div>+extent_scan_init (int src_fd, size_t =
src_total_size,</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp;struct extent_scan =
*scan)</div><div>&nbsp;{</div><div>&nbsp;&nbsp; scan-&gt;fd =3D =
src_fd;</div><div>&nbsp;&nbsp; scan-&gt;ei_count =3D =
0;</div><div>&nbsp;&nbsp; scan-&gt;scan_start =3D =
0;</div><div>&nbsp;&nbsp; scan-&gt;initial_scan_failed =3D =
false;</div><div>&nbsp;&nbsp; scan-&gt;hit_final_extent =3D =
false;</div><div>+#if defined(SEEK_HOLE) &amp;&amp; =
defined(SEEK_DATA)</div><div>+ &nbsp;scan-&gt;src_total_size =3D =
src_total_size;</div><div>+#endif</div><div>&nbsp;}</div><div>&nbsp;</div>=
<div>&nbsp;#ifdef __linux__</div><div>@@ -106,6 +110,113 @@ =
extent_scan_read (struct extent_scan =
*scan)</div><div>&nbsp;</div><div>&nbsp;&nbsp; return =
true;</div><div>&nbsp;}</div><div>+#elif defined(SEEK_HOLE) &amp;&amp; =
defined(SEEK_DATA)</div><div>+extern bool</div><div>+extent_scan_read =
(struct extent_scan *scan)</div><div>+{</div><div>+ &nbsp;off_t =
data_pos, hole_pos;</div><div>+ &nbsp;union { struct extent_info ei; =
char c[4096]; } extent_buf;</div><div>+ &nbsp;struct extent_info =
*ext_info =3D &amp;extent_buf.ei;</div><div>+ &nbsp;enum { count =3D =
(sizeof extent_buf / sizeof *ext_info) };</div><div>+ &nbsp;verify =
(count !=3D 0);</div><div>+</div><div>+ &nbsp;memset (&amp;extent_buf, =
0, sizeof extent_buf);</div><div>+</div><div>+ &nbsp;if =
(scan-&gt;scan_start =3D=3D 0)</div><div>+ &nbsp; &nbsp;{</div><div>+# =
ifdef _PC_MIN_HOLE_SIZE</div><div>+ &nbsp; &nbsp; &nbsp;/* To determine =
if the underlaying file system support</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp; SEEK_HOLE, if not, fall back to the standard copy. =
&nbsp;*/</div><div>+ &nbsp; &nbsp; &nbsp;if (fpathconf (scan-&gt;fd, =
_PC_MIN_HOLE_SIZE) &lt; 0)</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp;{</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;scan-&gt;initial_scan_failed =3D true;</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp;return false;</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp;}</div><div>+# endif</div><div>+</div><div>+ &nbsp; &nbsp; =
&nbsp;/* If we have been compiled on an OS that supports =
SEEK_HOLE</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; but run on an OS that =
does not support SEEK_HOLE, we get</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp; EINVAL. &nbsp;If the underlying filesystem does not support =
the</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; SEEK_HOLE call, we get =
ENOTSUP, fall back to standard copy</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp; in either case. &nbsp;*/</div><div>+ &nbsp; &nbsp; &nbsp;hole_pos =
=3D lseek (scan-&gt;fd, (off_t) 0, SEEK_HOLE);</div><div>+ &nbsp; &nbsp; =
&nbsp;if (hole_pos &lt; 0)</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp;{</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;if (errno =3D=3D =
EINVAL || errno =3D=3D ENOTSUP)</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp;scan-&gt;initial_scan_failed =3D true;</div><div>+ &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp;return false;</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp;}</div><div>+</div><div>+ &nbsp; &nbsp; &nbsp;/* Seek back =
to position 0 first if we detected a real hole. &nbsp;*/</div><div>+ =
&nbsp; &nbsp; &nbsp;if (hole_pos &gt; 0)</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp;{</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;off_t =
tmp_pos;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;tmp_pos =3D lseek =
(scan-&gt;fd, (off_t) 0, SEEK_SET);</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp;if (tmp_pos !=3D (off_t) 0)</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;return false;</div><div>+</div><div>+ =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;/* The source file is definitely not a =
sparse file, or it</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
maybe a sparse file but SEEK_HOLE returns the source file's</div><div>+ =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; total size, fall back to the =
standard copy too. &nbsp;*/</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;if (hole_pos &gt;=3D scan-&gt;src_total_size)</div><div>+ &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp;scan-&gt;initial_scan_failed =3D =
true;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;return =
false;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
&nbsp;}</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp;}</div><div>+ &nbsp; =
&nbsp;}</div><div>+</div><div>+ &nbsp;unsigned int i =3D 0;</div><div>+ =
&nbsp;/* If lseek(2) failed and the errno is set to ENXIO, =
for</div><div>+ &nbsp; &nbsp; SEEK_DATA there are no more data regions =
past the supplied</div><div>+ &nbsp; &nbsp; offset. &nbsp;For SEEK_HOLE, =
there are no more holes past the&nbsp;</div><div>+ &nbsp; &nbsp; =
supplied offset. &nbsp;Set scan-&gt;hit_final_extent to true =
for</div><div>+ &nbsp; &nbsp; either case. &nbsp;*/</div><div>+ &nbsp;do =
{</div><div>+ &nbsp; &nbsp;data_pos =3D lseek (scan-&gt;fd, =
scan-&gt;scan_start, SEEK_DATA);</div><div>+ &nbsp; &nbsp;if (data_pos =
&lt; 0)</div><div>+ &nbsp; &nbsp; &nbsp;{</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp;if (errno !=3D ENXIO)</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp; &nbsp;return false;</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp;else</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{</div><div>+ =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;scan-&gt;hit_final_extent =3D =
true;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;return =
true;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;}</div><div>+ &nbsp; =
&nbsp; &nbsp;}</div><div>+</div><div>+ &nbsp; &nbsp;hole_pos =3D lseek =
(scan-&gt;fd, data_pos, SEEK_HOLE);</div><div>+ &nbsp; &nbsp;if =
(hole_pos &lt; 0)</div><div>+ &nbsp; &nbsp; &nbsp;{</div><div>+ &nbsp; =
&nbsp; &nbsp; &nbsp;if (errno !=3D ENXIO)</div><div>+ &nbsp; &nbsp; =
&nbsp; &nbsp; &nbsp;return false;</div><div>+ &nbsp; &nbsp; &nbsp; =
&nbsp;else</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;{</div><div>+ =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;scan-&gt;hit_final_extent =3D =
true;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;return =
true;</div><div>+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;}</div><div>+ &nbsp; =
&nbsp; &nbsp;}</div><div>+</div><div>+ &nbsp; =
&nbsp;ext_info[i].ext_logical =3D data_pos;</div><div>+ &nbsp; =
&nbsp;ext_info[i].ext_length =3D hole_pos - data_pos;</div><div>+ &nbsp; =
&nbsp;scan-&gt;scan_start =3D hole_pos;</div><div>+ &nbsp; =
&nbsp;++i;</div><div>+ &nbsp;} while (scan-&gt;scan_start &lt; =
scan-&gt;src_total_size &amp;&amp; i &lt; =
count);</div><div>+</div><div>+ &nbsp;i--;</div><div>+ =
&nbsp;scan-&gt;ei_count =3D i;</div><div>+ &nbsp;scan-&gt;ext_info =3D =
xnmalloc (scan-&gt;ei_count, sizeof (struct =
extent_info));</div><div>+</div><div>+ &nbsp;for (i =3D 0; i &lt; =
scan-&gt;ei_count; i++)</div><div>+ &nbsp; &nbsp;{</div><div>+ &nbsp; =
&nbsp; &nbsp;assert (ext_info[i].ext_logical &lt;=3D =
OFF_T_MAX);</div><div>+</div><div>+ &nbsp; &nbsp; =
&nbsp;scan-&gt;ext_info[i].ext_logical =3D =
ext_info[i].ext_logical;</div><div>+ &nbsp; &nbsp; =
&nbsp;scan-&gt;ext_info[i].ext_length =3D =
ext_info[i].ext_length;</div><div>+ &nbsp; =
&nbsp;}</div><div>+</div><div>+ &nbsp;return =
true;&nbsp;</div><div>+}</div><div>&nbsp;#else</div><div>&nbsp;extern =
bool</div><div>&nbsp;extent_scan_read (struct extent_scan *scan =
ATTRIBUTE_UNUSED)</div><div>diff --git a/src/extent-scan.h =
b/src/extent-scan.h</div><div>index 4724b25..a271b95 =
100644</div><div>--- a/src/extent-scan.h</div><div>+++ =
b/src/extent-scan.h</div><div>@@ -18,7 +18,6 =
@@</div><div>&nbsp;</div><div>&nbsp;#ifndef =
EXTENT_SCAN_H</div><div>&nbsp;# define =
EXTENT_SCAN_H</div><div>-</div><div>&nbsp;/* Structure used to store =
information of each extent. &nbsp;*/</div><div>&nbsp;struct =
extent_info</div><div>&nbsp;{</div><div>@@ -38,6 +37,11 @@ struct =
extent_scan</div><div>&nbsp;&nbsp; /* File descriptor of extent scan run =
against. &nbsp;*/</div><div>&nbsp;&nbsp; int =
fd;</div><div>&nbsp;</div><div>+#if defined(SEEK_DATA) &amp;&amp; =
defined(SEEK_HOLE)</div><div>+ &nbsp;/* Source file size, i.e, (struct =
stat) &amp;statbuf.st_size. &nbsp;*/</div><div>+ &nbsp;size_t =
src_total_size;</div><div>+#endif</div><div>+</div><div>&nbsp;&nbsp; /* =
Next scan start offset. &nbsp;*/</div><div>&nbsp;&nbsp; off_t =
scan_start;</div><div>&nbsp;</div><div>@@ -55,7 +59,8 @@ struct =
extent_scan</div><div>&nbsp;&nbsp; struct extent_info =
*ext_info;</div><div>&nbsp;};</div><div>&nbsp;</div><div>-void =
extent_scan_init (int src_fd, struct extent_scan *scan);</div><div>+void =
extent_scan_init (int src_fd, size_t src_total_size,</div><div>+ &nbsp; =
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; =
struct extent_scan *scan);</div><div>&nbsp;</div><div>&nbsp;bool =
extent_scan_read (struct extent_scan =
*scan);</div><div>&nbsp;</div><div>--&nbsp;</div><div>1.7.4</div></body></=
html>=

--Apple-Mail-3--331228896--




Acknowledgement sent to Jeff liu <jeff.liu@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-coreutils@HIDDEN. Full text available.
Report forwarded to owner <at> debbugs.gnu.org, bug-coreutils@HIDDEN:
bug#8061; Package coreutils. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Fri, 31 Oct 2014 17:00:04 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.