X-Loop: help-debbugs@HIDDEN Subject: bug#21665: Use of mmap for large files Resent-From: Maurice van der Pot <griffon26@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-diffutils@HIDDEN Resent-Date: Sun, 11 Oct 2015 17:09:01 +0000 Resent-Message-ID: <handler.21665.B.144458329416432 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: report 21665 X-GNU-PR-Package: diffutils X-GNU-PR-Keywords: To: 21665 <at> debbugs.gnu.org X-Debbugs-Original-To: bug-diffutils@HIDDEN Received: via spool by submit <at> debbugs.gnu.org id=B.144458329416432 (code B ref -1); Sun, 11 Oct 2015 17:09:01 +0000 Received: (at submit) by debbugs.gnu.org; 11 Oct 2015 17:08:14 +0000 Received: from localhost ([127.0.0.1]:36361 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1ZlK6Y-0004Gy-5F for submit <at> debbugs.gnu.org; Sun, 11 Oct 2015 13:08:14 -0400 Received: from eggs.gnu.org ([208.118.235.92]:58721) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <griffon26@HIDDEN>) id 1ZlGtA-0007MT-2U for submit <at> debbugs.gnu.org; Sun, 11 Oct 2015 09:42:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <griffon26@HIDDEN>) id 1ZlGt8-0001ct-ST for submit <at> debbugs.gnu.org; Sun, 11 Oct 2015 09:42:11 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:44896) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <griffon26@HIDDEN>) id 1ZlGt8-0001cp-QT for submit <at> debbugs.gnu.org; Sun, 11 Oct 2015 09:42:10 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40480) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <griffon26@HIDDEN>) id 1ZlGt7-0007ak-SH for bug-diffutils@HIDDEN; Sun, 11 Oct 2015 09:42:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <griffon26@HIDDEN>) id 1ZlGt3-0001Zz-Ob for bug-diffutils@HIDDEN; Sun, 11 Oct 2015 09:42:09 -0400 Received: from griffon26.demon.nl ([83.163.44.87]:37298 helo=griffon26.kfk4ever.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <griffon26@HIDDEN>) id 1ZlGt3-0001ZW-HS for bug-diffutils@HIDDEN; Sun, 11 Oct 2015 09:42:05 -0400 Received: by griffon26.kfk4ever.com (Postfix, from userid 1000) id 99CB6127E2F; Sun, 11 Oct 2015 15:42:03 +0200 (CEST) Date: Sun, 11 Oct 2015 15:42:03 +0200 From: Maurice van der Pot <griffon26@HIDDEN> Message-ID: <20151011134203.GA1315@HIDDEN> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="jRHKVT23PllUwdXP" Content-Disposition: inline X-URL: http://www.kfk4ever.com/ X-PGP-Key: http://www.kfk4ever.com/~griffon26/pubkey.asc X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.1 (----) X-Mailman-Approved-At: Sun, 11 Oct 2015 13:08:13 -0400 X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -4.1 (----) --jRHKVT23PllUwdXP Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable I am working on an application suitable for visually merging large files. This application delegates determination of differences to GNU diff. Unfortunately I have found that diff reads the entire input files into memory, leading to "/usr/bin/diff: memory exhausted" messages on the types of files I'd like to support. Would you be open to patches that enable diffing large files by using mmap? Kind regards, Maurice. --=20 Maurice van der Pot Kdiff3 developer griffon26@HIDDEN http://kdiff3.sourceforge.net Tdiff3 developer https://github.com/Griffon26/td= iff3 --jRHKVT23PllUwdXP Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEARECAAYFAlYaZysACgkQMGnpIbeahxxSIQCgmUl58w1BR6+oOUQK7TUZkhYl 6EcAnjRMByA1+xXDEtfjFHXa/oCCAwxw =Zs9+ -----END PGP SIGNATURE----- --jRHKVT23PllUwdXP--
Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.503 (Entity 5.503) Content-Type: text/plain; charset=utf-8 X-Loop: help-debbugs@HIDDEN From: help-debbugs@HIDDEN (GNU bug Tracking System) To: Maurice van der Pot <griffon26@HIDDEN> Subject: bug#21665: Acknowledgement (Use of mmap for large files) Message-ID: <handler.21665.B.144458329416432.ack <at> debbugs.gnu.org> References: <20151011134203.GA1315@HIDDEN> X-Gnu-PR-Message: ack 21665 X-Gnu-PR-Package: diffutils Reply-To: 21665 <at> debbugs.gnu.org Date: Sun, 11 Oct 2015 17:09:02 +0000 Thank you for filing a new bug report with debbugs.gnu.org. This is an automatically generated reply to let you know your message has been received. Your message is being forwarded to the package maintainers and other interested parties for their attention; they will reply in due course. Your message has been sent to the package maintainer(s): bug-diffutils@HIDDEN If you wish to submit further information on this problem, please send it to 21665 <at> debbugs.gnu.org. Please do not send mail to help-debbugs@HIDDEN unless you wish to report a problem with the Bug-tracking system. --=20 21665: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D21665 GNU Bug Tracking System Contact help-debbugs@HIDDEN with problems
X-Loop: help-debbugs@HIDDEN Subject: bug#21665: [bug-diffutils] bug#21665: Use of mmap for large files Resent-From: Paul Eggert <eggert@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-diffutils@HIDDEN Resent-Date: Sun, 11 Oct 2015 21:16:02 +0000 Resent-Message-ID: <handler.21665.B.14445981049089 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 21665 X-GNU-PR-Package: diffutils X-GNU-PR-Keywords: To: 21665 <at> debbugs.gnu.org X-Debbugs-Original-To: bug-diffutils@HIDDEN Received: via spool by submit <at> debbugs.gnu.org id=B.14445981049089 (code B ref -1); Sun, 11 Oct 2015 21:16:02 +0000 Received: (at submit) by debbugs.gnu.org; 11 Oct 2015 21:15:04 +0000 Received: from localhost ([127.0.0.1]:36459 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1ZlNxP-0002MV-Or for submit <at> debbugs.gnu.org; Sun, 11 Oct 2015 17:15:04 -0400 Received: from eggs.gnu.org ([208.118.235.92]:49740) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from <eggert@HIDDEN>) id 1ZlNxN-0002Lo-2y for submit <at> debbugs.gnu.org; Sun, 11 Oct 2015 17:15:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <eggert@HIDDEN>) id 1ZlNxL-0001nJ-Va for submit <at> debbugs.gnu.org; Sun, 11 Oct 2015 17:15:00 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:36313) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <eggert@HIDDEN>) id 1ZlNxL-0001nF-Su for submit <at> debbugs.gnu.org; Sun, 11 Oct 2015 17:14:59 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59740) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <eggert@HIDDEN>) id 1ZlNxL-0000ab-4V for bug-diffutils@HIDDEN; Sun, 11 Oct 2015 17:14:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <eggert@HIDDEN>) id 1ZlNxH-0001mN-4c for bug-diffutils@HIDDEN; Sun, 11 Oct 2015 17:14:59 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:41265) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <eggert@HIDDEN>) id 1ZlNxG-0001mG-W4 for bug-diffutils@HIDDEN; Sun, 11 Oct 2015 17:14:55 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 1E9D3160195 for <bug-diffutils@HIDDEN>; Sun, 11 Oct 2015 14:14:54 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id dHdnGhGBdWPb for <bug-diffutils@HIDDEN>; Sun, 11 Oct 2015 14:14:53 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 5A70E160E14 for <bug-diffutils@HIDDEN>; Sun, 11 Oct 2015 14:14:53 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id pMXTd57pctEU for <bug-diffutils@HIDDEN>; Sun, 11 Oct 2015 14:14:53 -0700 (PDT) Received: from [192.168.1.9] (pool-100-32-155-148.lsanca.fios.verizon.net [100.32.155.148]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 37DFA160195 for <bug-diffutils@HIDDEN>; Sun, 11 Oct 2015 14:14:53 -0700 (PDT) References: <20151011134203.GA1315@HIDDEN> From: Paul Eggert <eggert@HIDDEN> Organization: UCLA Computer Science Department Message-ID: <561AD14D.5080009@HIDDEN> Date: Sun, 11 Oct 2015 14:14:53 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <20151011134203.GA1315@HIDDEN> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -4.0 (----) Maurice van der Pot wrote: > Would you be open to patches that enable diffing large files by using > mmap? I doubt whether that would help that much, as it still needs to construct information about each line, and that information consumes memory too. Doing this in secondary storage would be a bear. In practice when I've run into this problem, I've either gotten a bigger machine or made my input lines shorter. Preferably the former.
X-Loop: help-debbugs@HIDDEN Subject: bug#21665: [bug-diffutils] bug#21665: Use of mmap for large files Resent-From: Jim Meyering <jim@HIDDEN> Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> Resent-CC: bug-diffutils@HIDDEN Resent-Date: Mon, 02 May 2016 01:28:01 +0000 Resent-Message-ID: <handler.21665.B21665.146215245010674 <at> debbugs.gnu.org> Resent-Sender: help-debbugs@HIDDEN X-GNU-PR-Message: followup 21665 X-GNU-PR-Package: diffutils X-GNU-PR-Keywords: To: Maurice van der Pot <griffon26@HIDDEN> Cc: 21665 <at> debbugs.gnu.org Received: via spool by 21665-submit <at> debbugs.gnu.org id=B21665.146215245010674 (code B ref 21665); Mon, 02 May 2016 01:28:01 +0000 Received: (at 21665) by debbugs.gnu.org; 2 May 2016 01:27:30 +0000 Received: from localhost ([127.0.0.1]:32904 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1ax2e2-0002m6-9s for submit <at> debbugs.gnu.org; Sun, 01 May 2016 21:27:30 -0400 Received: from mail-oi0-f65.google.com ([209.85.218.65]:35486) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <meyering@HIDDEN>) id 1ax2e0-0002lt-It for 21665 <at> debbugs.gnu.org; Sun, 01 May 2016 21:27:28 -0400 Received: by mail-oi0-f65.google.com with SMTP id w198so21577074oiw.2 for <21665 <at> debbugs.gnu.org>; Sun, 01 May 2016 18:27:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=bpGtrFVUHSWlNN2Qam+m4nMc541GkQwzwWZ4TJEixww=; b=p4lzX3SPkhDpyov+tbR9gZH/WuwjscXo9uHK44eMh+JFlKtsYPK+K9tV4YRfKjQ3Hs RWTdd+BGZjUHz2pZMsTCH6//wEv6x4Eweam9ZLwaUFgNu4dwHT8DREfq76hnToNw5yWo 2f1CqWdi50P8mbB7kVieyvS60ty8ne0aHcabLsQdPQYwpkYmtniRl1FAoEY07Fklz6Lj MJxMSn+pxWq1ZOfcmZ+q9+h5s6CpJVSCuAqhCJEEDdQZ+vApOeBetSMDk0Z68qm+CuAW jK2flq+exVYPy1lql4sA2Y/ZfbuWCj9ziAYcOVLxXYjamGBBkpeIANGi45twKyax/8JT k1Fg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=bpGtrFVUHSWlNN2Qam+m4nMc541GkQwzwWZ4TJEixww=; b=D6XuMG9acPaRabKK9NN3tcXQSwO4X2X33/wDpUhtuWQdUw0Eo3qOEY7VbEJ9D9r5Vu c9IpF1gGS8V5dvaELLclLCGgqWoHOfe9o+qtLvWfZWDP8hbZvHM+ISNu/w4ey81VA+qZ Znf2HGDdgLlK9+Q+2DuaozHqjVlx1Yl3iGn3xHBtwVZigchxpELDGl5KAZnR9vJiGbJA PAofj9/C+VhqKT6lTLz8gu61Zj03YFU60+OnKjuOCL5byQGSkPeOrzO2HThwuFEcB+Np X5pfwDSs958zmy/xh8GVYLpf+wTCj7w6gOosjfVgKOwai0i18cufc1BD7wstywP0bEsr B3kQ== X-Gm-Message-State: AOPr4FVp/nc1Q/uTjobVH/zafTCtVSxxUAK1tkyuf9TspagHinEACJQUufbWKLXeOdPKw+m7Pfh7mJ5UbddTZA== X-Received: by 10.157.62.181 with SMTP id b50mr10792480otc.163.1462152443057; Sun, 01 May 2016 18:27:23 -0700 (PDT) MIME-Version: 1.0 Received: by 10.202.175.193 with HTTP; Sun, 1 May 2016 18:27:03 -0700 (PDT) In-Reply-To: <20151011134203.GA1315@HIDDEN> References: <20151011134203.GA1315@HIDDEN> From: Jim Meyering <jim@HIDDEN> Date: Sun, 1 May 2016 18:27:03 -0700 X-Google-Sender-Auth: 1k2RbJmVg3gtEk1Su7XDzd6elDQ Message-ID: <CA+8g5KFEe0+YrXJQtVH4Tb6YUr2JEWrPz_xbEj5aE6-Rqvp1mQ@HIDDEN> Content-Type: text/plain; charset=UTF-8 X-Spam-Score: -0.5 (/) X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -0.5 (/) tags 21665 notabug close 21665 done On Sun, Oct 11, 2015 at 6:42 AM, Maurice van der Pot <griffon26@HIDDEN> wrote: > I am working on an application suitable for visually merging large files. > This application delegates determination of differences to GNU diff. > > Unfortunately I have found that diff reads the entire input files into > memory, leading to "/usr/bin/diff: memory exhausted" messages on the > types of files I'd like to support. > > Would you be open to patches that enable diffing large files by using > mmap? As Paul responded in http://bugs.gnu.org/21665#8, using mmap seems unlikely to help much, but if you write the patch and demonstrate that it does make a difference, we'll be very interested, and I will happily reopen the issue. For now, I'm marking this as notabug and closing it.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.