GNU bug report logs - #21665
Use of mmap for large files

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: diffutils; Reported by: Maurice van der Pot <griffon26@HIDDEN>; dated Sun, 11 Oct 2015 17:09:01 UTC; Maintainer for diffutils is bug-diffutils@HIDDEN.

Message received at 21665 <at> debbugs.gnu.org:


Received: (at 21665) by debbugs.gnu.org; 2 May 2016 01:27:30 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun May 01 21:27:30 2016
Received: from localhost ([127.0.0.1]:32904 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1ax2e2-0002m6-9s
	for submit <at> debbugs.gnu.org; Sun, 01 May 2016 21:27:30 -0400
Received: from mail-oi0-f65.google.com ([209.85.218.65]:35486)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <meyering@HIDDEN>) id 1ax2e0-0002lt-It
 for 21665 <at> debbugs.gnu.org; Sun, 01 May 2016 21:27:28 -0400
Received: by mail-oi0-f65.google.com with SMTP id w198so21577074oiw.2
 for <21665 <at> debbugs.gnu.org>; Sun, 01 May 2016 18:27:28 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:from:date:message-id
 :subject:to:cc;
 bh=bpGtrFVUHSWlNN2Qam+m4nMc541GkQwzwWZ4TJEixww=;
 b=p4lzX3SPkhDpyov+tbR9gZH/WuwjscXo9uHK44eMh+JFlKtsYPK+K9tV4YRfKjQ3Hs
 RWTdd+BGZjUHz2pZMsTCH6//wEv6x4Eweam9ZLwaUFgNu4dwHT8DREfq76hnToNw5yWo
 2f1CqWdi50P8mbB7kVieyvS60ty8ne0aHcabLsQdPQYwpkYmtniRl1FAoEY07Fklz6Lj
 MJxMSn+pxWq1ZOfcmZ+q9+h5s6CpJVSCuAqhCJEEDdQZ+vApOeBetSMDk0Z68qm+CuAW
 jK2flq+exVYPy1lql4sA2Y/ZfbuWCj9ziAYcOVLxXYjamGBBkpeIANGi45twKyax/8JT
 k1Fg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:sender:in-reply-to:references:from
 :date:message-id:subject:to:cc;
 bh=bpGtrFVUHSWlNN2Qam+m4nMc541GkQwzwWZ4TJEixww=;
 b=D6XuMG9acPaRabKK9NN3tcXQSwO4X2X33/wDpUhtuWQdUw0Eo3qOEY7VbEJ9D9r5Vu
 c9IpF1gGS8V5dvaELLclLCGgqWoHOfe9o+qtLvWfZWDP8hbZvHM+ISNu/w4ey81VA+qZ
 Znf2HGDdgLlK9+Q+2DuaozHqjVlx1Yl3iGn3xHBtwVZigchxpELDGl5KAZnR9vJiGbJA
 PAofj9/C+VhqKT6lTLz8gu61Zj03YFU60+OnKjuOCL5byQGSkPeOrzO2HThwuFEcB+Np
 X5pfwDSs958zmy/xh8GVYLpf+wTCj7w6gOosjfVgKOwai0i18cufc1BD7wstywP0bEsr
 B3kQ==
X-Gm-Message-State: AOPr4FVp/nc1Q/uTjobVH/zafTCtVSxxUAK1tkyuf9TspagHinEACJQUufbWKLXeOdPKw+m7Pfh7mJ5UbddTZA==
X-Received: by 10.157.62.181 with SMTP id b50mr10792480otc.163.1462152443057; 
 Sun, 01 May 2016 18:27:23 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.202.175.193 with HTTP; Sun, 1 May 2016 18:27:03 -0700 (PDT)
In-Reply-To: <20151011134203.GA1315@HIDDEN>
References: <20151011134203.GA1315@HIDDEN>
From: Jim Meyering <jim@HIDDEN>
Date: Sun, 1 May 2016 18:27:03 -0700
X-Google-Sender-Auth: 1k2RbJmVg3gtEk1Su7XDzd6elDQ
Message-ID: <CA+8g5KFEe0+YrXJQtVH4Tb6YUr2JEWrPz_xbEj5aE6-Rqvp1mQ@HIDDEN>
Subject: Re: [bug-diffutils] bug#21665: Use of mmap for large files
To: Maurice van der Pot <griffon26@HIDDEN>
Content-Type: text/plain; charset=UTF-8
X-Spam-Score: -0.5 (/)
X-Debbugs-Envelope-To: 21665
Cc: 21665 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.5 (/)

tags 21665 notabug
close 21665
done

On Sun, Oct 11, 2015 at 6:42 AM, Maurice van der Pot
<griffon26@HIDDEN> wrote:
> I am working on an application suitable for visually merging large files.
> This application delegates determination of differences to GNU diff.
>
> Unfortunately I have found that diff reads the entire input files into
> memory, leading to "/usr/bin/diff: memory exhausted" messages on the
> types of files I'd like to support.
>
> Would you be open to patches that enable diffing large files by using
> mmap?

As Paul responded in http://bugs.gnu.org/21665#8, using mmap seems
unlikely to help much, but if you write the patch and demonstrate that
it does make a difference, we'll be very interested, and I will
happily reopen the issue.

For now, I'm marking this as notabug and closing it.




Information forwarded to bug-diffutils@HIDDEN:
bug#21665; Package diffutils. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 11 Oct 2015 21:15:04 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Oct 11 17:15:04 2015
Received: from localhost ([127.0.0.1]:36459 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1ZlNxP-0002MV-Or
	for submit <at> debbugs.gnu.org; Sun, 11 Oct 2015 17:15:04 -0400
Received: from eggs.gnu.org ([208.118.235.92]:49740)
 by debbugs.gnu.org with esmtp (Exim 4.80)
 (envelope-from <eggert@HIDDEN>) id 1ZlNxN-0002Lo-2y
 for submit <at> debbugs.gnu.org; Sun, 11 Oct 2015 17:15:01 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <eggert@HIDDEN>) id 1ZlNxL-0001nJ-Va
 for submit <at> debbugs.gnu.org; Sun, 11 Oct 2015 17:15:00 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled
 version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:36313)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <eggert@HIDDEN>) id 1ZlNxL-0001nF-Su
 for submit <at> debbugs.gnu.org; Sun, 11 Oct 2015 17:14:59 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:59740)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <eggert@HIDDEN>) id 1ZlNxL-0000ab-4V
 for bug-diffutils@HIDDEN; Sun, 11 Oct 2015 17:14:59 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <eggert@HIDDEN>) id 1ZlNxH-0001mN-4c
 for bug-diffutils@HIDDEN; Sun, 11 Oct 2015 17:14:59 -0400
Received: from zimbra.cs.ucla.edu ([131.179.128.68]:41265)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <eggert@HIDDEN>) id 1ZlNxG-0001mG-W4
 for bug-diffutils@HIDDEN; Sun, 11 Oct 2015 17:14:55 -0400
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 1E9D3160195
 for <bug-diffutils@HIDDEN>; Sun, 11 Oct 2015 14:14:54 -0700 (PDT)
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id dHdnGhGBdWPb for <bug-diffutils@HIDDEN>;
 Sun, 11 Oct 2015 14:14:53 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 5A70E160E14
 for <bug-diffutils@HIDDEN>; Sun, 11 Oct 2015 14:14:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id pMXTd57pctEU for <bug-diffutils@HIDDEN>;
 Sun, 11 Oct 2015 14:14:53 -0700 (PDT)
Received: from [192.168.1.9] (pool-100-32-155-148.lsanca.fios.verizon.net
 [100.32.155.148])
 by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 37DFA160195
 for <bug-diffutils@HIDDEN>; Sun, 11 Oct 2015 14:14:53 -0700 (PDT)
Subject: Re: [bug-diffutils] bug#21665: Use of mmap for large files
To: bug-diffutils@HIDDEN
References: <20151011134203.GA1315@HIDDEN>
From: Paul Eggert <eggert@HIDDEN>
Organization: UCLA Computer Science Department
Message-ID: <561AD14D.5080009@HIDDEN>
Date: Sun, 11 Oct 2015 14:14:53 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101
 Thunderbird/38.3.0
MIME-Version: 1.0
In-Reply-To: <20151011134203.GA1315@HIDDEN>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x
X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address
 (bad octet value).
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -4.0 (----)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -4.0 (----)

Maurice van der Pot wrote:
> Would you be open to patches that enable diffing large files by using
> mmap?

I doubt whether that would help that much, as it still needs to construct 
information about each line, and that information consumes memory too.  Doing 
this in secondary storage would be a bear.  In practice when I've run into this 
problem, I've either gotten a bigger machine or made my input lines shorter. 
Preferably the former.




Information forwarded to bug-diffutils@HIDDEN:
bug#21665; Package diffutils. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 11 Oct 2015 17:08:14 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Oct 11 13:08:14 2015
Received: from localhost ([127.0.0.1]:36361 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1ZlK6Y-0004Gy-5F
	for submit <at> debbugs.gnu.org; Sun, 11 Oct 2015 13:08:14 -0400
Received: from eggs.gnu.org ([208.118.235.92]:58721)
 by debbugs.gnu.org with esmtp (Exim 4.80)
 (envelope-from <griffon26@HIDDEN>) id 1ZlGtA-0007MT-2U
 for submit <at> debbugs.gnu.org; Sun, 11 Oct 2015 09:42:12 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <griffon26@HIDDEN>) id 1ZlGt8-0001ct-ST
 for submit <at> debbugs.gnu.org; Sun, 11 Oct 2015 09:42:11 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled
 version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:44896)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <griffon26@HIDDEN>) id 1ZlGt8-0001cp-QT
 for submit <at> debbugs.gnu.org; Sun, 11 Oct 2015 09:42:10 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:40480)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <griffon26@HIDDEN>) id 1ZlGt7-0007ak-SH
 for bug-diffutils@HIDDEN; Sun, 11 Oct 2015 09:42:10 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <griffon26@HIDDEN>) id 1ZlGt3-0001Zz-Ob
 for bug-diffutils@HIDDEN; Sun, 11 Oct 2015 09:42:09 -0400
Received: from griffon26.demon.nl ([83.163.44.87]:37298
 helo=griffon26.kfk4ever.com) by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <griffon26@HIDDEN>) id 1ZlGt3-0001ZW-HS
 for bug-diffutils@HIDDEN; Sun, 11 Oct 2015 09:42:05 -0400
Received: by griffon26.kfk4ever.com (Postfix, from userid 1000)
 id 99CB6127E2F; Sun, 11 Oct 2015 15:42:03 +0200 (CEST)
Date: Sun, 11 Oct 2015 15:42:03 +0200
From: Maurice van der Pot <griffon26@HIDDEN>
To: bug-diffutils@HIDDEN
Subject: Use of mmap for large files
Message-ID: <20151011134203.GA1315@HIDDEN>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
 protocol="application/pgp-signature"; boundary="jRHKVT23PllUwdXP"
Content-Disposition: inline
X-URL: http://www.kfk4ever.com/
X-PGP-Key: http://www.kfk4ever.com/~griffon26/pubkey.asc
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address
 (bad octet value).
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -4.1 (----)
X-Debbugs-Envelope-To: submit
X-Mailman-Approved-At: Sun, 11 Oct 2015 13:08:13 -0400
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -4.1 (----)


--jRHKVT23PllUwdXP
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

I am working on an application suitable for visually merging large files.
This application delegates determination of differences to GNU diff.

Unfortunately I have found that diff reads the entire input files into
memory, leading to "/usr/bin/diff: memory exhausted" messages on the
types of files I'd like to support.

Would you be open to patches that enable diffing large files by using
mmap?

Kind regards,
Maurice.

--=20
Maurice van der Pot

Kdiff3 developer   griffon26@HIDDEN   http://kdiff3.sourceforge.net
Tdiff3 developer                            https://github.com/Griffon26/td=
iff3

--jRHKVT23PllUwdXP
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iEYEARECAAYFAlYaZysACgkQMGnpIbeahxxSIQCgmUl58w1BR6+oOUQK7TUZkhYl
6EcAnjRMByA1+xXDEtfjFHXa/oCCAwxw
=Zs9+
-----END PGP SIGNATURE-----

--jRHKVT23PllUwdXP--




Acknowledgement sent to Maurice van der Pot <griffon26@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-diffutils@HIDDEN. Full text available.
Report forwarded to bug-diffutils@HIDDEN:
bug#21665; Package diffutils. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Mon, 25 Nov 2019 12:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.