GNU bug report logs - #25427
strip-trailing-cr makes -q not be quick

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: diffutils; Reported by: Tom Vajzovic <t.gnutools@HIDDEN>; dated Thu, 12 Jan 2017 16:12:01 UTC; Maintainer for diffutils is bug-diffutils@HIDDEN.

Message received at 25427 <at> debbugs.gnu.org:


Received: (at 25427) by debbugs.gnu.org; 22 Jan 2017 20:37:48 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Sun Jan 22 15:37:48 2017
Received: from localhost ([127.0.0.1]:38492 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1cVOtX-0004DF-QS
	for submit <at> debbugs.gnu.org; Sun, 22 Jan 2017 15:37:47 -0500
Received: from duffman.enixns.com ([91.238.164.3]:38960)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <t.gnutools@HIDDEN>) id 1cVOtW-0004D1-64
 for 25427 <at> debbugs.gnu.org; Sun, 22 Jan 2017 15:37:46 -0500
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;
 d=purposeful.co.uk; s=default; h=Content-Transfer-Encoding:Content-Type:
 MIME-Version:References:In-Reply-To:Message-ID:Subject:Cc:To:From:Date:Sender
 :Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:
 Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:
 List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive;
 bh=x8c5MRX0Mj/MWOd6XIDe9jqiNTXjW77fdG1W+nfO1+4=; b=txlPVbE6dVYpyeR/0jNJ1ziyMb
 mnEKBIUBh71z8MimxjRkm19e8DwDApX7EpmVyuWvz+Butmqh+MwU+cMZlZh8B6YPxoPLdzdpd2Urn
 yu9pLqEjDsOv2TitDVUoP6ue3v9U/pnwwnjnzipT4X19CfSbtAY+ClDQ3ld4SICm/uql5r3jsk5GV
 sL5abjZoN37zWuZFWtmvbyiA366jyFDqzV9Ek3d/n9pW9P/glzJp17U5LJIUBPkdYK8IrWkEnqZ4h
 E3jMCd/ieeMLQ1a2o6llAgjjYI6jSanHHCRgMTUpgCceSwdfowwAwq1LIvWpD0hoerz9AcR6KS0JG
 1ej/rH7Q==;
Received: from host86-162-171-175.range86-162.btcentralplus.com
 ([86.162.171.175]:35542 helo=machine.home)
 by duffman.enixns.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.87)
 (envelope-from <t.gnutools@HIDDEN>)
 id 1cVOtK-000Ony-R4; Sun, 22 Jan 2017 20:37:34 +0000
Date: Sun, 22 Jan 2017 20:37:34 +0000
From: Tom Vajzovic <t.gnutools@HIDDEN>
To: Paul Eggert <eggert@HIDDEN>
Subject: Re: [bug-diffutils] bug#25427: strip-trailing-cr makes -q not be quick
Message-ID: <20170122203734.75f3723c@HIDDEN>
In-Reply-To: <1768de52-8245-5358-1912-1ef73f0858d1@HIDDEN>
References: <20170112143745.7cf5914e@HIDDEN>
 <1768de52-8245-5358-1912-1ef73f0858d1@HIDDEN>
X-Mailer: Claws Mail 3.13.2 (GTK+ 2.24.30; x86_64-pc-linux-gnu)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-enixltd-MailScanner-Information: Please contact the ISP for more information
X-enixltd-MailScanner-ID: 1cVOtK-000Ony-R4
X-enixltd-MailScanner: Not scanned: please contact your Internet E-Mail
 Service Provider for details
X-enixltd-MailScanner-SpamCheck: not spam, SpamAssassin (not cached,
 score=1.101, required 5, ALL_TRUSTED -1.00, BAYES_999 1.00,
 KAM_COUK 1.10, URIBL_BLOCKED 0.00)
X-enixltd-MailScanner-SpamScore: s
X-enixltd-MailScanner-From: t.gnutools@HIDDEN
X-Spam-Status: No
X-AntiAbuse: This header was added to track abuse,
 please include it with any abuse report
X-AntiAbuse: Primary Hostname - duffman.enixns.com
X-AntiAbuse: Original Domain - debbugs.gnu.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - purposeful.co.uk
X-Get-Message-Sender-Via: duffman.enixns.com: authenticated_id:
 t.vajzovic@HIDDEN
X-Authenticated-Sender: duffman.enixns.com: t.vajzovic@HIDDEN
X-Spam-Score: 0.0 (/)
X-Debbugs-Envelope-To: 25427
Cc: 25427 <at> debbugs.gnu.org, Tom Vajzovic <t.gnutools@HIDDEN>
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: 0.0 (/)

Hi,

On Thu, 12 Jan 2017 15:38:27 -0800
Paul Eggert <eggert@HIDDEN> wrote:

> On 01/12/2017 06:37 AM, Tom Vajzovic wrote:
> > Are there active developers working on diff that want to look at and
> > fix this?  I am a C programmer so would be happy to do it myself,
> > but I have never contributed to GNU tools so I might not do it the
> > way you would like.  I'd rather not spend time on it if you are
> > likely to reject it and have to re-write it yourselves anyway.  
> 
> That option is low on my priority list so it'd be nice if you could
> fix it. Please use the style the code is already using; you can see
> the GNU Coding Standards for more details. Most likely your change
> would be significant enough that we'd need copyright assignment from
> you and your employer; I can send you details about how to do that,
> privately, if you like.

I'll take this one then.  I'll have time to start looking at it in
about a month from now.

Tom




Information forwarded to bug-diffutils@HIDDEN:
bug#25427; Package diffutils. Full text available.

Message received at 25427 <at> debbugs.gnu.org:


Received: (at 25427) by debbugs.gnu.org; 12 Jan 2017 23:38:36 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Jan 12 18:38:36 2017
Received: from localhost ([127.0.0.1]:51871 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1cRox1-00080M-Rr
	for submit <at> debbugs.gnu.org; Thu, 12 Jan 2017 18:38:36 -0500
Received: from zimbra.cs.ucla.edu ([131.179.128.68]:58140)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eggert@HIDDEN>) id 1cRox0-000806-9A
 for 25427 <at> debbugs.gnu.org; Thu, 12 Jan 2017 18:38:34 -0500
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 38320160057;
 Thu, 12 Jan 2017 15:38:28 -0800 (PST)
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id B8OhFG1PDPEs; Thu, 12 Jan 2017 15:38:27 -0800 (PST)
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 8526A160083;
 Thu, 12 Jan 2017 15:38:27 -0800 (PST)
X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id oDFHyq3e4BMv; Thu, 12 Jan 2017 15:38:27 -0800 (PST)
Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200])
 by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 6A30316007A;
 Thu, 12 Jan 2017 15:38:27 -0800 (PST)
Subject: Re: [bug-diffutils] bug#25427: strip-trailing-cr makes -q not be quick
To: Tom Vajzovic <t.gnutools@HIDDEN>, 25427 <at> debbugs.gnu.org
References: <20170112143745.7cf5914e@HIDDEN>
From: Paul Eggert <eggert@HIDDEN>
Organization: UCLA Computer Science Department
Message-ID: <1768de52-8245-5358-1912-1ef73f0858d1@HIDDEN>
Date: Thu, 12 Jan 2017 15:38:27 -0800
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
 Thunderbird/45.6.0
MIME-Version: 1.0
In-Reply-To: <20170112143745.7cf5914e@HIDDEN>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Spam-Score: -3.2 (---)
X-Debbugs-Envelope-To: 25427
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.2 (---)

On 01/12/2017 06:37 AM, Tom Vajzovic wrote:
> Are there active developers working on diff that want to look at and
> fix this?  I am a C programmer so would be happy to do it myself, but I
> have never contributed to GNU tools so I might not do it the way you
> would like.  I'd rather not spend time on it if you are likely to reject
> it and have to re-write it yourselves anyway.

That option is low on my priority list so it'd be nice if you could fix
it. Please use the style the code is already using; you can see the GNU
Coding Standards for more details. Most likely your change would be
significant enough that we'd need copyright assignment from you and your
employer; I can send you details about how to do that, privately, if you
like.





Information forwarded to bug-diffutils@HIDDEN:
bug#25427; Package diffutils. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 12 Jan 2017 16:11:22 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Jan 12 11:11:22 2017
Received: from localhost ([127.0.0.1]:51722 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1cRhyE-0001NS-6Y
	for submit <at> debbugs.gnu.org; Thu, 12 Jan 2017 11:11:22 -0500
Received: from eggs.gnu.org ([208.118.235.92]:56147)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <t.gnutools@HIDDEN>) id 1cRgWc-0005ar-2W
 for submit <at> debbugs.gnu.org; Thu, 12 Jan 2017 09:38:46 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <t.gnutools@HIDDEN>) id 1cRgWU-0004fV-AD
 for submit <at> debbugs.gnu.org; Thu, 12 Jan 2017 09:38:41 -0500
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,T_DKIM_INVALID
 autolearn=disabled version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:57235)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <t.gnutools@HIDDEN>)
 id 1cRgWU-0004fH-72
 for submit <at> debbugs.gnu.org; Thu, 12 Jan 2017 09:38:38 -0500
Received: from eggs.gnu.org ([2001:4830:134:3::10]:47091)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <t.gnutools@HIDDEN>) id 1cRgWR-00083J-1A
 for bug-diffutils@HIDDEN; Thu, 12 Jan 2017 09:38:37 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <t.gnutools@HIDDEN>) id 1cRgWM-0004YA-2L
 for bug-diffutils@HIDDEN; Thu, 12 Jan 2017 09:38:35 -0500
Received: from duffman.enixns.com ([91.238.164.3]:35798)
 by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <t.gnutools@HIDDEN>)
 id 1cRgWL-0004H8-OC
 for bug-diffutils@HIDDEN; Thu, 12 Jan 2017 09:38:29 -0500
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;
 d=purposeful.co.uk; s=default; h=Content-Transfer-Encoding:Content-Type:
 MIME-Version:Message-ID:Subject:To:From:Date:Sender:Reply-To:Cc:Content-ID:
 Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc
 :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe:
 List-Subscribe:List-Post:List-Owner:List-Archive;
 bh=EwK+Fi37ycIBCFyakeIQRmf6MpQ494QMEbhFt7NBjMI=; b=dGBoHmddjzq8HzO9JkTkZJNBIA
 N4+W/2FYK5v1ATr33nCDEwmWjNAEO69K8KAKQxBayzKwN2XWfpDUtPnIALsrlqj3YEJ3cf3dPpwPQ
 PvrNBZcKdIuom9KIQOS9jQgLY+qkNsU/nprrRjHiJhzej0I6On+GEYQ5pkDjNFykWZHGig3QeKC6h
 tE1hSqN7UurrNRuKYY/kVnr2SOiH3AbeV4HKtPV61zaAHWV6J4ohtEfSKh0ZyM5K5vNL/ft9thEB1
 yk2GElNTba5Z/TszBAWvUmc06tg7DmrCAcqEt427xWO4rxSMnmZagejUaaDlfabGoU4QhplrNxHsw
 mmNESNjA==;
Received: from host86-161-19-5.range86-161.btcentralplus.com
 ([86.161.19.5]:43944 helo=machine.home)
 by duffman.enixns.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.87)
 (envelope-from <t.gnutools@HIDDEN>) id 1cRgVd-001Fu7-E2
 for bug-diffutils@HIDDEN; Thu, 12 Jan 2017 14:37:45 +0000
Date: Thu, 12 Jan 2017 14:37:45 +0000
From: Tom Vajzovic <t.gnutools@HIDDEN>
To: bug-diffutils@HIDDEN
Subject: strip-trailing-cr makes -q not be quick
Message-ID: <20170112143745.7cf5914e@HIDDEN>
X-Mailer: Claws Mail 3.13.2 (GTK+ 2.24.30; x86_64-pc-linux-gnu)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-enixltd-MailScanner-Information: Please contact the ISP for more information
X-enixltd-MailScanner-ID: 1cRgVd-001Fu7-E2
X-enixltd-MailScanner: Found to be clean
X-enixltd-MailScanner-SpamCheck: 
X-enixltd-MailScanner-From: t.gnutools@HIDDEN
X-AntiAbuse: This header was added to track abuse,
 please include it with any abuse report
X-AntiAbuse: Primary Hostname - duffman.enixns.com
X-AntiAbuse: Original Domain - gnu.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - purposeful.co.uk
X-Get-Message-Sender-Via: duffman.enixns.com: authenticated_id:
 t.vajzovic@HIDDEN
X-Authenticated-Sender: duffman.enixns.com: t.vajzovic@HIDDEN
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [fuzzy]
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -4.0 (----)
X-Debbugs-Envelope-To: submit
X-Mailman-Approved-At: Thu, 12 Jan 2017 11:11:21 -0500
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -4.0 (----)

Hi,

I have diff aliased to 'diff -d --strip-trailing-cr'.

When I run 'diff -qs file1 file2' it takes a very long time to return
(minutes).

If I run without the -d it still takes a noticeable amount of time (a
second or two).

If I run without --strip-trailing-cr it takes next-to-no-time, whether
or not I specify -d.

These are two very large text files, but they differ within the first
few lines.

I suspect that it is calculating the whole diff internally and
discarding it when --strip-trailing-cr is given at the same time as -q.

Obviously I understand that by putting --strip-trailing-cr in an alias
I am requiring the whole file to be read and modified in memory, this
is fine.  The fact that -d has any effect though suggests that
once the line endings are processed the wrong algorithm is being
applied.  It doesn't need to run the diff algorithm, something almost
as simple as a strcmp could be used if -q is given, in which case -d
would have no effect.

Are there active developers working on diff that want to look at and
fix this?  I am a C programmer so would be happy to do it myself, but I
have never contributed to GNU tools so I might not do it the way you
would like.  I'd rather not spend time on it if you are likely to reject
it and have to re-write it yourselves anyway.

I'm using version 3.3 from Ubuntu.  Let me know if you want me to
reproduce from master.

Many thanks/best regards,

Tom Vajzovic




Acknowledgement sent to Tom Vajzovic <t.gnutools@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-diffutils@HIDDEN. Full text available.
Report forwarded to bug-diffutils@HIDDEN:
bug#25427; Package diffutils. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Mon, 25 Nov 2019 12:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.