GNU bug report logs - #32993
Pathologically slow operation

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: diffutils; Reported by: Stefan Monnier <monnier@HIDDEN>; dated Mon, 8 Oct 2018 21:35:01 UTC; Maintainer for diffutils is bug-diffutils@HIDDEN.

Message received at 32993 <at> debbugs.gnu.org:


Received: (at 32993) by debbugs.gnu.org; 9 Oct 2018 00:47:07 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Oct 08 20:47:07 2018
Received: from localhost ([127.0.0.1]:40996 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1g9gB1-0005Ui-3h
	for submit <at> debbugs.gnu.org; Mon, 08 Oct 2018 20:47:07 -0400
Received: from zimbra.cs.ucla.edu ([131.179.128.68]:43640)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eggert@HIDDEN>) id 1g9gAy-0005UE-V7
 for 32993 <at> debbugs.gnu.org; Mon, 08 Oct 2018 20:47:05 -0400
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 99F7B16173B;
 Mon,  8 Oct 2018 17:46:58 -0700 (PDT)
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id WMVOHcmMXvS7; Mon,  8 Oct 2018 17:46:57 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id BAC65161747;
 Mon,  8 Oct 2018 17:46:57 -0700 (PDT)
X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id uth-8xtj8Zb1; Mon,  8 Oct 2018 17:46:57 -0700 (PDT)
Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200])
 by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 9E34016173B;
 Mon,  8 Oct 2018 17:46:57 -0700 (PDT)
Subject: Re: [bug-diffutils] bug#32993: Pathologically slow operation
To: Stefan Monnier <monnier@HIDDEN>
References: <jwv4ldwqh32.fsf-monnier+gmane.linux.debian.user@HIDDEN>
 <0904c3c5-90d3-dd53-680f-7867eb7a60c4@HIDDEN>
 <jwvftxg2d2n.fsf-monnier+INBOX@HIDDEN>
From: Paul Eggert <eggert@HIDDEN>
Openpgp: preference=signencrypt
Autocrypt: addr=eggert@HIDDEN; prefer-encrypt=mutual; keydata=
 xsFNBEyAcmQBEADAAyH2xoTu7ppG5D3a8FMZEon74dCvc4+q1XA2J2tBy2pwaTqfhpxxdGA9
 Jj50UJ3PD4bSUEgN8tLZ0san47l5XTAFLi2456ciSl5m8sKaHlGdt9XmAAtmXqeZVIYX/UFS
 96fDzf4xhEmm/y7LbYEPQdUdxu47xA5KhTYp5bltF3WYDz1Ygd7gx07Auwp7iw7eNvnoDTAl
 KAl8KYDZzbDNCQGEbpY3efZIvPdeI+FWQN4W+kghy+P6au6PrIIhYraeua7XDdb2LS1en3Ss
 mE3QjqfRqI/A2ue8JMwsvXe/WK38Ezs6x74iTaqI3AFH6ilAhDqpMnd/msSESNFt76DiO1ZK
 QMr9amVPknjfPmJISqdhgB1DlEdw34sROf6V8mZw0xfqT6PKE46LcFefzs0kbg4GORf8vjG2
 Sf1tk5eU8MBiyN/bZ03bKNjNYMpODDQQwuP84kYLkX2wBxxMAhBxwbDVZudzxDZJ1C2VXujC
 OJVxq2kljBM9ETYuUGqd75AW2LXrLw6+MuIsHFAYAgRr7+KcwDgBAfwhPBYX34nSSiHlmLC+
 KaHLeCLF5ZI2vKm3HEeCTtlOg7xZEONgwzL+fdKo+D6SoC8RRxJKs8a3sVfI4t6CnrQzvJbB
 n6gxdgCu5i29J1QCYrCYvql2UyFPAK+do99/1jOXT4m2836j1wARAQABzSBQYXVsIEVnZ2Vy
 dCA8ZWdnZXJ0QGNzLnVjbGEuZWR1PsLBfgQTAQIAKAUCTIByZAIbAwUJEswDAAYLCQgHAwIG
 FQgCCQoLBBYCAwECHgECF4AACgkQ7ZfpDmKqfjRRGw/+Ij03dhYfYl/gXVRiuzV1gGrbHk+t
 nfrI/C7fAeoFzQ5tVgVinShaPkZo0HTPf18x6IDEdAiO8Mqo1yp0CtHmzGMCJ50o4Grgfjlr
 6g/+vtEOKbhleszN2XpJvpwM2QgGvn/laTLUu8PH9aRWTs7qJJZKKKAb4sxYc92FehPu6FOD
 0dDiyhlDAq4lOV2mdBpzQbiojoZzQLMQwjpgCTK2572eK9EOEQySUThXrSIz6ASenp4NYTFH
 s9tuJQvXk9gZDdPSl3bp+47dGxlxEWLpBIM7zIONw4ks4azgT8nvDZxA5IZHtvqBlJLBObYY
 0Le61Wp0y3TlBDh2qdK8eYL426W4scEMSuig5gb8OAtQiBW6k2sGUxxeiv8ovWu8YAZgKJfu
 oWI+uRnMEddruY8JsoM54KaKvZikkKs2bg1ndtLVzHpJ6qFZC7QVjeHUh6/BmgvdjWPZYFTt
 N+KA9CWX3GQKKgN3uu988yznD7LnB98T4EUH1HA/GnfBqMV1gpzTvPc4qVQinCmIkEFp83zl
 +G5fCjJJ3W7ivzCnYo4KhKLpFUm97okTKR2LW3xZzEW4cLSWO387MTK3CzDOx5qe6s4a91Zu
 ZM/j/TQdTLDaqNn83kA4Hq48UHXYxcIh+Nd8k/3w6lFuoK0wrOFiywjLx+0ur5jmmbecBGHc
 1xdhAFHOwU0ETIByZAEQAKaF678T9wyH4wjTrV1Pz3cDEoSnV/0ZUrOT37p1dcGyj/IXq1x6
 70HRVahAmk0sZpYc25PF9D5GPYHFWlNjuPU96rDndXB3hedmBRhLdC4bAXjI4DV+bmdVe+q/
 IMnlZRaVlm9EiMCVAR6w13sReu7qXkW9r3RwY2AzXskp/tAe4BRKr1Zmbvi2nbnQ6epEC42r
 Rbx0B1EhjbIQZ5JHGk24iPT7LdBgnNmos5wYjzwNlkMQD5T0Ydzhk7J+UxwA5m46mOhRDC2r
 FV/A0gm5TLy8DXjv/Esc4gYnYai6SQqnUEVh5LuV8YCJBnijs+Tiw71x1icmn6xGI45EugJO
 gec+rLypYgpVp4x0HI5T88qBRYCkxH3Kg8Qo+EWNA9A4LRQ9DX8njona0gf0s03tocK8kBN6
 6UoqqPtHBnc4eMgBymCflK12eKfd2YYxnyg9cZazWA5VslvTxpm76hbg5oiAEH/Vg/8MxHyA
 nPhfrgwyPrmJEcVBafdspJnYQxBYNco2LFPIhlOvWh8r4at+s+M3Lb26oUTczlgdW1Sf3SDA
 77BMRnF0FQyE+7AzV79MBN4ykiqaezQxtaF1Fy/tvkhffSo8u+dwG0EgJh+te38gTcISVr0G
 IPplLz6YhjrbHrPRF1CN5UuL9DBGjxuN35RLNVEfta6RUFlR6NctTjvrABEBAAHCwWUEGAEC
 AA8FAkyAcmQCGwwFCRLMAwAACgkQ7ZfpDmKqfjSrHA/+KzAKvTxRhA9MWNLxIyJ7S5uJ16gs
 T3oCjZrBKGEhKMOGX4O0GA6VOEryO7QRCCYah3oxSG38IAnNeiwJXgU9Bzkk85UGbPEd7HGF
 /VSeHCQwWou6jqUDTSDvn9YhNTdG0KXPM74aC+xr2Zow1O2mhXihgWKD0Dw+0LYPnUOsQ0KO
 FxHXXYHmRrS1OZPU59BLvc+TRhIhafSHKLwbXK+6ckkxBx6h8z5ccpG0Qs4bFhdFYnFrEieD
 LoGmnE2YLhdV6swJ9VNCS6pLiEohT3fm7aXm15tZOIyzMZhHRSAPblXxQ0ZSWjq8oRrcYNFx
 c4W1URpAkBCOYJoXvQfD5L3lqAl8TCqDUzYxhH/tJhbDdHrqHH767jaDaTB1+Talp/2AMKwc
 XNOdiklGxbmHVG6YGl6g8Lrbsu9NZEI4yLlHzuikthJWgz+3vZhVGyNlt+HNIoF6CjDL2omu
 5cEq4RDHM44QqPk6l7O0pUvN1mT4B+S1b08RKpqm/ff015E37HNV/piIvJlxGAYz8PSfuGCB
 1thMYqlmgdhd9/BabGFbGGYHA6U4/T5zqU+f6xHy1SsAQZ1MSKlLwekBIT+4/cLRGqCHjnV0
 q5H/T6a7t5mPkbzSrOLSo4puj+IToNjYyYIDBWzhlA19avOa+rvUjmHtD3sFN7cXWtkGoi8b
 uNcby4U=
Organization: UCLA Computer Science Department
Message-ID: <f71a3817-72fa-dd02-a4ad-52b6910609a1@HIDDEN>
Date: Mon, 8 Oct 2018 17:46:57 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
 Thunderbird/60.0
MIME-Version: 1.0
In-Reply-To: <jwvftxg2d2n.fsf-monnier+INBOX@HIDDEN>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Content-Language: en-US
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 32993
Cc: 32993 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

On 10/8/18 5:14 PM, Stefan Monnier wrote:
> What other way is there?

It'd require hacking the diff source code, alas. Also, understanding the 
algorithm. Although the best guy to do it would probably be the 
algorithm's co-inventor Eugene Myers, he's kind of busy these days 
<https://en.wikipedia.org/wiki/Eugene_Myers>.





Information forwarded to bug-diffutils@HIDDEN:
bug#32993; Package diffutils. Full text available.

Message received at 32993 <at> debbugs.gnu.org:


Received: (at 32993) by debbugs.gnu.org; 9 Oct 2018 00:15:12 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Oct 08 20:15:12 2018
Received: from localhost ([127.0.0.1]:40986 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1g9fg7-0004kR-Ty
	for submit <at> debbugs.gnu.org; Mon, 08 Oct 2018 20:15:12 -0400
Received: from chene.dit.umontreal.ca ([132.204.246.20]:52211)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <monnier@HIDDEN>) id 1g9fg6-0004kJ-BD
 for 32993 <at> debbugs.gnu.org; Mon, 08 Oct 2018 20:15:11 -0400
Received: from milanesa.home (lechon.iro.umontreal.ca [132.204.27.242])
 by chene.dit.umontreal.ca (8.14.7/8.14.1) with ESMTP id w990ESaV010572;
 Mon, 8 Oct 2018 20:14:28 -0400
Received: by milanesa.home (Postfix, from userid 20848)
 id 0549366100; Mon,  8 Oct 2018 20:14:28 -0400 (EDT)
From: Stefan Monnier <monnier@HIDDEN>
To: Paul Eggert <eggert@HIDDEN>
Subject: Re: [bug-diffutils] bug#32993: Pathologically slow operation
Message-ID: <jwvftxg2d2n.fsf-monnier+INBOX@HIDDEN>
References: <jwv4ldwqh32.fsf-monnier+gmane.linux.debian.user@HIDDEN>
 <0904c3c5-90d3-dd53-680f-7867eb7a60c4@HIDDEN>
Date: Mon, 08 Oct 2018 20:14:27 -0400
In-Reply-To: <0904c3c5-90d3-dd53-680f-7867eb7a60c4@HIDDEN> (Paul Eggert's
 message of "Mon, 8 Oct 2018 15:49:09 -0700")
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-NAI-Spam-Flag: NO
X-NAI-Spam-Threshold: 5
X-NAI-Spam-Score: 0
X-NAI-Spam-Rules: 2 Rules triggered
	EDT_SA_DN_PASS=0, RV6390=0
X-NAI-Spam-Version: 2.3.0.9418 : core <6390> : inlines <6919> : streams
 <1800759> : uri <2726587>
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 32993
Cc: 32993 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

>> tries to do word-level diffs (by basically turning every word
>> into N copies of this word, each one on its own line (where N is the
>> number of chars in the word, used to indicate to `diff` that long words
>> are "more costly" than short ones)
> Ouch.  That's a really inefficient way of mucking with the cost
> algorithm.

What other way is there?

> Diff should be able to do taht directly, without your having to
> repeat the words. (Just a simple matter of programming....)

"man diff" doesn't seem to offer much room for "programming".


        Stefan




Information forwarded to bug-diffutils@HIDDEN:
bug#32993; Package diffutils. Full text available.

Message received at 32993 <at> debbugs.gnu.org:


Received: (at 32993) by debbugs.gnu.org; 8 Oct 2018 22:49:19 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Oct 08 18:49:19 2018
Received: from localhost ([127.0.0.1]:40898 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1g9eL1-0000ZM-7X
	for submit <at> debbugs.gnu.org; Mon, 08 Oct 2018 18:49:19 -0400
Received: from zimbra.cs.ucla.edu ([131.179.128.68]:52720)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eggert@HIDDEN>) id 1g9eKy-0000Z6-MH
 for 32993 <at> debbugs.gnu.org; Mon, 08 Oct 2018 18:49:17 -0400
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 41872161747;
 Mon,  8 Oct 2018 15:49:10 -0700 (PDT)
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id 9Xj84CwsF5-q; Mon,  8 Oct 2018 15:49:09 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 834BD161749;
 Mon,  8 Oct 2018 15:49:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id ny9WKHMZ7_TY; Mon,  8 Oct 2018 15:49:09 -0700 (PDT)
Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200])
 by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 60920161747;
 Mon,  8 Oct 2018 15:49:09 -0700 (PDT)
Subject: Re: [bug-diffutils] bug#32993: Pathologically slow operation
To: Stefan Monnier <monnier@HIDDEN>, 32993 <at> debbugs.gnu.org
References: <jwv4ldwqh32.fsf-monnier+gmane.linux.debian.user@HIDDEN>
From: Paul Eggert <eggert@HIDDEN>
Openpgp: preference=signencrypt
Autocrypt: addr=eggert@HIDDEN; prefer-encrypt=mutual; keydata=
 xsFNBEyAcmQBEADAAyH2xoTu7ppG5D3a8FMZEon74dCvc4+q1XA2J2tBy2pwaTqfhpxxdGA9
 Jj50UJ3PD4bSUEgN8tLZ0san47l5XTAFLi2456ciSl5m8sKaHlGdt9XmAAtmXqeZVIYX/UFS
 96fDzf4xhEmm/y7LbYEPQdUdxu47xA5KhTYp5bltF3WYDz1Ygd7gx07Auwp7iw7eNvnoDTAl
 KAl8KYDZzbDNCQGEbpY3efZIvPdeI+FWQN4W+kghy+P6au6PrIIhYraeua7XDdb2LS1en3Ss
 mE3QjqfRqI/A2ue8JMwsvXe/WK38Ezs6x74iTaqI3AFH6ilAhDqpMnd/msSESNFt76DiO1ZK
 QMr9amVPknjfPmJISqdhgB1DlEdw34sROf6V8mZw0xfqT6PKE46LcFefzs0kbg4GORf8vjG2
 Sf1tk5eU8MBiyN/bZ03bKNjNYMpODDQQwuP84kYLkX2wBxxMAhBxwbDVZudzxDZJ1C2VXujC
 OJVxq2kljBM9ETYuUGqd75AW2LXrLw6+MuIsHFAYAgRr7+KcwDgBAfwhPBYX34nSSiHlmLC+
 KaHLeCLF5ZI2vKm3HEeCTtlOg7xZEONgwzL+fdKo+D6SoC8RRxJKs8a3sVfI4t6CnrQzvJbB
 n6gxdgCu5i29J1QCYrCYvql2UyFPAK+do99/1jOXT4m2836j1wARAQABzSBQYXVsIEVnZ2Vy
 dCA8ZWdnZXJ0QGNzLnVjbGEuZWR1PsLBfgQTAQIAKAUCTIByZAIbAwUJEswDAAYLCQgHAwIG
 FQgCCQoLBBYCAwECHgECF4AACgkQ7ZfpDmKqfjRRGw/+Ij03dhYfYl/gXVRiuzV1gGrbHk+t
 nfrI/C7fAeoFzQ5tVgVinShaPkZo0HTPf18x6IDEdAiO8Mqo1yp0CtHmzGMCJ50o4Grgfjlr
 6g/+vtEOKbhleszN2XpJvpwM2QgGvn/laTLUu8PH9aRWTs7qJJZKKKAb4sxYc92FehPu6FOD
 0dDiyhlDAq4lOV2mdBpzQbiojoZzQLMQwjpgCTK2572eK9EOEQySUThXrSIz6ASenp4NYTFH
 s9tuJQvXk9gZDdPSl3bp+47dGxlxEWLpBIM7zIONw4ks4azgT8nvDZxA5IZHtvqBlJLBObYY
 0Le61Wp0y3TlBDh2qdK8eYL426W4scEMSuig5gb8OAtQiBW6k2sGUxxeiv8ovWu8YAZgKJfu
 oWI+uRnMEddruY8JsoM54KaKvZikkKs2bg1ndtLVzHpJ6qFZC7QVjeHUh6/BmgvdjWPZYFTt
 N+KA9CWX3GQKKgN3uu988yznD7LnB98T4EUH1HA/GnfBqMV1gpzTvPc4qVQinCmIkEFp83zl
 +G5fCjJJ3W7ivzCnYo4KhKLpFUm97okTKR2LW3xZzEW4cLSWO387MTK3CzDOx5qe6s4a91Zu
 ZM/j/TQdTLDaqNn83kA4Hq48UHXYxcIh+Nd8k/3w6lFuoK0wrOFiywjLx+0ur5jmmbecBGHc
 1xdhAFHOwU0ETIByZAEQAKaF678T9wyH4wjTrV1Pz3cDEoSnV/0ZUrOT37p1dcGyj/IXq1x6
 70HRVahAmk0sZpYc25PF9D5GPYHFWlNjuPU96rDndXB3hedmBRhLdC4bAXjI4DV+bmdVe+q/
 IMnlZRaVlm9EiMCVAR6w13sReu7qXkW9r3RwY2AzXskp/tAe4BRKr1Zmbvi2nbnQ6epEC42r
 Rbx0B1EhjbIQZ5JHGk24iPT7LdBgnNmos5wYjzwNlkMQD5T0Ydzhk7J+UxwA5m46mOhRDC2r
 FV/A0gm5TLy8DXjv/Esc4gYnYai6SQqnUEVh5LuV8YCJBnijs+Tiw71x1icmn6xGI45EugJO
 gec+rLypYgpVp4x0HI5T88qBRYCkxH3Kg8Qo+EWNA9A4LRQ9DX8njona0gf0s03tocK8kBN6
 6UoqqPtHBnc4eMgBymCflK12eKfd2YYxnyg9cZazWA5VslvTxpm76hbg5oiAEH/Vg/8MxHyA
 nPhfrgwyPrmJEcVBafdspJnYQxBYNco2LFPIhlOvWh8r4at+s+M3Lb26oUTczlgdW1Sf3SDA
 77BMRnF0FQyE+7AzV79MBN4ykiqaezQxtaF1Fy/tvkhffSo8u+dwG0EgJh+te38gTcISVr0G
 IPplLz6YhjrbHrPRF1CN5UuL9DBGjxuN35RLNVEfta6RUFlR6NctTjvrABEBAAHCwWUEGAEC
 AA8FAkyAcmQCGwwFCRLMAwAACgkQ7ZfpDmKqfjSrHA/+KzAKvTxRhA9MWNLxIyJ7S5uJ16gs
 T3oCjZrBKGEhKMOGX4O0GA6VOEryO7QRCCYah3oxSG38IAnNeiwJXgU9Bzkk85UGbPEd7HGF
 /VSeHCQwWou6jqUDTSDvn9YhNTdG0KXPM74aC+xr2Zow1O2mhXihgWKD0Dw+0LYPnUOsQ0KO
 FxHXXYHmRrS1OZPU59BLvc+TRhIhafSHKLwbXK+6ckkxBx6h8z5ccpG0Qs4bFhdFYnFrEieD
 LoGmnE2YLhdV6swJ9VNCS6pLiEohT3fm7aXm15tZOIyzMZhHRSAPblXxQ0ZSWjq8oRrcYNFx
 c4W1URpAkBCOYJoXvQfD5L3lqAl8TCqDUzYxhH/tJhbDdHrqHH767jaDaTB1+Talp/2AMKwc
 XNOdiklGxbmHVG6YGl6g8Lrbsu9NZEI4yLlHzuikthJWgz+3vZhVGyNlt+HNIoF6CjDL2omu
 5cEq4RDHM44QqPk6l7O0pUvN1mT4B+S1b08RKpqm/ff015E37HNV/piIvJlxGAYz8PSfuGCB
 1thMYqlmgdhd9/BabGFbGGYHA6U4/T5zqU+f6xHy1SsAQZ1MSKlLwekBIT+4/cLRGqCHjnV0
 q5H/T6a7t5mPkbzSrOLSo4puj+IToNjYyYIDBWzhlA19avOa+rvUjmHtD3sFN7cXWtkGoi8b
 uNcby4U=
Organization: UCLA Computer Science Department
Message-ID: <0904c3c5-90d3-dd53-680f-7867eb7a60c4@HIDDEN>
Date: Mon, 8 Oct 2018 15:49:09 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
 Thunderbird/60.0
MIME-Version: 1.0
In-Reply-To: <jwv4ldwqh32.fsf-monnier+gmane.linux.debian.user@HIDDEN>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Content-Language: en-US
X-Spam-Score: -2.3 (--)
X-Debbugs-Envelope-To: 32993
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)

On 10/8/18 2:34 PM, Stefan Monnier wrote:
> Is there a chance this performance behavior is the
> result of a performance bug, or is the algorithm really that costly?

It's O(N*D), where N is the input length and D is the number of 
differences. In your case that might be about 10*11, which is getting up 
there.

> tries to do word-level diffs (by basically turning every word
> into N copies of this word, each one on its own line (where N is the
> number of chars in the word, used to indicate to `diff` that long words
> are "more costly" than short ones)

Ouch. That's a really inefficient way of mucking with the cost 
algorithm. Diff should be able to do taht directly, without your having 
to repeat the words. (Just a simple matter of programming....)





Information forwarded to bug-diffutils@HIDDEN:
bug#32993; Package diffutils. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 8 Oct 2018 21:34:42 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Mon Oct 08 17:34:42 2018
Received: from localhost ([127.0.0.1]:40890 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1g9dAo-0007I2-EK
	for submit <at> debbugs.gnu.org; Mon, 08 Oct 2018 17:34:42 -0400
Received: from eggs.gnu.org ([208.118.235.92]:50950)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <monnier@HIDDEN>) id 1g9dAm-0007Hp-G3
 for submit <at> debbugs.gnu.org; Mon, 08 Oct 2018 17:34:41 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <monnier@HIDDEN>) id 1g9dAd-0006Hz-VL
 for submit <at> debbugs.gnu.org; Mon, 08 Oct 2018 17:34:35 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_20 autolearn=disabled
 version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:46403)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <monnier@HIDDEN>)
 id 1g9dAd-0006Hs-RB
 for submit <at> debbugs.gnu.org; Mon, 08 Oct 2018 17:34:31 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:60527)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <monnier@HIDDEN>) id 1g9dAc-0001VX-T3
 for bug-diffutils@HIDDEN; Mon, 08 Oct 2018 17:34:31 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <monnier@HIDDEN>) id 1g9dAV-00068P-3y
 for bug-diffutils@HIDDEN; Mon, 08 Oct 2018 17:34:27 -0400
Received: from chene.dit.umontreal.ca ([132.204.246.20]:59186)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <monnier@HIDDEN>) id 1g9dAU-000674-DK
 for bug-diffutils@HIDDEN; Mon, 08 Oct 2018 17:34:23 -0400
Received: from pastel.home (lechon.iro.umontreal.ca [132.204.27.242])
 by chene.dit.umontreal.ca (8.14.7/8.14.1) with ESMTP id w98LYI7f021434;
 Mon, 8 Oct 2018 17:34:19 -0400
Received: by pastel.home (Postfix, from userid 20848)
 id A20D56A41C; Mon,  8 Oct 2018 17:34:18 -0400 (EDT)
From: Stefan Monnier <monnier@HIDDEN>
To: bug-diffutils@HIDDEN
Subject: Pathologically slow operation
Message-ID: <jwv4ldwqh32.fsf-monnier+gmane.linux.debian.user@HIDDEN>
Date: Mon, 08 Oct 2018 17:34:18 -0400
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-NAI-Spam-Flag: NO
X-NAI-Spam-Threshold: 5
X-NAI-Spam-Score: 0
X-NAI-Spam-Rules: 2 Rules triggered
	EDT_SA_DN_PASS=0, RV6390=0
X-NAI-Spam-Version: 2.3.0.9418 : core <6390> : inlines <6919> : streams
 <1800748> : uri <2726508>
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
 recognized.
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -4.0 (----)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -5.0 (-----)


I recently bumped into a `diff` operation that I killed after several
minutes while diffing two files (on 3.7GHz core i3, which is the fastest
machine I have).

These files were generated as part of Emacs's "refine-hunk" processing
which tries to do word-level diffs (by basically turning every word
into N copies of this word, each one on its own line (where N is the
number of chars in the word, used to indicate to `diff` that long words
are "more costly" than short ones)).

So the files's sizes were:

    % wc tmp/diff-bug-* 
    1038026  851160 4963190 tmp/diff-bug-1
      65041   54877  314788 tmp/diff-bug-2
    1103067  906037 5277978 total
    %

With --speed-large-files, diff still took almost a minute to return an
answer (which is 973026 lines long).

Those file aren't exactly security sensitive, but they contain personal
info that I'd rather not make public (I can make send them in private
upon request, tho).  Is there a chance this performance behavior is the
result of a performance bug, or is the algorithm really that costly?


        Stefan




Acknowledgement sent to Stefan Monnier <monnier@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-diffutils@HIDDEN. Full text available.
Report forwarded to bug-diffutils@HIDDEN:
bug#32993; Package diffutils. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Mon, 25 Nov 2019 12:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.