X-Loop: help-debbugs@HIDDEN
Subject: bug#32993: Pathologically slow operation
Resent-From: Stefan Monnier <monnier@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-diffutils@HIDDEN
Resent-Date: Mon, 08 Oct 2018 21:35:01 +0000
Resent-Message-ID: <handler.32993.B.153903448228030 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: report 32993
X-GNU-PR-Package: diffutils
X-GNU-PR-Keywords:
To: 32993 <at> debbugs.gnu.org
X-Debbugs-Original-To: bug-diffutils@HIDDEN
Received: via spool by submit <at> debbugs.gnu.org id=B.153903448228030
(code B ref -1); Mon, 08 Oct 2018 21:35:01 +0000
Received: (at submit) by debbugs.gnu.org; 8 Oct 2018 21:34:42 +0000
Received: from localhost ([127.0.0.1]:40890 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1g9dAo-0007I2-EK
for submit <at> debbugs.gnu.org; Mon, 08 Oct 2018 17:34:42 -0400
Received: from eggs.gnu.org ([208.118.235.92]:50950)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <monnier@HIDDEN>) id 1g9dAm-0007Hp-G3
for submit <at> debbugs.gnu.org; Mon, 08 Oct 2018 17:34:41 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
(envelope-from <monnier@HIDDEN>) id 1g9dAd-0006Hz-VL
for submit <at> debbugs.gnu.org; Mon, 08 Oct 2018 17:34:35 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level:
X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_20 autolearn=disabled
version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:46403)
by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
(Exim 4.71) (envelope-from <monnier@HIDDEN>)
id 1g9dAd-0006Hs-RB
for submit <at> debbugs.gnu.org; Mon, 08 Oct 2018 17:34:31 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:60527)
by lists.gnu.org with esmtp (Exim 4.71)
(envelope-from <monnier@HIDDEN>) id 1g9dAc-0001VX-T3
for bug-diffutils@HIDDEN; Mon, 08 Oct 2018 17:34:31 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
(envelope-from <monnier@HIDDEN>) id 1g9dAV-00068P-3y
for bug-diffutils@HIDDEN; Mon, 08 Oct 2018 17:34:27 -0400
Received: from chene.dit.umontreal.ca ([132.204.246.20]:59186)
by eggs.gnu.org with esmtp (Exim 4.71)
(envelope-from <monnier@HIDDEN>) id 1g9dAU-000674-DK
for bug-diffutils@HIDDEN; Mon, 08 Oct 2018 17:34:23 -0400
Received: from pastel.home (lechon.iro.umontreal.ca [132.204.27.242])
by chene.dit.umontreal.ca (8.14.7/8.14.1) with ESMTP id w98LYI7f021434;
Mon, 8 Oct 2018 17:34:19 -0400
Received: by pastel.home (Postfix, from userid 20848)
id A20D56A41C; Mon, 8 Oct 2018 17:34:18 -0400 (EDT)
From: Stefan Monnier <monnier@HIDDEN>
Message-ID: <jwv4ldwqh32.fsf-monnier+gmane.linux.debian.user@HIDDEN>
Date: Mon, 08 Oct 2018 17:34:18 -0400
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-NAI-Spam-Flag: NO
X-NAI-Spam-Threshold: 5
X-NAI-Spam-Score: 0
X-NAI-Spam-Rules: 2 Rules triggered
EDT_SA_DN_PASS=0, RV6390=0
X-NAI-Spam-Version: 2.3.0.9418 : core <6390> : inlines <6919> : streams
<1800748> : uri <2726508>
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
recognized.
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -4.0 (----)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -5.0 (-----)
I recently bumped into a `diff` operation that I killed after several
minutes while diffing two files (on 3.7GHz core i3, which is the fastest
machine I have).
These files were generated as part of Emacs's "refine-hunk" processing
which tries to do word-level diffs (by basically turning every word
into N copies of this word, each one on its own line (where N is the
number of chars in the word, used to indicate to `diff` that long words
are "more costly" than short ones)).
So the files's sizes were:
% wc tmp/diff-bug-*
1038026 851160 4963190 tmp/diff-bug-1
65041 54877 314788 tmp/diff-bug-2
1103067 906037 5277978 total
%
With --speed-large-files, diff still took almost a minute to return an
answer (which is 973026 lines long).
Those file aren't exactly security sensitive, but they contain personal
info that I'd rather not make public (I can make send them in private
upon request, tho). Is there a chance this performance behavior is the
result of a performance bug, or is the algorithm really that costly?
Stefan
Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) Content-Type: text/plain; charset=utf-8 X-Loop: help-debbugs@HIDDEN From: help-debbugs@HIDDEN (GNU bug Tracking System) To: Stefan Monnier <monnier@HIDDEN> Subject: bug#32993: Acknowledgement (Pathologically slow operation) Message-ID: <handler.32993.B.153903448228030.ack <at> debbugs.gnu.org> References: <jwv4ldwqh32.fsf-monnier+gmane.linux.debian.user@HIDDEN> X-Gnu-PR-Message: ack 32993 X-Gnu-PR-Package: diffutils Reply-To: 32993 <at> debbugs.gnu.org Date: Mon, 08 Oct 2018 21:35:02 +0000 Thank you for filing a new bug report with debbugs.gnu.org. This is an automatically generated reply to let you know your message has been received. Your message is being forwarded to the package maintainers and other interested parties for their attention; they will reply in due course. Your message has been sent to the package maintainer(s): bug-diffutils@HIDDEN If you wish to submit further information on this problem, please send it to 32993 <at> debbugs.gnu.org. Please do not send mail to help-debbugs@HIDDEN unless you wish to report a problem with the Bug-tracking system. --=20 32993: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D32993 GNU Bug Tracking System Contact help-debbugs@HIDDEN with problems
X-Loop: help-debbugs@HIDDEN
Subject: bug#32993: [bug-diffutils] bug#32993: Pathologically slow operation
Resent-From: Paul Eggert <eggert@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-diffutils@HIDDEN
Resent-Date: Mon, 08 Oct 2018 22:50:02 +0000
Resent-Message-ID: <handler.32993.B32993.15390389592196 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 32993
X-GNU-PR-Package: diffutils
X-GNU-PR-Keywords:
To: Stefan Monnier <monnier@HIDDEN>, 32993 <at> debbugs.gnu.org
Received: via spool by 32993-submit <at> debbugs.gnu.org id=B32993.15390389592196
(code B ref 32993); Mon, 08 Oct 2018 22:50:02 +0000
Received: (at 32993) by debbugs.gnu.org; 8 Oct 2018 22:49:19 +0000
Received: from localhost ([127.0.0.1]:40898 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1g9eL1-0000ZM-7X
for submit <at> debbugs.gnu.org; Mon, 08 Oct 2018 18:49:19 -0400
Received: from zimbra.cs.ucla.edu ([131.179.128.68]:52720)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <eggert@HIDDEN>) id 1g9eKy-0000Z6-MH
for 32993 <at> debbugs.gnu.org; Mon, 08 Oct 2018 18:49:17 -0400
Received: from localhost (localhost [127.0.0.1])
by zimbra.cs.ucla.edu (Postfix) with ESMTP id 41872161747;
Mon, 8 Oct 2018 15:49:10 -0700 (PDT)
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032)
with ESMTP id 9Xj84CwsF5-q; Mon, 8 Oct 2018 15:49:09 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1])
by zimbra.cs.ucla.edu (Postfix) with ESMTP id 834BD161749;
Mon, 8 Oct 2018 15:49:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026)
with ESMTP id ny9WKHMZ7_TY; Mon, 8 Oct 2018 15:49:09 -0700 (PDT)
Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200])
by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 60920161747;
Mon, 8 Oct 2018 15:49:09 -0700 (PDT)
References: <jwv4ldwqh32.fsf-monnier+gmane.linux.debian.user@HIDDEN>
From: Paul Eggert <eggert@HIDDEN>
Openpgp: preference=signencrypt
Autocrypt: addr=eggert@HIDDEN; prefer-encrypt=mutual; keydata=
xsFNBEyAcmQBEADAAyH2xoTu7ppG5D3a8FMZEon74dCvc4+q1XA2J2tBy2pwaTqfhpxxdGA9
Jj50UJ3PD4bSUEgN8tLZ0san47l5XTAFLi2456ciSl5m8sKaHlGdt9XmAAtmXqeZVIYX/UFS
96fDzf4xhEmm/y7LbYEPQdUdxu47xA5KhTYp5bltF3WYDz1Ygd7gx07Auwp7iw7eNvnoDTAl
KAl8KYDZzbDNCQGEbpY3efZIvPdeI+FWQN4W+kghy+P6au6PrIIhYraeua7XDdb2LS1en3Ss
mE3QjqfRqI/A2ue8JMwsvXe/WK38Ezs6x74iTaqI3AFH6ilAhDqpMnd/msSESNFt76DiO1ZK
QMr9amVPknjfPmJISqdhgB1DlEdw34sROf6V8mZw0xfqT6PKE46LcFefzs0kbg4GORf8vjG2
Sf1tk5eU8MBiyN/bZ03bKNjNYMpODDQQwuP84kYLkX2wBxxMAhBxwbDVZudzxDZJ1C2VXujC
OJVxq2kljBM9ETYuUGqd75AW2LXrLw6+MuIsHFAYAgRr7+KcwDgBAfwhPBYX34nSSiHlmLC+
KaHLeCLF5ZI2vKm3HEeCTtlOg7xZEONgwzL+fdKo+D6SoC8RRxJKs8a3sVfI4t6CnrQzvJbB
n6gxdgCu5i29J1QCYrCYvql2UyFPAK+do99/1jOXT4m2836j1wARAQABzSBQYXVsIEVnZ2Vy
dCA8ZWdnZXJ0QGNzLnVjbGEuZWR1PsLBfgQTAQIAKAUCTIByZAIbAwUJEswDAAYLCQgHAwIG
FQgCCQoLBBYCAwECHgECF4AACgkQ7ZfpDmKqfjRRGw/+Ij03dhYfYl/gXVRiuzV1gGrbHk+t
nfrI/C7fAeoFzQ5tVgVinShaPkZo0HTPf18x6IDEdAiO8Mqo1yp0CtHmzGMCJ50o4Grgfjlr
6g/+vtEOKbhleszN2XpJvpwM2QgGvn/laTLUu8PH9aRWTs7qJJZKKKAb4sxYc92FehPu6FOD
0dDiyhlDAq4lOV2mdBpzQbiojoZzQLMQwjpgCTK2572eK9EOEQySUThXrSIz6ASenp4NYTFH
s9tuJQvXk9gZDdPSl3bp+47dGxlxEWLpBIM7zIONw4ks4azgT8nvDZxA5IZHtvqBlJLBObYY
0Le61Wp0y3TlBDh2qdK8eYL426W4scEMSuig5gb8OAtQiBW6k2sGUxxeiv8ovWu8YAZgKJfu
oWI+uRnMEddruY8JsoM54KaKvZikkKs2bg1ndtLVzHpJ6qFZC7QVjeHUh6/BmgvdjWPZYFTt
N+KA9CWX3GQKKgN3uu988yznD7LnB98T4EUH1HA/GnfBqMV1gpzTvPc4qVQinCmIkEFp83zl
+G5fCjJJ3W7ivzCnYo4KhKLpFUm97okTKR2LW3xZzEW4cLSWO387MTK3CzDOx5qe6s4a91Zu
ZM/j/TQdTLDaqNn83kA4Hq48UHXYxcIh+Nd8k/3w6lFuoK0wrOFiywjLx+0ur5jmmbecBGHc
1xdhAFHOwU0ETIByZAEQAKaF678T9wyH4wjTrV1Pz3cDEoSnV/0ZUrOT37p1dcGyj/IXq1x6
70HRVahAmk0sZpYc25PF9D5GPYHFWlNjuPU96rDndXB3hedmBRhLdC4bAXjI4DV+bmdVe+q/
IMnlZRaVlm9EiMCVAR6w13sReu7qXkW9r3RwY2AzXskp/tAe4BRKr1Zmbvi2nbnQ6epEC42r
Rbx0B1EhjbIQZ5JHGk24iPT7LdBgnNmos5wYjzwNlkMQD5T0Ydzhk7J+UxwA5m46mOhRDC2r
FV/A0gm5TLy8DXjv/Esc4gYnYai6SQqnUEVh5LuV8YCJBnijs+Tiw71x1icmn6xGI45EugJO
gec+rLypYgpVp4x0HI5T88qBRYCkxH3Kg8Qo+EWNA9A4LRQ9DX8njona0gf0s03tocK8kBN6
6UoqqPtHBnc4eMgBymCflK12eKfd2YYxnyg9cZazWA5VslvTxpm76hbg5oiAEH/Vg/8MxHyA
nPhfrgwyPrmJEcVBafdspJnYQxBYNco2LFPIhlOvWh8r4at+s+M3Lb26oUTczlgdW1Sf3SDA
77BMRnF0FQyE+7AzV79MBN4ykiqaezQxtaF1Fy/tvkhffSo8u+dwG0EgJh+te38gTcISVr0G
IPplLz6YhjrbHrPRF1CN5UuL9DBGjxuN35RLNVEfta6RUFlR6NctTjvrABEBAAHCwWUEGAEC
AA8FAkyAcmQCGwwFCRLMAwAACgkQ7ZfpDmKqfjSrHA/+KzAKvTxRhA9MWNLxIyJ7S5uJ16gs
T3oCjZrBKGEhKMOGX4O0GA6VOEryO7QRCCYah3oxSG38IAnNeiwJXgU9Bzkk85UGbPEd7HGF
/VSeHCQwWou6jqUDTSDvn9YhNTdG0KXPM74aC+xr2Zow1O2mhXihgWKD0Dw+0LYPnUOsQ0KO
FxHXXYHmRrS1OZPU59BLvc+TRhIhafSHKLwbXK+6ckkxBx6h8z5ccpG0Qs4bFhdFYnFrEieD
LoGmnE2YLhdV6swJ9VNCS6pLiEohT3fm7aXm15tZOIyzMZhHRSAPblXxQ0ZSWjq8oRrcYNFx
c4W1URpAkBCOYJoXvQfD5L3lqAl8TCqDUzYxhH/tJhbDdHrqHH767jaDaTB1+Talp/2AMKwc
XNOdiklGxbmHVG6YGl6g8Lrbsu9NZEI4yLlHzuikthJWgz+3vZhVGyNlt+HNIoF6CjDL2omu
5cEq4RDHM44QqPk6l7O0pUvN1mT4B+S1b08RKpqm/ff015E37HNV/piIvJlxGAYz8PSfuGCB
1thMYqlmgdhd9/BabGFbGGYHA6U4/T5zqU+f6xHy1SsAQZ1MSKlLwekBIT+4/cLRGqCHjnV0
q5H/T6a7t5mPkbzSrOLSo4puj+IToNjYyYIDBWzhlA19avOa+rvUjmHtD3sFN7cXWtkGoi8b
uNcby4U=
Organization: UCLA Computer Science Department
Message-ID: <0904c3c5-90d3-dd53-680f-7867eb7a60c4@HIDDEN>
Date: Mon, 8 Oct 2018 15:49:09 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
Thunderbird/60.0
MIME-Version: 1.0
In-Reply-To: <jwv4ldwqh32.fsf-monnier+gmane.linux.debian.user@HIDDEN>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Content-Language: en-US
X-Spam-Score: -2.3 (--)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)
On 10/8/18 2:34 PM, Stefan Monnier wrote:
> Is there a chance this performance behavior is the
> result of a performance bug, or is the algorithm really that costly?
It's O(N*D), where N is the input length and D is the number of
differences. In your case that might be about 10*11, which is getting up
there.
> tries to do word-level diffs (by basically turning every word
> into N copies of this word, each one on its own line (where N is the
> number of chars in the word, used to indicate to `diff` that long words
> are "more costly" than short ones)
Ouch. That's a really inefficient way of mucking with the cost
algorithm. Diff should be able to do taht directly, without your having
to repeat the words. (Just a simple matter of programming....)
X-Loop: help-debbugs@HIDDEN
Subject: bug#32993: [bug-diffutils] bug#32993: Pathologically slow operation
Resent-From: Stefan Monnier <monnier@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-diffutils@HIDDEN
Resent-Date: Tue, 09 Oct 2018 00:16:02 +0000
Resent-Message-ID: <handler.32993.B32993.153904411218260 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 32993
X-GNU-PR-Package: diffutils
X-GNU-PR-Keywords:
To: Paul Eggert <eggert@HIDDEN>
Cc: 32993 <at> debbugs.gnu.org
Received: via spool by 32993-submit <at> debbugs.gnu.org id=B32993.153904411218260
(code B ref 32993); Tue, 09 Oct 2018 00:16:02 +0000
Received: (at 32993) by debbugs.gnu.org; 9 Oct 2018 00:15:12 +0000
Received: from localhost ([127.0.0.1]:40986 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1g9fg7-0004kR-Ty
for submit <at> debbugs.gnu.org; Mon, 08 Oct 2018 20:15:12 -0400
Received: from chene.dit.umontreal.ca ([132.204.246.20]:52211)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <monnier@HIDDEN>) id 1g9fg6-0004kJ-BD
for 32993 <at> debbugs.gnu.org; Mon, 08 Oct 2018 20:15:11 -0400
Received: from milanesa.home (lechon.iro.umontreal.ca [132.204.27.242])
by chene.dit.umontreal.ca (8.14.7/8.14.1) with ESMTP id w990ESaV010572;
Mon, 8 Oct 2018 20:14:28 -0400
Received: by milanesa.home (Postfix, from userid 20848)
id 0549366100; Mon, 8 Oct 2018 20:14:28 -0400 (EDT)
From: Stefan Monnier <monnier@HIDDEN>
Message-ID: <jwvftxg2d2n.fsf-monnier+INBOX@HIDDEN>
References: <jwv4ldwqh32.fsf-monnier+gmane.linux.debian.user@HIDDEN>
<0904c3c5-90d3-dd53-680f-7867eb7a60c4@HIDDEN>
Date: Mon, 08 Oct 2018 20:14:27 -0400
In-Reply-To: <0904c3c5-90d3-dd53-680f-7867eb7a60c4@HIDDEN> (Paul Eggert's
message of "Mon, 8 Oct 2018 15:49:09 -0700")
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
X-NAI-Spam-Flag: NO
X-NAI-Spam-Threshold: 5
X-NAI-Spam-Score: 0
X-NAI-Spam-Rules: 2 Rules triggered
EDT_SA_DN_PASS=0, RV6390=0
X-NAI-Spam-Version: 2.3.0.9418 : core <6390> : inlines <6919> : streams
<1800759> : uri <2726587>
X-Spam-Score: -2.3 (--)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)
>> tries to do word-level diffs (by basically turning every word
>> into N copies of this word, each one on its own line (where N is the
>> number of chars in the word, used to indicate to `diff` that long words
>> are "more costly" than short ones)
> Ouch. That's a really inefficient way of mucking with the cost
> algorithm.
What other way is there?
> Diff should be able to do taht directly, without your having to
> repeat the words. (Just a simple matter of programming....)
"man diff" doesn't seem to offer much room for "programming".
Stefan
X-Loop: help-debbugs@HIDDEN
Subject: bug#32993: [bug-diffutils] bug#32993: Pathologically slow operation
Resent-From: Paul Eggert <eggert@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-diffutils@HIDDEN
Resent-Date: Tue, 09 Oct 2018 00:48:01 +0000
Resent-Message-ID: <handler.32993.B32993.153904602721129 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 32993
X-GNU-PR-Package: diffutils
X-GNU-PR-Keywords:
To: Stefan Monnier <monnier@HIDDEN>
Cc: 32993 <at> debbugs.gnu.org
Received: via spool by 32993-submit <at> debbugs.gnu.org id=B32993.153904602721129
(code B ref 32993); Tue, 09 Oct 2018 00:48:01 +0000
Received: (at 32993) by debbugs.gnu.org; 9 Oct 2018 00:47:07 +0000
Received: from localhost ([127.0.0.1]:40996 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1g9gB1-0005Ui-3h
for submit <at> debbugs.gnu.org; Mon, 08 Oct 2018 20:47:07 -0400
Received: from zimbra.cs.ucla.edu ([131.179.128.68]:43640)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <eggert@HIDDEN>) id 1g9gAy-0005UE-V7
for 32993 <at> debbugs.gnu.org; Mon, 08 Oct 2018 20:47:05 -0400
Received: from localhost (localhost [127.0.0.1])
by zimbra.cs.ucla.edu (Postfix) with ESMTP id 99F7B16173B;
Mon, 8 Oct 2018 17:46:58 -0700 (PDT)
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032)
with ESMTP id WMVOHcmMXvS7; Mon, 8 Oct 2018 17:46:57 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1])
by zimbra.cs.ucla.edu (Postfix) with ESMTP id BAC65161747;
Mon, 8 Oct 2018 17:46:57 -0700 (PDT)
X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026)
with ESMTP id uth-8xtj8Zb1; Mon, 8 Oct 2018 17:46:57 -0700 (PDT)
Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200])
by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 9E34016173B;
Mon, 8 Oct 2018 17:46:57 -0700 (PDT)
References: <jwv4ldwqh32.fsf-monnier+gmane.linux.debian.user@HIDDEN>
<0904c3c5-90d3-dd53-680f-7867eb7a60c4@HIDDEN>
<jwvftxg2d2n.fsf-monnier+INBOX@HIDDEN>
From: Paul Eggert <eggert@HIDDEN>
Openpgp: preference=signencrypt
Autocrypt: addr=eggert@HIDDEN; prefer-encrypt=mutual; keydata=
xsFNBEyAcmQBEADAAyH2xoTu7ppG5D3a8FMZEon74dCvc4+q1XA2J2tBy2pwaTqfhpxxdGA9
Jj50UJ3PD4bSUEgN8tLZ0san47l5XTAFLi2456ciSl5m8sKaHlGdt9XmAAtmXqeZVIYX/UFS
96fDzf4xhEmm/y7LbYEPQdUdxu47xA5KhTYp5bltF3WYDz1Ygd7gx07Auwp7iw7eNvnoDTAl
KAl8KYDZzbDNCQGEbpY3efZIvPdeI+FWQN4W+kghy+P6au6PrIIhYraeua7XDdb2LS1en3Ss
mE3QjqfRqI/A2ue8JMwsvXe/WK38Ezs6x74iTaqI3AFH6ilAhDqpMnd/msSESNFt76DiO1ZK
QMr9amVPknjfPmJISqdhgB1DlEdw34sROf6V8mZw0xfqT6PKE46LcFefzs0kbg4GORf8vjG2
Sf1tk5eU8MBiyN/bZ03bKNjNYMpODDQQwuP84kYLkX2wBxxMAhBxwbDVZudzxDZJ1C2VXujC
OJVxq2kljBM9ETYuUGqd75AW2LXrLw6+MuIsHFAYAgRr7+KcwDgBAfwhPBYX34nSSiHlmLC+
KaHLeCLF5ZI2vKm3HEeCTtlOg7xZEONgwzL+fdKo+D6SoC8RRxJKs8a3sVfI4t6CnrQzvJbB
n6gxdgCu5i29J1QCYrCYvql2UyFPAK+do99/1jOXT4m2836j1wARAQABzSBQYXVsIEVnZ2Vy
dCA8ZWdnZXJ0QGNzLnVjbGEuZWR1PsLBfgQTAQIAKAUCTIByZAIbAwUJEswDAAYLCQgHAwIG
FQgCCQoLBBYCAwECHgECF4AACgkQ7ZfpDmKqfjRRGw/+Ij03dhYfYl/gXVRiuzV1gGrbHk+t
nfrI/C7fAeoFzQ5tVgVinShaPkZo0HTPf18x6IDEdAiO8Mqo1yp0CtHmzGMCJ50o4Grgfjlr
6g/+vtEOKbhleszN2XpJvpwM2QgGvn/laTLUu8PH9aRWTs7qJJZKKKAb4sxYc92FehPu6FOD
0dDiyhlDAq4lOV2mdBpzQbiojoZzQLMQwjpgCTK2572eK9EOEQySUThXrSIz6ASenp4NYTFH
s9tuJQvXk9gZDdPSl3bp+47dGxlxEWLpBIM7zIONw4ks4azgT8nvDZxA5IZHtvqBlJLBObYY
0Le61Wp0y3TlBDh2qdK8eYL426W4scEMSuig5gb8OAtQiBW6k2sGUxxeiv8ovWu8YAZgKJfu
oWI+uRnMEddruY8JsoM54KaKvZikkKs2bg1ndtLVzHpJ6qFZC7QVjeHUh6/BmgvdjWPZYFTt
N+KA9CWX3GQKKgN3uu988yznD7LnB98T4EUH1HA/GnfBqMV1gpzTvPc4qVQinCmIkEFp83zl
+G5fCjJJ3W7ivzCnYo4KhKLpFUm97okTKR2LW3xZzEW4cLSWO387MTK3CzDOx5qe6s4a91Zu
ZM/j/TQdTLDaqNn83kA4Hq48UHXYxcIh+Nd8k/3w6lFuoK0wrOFiywjLx+0ur5jmmbecBGHc
1xdhAFHOwU0ETIByZAEQAKaF678T9wyH4wjTrV1Pz3cDEoSnV/0ZUrOT37p1dcGyj/IXq1x6
70HRVahAmk0sZpYc25PF9D5GPYHFWlNjuPU96rDndXB3hedmBRhLdC4bAXjI4DV+bmdVe+q/
IMnlZRaVlm9EiMCVAR6w13sReu7qXkW9r3RwY2AzXskp/tAe4BRKr1Zmbvi2nbnQ6epEC42r
Rbx0B1EhjbIQZ5JHGk24iPT7LdBgnNmos5wYjzwNlkMQD5T0Ydzhk7J+UxwA5m46mOhRDC2r
FV/A0gm5TLy8DXjv/Esc4gYnYai6SQqnUEVh5LuV8YCJBnijs+Tiw71x1icmn6xGI45EugJO
gec+rLypYgpVp4x0HI5T88qBRYCkxH3Kg8Qo+EWNA9A4LRQ9DX8njona0gf0s03tocK8kBN6
6UoqqPtHBnc4eMgBymCflK12eKfd2YYxnyg9cZazWA5VslvTxpm76hbg5oiAEH/Vg/8MxHyA
nPhfrgwyPrmJEcVBafdspJnYQxBYNco2LFPIhlOvWh8r4at+s+M3Lb26oUTczlgdW1Sf3SDA
77BMRnF0FQyE+7AzV79MBN4ykiqaezQxtaF1Fy/tvkhffSo8u+dwG0EgJh+te38gTcISVr0G
IPplLz6YhjrbHrPRF1CN5UuL9DBGjxuN35RLNVEfta6RUFlR6NctTjvrABEBAAHCwWUEGAEC
AA8FAkyAcmQCGwwFCRLMAwAACgkQ7ZfpDmKqfjSrHA/+KzAKvTxRhA9MWNLxIyJ7S5uJ16gs
T3oCjZrBKGEhKMOGX4O0GA6VOEryO7QRCCYah3oxSG38IAnNeiwJXgU9Bzkk85UGbPEd7HGF
/VSeHCQwWou6jqUDTSDvn9YhNTdG0KXPM74aC+xr2Zow1O2mhXihgWKD0Dw+0LYPnUOsQ0KO
FxHXXYHmRrS1OZPU59BLvc+TRhIhafSHKLwbXK+6ckkxBx6h8z5ccpG0Qs4bFhdFYnFrEieD
LoGmnE2YLhdV6swJ9VNCS6pLiEohT3fm7aXm15tZOIyzMZhHRSAPblXxQ0ZSWjq8oRrcYNFx
c4W1URpAkBCOYJoXvQfD5L3lqAl8TCqDUzYxhH/tJhbDdHrqHH767jaDaTB1+Talp/2AMKwc
XNOdiklGxbmHVG6YGl6g8Lrbsu9NZEI4yLlHzuikthJWgz+3vZhVGyNlt+HNIoF6CjDL2omu
5cEq4RDHM44QqPk6l7O0pUvN1mT4B+S1b08RKpqm/ff015E37HNV/piIvJlxGAYz8PSfuGCB
1thMYqlmgdhd9/BabGFbGGYHA6U4/T5zqU+f6xHy1SsAQZ1MSKlLwekBIT+4/cLRGqCHjnV0
q5H/T6a7t5mPkbzSrOLSo4puj+IToNjYyYIDBWzhlA19avOa+rvUjmHtD3sFN7cXWtkGoi8b
uNcby4U=
Organization: UCLA Computer Science Department
Message-ID: <f71a3817-72fa-dd02-a4ad-52b6910609a1@HIDDEN>
Date: Mon, 8 Oct 2018 17:46:57 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
Thunderbird/60.0
MIME-Version: 1.0
In-Reply-To: <jwvftxg2d2n.fsf-monnier+INBOX@HIDDEN>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Content-Language: en-US
X-Spam-Score: -2.3 (--)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)
On 10/8/18 5:14 PM, Stefan Monnier wrote:
> What other way is there?
It'd require hacking the diff source code, alas. Also, understanding the
algorithm. Although the best guy to do it would probably be the
algorithm's co-inventor Eugene Myers, he's kind of busy these days
<https://en.wikipedia.org/wiki/Eugene_Myers>.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.