Received: (at 32993) by debbugs.gnu.org; 9 Oct 2018 00:47:07 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Oct 08 20:47:07 2018 Received: from localhost ([127.0.0.1]:40996 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1g9gB1-0005Ui-3h for submit <at> debbugs.gnu.org; Mon, 08 Oct 2018 20:47:07 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:43640) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <eggert@HIDDEN>) id 1g9gAy-0005UE-V7 for 32993 <at> debbugs.gnu.org; Mon, 08 Oct 2018 20:47:05 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 99F7B16173B; Mon, 8 Oct 2018 17:46:58 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id WMVOHcmMXvS7; Mon, 8 Oct 2018 17:46:57 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id BAC65161747; Mon, 8 Oct 2018 17:46:57 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id uth-8xtj8Zb1; Mon, 8 Oct 2018 17:46:57 -0700 (PDT) Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 9E34016173B; Mon, 8 Oct 2018 17:46:57 -0700 (PDT) Subject: Re: [bug-diffutils] bug#32993: Pathologically slow operation To: Stefan Monnier <monnier@HIDDEN> References: <jwv4ldwqh32.fsf-monnier+gmane.linux.debian.user@HIDDEN> <0904c3c5-90d3-dd53-680f-7867eb7a60c4@HIDDEN> <jwvftxg2d2n.fsf-monnier+INBOX@HIDDEN> From: Paul Eggert <eggert@HIDDEN> Openpgp: preference=signencrypt Autocrypt: addr=eggert@HIDDEN; prefer-encrypt=mutual; keydata= xsFNBEyAcmQBEADAAyH2xoTu7ppG5D3a8FMZEon74dCvc4+q1XA2J2tBy2pwaTqfhpxxdGA9 Jj50UJ3PD4bSUEgN8tLZ0san47l5XTAFLi2456ciSl5m8sKaHlGdt9XmAAtmXqeZVIYX/UFS 96fDzf4xhEmm/y7LbYEPQdUdxu47xA5KhTYp5bltF3WYDz1Ygd7gx07Auwp7iw7eNvnoDTAl KAl8KYDZzbDNCQGEbpY3efZIvPdeI+FWQN4W+kghy+P6au6PrIIhYraeua7XDdb2LS1en3Ss mE3QjqfRqI/A2ue8JMwsvXe/WK38Ezs6x74iTaqI3AFH6ilAhDqpMnd/msSESNFt76DiO1ZK QMr9amVPknjfPmJISqdhgB1DlEdw34sROf6V8mZw0xfqT6PKE46LcFefzs0kbg4GORf8vjG2 Sf1tk5eU8MBiyN/bZ03bKNjNYMpODDQQwuP84kYLkX2wBxxMAhBxwbDVZudzxDZJ1C2VXujC OJVxq2kljBM9ETYuUGqd75AW2LXrLw6+MuIsHFAYAgRr7+KcwDgBAfwhPBYX34nSSiHlmLC+ KaHLeCLF5ZI2vKm3HEeCTtlOg7xZEONgwzL+fdKo+D6SoC8RRxJKs8a3sVfI4t6CnrQzvJbB n6gxdgCu5i29J1QCYrCYvql2UyFPAK+do99/1jOXT4m2836j1wARAQABzSBQYXVsIEVnZ2Vy dCA8ZWdnZXJ0QGNzLnVjbGEuZWR1PsLBfgQTAQIAKAUCTIByZAIbAwUJEswDAAYLCQgHAwIG FQgCCQoLBBYCAwECHgECF4AACgkQ7ZfpDmKqfjRRGw/+Ij03dhYfYl/gXVRiuzV1gGrbHk+t nfrI/C7fAeoFzQ5tVgVinShaPkZo0HTPf18x6IDEdAiO8Mqo1yp0CtHmzGMCJ50o4Grgfjlr 6g/+vtEOKbhleszN2XpJvpwM2QgGvn/laTLUu8PH9aRWTs7qJJZKKKAb4sxYc92FehPu6FOD 0dDiyhlDAq4lOV2mdBpzQbiojoZzQLMQwjpgCTK2572eK9EOEQySUThXrSIz6ASenp4NYTFH s9tuJQvXk9gZDdPSl3bp+47dGxlxEWLpBIM7zIONw4ks4azgT8nvDZxA5IZHtvqBlJLBObYY 0Le61Wp0y3TlBDh2qdK8eYL426W4scEMSuig5gb8OAtQiBW6k2sGUxxeiv8ovWu8YAZgKJfu oWI+uRnMEddruY8JsoM54KaKvZikkKs2bg1ndtLVzHpJ6qFZC7QVjeHUh6/BmgvdjWPZYFTt N+KA9CWX3GQKKgN3uu988yznD7LnB98T4EUH1HA/GnfBqMV1gpzTvPc4qVQinCmIkEFp83zl +G5fCjJJ3W7ivzCnYo4KhKLpFUm97okTKR2LW3xZzEW4cLSWO387MTK3CzDOx5qe6s4a91Zu ZM/j/TQdTLDaqNn83kA4Hq48UHXYxcIh+Nd8k/3w6lFuoK0wrOFiywjLx+0ur5jmmbecBGHc 1xdhAFHOwU0ETIByZAEQAKaF678T9wyH4wjTrV1Pz3cDEoSnV/0ZUrOT37p1dcGyj/IXq1x6 70HRVahAmk0sZpYc25PF9D5GPYHFWlNjuPU96rDndXB3hedmBRhLdC4bAXjI4DV+bmdVe+q/ IMnlZRaVlm9EiMCVAR6w13sReu7qXkW9r3RwY2AzXskp/tAe4BRKr1Zmbvi2nbnQ6epEC42r Rbx0B1EhjbIQZ5JHGk24iPT7LdBgnNmos5wYjzwNlkMQD5T0Ydzhk7J+UxwA5m46mOhRDC2r FV/A0gm5TLy8DXjv/Esc4gYnYai6SQqnUEVh5LuV8YCJBnijs+Tiw71x1icmn6xGI45EugJO gec+rLypYgpVp4x0HI5T88qBRYCkxH3Kg8Qo+EWNA9A4LRQ9DX8njona0gf0s03tocK8kBN6 6UoqqPtHBnc4eMgBymCflK12eKfd2YYxnyg9cZazWA5VslvTxpm76hbg5oiAEH/Vg/8MxHyA nPhfrgwyPrmJEcVBafdspJnYQxBYNco2LFPIhlOvWh8r4at+s+M3Lb26oUTczlgdW1Sf3SDA 77BMRnF0FQyE+7AzV79MBN4ykiqaezQxtaF1Fy/tvkhffSo8u+dwG0EgJh+te38gTcISVr0G IPplLz6YhjrbHrPRF1CN5UuL9DBGjxuN35RLNVEfta6RUFlR6NctTjvrABEBAAHCwWUEGAEC AA8FAkyAcmQCGwwFCRLMAwAACgkQ7ZfpDmKqfjSrHA/+KzAKvTxRhA9MWNLxIyJ7S5uJ16gs T3oCjZrBKGEhKMOGX4O0GA6VOEryO7QRCCYah3oxSG38IAnNeiwJXgU9Bzkk85UGbPEd7HGF /VSeHCQwWou6jqUDTSDvn9YhNTdG0KXPM74aC+xr2Zow1O2mhXihgWKD0Dw+0LYPnUOsQ0KO FxHXXYHmRrS1OZPU59BLvc+TRhIhafSHKLwbXK+6ckkxBx6h8z5ccpG0Qs4bFhdFYnFrEieD LoGmnE2YLhdV6swJ9VNCS6pLiEohT3fm7aXm15tZOIyzMZhHRSAPblXxQ0ZSWjq8oRrcYNFx c4W1URpAkBCOYJoXvQfD5L3lqAl8TCqDUzYxhH/tJhbDdHrqHH767jaDaTB1+Talp/2AMKwc XNOdiklGxbmHVG6YGl6g8Lrbsu9NZEI4yLlHzuikthJWgz+3vZhVGyNlt+HNIoF6CjDL2omu 5cEq4RDHM44QqPk6l7O0pUvN1mT4B+S1b08RKpqm/ff015E37HNV/piIvJlxGAYz8PSfuGCB 1thMYqlmgdhd9/BabGFbGGYHA6U4/T5zqU+f6xHy1SsAQZ1MSKlLwekBIT+4/cLRGqCHjnV0 q5H/T6a7t5mPkbzSrOLSo4puj+IToNjYyYIDBWzhlA19avOa+rvUjmHtD3sFN7cXWtkGoi8b uNcby4U= Organization: UCLA Computer Science Department Message-ID: <f71a3817-72fa-dd02-a4ad-52b6910609a1@HIDDEN> Date: Mon, 8 Oct 2018 17:46:57 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.0 MIME-Version: 1.0 In-Reply-To: <jwvftxg2d2n.fsf-monnier+INBOX@HIDDEN> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 32993 Cc: 32993 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) On 10/8/18 5:14 PM, Stefan Monnier wrote: > What other way is there? It'd require hacking the diff source code, alas. Also, understanding the algorithm. Although the best guy to do it would probably be the algorithm's co-inventor Eugene Myers, he's kind of busy these days <https://en.wikipedia.org/wiki/Eugene_Myers>.
bug-diffutils@HIDDEN
:bug#32993
; Package diffutils
.
Full text available.Received: (at 32993) by debbugs.gnu.org; 9 Oct 2018 00:15:12 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Oct 08 20:15:12 2018 Received: from localhost ([127.0.0.1]:40986 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1g9fg7-0004kR-Ty for submit <at> debbugs.gnu.org; Mon, 08 Oct 2018 20:15:12 -0400 Received: from chene.dit.umontreal.ca ([132.204.246.20]:52211) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <monnier@HIDDEN>) id 1g9fg6-0004kJ-BD for 32993 <at> debbugs.gnu.org; Mon, 08 Oct 2018 20:15:11 -0400 Received: from milanesa.home (lechon.iro.umontreal.ca [132.204.27.242]) by chene.dit.umontreal.ca (8.14.7/8.14.1) with ESMTP id w990ESaV010572; Mon, 8 Oct 2018 20:14:28 -0400 Received: by milanesa.home (Postfix, from userid 20848) id 0549366100; Mon, 8 Oct 2018 20:14:28 -0400 (EDT) From: Stefan Monnier <monnier@HIDDEN> To: Paul Eggert <eggert@HIDDEN> Subject: Re: [bug-diffutils] bug#32993: Pathologically slow operation Message-ID: <jwvftxg2d2n.fsf-monnier+INBOX@HIDDEN> References: <jwv4ldwqh32.fsf-monnier+gmane.linux.debian.user@HIDDEN> <0904c3c5-90d3-dd53-680f-7867eb7a60c4@HIDDEN> Date: Mon, 08 Oct 2018 20:14:27 -0400 In-Reply-To: <0904c3c5-90d3-dd53-680f-7867eb7a60c4@HIDDEN> (Paul Eggert's message of "Mon, 8 Oct 2018 15:49:09 -0700") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-NAI-Spam-Flag: NO X-NAI-Spam-Threshold: 5 X-NAI-Spam-Score: 0 X-NAI-Spam-Rules: 2 Rules triggered EDT_SA_DN_PASS=0, RV6390=0 X-NAI-Spam-Version: 2.3.0.9418 : core <6390> : inlines <6919> : streams <1800759> : uri <2726587> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 32993 Cc: 32993 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) >> tries to do word-level diffs (by basically turning every word >> into N copies of this word, each one on its own line (where N is the >> number of chars in the word, used to indicate to `diff` that long words >> are "more costly" than short ones) > Ouch. That's a really inefficient way of mucking with the cost > algorithm. What other way is there? > Diff should be able to do taht directly, without your having to > repeat the words. (Just a simple matter of programming....) "man diff" doesn't seem to offer much room for "programming". Stefan
bug-diffutils@HIDDEN
:bug#32993
; Package diffutils
.
Full text available.Received: (at 32993) by debbugs.gnu.org; 8 Oct 2018 22:49:19 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Oct 08 18:49:19 2018 Received: from localhost ([127.0.0.1]:40898 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1g9eL1-0000ZM-7X for submit <at> debbugs.gnu.org; Mon, 08 Oct 2018 18:49:19 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:52720) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <eggert@HIDDEN>) id 1g9eKy-0000Z6-MH for 32993 <at> debbugs.gnu.org; Mon, 08 Oct 2018 18:49:17 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 41872161747; Mon, 8 Oct 2018 15:49:10 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 9Xj84CwsF5-q; Mon, 8 Oct 2018 15:49:09 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 834BD161749; Mon, 8 Oct 2018 15:49:09 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id ny9WKHMZ7_TY; Mon, 8 Oct 2018 15:49:09 -0700 (PDT) Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 60920161747; Mon, 8 Oct 2018 15:49:09 -0700 (PDT) Subject: Re: [bug-diffutils] bug#32993: Pathologically slow operation To: Stefan Monnier <monnier@HIDDEN>, 32993 <at> debbugs.gnu.org References: <jwv4ldwqh32.fsf-monnier+gmane.linux.debian.user@HIDDEN> From: Paul Eggert <eggert@HIDDEN> Openpgp: preference=signencrypt Autocrypt: addr=eggert@HIDDEN; prefer-encrypt=mutual; keydata= xsFNBEyAcmQBEADAAyH2xoTu7ppG5D3a8FMZEon74dCvc4+q1XA2J2tBy2pwaTqfhpxxdGA9 Jj50UJ3PD4bSUEgN8tLZ0san47l5XTAFLi2456ciSl5m8sKaHlGdt9XmAAtmXqeZVIYX/UFS 96fDzf4xhEmm/y7LbYEPQdUdxu47xA5KhTYp5bltF3WYDz1Ygd7gx07Auwp7iw7eNvnoDTAl KAl8KYDZzbDNCQGEbpY3efZIvPdeI+FWQN4W+kghy+P6au6PrIIhYraeua7XDdb2LS1en3Ss mE3QjqfRqI/A2ue8JMwsvXe/WK38Ezs6x74iTaqI3AFH6ilAhDqpMnd/msSESNFt76DiO1ZK QMr9amVPknjfPmJISqdhgB1DlEdw34sROf6V8mZw0xfqT6PKE46LcFefzs0kbg4GORf8vjG2 Sf1tk5eU8MBiyN/bZ03bKNjNYMpODDQQwuP84kYLkX2wBxxMAhBxwbDVZudzxDZJ1C2VXujC OJVxq2kljBM9ETYuUGqd75AW2LXrLw6+MuIsHFAYAgRr7+KcwDgBAfwhPBYX34nSSiHlmLC+ KaHLeCLF5ZI2vKm3HEeCTtlOg7xZEONgwzL+fdKo+D6SoC8RRxJKs8a3sVfI4t6CnrQzvJbB n6gxdgCu5i29J1QCYrCYvql2UyFPAK+do99/1jOXT4m2836j1wARAQABzSBQYXVsIEVnZ2Vy dCA8ZWdnZXJ0QGNzLnVjbGEuZWR1PsLBfgQTAQIAKAUCTIByZAIbAwUJEswDAAYLCQgHAwIG FQgCCQoLBBYCAwECHgECF4AACgkQ7ZfpDmKqfjRRGw/+Ij03dhYfYl/gXVRiuzV1gGrbHk+t nfrI/C7fAeoFzQ5tVgVinShaPkZo0HTPf18x6IDEdAiO8Mqo1yp0CtHmzGMCJ50o4Grgfjlr 6g/+vtEOKbhleszN2XpJvpwM2QgGvn/laTLUu8PH9aRWTs7qJJZKKKAb4sxYc92FehPu6FOD 0dDiyhlDAq4lOV2mdBpzQbiojoZzQLMQwjpgCTK2572eK9EOEQySUThXrSIz6ASenp4NYTFH s9tuJQvXk9gZDdPSl3bp+47dGxlxEWLpBIM7zIONw4ks4azgT8nvDZxA5IZHtvqBlJLBObYY 0Le61Wp0y3TlBDh2qdK8eYL426W4scEMSuig5gb8OAtQiBW6k2sGUxxeiv8ovWu8YAZgKJfu oWI+uRnMEddruY8JsoM54KaKvZikkKs2bg1ndtLVzHpJ6qFZC7QVjeHUh6/BmgvdjWPZYFTt N+KA9CWX3GQKKgN3uu988yznD7LnB98T4EUH1HA/GnfBqMV1gpzTvPc4qVQinCmIkEFp83zl +G5fCjJJ3W7ivzCnYo4KhKLpFUm97okTKR2LW3xZzEW4cLSWO387MTK3CzDOx5qe6s4a91Zu ZM/j/TQdTLDaqNn83kA4Hq48UHXYxcIh+Nd8k/3w6lFuoK0wrOFiywjLx+0ur5jmmbecBGHc 1xdhAFHOwU0ETIByZAEQAKaF678T9wyH4wjTrV1Pz3cDEoSnV/0ZUrOT37p1dcGyj/IXq1x6 70HRVahAmk0sZpYc25PF9D5GPYHFWlNjuPU96rDndXB3hedmBRhLdC4bAXjI4DV+bmdVe+q/ IMnlZRaVlm9EiMCVAR6w13sReu7qXkW9r3RwY2AzXskp/tAe4BRKr1Zmbvi2nbnQ6epEC42r Rbx0B1EhjbIQZ5JHGk24iPT7LdBgnNmos5wYjzwNlkMQD5T0Ydzhk7J+UxwA5m46mOhRDC2r FV/A0gm5TLy8DXjv/Esc4gYnYai6SQqnUEVh5LuV8YCJBnijs+Tiw71x1icmn6xGI45EugJO gec+rLypYgpVp4x0HI5T88qBRYCkxH3Kg8Qo+EWNA9A4LRQ9DX8njona0gf0s03tocK8kBN6 6UoqqPtHBnc4eMgBymCflK12eKfd2YYxnyg9cZazWA5VslvTxpm76hbg5oiAEH/Vg/8MxHyA nPhfrgwyPrmJEcVBafdspJnYQxBYNco2LFPIhlOvWh8r4at+s+M3Lb26oUTczlgdW1Sf3SDA 77BMRnF0FQyE+7AzV79MBN4ykiqaezQxtaF1Fy/tvkhffSo8u+dwG0EgJh+te38gTcISVr0G IPplLz6YhjrbHrPRF1CN5UuL9DBGjxuN35RLNVEfta6RUFlR6NctTjvrABEBAAHCwWUEGAEC AA8FAkyAcmQCGwwFCRLMAwAACgkQ7ZfpDmKqfjSrHA/+KzAKvTxRhA9MWNLxIyJ7S5uJ16gs T3oCjZrBKGEhKMOGX4O0GA6VOEryO7QRCCYah3oxSG38IAnNeiwJXgU9Bzkk85UGbPEd7HGF /VSeHCQwWou6jqUDTSDvn9YhNTdG0KXPM74aC+xr2Zow1O2mhXihgWKD0Dw+0LYPnUOsQ0KO FxHXXYHmRrS1OZPU59BLvc+TRhIhafSHKLwbXK+6ckkxBx6h8z5ccpG0Qs4bFhdFYnFrEieD LoGmnE2YLhdV6swJ9VNCS6pLiEohT3fm7aXm15tZOIyzMZhHRSAPblXxQ0ZSWjq8oRrcYNFx c4W1URpAkBCOYJoXvQfD5L3lqAl8TCqDUzYxhH/tJhbDdHrqHH767jaDaTB1+Talp/2AMKwc XNOdiklGxbmHVG6YGl6g8Lrbsu9NZEI4yLlHzuikthJWgz+3vZhVGyNlt+HNIoF6CjDL2omu 5cEq4RDHM44QqPk6l7O0pUvN1mT4B+S1b08RKpqm/ff015E37HNV/piIvJlxGAYz8PSfuGCB 1thMYqlmgdhd9/BabGFbGGYHA6U4/T5zqU+f6xHy1SsAQZ1MSKlLwekBIT+4/cLRGqCHjnV0 q5H/T6a7t5mPkbzSrOLSo4puj+IToNjYyYIDBWzhlA19avOa+rvUjmHtD3sFN7cXWtkGoi8b uNcby4U= Organization: UCLA Computer Science Department Message-ID: <0904c3c5-90d3-dd53-680f-7867eb7a60c4@HIDDEN> Date: Mon, 8 Oct 2018 15:49:09 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.0 MIME-Version: 1.0 In-Reply-To: <jwv4ldwqh32.fsf-monnier+gmane.linux.debian.user@HIDDEN> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 32993 X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -3.3 (---) On 10/8/18 2:34 PM, Stefan Monnier wrote: > Is there a chance this performance behavior is the > result of a performance bug, or is the algorithm really that costly? It's O(N*D), where N is the input length and D is the number of differences. In your case that might be about 10*11, which is getting up there. > tries to do word-level diffs (by basically turning every word > into N copies of this word, each one on its own line (where N is the > number of chars in the word, used to indicate to `diff` that long words > are "more costly" than short ones) Ouch. That's a really inefficient way of mucking with the cost algorithm. Diff should be able to do taht directly, without your having to repeat the words. (Just a simple matter of programming....)
bug-diffutils@HIDDEN
:bug#32993
; Package diffutils
.
Full text available.Received: (at submit) by debbugs.gnu.org; 8 Oct 2018 21:34:42 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Mon Oct 08 17:34:42 2018 Received: from localhost ([127.0.0.1]:40890 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1g9dAo-0007I2-EK for submit <at> debbugs.gnu.org; Mon, 08 Oct 2018 17:34:42 -0400 Received: from eggs.gnu.org ([208.118.235.92]:50950) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <monnier@HIDDEN>) id 1g9dAm-0007Hp-G3 for submit <at> debbugs.gnu.org; Mon, 08 Oct 2018 17:34:41 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <monnier@HIDDEN>) id 1g9dAd-0006Hz-VL for submit <at> debbugs.gnu.org; Mon, 08 Oct 2018 17:34:35 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_20 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:46403) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from <monnier@HIDDEN>) id 1g9dAd-0006Hs-RB for submit <at> debbugs.gnu.org; Mon, 08 Oct 2018 17:34:31 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60527) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <monnier@HIDDEN>) id 1g9dAc-0001VX-T3 for bug-diffutils@HIDDEN; Mon, 08 Oct 2018 17:34:31 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <monnier@HIDDEN>) id 1g9dAV-00068P-3y for bug-diffutils@HIDDEN; Mon, 08 Oct 2018 17:34:27 -0400 Received: from chene.dit.umontreal.ca ([132.204.246.20]:59186) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <monnier@HIDDEN>) id 1g9dAU-000674-DK for bug-diffutils@HIDDEN; Mon, 08 Oct 2018 17:34:23 -0400 Received: from pastel.home (lechon.iro.umontreal.ca [132.204.27.242]) by chene.dit.umontreal.ca (8.14.7/8.14.1) with ESMTP id w98LYI7f021434; Mon, 8 Oct 2018 17:34:19 -0400 Received: by pastel.home (Postfix, from userid 20848) id A20D56A41C; Mon, 8 Oct 2018 17:34:18 -0400 (EDT) From: Stefan Monnier <monnier@HIDDEN> To: bug-diffutils@HIDDEN Subject: Pathologically slow operation Message-ID: <jwv4ldwqh32.fsf-monnier+gmane.linux.debian.user@HIDDEN> Date: Mon, 08 Oct 2018 17:34:18 -0400 User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-NAI-Spam-Flag: NO X-NAI-Spam-Threshold: 5 X-NAI-Spam-Score: 0 X-NAI-Spam-Rules: 2 Rules triggered EDT_SA_DN_PASS=0, RV6390=0 X-NAI-Spam-Version: 2.3.0.9418 : core <6390> : inlines <6919> : streams <1800748> : uri <2726508> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -5.0 (-----) I recently bumped into a `diff` operation that I killed after several minutes while diffing two files (on 3.7GHz core i3, which is the fastest machine I have). These files were generated as part of Emacs's "refine-hunk" processing which tries to do word-level diffs (by basically turning every word into N copies of this word, each one on its own line (where N is the number of chars in the word, used to indicate to `diff` that long words are "more costly" than short ones)). So the files's sizes were: % wc tmp/diff-bug-* 1038026 851160 4963190 tmp/diff-bug-1 65041 54877 314788 tmp/diff-bug-2 1103067 906037 5277978 total % With --speed-large-files, diff still took almost a minute to return an answer (which is 973026 lines long). Those file aren't exactly security sensitive, but they contain personal info that I'd rather not make public (I can make send them in private upon request, tho). Is there a chance this performance behavior is the result of a performance bug, or is the algorithm really that costly? Stefan
Stefan Monnier <monnier@HIDDEN>
:bug-diffutils@HIDDEN
.
Full text available.bug-diffutils@HIDDEN
:bug#32993
; Package diffutils
.
Full text available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.