Received: (at submit) by debbugs.gnu.org; 10 Mar 2017 18:04:09 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Mar 10 13:04:09 2017 Received: from localhost ([127.0.0.1]:49941 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1cmOtd-0007Hx-Fa for submit <at> debbugs.gnu.org; Fri, 10 Mar 2017 13:04:09 -0500 Received: from eggs.gnu.org ([208.118.235.92]:54241) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from <porton@HIDDEN>) id 1cmOrY-0007Dc-VH for submit <at> debbugs.gnu.org; Fri, 10 Mar 2017 13:02:02 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <porton@HIDDEN>) id 1cmOrS-0006lV-Qt for submit <at> debbugs.gnu.org; Fri, 10 Mar 2017 13:01:55 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM, T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:47080) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from <porton@HIDDEN>) id 1cmOrS-0006lR-O5 for submit <at> debbugs.gnu.org; Fri, 10 Mar 2017 13:01:54 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45210) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <porton@HIDDEN>) id 1cmOrR-0003p2-PY for bug-diffutils@HIDDEN; Fri, 10 Mar 2017 13:01:54 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <porton@HIDDEN>) id 1cmOrN-0006hA-SR for bug-diffutils@HIDDEN; Fri, 10 Mar 2017 13:01:53 -0500 Received: from forward12p.cmail.yandex.net ([87.250.241.138]:60150) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from <porton@HIDDEN>) id 1cmOrN-0006cr-Gr for bug-diffutils@HIDDEN; Fri, 10 Mar 2017 13:01:49 -0500 Received: from smtp2p.mail.yandex.net (smtp2p.mail.yandex.net [IPv6:2a02:6b8:0:1472:2741:0:8b6:7]) by forward12p.cmail.yandex.net (Yandex) with ESMTP id 2030E21302 for <bug-diffutils@HIDDEN>; Fri, 10 Mar 2017 21:01:46 +0300 (MSK) Received: from smtp2p.mail.yandex.net (localhost.localdomain [127.0.0.1]) by smtp2p.mail.yandex.net (Yandex) with ESMTP id 0C3671A8003D for <bug-diffutils@HIDDEN>; Fri, 10 Mar 2017 21:01:45 +0300 (MSK) Received: by smtp2p.mail.yandex.net (nwsmtp/Yandex) with ESMTPSA id 6kkm7chHfC-1i4Kdrr9; Fri, 10 Mar 2017 21:01:45 +0300 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client certificate not present) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=narod.ru; s=mail; t=1489168905; bh=4xeDhxIF+8X/7xILW8b7hzMaBU8lz9g2R/6YSP9/44g=; h=Message-ID:Subject:From:To:Date; b=F7rywAhHP1uhIC0paxdeiC1I1CwiZXpum9csqctGN7uj/JZpO50wVegLfjVKpR3Eu JdQBQCoHHvkLry1xf79FhWM3s8x1mNhmmJlaesmctRXE1sV/uKz4OVF/0L2gwXbqc/ 6wn+aU1P+CtKOAMp71w+viHEPt9TFI7Tm7FHNPN8= Authentication-Results: smtp2p.mail.yandex.net; dkim=pass header.i=@narod.ru X-Yandex-Suid-Status: 1 0 Message-ID: <1489168903.4876.2.camel@HIDDEN> Subject: Feature suggestion: Identifying identical files From: Victor Porton <porton@HIDDEN> To: bug-diffutils@HIDDEN Date: Fri, 10 Mar 2017 20:01:43 +0200 Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.22.5-1 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Fri, 10 Mar 2017 13:04:08 -0500 X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org> X-Spam-Score: -4.0 (----) Let we have N files (or directories). I want to split this N-element set into a partition (and display this partition to the user) by "being identical" equivalence relation. Note that being identical may sometimes be relaxed, for example the amount of whitespace may not matter. This utility would be useful for refactoring (restructuring) code when several functions or modules are merged together, to find which code is identical and which is not. Concerning implementation: For strict equality (without relaxing it to ignore whitespace or like this), it can be efficiently implemented using hashes. For relaxed utility, `diff` needs to be invoked. Well, sometimes relaxed equality can be also done with hashes: For example, replace all sequences of spaces with one space before calculating the hash.
Victor Porton <porton@HIDDEN>
:bug-diffutils@HIDDEN
.
Full text available.bug-diffutils@HIDDEN
:bug#26052
; Package diffutils
.
Full text available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.