GNU bug report logs - #25146
grep unusable on mingw - SAME_INODE woes

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: grep; Severity: wishlist; Reported by: Bruno Haible <bruno@HIDDEN>; dated Fri, 9 Dec 2016 15:32:02 UTC; Maintainer for grep is bug-grep@HIDDEN.
Severity set to 'wishlist' from 'normal' Request was from Paul Eggert <eggert@HIDDEN> to control <at> debbugs.gnu.org. Full text available.

Message received at 25146 <at> debbugs.gnu.org:


Received: (at 25146) by debbugs.gnu.org; 9 Dec 2016 19:12:52 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Dec 09 14:12:52 2016
Received: from localhost ([127.0.0.1]:36260 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1cFQbE-0001Mk-4S
	for submit <at> debbugs.gnu.org; Fri, 09 Dec 2016 14:12:52 -0500
Received: from mo4-p00-ob.smtp.rzone.de ([81.169.146.163]:9802)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <bruno@HIDDEN>) id 1cFQbC-0001Mb-NS
 for 25146 <at> debbugs.gnu.org; Fri, 09 Dec 2016 14:12:51 -0500
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1481310769;
 l=310; s=domk; d=clisp.org;
 h=Content-Type:Content-Transfer-Encoding:MIME-Version:References:
 In-Reply-To:Date:Subject:Cc:To:From;
 bh=ntefhQljFnQwKP4lSbLIWyKsI4XERQ2xFpePhlFUZOs=;
 b=mBug7yJD7x7z8vys8tGB7mcjobJoLteP12ACOsQHSIKxK1IDizECOcR4JEhecvM7RT
 Vwl8B87WpBC+M6ciFVVkOyykscckTMxxwnhjdWiujEgZ9Lqc4IbXMUQmmc3HNLYTW8cO
 IvDf+XoeuJSrXbnhx7UfsWCHrYXcPYjmN2OLo=
X-RZG-AUTH: :Ln4Re0+Ic/6oZXR1YgKryK8brksyK8dozXDwHXjf9hj/zDNRavU44PUVlw==
X-RZG-CLASS-ID: mo00
Received: from bruno.haible.de
 (dslb-088-068-033-218.088.068.pools.vodafone-ip.de [88.68.33.218])
 by smtp.strato.de (RZmta 39.10 DYNA|AUTH)
 with ESMTPSA id j00a9fsB9HgRN8O
 (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (curve secp521r1 with 521 ECDH
 bits, eq. 15360 bits RSA)) (Client did not present a certificate);
 Fri, 9 Dec 2016 18:42:27 +0100 (CET)
From: Bruno Haible <bruno@HIDDEN>
To: Paul Eggert <eggert@HIDDEN>
Subject: Re: bug#25146: grep unusable on mingw - SAME_INODE woes
Date: Fri, 09 Dec 2016 18:42:20 +0100
Message-ID: <5033004.eAgOMkh6zq@HIDDEN>
User-Agent: KMail/4.8.5 (Linux/3.8.0-44-generic; KDE/4.8.5; x86_64; ; )
In-Reply-To: <a5fb9bfd-cb7e-7cd8-8dd2-2d1d863f4e94@HIDDEN>
References: <CA+8g5KG1_hHOorU9MuCHynST0CazMpfO97rLLs6pbJQiD+rdkA@HIDDEN>
 <1842652.iKEY8aY0XY@HIDDEN>
 <a5fb9bfd-cb7e-7cd8-8dd2-2d1d863f4e94@HIDDEN>
MIME-Version: 1.0
Content-Transfer-Encoding: 7Bit
Content-Type: text/plain; charset="us-ascii"
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 25146
Cc: 25146 <at> debbugs.gnu.org
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.7 (/)

> I installed the attached patch into Gnulib. This 
> isn't perfect (it means MinGW grep won't detect that the input and 
> output are the same file), but it should be good enough to fix the 
> glaring bugs and to conform to POSIX.

Thanks, Paul. Yes, it surely fixes the immediate issue. I agree.

Bruno





Information forwarded to bug-grep@HIDDEN:
bug#25146; Package grep. Full text available.

Message received at 25146 <at> debbugs.gnu.org:


Received: (at 25146) by debbugs.gnu.org; 9 Dec 2016 16:34:44 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Dec 09 11:34:44 2016
Received: from localhost ([127.0.0.1]:36213 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1cFO8B-0006NG-KK
	for submit <at> debbugs.gnu.org; Fri, 09 Dec 2016 11:34:43 -0500
Received: from zimbra.cs.ucla.edu ([131.179.128.68]:58278)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <eggert@HIDDEN>) id 1cFO89-0006N2-6W
 for 25146 <at> debbugs.gnu.org; Fri, 09 Dec 2016 11:34:42 -0500
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 6638D1601C2;
 Fri,  9 Dec 2016 08:34:34 -0800 (PST)
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id GafejFo-XVHP; Fri,  9 Dec 2016 08:34:33 -0800 (PST)
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 5B44A1601DB;
 Fri,  9 Dec 2016 08:34:33 -0800 (PST)
X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id z5UPJ6eTGdPl; Fri,  9 Dec 2016 08:34:33 -0800 (PST)
Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200])
 by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 3E7021601C2;
 Fri,  9 Dec 2016 08:34:33 -0800 (PST)
Subject: Re: bug#25146: grep unusable on mingw - SAME_INODE woes
To: Bruno Haible <bruno@HIDDEN>, 25146 <at> debbugs.gnu.org
References: <CA+8g5KG1_hHOorU9MuCHynST0CazMpfO97rLLs6pbJQiD+rdkA@HIDDEN>
 <1842652.iKEY8aY0XY@HIDDEN>
From: Paul Eggert <eggert@HIDDEN>
Organization: UCLA Computer Science Department
Message-ID: <a5fb9bfd-cb7e-7cd8-8dd2-2d1d863f4e94@HIDDEN>
Date: Fri, 9 Dec 2016 08:34:32 -0800
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
 Thunderbird/45.5.1
MIME-Version: 1.0
In-Reply-To: <1842652.iKEY8aY0XY@HIDDEN>
Content-Type: multipart/mixed; boundary="------------9DFCB630D3BDCFD20D5A716D"
X-Spam-Score: -3.0 (---)
X-Debbugs-Envelope-To: 25146
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.0 (---)

This is a multi-part message in MIME format.
--------------9DFCB630D3BDCFD20D5A716D
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit

On 12/09/2016 07:30 AM, Bruno Haible wrote:
>
> How can we go forward from here? I would propose a gnulib module that defines
> a data structure that combines a 'struct stat' with the FILE_ID_INFO for native
> Windows, and rebase the 'same-inode' module on it.
>
> The other approach, to override mingw's 'struct stat' and stat/fstat/lstat()
> functions, would imply a performance hit to all stat calls, even those that
> don't want to access the st_ino field.
For grep's purposes a simple workaround is to have SAME_INODE always 
return 0 on MinGW, so I installed the attached patch into Gnulib. This 
isn't perfect (it means MinGW grep won't detect that the input and 
output are the same file), but it should be good enough to fix the 
glaring bugs and to conform to POSIX.

Although it might be helpful to have a fancier module that does the work 
of SAME_INODE but does it more accurately on MinGW, I'm not sure it's 
worth the hassle. A lot of code assumes that 'struct stat' suffices to 
identify files, and it would be a pain to clutter it with another struct 
of our own design that contains a 'struct stat' as a component. Even if 
we had another module like that, we'd need to keep SAME_INODE for the 
benefit of programs that cannot easily adopt the new struct.

It seems more plausible to override MinGW's struct stat and stat/etc. 
functions. To my mind it's OK to take a performance hit in the interest 
of portability. The performance hit would occur only on programs that 
need to deduce the equivalent of SAME_INODE.

--------------9DFCB630D3BDCFD20D5A716D
Content-Type: text/plain; charset=UTF-8;
 name="0001-same-inode-port-to-MinGW.txt"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename="0001-same-inode-port-to-MinGW.txt"

RnJvbSA3ODU1YTJlM2FlNGFhY2FhMDZhNzVhNWUyOTkzMGFjNTVlZDBlOWFlIE1vbiBTZXAg
MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBQYXVsIEVnZ2VydCA8ZWdnZXJ0QGNzLnVjbGEuZWR1
PgpEYXRlOiBGcmksIDkgRGVjIDIwMTYgMDg6MTY6MTMgLTA4MDAKU3ViamVjdDogW1BBVENI
XSBzYW1lLWlub2RlOiBwb3J0IHRvIE1pbkdXCk1JTUUtVmVyc2lvbjogMS4wCkNvbnRlbnQt
VHlwZTogdGV4dC9wbGFpbjsgY2hhcnNldD1VVEYtOApDb250ZW50LVRyYW5zZmVyLUVuY29k
aW5nOiA4Yml0CgpIZXJlIHN0X2lubyBpcyBhbHdheXMgMCwgc28gY2hhbmdlIHRoZSBkZWZp
bml0aW9uIG9mIFNBTUVfSU5PREUgc28KdGhhdCAxIG1lYW5zIHRoZSB0d28gZmlsZXMgYXJl
IHRoZSBzYW1lLCAwIHdpdGggc3RfaW5vICE9IDAgbWVhbnMKdGhleSBkaWZmZXIsIGFuZCAw
IHdpdGggc3RfaW5vID09IDAgbWVhbnMgd2UgZG9u4oCZdCBrbm93LiAgUHJvYmxlbQpyZXBv
cnRlZCBieSBCcnVubyBIYWlibGUgKEJ1ZyMyNTE0NikuCiogZG9jL3Bvc2l4LWhlYWRlcnMv
c3lzX3N0YXQudGV4aSAoc3lzL3N0YXQuaCk6IFVwZGF0ZS4KKiBsaWIvc2FtZS1pbm9kZS5o
IChTQU1FX0lOT0RFKTogUmV0dXJuIDAgb24gTWluR1cuCi0tLQogQ2hhbmdlTG9nICAgICAg
ICAgICAgICAgICAgICAgICB8IDEwICsrKysrKysrKysKIGRvYy9wb3NpeC1oZWFkZXJzL3N5
c19zdGF0LnRleGkgfCAgNCArKy0tCiBsaWIvc2FtZS1pbm9kZS5oICAgICAgICAgICAgICAg
IHwgIDYgKysrKystCiAzIGZpbGVzIGNoYW5nZWQsIDE3IGluc2VydGlvbnMoKyksIDMgZGVs
ZXRpb25zKC0pCgpkaWZmIC0tZ2l0IGEvQ2hhbmdlTG9nIGIvQ2hhbmdlTG9nCmluZGV4IDgx
MDNlYmQuLmZkM2U5ZDggMTAwNjQ0Ci0tLSBhL0NoYW5nZUxvZworKysgYi9DaGFuZ2VMb2cK
QEAgLTEsMyArMSwxMyBAQAorMjAxNi0xMi0wOSAgUGF1bCBFZ2dlcnQgIDxlZ2dlcnRAY3Mu
dWNsYS5lZHU+CisKKwlzYW1lLWlub2RlOiBwb3J0IHRvIE1pbkdXCisJSGVyZSBzdF9pbm8g
aXMgYWx3YXlzIDAsIHNvIGNoYW5nZSB0aGUgZGVmaW5pdGlvbiBvZiBTQU1FX0lOT0RFIHNv
CisJdGhhdCAxIG1lYW5zIHRoZSB0d28gZmlsZXMgYXJlIHRoZSBzYW1lLCAwIHdpdGggc3Rf
aW5vICE9IDAgbWVhbnMKKwl0aGV5IGRpZmZlciwgYW5kIDAgd2l0aCBzdF9pbm8gPT0gMCBt
ZWFucyB3ZSBkb27igJl0IGtub3cuICBQcm9ibGVtCisJcmVwb3J0ZWQgYnkgQnJ1bm8gSGFp
YmxlIChCdWcjMjUxNDYpLgorCSogZG9jL3Bvc2l4LWhlYWRlcnMvc3lzX3N0YXQudGV4aSAo
c3lzL3N0YXQuaCk6IFVwZGF0ZS4KKwkqIGxpYi9zYW1lLWlub2RlLmggKFNBTUVfSU5PREUp
OiBSZXR1cm4gMCBvbiBNaW5HVy4KKwogMjAxNi0xMi0wNCAgQnJ1bm8gSGFpYmxlICA8YnJ1
bm9AY2xpc3Aub3JnPgogCiAJamF2YWNvbXA6IFN1cHBvcnQgSmF2YSA3IGFuZCA4LgpkaWZm
IC0tZ2l0IGEvZG9jL3Bvc2l4LWhlYWRlcnMvc3lzX3N0YXQudGV4aSBiL2RvYy9wb3NpeC1o
ZWFkZXJzL3N5c19zdGF0LnRleGkKaW5kZXggYmQ2NDRmNi4uNGMxNzZhYSAxMDA2NDQKLS0t
IGEvZG9jL3Bvc2l4LWhlYWRlcnMvc3lzX3N0YXQudGV4aQorKysgYi9kb2MvcG9zaXgtaGVh
ZGVycy9zeXNfc3RhdC50ZXhpCkBAIC00NCw4ICs0NCw4IEBAIG5vdCBhIHNpbmdsZSB2YWx1
ZS4KIEBpdGVtCiBUbyBwYXJ0aWFsbHkgd29yayBhcm91bmQgdGhlIHByZXZpb3VzIHR3byBw
cm9ibGVtcywgeW91IGNhbiB0ZXN0IGZvcgogbm9uemVybyBAY29kZXtzdF9pbm99IGFuZCB1
c2UgdGhlIEdudWxpYiBAY29kZXtzYW1lLWlub2RlfSBtb2R1bGUgdG8KLWNvbXBhcmUgbm9u
emVybyB2YWx1ZXMuICBGb3IgZXhhbXBsZSwgQGNvZGV7KGEuc3RfaW5vICYmIFNBTUVfSU5P
REUKLShhLCBiKSl9IGlzIHRydWUgaWYgdGhlIEBjb2Rle3N0cnVjdCBzdGF0fSB2YWx1ZXMg
QGNvZGV7YX0gYW5kCitjb21wYXJlIG5vbnplcm8gdmFsdWVzLiAgRm9yIGV4YW1wbGUsIEBj
b2Rle1NBTUVfSU5PREUgKGEsIGIpfQoraXMgdHJ1ZSBpZiB0aGUgQGNvZGV7c3RydWN0IHN0
YXR9IHZhbHVlcyBAY29kZXthfSBhbmQKIEBjb2Rle2J9IGFyZSBrbm93biB0byByZXByZXNl
bnQgdGhlIHNhbWUgZmlsZSwgQGNvZGV7KGEuc3RfaW5vICYmCiAhU0FNRV9JTk9ERSAoYSwg
YikpfSBpcyB0cnVlIGlmIHRoZXkgYXJlIGtub3duIHRvIHJlcHJlc2VudCBkaWZmZXJlbnQK
IGZpbGVzLCBhbmQgQGNvZGV7IWEuc3RfaW5vfSBpcyB0cnVlIGlmIGl0IGlzIG5vdCBrbm93
biB3aGV0aGVyIHRoZXkKZGlmZiAtLWdpdCBhL2xpYi9zYW1lLWlub2RlLmggYi9saWIvc2Ft
ZS1pbm9kZS5oCmluZGV4IGJmNDU2MzUuLmM3YThmYjUgMTAwNjQ0Ci0tLSBhL2xpYi9zYW1l
LWlub2RlLmgKKysrIGIvbGliL3NhbWUtaW5vZGUuaApAQCAtMSw0ICsxLDQgQEAKLS8qIERl
dGVybWluZSB3aGV0aGVyIHR3byBzdGF0IGJ1ZmZlcnMgcmVmZXIgdG8gdGhlIHNhbWUgZmls
ZS4KKy8qIERldGVybWluZSB3aGV0aGVyIHR3byBzdGF0IGJ1ZmZlcnMgYXJlIGtub3duIHRv
IHJlZmVyIHRvIHRoZSBzYW1lIGZpbGUuCiAKICAgIENvcHlyaWdodCAoQykgMjAwNiwgMjAw
OS0yMDE2IEZyZWUgU29mdHdhcmUgRm91bmRhdGlvbiwgSW5jLgogCkBAIC0yNCw2ICsyNCwx
MCBAQAogICAgICAmJiAoYSkuc3RfaW5vWzFdID09IChiKS5zdF9pbm9bMV0gXAogICAgICAm
JiAoYSkuc3RfaW5vWzJdID09IChiKS5zdF9pbm9bMl0gXAogICAgICAmJiAoYSkuc3RfZGV2
ID09IChiKS5zdF9kZXYpCisjIGVsaWYgKGRlZmluZWQgX1dJTjMyIHx8IGRlZmluZWQgX19X
SU4zMl9fKSAmJiAhIGRlZmluZWQgX19DWUdXSU5fXworLyogT24gTWluR1csIHN0cnVjdCBz
dGF0IGxhY2tzIG5lY2Vzc2FyeSBpbmZvLCBzbyBhbHdheXMgcmV0dXJuIDAuCisgICBDYWxs
ZXJzIGNhbiB1c2UgIWEuc3RfaW5vIHRvIGRlZHVjZSB0aGF0IHRoZSBpbmZvcm1hdGlvbiBp
cyB1bmtub3duLiAgKi8KKyMgIGRlZmluZSBTQU1FX0lOT0RFKGEsIGIpIDAKICMgZWxzZQog
IyAgZGVmaW5lIFNBTUVfSU5PREUoYSwgYikgICAgXAogICAgICgoYSkuc3RfaW5vID09IChi
KS5zdF9pbm8gXAotLSAKMi43LjQKCg==
--------------9DFCB630D3BDCFD20D5A716D--




Information forwarded to bug-grep@HIDDEN:
bug#25146; Package grep. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 9 Dec 2016 15:31:16 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Dec 09 10:31:16 2016
Received: from localhost ([127.0.0.1]:36175 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1cFN8m-0004vD-5O
	for submit <at> debbugs.gnu.org; Fri, 09 Dec 2016 10:31:16 -0500
Received: from eggs.gnu.org ([208.118.235.92]:44652)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <bruno@HIDDEN>) id 1cFN8j-0004v1-U1
 for submit <at> debbugs.gnu.org; Fri, 09 Dec 2016 10:31:14 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <bruno@HIDDEN>) id 1cFN8Y-00053x-JH
 for submit <at> debbugs.gnu.org; Fri, 09 Dec 2016 10:31:08 -0500
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.0 required=5.0 tests=BAYES_20,T_DKIM_INVALID
 autolearn=disabled version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:50056)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <bruno@HIDDEN>) id 1cFN8Y-00053r-GL
 for submit <at> debbugs.gnu.org; Fri, 09 Dec 2016 10:31:02 -0500
Received: from eggs.gnu.org ([2001:4830:134:3::10]:35602)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <bruno@HIDDEN>) id 1cFN8W-0007ss-Qp
 for bug-grep@HIDDEN; Fri, 09 Dec 2016 10:31:02 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <bruno@HIDDEN>) id 1cFN8S-00050v-HM
 for bug-grep@HIDDEN; Fri, 09 Dec 2016 10:31:00 -0500
Received: from mo6-p00-ob.smtp.rzone.de ([2a01:238:20a:202:5300::8]:16664)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <bruno@HIDDEN>) id 1cFN8S-0004zD-8C
 for bug-grep@HIDDEN; Fri, 09 Dec 2016 10:30:56 -0500
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1481297453;
 l=5102; s=domk; d=clisp.org;
 h=Content-Transfer-Encoding:Content-Type:MIME-Version:References:
 In-Reply-To:Date:Subject:To:From;
 bh=tKmKyfz5IR8E5HgWNtazGiXjCKW4XBXWpeUMx9BxE4I=;
 b=KUgRHu0myiIdzRaR2zgCyJbjR5cfA44gkkPgdgtpyL77X4yDBeVDfcSbX0WzoiBfO7
 T5u8T5znlwZNGyXTQp2J9z6rEw+OqxTvLBn/DQauoHPpZbFwpYUAGifxlH3Px8pg2cLG
 3ulWUoAZTu22Xt7BrgKmXYxWq2ryWhfGwOP+k=
X-RZG-AUTH: :Ln4Re0+Ic/6oZXR1YgKryK8brksyK8dozXDwHXjf9hj/zDNRavU44PUVlw==
X-RZG-CLASS-ID: mo00
Received: from bruno.haible.de
 (dslb-088-068-033-218.088.068.pools.vodafone-ip.de [88.68.33.218])
 by smtp.strato.de (RZmta 39.10 DYNA|AUTH)
 with ESMTPSA id 20b5fcsB9FUrLba
 (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (curve secp521r1 with 521 ECDH
 bits, eq. 15360 bits RSA)) (Client did not present a certificate);
 Fri, 9 Dec 2016 16:30:53 +0100 (CET)
From: Bruno Haible <bruno@HIDDEN>
To: bug-grep@HIDDEN
Subject: grep unusable on mingw - SAME_INODE woes
Date: Fri, 09 Dec 2016 16:30:45 +0100
Message-ID: <1842652.iKEY8aY0XY@HIDDEN>
User-Agent: KMail/4.8.5 (Linux/3.8.0-44-generic; KDE/4.8.5; x86_64; ; )
In-Reply-To: <CA+8g5KG1_hHOorU9MuCHynST0CazMpfO97rLLs6pbJQiD+rdkA@HIDDEN>
References: <CA+8g5KG1_hHOorU9MuCHynST0CazMpfO97rLLs6pbJQiD+rdkA@HIDDEN>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="nextPart2833148.zOgL9ZBi2u"
Content-Transfer-Encoding: 7Bit
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic]
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -5.0 (-----)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -5.0 (-----)


--nextPart2833148.zOgL9ZBi2u
Content-Transfer-Encoding: 7Bit
Content-Type: text/plain; charset="us-ascii"

> grep snapshot:
>   http://meyering.net/grep/grep-ss.tar.xz      1.4 MB
>   http://meyering.net/grep/grep-ss.tar.xz.sig
>   http://meyering.net/grep/grep-2.26.39-ae3f.tar.xz

This release, built for mingw, is hardly usable:
  - 33 out of 107 tests fail,
  - A simple "grep.exe o xx > yy" fails with error
    grep.exe: input file 'xx' is also the output

More details:
- This happens both in a Cygwin mintty.exe window and in a cmd.exe window.
- It's the same for 32-bit mingw builds and 64-bit mingw builds
  (recipe: http://git.savannah.gnu.org/gitweb/?p=gperf.git;a=blob_plain;f=README.windows;hb=HEAD )
- The error is signalled in grep.c:1874.
  At this point, 'st' (of type 'struct _stat64') contains
    { st_dev = 0, st_ino = 0,
      st_mode = 0x81B6 = _S_IFREG | _S_IREAD | _S_IWRITE | 0x36,
      st_nlink = 1,
      st_uid = 0, st_gid = 0, st_rdev = 0, st_size = 4,
      st_atime = 1481099615, st_mtime = 1481099615, st_ctime = 1481099615 }
  Obviously, such a struct cannot reliably distinguish two different regular files.
  In other words, SAME_INODE cannot work.
- So, how do you determine identity of files in Windows?
  http://stackoverflow.com/questions/562701/best-way-to-determine-if-two-path-reference-to-same-file-in-windows
  But even this is wrong, the use of a BY_HANDLE_FILE_INFORMATION
  is not sufficient because it contains only 64-bit identifiers for
  files. See https://msdn.microsoft.com/en-us/library/windows/desktop/aa363788(v=vs.85).aspx
  The best approach is to use GetFileInformationByHandleEx to produce a
  FILE_ID_INFO.

Find attached a proof-of-concept patch. (Really rough - needs
-D_WIN32_WINNT=_WIN32_WINNT_WIN8, and lacks good error handling.)

With it, I get:
$ ./grep.exe o xx > yy
$ ./grep.exe o xx > xx
grep.exe: input file 'xx' is also the output

That is, now the detection of identical regular files works.

How can we go forward from here? I would propose a gnulib module that defines
a data structure that combines a 'struct stat' with the FILE_ID_INFO for native
Windows, and rebase the 'same-inode' module on it.

The other approach, to override mingw's 'struct stat' and stat/fstat/lstat()
functions, would imply a performance hit to all stat calls, even those that
don't want to access the st_ino field.

Bruno


--nextPart2833148.zOgL9ZBi2u
Content-Disposition: attachment; filename="grep-same-inode-fix.diff"
Content-Transfer-Encoding: 7Bit
Content-Type: text/x-patch; charset="UTF-8"; name="grep-same-inode-fix.diff"

--- grep.c.orig	2016-11-21 18:31:31.000000000 +0100
+++ grep.c	2016-12-09 16:12:51.294888100 +0100
@@ -27,6 +27,11 @@
 #include <stdarg.h>
 #include <stdio.h>
 #include "system.h"
+#if (defined _WIN32 || defined __WIN32__) && ! defined __CYGWIN__
+# define WIN32_LEAN_AND_MEAN /* avoid conflict due to DATADIR */
+# include <io.h>
+# include <windows.h>
+#endif
 
 #include "argmatch.h"
 #include "c-ctype.h"
@@ -62,6 +67,9 @@
    information here, so that we can automatically skip it, thus
    avoiding a potential (racy) infinite loop.  */
 static struct stat out_stat;
+#if (defined _WIN32 || defined __WIN32__) && ! defined __CYGWIN__
+static FILE_ID_INFO out_id;
+#endif
 
 /* if non-zero, display usage information and exit */
 static int show_help;
@@ -1868,13 +1876,26 @@
      input==output, while there is no risk of infloop, there is a race
      condition that could result in "alternate" output.  */
   if (!out_quiet && list_files == LISTFILES_NONE && 1 < max_count
-      && S_ISREG (st.st_mode) && SAME_INODE (st, out_stat))
+      && S_ISREG (st.st_mode))
     {
-      if (! suppress_errors)
-        error (0, 0, _("input file %s is also the output"),
-               quote (input_filename ()));
-      errseen = true;
-      goto closeout;
+#if (defined _WIN32 || defined __WIN32__) && ! defined __CYGWIN__
+      FILE_ID_INFO desc_id;
+      if (!GetFileInformationByHandleEx (_get_osfhandle (desc), FileIdInfo, &desc_id, sizeof (desc_id)))
+        {
+          fprintf (stderr, "GetFileInformationByHandleEx failed -> %d\n", GetLastError ());
+        }
+      if (desc_id.VolumeSerialNumber == out_id.VolumeSerialNumber
+          && memcmp (&desc_id.FileId, &out_id.FileId, sizeof (FILE_ID_128)) == 0)
+#else
+      if (SAME_INODE (st, out_stat))
+#endif
+        {
+          if (! suppress_errors)
+            error (0, 0, _("input file %s is also the output"),
+                   quote (input_filename ()));
+          errseen = true;
+          goto closeout;
+        }
     }
 
   /* Set input to binary mode.  Pipes are simulated with files
@@ -2763,7 +2784,15 @@
   if (! exit_on_match && fstat (STDOUT_FILENO, &tmp_stat) == 0)
     {
       if (S_ISREG (tmp_stat.st_mode))
-        out_stat = tmp_stat;
+        {
+          out_stat = tmp_stat;
+#if (defined _WIN32 || defined __WIN32__) && ! defined __CYGWIN__
+          if (!GetFileInformationByHandleEx (_get_osfhandle (STDOUT_FILENO), FileIdInfo, &out_id, sizeof (out_id)))
+            {
+              fprintf (stderr, "GetFileInformationByHandleEx failed -> %d\n", GetLastError ());
+            }
+#endif
+        }
       else if (S_ISCHR (tmp_stat.st_mode))
         {
           struct stat null_stat;

--nextPart2833148.zOgL9ZBi2u--





Acknowledgement sent to Bruno Haible <bruno@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-grep@HIDDEN. Full text available.
Report forwarded to bug-grep@HIDDEN:
bug#25146; Package grep. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Mon, 25 Nov 2019 12:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.