GNU bug report logs - #16871
problems about matching newline (with -z)

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: grep; Severity: wishlist; Reported by: Stephane Chazelas <stephane.chazelas@HIDDEN>; dated Tue, 25 Feb 2014 07:33:01 UTC; Maintainer for grep is bug-grep@HIDDEN.

Message received at 16871 <at> debbugs.gnu.org:


Received: (at 16871) by debbugs.gnu.org; 18 Nov 2016 17:40:54 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Nov 18 12:40:54 2016
Received: from localhost ([127.0.0.1]:34182 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1c7n9i-00012V-G9
	for submit <at> debbugs.gnu.org; Fri, 18 Nov 2016 12:40:54 -0500
Received: from mail-wm0-f65.google.com ([74.125.82.65]:33940)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <stephane.chazelas@HIDDEN>) id 1c7n9g-00012I-Lg
 for 16871 <at> debbugs.gnu.org; Fri, 18 Nov 2016 12:40:53 -0500
Received: by mail-wm0-f65.google.com with SMTP id g23so8444825wme.1
 for <16871 <at> debbugs.gnu.org>; Fri, 18 Nov 2016 09:40:52 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=date:from:to:subject:message-id:mime-version:content-disposition
 :user-agent; bh=f5RIUnFY1ltZ5CWn8zoKGNIR6dwc+yujBzA2A1UsgHQ=;
 b=Yi4Uog7D03j5yAaKhz/y1hvBgCjydipW5ygaFYcQ5fSnZhv2lxmQp2Tok9lbtZLwab
 uqdxK/UED9PSP3O9OKdL5CNC3ESw2z24vbGaJ4fsTRkR7xXWi9ch0+s40ZAyDF7pLl6H
 Q4hj/r4L/B3icMX9vCng2z0fZt2j+1G0tcWASiaJkWxwXNTBzAABzaN8ibd5lJ9kdrjy
 jApCmoQxgcy9fzRedMRqufWF6uM8BGzHR9MnB6xe3V2RHnv1J2O7T/HXWheOFEMBVgZZ
 TsqznhTcrPIEN1IN/0h5oNOj0zt1E2Qz+rXmddHZSsAZ1KrlYp6D+4uEQAX6qjQHwGjo
 H+hQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:date:from:to:subject:message-id:mime-version
 :content-disposition:user-agent;
 bh=f5RIUnFY1ltZ5CWn8zoKGNIR6dwc+yujBzA2A1UsgHQ=;
 b=l3o04NdzvTTiL25MkWliQbbVZrcTuYjWT60rI1mAq7twzd+mn1BBV1YYR50YWZz9ME
 E86UFFr3xj1q4y01jpIFjFHcqYmjL+CHy1pI/9Snabvkw5xFrEJAp1eZKD458psVFGXV
 volUtJoIW8OWDkwnfKTSbY1+Eg7t00xBv9+Ed3OPV1m7vTaS2HpKT3DtMufQ/oP4bCrI
 OaEVhANoYgpU+z2+UEmv94arvssY8vW1kZ7wP1anxQ5En/QwMjtiF0KCJRa8NB+HAczI
 Qj4p6BffoEitzbuWFMBoIdIWb+yLx0itmM01cxwV5kgFu5nGBvq8gwJqYdG0dt17iiPp
 NXEA==
X-Gm-Message-State: AKaTC00qH60/SEcwj6oCEMRwecyHrywAQHPEi6h0gN8nwvKywIeuBvVKakfZXjCGrf+CBA==
X-Received: by 10.28.113.218 with SMTP id d87mr1365006wmi.111.1479490846837;
 Fri, 18 Nov 2016 09:40:46 -0800 (PST)
Received: from chaz.gmail.com ([90.201.137.34])
 by smtp.gmail.com with ESMTPSA id l67sm4458741wmf.20.2016.11.18.09.40.45
 for <16871 <at> debbugs.gnu.org>
 (version=TLS1_2 cipher=AES128-SHA bits=128/128);
 Fri, 18 Nov 2016 09:40:45 -0800 (PST)
Date: Fri, 18 Nov 2016 17:40:44 +0000
From: Stephane Chazelas <stephane.chazelas@HIDDEN>
To: 16871 <at> debbugs.gnu.org
Subject: doc/test confusions with grep -P
Message-ID: <20161118174044.GD10084@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Score: 0.5 (/)
X-Debbugs-Envelope-To: 16871
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: 0.5 (/)

For the record, the doc/test confusion was fixed by commit
b73296ace186451b096b075461634c153d1fa525
http://git.savannah.gnu.org/cgit/grep.git/commit/?id=b73296ace186451b096b075461634c153d1fa525

See also https://debbugs.gnu.org/cgi/bugreport.cgi?bug=22655#47
and below about PCRE_MULTILINE.




Information forwarded to bug-grep@HIDDEN:
bug#16871; Package grep. Full text available.
Severity set to 'wishlist' from 'normal' Request was from Paul Eggert <eggert@HIDDEN> to control <at> debbugs.gnu.org. Full text available.

Message received at 16871 <at> debbugs.gnu.org:


Received: (at 16871) by debbugs.gnu.org; 25 Apr 2014 04:27:47 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Fri Apr 25 00:27:47 2014
Received: from localhost ([127.0.0.1]:57815 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1WdXjm-00021k-9X
	for submit <at> debbugs.gnu.org; Fri, 25 Apr 2014 00:27:46 -0400
Received: from smtp.cs.ucla.edu ([131.179.128.62]:56311)
 by debbugs.gnu.org with esmtp (Exim 4.80)
 (envelope-from <eggert@HIDDEN>) id 1WdXjj-00021Z-Ir
 for 16871 <at> debbugs.gnu.org; Fri, 25 Apr 2014 00:27:44 -0400
Received: from localhost (localhost.localdomain [127.0.0.1])
 by smtp.cs.ucla.edu (Postfix) with ESMTP id D82C739E8012;
 Thu, 24 Apr 2014 21:27:42 -0700 (PDT)
X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu
Received: from smtp.cs.ucla.edu ([127.0.0.1])
 by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id 3VFQIGFvYGYy; Thu, 24 Apr 2014 21:27:38 -0700 (PDT)
Received: from [192.168.1.9] (pool-108-0-233-62.lsanca.fios.verizon.net
 [108.0.233.62])
 by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 95D1939E8016;
 Thu, 24 Apr 2014 21:27:38 -0700 (PDT)
Message-ID: <5359E43A.2070709@HIDDEN>
Date: Thu, 24 Apr 2014 21:27:38 -0700
From: Paul Eggert <eggert@HIDDEN>
Organization: UCLA Computer Science Department
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
 rv:24.0) Gecko/20100101 Thunderbird/24.4.0
MIME-Version: 1.0
To: Stephane Chazelas <stephane.chazelas@HIDDEN>, 
 16871 <at> debbugs.gnu.org
Subject: Re: bug#16871: problems about matching newline (with -z)
References: <20140225073218.GA18853@HIDDEN>
In-Reply-To: <20140225073218.GA18853@HIDDEN>
Content-Type: multipart/mixed; boundary="------------010104050602070501040309"
X-Spam-Score: -3.0 (---)
X-Debbugs-Envelope-To: 16871
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.0 (---)

This is a multi-part message in MIME format.
--------------010104050602070501040309
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit

Stephane Chazelas wrote:
> The doc has a confusing statement ... Same confusion in tests/pcre:

Thanks, I installed the attached patch to fix those.

> We can match a newline with grep -zP 'a\nb' (or '\x0a' or '\012'
> or '[\n]'...) but not easily without -P. Same for NUL
> characters.

Yes, that's a downside of the POSIX notation, and it'd be nice to extend 
POSIX to allow easy matching for newlines and/or null bytes.  I'll mark 
this bug report as a wishlist bug.


--------------010104050602070501040309
Content-Type: text/plain; charset=UTF-8;
 name="0001-misc-fix-doc-and-test-bugs-re-grep-z.patch"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename="0001-misc-fix-doc-and-test-bugs-re-grep-z.patch"

RnJvbSAyZmQ5Y2UzYmNmNTVlYzgwYWE1YjFkMzc3NWZlMDBjY2IwNzhkN2RkIE1vbiBTZXAg
MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBQYXVsIEVnZ2VydCA8ZWdnZXJ0QGNzLnVjbGEuZWR1
PgpEYXRlOiBUaHUsIDI0IEFwciAyMDE0IDIxOjI0OjIyIC0wNzAwClN1YmplY3Q6IFtQQVRD
SF0gbWlzYzogZml4IGRvYyBhbmQgdGVzdCBidWdzIHJlIGdyZXAgLXoKClByb2JsZW0gcmVw
b3J0ZWQgYnkgU3RlcGhhbmUgQ2hhemVsYXMgaW46IGh0dHA6Ly9idWdzLmdudS5vcmcvMTY4
NzEKKiBkb2MvZ3JlcC50ZXhpIChVc2FnZSk6IFJlbW92ZSBpbmNvcnJlY3QgZXhhbXBsZSB3
aXRoIC1QLgoqIHRlc3RzL3BjcmU6IEltcHJvdmUgdGVzdCBzbyB0aGF0IGl0IGFjdHVhbGx5
IHRlc3RzIHdoZXRoZXIgXHMKbWF0Y2hlcyBhIG5ld2xpbmUuCi0tLQogZG9jL2dyZXAudGV4
aSB8IDE2ICsrKysrLS0tLS0tLS0tLS0KIHRlc3RzL3BjcmUgICAgfCAgNCArKy0tCiAyIGZp
bGVzIGNoYW5nZWQsIDcgaW5zZXJ0aW9ucygrKSwgMTMgZGVsZXRpb25zKC0pCgpkaWZmIC0t
Z2l0IGEvZG9jL2dyZXAudGV4aSBiL2RvYy9ncmVwLnRleGkKaW5kZXggZjYzMWYwMy4uNTlk
MGQzYyAxMDA2NDQKLS0tIGEvZG9jL2dyZXAudGV4aQorKysgYi9kb2MvZ3JlcC50ZXhpCkBA
IC0xNzE5LDI1ICsxNzE5LDE5IEBAIEhvdyBjYW4gSSBtYXRjaCBhY3Jvc3MgbGluZXM/CiAK
IFN0YW5kYXJkIGdyZXAgY2Fubm90IGRvIHRoaXMsIGFzIGl0IGlzIGZ1bmRhbWVudGFsbHkg
bGluZS1iYXNlZC4KIFRoZXJlZm9yZSwgbWVyZWx5IHVzaW5nIHRoZSBAY29kZXtbOnNwYWNl
Ol19IGNoYXJhY3RlciBjbGFzcyBkb2VzIG5vdAotbWF0Y2ggbmV3bGluZXMgaW4gdGhlIHdh
eSB5b3UgbWlnaHQgZXhwZWN0LiAgSG93ZXZlciwgaWYgeW91ciBncmVwIGlzCi1jb21waWxl
ZCB3aXRoIFBlcmwgcGF0dGVybnMgZW5hYmxlZCwgdGhlIFBlcmwgQHNhbXB7c30KLW1vZGlm
aWVyICh3aGljaCBtYWtlcyBAY29kZXsufSBtYXRjaCBuZXdsaW5lcykgY2FuIGJlIHVzZWQ6
Ci0KLUBleGFtcGxlCi1wcmludGYgJ2Zvb1xuYmFyXG4nIHwgZ3JlcCAtUCAnKD9zKWZvby4q
P2JhcicKLUBlbmQgZXhhbXBsZQorbWF0Y2ggbmV3bGluZXMgaW4gdGhlIHdheSB5b3UgbWln
aHQgZXhwZWN0LgogCiBXaXRoIHRoZSBHTlUgQGNvbW1hbmR7Z3JlcH0gb3B0aW9uIEBjb2Rl
ey16fSAoQHB4cmVme0ZpbGUgYW5kCiBEaXJlY3RvcnkgU2VsZWN0aW9ufSksIHRoZSBpbnB1
dCBpcyB0ZXJtaW5hdGVkIGJ5IG51bGwgYnl0ZXMuICBUaHVzLAoteW91IGNhbiBtYXRjaCBu
ZXdsaW5lcyBpbiB0aGUgaW5wdXQsIGJ1dCB0aGUgb3V0cHV0IHdpbGwgYmUgdGhlIHdob2xl
Ci1maWxlLCBzbyB0aGlzIGlzIHJlYWxseSBvbmx5IHVzZWZ1bCB0byBkZXRlcm1pbmUgaWYg
dGhlIHBhdHRlcm4gaXMKLXByZXNlbnQ6Cit5b3UgY2FuIG1hdGNoIG5ld2xpbmVzIGluIHRo
ZSBpbnB1dCwgYnV0IHR5cGljYWxseSBpZiB0aGVyZSBpcyBhIG1hdGNoCit0aGUgZW50aXJl
IGlucHV0IGlzIG91dHB1dCwgc28gdGhpcyB1c2FnZSBpcyBvZnRlbiBjb21iaW5lZCB3aXRo
CitvdXRwdXQtc3VwcHJlc3Npbmcgb3B0aW9ucyBsaWtlIEBvcHRpb257LXF9LCBlLmcuOgog
CiBAZXhhbXBsZQogcHJpbnRmICdmb29cbmJhclxuJyB8IGdyZXAgLXogLXEgJ2Zvb1tbOnNw
YWNlOl1dXCtiYXInCiBAZW5kIGV4YW1wbGUKIAotRmFpbGluZyBlaXRoZXIgb2YgdGhvc2Ug
b3B0aW9ucywgeW91IG5lZWQgdG8gdHJhbnNmb3JtIHRoZSBpbnB1dAorSWYgdGhpcyBkb2Vz
IG5vdCBzdWZmaWNlLCB5b3UgY2FuIHRyYW5zZm9ybSB0aGUgaW5wdXQKIGJlZm9yZSBnaXZp
bmcgaXQgdG8gQGNvbW1hbmR7Z3JlcH0sIG9yIHR1cm4gdG8gQGNvbW1hbmR7YXdrfSwKIEBj
b21tYW5ke3NlZH0sIEBjb21tYW5ke3Blcmx9LCBvciBtYW55IG90aGVyIHV0aWxpdGllcyB0
aGF0IGFyZQogZGVzaWduZWQgdG8gb3BlcmF0ZSBhY3Jvc3MgbGluZXMuCmRpZmYgLS1naXQg
YS90ZXN0cy9wY3JlIGIvdGVzdHMvcGNyZQppbmRleCBjYmU2ODg0Li43ZWZhNTYwIDEwMDc1
NQotLS0gYS90ZXN0cy9wY3JlCisrKyBiL3Rlc3RzL3BjcmUKQEAgLTEsNSArMSw1IEBACiAj
ISAvYmluL3NoCi0jIEVuc3VyZSB0aGF0IHdpdGggLVAsIFxzKiQgbWF0Y2hlcyBhIG5ld2xp
bmUuCisjIEVuc3VyZSB0aGF0IHdpdGggLVAsIFxzIG1hdGNoZXMgYSBuZXdsaW5lLgogIwog
IyBDb3B5cmlnaHQgKEMpIDIwMDEsIDIwMDYsIDIwMDktMjAxNCBGcmVlIFNvZnR3YXJlIEZv
dW5kYXRpb24sIEluYy4KICMKQEAgLTEyLDcgKzEyLDcgQEAgcmVxdWlyZV9wY3JlXwogCiBm
YWlsPTAKIAotIyBTZWUgQ1ZTIHJldmlzaW9uIDEuMzIgb2YgInNyYy9zZWFyY2guYyIuCiBl
Y2hvIHwgZ3JlcCAtUCAnXHMqJCcgfHwgZmFpbD0xCitlY2hvIHwgZ3JlcCAtelAgJ1xzJCcg
fHwgZmFpbD0xCiAKIEV4aXQgJGZhaWwKLS0gCjEuOS4wCgo=
--------------010104050602070501040309--




Information forwarded to bug-grep@HIDDEN:
bug#16871; Package grep. Full text available.

Message received at 16871 <at> debbugs.gnu.org:


Received: (at 16871) by debbugs.gnu.org; 25 Feb 2014 11:33:05 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Tue Feb 25 06:33:05 2014
Received: from localhost ([127.0.0.1]:38705 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1WIGG0-0008H1-Sx
	for submit <at> debbugs.gnu.org; Tue, 25 Feb 2014 06:33:05 -0500
Received: from mail-wi0-f179.google.com ([209.85.212.179]:63946)
 by debbugs.gnu.org with esmtp (Exim 4.80)
 (envelope-from <stephane.chazelas@HIDDEN>) id 1WIGFy-0008GU-EC
 for 16871 <at> debbugs.gnu.org; Tue, 25 Feb 2014 06:33:02 -0500
Received: by mail-wi0-f179.google.com with SMTP id bs8so545900wib.0
 for <16871 <at> debbugs.gnu.org>; Tue, 25 Feb 2014 03:32:56 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=date:from:to:subject:message-id:references:mime-version
 :content-type:content-disposition:in-reply-to:user-agent;
 bh=C3RCZ4rRiB9s8pdMOoa1lEG+roJIfhjcNIoAOWQ1vMQ=;
 b=ZvOLF5afCpO+7gpFWIbAris8Psxyc3+7e0PnR3SE9rFeXgBJ0MLp6fAJv5H7yyzfsr
 StgTKZSsqnL2H0JodUUqULFhcsQm3Kefm8TqnXSh+oLjobJG4SKtbCKk6cZyw6ErZy6N
 wOHe7RnBmHJmlbEv+6q8X5FsRJFs2LZqJoyO4p3ABr2SQ11RDh1DNiXDynmECmBBYGJC
 tIi8rK3NJ5e+BXtc6Ul2cQFWCknnKSA4o64PlzFgX/DvD/FZKHCaFdPDK604mNiHIeSI
 AHQQdgKCT87qSOHAsonfU/oqxz3AnGMcBgPj2vSud7d14KHyqZlhx626enWS2g+hmYiX
 HN4g==
X-Received: by 10.180.87.162 with SMTP id az2mr2358929wib.23.1393327965722;
 Tue, 25 Feb 2014 03:32:45 -0800 (PST)
Received: from chaz.gmail.com (188-223-3-27.zone14.bethere.co.uk.
 [188.223.3.27])
 by mx.google.com with ESMTPSA id f7sm49965001wjb.7.2014.02.25.03.32.44
 for <16871 <at> debbugs.gnu.org>
 (version=TLSv1.2 cipher=RC4-SHA bits=128/128);
 Tue, 25 Feb 2014 03:32:44 -0800 (PST)
Date: Tue, 25 Feb 2014 11:32:43 +0000
From: Stephane Chazelas <stephane.chazelas@HIDDEN>
To: 16871 <at> debbugs.gnu.org
Subject: Re: bug#16871: Acknowledgement (problems about matching newline
 (with -z))
Message-ID: <20140225113243.GB18853@HIDDEN>
References: <20140225073218.GA18853@HIDDEN>
 <handler.16871.B.13933135695090.ack <at> debbugs.gnu.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <handler.16871.B.13933135695090.ack <at> debbugs.gnu.org>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Score: -0.7 (/)
X-Debbugs-Envelope-To: 16871
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.7 (/)

Also:

$ printf 'a\nb\0' | grep -z 'a$'
$ printf 'a\nb\0' | grep -zP 'a$'
a
b
$ printf 'a\nb\0' | grep -zxP a
a
b

Why use PCRE_MULTILINE here?




Information forwarded to bug-grep@HIDDEN:
bug#16871; Package grep. Full text available.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 25 Feb 2014 07:32:49 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Tue Feb 25 02:32:49 2014
Received: from localhost ([127.0.0.1]:38508 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1WICVU-0001Jw-4N
	for submit <at> debbugs.gnu.org; Tue, 25 Feb 2014 02:32:48 -0500
Received: from eggs.gnu.org ([208.118.235.92]:43586)
 by debbugs.gnu.org with esmtp (Exim 4.80)
 (envelope-from <stephane.chazelas@HIDDEN>) id 1WICVS-0001Jf-3e
 for submit <at> debbugs.gnu.org; Tue, 25 Feb 2014 02:32:46 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <stephane.chazelas@HIDDEN>) id 1WICVH-0002xQ-1X
 for submit <at> debbugs.gnu.org; Tue, 25 Feb 2014 02:32:40 -0500
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM,
 T_DKIM_INVALID autolearn=disabled version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:42753)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <stephane.chazelas@HIDDEN>) id 1WICVG-0002xM-V0
 for submit <at> debbugs.gnu.org; Tue, 25 Feb 2014 02:32:34 -0500
Received: from eggs.gnu.org ([2001:4830:134:3::10]:45265)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <stephane.chazelas@HIDDEN>) id 1WICVB-0000ZW-BB
 for bug-grep@HIDDEN; Tue, 25 Feb 2014 02:32:34 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <stephane.chazelas@HIDDEN>) id 1WICV5-0002tS-P4
 for bug-grep@HIDDEN; Tue, 25 Feb 2014 02:32:29 -0500
Received: from mail-wg0-x22c.google.com ([2a00:1450:400c:c00::22c]:58937)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <stephane.chazelas@HIDDEN>) id 1WICV5-0002tC-I0
 for bug-grep@HIDDEN; Tue, 25 Feb 2014 02:32:23 -0500
Received: by mail-wg0-f44.google.com with SMTP id a1so12929wgh.27
 for <bug-grep@HIDDEN>; Mon, 24 Feb 2014 23:32:22 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=date:from:to:subject:message-id:mime-version:content-type
 :content-disposition:user-agent;
 bh=Sf2hql25pLvvgRnDHrcKsW3tg0N6P6BUYjCV2aODfd4=;
 b=N8DCbEDs82/aiS3O9k2jNbeCRZz8XIDEfQs+fJI0SnUgSdWmmDc7JHgw/PGrHpixFQ
 QAcIEuY3eyB7n5vPTBgrbc1CwQyuGCuvasN9ecqllggS3AT9d1o1EiP5Xck30UJPD4E4
 NLv3RWoyeULaUO1tG8fESX38hXh8fs9eZxdTG4mmm8TWp5wWWm+0ktgEMM3WDYEIQLG5
 54dAn+rvGgvMLc35BPq5RYHaTnnddLoVQpUrKsDrpb4GtkvRFaRrKKHGxHLm+kDG0NOo
 w/DFXQmB3b4U9OWX5FCMI0dPeM3b0fdYI8buDBg2hFkWGJ3rWaJg9Wu1qbWF7gRIeAK7
 oqiw==
X-Received: by 10.194.6.164 with SMTP id c4mr22787558wja.38.1393313542404;
 Mon, 24 Feb 2014 23:32:22 -0800 (PST)
Received: from chaz.gmail.com (188-223-3-27.zone14.bethere.co.uk.
 [188.223.3.27])
 by mx.google.com with ESMTPSA id ee5sm31422807wib.8.2014.02.24.23.32.20
 for <bug-grep@HIDDEN> (version=TLSv1.2 cipher=RC4-SHA bits=128/128);
 Mon, 24 Feb 2014 23:32:21 -0800 (PST)
Date: Tue, 25 Feb 2014 07:32:18 +0000
From: Stephane Chazelas <stephane.chazelas@HIDDEN>
To: bug-grep@HIDDEN
Subject: problems about matching newline (with -z)
Message-ID: <20140225073218.GA18853@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.21 (2010-09-15)
X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address
 (bad octet value).
X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address
 (bad octet value).
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -4.0 (----)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -4.0 (----)

The doc has a confusing statement:

> 15. How can I match across lines?
>
>    Standard grep cannot do this, as it is fundamentally line-based.
>    Therefore, merely using the '[:space:]' character class does not
>    match newlines in the way you might expect.  However, if your grep
>    is compiled with Perl patterns enabled, the Perl 's' modifier
>    (which makes '.' match newlines) can be used:
>
>         printf 'foo\nbar\n' | grep -P '(?s)foo.*?bar'
>
>    With the GNU 'grep' option '-z' (*note File and Directory
>    Selection::), the input is terminated by null bytes.  Thus, you can
>    match newlines in the input, but the output will be the whole file,
>    so this is really only useful to determine if the pattern is
>    present:
>
>         printf 'foo\nbar\n' | grep -z -q 'foo[[:space:]]\+bar'
>
>    Failing either of those options, you need to transform the input
>    before giving it to 'grep', or turn to 'awk', 'sed', 'perl', or
>    many other utilities that are designed to operate across lines.

printf 'foo\nbar\n' | grep -P '(?s)foo.*?bar'

Will never match as it's line-based even with -P. -P doesn't
help here, it makes it harder as you need that (?s).

printf 'foo\nbar\n\0' | grep -z 'foo.*bar'

would match.

Same confusion in tests/pcre:

> #! /bin/sh
> # Ensure that with -P, \s*$ matches a newline.
> #
> # Copyright (C) 2001, 2006, 2009-2014 Free Software Foundation, Inc.
> #
> # Copying and distribution of this file, with or without modification,
> # are permitted in any medium without royalty provided the copyright
> # notice and this notice are preserved.
> 
> . "${srcdir=.}/init.sh"; path_prepend_ ../src
> require_pcre_
> 
> fail=0
> 
> # See CVS revision 1.32 of "src/search.c".
> echo | grep -P '\s*$' || fail=1
> 
> Exit $fail

'\s*$' doesn't match a newline, but an empty string.

You need echo | grep -zP '\s' to match the newline.

Also:

We can match a newline with grep -zP 'a\nb' (or '\x0a' or '\012'
or '[\n]'...) but not easily without -P. Same for NUL
characters.

Without -P, the only way I could think of was with
[^\0-\011\013-\377], but that would only work for single-byte
locales, and you can't pass a nul character on the command line,
so it would have to be with -f but:

$ printf 'a\nb\0' | LC_ALL=C grep -zf <(LC_ALL=C printf 'a[^\0-\011\013-\377]b')
zsh: done                printf 'a\nb\0' |
zsh: segmentation fault  LC_ALL=C grep -zf <(LC_ALL=C printf 'a[^\0-\011\013-\377]b')

Having said that:

grep -z $'a[^\01-\011\013-\0377]b'

would work (in single-byte locales) since nul is not in the
input since it's the delimiter.

and grep -a $'[^\01-\0377]' can match nul (in single-byte
locales).

But it would be handly to be able to do the same as with -P.

-- 
Stephane




Acknowledgement sent to Stephane Chazelas <stephane.chazelas@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-grep@HIDDEN. Full text available.
Report forwarded to bug-grep@HIDDEN:
bug#16871; Package grep. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Mon, 25 Nov 2019 12:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.