GNU logs - #22059, boring messages


Message sent to bug-grep@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#22059: grep -E: unexpected behaviour
Resent-From: Charles <c@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-grep@HIDDEN
Resent-Date: Mon, 30 Nov 2015 07:23:02 +0000
Resent-Message-ID: <handler.22059.B.144886815619681 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: report 22059
X-GNU-PR-Package: grep
X-GNU-PR-Keywords: 
To: 22059 <at> debbugs.gnu.org
X-Debbugs-Original-To: bug-grep@HIDDEN
Received: via spool by submit <at> debbugs.gnu.org id=B.144886815619681
          (code B ref -1); Mon, 30 Nov 2015 07:23:02 +0000
Received: (at submit) by debbugs.gnu.org; 30 Nov 2015 07:22:36 +0000
Received: from localhost ([127.0.0.1]:60025 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1a3Imv-000572-UF
	for submit <at> debbugs.gnu.org; Mon, 30 Nov 2015 02:22:36 -0500
Received: from eggs.gnu.org ([208.118.235.92]:55914)
 by debbugs.gnu.org with esmtp (Exim 4.80)
 (envelope-from <c@HIDDEN>) id 1a3GXP-0001d7-Js
 for submit <at> debbugs.gnu.org; Sun, 29 Nov 2015 23:58:08 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <c@HIDDEN>) id 1a3GXO-0005F1-Ad
 for submit <at> debbugs.gnu.org; Sun, 29 Nov 2015 23:58:07 -0500
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled
 version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:42777)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <c@HIDDEN>) id 1a3GXO-0005Ex-7Y
 for submit <at> debbugs.gnu.org; Sun, 29 Nov 2015 23:58:06 -0500
Received: from eggs.gnu.org ([2001:4830:134:3::10]:37677)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <c@HIDDEN>) id 1a3GXN-0004dc-Dy
 for bug-grep@HIDDEN; Sun, 29 Nov 2015 23:58:06 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <c@HIDDEN>) id 1a3GXJ-0005EY-Dr
 for bug-grep@HIDDEN; Sun, 29 Nov 2015 23:58:05 -0500
Received: from smtp5.emailarray.com ([65.39.216.39]:37633)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <c@HIDDEN>) id 1a3GXJ-0005ET-8v
 for bug-grep@HIDDEN; Sun, 29 Nov 2015 23:58:01 -0500
Received: (qmail 82799 invoked by uid 89); 30 Nov 2015 04:57:58 -0000
Received: from unknown (HELO ?192.168.10.17?)
 (Y2hhcmxlc0BjaGFybGVzbWF0a2luc29uLm9yZ0A1OS45OS4yMzkuODg=) (POLARISLOCAL) 
 by smtp5.emailarray.com with SMTP; 30 Nov 2015 04:57:58 -0000
Message-ID: <565BD753.7020507@HIDDEN>
Date: Mon, 30 Nov 2015 10:27:55 +0530
From: Charles <c@HIDDEN>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x
X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address
 (bad octet value).
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -5.0 (-----)
X-Mailman-Approved-At: Mon, 30 Nov 2015 02:22:17 -0500
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -5.0 (-----)

As expected:

# grep -E 'udisksd\[[[:digit:]]+\]: The string .* ' /var/log/syslog.1
Nov 30 07:16:38 CW8 udisksd[2650]: The string `TSSTcorp CDDVDW SHQeò? ±?¾MUæíE³èBãÄL' is not valid UTF-8. Invalid characters begins at `eò? ±?¾MUæíE³èBãÄL'
Nov 30 07:16:38 CW8 udisksd[2650]: The string `TSSTcorp CDDVDW SHQeò? ±?¾MUæíE³èBãÄL' is not valid UTF-8. Invalid characters begins at `eò? ±?¾MUæíE³èBãÄL'

But add the i to the pattern and the behaviour is unexpected:

# grep -E 'udisksd\[[[:digit:]]+\]: The string .* i' /var/log/syslog.1
[no output]

Apparently grep silently stops processing when it encounters the invalid UTF-8:

# grep -E --only-matching 'udisksd\[[[:digit:]]+\]: The string .* ' /var/log/syslog.1 | tail -1
udisksd[2650]: The string `TSSTcorp CDDVDW

In case the specific unusual characters are relevant, here they are in hex:

# grep -E 'udisksd\[[[:digit:]]+\]: The string .* ' /var/log/syslog.1 | head -1 | cut --delimiter=' ' --fields=10-11 | od -x
0000000 4853 8251 f265 88d0 b120 b8d3 4dbe e655
0000020 45ed e8b3 e342 4cc4 0a27
0000032

When the input has invalid characters so grep cannot process it, a message could be expected perhaps configurable by the -s/--no-messages option because the input is (sort of) unreadable.

Version: 2.20 from the Debian Jessie package 2.20-4.1

Charles





Message sent:


Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Mailer: MIME-tools 5.503 (Entity 5.503)
Content-Type: text/plain; charset=utf-8
X-Loop: help-debbugs@HIDDEN
From: help-debbugs@HIDDEN (GNU bug Tracking System)
To: Charles <c@HIDDEN>
Subject: bug#22059: Acknowledgement (grep -E: unexpected behaviour)
Message-ID: <handler.22059.B.144886815619681.ack <at> debbugs.gnu.org>
References: <565BD753.7020507@HIDDEN>
X-Gnu-PR-Message: ack 22059
X-Gnu-PR-Package: grep
Reply-To: 22059 <at> debbugs.gnu.org
Date: Mon, 30 Nov 2015 07:23:02 +0000

Thank you for filing a new bug report with debbugs.gnu.org.

This is an automatically generated reply to let you know your message
has been received.

Your message is being forwarded to the package maintainers and other
interested parties for their attention; they will reply in due course.

Your message has been sent to the package maintainer(s):
 bug-grep@HIDDEN

If you wish to submit further information on this problem, please
send it to 22059 <at> debbugs.gnu.org.

Please do not send mail to help-debbugs@HIDDEN unless you wish
to report a problem with the Bug-tracking system.

--=20
22059: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D22059
GNU Bug Tracking System
Contact help-debbugs@HIDDEN with problems


Message sent to bug-grep@HIDDEN:


X-Loop: help-debbugs@HIDDEN
Subject: bug#22059: grep -E: unexpected behaviour
Resent-From: Paul Eggert <eggert@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-grep@HIDDEN
Resent-Date: Mon, 30 Nov 2015 17:28:02 +0000
Resent-Message-ID: <handler.22059.B22059.144890444924700 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 22059
X-GNU-PR-Package: grep
X-GNU-PR-Keywords: 
To: Charles <c@HIDDEN>, 22059 <at> debbugs.gnu.org
Received: via spool by 22059-submit <at> debbugs.gnu.org id=B22059.144890444924700
          (code B ref 22059); Mon, 30 Nov 2015 17:28:02 +0000
Received: (at 22059) by debbugs.gnu.org; 30 Nov 2015 17:27:29 +0000
Received: from localhost ([127.0.0.1]:33094 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.80)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1a3SEa-0006QK-Pz
	for submit <at> debbugs.gnu.org; Mon, 30 Nov 2015 12:27:28 -0500
Received: from zimbra.cs.ucla.edu ([131.179.128.68]:41764)
 by debbugs.gnu.org with esmtp (Exim 4.80)
 (envelope-from <eggert@HIDDEN>) id 1a3SEZ-0006QA-5k
 for 22059 <at> debbugs.gnu.org; Mon, 30 Nov 2015 12:27:27 -0500
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 54CAA1601D0;
 Mon, 30 Nov 2015 09:27:26 -0800 (PST)
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id mxRZTcT7DNDn; Mon, 30 Nov 2015 09:27:25 -0800 (PST)
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id AA339160E3D;
 Mon, 30 Nov 2015 09:27:25 -0800 (PST)
X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id CKHkCot-qxiy; Mon, 30 Nov 2015 09:27:25 -0800 (PST)
Received: from Penguin.CS.UCLA.EDU (Penguin.CS.UCLA.EDU [131.179.64.200])
 by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 91B181601D0;
 Mon, 30 Nov 2015 09:27:25 -0800 (PST)
References: <565BD753.7020507@HIDDEN>
From: Paul Eggert <eggert@HIDDEN>
Organization: UCLA Computer Science Department
Message-ID: <565C86FD.70909@HIDDEN>
Date: Mon, 30 Nov 2015 09:27:25 -0800
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101
 Thunderbird/38.1.0
MIME-Version: 1.0
In-Reply-To: <565BD753.7020507@HIDDEN>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Score: -0.0 (/)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

On 11/29/2015 08:57 PM, Charles wrote:
> Apparently grep silently stops processing when it encounters the invalid UTF-8:

The regular expression "." matches a single character, and ".*" matches 
a string of characters. In your example, there is an encoding error, and 
encoding errors are not characters so "." and ".*" do not match them. I 
don't see any bug here.

> When the input has invalid characters so grep cannot process it, a message could be expected

That's a good suggestion, yes.




Message received at control <at> debbugs.gnu.org:


Received: (at control) by debbugs.gnu.org; 31 Dec 2015 08:55:23 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Dec 31 03:55:23 2015
Received: from localhost ([127.0.0.1]:50938 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1aEZ11-0005OJ-J5
	for submit <at> debbugs.gnu.org; Thu, 31 Dec 2015 03:55:23 -0500
Received: from zimbra.cs.ucla.edu ([131.179.128.68]:41868)
 by debbugs.gnu.org with esmtp (Exim 4.84)
 (envelope-from <eggert@HIDDEN>) id 1aEZ0z-0005O1-Ht
 for control <at> debbugs.gnu.org; Thu, 31 Dec 2015 03:55:21 -0500
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 2C836160ED6
 for <control <at> debbugs.gnu.org>; Thu, 31 Dec 2015 00:55:16 -0800 (PST)
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id Dsi6TvFEL1b3 for <control <at> debbugs.gnu.org>;
 Thu, 31 Dec 2015 00:55:15 -0800 (PST)
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 4FFD0160ED7
 for <control <at> debbugs.gnu.org>; Thu, 31 Dec 2015 00:55:15 -0800 (PST)
X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id EnJ9ehm4awCT for <control <at> debbugs.gnu.org>;
 Thu, 31 Dec 2015 00:55:15 -0800 (PST)
Received: from [192.168.1.9] (pool-100-32-155-148.lsanca.fios.verizon.net
 [100.32.155.148])
 by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 3587F160ED6
 for <control <at> debbugs.gnu.org>; Thu, 31 Dec 2015 00:55:15 -0800 (PST)
To: control <at> debbugs.gnu.org
From: Paul Eggert <eggert@HIDDEN>
Subject: grep bug maintenance
Organization: UCLA Computer Science Department
Message-ID: <5684ED73.6060403@HIDDEN>
Date: Thu, 31 Dec 2015 00:55:15 -0800
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101
 Thunderbird/38.4.0
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Score: -0.0 (/)
X-Debbugs-Envelope-To: control
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -0.0 (/)

severity 22059 wishlist
severity 21865 wishlist
close 22278
close 22279
close 21755
close 21700
tags 21554 wontfix
tags 21527 moreinfo





Last modified: Mon, 25 Nov 2019 12:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.