GNU bug report logs - #38223
grep >=2.28 cannot handle -wF correctly under LANG=ja_JP.eucjp

Previous Next

Package: grep;

Reported by: "NIDE, Naoyuki" <nide <at> ics.nara-wu.ac.jp>

Date: Fri, 15 Nov 2019 19:54:02 UTC

Severity: normal

Tags: moreinfo

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 38223 in the body.
You can then email your comments to 38223 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#38223; Package grep. (Fri, 15 Nov 2019 19:54:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to "NIDE, Naoyuki" <nide <at> ics.nara-wu.ac.jp>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Fri, 15 Nov 2019 19:54:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: "NIDE, Naoyuki" <nide <at> ics.nara-wu.ac.jp>
To: bug-grep <at> gnu.org
Subject: grep >=2.28 cannot handle -wF correctly under LANG=ja_JP.eucjp
Date: Sat, 16 Nov 2019 04:06:51 +0900 (JST)
echo ba | LANG=ja_JP.eucjp grep -F -w a
outputs ba, but should output nothing.

NIDE, Naoyuki
nide <at> ics.nara-wu.ac.jp




Information forwarded to bug-grep <at> gnu.org:
bug#38223; Package grep. (Fri, 15 Nov 2019 21:39:02 GMT) Full text and rfc822 format available.

Message #8 received at 38223 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: "NIDE, Naoyuki" <nide <at> ics.nara-wu.ac.jp>
Cc: 38223 <at> debbugs.gnu.org
Subject: Re: bug#38223: grep >=2.28 cannot handle -wF correctly under
 LANG=ja_JP.eucjp
Date: Fri, 15 Nov 2019 13:38:38 -0800
On 11/15/19 11:06 AM, NIDE, Naoyuki wrote:
> echo ba | LANG=ja_JP.eucjp grep -F -w a
> outputs ba, but should output nothing.

I don't observe this problem with GNU grep 3.3 on Fedora 31. Please try 
upgrading to grep 3.3, the current release. If that doesn't work, please send 
more details about your configuration: what OS you're using, how you built 
'grep', etc. Thanks.




Added tag(s) moreinfo. Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Fri, 15 Nov 2019 21:40:01 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#38223; Package grep. (Sat, 16 Nov 2019 00:31:02 GMT) Full text and rfc822 format available.

Message #13 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Stephane Chazelas <stephane.chazelas <at> gmail.com>
To: bug-grep <at> gnu.org
Subject: Re: bug#38223: grep >=2.28 cannot handle -wF correctly under
 LANG=ja_JP.eucjp
Date: Sat, 16 Nov 2019 00:27:25 +0000
2019-11-15 13:38:38 -0800, Paul Eggert:
> On 11/15/19 11:06 AM, NIDE, Naoyuki wrote:
> > echo ba | LANG=ja_JP.eucjp grep -F -w a
> > outputs ba, but should output nothing.
> 
> I don't observe this problem with GNU grep 3.3 on Fedora 31. Please try
> upgrading to grep 3.3, the current release. If that doesn't work, please
> send more details about your configuration: what OS you're using, how you
> built 'grep', etc. Thanks.
[...]

I can reproduce on Linux Mint 19.2 Tina amd64, based on Ubuntu 18.04
with grep 3.1 and 3.3 and glibc 2.27-3ubuntu1.

$ echo ba | LC_ALL=ja_JP.eucjp ./src/grep  -o '[[:alnum:]]'
b
a
$ echo \\nba\\n | LC_ALL=ja_JP.eucjp ./src/grep  -wF a
ba

Also in these locales:

ja_JP.eucjp
ko_KR.euckr
zh_CN.gb18030
zh_CN.gb2312
zh_CN.gbk
zh_HK.big5hkscs
zh_SG.gb2312
zh_SG.gbk
zh_TW.big5
zh_TW.euctw

-- 
Stephane





Information forwarded to bug-grep <at> gnu.org:
bug#38223; Package grep. (Sat, 16 Nov 2019 06:16:02 GMT) Full text and rfc822 format available.

Message #16 received at 38223 <at> debbugs.gnu.org (full text, mbox):

From: "NIDE, Naoyuki" <nide <at> ics.nara-wu.ac.jp>
To: eggert <at> cs.ucla.edu
Cc: nide <at> ics.nara-wu.ac.jp, 38223 <at> debbugs.gnu.org
Subject: Re: bug#38223: grep >=2.28 cannot handle -wF correctly under
 LANG=ja_JP.eucjp
Date: Sat, 16 Nov 2019 14:11:44 +0900 (JST)
In Message <fd9ae09c-8ecf-b998-0f63-ea95357b5158 <at> cs.ucla.edu>,
	Paul Eggert <eggert <at> cs.ucla.edu> writes:
> On 11/15/19 11:06 AM, NIDE, Naoyuki wrote:
> > echo ba | LANG=ja_JP.eucjp grep -F -w a
> > outputs ba, but should output nothing.
> 
> I don't observe this problem with GNU grep 3.3 on Fedora 31. Please
> try upgrading to grep 3.3, the current release. If that doesn't work,
> please send more details about your configuration: what OS you're
> using, how you built 'grep', etc. Thanks.

I am using grep 3.3 on Debian buster (the packaged one by Debian).

$ uname -a
Linux myhost 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u1 (2019-09-20) x86_64 GNU/Linux
$ cat /etc/debian_version 
10.1
$ which grep
/bin/grep
$ grep --version
grep (GNU grep) 3.3
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Mike Haertel and others; see
<https://git.sv.gnu.org/cgit/grep.git/tree/AUTHORS>.
$ echo ba | LANG=ja_JP.eucjp grep -F -w a
ba

The bug appears.
Perhaps you tried on an environment which does not have ja_JP.eucjp locale?

I also tried grep 3.3 built from the source (on the same environment).

$ wget http://ftp.jaist.ac.jp/pub/GNU/grep/grep-3.3.tar.xz
$ tar zxf grep-3.3.tar.xz 
$ cd grep-3.3
$ ./configure --prefix=/tmp/test
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking whether make supports nested variables... (cached) yes
checking build system type... x86_64-pc-linux-gnu
checking host system type... x86_64-pc-linux-gnu
checking for gawk... (cached) gawk
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables... 
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether the compiler supports GNU C... yes
checking whether gcc accepts -g... yes
checking for gcc option to enable C11 features... none needed
checking whether make supports the include directive... yes (GNU style)
checking dependency style of gcc... gcc3
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking for ucontext.h... yes
checking for sys/param.h... yes
checking for sys/socket.h... yes
checking for dirent.h... yes
checking for fnmatch.h... yes
checking for wctype.h... yes
checking for stdio_ext.h... yes
checking for sys/vfs.h... yes
checking for getopt.h... yes
checking for sys/cdefs.h... yes
checking for iconv.h... yes
checking for limits.h... yes
checking for wchar.h... yes
checking for crtdefs.h... no
checking for langinfo.h... yes
checking for xlocale.h... no
checking for sys/mman.h... yes
checking for malloc.h... yes
checking for sys/time.h... yes
checking for features.h... yes
checking for arpa/inet.h... yes
checking for netdb.h... yes
checking for netinet/in.h... yes
checking for sys/select.h... yes
checking for sys/wait.h... yes
checking for sys/ioctl.h... yes
checking for sys/uio.h... yes
checking for minix/config.h... no
checking whether it is safe to define __EXTENSIONS__... yes
checking whether _XOPEN_SOURCE should be defined... no
checking how to run the C preprocessor... gcc -E
checking for grep that handles long lines and -e... fn_grep
checking for egrep... (cached) fn_grep
checking for Minix Amsterdam compiler... no
checking for ar... ar
checking for ranlib... ranlib
checking for special C compiler options needed for large files... no
checking for _FILE_OFFSET_BITS value needed for large files... no
checking for ranlib... (cached) ranlib
checking for pkg-config... /usr/bin/pkg-config
checking pkg-config is at least version 0.9.0... yes
checking for size_t... yes
checking for an ANSI C-conforming const... yes
checking for working alloca.h... yes
checking for alloca... yes
checking whether <wchar.h> uses 'inline' correctly... yes
checking for btowc... yes
checking for setrlimit... yes
checking for sigaltstack... yes
checking for _set_invalid_parameter_handler... no
checking for fchdir... yes
checking for strerror_r... yes
checking for fcntl... yes
checking for symlink... yes
checking for fdopendir... yes
checking for mempcpy... yes
checking for fnmatch... yes
checking for isblank... yes
checking for iswctype... yes
checking for mbsrtowcs... yes
checking for wmemchr... yes
checking for wmemcpy... yes
checking for wmempcpy... yes
checking for fstatat... yes
checking for openat... yes
checking for fstatfs... yes
checking for getdtablesize... yes
checking for getprogname... no
checking for getexecname... no
checking for iswcntrl... yes
checking for iswblank... yes
checking for lstat... yes
checking for mbsinit... yes
checking for mbrtowc... yes
checking for mbrlen... yes
checking for mbslen... no
checking for mprotect... yes
checking for nl_langinfo... yes
checking for sigaction... yes
checking for siginterrupt... yes
checking for strdup... yes
checking for __xpg_strerror_r... yes
checking for strtoimax... yes
checking for strtoumax... yes
checking for pipe... yes
checking for wcrtomb... yes
checking for wctob... yes
checking for wcwidth... yes
checking for ftruncate... yes
checking for gettimeofday... yes
checking for newlocale... yes
checking for uselocale... yes
checking for duplocale... yes
checking for freelocale... yes
checking for setenv... yes
checking for sleep... yes
checking for snprintf... yes
checking for catgets... yes
checking for shutdown... yes
checking for vasnprintf... no
checking for isascii... yes
checking for setlocale... yes
checking for nl_langinfo and CODESET... yes
checking for a traditional french locale... none
checking for working C stack overflow detection... yes
checking for correct stack_t interpretation... yes
checking for precise C stack overflow detection... no
checking for ld used by gcc... /usr/bin/ld
checking if the linker (/usr/bin/ld) is GNU ld... yes
checking for shared library run path origin... done
checking 32-bit host C ABI... no
checking for the common suffixes of directories in the library search path... lib,lib
checking for libsigsegv... no, consider installing GNU libsigsegv
checking how gcc reports undeclared, standard C functions... error
checking whether the preprocessor supports include_next... yes
checking whether system header files limit the line length... no
checking whether // is distinct from /... no
checking for complete errno.h... yes
checking whether strerror_r is declared... yes
checking whether strerror_r returns char *... yes
checking whether fchdir is declared... yes
checking for working fcntl.h... yes
checking for pid_t... yes
checking for mode_t... yes
checking for promoted mode_t type... mode_t
checking for mbstate_t... yes
checking whether stat file-mode macros are broken... no
checking for nlink_t... yes
checking whether lstat correctly handles trailing slash... yes
checking for O_CLOEXEC... yes
checking whether getcwd (NULL, 0) allocates memory for result... yes
checking for getcwd with POSIX signature... yes
checking whether getdtablesize is declared... yes
checking for getopt.h... (cached) yes
checking for getopt_long_only... yes
checking whether getopt is POSIX compatible... yes
checking for working GNU getopt function... yes
checking for working GNU getopt_long function... yes
checking for iconv... yes
checking for working iconv... yes
checking for iconv declaration... 
         extern size_t iconv (iconv_t cd, char * *inbuf, size_t *inbytesleft, char * *outbuf, size_t *outbytesleft);
checking for inline... inline
checking whether limits.h has LLONG_MAX, WORD_BIT, ULLONG_WIDTH etc.... yes
checking for wint_t... yes
checking whether wint_t is too small... no
checking for unsigned long long int... yes
checking for long long int... yes
checking whether stdint.h conforms to C99... yes
checking whether stdint.h predates C++11... no
checking whether stdint.h has UINTMAX_WIDTH etc.... yes
checking for inttypes.h... (cached) yes
checking whether the inttypes.h PRIxNN macros are broken... no
checking whether iswcntrl works... yes
checking for towlower... yes
checking for wctype_t... yes
checking for wctrans_t... yes
checking for wchar_t... yes
checking for good max_align_t... yes
checking whether NULL can be used in arbitrary expressions... yes
checking whether imported symbols can be declared weak... yes
checking whether the linker supports --as-needed... yes
checking whether the linker supports --push-state... yes
checking for pthread.h... yes
checking for multithread API to use... posix
checking for a sed that does not truncate output... /bin/sed
checking whether malloc, realloc, calloc are POSIX compliant... yes
checking for stdlib.h... yes
checking for GNU libc compatible malloc... yes
checking for a traditional japanese locale... ja_JP
checking for a transitional chinese locale... none
checking for a french Unicode locale... none
checking whether mbrtowc handles incomplete characters... yes
checking whether mbrtowc works as well as mbtowc... guessing yes
checking whether mbrtowc handles a NULL pwc argument... guessing yes
checking whether mbrtowc handles a NULL string argument... guessing yes
checking whether mbrtowc has a correct return value... yes
checking whether mbrtowc returns 0 when parsing a NUL character... guessing yes
checking whether mbrtowc works on empty input... (cached) assume yes
checking whether the C locale is free of encoding errors... no
checking for mmap... yes
checking for MAP_ANONYMOUS... yes
checking whether memchr works... yes
checking for C/C++ restrict keyword... __restrict
checking whether memrchr is declared... yes
checking whether <limits.h> defines MIN and MAX... no
checking whether <sys/param.h> defines MIN and MAX... yes
checking for sigset_t... yes
checking whether alarm is declared... yes
checking whether we are using the GNU C Library >= 2.1 or uClibc... yes
checking for ssize_t... yes
checking for uid_t in sys/types.h... yes
checking for stdbool.h that conforms to C99... yes
checking for _Bool... yes
checking whether strdup is declared... yes
checking whether strerror(0) succeeds... yes
checking for strerror_r with POSIX signature... no
checking whether __xpg_strerror_r works... yes
checking whether strnlen is declared... yes
checking whether strstr works... no
checking whether strtoimax is declared... yes
checking whether strtoumax is declared... yes
checking for struct timespec in <time.h>... yes
checking whether clearerr_unlocked is declared... yes
checking whether feof_unlocked is declared... yes
checking whether ferror_unlocked is declared... yes
checking whether fflush_unlocked is declared... yes
checking whether fgets_unlocked is declared... yes
checking whether fputc_unlocked is declared... yes
checking whether fputs_unlocked is declared... yes
checking whether fread_unlocked is declared... yes
checking whether fwrite_unlocked is declared... yes
checking whether getc_unlocked is declared... yes
checking whether getchar_unlocked is declared... yes
checking whether putc_unlocked is declared... yes
checking whether putchar_unlocked is declared... yes
checking whether <sys/socket.h> is self-contained... yes
checking for shutdown... (cached) yes
checking whether <sys/socket.h> defines the SHUT_* macros... yes
checking for struct sockaddr_storage... yes
checking for sa_family_t... yes
checking for struct sockaddr_storage.ss_family... yes
checking if environ is properly declared... yes
checking for struct timeval... yes
checking for wide-enough struct timeval.tv_sec member... yes
checking for IPv4 sockets... yes
checking for IPv6 sockets... yes
checking for off_t... yes
checking for LC_MESSAGES... yes
checking whether uselocale works... yes
checking for fake locale system (OpenBSD)... no
checking for Solaris 11.4 locale system... no
checking for getlocalename_l... no
checking for CFPreferencesCopyAppValue... no
checking for CFLocaleCopyCurrent... no
checking for CFLocaleCopyPreferredLanguages... no
checking whether <sys/select.h> is self-contained... yes
checking for library containing setsockopt... none needed
checking whether select supports a 0 argument... yes
checking whether select detects invalid fds... yes
checking whether setenv is declared... yes
checking for search.h... yes
checking for tsearch... yes
checking whether snprintf returns a byte count as in C99... yes
checking whether snprintf is declared... yes
checking whether unsetenv is declared... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for intmax_t... yes
checking where to find the exponent in a 'double'... word 1 bit 20
checking for snprintf... (cached) yes
checking for strnlen... yes
checking for wcslen... yes
checking for wcsnlen... yes
checking for mbrtowc... (cached) yes
checking for wcrtomb... (cached) yes
checking whether _snprintf is declared... no
checking for alloca as a compiler built-in... yes
checking whether to enable assertions... yes
checking whether btowc(0) is correct... yes
checking whether btowc(EOF) is correct... guessing yes
checking for __builtin_expect... yes
checking whether sigaltstack is declared... yes
checking for stack_t... yes
checking whether this system has an arbitrary file name length limit... yes
checking for closedir... yes
checking for d_ino member in directory struct... yes
checking for d_type member in directory struct... yes
checking for dirfd... yes
checking whether dirfd is declared... yes
checking whether dirfd is a macro... no
checking whether // is distinct from /... (cached) no
checking whether dup works... yes
checking whether dup2 works... yes
checking for error_at_line... yes
checking whether fcntl handles F_DUPFD correctly... yes
checking whether fcntl understands F_DUPFD_CLOEXEC... needs runtime check
checking whether fdopendir is declared... yes
checking whether fdopendir works... yes
checking for flexible array member... yes
checking for working POSIX fnmatch... yes
checking for __fpending... yes
checking whether __fpending is declared... yes
checking whether fstatat (..., 0) works... yes
checking for struct statfs.f_type... yes
checking for __fsword_t... yes
checking whether getdtablesize works... yes
checking for getpagesize... yes
checking whether getpagesize is declared... yes
checking whether program_invocation_name is declared... yes
checking whether program_invocation_short_name is declared... yes
checking whether __argv is declared... no
checking whether the compiler generally respects inline... yes
checking whether INT32_MAX < INTMAX_MAX... yes
checking whether INT64_MAX == LONG_MAX... yes
checking whether UINT32_MAX < UINTMAX_MAX... yes
checking whether UINT64_MAX == ULONG_MAX... yes
checking whether iswblank is declared... yes
checking whether langinfo.h defines CODESET... yes
checking whether langinfo.h defines T_FMT_AMPM... yes
checking whether langinfo.h defines ALTMON_1... yes
checking whether langinfo.h defines ERA... yes
checking whether langinfo.h defines YESEXPR... yes
checking whether the compiler supports the __inline keyword... yes
checking for libsigsegv... (cached) no, consider installing GNU libsigsegv
checking whether locale.h conforms to POSIX:2001... yes
checking whether struct lconv is properly defined... yes
checking for pthread_rwlock_t... yes
checking whether pthread_rwlock_rdlock prefers a writer to a reader... no
checking whether lseek detects pipes... yes
checking for stdlib.h... (cached) yes
checking for GNU libc compatible malloc... (cached) yes
checking whether mbrtowc handles incomplete characters... (cached) yes
checking whether mbrtowc works as well as mbtowc... (cached) guessing yes
checking whether mbrtowc handles a NULL pwc argument... (cached) guessing yes
checking whether mbrtowc handles a NULL string argument... (cached) guessing yes
checking whether mbrtowc has a correct return value... (cached) yes
checking whether mbrtowc returns 0 when parsing a NUL character... (cached) guessing yes
checking whether mbrtowc works on empty input... (cached) assume yes
checking whether the C locale is free of encoding errors... (cached) no
checking whether mbrtowc handles incomplete characters... (cached) yes
checking whether mbrtowc works as well as mbtowc... (cached) guessing yes
checking whether mbrtowc handles incomplete characters... (cached) yes
checking whether mbrtowc works as well as mbtowc... (cached) guessing yes
checking whether mbsrtowcs works... yes
checking for mempcpy... (cached) yes
checking for memrchr... yes
checking whether YESEXPR works... yes
checking for obstacks that work with any size object... no
checking whether open recognizes a trailing slash... yes
checking for opendir... yes
checking for perl5.005 or newer... yes
checking for raise... yes
checking for sigprocmask... yes
checking for readdir... yes
checking for stdlib.h... (cached) yes
checking for GNU libc compatible realloc... yes
checking for working re_compile_pattern... no
checking for libintl.h... yes
checking whether isblank is declared... yes
checking for struct sigaction.sa_sigaction... yes
checking for volatile sig_atomic_t... yes
checking for sighandler_t... yes
checking for sigprocmask... (cached) yes
checking for ssize_t... (cached) yes
checking whether stat handles trailing slashes on files... yes
checking for struct stat.st_atim.tv_nsec... yes
checking whether struct stat.st_atim is of type struct timespec... yes
checking for struct stat.st_birthtimespec.tv_nsec... no
checking for struct stat.st_birthtimensec... no
checking for struct stat.st_birthtim.tv_nsec... no
checking for working stdalign.h... yes
checking for va_copy... yes
checking for good max_align_t... (cached) yes
checking whether NULL can be used in arbitrary expressions... (cached) yes
checking which flavor of printf attribute matches inttypes macros... system
checking for stpcpy... yes
checking for working strerror function... yes
checking for working strnlen... yes
checking whether strstr works... (cached) no
checking whether strtoimax works... yes
checking for strtoll... yes
checking for strtoull... yes
checking for nlink_t... (cached) yes
checking whether mbrtowc handles incomplete characters... (cached) yes
checking whether mbrtowc works as well as mbtowc... (cached) guessing yes
checking whether wcrtomb return value is correct... yes
checking whether wctob works... guessing yes
checking whether wctob is declared... yes
checking whether iswcntrl works... (cached) yes
checking for towlower... (cached) yes
checking for wctype_t... (cached) yes
checking for wctrans_t... (cached) yes
checking whether wcwidth is declared... yes
checking whether wcwidth works reasonably in UTF-8 locales... yes
checking for a traditional french locale... (cached) none
checking for a french Unicode locale... (cached) none
checking for a traditional french locale... (cached) none
checking for a turkish Unicode locale... none
checking whether fdopen sets errno... yes
checking whether conversion from 'int' to 'long double' works... yes
checking whether gettimeofday clobbers localtime buffer... no
checking for gettimeofday with POSIX signature... almost
checking for library containing inet_pton... none required
checking whether inet_pton is declared... yes
checking whether byte ordering is bigendian... no
checking for ioctl... yes
checking for ioctl with POSIX signature... no
checking for setlocale... (cached) yes
checking for a turkish Unicode locale... (cached) none
checking for a french Unicode locale... (cached) none
checking for a traditional french locale... (cached) none
checking for a french Unicode locale... (cached) none
checking for a traditional japanese locale... (cached) ja_JP
checking for a transitional chinese locale... (cached) none
checking for a french Unicode locale... (cached) none
checking for a transitional chinese locale... (cached) none
checking for mmap... (cached) yes
checking for MAP_ANONYMOUS... yes
checking for mmap... (cached) yes
checking for MAP_ANONYMOUS... yes
checking for mmap... (cached) yes
checking for MAP_ANONYMOUS... yes
checking for library containing nanosleep... none required
checking for working nanosleep... no (mishandles large arguments)
checking whether <netinet/in.h> is self-contained... yes
checking for a traditional french locale... (cached) none
checking for a french Unicode locale... (cached) none
checking whether perror matches strerror... yes
checking for putenv compatible with GNU and SVID... yes
checking for mmap... (cached) yes
checking for MAP_ANONYMOUS... yes
checking whether select supports a 0 argument... (cached) yes
checking whether select detects invalid fds... (cached) yes
checking whether setenv validates arguments... yes
checking for a traditional french locale... (cached) none
checking for a french Unicode locale... (cached) none
checking for a traditional japanese locale... (cached) ja_JP
checking for a transitional chinese locale... (cached) none
checking for stdint.h... (cached) yes
checking for SIZE_MAX... yes
checking whether sleep is declared... yes
checking for working sleep... yes
checking for snprintf... (cached) yes
checking whether snprintf respects a size of 1... yes
checking whether printf supports POSIX/XSI format strings with positions... yes
checking for socklen_t... yes
checking for mmap... (cached) yes
checking for MAP_ANONYMOUS... yes
checking for mmap... (cached) yes
checking for MAP_ANONYMOUS... yes
checking whether symlink handles trailing slash correctly... yes
checking whether <sys/ioctl.h> declares ioctl... yes
checking for unsetenv... yes
checking for unsetenv() return type... int
checking whether unsetenv obeys POSIX... yes
checking for ptrdiff_t... yes
checking for a traditional french locale... (cached) none
checking for a french Unicode locale... (cached) none
checking for a traditional japanese locale... (cached) ja_JP
checking for a transitional chinese locale... (cached) none
checking for stdint.h... (cached) yes
checking for dirent.h that defines DIR... yes
checking for library containing opendir... none required
checking whether closedir returns void... no
checking whether NLS is requested... yes
checking for msgfmt... /usr/bin/msgfmt
checking for gmsgfmt... /usr/bin/msgfmt
checking for xgettext... /usr/bin/xgettext
checking for msgmerge... /usr/bin/msgmerge
checking for CFPreferencesCopyAppValue... (cached) no
checking for CFLocaleCopyCurrent... (cached) no
checking for CFLocaleCopyPreferredLanguages... (cached) no
checking for GNU gettext in libc... yes
checking whether to use NLS... yes
checking where the gettext function comes from... libc
checking for PCRE... yes
checking for pcre_compile... yes
checking that generated files are newer than configure... done
configure: creating ./config.status
config.status: creating Makefile
config.status: creating lib/Makefile
config.status: creating src/Makefile
config.status: creating tests/Makefile
config.status: creating po/Makefile.in
config.status: creating doc/Makefile
config.status: creating gnulib-tests/Makefile
config.status: creating config.h
config.status: executing depfiles commands
config.status: executing po-directories commands
config.status: creating po/POTFILES
config.status: creating po/Makefile
$ make
make  all-recursive
make[1]: Entering directory '/tmp/grep-3.3'
Making all in po
make[2]: Entering directory '/tmp/grep-3.3/po'
make[2]: Nothing to be done for 'all'.
make[2]: Leaving directory '/tmp/grep-3.3/po'
Making all in lib
make[2]: Entering directory '/tmp/grep-3.3/lib'
  GEN      alloca.h
  GEN      configmake.h
  GEN      ctype.h
  GEN      dirent.h
  GEN      fcntl.h
  GEN      iconv.h
  GEN      inttypes.h
  GEN      langinfo.h
  GEN      limits.h
  GEN      locale.h
  GEN      signal.h
  GEN      stdio.h
  GEN      stdlib.h
  GEN      string.h
  GEN      sys/stat.h
  GEN      sys/types.h
  GEN      time.h
  GEN      unistd.h
  GEN      unistr.h
  GEN      unitypes.h
  GEN      uniwidth.h
  GEN      wchar.h
  GEN      wctype.h
make  all-am
make[3]: Entering directory '/tmp/grep-3.3/lib'
  CC       argmatch.o
  CC       binary-io.o
  CC       bitrotate.o
  CC       c-ctype.o
  CC       c-stack.o
  CC       c-strcasecmp.o
  CC       c-strncasecmp.o
  CC       cloexec.o
  CC       close-stream.o
  CC       closeout.o
  CC       cycle-check.o
  CC       dfa.o
  CC       localeinfo.o
  CC       dirname-lgpl.o
  CC       basename-lgpl.o
  CC       stripslash.o
  CC       exclude.o
  CC       exitfail.o
  CC       creat-safer.o
  CC       open-safer.o
  CC       fd-hook.o
  CC       fd-safer-flag.o
  CC       dup-safer-flag.o
  CC       filenamecat-lgpl.o
  CC       getprogname.o
  CC       hard-locale.o
  CC       hash.o
  CC       i-ring.o
  CC       localcharset.o
  CC       glthread/lock.o
  CC       malloca.o
  CC       mbchar.o
  CC       mbiter.o
  CC       mbscasecmp.o
  CC       mbslen.o
  CC       mbsstr.o
  CC       mbuiter.o
  CC       memchr2.o
  CC       openat-die.o
  CC       openat-safer.o
  CC       opendirat.o
  CC       propername.o
  CC       quotearg.o
  CC       safe-read.o
  CC       save-cwd.o
  CC       sig-handler.o
  CC       stat-time.o
  CC       striconv.o
  CC       strnlen1.o
  CC       glthread/threadlib.o
  CC       trim.o
  CC       unistd.o
  CC       dup-safer.o
  CC       fd-safer.o
  CC       pipe-safer.o
  CC       unistr/u8-mbtoucr.o
  CC       unistr/u8-uctomb.o
  CC       unistr/u8-uctomb-aux.o
  CC       uniwidth/width.o
  CC       version-etc.o
  CC       version-etc-fsf.o
  CC       wctype-h.o
  CC       xmalloc.o
  CC       xalloc-die.o
  CC       xbinary-io.o
  CC       xstriconv.o
  CC       xstrtoimax.o
  CC       xstrtol.o
  CC       xstrtoul.o
  CC       xstrtol-error.o
  CC       colorize.o
  CC       chdir-long.o
  CC       fcntl.o
  CC       fts.o
  CC       mbrlen.o
  CC       mbrtowc.o
  CC       obstack.o
  CC       openat-proc.o
  CC       regex.o
  CC       strstr.o
  AR       libgreputils.a
make[3]: Leaving directory '/tmp/grep-3.3/lib'
make[2]: Leaving directory '/tmp/grep-3.3/lib'
Making all in doc
make[2]: Entering directory '/tmp/grep-3.3/doc'
  GEN      grep.1
  GEN      fgrep.1
  GEN      egrep.1
make[2]: Leaving directory '/tmp/grep-3.3/doc'
Making all in src
make[2]: Entering directory '/tmp/grep-3.3/src'
  CC       dfasearch.o
  CC       grep.o
  CC       kwsearch.o
  CC       kwset.o
  CC       pcresearch.o
  CC       searchutils.o
  CCLD     grep
  GEN      egrep
  GEN      fgrep
make[2]: Leaving directory '/tmp/grep-3.3/src'
Making all in tests
make[2]: Entering directory '/tmp/grep-3.3/tests'
make[2]: Nothing to be done for 'all'.
make[2]: Leaving directory '/tmp/grep-3.3/tests'
Making all in gnulib-tests
make[2]: Entering directory '/tmp/grep-3.3/gnulib-tests'
  GEN      arpa/inet.h
  GEN      sys/ioctl.h
  GEN      sys/select.h
  GEN      sys/socket.h
  GEN      sys/time.h
  GEN      sys/uio.h
make  all-recursive
make[3]: Entering directory '/tmp/grep-3.3/gnulib-tests'
Making all in .
make[4]: Entering directory '/tmp/grep-3.3/gnulib-tests'
  CC       test-localcharset.o
  CC       hash-pjw.o
  CC       imaxtostr.o
  CC       inttostr.o
  CC       offtostr.o
  CC       uinttostr.o
  CC       umaxtostr.o
  CC       localename.o
  CC       localename-table.o
  CC       sockets.o
  CC       sys_socket.o
  CC       xsize.o
  CC       asnprintf.o
  CC       ioctl.o
  CC       localtime-buffer.o
  CC       nanosleep.o
  CC       printf-args.o
  CC       printf-parse.o
  CC       strerror_r.o
  CC       vasnprintf.o
  AR       libtests.a
  CCLD     test-localcharset
make[4]: Leaving directory '/tmp/grep-3.3/gnulib-tests'
make[3]: Leaving directory '/tmp/grep-3.3/gnulib-tests'
make[2]: Leaving directory '/tmp/grep-3.3/gnulib-tests'
make[2]: Entering directory '/tmp/grep-3.3'
make[2]: Leaving directory '/tmp/grep-3.3'
make[1]: Leaving directory '/tmp/grep-3.3'
$ make install
Making install in po
make[1]: Entering directory '/tmp/grep-3.3/po'
installing af.gmo as /tmp/test/share/locale/af/LC_MESSAGES/grep.mo
installing be.gmo as /tmp/test/share/locale/be/LC_MESSAGES/grep.mo
installing bg.gmo as /tmp/test/share/locale/bg/LC_MESSAGES/grep.mo
installing ca.gmo as /tmp/test/share/locale/ca/LC_MESSAGES/grep.mo
installing cs.gmo as /tmp/test/share/locale/cs/LC_MESSAGES/grep.mo
installing da.gmo as /tmp/test/share/locale/da/LC_MESSAGES/grep.mo
installing de.gmo as /tmp/test/share/locale/de/LC_MESSAGES/grep.mo
installing el.gmo as /tmp/test/share/locale/el/LC_MESSAGES/grep.mo
installing eo.gmo as /tmp/test/share/locale/eo/LC_MESSAGES/grep.mo
installing es.gmo as /tmp/test/share/locale/es/LC_MESSAGES/grep.mo
installing et.gmo as /tmp/test/share/locale/et/LC_MESSAGES/grep.mo
installing eu.gmo as /tmp/test/share/locale/eu/LC_MESSAGES/grep.mo
installing fi.gmo as /tmp/test/share/locale/fi/LC_MESSAGES/grep.mo
installing fr.gmo as /tmp/test/share/locale/fr/LC_MESSAGES/grep.mo
installing ga.gmo as /tmp/test/share/locale/ga/LC_MESSAGES/grep.mo
installing gl.gmo as /tmp/test/share/locale/gl/LC_MESSAGES/grep.mo
installing he.gmo as /tmp/test/share/locale/he/LC_MESSAGES/grep.mo
installing hr.gmo as /tmp/test/share/locale/hr/LC_MESSAGES/grep.mo
installing hu.gmo as /tmp/test/share/locale/hu/LC_MESSAGES/grep.mo
installing id.gmo as /tmp/test/share/locale/id/LC_MESSAGES/grep.mo
installing it.gmo as /tmp/test/share/locale/it/LC_MESSAGES/grep.mo
installing ja.gmo as /tmp/test/share/locale/ja/LC_MESSAGES/grep.mo
installing ko.gmo as /tmp/test/share/locale/ko/LC_MESSAGES/grep.mo
installing ky.gmo as /tmp/test/share/locale/ky/LC_MESSAGES/grep.mo
installing lt.gmo as /tmp/test/share/locale/lt/LC_MESSAGES/grep.mo
installing nb.gmo as /tmp/test/share/locale/nb/LC_MESSAGES/grep.mo
installing nl.gmo as /tmp/test/share/locale/nl/LC_MESSAGES/grep.mo
installing pa.gmo as /tmp/test/share/locale/pa/LC_MESSAGES/grep.mo
installing pl.gmo as /tmp/test/share/locale/pl/LC_MESSAGES/grep.mo
installing pt.gmo as /tmp/test/share/locale/pt/LC_MESSAGES/grep.mo
installing pt_BR.gmo as /tmp/test/share/locale/pt_BR/LC_MESSAGES/grep.mo
installing ro.gmo as /tmp/test/share/locale/ro/LC_MESSAGES/grep.mo
installing ru.gmo as /tmp/test/share/locale/ru/LC_MESSAGES/grep.mo
installing sk.gmo as /tmp/test/share/locale/sk/LC_MESSAGES/grep.mo
installing sl.gmo as /tmp/test/share/locale/sl/LC_MESSAGES/grep.mo
installing sr.gmo as /tmp/test/share/locale/sr/LC_MESSAGES/grep.mo
installing sv.gmo as /tmp/test/share/locale/sv/LC_MESSAGES/grep.mo
installing th.gmo as /tmp/test/share/locale/th/LC_MESSAGES/grep.mo
installing tr.gmo as /tmp/test/share/locale/tr/LC_MESSAGES/grep.mo
installing uk.gmo as /tmp/test/share/locale/uk/LC_MESSAGES/grep.mo
installing vi.gmo as /tmp/test/share/locale/vi/LC_MESSAGES/grep.mo
installing zh_CN.gmo as /tmp/test/share/locale/zh_CN/LC_MESSAGES/grep.mo
installing zh_TW.gmo as /tmp/test/share/locale/zh_TW/LC_MESSAGES/grep.mo
if test "grep" = "gettext-tools"; then \
  /bin/mkdir -p /tmp/test/share/gettext/po; \
  for file in Makefile.in.in remove-potcdate.sin quot.sed boldquot.sed en <at> quot.header en <at> boldquot.header insert-header.sin Rules-quot   Makevars.template; do \
    /usr/bin/install -c -m 644 ./$file \
		    /tmp/test/share/gettext/po/$file; \
  done; \
  for file in Makevars; do \
    rm -f /tmp/test/share/gettext/po/$file; \
  done; \
else \
  : ; \
fi
make[1]: Leaving directory '/tmp/grep-3.3/po'
Making install in lib
make[1]: Entering directory '/tmp/grep-3.3/lib'
make  install-am
make[2]: Entering directory '/tmp/grep-3.3/lib'
make[3]: Entering directory '/tmp/grep-3.3/lib'
make[3]: Nothing to be done for 'install-exec-am'.
make[3]: Nothing to be done for 'install-data-am'.
make[3]: Leaving directory '/tmp/grep-3.3/lib'
make[2]: Leaving directory '/tmp/grep-3.3/lib'
make[1]: Leaving directory '/tmp/grep-3.3/lib'
Making install in doc
make[1]: Entering directory '/tmp/grep-3.3/doc'
make[2]: Entering directory '/tmp/grep-3.3/doc'
make[2]: Nothing to be done for 'install-exec-am'.
 /bin/mkdir -p '/tmp/test/share/info'
 /usr/bin/install -c -m 644 ./grep.info '/tmp/test/share/info'
 install-info --info-dir='/tmp/test/share/info' '/tmp/test/share/info/grep.info'
 /bin/mkdir -p '/tmp/test/share/man/man1'
 /usr/bin/install -c -m 644 grep.1 fgrep.1 egrep.1 '/tmp/test/share/man/man1'
make[2]: Leaving directory '/tmp/grep-3.3/doc'
make[1]: Leaving directory '/tmp/grep-3.3/doc'
Making install in src
make[1]: Entering directory '/tmp/grep-3.3/src'
make[2]: Entering directory '/tmp/grep-3.3/src'
 /bin/mkdir -p '/tmp/test/bin'
  /usr/bin/install -c grep '/tmp/test/bin'
 /bin/mkdir -p '/tmp/test/bin'
 /usr/bin/install -c egrep fgrep '/tmp/test/bin'
make[2]: Nothing to be done for 'install-data-am'.
make[2]: Leaving directory '/tmp/grep-3.3/src'
make[1]: Leaving directory '/tmp/grep-3.3/src'
Making install in tests
make[1]: Entering directory '/tmp/grep-3.3/tests'
make[2]: Entering directory '/tmp/grep-3.3/tests'
make[2]: Nothing to be done for 'install-exec-am'.
make[2]: Nothing to be done for 'install-data-am'.
make[2]: Leaving directory '/tmp/grep-3.3/tests'
make[1]: Leaving directory '/tmp/grep-3.3/tests'
Making install in gnulib-tests
make[1]: Entering directory '/tmp/grep-3.3/gnulib-tests'
make  install-recursive
make[2]: Entering directory '/tmp/grep-3.3/gnulib-tests'
Making install in .
make[3]: Entering directory '/tmp/grep-3.3/gnulib-tests'
make[4]: Entering directory '/tmp/grep-3.3/gnulib-tests'
make[4]: Nothing to be done for 'install-exec-am'.
make[4]: Nothing to be done for 'install-data-am'.
make[4]: Leaving directory '/tmp/grep-3.3/gnulib-tests'
make[3]: Leaving directory '/tmp/grep-3.3/gnulib-tests'
make[2]: Leaving directory '/tmp/grep-3.3/gnulib-tests'
make[1]: Leaving directory '/tmp/grep-3.3/gnulib-tests'
make[1]: Entering directory '/tmp/grep-3.3'
make[2]: Entering directory '/tmp/grep-3.3'
make[2]: Nothing to be done for 'install-exec-am'.
make[2]: Nothing to be done for 'install-data-am'.
make[2]: Leaving directory '/tmp/grep-3.3'
make[1]: Leaving directory '/tmp/grep-3.3'
$ echo ba | LANG=ja_JP.eucjp /tmp/test/bin/grep -F -w a
ba

The bug still appears.




Information forwarded to bug-grep <at> gnu.org:
bug#38223; Package grep. (Sat, 16 Nov 2019 17:35:03 GMT) Full text and rfc822 format available.

Message #19 received at 38223 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: "NIDE, Naoyuki" <nide <at> ics.nara-wu.ac.jp>
Cc: 38223 <at> debbugs.gnu.org
Subject: Re: bug#38223: grep >=2.28 cannot handle -wF correctly under
 LANG=ja_JP.eucjp
Date: Sat, 16 Nov 2019 09:34:06 -0800
[Message part 1 (text/plain, inline)]
On Fri, Nov 15, 2019 at 11:54 AM NIDE, Naoyuki <nide <at> ics.nara-wu.ac.jp> wrote:
> echo ba | LANG=ja_JP.eucjp grep -F -w a
> outputs ba, but should output nothing.

Thank you for that report. It is reproducible for me on Fedora 30.
Here is a fix, but the commit is incomplete: I am still in the process
of preparing a test case and the NEWS entry.
Will also fix the erroneous comment just below in a separate patch.
[grep-Fw-mb-non-utf8.diff (application/octet-stream, attachment)]

Reply sent to Jim Meyering <jim <at> meyering.net>:
You have taken responsibility. (Sat, 16 Nov 2019 19:01:03 GMT) Full text and rfc822 format available.

Notification sent to "NIDE, Naoyuki" <nide <at> ics.nara-wu.ac.jp>:
bug acknowledged by developer. (Sat, 16 Nov 2019 19:01:03 GMT) Full text and rfc822 format available.

Message #24 received at 38223-done <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: "NIDE, Naoyuki" <nide <at> ics.nara-wu.ac.jp>
Cc: 38223-done <at> debbugs.gnu.org
Subject: Re: bug#38223: grep >=2.28 cannot handle -wF correctly under
 LANG=ja_JP.eucjp
Date: Sat, 16 Nov 2019 11:00:38 -0800
On Sat, Nov 16, 2019 at 9:34 AM Jim Meyering <jim <at> meyering.net> wrote:
> On Fri, Nov 15, 2019 at 11:54 AM NIDE, Naoyuki <nide <at> ics.nara-wu.ac.jp> wrote:
> > echo ba | LANG=ja_JP.eucjp grep -F -w a
> > outputs ba, but should output nothing.
>
> Thank you for that report. It is reproducible for me on Fedora 30.
> Here is a fix, but the commit is incomplete: I am still in the process
> of preparing a test case and the NEWS entry.
> Will also fix the erroneous comment just below in a separate patch.

I've pushed the complete fix here:
https://git.savannah.gnu.org/cgit/grep.git/commit/?id=090a4dbe03951e427f03f83be424caacc3303799

I've also fixed the comment and a variable name and updated gnulib to latest.




Information forwarded to bug-grep <at> gnu.org:
bug#38223; Package grep. (Sun, 17 Nov 2019 00:02:02 GMT) Full text and rfc822 format available.

Message #27 received at 38223 <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: 38223 <at> debbugs.gnu.org
Subject: Re: bug#38223: grep >=2.28 cannot handle -wF correctly under
 LANG=ja_JP.eucjp
Date: Sun, 17 Nov 2019 09:01:21 +0900
[Message part 1 (text/plain, inline)]
On Sat, 16 Nov 2019 11:00:38 -0800
Jim Meyering <jim <at> meyering.net> wrote:

> I've pushed the complete fix here:
> https://git.savannah.gnu.org/cgit/grep.git/commit/?id=090a4dbe03951e427f03f83be424caacc3303799
> 
> I've also fixed the comment and a variable name and updated gnulib to latest.

After patched, I found extreamly slowdown.

  yes $(printf %040d 0) | head -1000000 >k
  time -p env LC_ALL=ja_JP.eucjp src/grep -F -w 0 k

First patch fixes it, and second improves performance more.
[0001-grep-fix-performance-degration-with-previous-patch.patch (text/plain, attachment)]
[0002-grep-performance-improvement-for-grep-F-w-in-non-UTF.patch (text/plain, attachment)]

Information forwarded to bug-grep <at> gnu.org:
bug#38223; Package grep. (Sun, 17 Nov 2019 04:37:02 GMT) Full text and rfc822 format available.

Message #30 received at 38223 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Cc: 38223 <at> debbugs.gnu.org
Subject: Re: bug#38223: grep >=2.28 cannot handle -wF correctly under
 LANG=ja_JP.eucjp
Date: Sat, 16 Nov 2019 20:36:18 -0800
[Message part 1 (text/plain, inline)]
On Sat, Nov 16, 2019 at 4:02 PM Norihiro Tanaka <noritnk <at> kcn.ne.jp> wrote:
> On Sat, 16 Nov 2019 11:00:38 -0800
> Jim Meyering <jim <at> meyering.net> wrote:
>
> > I've pushed the complete fix here:
> > https://git.savannah.gnu.org/cgit/grep.git/commit/?id=090a4dbe03951e427f03f83be424caacc3303799
> >
> > I've also fixed the comment and a variable name and updated gnulib to latest.
>
> After patched, I found extreamly slowdown.
>
>   yes $(printf %040d 0) | head -1000000 >k
>   time -p env LC_ALL=ja_JP.eucjp src/grep -F -w 0 k
>
> First patch fixes it, and second improves performance more.

Nice. Thank you!
Those look fine, at first glance, modulo these minor changes that I
expect to merge into the latter:
[grep-touchup.diff (application/octet-stream, attachment)]

Information forwarded to bug-grep <at> gnu.org:
bug#38223; Package grep. (Sun, 17 Nov 2019 06:47:01 GMT) Full text and rfc822 format available.

Message #33 received at 38223 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Cc: 38223 <at> debbugs.gnu.org
Subject: Re: bug#38223: grep >=2.28 cannot handle -wF correctly under
 LANG=ja_JP.eucjp
Date: Sat, 16 Nov 2019 22:45:56 -0800
[Message part 1 (text/plain, inline)]
On Sat, Nov 16, 2019 at 8:36 PM Jim Meyering <jim <at> meyering.net> wrote:
> On Sat, Nov 16, 2019 at 4:02 PM Norihiro Tanaka <noritnk <at> kcn.ne.jp> wrote:
> > On Sat, 16 Nov 2019 11:00:38 -0800
> > Jim Meyering <jim <at> meyering.net> wrote:
> >
> > > I've pushed the complete fix here:
> > > https://git.savannah.gnu.org/cgit/grep.git/commit/?id=090a4dbe03951e427f03f83be424caacc3303799
> > >
> > > I've also fixed the comment and a variable name and updated gnulib to latest.
> >
> > After patched, I found extreamly slowdown.
> >
> >   yes $(printf %040d 0) | head -1000000 >k
> >   time -p env LC_ALL=ja_JP.eucjp src/grep -F -w 0 k
> >
> > First patch fixes it, and second improves performance more.
>
> Nice. Thank you!
> Those look fine, at first glance, modulo these minor changes that I
> expect to merge into the latter:

Thanks again, Norihiro Tanaka.
I have also adjusted commit log wording and added comments for the new
mbclen parameter. I've attached the two commits that I expect to push
tomorrow, assuming no objection.
[grep-Fw-performance-fix.diff (application/octet-stream, attachment)]

Information forwarded to bug-grep <at> gnu.org:
bug#38223; Package grep. (Sun, 17 Nov 2019 08:05:02 GMT) Full text and rfc822 format available.

Message #36 received at 38223 <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: Jim Meyering <jim <at> meyering.net>
Cc: 38223 <at> debbugs.gnu.org
Subject: Re: bug#38223: grep >=2.28 cannot handle -wF correctly under
 LANG=ja_JP.eucjp
Date: Sun, 17 Nov 2019 17:04:18 +0900
On Sat, 16 Nov 2019 22:45:56 -0800
Jim Meyering <jim <at> meyering.net> wrote:

> On Sat, Nov 16, 2019 at 8:36 PM Jim Meyering <jim <at> meyering.net> wrote:
> > On Sat, Nov 16, 2019 at 4:02 PM Norihiro Tanaka <noritnk <at> kcn.ne.jp> wrote:
> > > On Sat, 16 Nov 2019 11:00:38 -0800
> > > Jim Meyering <jim <at> meyering.net> wrote:
> > >
> > > > I've pushed the complete fix here:
> > > > https://git.savannah.gnu.org/cgit/grep.git/commit/?id=090a4dbe03951e427f03f83be424caacc3303799
> > > >
> > > > I've also fixed the comment and a variable name and updated gnulib to latest.
> > >
> > > After patched, I found extreamly slowdown.
> > >
> > >   yes $(printf %040d 0) | head -1000000 >k
> > >   time -p env LC_ALL=ja_JP.eucjp src/grep -F -w 0 k
> > >
> > > First patch fixes it, and second improves performance more.
> >
> > Nice. Thank you!
> > Those look fine, at first glance, modulo these minor changes that I
> > expect to merge into the latter:
> 
> Thanks again, Norihiro Tanaka.
> I have also adjusted commit log wording and added comments for the new
> mbclen parameter. I've attached the two commits that I expect to push
> tomorrow, assuming no objection.

Thanks for the adjustment.  I have no objection to the content.





Information forwarded to bug-grep <at> gnu.org:
bug#38223; Package grep. (Sun, 17 Nov 2019 15:17:01 GMT) Full text and rfc822 format available.

Message #39 received at 38223 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Cc: 38223 <at> debbugs.gnu.org
Subject: Re: bug#38223: grep >=2.28 cannot handle -wF correctly under
 LANG=ja_JP.eucjp
Date: Sun, 17 Nov 2019 07:16:24 -0800
On Sun, Nov 17, 2019 at 12:04 AM Norihiro Tanaka <noritnk <at> kcn.ne.jp> wrote:
> On Sat, 16 Nov 2019 22:45:56 -0800
> Jim Meyering <jim <at> meyering.net> wrote:
>
> > On Sat, Nov 16, 2019 at 8:36 PM Jim Meyering <jim <at> meyering.net> wrote:
> > > On Sat, Nov 16, 2019 at 4:02 PM Norihiro Tanaka <noritnk <at> kcn.ne.jp> wrote:
> > > > On Sat, 16 Nov 2019 11:00:38 -0800
> > > > Jim Meyering <jim <at> meyering.net> wrote:
> > > >
> > > > > I've pushed the complete fix here:
> > > > > https://git.savannah.gnu.org/cgit/grep.git/commit/?id=090a4dbe03951e427f03f83be424caacc3303799
> > > > >
> > > > > I've also fixed the comment and a variable name and updated gnulib to latest.
> > > >
> > > > After patched, I found extreamly slowdown.
> > > >
> > > >   yes $(printf %040d 0) | head -1000000 >k
> > > >   time -p env LC_ALL=ja_JP.eucjp src/grep -F -w 0 k
> > > >
> > > > First patch fixes it, and second improves performance more.
> > >
> > > Nice. Thank you!
> > > Those look fine, at first glance, modulo these minor changes that I
> > > expect to merge into the latter:
> >
> > Thanks again, Norihiro Tanaka.
> > I have also adjusted commit log wording and added comments for the new
> > mbclen parameter. I've attached the two commits that I expect to push
> > tomorrow, assuming no objection.
>
> Thanks for the adjustment.  I have no objection to the content.

Pushed both.




Information forwarded to bug-grep <at> gnu.org:
bug#38223; Package grep. (Sun, 17 Nov 2019 16:29:02 GMT) Full text and rfc822 format available.

Message #42 received at 38223 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Jim Meyering <jim <at> meyering.net>
Cc: Norihiro Tanaka <noritnk <at> kcn.ne.jp>, 38223 <at> debbugs.gnu.org
Subject: Re: bug#38223: grep >=2.28 cannot handle -wF correctly under
 LANG=ja_JP.eucjp
Date: Sun, 17 Nov 2019 08:28:04 -0800
Thanks for fixing that. Although the patch says "[Bug#38223 introduced in grep 
3.0]", the original bug report is against grep 2.28 and later. Can I take it 
that we tried to fix the bug in 3.0 but the fix was incomplete?




Information forwarded to bug-grep <at> gnu.org:
bug#38223; Package grep. (Sun, 17 Nov 2019 18:57:01 GMT) Full text and rfc822 format available.

Message #45 received at 38223 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Norihiro Tanaka <noritnk <at> kcn.ne.jp>, 38223 <at> debbugs.gnu.org
Subject: Re: bug#38223: grep >=2.28 cannot handle -wF correctly under
 LANG=ja_JP.eucjp
Date: Sun, 17 Nov 2019 10:55:59 -0800
On Sun, Nov 17, 2019 at 8:28 AM Paul Eggert <eggert <at> cs.ucla.edu> wrote:
> Thanks for fixing that. Although the patch says "[Bug#38223 introduced in grep
> 3.0]", the original bug report is against grep 2.28 and later. Can I take it
> that we tried to fix the bug in 3.0 but the fix was incomplete?

Thanks for noting that. I confirm that 2.10 through 2.27 are fine, and
that this does afflict 2.28, but see it affected no other release
until 3.0. Probably deserves more investigation.

$ for i in grep-*; do echo $i: $(echo ab | LC_CTYPE=ja_JP.eucjp
$i/bin/grep -Fw b); done|sed 's,.*-,,'
2.10:
2.11:
2.12:
2.13:
2.14:
2.15:
2.16:
2.17:
2.18:
2.19:
2.20:
2.21:
2.22:
2.23:
2.24:
2.25:
2.26:
2.27:
2.28: ab
2.3:
2.4:
2.4.1:
2.4.2:
2.5:
2.5.1:
2.5.3:
2.5.4:
2.6:
2.6.1:
2.6.2:
2.6.3:
2.7:
2.8:
2.9:
3.0: ab
3.1: ab
3.2: ab
3.3: ab




Information forwarded to bug-grep <at> gnu.org:
bug#38223; Package grep. (Mon, 25 Nov 2019 12:04:01 GMT) Full text and rfc822 format available.

Message #48 received at 38223 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Norihiro Tanaka <noritnk <at> kcn.ne.jp>, 38223 <at> debbugs.gnu.org
Subject: Re: bug#38223: grep >=2.28 cannot handle -wF correctly under
 LANG=ja_JP.eucjp
Date: Sun, 24 Nov 2019 08:43:08 -0800
On Sun, Nov 17, 2019 at 10:55 AM Jim Meyering <jim <at> meyering.net> wrote:
>
> On Sun, Nov 17, 2019 at 8:28 AM Paul Eggert <eggert <at> cs.ucla.edu> wrote:
> > Thanks for fixing that. Although the patch says "[Bug#38223 introduced in grep
> > 3.0]", the original bug report is against grep 2.28 and later. Can I take it
> > that we tried to fix the bug in 3.0 but the fix was incomplete?
>
> Thanks for noting that. I confirm that 2.10 through 2.27 are fine, and
> that this does afflict 2.28, but see it affected no other release
> until 3.0. Probably deserves more investigation.
>
> $ for i in grep-*; do echo $i: $(echo ab | LC_CTYPE=ja_JP.eucjp
> $i/bin/grep -Fw b); done|sed 's,.*-,,'

I've corrected NEWS.
Thanks!




Information forwarded to bug-grep <at> gnu.org:
bug#38223; Package grep. (Sat, 30 Nov 2019 22:43:01 GMT) Full text and rfc822 format available.

Message #51 received at 38223 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Cc: 38223 <at> debbugs.gnu.org
Subject: Re: bug#38223: grep >=2.28 cannot handle -wF correctly under
 LANG=ja_JP.eucjp
Date: Sun, 1 Dec 2019 06:42:29 +0800
On Sun, Nov 17, 2019 at 2:45 PM Jim Meyering <jim <at> meyering.net> wrote:
>
> On Sat, Nov 16, 2019 at 8:36 PM Jim Meyering <jim <at> meyering.net> wrote:
> > On Sat, Nov 16, 2019 at 4:02 PM Norihiro Tanaka <noritnk <at> kcn.ne.jp> wrote:
> > > On Sat, 16 Nov 2019 11:00:38 -0800
> > > Jim Meyering <jim <at> meyering.net> wrote:
> > >
> > > > I've pushed the complete fix here:
> > > > https://git.savannah.gnu.org/cgit/grep.git/commit/?id=090a4dbe03951e427f03f83be424caacc3303799
> > > >
> > > > I've also fixed the comment and a variable name and updated gnulib to latest.
> > >
> > > After patched, I found extreamly slowdown.
> > >
> > >   yes $(printf %040d 0) | head -1000000 >k
> > >   time -p env LC_ALL=ja_JP.eucjp src/grep -F -w 0 k
> > >
> > > First patch fixes it, and second improves performance more.
> >
> > Nice. Thank you!
> > Those look fine, at first glance, modulo these minor changes that I
> > expect to merge into the latter:
>
> Thanks again, Norihiro Tanaka.
> I have also adjusted commit log wording and added comments for the new
> mbclen parameter. I've attached the two commits that I expect to push
> tomorrow, assuming no objection.

That performance regression deserved a test suite addition, so I've done this:

    tests: add test that would have detected -Fw perf regression
    * tests/mb-non-UTF8-perf-Fw: New file. Detect v3.3-22-g090a4db's
    performance regression.
    * tests/Makefile.am (TESTS): Add it.

Pushed as https://git.sv.gnu.org/cgit/grep.git/commit/?id=fdd45db167c9e5




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 29 Dec 2019 12:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 91 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.