X-Loop: help-debbugs@HIDDEN
Subject: bug#36718: uniq treats distinct Korean characters equal
Resent-From: Felix Hamme <fhamme@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-coreutils@HIDDEN
Resent-Date: Thu, 18 Jul 2019 14:49:01 +0000
Resent-Message-ID: <handler.36718.B.156346133810329 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: report 36718
X-GNU-PR-Package: coreutils
X-GNU-PR-Keywords:
To: 36718 <at> debbugs.gnu.org
Cc: Gerhard Dittes <gerhard.dittes@HIDDEN>
X-Debbugs-Original-To: bug-coreutils@HIDDEN
Received: via spool by submit <at> debbugs.gnu.org id=B.156346133810329
(code B ref -1); Thu, 18 Jul 2019 14:49:01 +0000
Received: (at submit) by debbugs.gnu.org; 18 Jul 2019 14:48:58 +0000
Received: from localhost ([127.0.0.1]:54473 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1ho7iM-0002gX-0N
for submit <at> debbugs.gnu.org; Thu, 18 Jul 2019 10:48:58 -0400
Received: from lists.gnu.org ([209.51.188.17]:37545)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <fhamme@HIDDEN>) id 1ho75l-0001jP-K7
for submit <at> debbugs.gnu.org; Thu, 18 Jul 2019 10:09:06 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:46336)
by lists.gnu.org with esmtp (Exim 4.86_2)
(envelope-from <fhamme@HIDDEN>) id 1ho75k-00077Q-Os
for bug-coreutils@HIDDEN; Thu, 18 Jul 2019 10:09:05 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level:
X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_40,RCVD_IN_DNSWL_NONE
autolearn=disabled version=3.3.2
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
(envelope-from <fhamme@HIDDEN>) id 1ho75j-0002dL-Rv
for bug-coreutils@HIDDEN; Thu, 18 Jul 2019 10:09:04 -0400
Received: from moint.1and1.com ([212.227.15.8]:49344)
by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16)
(Exim 4.71) (envelope-from <fhamme@HIDDEN>)
id 1ho75j-0002cV-K3
for bug-coreutils@HIDDEN; Thu, 18 Jul 2019 10:09:03 -0400
Received: from [82.165.232.198] (helo=[10.21.18.246])
by mrint.1and1.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
(Exim 4.84_2) (envelope-from <fhamme@HIDDEN>)
id 1ho75d-0006pm-U6; Thu, 18 Jul 2019 16:08:58 +0200
From: Felix Hamme <fhamme@HIDDEN>
Openpgp: preference=signencrypt
Autocrypt: addr=fhamme@HIDDEN; prefer-encrypt=mutual; keydata=
mQINBFnU9gQBEACpRolVcloxAchzV386GmfEgvPIAKtPoFyVJW5E1vW7wLmZotB8Mx8JZ5uV
ek5Gz4vNNaSNdowlDFw563t8PPiE7cbZ3VKqPfUnHml9Ky/jYQjFmdkWu9ffdbcKlqHk3rsc
rdY5AzYjbnfuF0kJgXgwlm85csucCiwdBvGf4xvyqwx7u2kP0V5FoDbOpPRiSSNDBTR9dxT7
RJmM9D+P48x90G0E2sXRb7oAmLYzID/38slgtDGhtfTj80E5qJR1Uy78iY7yM8JADBHTxwa3
7ytW6lR2QcwTkm0nF5BHX47t5jbu/y7We4OQgr4koQo3Xc93ybtaMEIfVyCtD/r/x4jQ39cE
/Eu+yRqGKeiHB6dzndT+8MSwVsRZf7isDUHBNfUC7P3fe1zqH3wdC6UX83FBxLkRVd2Nxmay
OdaK3oggYPN8FL3TUOQhGL6QNGdMdxy98vFZvZZ23Wg/k5fRKTvA6r+RFZpFN04TWmfJq7W4
GoYZJ7zs7oPBksOWQAi63d+P3fQi2DreMyqBFUuhXpf3g6ZncS1jZ/yI/RLzjKw08t3UUD1+
CrzHDAgcx6v1y/WVWtXDEV+heH92HRlaokfNoMLxmX/T97Mcru5OSndhCVXVLizvFNR7x8F+
mjvKyNvN9KI9k7dKGAlaYB4T6bcCqGpgA8UO7Xh1wlb/p/jKvwARAQABtCdGZWxpeCBIYW1t
ZSA8ZmhhbW1lQHVuaXRlZC1pbnRlcm5ldC5kZT6JAj0EEwEIACcFAlnU9gQCGyMFCQlmAYAF
CwkIBwIGFQgJCgsCBBYCAwECHgECF4AACgkQCzkgEoCLFB+b6g/5AY6pYi52p9qh2oBZjvMW
1rX8+9vwrcVXEX1dJdr/ZbniHRguoYVog7V08zyZzAC84dzA6zGDYzc83leWUdOV9W17NMUY
8J4DdyMX7mmCY9nAmfviloR+D00PR0d2MjBTzQJePRe/487pVMueBARWuWhl27sGE5G0KLmF
Tngi4j9xx3yKveSwREwEMEsdA1T95ZB+tLCuiR1S8gqWw1gVtjMIEuA43ScF+bkSg4eP91zw
TIlbsvwIoe/+i/T4Hev8fdfKVPd4mHNvK0oNHLnaOdF5UdCHuSirNeJ56bjOyJvdHjjgLXsU
blBJX4zG0J7Pc/xtCIIFzgzmhdApy3QpdOQ/FInoqxfn+Zd8QD/9IZSvPGJlctsjuetcyQbi
09uQWsBl4tFWO7OQkkxcPkPBFDQ7beMFfZPL/zNjACzirG7wRQY8xzReBHsTWxolWM1vcn0u
IkrENdF3yeMBiti4zLrMnlw1hw+C2TpLwNEh6hP8UqR6qGEgZiwW8lOl63Heh8EJllZnKW88
0IY6K41e/VUkZN4yKIWg6K5/FNxoUo5+ZCA6i9dfcytPOvK/pQRQvAp6qlbqfKzqvClCfnQd
BWqt0QsK1c4BEcvvkwphwBWrRPdx4AiglVfiousKDOax7gjha/4HsRaAOi0YQZ8zHx8nPtmE
z2fAXjwQjgxpBvK5Ag0EWdT2BAEQAKXuBKgOh69+jMhP+DqADZbx323n1FxgaAZ2pwLvgic5
cq00G0cJbsf/WE5ri4yRpMS2EK37YqNrWttyFc0fopZD4tGoNmZY2qdzlEtGyfp4AxfUw+FW
78cfL3fSNPDvNMKORKSgWy0VcPFGl+5Fc+Gr4xhcPL6I/Pyg+U3NWlevTwdyvHacR5fV4loa
V+ULhaN+zR4ZSaYsLnthgEWTWzC9tqORQ/O34iLUiyd8+XjRmgXvZVYFgAm+5cOVv8xic1xp
1TRGQmOIKF1TFR3kcp0Xtv3U/lS8bCZrYek56Ptn56pUxbu52HEolJTifAEB2lpOBRDrHc0b
KueAP6rnirN02waPjK22B62Ujt66S0mHHHnovsvK+oCDjfe8ITe9g5nrm11EVSNiv9+MJh6g
FCdIHr8CaOhipZn9gLjY0PCMt9PPIVAmuvG6x3HABvse5PHH4tXHcvatKVhqFbmZu4ryRvER
27qNzyLGAcssrXWTzYuo7Q/mVFkk6pMHJd9uXa8fIYmrdVM9yQgj7Gs0TixcaHVtsL/StfCk
3P7CfflE6pnnc0yRTwC82XXRpK8GA7TUtEg38z2G0C1O0vH6C4sWuRwXbCq3HQb8wuW8ukxj
XFW8rUkbY/z9rneaSMsTjUJmZKgWDBrqlTVbRU+gVhO2RCoaJNnf4ZrmJkFof0RvABEBAAGJ
AiUEGAEIAA8FAlnU9gQCGwwFCQlmAYAACgkQCzkgEoCLFB9T3w//cczKxXaX+dlW5QDMSZUK
8qmedA7bi7O2fiUZbISddnjYPdR+BJDwTnYCBeKGS8eXlGeNVTi+X9LycV0LqDxJX+NLVUXD
QfCCtQyZhh/lojD+RTs/KBV6nXD6Ad2rXeYZFB6fr8ABv4YzSGI3z/pDbZpa/1Pop9VATS97
VPFUUY8p3fNbmNyndANyOrtyTpkeK5TF4euQDJXg9CJJKwcN60LYz63K+hjGmgXlhMqEgBMJ
7s/ywQCA4gY4RLVeUZK9ZisRixqoHcza3GsgFdFaImU2l3KK7vwdff2QXC7IWWWgnIHk+U2Y
mZ5qU3QUXsv80b24JQzGUT/MnwFx7D56eICNqzIxe77NGi8XGCLMtBb9ZA5jFvd0lBeOm9L7
F29ta1l8z0maHttlR1g+FufV8a2yZF5vHL5jBLJ+6WJfqMvEF3lBjtv9KGltkr2YDhs14Jap
yRGAwD3J7Rzq1AjEAa6qUdNXbTLXRofabWFQS5NQ0V8iEuMQ3jSNiuS3RnXyNphXR3og6myU
u+uP5vt2mKLjADeywl3tufDqkKXZ2IgAvKwccJaMRkZtcmbMG1oN6pb5Z/WENCwNsXNm0DvF
+54VhvMowEeOihiTuvrqGcIpkt2ZpCvQvMbtsYqXIJfsxQOgQnjbVNxyeop9gLOeTi88BHmz
WvwzPrxf0Ensy7U=
Message-ID: <b63c29f8-dbed-b445-6ab9-ad0d0872481d@HIDDEN>
Date: Thu, 18 Jul 2019 16:08:57 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
Thunderbird/60.8.0
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------D989C0C3F5A783E7E6CBFD03"
Content-Language: en-US
X-Virus-Scanned: ClamAV@mvs-ha-bs
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Received-From: 212.227.15.8
X-Spam-Score: -1.4 (-)
X-Mailman-Approved-At: Thu, 18 Jul 2019 10:48:55 -0400
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -2.4 (--)
This is a multi-part message in MIME format.
--------------D989C0C3F5A783E7E6CBFD03
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Dear all,
I found that, when performing uniq on some Korean characters, it treats
them as equal (counts as duplicate) although the characters aren't
equal. To be precise, it happened to me on the Characters 프 (U+D504) and
틀 (U+D2C0).
An example (input, expected output, actual output) can be found in the
attachment.
I've tried that using uniq (GNU coreutils) 8.30.
Greetings
Felix Hamme
--------------D989C0C3F5A783E7E6CBFD03
Content-Type: application/gzip;
name="uniq-korean-characters-bug.tar.gz"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename="uniq-korean-characters-bug.tar.gz"
H4sIAA59MF0AA+3Wv07CQADH8c48xb3A0bvrXevrVCxKRIqlTRgdiiZi4uKIiSzMTYyRwfgu
XfnzDrY1EgylloTUiL/PwDW9ElqOb2nQaV3Sc9dz7A5tnNme3fAdr0ePg1Nd2xeWsJRKR24p
tj5+0bihuGSWaViGxrgSnGlE7e0MCgQ93/YI0ZpOu9UvOO6n+T8q2L7+m3uTIbDb1A38buDT
pude0Oz9DXpUN1jd7/u5n5EusCnl1vXnyW/j2/oL07SkRlglX8D/Xn+SEWT5EM7Ho8XT4/J2
uri+m0UjvfY5x8lyeDWfRLPX6XzytojeZ9NIz9m1OjwOh/FgHIf32Wuy/RKHN/HgWa/99sXC
hr30X1h/if7Fqn9lSDPtXybT6L8Ced2X7B2dH4Cd+nf6XSfZOtm4AxTEr5X5/+er/oUQWf9M
oP8qlOwf94QDtVP/rU5WveuViH5N+f5NxUX6/C+FwPN/JdA/AAAAAAAAAAAAAMDh+QC5PrXF
ACgAAA==
--------------D989C0C3F5A783E7E6CBFD03--
Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) Content-Type: text/plain; charset=utf-8 X-Loop: help-debbugs@HIDDEN From: help-debbugs@HIDDEN (GNU bug Tracking System) To: Felix Hamme <fhamme@HIDDEN> Subject: bug#36718: Acknowledgement (uniq treats distinct Korean characters equal) Message-ID: <handler.36718.B.156346133810329.ack <at> debbugs.gnu.org> References: <b63c29f8-dbed-b445-6ab9-ad0d0872481d@HIDDEN> X-Gnu-PR-Message: ack 36718 X-Gnu-PR-Package: coreutils Reply-To: 36718 <at> debbugs.gnu.org Date: Thu, 18 Jul 2019 14:49:01 +0000 Thank you for filing a new bug report with debbugs.gnu.org. This is an automatically generated reply to let you know your message has been received. Your message is being forwarded to the package maintainers and other interested parties for their attention; they will reply in due course. Your message has been sent to the package maintainer(s): bug-coreutils@HIDDEN If you wish to submit further information on this problem, please send it to 36718 <at> debbugs.gnu.org. Please do not send mail to help-debbugs@HIDDEN unless you wish to report a problem with the Bug-tracking system. --=20 36718: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D36718 GNU Bug Tracking System Contact help-debbugs@HIDDEN with problems
X-Loop: help-debbugs@HIDDEN
Subject: bug#36718: uniq treats distinct Korean characters equal
Resent-From: Paul Eggert <eggert@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-coreutils@HIDDEN
Resent-Date: Thu, 18 Jul 2019 22:44:02 +0000
Resent-Message-ID: <handler.36718.B36718.156348983810513 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 36718
X-GNU-PR-Package: coreutils
X-GNU-PR-Keywords:
To: Felix Hamme <fhamme@HIDDEN>, 36718 <at> debbugs.gnu.org
Cc: Gerhard Dittes <gerhard.dittes@HIDDEN>
Received: via spool by 36718-submit <at> debbugs.gnu.org id=B36718.156348983810513
(code B ref 36718); Thu, 18 Jul 2019 22:44:02 +0000
Received: (at 36718) by debbugs.gnu.org; 18 Jul 2019 22:43:58 +0000
Received: from localhost ([127.0.0.1]:54799 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1hoF82-0002jT-Ba
for submit <at> debbugs.gnu.org; Thu, 18 Jul 2019 18:43:58 -0400
Received: from zimbra.cs.ucla.edu ([131.179.128.68]:46214)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <eggert@HIDDEN>) id 1hoF7y-0002jB-WE
for 36718 <at> debbugs.gnu.org; Thu, 18 Jul 2019 18:43:57 -0400
Received: from localhost (localhost [127.0.0.1])
by zimbra.cs.ucla.edu (Postfix) with ESMTP id 0AF681626E6;
Thu, 18 Jul 2019 15:43:49 -0700 (PDT)
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032)
with ESMTP id kWMLPgmNXoX0; Thu, 18 Jul 2019 15:43:48 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1])
by zimbra.cs.ucla.edu (Postfix) with ESMTP id 695131626E8;
Thu, 18 Jul 2019 15:43:48 -0700 (PDT)
X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026)
with ESMTP id jP5MSBBPkspq; Thu, 18 Jul 2019 15:43:48 -0700 (PDT)
Received: from [192.168.1.9] (cpe-23-242-74-103.socal.res.rr.com
[23.242.74.103])
by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 433EB1626E6;
Thu, 18 Jul 2019 15:43:48 -0700 (PDT)
References: <b63c29f8-dbed-b445-6ab9-ad0d0872481d@HIDDEN>
From: Paul Eggert <eggert@HIDDEN>
Organization: UCLA Computer Science Department
Message-ID: <1395fa6c-9741-c5af-e9b0-36d2677b7cc4@HIDDEN>
Date: Thu, 18 Jul 2019 15:43:48 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
Thunderbird/60.8.0
MIME-Version: 1.0
In-Reply-To: <b63c29f8-dbed-b445-6ab9-ad0d0872481d@HIDDEN>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
X-Spam-Score: -2.3 (--)
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -3.3 (---)
uniq just calls strcoll, and if strcoll (A, B) returns 0 then uniq assumes the
lines are equal. So my guess is that your problem has something to do with
strcoll, not with coreutils per se.
X-Loop: help-debbugs@HIDDEN
Subject: bug#36718: uniq treats distinct Korean characters equal
Resent-From: Felix Hamme <fhamme@HIDDEN>
Original-Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
Resent-CC: bug-coreutils@HIDDEN
Resent-Date: Fri, 19 Jul 2019 11:58:02 +0000
Resent-Message-ID: <handler.36718.B36718.156353748116041 <at> debbugs.gnu.org>
Resent-Sender: help-debbugs@HIDDEN
X-GNU-PR-Message: followup 36718
X-GNU-PR-Package: coreutils
X-GNU-PR-Keywords:
To: Paul Eggert <eggert@HIDDEN>, 36718 <at> debbugs.gnu.org
Cc: Gerhard Dittes <gerhard.dittes@HIDDEN>
Received: via spool by 36718-submit <at> debbugs.gnu.org id=B36718.156353748116041
(code B ref 36718); Fri, 19 Jul 2019 11:58:02 +0000
Received: (at 36718) by debbugs.gnu.org; 19 Jul 2019 11:58:01 +0000
Received: from localhost ([127.0.0.1]:55256 helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
id 1hoRWS-0004Af-RU
for submit <at> debbugs.gnu.org; Fri, 19 Jul 2019 07:58:01 -0400
Received: from moint.1and1.com ([212.227.15.8]:57120)
by debbugs.gnu.org with esmtp (Exim 4.84_2)
(envelope-from <fhamme@HIDDEN>) id 1hoPyY-0007oq-Kn
for 36718 <at> debbugs.gnu.org; Fri, 19 Jul 2019 06:18:55 -0400
Received: from [82.165.232.198] (helo=[10.21.18.246])
by mrint.1and1.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
(Exim 4.84_2) (envelope-from <fhamme@HIDDEN>)
id 1hoPyH-0004gE-7f; Fri, 19 Jul 2019 12:18:37 +0200
References: <b63c29f8-dbed-b445-6ab9-ad0d0872481d@HIDDEN>
<1395fa6c-9741-c5af-e9b0-36d2677b7cc4@HIDDEN>
From: Felix Hamme <fhamme@HIDDEN>
Openpgp: preference=signencrypt
Autocrypt: addr=fhamme@HIDDEN; prefer-encrypt=mutual; keydata=
mQINBFnU9gQBEACpRolVcloxAchzV386GmfEgvPIAKtPoFyVJW5E1vW7wLmZotB8Mx8JZ5uV
ek5Gz4vNNaSNdowlDFw563t8PPiE7cbZ3VKqPfUnHml9Ky/jYQjFmdkWu9ffdbcKlqHk3rsc
rdY5AzYjbnfuF0kJgXgwlm85csucCiwdBvGf4xvyqwx7u2kP0V5FoDbOpPRiSSNDBTR9dxT7
RJmM9D+P48x90G0E2sXRb7oAmLYzID/38slgtDGhtfTj80E5qJR1Uy78iY7yM8JADBHTxwa3
7ytW6lR2QcwTkm0nF5BHX47t5jbu/y7We4OQgr4koQo3Xc93ybtaMEIfVyCtD/r/x4jQ39cE
/Eu+yRqGKeiHB6dzndT+8MSwVsRZf7isDUHBNfUC7P3fe1zqH3wdC6UX83FBxLkRVd2Nxmay
OdaK3oggYPN8FL3TUOQhGL6QNGdMdxy98vFZvZZ23Wg/k5fRKTvA6r+RFZpFN04TWmfJq7W4
GoYZJ7zs7oPBksOWQAi63d+P3fQi2DreMyqBFUuhXpf3g6ZncS1jZ/yI/RLzjKw08t3UUD1+
CrzHDAgcx6v1y/WVWtXDEV+heH92HRlaokfNoMLxmX/T97Mcru5OSndhCVXVLizvFNR7x8F+
mjvKyNvN9KI9k7dKGAlaYB4T6bcCqGpgA8UO7Xh1wlb/p/jKvwARAQABtCdGZWxpeCBIYW1t
ZSA8ZmhhbW1lQHVuaXRlZC1pbnRlcm5ldC5kZT6JAj0EEwEIACcFAlnU9gQCGyMFCQlmAYAF
CwkIBwIGFQgJCgsCBBYCAwECHgECF4AACgkQCzkgEoCLFB+b6g/5AY6pYi52p9qh2oBZjvMW
1rX8+9vwrcVXEX1dJdr/ZbniHRguoYVog7V08zyZzAC84dzA6zGDYzc83leWUdOV9W17NMUY
8J4DdyMX7mmCY9nAmfviloR+D00PR0d2MjBTzQJePRe/487pVMueBARWuWhl27sGE5G0KLmF
Tngi4j9xx3yKveSwREwEMEsdA1T95ZB+tLCuiR1S8gqWw1gVtjMIEuA43ScF+bkSg4eP91zw
TIlbsvwIoe/+i/T4Hev8fdfKVPd4mHNvK0oNHLnaOdF5UdCHuSirNeJ56bjOyJvdHjjgLXsU
blBJX4zG0J7Pc/xtCIIFzgzmhdApy3QpdOQ/FInoqxfn+Zd8QD/9IZSvPGJlctsjuetcyQbi
09uQWsBl4tFWO7OQkkxcPkPBFDQ7beMFfZPL/zNjACzirG7wRQY8xzReBHsTWxolWM1vcn0u
IkrENdF3yeMBiti4zLrMnlw1hw+C2TpLwNEh6hP8UqR6qGEgZiwW8lOl63Heh8EJllZnKW88
0IY6K41e/VUkZN4yKIWg6K5/FNxoUo5+ZCA6i9dfcytPOvK/pQRQvAp6qlbqfKzqvClCfnQd
BWqt0QsK1c4BEcvvkwphwBWrRPdx4AiglVfiousKDOax7gjha/4HsRaAOi0YQZ8zHx8nPtmE
z2fAXjwQjgxpBvK5Ag0EWdT2BAEQAKXuBKgOh69+jMhP+DqADZbx323n1FxgaAZ2pwLvgic5
cq00G0cJbsf/WE5ri4yRpMS2EK37YqNrWttyFc0fopZD4tGoNmZY2qdzlEtGyfp4AxfUw+FW
78cfL3fSNPDvNMKORKSgWy0VcPFGl+5Fc+Gr4xhcPL6I/Pyg+U3NWlevTwdyvHacR5fV4loa
V+ULhaN+zR4ZSaYsLnthgEWTWzC9tqORQ/O34iLUiyd8+XjRmgXvZVYFgAm+5cOVv8xic1xp
1TRGQmOIKF1TFR3kcp0Xtv3U/lS8bCZrYek56Ptn56pUxbu52HEolJTifAEB2lpOBRDrHc0b
KueAP6rnirN02waPjK22B62Ujt66S0mHHHnovsvK+oCDjfe8ITe9g5nrm11EVSNiv9+MJh6g
FCdIHr8CaOhipZn9gLjY0PCMt9PPIVAmuvG6x3HABvse5PHH4tXHcvatKVhqFbmZu4ryRvER
27qNzyLGAcssrXWTzYuo7Q/mVFkk6pMHJd9uXa8fIYmrdVM9yQgj7Gs0TixcaHVtsL/StfCk
3P7CfflE6pnnc0yRTwC82XXRpK8GA7TUtEg38z2G0C1O0vH6C4sWuRwXbCq3HQb8wuW8ukxj
XFW8rUkbY/z9rneaSMsTjUJmZKgWDBrqlTVbRU+gVhO2RCoaJNnf4ZrmJkFof0RvABEBAAGJ
AiUEGAEIAA8FAlnU9gQCGwwFCQlmAYAACgkQCzkgEoCLFB9T3w//cczKxXaX+dlW5QDMSZUK
8qmedA7bi7O2fiUZbISddnjYPdR+BJDwTnYCBeKGS8eXlGeNVTi+X9LycV0LqDxJX+NLVUXD
QfCCtQyZhh/lojD+RTs/KBV6nXD6Ad2rXeYZFB6fr8ABv4YzSGI3z/pDbZpa/1Pop9VATS97
VPFUUY8p3fNbmNyndANyOrtyTpkeK5TF4euQDJXg9CJJKwcN60LYz63K+hjGmgXlhMqEgBMJ
7s/ywQCA4gY4RLVeUZK9ZisRixqoHcza3GsgFdFaImU2l3KK7vwdff2QXC7IWWWgnIHk+U2Y
mZ5qU3QUXsv80b24JQzGUT/MnwFx7D56eICNqzIxe77NGi8XGCLMtBb9ZA5jFvd0lBeOm9L7
F29ta1l8z0maHttlR1g+FufV8a2yZF5vHL5jBLJ+6WJfqMvEF3lBjtv9KGltkr2YDhs14Jap
yRGAwD3J7Rzq1AjEAa6qUdNXbTLXRofabWFQS5NQ0V8iEuMQ3jSNiuS3RnXyNphXR3og6myU
u+uP5vt2mKLjADeywl3tufDqkKXZ2IgAvKwccJaMRkZtcmbMG1oN6pb5Z/WENCwNsXNm0DvF
+54VhvMowEeOihiTuvrqGcIpkt2ZpCvQvMbtsYqXIJfsxQOgQnjbVNxyeop9gLOeTi88BHmz
WvwzPrxf0Ensy7U=
Message-ID: <7476770c-e8c9-a8fc-4564-1613be9c9a9f@HIDDEN>
Date: Fri, 19 Jul 2019 12:18:32 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
Thunderbird/60.8.0
MIME-Version: 1.0
In-Reply-To: <1395fa6c-9741-c5af-e9b0-36d2677b7cc4@HIDDEN>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 8bit
X-Virus-Scanned: ClamAV@mvs-ha-bs
X-Spam-Score: 0.0 (/)
X-Mailman-Approved-At: Fri, 19 Jul 2019 07:57:59 -0400
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>,
<mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -1.0 (-)
Thanks @Paul Eggert, it seems like this isn't a bug at all.
My locale (de_DE.utf8) appears to lack definitions for the mentioned
Korean characters. After setting my system language to Korean
(ko_KR.utf8) uniq produces the expected output.
For my purpose, I'll set my environment to LC_COLLATE=C, which forces
byte-wise comparison and should work for all languages.
Admittedly, I could've searched it:
https://unix.stackexchange.com/questions/373848/why-does-uniq-think-%E3%81%82%E3%81%84-and-%E3%81%84%E3%81%82-are-the-same
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.