GNU bug report logs - #30574
Support different charsets

Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.

Package: diffutils; Reported by: Victor Porton <porton@HIDDEN>; dated Thu, 22 Feb 2018 16:21:02 UTC; Maintainer for diffutils is bug-diffutils@HIDDEN.

Message received at submit <at> debbugs.gnu.org:


Received: (at submit) by debbugs.gnu.org; 22 Feb 2018 16:20:04 +0000
From debbugs-submit-bounces <at> debbugs.gnu.org Thu Feb 22 11:20:04 2018
Received: from localhost ([127.0.0.1]:56698 helo=debbugs.gnu.org)
	by debbugs.gnu.org with esmtp (Exim 4.84_2)
	(envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>)
	id 1eotbH-0002Hf-Ku
	for submit <at> debbugs.gnu.org; Thu, 22 Feb 2018 11:20:03 -0500
Received: from eggs.gnu.org ([208.118.235.92]:40331)
 by debbugs.gnu.org with esmtp (Exim 4.84_2)
 (envelope-from <porton@HIDDEN>) id 1eotbF-0002H5-Hp
 for submit <at> debbugs.gnu.org; Thu, 22 Feb 2018 11:20:02 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <porton@HIDDEN>) id 1eotb9-000165-O1
 for submit <at> debbugs.gnu.org; Thu, 22 Feb 2018 11:19:56 -0500
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM,
 T_DKIM_INVALID autolearn=disabled version=3.3.2
Received: from lists.gnu.org ([2001:4830:134:3::11]:42791)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <porton@HIDDEN>) id 1eotb9-00015o-L1
 for submit <at> debbugs.gnu.org; Thu, 22 Feb 2018 11:19:55 -0500
Received: from eggs.gnu.org ([2001:4830:134:3::10]:59504)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <porton@HIDDEN>) id 1eotb8-0003MH-Kc
 for bug-diffutils@HIDDEN; Thu, 22 Feb 2018 11:19:55 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <porton@HIDDEN>) id 1eotb7-00013s-FS
 for bug-diffutils@HIDDEN; Thu, 22 Feb 2018 11:19:54 -0500
Received: from forward100p.mail.yandex.net
 ([2a02:6b8:0:1472:2741:0:8b7:100]:47305)
 by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <porton@HIDDEN>) id 1eotb7-000103-11
 for bug-diffutils@HIDDEN; Thu, 22 Feb 2018 11:19:53 -0500
Received: from mxback14j.mail.yandex.net (mxback14j.mail.yandex.net
 [IPv6:2a02:6b8:0:1619::90])
 by forward100p.mail.yandex.net (Yandex) with ESMTP id 5A2465103AAD
 for <bug-diffutils@HIDDEN>; Thu, 22 Feb 2018 19:19:48 +0300 (MSK)
Received: from smtp2j.mail.yandex.net (smtp2j.mail.yandex.net
 [2a02:6b8:0:801::ac])
 by mxback14j.mail.yandex.net (nwsmtp/Yandex) with ESMTP id QtLfFyEg4m-JmTewNoe;
 Thu, 22 Feb 2018 19:19:48 +0300
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=narod.ru; s=mail;
 t=1519316388; bh=93Iw8sH41pONgpmlTV4TC3D+uXgt5DLFhN2hNgf+xVg=;
 h=Message-ID:Subject:From:To:Date;
 b=Gqv0ig0dkllFAbDV0y777TV7jV2p3Evc4mxy+Va66rZSsuqBne0xfNR/xr38kRGcN
 fYVf2FOoH7GJm9YiFW8P+jcA5fMhuokXZ2Muu5YAHp0DAzF7o6C4OaiM1Q5CRUDMaq
 QrHHsvq3J7dd5xtu0ZDeR2QcTAoZo7ZAK9TvlCxE=
Received: by smtp2j.mail.yandex.net (nwsmtp/Yandex) with ESMTPSA id
 PKFXxGU8Us-JldS0Ro5; Thu, 22 Feb 2018 19:19:47 +0300
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client certificate not present)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=narod.ru; s=mail;
 t=1519316387; bh=93Iw8sH41pONgpmlTV4TC3D+uXgt5DLFhN2hNgf+xVg=;
 h=Message-ID:Subject:From:To:Date;
 b=dZFA8yTN1YlsIk6jga27X079Rvdm2Hn4pd/8FTPDb65GlMx35ehhEtOf3suIWiHhz
 AgwJxrvOQ0rIcwDPQXCaKeY2TQJCJEkJz+9u2McZb8Nyr3cKzTMC2o23FzkmGe10eu
 qnO7wl713usghYIl5ws63NUa7KVZgXrZDG7rhqLc=
Authentication-Results: smtp2j.mail.yandex.net; dkim=pass header.i=@narod.ru
Message-ID: <1519316385.2225.34.camel@HIDDEN>
Subject: Support different charsets
From: Victor Porton <porton@HIDDEN>
To: bug-diffutils@HIDDEN
Date: Thu, 22 Feb 2018 18:19:45 +0200
Content-Type: text/plain; charset="UTF-8"
X-Mailer: Evolution 3.26.5-1 
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 2001:4830:134:3::11
X-Spam-Score: -4.0 (----)
X-Debbugs-Envelope-To: submit
X-BeenThere: debbugs-submit <at> debbugs.gnu.org
X-Mailman-Version: 2.1.18
Precedence: list
List-Id: <debbugs-submit.debbugs.gnu.org>
List-Unsubscribe: <https://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe>
List-Archive: <https://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit/>
List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org>
List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help>
List-Subscribe: <https://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, 
 <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe>
Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org
Sender: "Debbugs-submit" <debbugs-submit-bounces <at> debbugs.gnu.org>
X-Spam-Score: -4.0 (----)

1. We should be able to do `diff` even for source and destination
having different encodings and/or charsets.

Add option for source encoding and destination encoding.

2. The conversion to common encoding should be even with the option to
not fail on wrong or unconvertible characters (like //IGNORE in GNU
iconv), that is replacing unknown characters with a placeholder
character. This is useful to compare wrongly encoded files.

3. More generally we can add filter for every compared file first to
pass through the filter. Item 1 can be implemented by passing `iconv`
or `recode` command as such a filter.




Acknowledgement sent to Victor Porton <porton@HIDDEN>:
New bug report received and forwarded. Copy sent to bug-diffutils@HIDDEN. Full text available.
Report forwarded to bug-diffutils@HIDDEN:
bug#30574; Package diffutils. Full text available.
Please note: This is a static page, with minimal formatting, updated once a day.
Click here to see this page with the latest information and nicer formatting.
Last modified: Mon, 25 Nov 2019 12:00:02 UTC

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997 nCipher Corporation Ltd, 1994-97 Ian Jackson.