Assaf Gordon <assafgordon@HIDDEN>
to control <at> debbugs.gnu.org
.
Full text available.Assaf Gordon <assafgordon@HIDDEN>
to control <at> debbugs.gnu.org
.
Full text available.Received: (at 11187) by debbugs.gnu.org; 6 Apr 2012 06:35:35 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Fri Apr 06 02:35:35 2012 Received: from localhost ([127.0.0.1]:41994 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1SG2lh-00084r-P2 for submit <at> debbugs.gnu.org; Fri, 06 Apr 2012 02:35:35 -0400 Received: from mx.meyering.net ([88.168.87.75]:41581) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from <jim@HIDDEN>) id 1SG2la-00084e-Tw for 11187 <at> debbugs.gnu.org; Fri, 06 Apr 2012 02:35:31 -0400 Received: from rho.meyering.net (localhost.localdomain [127.0.0.1]) by rho.meyering.net (Acme Bit-Twister) with ESMTP id D05946008A; Fri, 6 Apr 2012 08:34:46 +0200 (CEST) From: Jim Meyering <jim@HIDDEN> To: =?utf-8?Q?Vladimir_'=CF=86-coder=2Fphcoder'_Serbinenko?= <phcoder@HIDDEN> Subject: Re: bug#11187: [PATCH] Fix incorrect width handling of multibyte characters in fmt In-Reply-To: <4F7E240F.3030401@HIDDEN> ("Vladimir =?utf-8?Q?'=CF=86-cod?= =?utf-8?Q?er=2Fphcoder'?= Serbinenko"'s message of "Fri, 06 Apr 2012 01:00:31 +0200") References: <4F7DE02D.9050106@HIDDEN> <87d37mq9lz.fsf@HIDDEN> <4F7E240F.3030401@HIDDEN> Date: Fri, 06 Apr 2012 08:34:46 +0200 Message-ID: <87hawxpdvd.fsf@HIDDEN> Lines: 18 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -1.9 (-) X-Debbugs-Envelope-To: 11187 Cc: 11187 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Sender: debbugs-submit-bounces <at> debbugs.gnu.org Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org X-Spam-Score: -1.9 (-) Vladimir '=CF=86-coder/phcoder' Serbinenko wrote: > On 05.04.2012 21:09, Jim Meyering wrote: >> Vladimir "'=CF=86-coder/phcoder'" Serbinenko wrote: >>> Currently fmt assumes that 1 byte=3D 1 column which creates wrongly >>> formatted strings. Attached patch fixes it >> Hi Vlad, >> >> Thank you for contributing. >> This is a large enough change that we'll need an FSF copyright >> assignment from you. If you haven't already sent in the one for >> gnulib, please just add coreutils to the list of affected projects. >> (you can do up to 4 projects at a time) > Ok, will do so. I'll also wait till more or less definitive version is > ready for gnulib before updating the one for coreutils. > Can I add TP in the same time? Translation Project? I don't know if they use the same forms/addresses. Please ask coordinator@HIDDEN to be sure.
bug-coreutils@HIDDEN
:bug#11187
; Package coreutils
.
Full text available.Received: (at 11187) by debbugs.gnu.org; 5 Apr 2012 23:01:27 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Thu Apr 05 19:01:27 2012 Received: from localhost ([127.0.0.1]:41847 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1SFvgB-0006My-32 for submit <at> debbugs.gnu.org; Thu, 05 Apr 2012 19:01:27 -0400 Received: from mail-wg0-f46.google.com ([74.125.82.46]:60928) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from <phcoder@HIDDEN>) id 1SFvg4-0006Mn-KH for 11187 <at> debbugs.gnu.org; Thu, 05 Apr 2012 19:01:21 -0400 Received: by wgbdq11 with SMTP id dq11so1753542wgb.15 for <11187 <at> debbugs.gnu.org>; Thu, 05 Apr 2012 16:00:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:x-enigmail-version:content-type; bh=8kTQMeLu+IdKmDB5hf7Oie49SLasrt3mDmQq1LxOvkA=; b=Mp0HpLzJTbBSo8j3R9yI8UXQu7CeO/C7pKGHNKa4V8PxN8MHo+QAp2B7AglgLowqB3 cum0ztfISb0iqLa4DibD9gAASb8w/3fmX7KFqB9L1StRvaius0y4noFl7mOm5lxI5y2K lvhpZCugzdMUz8nXA9RVEVJuCDonqoidpGXBKX9CtgQQY2RdgaRxilxwytuu8L7hnY95 V1JFmflQzs0xZMqXQpR3nmuP9FJTG0ZF+fkHJs6iLjqyuRhgZ18iGkHwI+KUKHCnxOnC zyQ9v2oX8L4cFJyKfFIDeIHh6Gfyk/bVdFST7+HCzTmsLFcV5GgOxOr0kDjSwuckMNjP Nr2g== Received: by 10.216.137.27 with SMTP id x27mr2729129wei.70.1333666840576; Thu, 05 Apr 2012 16:00:40 -0700 (PDT) Received: from debian.x201.phnet (9-233.197-178.cust.bluewin.ch. [178.197.233.9]) by mx.google.com with ESMTPS id l5sm1149998wia.11.2012.04.05.16.00.38 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 05 Apr 2012 16:00:39 -0700 (PDT) Message-ID: <4F7E240F.3030401@HIDDEN> Date: Fri, 06 Apr 2012 01:00:31 +0200 From: =?UTF-8?B?VmxhZGltaXIgJ8+GLWNvZGVyL3BoY29kZXInIFNlcmJpbmVua28=?= <phcoder@HIDDEN> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.3) Gecko/20120329 Icedove/10.0.3 MIME-Version: 1.0 To: Jim Meyering <jim@HIDDEN> Subject: Re: bug#11187: [PATCH] Fix incorrect width handling of multibyte characters in fmt References: <4F7DE02D.9050106@HIDDEN> <87d37mq9lz.fsf@HIDDEN> In-Reply-To: <87d37mq9lz.fsf@HIDDEN> X-Enigmail-Version: 1.4 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="------------enig870083F86BDD4B965D893D0A" X-Spam-Score: -2.6 (--) X-Debbugs-Envelope-To: 11187 Cc: 11187 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Sender: debbugs-submit-bounces <at> debbugs.gnu.org Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org X-Spam-Score: -2.6 (--) This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig870083F86BDD4B965D893D0A Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 05.04.2012 21:09, Jim Meyering wrote: > Vladimir "'=CF=86-coder/phcoder'" Serbinenko wrote: >> Currently fmt assumes that 1 byte=3D 1 column which creates wrongly >> formatted strings. Attached patch fixes it > Hi Vlad, > > Thank you for contributing. > This is a large enough change that we'll need an FSF copyright > assignment from you. If you haven't already sent in the one for > gnulib, please just add coreutils to the list of affected projects. > (you can do up to 4 projects at a time) Ok, will do so. I'll also wait till more or less definitive version is ready for gnulib before updating the one for coreutils. Can I add TP in the same time? --=20 Regards Vladimir '=CF=86-coder/phcoder' Serbinenko --------------enig870083F86BDD4B965D893D0A Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iF4EAREKAAYFAk9+JA8ACgkQNak7dOguQgnWpAD/dmi/XvgapYVlEkEuGMgyJ14W lfYKn/zqy8/q7cctDWAA/3u79vWylWDSEMqXATZtvxz6KDcbQ55nGXMCHQqQcU1L =/KTg -----END PGP SIGNATURE----- --------------enig870083F86BDD4B965D893D0A--
bug-coreutils@HIDDEN
:bug#11187
; Package coreutils
.
Full text available.Received: (at 11187) by debbugs.gnu.org; 5 Apr 2012 19:09:55 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Thu Apr 05 15:09:55 2012 Received: from localhost ([127.0.0.1]:41770 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1SFs4A-0001EK-7N for submit <at> debbugs.gnu.org; Thu, 05 Apr 2012 15:09:54 -0400 Received: from mx.meyering.net ([88.168.87.75]:39840) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from <jim@HIDDEN>) id 1SFs47-0001EC-Ct for 11187 <at> debbugs.gnu.org; Thu, 05 Apr 2012 15:09:53 -0400 Received: from rho.meyering.net (localhost.localdomain [127.0.0.1]) by rho.meyering.net (Acme Bit-Twister) with ESMTP id 07F9E602C1; Thu, 5 Apr 2012 21:09:12 +0200 (CEST) From: Jim Meyering <jim@HIDDEN> To: Vladimir =?utf-8?Q?'=CF=86-coder=2Fphcoder'?= Serbinenko <phcoder@HIDDEN> Subject: Re: bug#11187: [PATCH] Fix incorrect width handling of multibyte characters in fmt In-Reply-To: <4F7DE02D.9050106@HIDDEN> ("Vladimir =?utf-8?Q?=5C=22'?= =?utf-8?Q?=CF=86-coder=2Fphcoder'=5C=22?= Serbinenko"'s message of "Thu, 05 Apr 2012 20:10:53 +0200") References: <4F7DE02D.9050106@HIDDEN> Date: Thu, 05 Apr 2012 21:09:12 +0200 Message-ID: <87d37mq9lz.fsf@HIDDEN> Lines: 155 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -1.9 (-) X-Debbugs-Envelope-To: 11187 Cc: 11187 <at> debbugs.gnu.org X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Sender: debbugs-submit-bounces <at> debbugs.gnu.org Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org X-Spam-Score: -1.9 (-) Vladimir "'=CF=86-coder/phcoder'" Serbinenko wrote: > Currently fmt assumes that 1 byte=3D 1 column which creates wrongly > formatted strings. Attached patch fixes it Hi Vlad, Thank you for contributing. This is a large enough change that we'll need an FSF copyright assignment from you. If you haven't already sent in the one for gnulib, please just add coreutils to the list of affected projects. (you can do up to 4 projects at a time) Here are a few suggested adjustments: It'd be great to have a test-suite addition that fails without your patch, yet succeeds with it. If you simply provide small sample input/output pairs along with a selected locale name, we can convert that to an actual test suite script for you. Also, since this is a NEWS-worthy change, it is customary to add an entry in the NEWS file, too. I'd put this in a section entitled "Improvements". > diff --git a/src/fmt.c b/src/fmt.c > index 89d13a6..56f7c0b 100644 > --- a/src/fmt.c > +++ b/src/fmt.c > @@ -20,6 +20,7 @@ > #include <stdio.h> > #include <sys/types.h> > #include <getopt.h> > +#include <wchar.h> > > /* Redefine. Otherwise, systems (Unicos for one) with headers that defi= ne > it to be a type get syntax errors for the variable declaration below.= */ > @@ -135,6 +136,7 @@ struct Word > > const char *text; /* the text of the word */ > int length; /* length of this word */ > + int width; Please don't follow the bad example there. This value is always unsigned, so it is better to use size_t, to match the type of your new get_display_width function. > int space; /* the size of the following space */ > unsigned int paren:1; /* starts with open paren */ > unsigned int period:1; /* ends in [.?!])* */ > @@ -259,6 +261,42 @@ static int next_prefix_indent; > paragraphs chosen by fmt_paragraph(). */ > static int last_line_length; > Please add a comment saying what the function does/returns, and naming/describing the arguments. > +static size_t > +get_display_width (const char *beg, const char *end) > +{ > + const char *ptr; > + size_t r =3D 0; > + mbstate_t ps; > + > + memset (&ps, 0, sizeof (ps)); We prefer to initialize mbstate_t variables all on one line, like this: mbstate_t ps =3D { 0, }; > + for (ptr =3D beg; *ptr && ptr < end; ) > + { > + wchar_t wc; > + size_t s; Oops. You've used TABs for indentation. Note how mixing TABs and spaces makes the indentation look invalid. Please use only spaces instead. > + s =3D mbrtowc (&wc, ptr, end - ptr, &ps); > + if (s =3D=3D (size_t) -1) > + break; > + if (s =3D=3D (size_t) -2) > + { > + ptr++; > + r++; > + continue; > + } > + if (wc =3D=3D '\e' && ptr + 3 < end > + && ptr[1] =3D=3D '[' && (ptr[2] =3D=3D '0' || ptr[2] =3D=3D '1') > + && ptr[3] =3D=3D 'm') > + { > + ptr +=3D 4; > + continue; > + } > + r +=3D wcwidth (wc); > + ptr +=3D s; > + } > + return r; > +} > + > void > usage (int status) > { > @@ -669,7 +707,9 @@ get_line (FILE *f, int c) > c =3D getc (f); > } > while (c !=3D EOF && !isspace (c)); > - in_column +=3D word_limit->length =3D wptr - word_limit->text; > + word_limit->length =3D wptr - word_limit->text; > + in_column +=3D word_limit->width =3D get_display_width (word_limit= ->text, > + wptr); > check_punctuation (word_limit); > > /* Scan inter-word space. */ > @@ -871,13 +911,13 @@ fmt_paragraph (void) > if (w =3D=3D word_limit) > break; > > - len +=3D (w - 1)->space + w->length; /* w > start >=3D word */ > + len +=3D (w - 1)->space + w->width; /* w > start >=3D word */ > } > while (len < max_width); > start->best_cost =3D best + base_cost (start); > } > > - word_limit->length =3D saved_length; > + word_limit->width =3D saved_length; > } > > /* Return the constant component of the cost of breaking before the > @@ -902,13 +942,13 @@ base_cost (WORD *this) > else if ((this - 1)->punct) > cost -=3D PUNCT_BONUS; > else if (this > word + 1 && (this - 2)->final) > - cost +=3D WIDOW_COST ((this - 1)->length); > + cost +=3D WIDOW_COST ((this - 1)->width); > } > > if (this->paren) > cost -=3D PAREN_BONUS; > else if (this->final) > - cost +=3D ORPHAN_COST (this->length); > + cost +=3D ORPHAN_COST (this->width); > > return cost; > } > @@ -983,7 +1023,7 @@ put_word (WORD *w) > s =3D w->text; > for (n =3D w->length; n !=3D 0; n--) > putchar (*s++); > - out_column +=3D w->length; > + out_column +=3D w->width; > } > > /* Output to stdout SPACE spaces, or equivalent tabs. */
bug-coreutils@HIDDEN
:bug#11187
; Package coreutils
.
Full text available.Received: (at submit) by debbugs.gnu.org; 5 Apr 2012 18:22:49 +0000 From debbugs-submit-bounces <at> debbugs.gnu.org Thu Apr 05 14:22:49 2012 Received: from localhost ([127.0.0.1]:41743 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from <debbugs-submit-bounces <at> debbugs.gnu.org>) id 1SFrKO-00008Y-CW for submit <at> debbugs.gnu.org; Thu, 05 Apr 2012 14:22:49 -0400 Received: from eggs.gnu.org ([208.118.235.92]:56463) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from <phcoder@HIDDEN>) id 1SFrJH-00006Y-1z for submit <at> debbugs.gnu.org; Thu, 05 Apr 2012 14:21:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <phcoder@HIDDEN>) id 1SFrIQ-00040m-To for submit <at> debbugs.gnu.org; Thu, 05 Apr 2012 14:20:41 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=unavailable version=3.3.2 Received: from lists.gnu.org ([208.118.235.17]:43199) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <phcoder@HIDDEN>) id 1SFrIP-0003ys-OF for submit <at> debbugs.gnu.org; Thu, 05 Apr 2012 14:20:34 -0400 Received: from eggs.gnu.org ([208.118.235.92]:50075) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <phcoder@HIDDEN>) id 1SFr9N-00035C-GU for bug-coreutils@HIDDEN; Thu, 05 Apr 2012 14:11:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from <phcoder@HIDDEN>) id 1SFr9G-0001vb-JJ for bug-coreutils@HIDDEN; Thu, 05 Apr 2012 14:11:11 -0400 Received: from mail-wg0-f49.google.com ([74.125.82.49]:44503) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <phcoder@HIDDEN>) id 1SFr9G-0001vN-73 for bug-coreutils@HIDDEN; Thu, 05 Apr 2012 14:11:06 -0400 Received: by wgbdr1 with SMTP id dr1so1169231wgb.30 for <bug-coreutils@HIDDEN>; Thu, 05 Apr 2012 11:11:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject :x-enigmail-version:content-type; bh=37J/k4fMPXIvmtz3fECijsrMfvJddFYLPD1P3aiF7dk=; b=NDR+mrOXNvxjIz1IiBHaERAXoVdAqW7rxPUTtZKYsUOWP2PgmmaNAgvbaGo6QsQIVy Q5u2gUisI8t9aOwS1Ny33+uNmXKJKoU7N5TiUZtvgJwzUKscCS99k6+FLkmffmzlBzEm FUCJhgR6TT19I6hiI+l99kOgpzs3AnFF07XqYFi3r8bdKITYdlQwN5GqauPQrh0Mnhrd 44ORSUzjxln6weD5NvuK7mfebI4TirwIQCoxDoFWLtW5HalpDJDPgmthr/hP7olSfHB4 9xW7k3HYTJjEO+dcaWOFzb2i3aoq2tMj+KXRVxzZm1NW7kt5gkjf2Z7NZuMPpnAvJqwu DEsA== Received: by 10.216.132.6 with SMTP id n6mr2391049wei.26.1333649461845; Thu, 05 Apr 2012 11:11:01 -0700 (PDT) Received: from debian.x201.phnet (9-233.197-178.cust.bluewin.ch. [178.197.233.9]) by mx.google.com with ESMTPS id ff9sm15517393wib.2.2012.04.05.11.11.00 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 05 Apr 2012 11:11:00 -0700 (PDT) Message-ID: <4F7DE02D.9050106@HIDDEN> Date: Thu, 05 Apr 2012 20:10:53 +0200 From: =?UTF-8?B?VmxhZGltaXIgJ8+GLWNvZGVyL3BoY29kZXInIFNlcmJpbmVua28=?= <phcoder@HIDDEN> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.3) Gecko/20120329 Icedove/10.0.3 MIME-Version: 1.0 To: bug-coreutils@HIDDEN Subject: [PATCH] Fix incorrect width handling of multibyte characters in fmt X-Enigmail-Version: 1.4 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="------------enig86CCBDA9B4AA2A1799EDAF82" X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 208.118.235.17 X-Spam-Score: -6.1 (------) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Thu, 05 Apr 2012 14:22:35 -0400 X-BeenThere: debbugs-submit <at> debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: <debbugs-submit.debbugs.gnu.org> List-Unsubscribe: <http://debbugs.gnu.org/cgi-bin/mailman/options/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=unsubscribe> List-Archive: <http://debbugs.gnu.org/cgi-bin/mailman/private/debbugs-submit> List-Post: <mailto:debbugs-submit <at> debbugs.gnu.org> List-Help: <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=help> List-Subscribe: <http://debbugs.gnu.org/cgi-bin/mailman/listinfo/debbugs-submit>, <mailto:debbugs-submit-request <at> debbugs.gnu.org?subject=subscribe> Sender: debbugs-submit-bounces <at> debbugs.gnu.org Errors-To: debbugs-submit-bounces <at> debbugs.gnu.org X-Spam-Score: -6.1 (------) This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig86CCBDA9B4AA2A1799EDAF82 Content-Type: multipart/mixed; boundary="------------070803090808070203070801" This is a multi-part message in MIME format. --------------070803090808070203070801 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Currently fmt assumes that 1 byte=3D 1 column which creates wrongly formatted strings. Attached patch fixes it --=20 Regards Vladimir '=CF=86-coder/phcoder' Serbinenko --------------070803090808070203070801 Content-Type: text/x-diff; name="fmt_width.diff" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="fmt_width.diff" diff --git a/src/fmt.c b/src/fmt.c index 89d13a6..56f7c0b 100644 --- a/src/fmt.c +++ b/src/fmt.c @@ -20,6 +20,7 @@ #include <stdio.h> #include <sys/types.h> #include <getopt.h> +#include <wchar.h> =20 /* Redefine. Otherwise, systems (Unicos for one) with headers that defi= ne it to be a type get syntax errors for the variable declaration below.= */ @@ -135,6 +136,7 @@ struct Word =20 const char *text; /* the text of the word */ int length; /* length of this word */ + int width; int space; /* the size of the following space */ unsigned int paren:1; /* starts with open paren */ unsigned int period:1; /* ends in [.?!])* */ @@ -259,6 +261,42 @@ static int next_prefix_indent; paragraphs chosen by fmt_paragraph(). */ static int last_line_length; =20 +static size_t +get_display_width (const char *beg, const char *end) +{ + const char *ptr; + size_t r =3D 0; + mbstate_t ps; + + memset (&ps, 0, sizeof (ps)); + + for (ptr =3D beg; *ptr && ptr < end; ) + { + wchar_t wc; + size_t s; + + s =3D mbrtowc (&wc, ptr, end - ptr, &ps); + if (s =3D=3D (size_t) -1) + break; + if (s =3D=3D (size_t) -2) + { + ptr++; + r++; + continue; + } + if (wc =3D=3D '\e' && ptr + 3 < end + && ptr[1] =3D=3D '[' && (ptr[2] =3D=3D '0' || ptr[2] =3D=3D '1') + && ptr[3] =3D=3D 'm') + { + ptr +=3D 4; + continue; + } + r +=3D wcwidth (wc); + ptr +=3D s; + } + return r; +} + void usage (int status) { @@ -669,7 +707,9 @@ get_line (FILE *f, int c) c =3D getc (f); } while (c !=3D EOF && !isspace (c)); - in_column +=3D word_limit->length =3D wptr - word_limit->text; + word_limit->length =3D wptr - word_limit->text; + in_column +=3D word_limit->width =3D get_display_width (word_limit= ->text, + wptr); check_punctuation (word_limit); =20 /* Scan inter-word space. */ @@ -871,13 +911,13 @@ fmt_paragraph (void) if (w =3D=3D word_limit) break; =20 - len +=3D (w - 1)->space + w->length; /* w > start >=3D word */= + len +=3D (w - 1)->space + w->width; /* w > start >=3D word */ } while (len < max_width); start->best_cost =3D best + base_cost (start); } =20 - word_limit->length =3D saved_length; + word_limit->width =3D saved_length; } =20 /* Return the constant component of the cost of breaking before the @@ -902,13 +942,13 @@ base_cost (WORD *this) else if ((this - 1)->punct) cost -=3D PUNCT_BONUS; else if (this > word + 1 && (this - 2)->final) - cost +=3D WIDOW_COST ((this - 1)->length); + cost +=3D WIDOW_COST ((this - 1)->width); } =20 if (this->paren) cost -=3D PAREN_BONUS; else if (this->final) - cost +=3D ORPHAN_COST (this->length); + cost +=3D ORPHAN_COST (this->width); =20 return cost; } @@ -983,7 +1023,7 @@ put_word (WORD *w) s =3D w->text; for (n =3D w->length; n !=3D 0; n--) putchar (*s++); - out_column +=3D w->length; + out_column +=3D w->width; } =20 /* Output to stdout SPACE spaces, or equivalent tabs. */ --------------070803090808070203070801-- --------------enig86CCBDA9B4AA2A1799EDAF82 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iF4EAREKAAYFAk994C0ACgkQNak7dOguQgloYgD8D/xsDVdrE2wBlvEt8wYCapFB c3pVzMyV1h6L2p1wfsYA/Rm7eqSkAdB5haDc5deq3ub4x81JEDJmPXALgviLHFGB =IM6N -----END PGP SIGNATURE----- --------------enig86CCBDA9B4AA2A1799EDAF82--
Vladimir 'φ-coder/phcoder' Serbinenko <phcoder@HIDDEN>
:bug-coreutils@HIDDEN
.
Full text available.bug-coreutils@HIDDEN
:bug#11187
; Package coreutils
.
Full text available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997 nCipher Corporation Ltd,
1994-97 Ian Jackson.