GNU bug report logs - #34463
fill-paragraph ruined URL

Previous Next

Package: emacs;

Reported by: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>

Date: Tue, 12 Feb 2019 22:08:01 UTC

Severity: minor

Tags: fixed, patch

Merged with 9286

Fixed in version 26.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 34463 in the body.
You can then email your comments to 34463 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#34463; Package emacs. (Tue, 12 Feb 2019 22:08:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Tue, 12 Feb 2019 22:08:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>
To: bug-gnu-emacs <at> gnu.org
Cc: yamaoka <at> jpl.org
Subject: fill-paragraph ruined URL
Date: Wed, 13 Feb 2019 06:06:28 +0800
Big bug:

Do M-q (fill-paragraph) on

特此提案,
266 公車由原「東勢 - 谷關」改為「中興嶺 - 谷關」如圖
https://goo.gl/maps/rkkBr6jX41m
謝謝!

It becomes

特此提案,266 公車由原「東勢 - 谷關」改為「中興嶺 - 谷關」如圖
https://goo.gl/maps/rkkBr6jX41m謝謝!

Ruining the URL, and thus our whole proposal to the government.




Merged 9286 34463. Request was from Glenn Morris <rgm <at> gnu.org> to control <at> debbugs.gnu.org. (Tue, 12 Feb 2019 22:14:02 GMT) Full text and rfc822 format available.

Added tag(s) patch. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Wed, 09 Oct 2019 22:29:01 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#34463; Package emacs. (Wed, 09 Oct 2019 22:31:03 GMT) Full text and rfc822 format available.

Message #12 received at 34463 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Chong Yidong <cyd <at> stupidchicken.com>
Cc: 34463 <at> debbugs.gnu.org, 9286 <at> debbugs.gnu.org, jidanni <at> jidanni.org,
 Kenichi Handa <handa <at> m17n.org>
Subject: Re: bug#9286: fill-paragraph destroys URLs
Date: Thu, 10 Oct 2019 00:30:54 +0200
Chong Yidong <cyd <at> stupidchicken.com> writes:

> If I am decoding the jidanni-speak correctly, his complaint is doing M-q
> on a buffer containing
>
> asdf
> 國
>
> turns the text into
>
> asdf國
>
> instead of what he wants:
>
> asdf 國
>
> This is because line joining does not include a space if *either*
> character on each side of the newline has the ?| (line-breakable)
> category and an entry in fill-nospace-between-words-table.  To get the
> behavior jidanni wants, we could change it so that *both* the characters
> must have this property; see attached patch.
>
> But I am not sure this is TRT in general.  Handa-san, could you weigh in
> with an opinion?  Adding a space seems more or less correct to me, but I
> am no expert.

This problem is still present in Emacs 27.  This patch, from 2011, was
never applied.  I think Chong's proposal sounds logical, but like him,
I'm (ahem) no expert.

> *** lisp/textmodes/fill.el	2011-07-16 20:05:54 +0000
> --- lisp/textmodes/fill.el	2011-08-20 19:52:41 +0000
> ***************
> *** 482,491 ****
>   	    (replace-match (get-text-property (match-beginning 0) 'fill-space))
>   	  (let ((prev (char-before (match-beginning 0)))
>   		(next (following-char)))
> ! 	    (if (and (or (aref (char-category-set next) ?|)
> ! 			 (aref (char-category-set prev) ?|))
> ! 		     (or (aref fill-nospace-between-words-table next)
> ! 			 (aref fill-nospace-between-words-table prev)))
>   		(delete-char -1))))))
>
>     (goto-char from)
> --- 482,491 ----
>   	    (replace-match (get-text-property (match-beginning 0) 'fill-space))
>   	  (let ((prev (char-before (match-beginning 0)))
>   		(next (following-char)))
> ! 	    (if (and (aref (char-category-set next) ?|)
> ! 		     (aref (char-category-set prev) ?|)
> ! 		     (aref fill-nospace-between-words-table next)
> ! 		     (aref fill-nospace-between-words-table prev))
>   		(delete-char -1))))))
>
>     (goto-char from)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#34463; Package emacs. (Thu, 10 Oct 2019 07:44:02 GMT) Full text and rfc822 format available.

Message #15 received at 34463 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 9286 <at> debbugs.gnu.org, 34463 <at> debbugs.gnu.org, cyd <at> stupidchicken.com,
 handa <at> m17n.org, jidanni <at> jidanni.org
Subject: Re: bug#9286: fill-paragraph destroys URLs
Date: Thu, 10 Oct 2019 10:43:07 +0300
> From: Lars Ingebrigtsen <larsi <at> gnus.org>
> Date: Thu, 10 Oct 2019 00:30:54 +0200
> Cc: 34463 <at> debbugs.gnu.org, jidanni <at> jidanni.org, Kenichi Handa <handa <at> m17n.org>,
>  9286 <at> debbugs.gnu.org
> 
> > This is because line joining does not include a space if *either*
> > character on each side of the newline has the ?| (line-breakable)
> > category and an entry in fill-nospace-between-words-table.  To get the
> > behavior jidanni wants, we could change it so that *both* the characters
> > must have this property; see attached patch.
> >
> > But I am not sure this is TRT in general.  Handa-san, could you weigh in
> > with an opinion?  Adding a space seems more or less correct to me, but I
> > am no expert.
> 
> This problem is still present in Emacs 27.  This patch, from 2011, was
> never applied.  I think Chong's proposal sounds logical, but like him,
> I'm (ahem) no expert.

Since Kenichi didn't respond, I think we should study what the Unicode
Line-breaking Algorithm has to say about that.  Can you look there for
relevant guidance?  We don't yet implement the complete algorithm, but
some of what they say could nevertheless be used to resolve this
issue.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#34463; Package emacs. (Fri, 11 Oct 2019 07:00:02 GMT) Full text and rfc822 format available.

Message #18 received at 34463 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 9286 <at> debbugs.gnu.org, 34463 <at> debbugs.gnu.org, cyd <at> stupidchicken.com,
 handa <at> m17n.org, jidanni <at> jidanni.org
Subject: Re: bug#9286: fill-paragraph destroys URLs
Date: Fri, 11 Oct 2019 08:58:52 +0200
Eli Zaretskii <eliz <at> gnu.org> writes:

> Since Kenichi didn't respond, I think we should study what the Unicode
> Line-breaking Algorithm has to say about that.  Can you look there for
> relevant guidance?  We don't yet implement the complete algorithm, but
> some of what they say could nevertheless be used to resolve this
> issue.

That would be this:

https://unicode.org/reports/tr14/

I have just skimmed it, but I can't see that it says anything helpful
about filling/folding lines.

If I read it correctly, then it's perfectly allowed to line-break

asdf國

into

asdf
國

But it doesn't say what software should do when filling

asdf
國

Presumably filling that into

asdf國

would be correct in many circumstances, but as Dan said, if it's really

http://google.com
國

then filling that into 

http://google.com國

is most likely wrong.  So if we want to be cautious, then applying
Chong's patch seems to be the right thing:  Adding the space will lead
to things working more of the time, while the downside is that somebody
might prefer 

asdf國

visually.  I think.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#34463; Package emacs. (Sat, 23 Nov 2019 14:02:04 GMT) Full text and rfc822 format available.

Message #21 received at 34463 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 9286 <at> debbugs.gnu.org, cyd <at> stupidchicken.com, handa <at> m17n.org,
 jidanni <at> jidanni.org, 34463 <at> debbugs.gnu.org
Subject: Re: bug#9286: fill-paragraph destroys URLs
Date: Sat, 23 Nov 2019 15:00:55 +0100
Lars Ingebrigtsen <larsi <at> gnus.org> writes:

> That would be this:
>
> https://unicode.org/reports/tr14/
>
> I have just skimmed it, but I can't see that it says anything helpful
> about filling/folding lines.

Ah, this is all moot -- in Emacs 26, the
fill-separate-heterogeneous-words-with-space variable was introduced,
which gives the behaviour that Dan wants (and is similar to Chong's
patch, only guarded by that variable).

So I'm closing this bug report.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Added tag(s) fixed. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Sat, 23 Nov 2019 14:03:03 GMT) Full text and rfc822 format available.

bug marked as fixed in version 26.1, send any further explanations to 9286 <at> debbugs.gnu.org and jidanni <at> jidanni.org Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Sat, 23 Nov 2019 14:03:07 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 22 Dec 2019 12:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 98 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.