GNU bug report logs - #51762
13.0.14; environment formating

Previous Next

Package: auctex;

Reported by: "Pierre L. Nageoire" <devel <at> pollock-nageoire.net>

Date: Thu, 11 Nov 2021 07:15:02 UTC

Severity: normal

Tags: confirmed

Found in version 13.0.14

Done: Ikumi Keita <ikumi <at> ikumi.que.jp>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 51762 in the body.
You can then email your comments to 51762 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-auctex <at> gnu.org:
bug#51762; Package auctex. (Thu, 11 Nov 2021 07:15:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to "Pierre L. Nageoire" <devel <at> pollock-nageoire.net>:
New bug report received and forwarded. Copy sent to bug-auctex <at> gnu.org. (Thu, 11 Nov 2021 07:15:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: "Pierre L. Nageoire" <devel <at> pollock-nageoire.net>
To: bug-auctex <at> gnu.org
Subject: 13.0.14; environment formating
Date: Thu, 11 Nov 2021 07:22:59 +0100
Hi here is the environment before hitting C-c C-q C-e

  %%
  \begin{exoenonce}
    On note alors
    $$ \fundefline p{A \times B}{A}{(x,y)}{x}\ %%
    \et\ \fundefline q{A \times B}{B}{(x,y)}{y}\ %%
    \text{les projections canoniques}\ .$$
  \end{exoenonce}

  And after formating operation I get :
    %%
  \begin{exoenonce}
    On note alors
    $$ \fundefline p{A \times B}{A}{(x,y)}{x}\ %%
    \et\ \fundefline q{A \times
      B}{B}{(x,y)}{y}\ %% \text{les projections canoniques}\ .$$
  \end{exoenonce}

  Causing a part of code to be commented that should not and for
  instance here an unbalanced dollar that makes the code completly
  unexecutable !

  Best regards 









Emacs  : GNU Emacs 28.0.50 (build 1, x86_64-pc-linux-gnu)
 of 2021-09-22
Package: 13.0.14

current state:
==============
(setq
 AUCTeX-date "2021-08-26"
 window-system nil
 LaTeX-version "2e"
 TeX-style-path '("~/.emacs.d/auctex"
		  "/home/devel/.emacs.d/elpa/auctex-13.0.11/style"
		  "/usr/local/share/texmf/tex/latex/auto/"
		  "/usr/local/share/texmf/tex/latex/style/" "auto" "style")
 TeX-auto-save t
 TeX-parse-self t
 TeX-master nil
 TeX-command-list '(("pdf" "ps2pdf13 %s.ps" TeX-run-command nil t)
		    ("mp" "mpost %s.mp" TeX-run-command nil t)
		    ("Hacha" "hacha %s.html" TeX-run-command nil t)
		    ("PHPindex" "hacha -o index.php %s.html" TeX-run-command
		     nil t)
		    ("PHP" "hevea -fix -o %s.php %t" TeX-run-command nil t)
		    ("Info" "hevea -fix -info %t" TeX-run-command nil t)
		    ("Txt" "hevea -fix -text %t" TeX-run-command nil t)
		    ("Hevea" "hevea -fix %t" TeX-run-command nil t)
		    ("TeX" "%(PDF)%(tex) %`%S%(PDFout)%(mode)%' %t"
		     TeX-run-TeX nil
		     (plain-tex-mode texinfo-mode ams-tex-mode) :help
		     "Run plain TeX")
		    ("LaTeX" "%`%l%(mode)%' %t" TeX-run-TeX nil
		     (latex-mode doctex-mode) :help "Run LaTeX")
		    ("Makeinfo" "makeinfo %t" TeX-run-compile nil
		     (texinfo-mode) :help "Run Makeinfo with Info output")
		    ("Makeinfo HTML" "makeinfo --html %t" TeX-run-compile nil
		     (texinfo-mode) :help "Run Makeinfo with HTML output")
		    ("AmSTeX" "%(PDF)amstex %`%S%(PDFout)%(mode)%' %t"
		     TeX-run-TeX nil (ams-tex-mode) :help "Run AMSTeX")
		    ("ConTeXt" "texexec --once --texutil %(execopts)%t"
		     TeX-run-TeX nil (context-mode) :help "Run ConTeXt once")
		    ("ConTeXt Full" "texexec %(execopts)%t" TeX-run-TeX nil
		     (context-mode) :help "Run ConTeXt until completion")
		    ("BibTeX" "%(bibtex) %s" TeX-run-BibTeX nil t :help
		     "Run BibTeX")
		    ("View" "dvi2tty -q -w 132 %s" TeX-run-command t t :help
		     "Run Text viewer")
		    ("Print" "%p" TeX-run-command t t :help "Print the file")
		    ("Queue" "%q" TeX-run-background nil t :help
		     "View the printer queue" :visible TeX-queue-command)
		    ("File" "%(o?)dvips %d -o %f " TeX-run-command t t :help
		     "Generate PostScript file")
		    ("Index" "%(makeindex) %s" TeX-run-command nil t :help
		     "Create index file")
		    ("Check" "lacheck %s" TeX-run-compile nil (latex-mode)
		     :help "Check LaTeX file for correctness")
		    ("Spell" "(TeX-ispell-document \"\")" TeX-run-function nil
		     t :help "Spell-check the document")
		    ("Browse" "(plnltx-browse)" TeX-run-function nil t)
		    ("Clean" "TeX-clean" TeX-run-function nil t :help
		     "Delete generated intermediate files")
		    ("Clean All" "(TeX-clean t)" TeX-run-function nil t :help
		     "Delete generated intermediate and output files")
		    ("Other" "" TeX-run-command t t :help
		     "Run an arbitrary command")
		    )
 )




Added tag(s) confirmed. Request was from Ikumi Keita <ikumi <at> ikumi.que.jp> to control <at> debbugs.gnu.org. (Fri, 12 Nov 2021 06:50:01 GMT) Full text and rfc822 format available.

Information forwarded to bug-auctex <at> gnu.org:
bug#51762; Package auctex. (Fri, 12 Nov 2021 10:20:02 GMT) Full text and rfc822 format available.

Message #10 received at 51762 <at> debbugs.gnu.org (full text, mbox):

From: Ikumi Keita <ikumi <at> ikumi.que.jp>
To: "Pierre L. Nageoire" <devel <at> pollock-nageoire.net>
Cc: 51762 <at> debbugs.gnu.org
Subject: Re: bug#51762: 13.0.14; environment formating
Date: Fri, 12 Nov 2021 19:19:15 +0900
Hi Pierre,

>>>>> "Pierre L. Nageoire" <devel <at> pollock-nageoire.net> writes:
> Hi here is the environment before hitting C-c C-q C-e
>   %%
>   \begin{exoenonce}
>     On note alors
>     $$ \fundefline p{A \times B}{A}{(x,y)}{x}\ %%
>     \et\ \fundefline q{A \times B}{B}{(x,y)}{y}\ %%
>     \text{les projections canoniques}\ .$$
>   \end{exoenonce}

>   And after formating operation I get :
>     %%
>   \begin{exoenonce}
>     On note alors
>     $$ \fundefline p{A \times B}{A}{(x,y)}{x}\ %%
>     \et\ \fundefline q{A \times
>       B}{B}{(x,y)}{y}\ %% \text{les projections canoniques}\ .$$
>   \end{exoenonce}

>   Causing a part of code to be commented that should not and for
>   instance here an unbalanced dollar that makes the code completly
>   unexecutable !

Thanks for the report. I can repoduce the issue.

It seems that the regexp in `LaTeX-fill-region-as-paragraph' to identify
a code comment isn't wise enough:
----------------------------------------------------------------------
          (if (re-search-forward
               (concat "\\("
                       ;; Code comments.
                       "\\([^ \r\n%\\]\\|\\\\%\\)\\([ \t]\\|\\\\\\\\\\)*"
                       TeX-comment-start-regexp
                       "\\|"
[...]
----------------------------------------------------------------------
This doesn't match lines which end with "\ %%". I'll try to find
something better.

Regards,
Ikumi Keita




Information forwarded to bug-auctex <at> gnu.org:
bug#51762; Package auctex. (Fri, 12 Nov 2021 15:29:01 GMT) Full text and rfc822 format available.

Message #13 received at 51762 <at> debbugs.gnu.org (full text, mbox):

From: Ikumi Keita <ikumi <at> ikumi.que.jp>
To: "Pierre L. Nageoire" <devel <at> pollock-nageoire.net>
Cc: 51762 <at> debbugs.gnu.org
Subject: Re: bug#51762: 13.0.14; environment formating
Date: Sat, 13 Nov 2021 00:28:08 +0900
[Message part 1 (text/plain, inline)]
>>>>> Ikumi Keita <ikumi <at> ikumi.que.jp> writes:
> It seems that the regexp in `LaTeX-fill-region-as-paragraph' to identify
> a code comment isn't wise enough:
> ----------------------------------------------------------------------
>           (if (re-search-forward
>                (concat "\\("
>                        ;; Code comments.
>                        "\\([^ \r\n%\\]\\|\\\\%\\)\\([ \t]\\|\\\\\\\\\\)*"
>                        TeX-comment-start-regexp
>                        "\\|"
> [...]
> ----------------------------------------------------------------------
> This doesn't match lines which end with "\ %%". I'll try to find
> something better.

I think the attached patch fixes the problem. Could you test whether it
works on your side?

I confirmed this fix passes regression test, but would appreciate if
others also could have a look to find any possible regression.

Regards,
Ikumi Keita

[patch (text/x-diff, attachment)]

Information forwarded to bug-auctex <at> gnu.org:
bug#51762; Package auctex. (Sat, 13 Nov 2021 04:38:01 GMT) Full text and rfc822 format available.

Message #16 received at 51762 <at> debbugs.gnu.org (full text, mbox):

From: Ikumi Keita <ikumi <at> ikumi.que.jp>
To: "Pierre L. Nageoire" <devel <at> pollock-nageoire.net>
Cc: 51762 <at> debbugs.gnu.org
Subject: Re: bug#51762: 13.0.14; environment formating
Date: Sat, 13 Nov 2021 13:37:16 +0900
>>>>> Ikumi Keita <ikumi <at> ikumi.que.jp> writes:
> I think the attached patch fixes the problem. Could you test whether it
> works on your side?

Hmm, the patch also fails to identify the following form of code
comment:
 \\% This is a code comment.
That is, a line beginning with 0 or more whitespaces, followed by even
number of back slashes, followed by percent sign(s) and comment body.

Maybe we should give up regexp-based approach to find out code comments
accurately.

Regards,
Ikumi Keita




Information forwarded to bug-auctex <at> gnu.org:
bug#51762; Package auctex. (Sat, 13 Nov 2021 16:06:01 GMT) Full text and rfc822 format available.

Message #19 received at 51762 <at> debbugs.gnu.org (full text, mbox):

From: Arash Esbati <arash <at> gnu.org>
To: Ikumi Keita <ikumi <at> ikumi.que.jp>
Cc: 51762 <at> debbugs.gnu.org, "Pierre L. Nageoire" <devel <at> pollock-nageoire.net>
Subject: Re: bug#51762: 13.0.14; environment formating
Date: Sat, 13 Nov 2021 17:04:33 +0100
Hi Keita,

Ikumi Keita <ikumi <at> ikumi.que.jp> writes:

>>>>>> Ikumi Keita <ikumi <at> ikumi.que.jp> writes:
>> I think the attached patch fixes the problem. Could you test whether it
>> works on your side?
>
> Hmm, the patch also fails to identify the following form of code
> comment:
>  \\% This is a code comment.
> That is, a line beginning with 0 or more whitespaces, followed by even
> number of back slashes, followed by percent sign(s) and comment body.

Thank you for looking into this.  The way I understand this regexp:

  "\\([^ \r\n%\\]\\|\\\\%\\)\\([ \t]\\|\\\\\\\\\\)*"
                    ^^^^^^^

is there to exclude the control symbol \%, i.e., being parsed as comment
start.  Would it help if we generlize the control symbol idea by saying:

  "\\([^ \r\n%\\]\\|\\\\[^a-zA-Z0-9\\]\\)\\([ \t]\\|\\\\\\\\\\)*"
                        ^^^^^^^^^^^^^^

> Maybe we should give up regexp-based approach to find out code comments
> accurately.

Are you thinking about `syntax-ppss'?

Best, Arash




Information forwarded to bug-auctex <at> gnu.org:
bug#51762; Package auctex. (Sun, 14 Nov 2021 06:20:02 GMT) Full text and rfc822 format available.

Message #22 received at 51762 <at> debbugs.gnu.org (full text, mbox):

From: Ikumi Keita <ikumi <at> ikumi.que.jp>
To: Arash Esbati <arash <at> gnu.org>
Cc: 51762 <at> debbugs.gnu.org, "Pierre L. Nageoire" <devel <at> pollock-nageoire.net>
Subject: Re: bug#51762: 13.0.14; environment formating
Date: Sun, 14 Nov 2021 15:19:48 +0900
[Message part 1 (text/plain, inline)]
Hi Arash, thanks for your comment.

>>>>> Arash Esbati <arash <at> gnu.org> writes:
> Thank you for looking into this.  The way I understand this regexp:

>   "\\([^ \r\n%\\]\\|\\\\%\\)\\([ \t]\\|\\\\\\\\\\)*"
>                     ^^^^^^^

> is there to exclude the control symbol \%, i.e., being parsed as comment
> start.

I think so, too. Tassilo added it to fix bug#48937 this June.

> Would it help if we generlize the control symbol idea by saying:

>   "\\([^ \r\n%\\]\\|\\\\[^a-zA-Z0-9\\]\\)\\([ \t]\\|\\\\\\\\\\)*"
>                         ^^^^^^^^^^^^^^

I'm afraid that it doesn't match a line
 \\% This is a code comment.
, either. Try typing M-q on the following paragraph in latex mode
buffer:
----------------------------------------------------------------------
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Donec hendrerit
tempor tellus. Donec pretium posuere tellus. Proin quam nisl, tincidunt et,
 \\% This is a code comment.
mattis eget, convallis nec, purus.
----------------------------------------------------------------------

>> Maybe we should give up regexp-based approach to find out code comments
>> accurately.

> Are you thinking about `syntax-ppss'?

No, other parts of latex.el ideintify code comments by a different logic
like:
       ;; A line with some code, followed by a comment?
       ((and (setq code-comment-start (save-excursion
                                        (beginning-of-line)
                                        (TeX-search-forward-comment-start
                                         (line-end-position))))
             (> (point) code-comment-start)
             (not (TeX-in-commented-line))
             (save-excursion
               (goto-char code-comment-start)
               ;; See if there is at least one non-whitespace character
               ;; before the comment starts.
               (re-search-backward "[^ \t\n]" (line-beginning-position) t)))

So it would be better to follow this logic than to rely on regexp. In
addition, regexp-based approach is easily fooled by percent sign in
\verb, while `TeX-search-forward-comment-start' (which in turn calls
`LaTeX-search-forward-comment-start') takes care of such cases.

I ended up with the attached tentative patch. I hope this doesn't slow
down the filling loop significantly. What do you think about it?

Regards,
Ikumi Keita

[patch (text/x-diff, attachment)]

Information forwarded to bug-auctex <at> gnu.org:
bug#51762; Package auctex. (Mon, 15 Nov 2021 17:59:01 GMT) Full text and rfc822 format available.

Message #25 received at 51762 <at> debbugs.gnu.org (full text, mbox):

From: Arash Esbati <arash <at> gnu.org>
To: Ikumi Keita <ikumi <at> ikumi.que.jp>
Cc: 51762 <at> debbugs.gnu.org, "Pierre L. Nageoire" <devel <at> pollock-nageoire.net>
Subject: Re: bug#51762: 13.0.14; environment formating
Date: Mon, 15 Nov 2021 18:57:33 +0100
Hi Keita,

Ikumi Keita <ikumi <at> ikumi.que.jp> writes:

>>>>>> Arash Esbati <arash <at> gnu.org> writes:
>> Thank you for looking into this.  The way I understand this regexp:
>
>>   "\\([^ \r\n%\\]\\|\\\\%\\)\\([ \t]\\|\\\\\\\\\\)*"
>>                     ^^^^^^^
>
>> is there to exclude the control symbol \%, i.e., being parsed as comment
>> start.
>
> I think so, too. Tassilo added it to fix bug#48937 this June.
>
>> Would it help if we generlize the control symbol idea by saying:
>
>>   "\\([^ \r\n%\\]\\|\\\\[^a-zA-Z0-9\\]\\)\\([ \t]\\|\\\\\\\\\\)*"
>>                         ^^^^^^^^^^^^^^
>
> I'm afraid that it doesn't match a line
>  \\% This is a code comment.
> , either. Try typing M-q on the following paragraph in latex mode
> buffer:
> ----------------------------------------------------------------------
> Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Donec hendrerit
> tempor tellus. Donec pretium posuere tellus. Proin quam nisl, tincidunt et,
>  \\% This is a code comment.
> mattis eget, convallis nec, purus.
> ----------------------------------------------------------------------

I agree; it's hard to come up with a regexp to catch all possible
comment lines.

>>> Maybe we should give up regexp-based approach to find out code comments
>>> accurately.
>
>> Are you thinking about `syntax-ppss'?
>
> No, other parts of latex.el ideintify code comments by a different logic
> like:
>        ;; A line with some code, followed by a comment?
>        ((and (setq code-comment-start (save-excursion
>                                         (beginning-of-line)
>                                         (TeX-search-forward-comment-start
>                                          (line-end-position))))
>              (> (point) code-comment-start)
>              (not (TeX-in-commented-line))
>              (save-excursion
>                (goto-char code-comment-start)
>                ;; See if there is at least one non-whitespace character
>                ;; before the comment starts.
>                (re-search-backward "[^ \t\n]" (line-beginning-position) t)))
>
> So it would be better to follow this logic than to rely on regexp. In
> addition, regexp-based approach is easily fooled by percent sign in
> \verb, while `TeX-search-forward-comment-start' (which in turn calls
> `LaTeX-search-forward-comment-start') takes care of such cases.
>
> I ended up with the attached tentative patch. I hope this doesn't slow
> down the filling loop significantly. What do you think about it?

Do you have an idea about the performance hit?  I'd say we have to bite
the bullet and use the code.  Our current approach is not the best.  And
while we're at it, we'll have to take care of this comment in
`LaTeX-verbatim-macro-boundaries':

    ;; XXX: Here we assume we are dealing with \verb which
    ;; expects the delimiter right behind the command.
    ;; However, \lstinline can also cope with whitespace as
    ;; well as an optional argument after the command.

Other packages like fancyvrb and minted do the same: Inline verb macros
can have an optional and a mandatory argument.  So the regexp fun will
continue :-)

Best, Arash




Information forwarded to bug-auctex <at> gnu.org:
bug#51762; Package auctex. (Tue, 16 Nov 2021 06:30:02 GMT) Full text and rfc822 format available.

Message #28 received at 51762 <at> debbugs.gnu.org (full text, mbox):

From: Ikumi Keita <ikumi <at> ikumi.que.jp>
To: Arash Esbati <arash <at> gnu.org>
Cc: 51762 <at> debbugs.gnu.org, "Pierre L. Nageoire" <devel <at> pollock-nageoire.net>
Subject: Re: bug#51762: 13.0.14; environment formating
Date: Tue, 16 Nov 2021 15:29:30 +0900
[Message part 1 (text/plain, inline)]
>>>>> Arash Esbati <arash <at> gnu.org> writes:
>>> Would it help if we generlize the control symbol idea by saying:
>> 
>>> "\\([^ \r\n%\\]\\|\\\\[^a-zA-Z0-9\\]\\)\\([ \t]\\|\\\\\\\\\\)*"
>>> ^^^^^^^^^^^^^^
>> 
>> I'm afraid that it doesn't match a line
>> \\% This is a code comment.
>> , either. Try typing M-q on the following paragraph in latex mode
>> buffer:
>> ----------------------------------------------------------------------
>> Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Donec hendrerit
>> tempor tellus. Donec pretium posuere tellus. Proin quam nisl, tincidunt et,
>> \\% This is a code comment.
>> mattis eget, convallis nec, purus.
>> ----------------------------------------------------------------------

> I agree; it's hard to come up with a regexp to catch all possible
> comment lines.

Indeed.

>> I ended up with the attached tentative patch. I hope this doesn't slow
>> down the filling loop significantly. What do you think about it?

First of all, my previous propoal doesn't work correctly for lines
ending with "\par". The first regexp group in the current code spans the
highlighted interval in the attached screenshot actually, so the
criterion "(not (match-beginning 0))" in the previous proposal is wrong.
I attach the revised patch at last of this message.

> Do you have an idea about the performance hit?

Unfortunatetly, no. I just tried several examples by hand and checked
that "make check" passes. They worked smoothly so I expect there isn't
serious performance problem.

> And while we're at it, we'll have to take care of this comment in
> `LaTeX-verbatim-macro-boundaries':

>     ;; XXX: Here we assume we are dealing with \verb which
>     ;; expects the delimiter right behind the command.
>     ;; However, \lstinline can also cope with whitespace as
>     ;; well as an optional argument after the command.

> Other packages like fancyvrb and minted do the same: Inline verb macros
> can have an optional and a mandatory argument.  So the regexp fun will
> continue :-)

OMG! 😖

Anyway, I'd like to commit the attached revised fix along with some
additional regression tests if no one objects.

Regards,
Ikumi Keita

[screenshot.png (image/png, attachment)]
[patch (text/x-diff, attachment)]

Information forwarded to bug-auctex <at> gnu.org:
bug#51762; Package auctex. (Wed, 17 Nov 2021 10:19:02 GMT) Full text and rfc822 format available.

Message #31 received at 51762 <at> debbugs.gnu.org (full text, mbox):

From: Arash Esbati <arash <at> gnu.org>
To: Ikumi Keita <ikumi <at> ikumi.que.jp>
Cc: 51762 <at> debbugs.gnu.org, "Pierre L. Nageoire" <devel <at> pollock-nageoire.net>
Subject: Re: bug#51762: 13.0.14; environment formating
Date: Wed, 17 Nov 2021 11:17:41 +0100
Ikumi Keita <ikumi <at> ikumi.que.jp> writes:

>>>>>> Arash Esbati <arash <at> gnu.org> writes:
>
>> Do you have an idea about the performance hit?
>
> Unfortunatetly, no. I just tried several examples by hand and checked
> that "make check" passes. They worked smoothly so I expect there isn't
> serious performance problem.

There is `benchmark-run' but I don't think the performance hit is big
enough to go through the trouble and benchmark it.

>> And while we're at it, we'll have to take care of this comment in
>> `LaTeX-verbatim-macro-boundaries':
>
>>     ;; XXX: Here we assume we are dealing with \verb which
>>     ;; expects the delimiter right behind the command.
>>     ;; However, \lstinline can also cope with whitespace as
>>     ;; well as an optional argument after the command.
>
>> Other packages like fancyvrb and minted do the same: Inline verb macros
>> can have an optional and a mandatory argument.  So the regexp fun will
>> continue :-)
>
> OMG! 😖

:-)

> Anyway, I'd like to commit the attached revised fix along with some
> additional regression tests if no one objects.

Yes, please go ahead.

Best, Arash




bug closed, send any further explanations to 51762 <at> debbugs.gnu.org and "Pierre L. Nageoire" <devel <at> pollock-nageoire.net> Request was from Ikumi Keita <ikumi <at> ikumi.que.jp> to control <at> debbugs.gnu.org. (Wed, 17 Nov 2021 14:25:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-auctex <at> gnu.org:
bug#51762; Package auctex. (Wed, 17 Nov 2021 16:20:01 GMT) Full text and rfc822 format available.

Message #36 received at 51762 <at> debbugs.gnu.org (full text, mbox):

From: Ikumi Keita <ikumikeita <at> jcom.home.ne.jp>
To: Arash Esbati <arash <at> gnu.org>
Cc: 51762 <at> debbugs.gnu.org, "Pierre L. Nageoire" <devel <at> pollock-nageoire.net>
Subject: Re: bug#51762: 13.0.14; environment formating
Date: Wed, 17 Nov 2021 23:23:57 +0900
>>>>> Arash Esbati <arash <at> gnu.org> writes:
>> Anyway, I'd like to commit the attached revised fix along with some
>> additional regression tests if no one objects.

> Yes, please go ahead.

Pushed. Pierre, I'll close this bug.

Bye,
Ikumi Keita




Information forwarded to bug-auctex <at> gnu.org:
bug#51762; Package auctex. (Thu, 18 Nov 2021 07:18:02 GMT) Full text and rfc822 format available.

Message #39 received at 51762 <at> debbugs.gnu.org (full text, mbox):

From: Ikumi Keita <ikumi <at> ikumi.que.jp>
To: Arash Esbati <arash <at> gnu.org>, 51762 <at> debbugs.gnu.org
Cc: "Pierre L. Nageoire" <devel <at> pollock-nageoire.net>
Subject: Re: bug#51762: 13.0.14; environment formating
Date: Thu, 18 Nov 2021 16:17:53 +0900
>>>>> Ikumi Keita <ikumikeita <at> jcom.home.ne.jp> writes:
>>>>> Arash Esbati <arash <at> gnu.org> writes:
>>> Anyway, I'd like to commit the attached revised fix along with some
>>> additional regression tests if no one objects.

>> Yes, please go ahead.

> Pushed. Pierre, I'll close this bug.

Ah, no, the logic is incomplete. If there is a line matching the regexp
before the first code comment, my code skips it. So we always have to
perform both code comment detection and regexp search, and take up the
earlier match if both exists.

I'll revise my code...

Regards,
Ikumi Keita




Information forwarded to bug-auctex <at> gnu.org:
bug#51762; Package auctex. (Thu, 18 Nov 2021 17:03:02 GMT) Full text and rfc822 format available.

Message #42 received at 51762 <at> debbugs.gnu.org (full text, mbox):

From: Ikumi Keita <ikumi <at> ikumi.que.jp>
To: Arash Esbati <arash <at> gnu.org>
Cc: "Pierre L. Nageoire" <devel <at> pollock-nageoire.net>, 51762 <at> debbugs.gnu.org
Subject: Re: bug#51762: 13.0.14; environment formating
Date: Fri, 19 Nov 2021 02:02:13 +0900
>>>>> Ikumi Keita <ikumi <at> ikumi.que.jp> writes:
> Ah, no, the logic is incomplete. If there is a line matching the regexp
> before the first code comment, my code skips it. So we always have to
> perform both code comment detection and regexp search, and take up the
> earlier match if both exists.

> I'll revise my code...

Done.

Bye,
Ikumi Keita




Information forwarded to bug-auctex <at> gnu.org:
bug#51762; Package auctex. (Thu, 18 Nov 2021 20:42:02 GMT) Full text and rfc822 format available.

Message #45 received at 51762 <at> debbugs.gnu.org (full text, mbox):

From: Arash Esbati <arash <at> gnu.org>
To: Ikumi Keita <ikumi <at> ikumi.que.jp>
Cc: "Pierre L. Nageoire" <devel <at> pollock-nageoire.net>, 51762 <at> debbugs.gnu.org
Subject: Re: bug#51762: 13.0.14; environment formating
Date: Thu, 18 Nov 2021 21:40:28 +0100
Hi Keita,

Ikumi Keita <ikumi <at> ikumi.que.jp> writes:

>>>>>> Ikumi Keita <ikumi <at> ikumi.que.jp> writes:
>> Ah, no, the logic is incomplete. If there is a line matching the regexp
>> before the first code comment, my code skips it. So we always have to
>> perform both code comment detection and regexp search, and take up the
>> earlier match if both exists.
>
>> I'll revise my code...
>
> Done.

Thanks for the update.  I have a question, though: You have also
expanded the `LaTeX-filling' ert test.  bug#51762-5 test turns this:

--8<---------------cut here---------------start------------->8---
% bug#51762-5 "\\" before code comment shouldn't be skipped.
Mauris ac felis vel velit tristique imperdiet.  Vestibulum convallis, lorem a 
tempus semper, dui dui euismod elit, vitae placerat urna tortor vitae lacus.\\
  Fusce sagittis, libero non molestie mollis, magna orci ultrices dolor, at vulputate neque nulla lacinia eros.  Aliquam posuere.  Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.%
--8<---------------cut here---------------end--------------->8---

into this:

--8<---------------cut here---------------start------------->8---
% bug#51762-5 "\\" before code comment shouldn't be skipped.
Mauris ac felis vel velit tristique imperdiet.  Vestibulum convallis,
lorem a
tempus semper, dui dui euismod elit, vitae placerat urna tortor vitae lacus.\\
Fusce sagittis, libero non molestie mollis, magna orci ultrices dolor,
at vulputate neque nulla lacinia eros.  Aliquam posuere.  Cum sociis
natoque penatibus et magnis dis parturient montes, nascetur ridiculus
mus.%
--8<---------------cut here---------------end--------------->8---

I would have expected something like this:

--8<---------------cut here---------------start------------->8---
% bug#51762-5 "\\" before code comment shouldn't be skipped.
Mauris ac felis vel velit tristique imperdiet.  Vestibulum convallis,
lorem a tempus semper, dui dui euismod elit, vitae placerat urna
tortor vitae lacus.\\
Fusce sagittis, libero non molestie mollis, magna orci ultrices dolor,
at vulputate neque nulla lacinia eros.  Aliquam posuere.  Cum sociis
natoque penatibus et magnis dis parturient montes, nascetur ridiculus
mus.%
--8<---------------cut here---------------end--------------->8---

Am I missing something?  I works fine if I replace \\ with \par.

Best, Arash




Information forwarded to bug-auctex <at> gnu.org:
bug#51762; Package auctex. (Fri, 19 Nov 2021 03:09:01 GMT) Full text and rfc822 format available.

Message #48 received at 51762 <at> debbugs.gnu.org (full text, mbox):

From: Ikumi Keita <ikumi <at> ikumi.que.jp>
To: Arash Esbati <arash <at> gnu.org>
Cc: "Pierre L. Nageoire" <devel <at> pollock-nageoire.net>, 51762 <at> debbugs.gnu.org
Subject: Re: bug#51762: 13.0.14; environment formating
Date: Fri, 19 Nov 2021 12:08:37 +0900
Hi Arash,

> Thanks for the update.  I have a question, though: You have also
> expanded the `LaTeX-filling' ert test.
[...]
> I would have expected something like this:
> % bug#51762-5 "\\" before code comment shouldn't be skipped.
> Mauris ac felis vel velit tristique imperdiet.  Vestibulum convallis,
> lorem a tempus semper, dui dui euismod elit, vitae placerat urna
> tortor vitae lacus.\\
> Fusce sagittis, libero non molestie mollis, magna orci ultrices dolor,
> at vulputate neque nulla lacinia eros.  Aliquam posuere.  Cum sociis
> natoque penatibus et magnis dis parturient montes, nascetur ridiculus
> mus.%
> Am I missing something?  I works fine if I replace \\ with \par.

I agree it looks odd, but it seems an intended behavior. In
`LaTeX-fill-region-as-paragraph', we have:
----------------------------------------------------------------------
                  ;; Code comments and lines ending with `\par' are
                  ;; included in filling.  Lines ending with `\\' are
                  ;; skipped.
                  (if (or has-code-comment
                          (match-beginning 1))
                      (LaTeX-fill-region-as-para-do from (point) justify-flag)
                    (LaTeX-fill-region-as-para-do
                     from (line-beginning-position 0) justify-flag)
                    ;; At least indent the line ending with `\\'.
                    (indent-according-to-mode)))
----------------------------------------------------------------------
According to the comment, lines ending with `\\' are excluded from
filling on purpose, though I'm not sure why. Perhaps for the cases
inside environments like tabular, array, align etc.?

By the way, detection of code comments still needs revision to mimic the
regexp search.
----------------------------------------------------------------------
          ;; Code comments.
          (when (setq has-code-comment
                      (TeX-search-forward-comment-start end-marker))
            ;; See if there is at least one non-whitespace
            ;; character before the comment starts.
            (goto-char has-code-comment)
            (skip-chars-backward " \t" (line-beginning-position))
            (if (bolp)
                ;; Not a code comment.
                (setq has-code-comment nil)))
----------------------------------------------------------------------
It's OK to skip when `TeX-search-forward-comment-start' finds no match,
but we must continue searching when the found candidate turns out to be
non code comment. That's what regexp search would have done.

I'll fix this later.

Regards,
Ikumi Keita




Information forwarded to bug-auctex <at> gnu.org:
bug#51762; Package auctex. (Fri, 19 Nov 2021 19:07:02 GMT) Full text and rfc822 format available.

Message #51 received at 51762 <at> debbugs.gnu.org (full text, mbox):

From: Arash Esbati <arash <at> gnu.org>
To: Ikumi Keita <ikumi <at> ikumi.que.jp>
Cc: "Pierre L. Nageoire" <devel <at> pollock-nageoire.net>, 51762 <at> debbugs.gnu.org
Subject: Re: bug#51762: 13.0.14; environment formating
Date: Fri, 19 Nov 2021 20:05:26 +0100
Hi Keita,

Ikumi Keita <ikumi <at> ikumi.que.jp> writes:

> I agree it looks odd, but it seems an intended behavior. In
> `LaTeX-fill-region-as-paragraph', we have:
> ----------------------------------------------------------------------
>                   ;; Code comments and lines ending with `\par' are
>                   ;; included in filling.  Lines ending with `\\' are
>                   ;; skipped.
>                   (if (or has-code-comment
>                           (match-beginning 1))
>                       (LaTeX-fill-region-as-para-do from (point) justify-flag)
>                     (LaTeX-fill-region-as-para-do
>                      from (line-beginning-position 0) justify-flag)
>                     ;; At least indent the line ending with `\\'.
>                     (indent-according-to-mode)))
> ----------------------------------------------------------------------
> According to the comment, lines ending with `\\' are excluded from
> filling on purpose, though I'm not sure why. Perhaps for the cases
> inside environments like tabular, array, align etc.?

Ah, I see.  Sorry for missing that.  While we're at it, should we also
cater for \newline and \linebreak?  Currently, they handled as normal
macro.  It should be easy to add them to the regexp.

WDYT?

Best, Arash




Information forwarded to bug-auctex <at> gnu.org:
bug#51762; Package auctex. (Sat, 20 Nov 2021 09:43:01 GMT) Full text and rfc822 format available.

Message #54 received at 51762 <at> debbugs.gnu.org (full text, mbox):

From: "Pierre L. Nageoire" <devel <at> pollock-nageoire.net>
To: Arash Esbati <arash <at> gnu.org>
Cc: Ikumi Keita <ikumi <at> ikumi.que.jp>, 51762 <at> debbugs.gnu.org
Subject: Re: bug#51762: 13.0.14; environment formating
Date: Sat, 20 Nov 2021 09:50:51 +0100
Hi,

Anyway thanks having taken care of this question ! I hope you ll find a
good solution ; I have unfortunately no time to investigate this subtle
regexp question. Moreover I am not familiar with this part of auctex
code.

So thanks again to do it for me and hopefully for the rest of the
community !

Arash Esbati <arash <at> gnu.org> writes:

> Hi Keita,
>
> Ikumi Keita <ikumi <at> ikumi.que.jp> writes:
>
>>>>>>> Arash Esbati <arash <at> gnu.org> writes:
>>> Thank you for looking into this.  The way I understand this regexp:
>>
>>>   "\\([^ \r\n%\\]\\|\\\\%\\)\\([ \t]\\|\\\\\\\\\\)*"
>>>                     ^^^^^^^
>>
>>> is there to exclude the control symbol \%, i.e., being parsed as comment
>>> start.
>>
>> I think so, too. Tassilo added it to fix bug#48937 this June.
>>
>>> Would it help if we generlize the control symbol idea by saying:
>>
>>>   "\\([^ \r\n%\\]\\|\\\\[^a-zA-Z0-9\\]\\)\\([ \t]\\|\\\\\\\\\\)*"
>>>                         ^^^^^^^^^^^^^^
>>
>> I'm afraid that it doesn't match a line
>>  \\% This is a code comment.
>> , either. Try typing M-q on the following paragraph in latex mode
>> buffer:
>> ----------------------------------------------------------------------
>> Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Donec hendrerit
>> tempor tellus. Donec pretium posuere tellus. Proin quam nisl, tincidunt et,
>>  \\% This is a code comment.
>> mattis eget, convallis nec, purus.
>> ----------------------------------------------------------------------
>
> I agree; it's hard to come up with a regexp to catch all possible
> comment lines.
>
>>>> Maybe we should give up regexp-based approach to find out code comments
>>>> accurately.
>>
>>> Are you thinking about `syntax-ppss'?
>>
>> No, other parts of latex.el ideintify code comments by a different logic
>> like:
>>        ;; A line with some code, followed by a comment?
>>        ((and (setq code-comment-start (save-excursion
>>                                         (beginning-of-line)
>>                                         (TeX-search-forward-comment-start
>>                                          (line-end-position))))
>>              (> (point) code-comment-start)
>>              (not (TeX-in-commented-line))
>>              (save-excursion
>>                (goto-char code-comment-start)
>>                ;; See if there is at least one non-whitespace character
>>                ;; before the comment starts.
>>                (re-search-backward "[^ \t\n]" (line-beginning-position) t)))
>>
>> So it would be better to follow this logic than to rely on regexp. In
>> addition, regexp-based approach is easily fooled by percent sign in
>> \verb, while `TeX-search-forward-comment-start' (which in turn calls
>> `LaTeX-search-forward-comment-start') takes care of such cases.
>>
>> I ended up with the attached tentative patch. I hope this doesn't slow
>> down the filling loop significantly. What do you think about it?
>
> Do you have an idea about the performance hit?  I'd say we have to bite
> the bullet and use the code.  Our current approach is not the best.  And
> while we're at it, we'll have to take care of this comment in
> `LaTeX-verbatim-macro-boundaries':
>
>     ;; XXX: Here we assume we are dealing with \verb which
>     ;; expects the delimiter right behind the command.
>     ;; However, \lstinline can also cope with whitespace as
>     ;; well as an optional argument after the command.
>
> Other packages like fancyvrb and minted do the same: Inline verb macros
> can have an optional and a mandatory argument.  So the regexp fun will
> continue :-)
>
> Best, Arash




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 18 Dec 2021 12:24:07 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 91 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.