GNU bug report logs - #71685
[PATCH] fix shr rendering in tables without tbody

Previous Next

Package: emacs;

Reported by: JD Smith <jdtsmith <at> gmail.com>

Date: Thu, 20 Jun 2024 19:16:01 UTC

Severity: normal

Tags: patch

Fixed in version 30.1

Done: Stefan Kangas <stefankangas <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 71685 in the body.
You can then email your comments to 71685 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#71685; Package emacs. (Thu, 20 Jun 2024 19:16:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to JD Smith <jdtsmith <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Thu, 20 Jun 2024 19:16:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: JD Smith <jdtsmith <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: [PATCH] fix shr rendering in tables without tbody
Date: Thu, 20 Jun 2024 15:15:32 -0400
[Message part 1 (text/plain, inline)]
It is very common for HTML tables to include a header (<thead>) and/or footer (<tfoot>) without using <tbody>.  Modern browsers simply supply an implicit <tbody>..</tbody> around all the unparented rows in a table.  `shr' does not handle this common case correctly.  Below is an example with <thead> but not <tbody>.  It prints the header, but then subsumes it again inside the derived body, printing the header again in a single cell.  

The relevant function which should handle this is `shr--fix-tbody'.   The included patch to this function simply avoids including `thead` and `tfoot` children in the implicit body.

(let ((shr-table-vertical-line ?|)
      (shr-table-horizontal-line ?-))
  (shr-insert-document
   (with-temp-buffer
     (insert "<table>
<thead><tr><th>A</th><th>B</th></tr></thead>
<tr><td>1</td><td>2</td></tr>
<tr><td>3</td><td>4</td></tr>
</table>")
     (libxml-parse-html-region))))

 ---------  
| ---  --   |
||A |B | |
| ---  --   |
||AB | |
| ---  --   |
||1 |2 | |
| ---  --   |
||3 |4 | |
| ---  --   |
 ---------  


[shr_fix_tbody.patch (application/octet-stream, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71685; Package emacs. (Sat, 06 Jul 2024 07:37:01 GMT) Full text and rfc822 format available.

Message #8 received at 71685 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: JD Smith <jdtsmith <at> gmail.com>
Cc: 71685 <at> debbugs.gnu.org
Subject: Re: bug#71685: [PATCH] fix shr rendering in tables without tbody
Date: Sat, 06 Jul 2024 10:36:31 +0300
> From: JD Smith <jdtsmith <at> gmail.com>
> Date: Thu, 20 Jun 2024 15:15:32 -0400
> 
> It is very common for HTML tables to include a header (<thead>) and/or footer (<tfoot>) without using <tbody>.  Modern browsers simply supply an implicit <tbody>..</tbody> around all the unparented rows in a table.  `shr' does not handle this common case correctly.  Below is an example with <thead> but not <tbody>.  It prints the header, but then subsumes it again inside the derived body, printing the header again in a single cell.  
> 
> The relevant function which should handle this is `shr--fix-tbody'.   The included patch to this function simply avoids including `thead` and `tfoot` children in the implicit body.

Thanks.  I don't see any experts chiming in, so if you are confident
in the patch, and it doesn't fail the existing tests, please install
on the emacs-30 branch, and thanks.  Bonus points for adding a test
for this case.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#71685; Package emacs. (Sat, 06 Jul 2024 18:15:01 GMT) Full text and rfc822 format available.

Message #11 received at 71685 <at> debbugs.gnu.org (full text, mbox):

From: JD Smith <jdtsmith <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 71685 <at> debbugs.gnu.org
Subject: Re: bug#71685: [PATCH] fix shr rendering in tables without tbody
Date: Sat, 6 Jul 2024 14:13:30 -0400
[Message part 1 (text/plain, inline)]

> On Jul 6, 2024, at 3:36 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
>> From: JD Smith <jdtsmith <at> gmail.com>
>> Date: Thu, 20 Jun 2024 15:15:32 -0400
>> 
>> It is very common for HTML tables to include a header (<thead>) and/or footer (<tfoot>) without using <tbody>.  Modern browsers simply supply an implicit <tbody>..</tbody> around all the unparented rows in a table.  `shr' does not handle this common case correctly.  Below is an example with <thead> but not <tbody>.  It prints the header, but then subsumes it again inside the derived body, printing the header again in a single cell.  
>> 
>> The relevant function which should handle this is `shr--fix-tbody'.   The included patch to this function simply avoids including `thead` and `tfoot` children in the implicit body.
> 
> Thanks.  I don't see any experts chiming in, so if you are confident
> in the patch, and it doesn't fail the existing tests, please install
> on the emacs-30 branch, and thanks.  Bonus points for adding a test
> for this case.

Thanks.  I'm afraid I don't have write access on savannah.  I've added a test and formatted the patch, below.  All shr tests succeed.

[0001-Fix-formatting-of-tables-with-thead-tfoot-but-no-tbo.patch (application/octet-stream, attachment)]

Reply sent to Stefan Kangas <stefankangas <at> gmail.com>:
You have taken responsibility. (Sat, 06 Jul 2024 19:13:02 GMT) Full text and rfc822 format available.

Notification sent to JD Smith <jdtsmith <at> gmail.com>:
bug acknowledged by developer. (Sat, 06 Jul 2024 19:13:02 GMT) Full text and rfc822 format available.

Message #16 received at 71685-done <at> debbugs.gnu.org (full text, mbox):

From: Stefan Kangas <stefankangas <at> gmail.com>
To: JD Smith <jdtsmith <at> gmail.com>, Eli Zaretskii <eliz <at> gnu.org>
Cc: 71685-done <at> debbugs.gnu.org
Subject: Re: bug#71685: [PATCH] fix shr rendering in tables without tbody
Date: Sat, 6 Jul 2024 19:11:00 +0000
Version: 30.1

JD Smith <jdtsmith <at> gmail.com> writes:

> I've added a test and formatted the patch, below.  All shr tests
> succeed.

Thanks, installed on emacs-30 as commit 9625e4af994.

I'm therefore closing this bug report.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 04 Aug 2024 11:24:17 GMT) Full text and rfc822 format available.

This bug report was last modified 34 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.