GNU bug report logs - #45477
27.1; RFE: Make full RSS fragments available for nnrss servers

Previous Next

Packages: emacs, gnus;

Reported by: Tim Landscheidt <tim <at> tim-landscheidt.de>

Date: Sun, 27 Dec 2020 21:31:01 UTC

Severity: wishlist

Found in version 27.1

To reply to this bug, email your comments to 45477 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org, bugs <at> gnus.org:
bug#45477; Package emacs,gnus. (Sun, 27 Dec 2020 21:31:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Tim Landscheidt <tim <at> tim-landscheidt.de>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org, bugs <at> gnus.org. (Sun, 27 Dec 2020 21:31:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Tim Landscheidt <tim <at> tim-landscheidt.de>
To: bug-gnu-emacs <at> gnu.org
Subject: 27.1; RFE: Make full RSS fragments available for nnrss servers
Date: Sun, 27 Dec 2020 21:30:36 +0000
Severity: wishlist

Some RSS feeds provide additional data in extra tags; for
example, http://feeds.feedburner.com/DougLovesMovies (and
others) includes information for/from iTunes:

|     <item>
|       <title>Jimmy Pardo, Matt Belknap, Eliot Hochberg and Garon Cockrell guest</title>
|       <description><![CDATA[<p>In a special holiday Doug Loves Movies-Never Not Funny cross-over event, Doug welcomes Jimmy Pardo, Matt Belknap, Eliot Hochberg and Garon Cockrell to the show.</p>]]></description>
|       <itunes:title>Jimmy Pardo, Matt Belknap, Eliot Hochberg and Garon Cockrell guest</itunes:title>
|       <itunes:episodeType>full</itunes:episodeType>
|       <itunes:episode>1286</itunes:episode>
|       <itunes:summary>In a special holiday Doug Loves Movies-Never Not Funny cross-over event, Doug welcomes Jimmy Pardo, Matt Belknap, Eliot Hochberg and Garon Cockrell to the show.</itunes:summary>
|       <content:encoded><![CDATA[<p>In a special holiday Doug Loves Movies-Never Not Funny cross-over event, Doug welcomes Jimmy Pardo, Matt Belknap, Eliot Hochberg and Garon Cockrell to the show.</p>]]></content:encoded>
|       <guid isPermaLink="false">gid://art19-episode-locator/V0/R8l82ylk4BaF_BypOVQuG89EyfRunLiF485aNFQW_mA</guid>
|       <pubDate>Thu, 24 Dec 2020 08:00:00 -0000</pubDate>
|       <itunes:explicit>yes</itunes:explicit>
|       <itunes:image href="https://content.production.cdn.art19.com/images/82/2f/79/e8/822f79e8-722d-4a6b-8647-8cc9741e87a9/f933e361a2ab293aa85d74f63a4ea343522012cae57c6efa04c518b09d5ad28201c3312b452e93b09b6e349f33b211553dedcf0b7ab4da2702d3680700b416ec.jpeg"/>
|       <itunes:keywords>DLM</itunes:keywords>
|       <itunes:duration>00:48:13</itunes:duration>
|       <enclosure url="http://feedproxy.google.com/~r/DougLovesMovies/~5/4ig02kQVqkQ/7d50103f-d685-4f0f-814e-b38fd2d643d8.mp3" length="46292323" type="audio/mpeg"/>
|       <feedburner:origEnclosureLink>https://rss.art19.com/episodes/7d50103f-d685-4f0f-814e-b38fd2d643d8.mp3</feedburner:origEnclosureLink>
|     </item>

AFAICT, all information except title, date, description and
enclosure gets thrown away by nnrss-check-group.  This makes
it impossible to process this information when displaying an
article.

It would be very useful to have this information available.
A very simplistic solution would be to add item to the tuple
that gets pushed to nnrss-group-data by nnrss-check-group so
that it can be accessed via:

| (nth 9 (alist-get
|         (gnus-summary-article-number)
|         nnrss-group-data))

(This method, with 9 replaced by 2 or 6, already allows ac-
cess to "pure" representations of title, URL & Co.)

However it might be prudent to have a more stable inter-
face :-).




Information forwarded to bug-gnu-emacs <at> gnu.org, bugs <at> gnus.org:
bug#45477; Package emacs,gnus. (Mon, 28 Dec 2020 00:07:01 GMT) Full text and rfc822 format available.

Message #8 received at 45477 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Tim Landscheidt <tim <at> tim-landscheidt.de>
Cc: 45477 <at> debbugs.gnu.org
Subject: Re: bug#45477: 27.1; RFE: Make full RSS fragments available for
 nnrss servers
Date: Mon, 28 Dec 2020 01:05:45 +0100
Tim Landscheidt <tim <at> tim-landscheidt.de> writes:

> Some RSS feeds provide additional data in extra tags; for
> example, http://feeds.feedburner.com/DougLovesMovies (and
> others) includes information for/from iTunes:

[...]

> |       <itunes:title>Jimmy Pardo, Matt Belknap, Eliot Hochberg and
> | Garon Cockrell guest</itunes:title>
> |       <itunes:episodeType>full</itunes:episodeType>
> |       <itunes:episode>1286</itunes:episode>
> |       <itunes:summary>In a special holiday Doug Loves Movies-Never
> | Not Funny cross-over event, Doug welcomes Jimmy Pardo, Matt Belknap,
> | Eliot Hochberg and Garon Cockrell to the show.</itunes:summary>

This information?

> |       <content:encoded><![CDATA[<p>In a special holiday Doug Loves
> | Movies-Never Not Funny cross-over event, Doug welcomes Jimmy Pardo,
> | Matt Belknap, Eliot Hochberg and Garon Cockrell to the
> | show.</p>]]></content:encoded>

[...]

> |       <enclosure
> | url="http://feedproxy.google.com/~r/DougLovesMovies/~5/4ig02kQVqkQ/7d50103f-d685-4f0f-814e-b38fd2d643d8.mp3"
> | length="46292323" type="audio/mpeg"/>

I'm not very familiar with nnrss, but it looks like the itunes: info
mostly replicates the info in the other fields?

> It would be very useful to have this information available.
> A very simplistic solution would be to add item to the tuple
> that gets pushed to nnrss-group-data by nnrss-check-group so
> that it can be accessed via:
>
> | (nth 9 (alist-get
> |         (gnus-summary-article-number)
> |         nnrss-group-data))
>
> (This method, with 9 replaced by 2 or 6, already allows ac-
> cess to "pure" representations of title, URL & Co.)
>
> However it might be prudent to have a more stable inter-
> face :-).

Sure, I guess stashing it there would make sense, but it would require
people that want to use the info to write a bit of code, right?  Just
stashing all the info there seems a bit... odd to me somehow.  I don't
think any other backends do that?

So I'm wondering whether this could be fixed in some other way, that
would be useful to everybody without writing further code to use the
data.  So would it make sense just to include the data from the extra
fields here in the message body?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org, bugs <at> gnus.org:
bug#45477; Package emacs,gnus. (Tue, 29 Dec 2020 16:51:02 GMT) Full text and rfc822 format available.

Message #11 received at 45477 <at> debbugs.gnu.org (full text, mbox):

From: Tim Landscheidt <tim <at> tim-landscheidt.de>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 45477 <at> debbugs.gnu.org
Subject: Re: bug#45477: 27.1; RFE: Make full RSS fragments available for
 nnrss servers
Date: Tue, 29 Dec 2020 16:50:07 +0000
Lars Ingebrigtsen <larsi <at> gnus.org> wrote:

>> Some RSS feeds provide additional data in extra tags; for
>> example, http://feeds.feedburner.com/DougLovesMovies (and
>> others) includes information for/from iTunes:

> [...]

>> |       <itunes:title>Jimmy Pardo, Matt Belknap, Eliot Hochberg and
>> | Garon Cockrell guest</itunes:title>
>> |       <itunes:episodeType>full</itunes:episodeType>
>> |       <itunes:episode>1286</itunes:episode>
>> |       <itunes:summary>In a special holiday Doug Loves Movies-Never
>> | Not Funny cross-over event, Doug welcomes Jimmy Pardo, Matt Belknap,
>> | Eliot Hochberg and Garon Cockrell to the show.</itunes:summary>

> This information?

>> |       <content:encoded><![CDATA[<p>In a special holiday Doug Loves
>> | Movies-Never Not Funny cross-over event, Doug welcomes Jimmy Pardo,
>> | Matt Belknap, Eliot Hochberg and Garon Cockrell to the
>> | show.</p>]]></content:encoded>

> [...]

>> |       <enclosure
>> | url="http://feedproxy.google.com/~r/DougLovesMovies/~5/4ig02kQVqkQ/7d50103f-d685-4f0f-814e-b38fd2d643d8.mp3"
>> | length="46292323" type="audio/mpeg"/>

> I'm not very familiar with nnrss, but it looks like the itunes: info
> mostly replicates the info in the other fields?

No; for example and especially in my use case at hand, the
content of itunes:episode is not contained anywhere else.

>> It would be very useful to have this information available.
>> A very simplistic solution would be to add item to the tuple
>> that gets pushed to nnrss-group-data by nnrss-check-group so
>> that it can be accessed via:

>> | (nth 9 (alist-get
>> |         (gnus-summary-article-number)
>> |         nnrss-group-data))

>> (This method, with 9 replaced by 2 or 6, already allows ac-
>> cess to "pure" representations of title, URL & Co.)

>> However it might be prudent to have a more stable inter-
>> face :-).

> Sure, I guess stashing it there would make sense, but it would require
> people that want to use the info to write a bit of code, right?  Just
> stashing all the info there seems a bit... odd to me somehow.  I don't
> think any other backends do that?

> So I'm wondering whether this could be fixed in some other way, that
> would be useful to everybody without writing further code to use the
> data.  So would it make sense just to include the data from the extra
> fields here in the message body?

To draw a bigger picture: My use case (and existing workflow
with newsticker) is, when displaying an episode's Gnus arti-
cle, to provide the user (me) with a command to import the
episode's data (feed title, episode number, episode title,
episode URL) into my database for further processing.
Therefore, I need the data, and I need it in a format that
can be processed further, and there will be a need for a
custom user function to process the data because each use
case will be different.

So just unconditionally mangling and dumping the data into
the message body will help neither me nor the users who just
want to use nnrss "normally".

I thought about including the raw XML fragments as either a
Base64-encoded X-Gnus-nnrss-Entry-XML header or a
multipart/alternative MIME part.  However, regardless of the
solution, Gnus would need to provide a function that returns
the DOM for the current/an article, and with the current de-
sign, adding an element to nnrss-group-data is probably the
easiest path of the three.

(If there was a major overhaul of nnrss, it could be inter-
esting to forego the intermediate nnrss-group-data saved in
~/News/rss/* and either store the feeds as pure XML files,
re-parsed on demand and available for further processing, or
write out all the articles as mbox files after parsing the
feeds, with the entries' fragments as MIME parts.)




Information forwarded to bug-gnu-emacs <at> gnu.org, bugs <at> gnus.org:
bug#45477; Package emacs,gnus. (Wed, 30 Dec 2020 03:04:02 GMT) Full text and rfc822 format available.

Message #14 received at 45477 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Tim Landscheidt <tim <at> tim-landscheidt.de>
Cc: 45477 <at> debbugs.gnu.org
Subject: Re: bug#45477: 27.1; RFE: Make full RSS fragments available for
 nnrss servers
Date: Wed, 30 Dec 2020 04:02:55 +0100
Tim Landscheidt <tim <at> tim-landscheidt.de> writes:

> Therefore, I need the data, and I need it in a format that
> can be processed further, and there will be a need for a
> custom user function to process the data because each use
> case will be different.

Sure, that sounds reasonable.  However:

> (If there was a major overhaul of nnrss, it could be inter-
> esting to forego the intermediate nnrss-group-data saved in
> ~/News/rss/* and either store the feeds as pure XML files,
> re-parsed on demand and available for further processing, or
> write out all the articles as mbox files after parsing the
> feeds, with the entries' fragments as MIME parts.)

I've not used nnrss myself, but reading the code, it seems like it's
storing all the data needed for Gnus to read an nnrss group in
`nnrss-group-data', so storing all the XML data in case somebody is
going to use it would require orders of magnitude more storage?

I think a way to implement this would be to add an nnrss variable that
says what "extra" XML fields to store -- like (nnrss-extra-fields
'(itunes:episodeType ...)).

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org, bugs <at> gnus.org:
bug#45477; Package emacs,gnus. (Wed, 30 Dec 2020 08:23:01 GMT) Full text and rfc822 format available.

Message #17 received at 45477 <at> debbugs.gnu.org (full text, mbox):

From: Tim Landscheidt <tim <at> tim-landscheidt.de>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 45477 <at> debbugs.gnu.org
Subject: Re: bug#45477: 27.1; RFE: Make full RSS fragments available for
 nnrss servers
Date: Wed, 30 Dec 2020 08:22:15 +0000
Lars Ingebrigtsen <larsi <at> gnus.org> wrote:

>> Therefore, I need the data, and I need it in a format that
>> can be processed further, and there will be a need for a
>> custom user function to process the data because each use
>> case will be different.

> Sure, that sounds reasonable.  However:

>> (If there was a major overhaul of nnrss, it could be inter-
>> esting to forego the intermediate nnrss-group-data saved in
>> ~/News/rss/* and either store the feeds as pure XML files,
>> re-parsed on demand and available for further processing, or
>> write out all the articles as mbox files after parsing the
>> feeds, with the entries' fragments as MIME parts.)

> I've not used nnrss myself, but reading the code, it seems like it's
> storing all the data needed for Gnus to read an nnrss group in
> `nnrss-group-data', so storing all the XML data in case somebody is
> going to use it would require orders of magnitude more storage?

In my practice (so far), not even one magnitude.  Random on-
disk sample:

|   -rw-r--r--. 1 root root  132680 Dec 30 06:56 Conan O’Brien Needs A Friend.el
|   -rw-r--r--. 1 root root  471799 Dec 30 07:28 Conan O’Brien Needs A Friend.xml
|   -rw-r--r--. 1 root root   72249 Dec 30 06:56 Doug Loves Movies.el
|   -rw-r--r--. 1 root root  245312 Dec 30 07:01 Doug Loves Movies.xml
|   -rw-r--r--. 1 root root  630495 Dec 30 06:56 ID10T with Chris Hardwick.el
|   -rw-r--r--. 1 root root 2100500 Dec 27 21:30 ID10T with Chris Hardwick.xml
|   -rw-r--r--. 1 root root   21754 Dec 30 06:56 Sprechen wir über Mord?! Der SWR2 True Crime Podcast.el
|   -rw-r--r--. 1 root root   47741 Dec 30 06:36 Sprechen wir über Mord?! Der SWR2 True Crime Podcast.xml
|   -rw-r--r--. 1 root root   93927 Dec 30 06:56 Stone Clearing With Richard Herring.el
|   -rw-r--r--. 1 root root  221040 Dec 30 07:25 Stone Clearing With Richard Herring.xml
|   -rw-r--r--. 1 root root   17002 Dec 30 06:56 Taskmaster The Podcast.el
|   -rw-r--r--. 1 root root   53080 Dec 30 07:28 Taskmaster The Podcast.xml
|   -rw-r--r--. 1 root root  265970 Dec 30 06:56 You Made It Weird with Pete Holmes.el
|   -rw-r--r--. 1 root root  650710 Dec 30 04:07 You Made It Weird with Pete Holmes.xml

Even if the XML gets bloated when saved in nnrss-group-data
(it holds one feed at most), IMHO almost all feeds will be
small enough to be negligible in a typical Emacs/Gnus setup
(the largest feed above holds data from February 2010 till
now; usually feeds only contain the most recent x entries).

> I think a way to implement this would be to add an nnrss variable that
> says what "extra" XML fields to store -- like (nnrss-extra-fields
> '(itunes:episodeType ...)).

That would allow my use case.  (In a major overhaul, another
way to approach this could be a hook/function variable (con-
figurable per group) that gets called in addition/in lieu of
nnrss-request-article with the raw XML data and then has
free rein to format the Gnus article as it wishes to.)




Information forwarded to bug-gnu-emacs <at> gnu.org, bugs <at> gnus.org:
bug#45477; Package emacs,gnus. (Thu, 31 Dec 2020 04:34:01 GMT) Full text and rfc822 format available.

Message #20 received at 45477 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Tim Landscheidt <tim <at> tim-landscheidt.de>
Cc: 45477 <at> debbugs.gnu.org
Subject: Re: bug#45477: 27.1; RFE: Make full RSS fragments available for
 nnrss servers
Date: Thu, 31 Dec 2020 05:33:00 +0100
Tim Landscheidt <tim <at> tim-landscheidt.de> writes:

> Even if the XML gets bloated when saved in nnrss-group-data
> (it holds one feed at most), IMHO almost all feeds will be
> small enough to be negligible in a typical Emacs/Gnus setup
> (the largest feed above holds data from February 2010 till
> now; usually feeds only contain the most recent x entries).

Doesn't nnrss-group-data store older entries, though?  I just skimmed
the nnrss code, and I didn't see any pruning...

>> I think a way to implement this would be to add an nnrss variable that
>> says what "extra" XML fields to store -- like (nnrss-extra-fields
>> '(itunes:episodeType ...)).
>
> That would allow my use case.

Patches welcome.  :-)

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org, bugs <at> gnus.org:
bug#45477; Package emacs,gnus. (Fri, 01 Jan 2021 17:10:01 GMT) Full text and rfc822 format available.

Message #23 received at 45477 <at> debbugs.gnu.org (full text, mbox):

From: Tim Landscheidt <tim <at> tim-landscheidt.de>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 45477 <at> debbugs.gnu.org
Subject: Re: bug#45477: 27.1; RFE: Make full RSS fragments available for
 nnrss servers
Date: Fri, 01 Jan 2021 17:09:02 +0000
Lars Ingebrigtsen <larsi <at> gnus.org> wrote:

>> Even if the XML gets bloated when saved in nnrss-group-data
>> (it holds one feed at most), IMHO almost all feeds will be
>> small enough to be negligible in a typical Emacs/Gnus setup
>> (the largest feed above holds data from February 2010 till
>> now; usually feeds only contain the most recent x entries).

> Doesn't nnrss-group-data store older entries, though?  I just skimmed
> the nnrss code, and I didn't see any pruning...

I assumed that was done by the normal expire process, but I
didn't look deeper into that.

>>> I think a way to implement this would be to add an nnrss variable that
>>> says what "extra" XML fields to store -- like (nnrss-extra-fields
>>> '(itunes:episodeType ...)).

>> That would allow my use case.

> Patches welcome.  :-)

Well, in that case I rather work on a new, clean, shiny
backend that accepts Atom and RSS feeds and does everything
The Right Way™ :-).




Information forwarded to bug-gnu-emacs <at> gnu.org, bugs <at> gnus.org:
bug#45477; Package emacs,gnus. (Sat, 02 Jan 2021 05:53:02 GMT) Full text and rfc822 format available.

Message #26 received at 45477 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Tim Landscheidt <tim <at> tim-landscheidt.de>
Cc: 45477 <at> debbugs.gnu.org
Subject: Re: bug#45477: 27.1; RFE: Make full RSS fragments available for
 nnrss servers
Date: Sat, 02 Jan 2021 06:52:18 +0100
Tim Landscheidt <tim <at> tim-landscheidt.de> writes:

> Well, in that case I rather work on a new, clean, shiny
> backend that accepts Atom and RSS feeds and does everything
> The Right Way™ :-).

That sounds more fun.  :-)  But nnrss users would probably prefer that
nnrss gets improved instead of getting a new backend that does kinda
sorta the same thing, though.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




This bug report was last modified 3 years and 113 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.