You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What's the problem this feature will solve?
I run a PyPI mirror which uses the RSS feeds to keep in sync with new packages and releases, which always return the 40 most recent items. Currently I query the feeds every 5 minutes, but sometimes there have been more than 40 releases in the last 5 minutes. To get around this with the current API I would have to query much more frequently, which the vast majority of the time is unnecessary load on PyPI and depending on how these packages are being rapidly published might still be insufficient.
Describe the solution you'd like
The RSS feeds could accept a pair of query parameters:
limit: allow returning up to limit items instead of the current 40, probably up to some cap for performance reasons
max_age: return only items published in the last max_age seconds
This would make mirroring much easier, as I could set max_age to my polling interval plus epsilon. I would not be retrieving any more items than necessary, so I think the load on PyPI would be lighter.
Additional context
This accomplishes something similar to the deprecated XML-RPC changelog, but the documentation warns to use the RSS feed instead of that.
This request is probably of interest to the same audience of #1683 but the feature itself is orthogonal.
The text was updated successfully, but these errors were encountered:
Thanks for the feature request! Apologies that we haven't been able to respond before you started work in #7117.
The reason the XML-RPC changelog is deprecated is largely due to the fact that it has this feature, which makes it extremely challenging to cache and thus pretty resource-intensive for PyPI.
Adding limit and max_age parameters would introduce the same problem for our RSS feeds, as there would be a huge number of unique cache entries that these parameters would introduce (which would all need to be invalidated when a new package affects them, and we'd need to know if a new package affects them), instead of the single-cache-entry-per-feed that we currently have.
You seem concerned about putting too much load on PyPI, which we definitely appreciate but I think in this instance is probably unnecessary since you'll almost always be hitting our cache. You could increase your requests to 1/min or even 1/s and it would be a small blip in our total traffic.
That would probably be the best solution in the short-term, but I understand that that might not be ideal for you or might be too resource intensive on your end.
I think the best long-term alternative would be to implement the hypermedia based API discussed in #284. Short term, we could potentially permanently increase the number of items returned by our RSS feeds (not sure why this is 40 right now, but it does seem low).
Short term, we could potentially permanently increase the number of items returned by our RSS feeds (not sure why this is 40 right now, but it does seem low).
Maybe we should create an issue for this? Maybe returning 100 items?
What's the problem this feature will solve?
I run a PyPI mirror which uses the RSS feeds to keep in sync with new packages and releases, which always return the 40 most recent items. Currently I query the feeds every 5 minutes, but sometimes there have been more than 40 releases in the last 5 minutes. To get around this with the current API I would have to query much more frequently, which the vast majority of the time is unnecessary load on PyPI and depending on how these packages are being rapidly published might still be insufficient.
Describe the solution you'd like
The RSS feeds could accept a pair of query parameters:
limit
: allow returning up tolimit
items instead of the current 40, probably up to some cap for performance reasonsmax_age
: return only items published in the lastmax_age
secondsThis would make mirroring much easier, as I could set
max_age
to my polling interval plus epsilon. I would not be retrieving any more items than necessary, so I think the load on PyPI would be lighter.Additional context
This accomplishes something similar to the deprecated XML-RPC changelog, but the documentation warns to use the RSS feed instead of that.
This request is probably of interest to the same audience of #1683 but the feature itself is orthogonal.
The text was updated successfully, but these errors were encountered: