-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feeds requiring cookie support #39
Comments
I was hoping this would fix the redirect issue but it doesn't---seems to be a different bug: feedreader/pluto#39
Thanks for highlighting the neuroscience planets. The http library used is ruby's standard Net::HTTP with a little convenience wrapper (for handling redirections and caching but not cookies), See https://github.com/rubycoco/webclient/tree/master/fetcher If you have pluto installed locally you should also have the fetcher gem install which includes a debug command line tool called fetch. Try:
That should give you the same result, please what results you get (theoratically it's the same but "isolated" for easy testing. We can take it from there. You might check what is the spring policy on feeds and if you need a token or something. Ok I tried:
That gets redirected to https://idp.springer.com/authorize?redirect_uri=[url] - not sure what to do here. Not sure if you can do ruby scripts - if you can - than use the standard Net:HTTP library and make a script with a successful request with cookie and I am happy to patch / update the fetcher based on your script. This cheatsheet might help to get your started https://yukimotopress.github.io/http. |
Thanks for looking into it @geraldb . I've dug in a little more. Using something like
However, if I stop wget from dealing with cookies, I get the same redirection loop:
So I guess the fix/workaround here would be to get fetcher etc. to support cookies. I don't do a lot of ruby, but I'll see if someone from the team can find the time to work on this. |
Thanks for the wget analysis - that makes it look like as you say it requires to pass along the set-cookie on the redirect. Not sure if wget also has an option to show the request and response HTTP headers - it should be something like set-cookie in the first HTTP equest and cookie in the second HTTP response. I try to put together a test script in about a week - if someone in your team can try out a request with plain ruby Net::HTTP that be great or even python if you set / manage the cookies yourself. |
Yes, here are the outputs in debug mode which includes the headers: With cookies:
Without cookies:
|
Thanks for the HTTP headers that makes it clear - the second request (authorize) sets the cookies and the third request than needs to pass along the cookies to succeed - so you are right it's all about cookies. I try to put together a test script next week and if it works add the cookie support to the fetcher library /gem (that will update pluto).
|
For what's worth, this script seems to work. The |
Thanks for highlight the script on stackoverflow - I also searched / browsed some - one gotcha was:
will only work if there's only one cookie but in the springer (or more generic case) there can be 3 or multiple. I try to get to it early next week. Cheers. |
In relation to feedreader/pluto#39, the request cookie gets saved for each redirect. This should support cookie based authentication.
* Add support for redirect with cookie authentication In relation to feedreader/pluto#39, the request cookie gets saved for each redirect. This should support cookie based authentication. * Support multiple cookies Some websites return more than one cookie. This iterates over the multiple cookies and concatenates them in a single one. If only one cookie is returned, just pass that one to the next request. * Make cookie header compliant with RFC 6265 RFC 6265 states that the client should pass a colon-separated cookie header to the server as a single string. This commit should make the cookie request compliant to this RFC.
tl;dr: Adding support for cookies might let pluto fetch a feed but it will probably be the wrong feed I have bad news. I don't think that adding cookie support will completely fix the issue with Springer's feeds. I am getting a random feed back from Springer's server after completing the request with the cookie. Here is the trace route for the request from OP using the patched fetcher gem:
You can see that the requests succeeds with the cookie but somehow the server "eats" the search parameters that were passed to the original request. The parameters for the first request were I tried to trace the same request in curl and I get the same problem. The same problem is present in the wget request from here; notice how the last request is just for I am working around this issue by moving Note that this is hit-or-miss depending on how the server responds. That is, if the server doesn't ask the client for authentication, the correct feed gets served. This is similar to how pluto was able to pull the feed sometimes before the cookie patch. In fact, now I am getting this response when requesting the same feed as the one above:
The request doesn't get re-routed through the authentication server and the correct feed gets served (notice the difference in content_length). Probably the difference is that the new request is hitting a CDN cache while the original one went through Springer's servers. Edit After some tests, setting the three cookies from the authentication process and re-do the request to the original url, the correct feed gets returned. If you want to test the "bugged" request, you can use an incognito session in chrome. It should get the wrong feed unless the feed it's cached on the server. |
Thanks for your detailed report on cookies. Really appreciate your attention to detail and sharing your findings. I try to wrap up the code (in the fetcher gem) early next week. Cheers. Prost. |
In relation to feedreader/pluto#39, the request cookie gets saved for each redirect. This should support cookie based authentication.
Hi @geraldb , happy new year---anything we can do to help with this one (or has the code landed already)? |
Happy new year. Prosit 2022. Thanks for the reminder. I sure try to get cookie support added / done this year in 2022. Thanks for the patience. Cheers. |
Hello,
Thanks for pluto, we moved the neuroscience planets over from venus a few days ago:
https://github.com/neurofedora/planet-neuroscience
https://github.com/neurofedora/planet-neuroscientists
For the neuroscience planet, which collects feeds from research journals, we're seeing errors which seem to be related to cookies:
https://link.springer.com/search.rss?search-within=Journal&facet-journal-id=10827 works fine in a browser, but not on pluto---and it looks like they keep redirecting which causes pluto to error out eventually.
Is there a way to use these feeds with pluto?
The text was updated successfully, but these errors were encountered: