You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have noticed that for articles that are multiple pages, readability only gets the first one. But for postlight parser, this is not the case. It usually manages to page through to the end and capture it all.
I have noticed that for articles that are multiple pages, readability only gets the first one. But for postlight parser, this is not the case. It usually manages to page through to the end and capture it all.
Arstechnica for example has multi page articles, like so:
https://arstechnica.com/tech-policy/2024/05/how-dark-money-groups-help-private-isps-lobby-against-municipal-broadband/
This looks like how postlight does it:
https://github.com/postlight/parser/blob/main/src/extractors/generic/next-page-url/extractor.js
The text was updated successfully, but these errors were encountered: