-
Notifications
You must be signed in to change notification settings - Fork 773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add translator for Jewish Currents #3182
base: master
Are you sure you want to change the base?
Conversation
|
||
let item = new Zotero.Item(itemType); | ||
item.title = doc.title; | ||
item.date = ZU.strToISO(text(doc, '.feature .absolute')); // Bad! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I understand what you mean here and I see you. The logically cleaner solution seems to be parsing the Schema.org JSON-LD <script> element and extract the datePublished
property.
item.runningTime = text(doc, '.podcasts .total'); | ||
} | ||
|
||
let authors = doc.querySelectorAll('.lockup a[href*="/author/"]'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with this selector, I presume. The class name lockup
is a bit cryptic, and it seems to be a design jargon referring to an exception to responsive design for certain elements. But it's not actually used in the CSS (so not really locked up), and in your words it may not be "very stable".
This one may be better semantically: .bioblock strong a
(locating the author links after the article's end).
else if (doc.querySelector('audio#podcast')) { | ||
return 'podcast'; | ||
} | ||
else if (url.includes('/results') && getSearchResults(doc, true)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, I think we can also URL-match the issue pages (example issue) and category pages (example category). The selectors in getSearchResults()
will still work. The following tests the paths /results
(with query), /archive
("All Articles", with query or hash but maybe unnecessary), and paths with /issue/
or /category/
as parent
else if (url.includes('/results') && getSearchResults(doc, true)) { | |
else if (/\/(results\?|archive($|[#?])|issue\/.+|category\/.+)/.test(url) && getSearchResults(doc, true)) { |
The selectors don't work for the list of podcast episodes https://jewishcurrents.org/podcast though.
Hello @AbeJellinek, In addition to the inline comments, here's a few more observations about the idiosyncrasies -
I don't think these are high-priority ones while it would be nice to have some way to deal with a few of them (the author-name component maybe?) And thanks for the patience. |
And possibly |
Tries to distinguish magazine articles from online-only articles, and makes a rudimentary attempt at handling podcast episodes (without creators).