-
Notifications
You must be signed in to change notification settings - Fork 975
Add publishers media info request API #13115
Conversation
@diracdeltas - i'm still reviewing a look at this; however, the overall approach looks great! it is certainly less work than i had anticipated! however, isn't the functionality considerably broader than the name being used. instead of |
Getting the background page webcontents is only needed so that the main process has a reference to the correct webcontents that the requests will go in. I assume ledger doesn't need generic access to the webcontents but only a few properties of the page (currently: whether there was an error in the request, the page image, the page title, and the page's final URL). Those properties are passed to the |
sadly, that's not a good assumption. the purpose of the webscraper is in fact to scrape a whole bunch of tags. i can put together a list i suppose. |
unfortunately you can't pass the entire webContents or parsed HTML object over IPC without serializing. it'd be easy to add more properties though. |
that's what we'll do then! |
Codecov Report
@@ Coverage Diff @@
## master #13115 +/- ##
==========================================
+ Coverage 56.25% 56.49% +0.23%
==========================================
Files 283 284 +1
Lines 28353 28656 +303
Branches 4674 4734 +60
==========================================
+ Hits 15950 16189 +239
- Misses 12403 12467 +64
|
@diracdeltas - thanks for the update. it turns out that for twitch, we won't be using the scraper, but i want to get this integrated in regardless (for youtube and future others); however, i'm going to need to add a bunch more rules so it is as "clever" as metascraper (which is really clever!) |
7e45c2b
to
7706b50
Compare
PR blocked on brave-intl/bat-publisher#21 |
a4fd9e6
to
cc19157
Compare
@yan can you please check my last commit |
}) | ||
} | ||
|
||
// https://github.com/microlinkhq/metascraper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this approach looks good to me, but what's the process to update this file in the future to include upstream changes in metascraper? please add either a comment with manual steps to update this file or an automated script
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can add the metascraper functions as a separate JS file in https://github.com/brave/browser-laptop/pull/13115/files#diff-a533e12744082c16911d52f54573e441R41 so it's easier to update automatically
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
problem is that they are using jquery library for selectors, and we use native html selectors
metascraper: $('meta[property="og:image:secure_url"]').attr('content')
we: html.querySelector('meta[property="og:image:secure_url"]').content
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can move selectors to the different file, so that it will be easier to upgrade
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rules moved
25741fa
to
37e6151
Compare
@diracdeltas can you please re-review? |
app/browser/api/ledger.js
Outdated
* @param {boolean} options.binaryP - are we receiving binary payload back | ||
* @param {string} options.rawP - are we receiving raw payload back | ||
* @param {string} options.scrapeP - are we doping scraping | ||
* @param {string} options.windowP - do we want to run this request in the window process |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks this is helpful. but are rawP
, scrapeP
, and windowP
supposed to be boolean, not string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yeah, my bad, copy-paste mistake
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
app/browser/api/ledger.js
Outdated
* @param {object} params - contains params from roundtrip | ||
* @param {string} params.url - url of the site that we want to scrap | ||
*/ | ||
const roundTripFromWindow = (params, callback) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please document what the arguments to callback
are supposed to be, either here or in roundtrip
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -1436,6 +1484,11 @@ const roundtrip = (params, options, callback) => { | |||
parts.pathname = parts.path | |||
} | |||
|
|||
if (options.windowP) { | |||
roundTripFromWindow({url: urlFormat(parts), verboseP: options.verboseP}, callback) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: params.verboseP
is never used in roundTripFromWindow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added some logging, so we use it now 😃
there is a unit test error in travis: https://travis-ci.org/brave/browser-laptop/jobs/353053977#L9779 |
I see that we are using different call for travis, will adjust that values so that it will pass @diracdeltas |
2133473
to
6d62de8
Compare
@diracdeltas updated and ready for another review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this LGTM.
github won't let me approve since i opened this PR, but this lgtm :) |
6d62de8
to
6138059
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving it based on previous approvals from @mrose17 and @diracdeltas
Add publishers media info request API
Add publishers media info request API
Add publishers media info request API
0.21 14741d4 |
…t-api" This reverts commit 14741d4.
reverted from 0.21.x with 5593d21; milestone updated to 0.22.x |
Needed for
brave-intl/bat-publisher#11 (review)
and #11889
Resolves #13114
https://github.com/brave-intl/bat-publisher/blob/8dc0b6d016e81fe777983017ed1eebc4ce048770/getMedia.js#L116-L168 can be refactored to use
request.fetchPublisherInfo
so that it doesn't need to usemetascraper
.request.fetchPublisherInfo
calls a callback whose input is the object:Submitter Checklist:
git rebase -i
to squash commits (if needed).Test Plan:
Reviewer Checklist:
Tests