-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ID Library #5863
ID Library #5863
Conversation
This pull request introduces 2 alerts when merging 7440f9e into 4662f06 - view on LGTM.com new alerts:
|
This pull request introduces 2 alerts when merging 021cd69 into 926ccc6 - view on LGTM.com new alerts:
|
This pull request introduces 2 alerts when merging a44f22a into 926ccc6 - view on LGTM.com new alerts:
|
This pull request introduces 2 alerts when merging 9a20eaf into 926ccc6 - view on LGTM.com new alerts:
|
This pull request introduces 2 alerts when merging 0aec7ea into 926ccc6 - view on LGTM.com new alerts:
|
docs pr: prebid/prebid.github.io#2429 |
@SKOCHERI or @jdwieland8282 can you provide a bit of context as to what this is for, please? |
@smenzer context is in the docs pr, but specifically to answer your question: "The ID Library module gathers and generates a map of identities present on the page. The primary usecase for this adapter is for Publishers who have included multiple UserId subadapters in their prebid.js implementation, and want to store the resulting user ids server side for modeling or graphing purposes. " |
@jdwieland8282 ah thanks...I opened the docs PR looking for some context but didn't actually go read the .md file 🤦 from the docs and code, it looks like this is all tied to a hashed email, is that correct? what if there isn't an email available, can the publisher provide their own UID or something? |
its all tied to a persistent identifier, hashed email is one option but not the only option. If the id library module doesn't find a persistent id nothing happens. The publisher can declare an element id for where they store the users userName for example on the page: ex. "2. a publisher configured element, for example when configuration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still need to do some more review but had a few concerns / comments
logInfo('Email is found, body observer disconnected'); | ||
} | ||
|
||
const body = document.body.innerHTML; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are scanning the entire page with regex for an email?
Can you explain the use case here? This doesn't seem very prebid-y.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless the publisher has configured the element from which the email needs to be fetched, we will have to scan the entire page to get email.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why can't the publisher just pass the hashed email directly to the library when they initiate it? i agree that scanning an entire page is not ideal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend we make this scanning of entire document a configurable option via the setConfig
call to this module.
We will leave it defaulting to true
but I think it is in our best interest to document the behavior of the module thoroughly, and make sure the publisher has the control necessary.
@jdwieland8282 Can we make sure the associated docs PR contains the steps outlined and the fallbacks?
Something that outlines the flow:
How the target is found:
- Uses publisher defined target element
- Searches for HTML input (text/email) element
- Searches entire document for email using regex.
And outline that the latter (3) can be configured off by something like:
pbjs.setConfig({
idLibrary: {
fallbackSearch: false // defaults to true (I am bad at naming vars)
}
});
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@robertrmartinez the existing Docs PR describes the following:
A persistent identifier can be extracted in the following ways:
1. From a generic `<div>` element
2. From a publisher configured HTML element id
3. From an `<input>` element of type text/email
Are you suggesting your 1-3 above replace the existing 1-3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well just something that outlines exactly the flow of the module.
Could be another flow chart or something.
modules/idLibrary.js
Outdated
const body = document.body.innerHTML; | ||
|
||
if (hasEmail(body)) { | ||
email = getEmail(body); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think it is a good idea to call hasEmail
which calls getEmail
and does a full regex lookup on the innerHtml of the whole page, then call getEmail again. Seems very inefficient.
If you are gonna run getEmail
anyways, we should just check the response instead of wrapping it in hasEmail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been fixed
adding some context behind the motivation for creating this module Our point of view with regard to identity is that a 1st party relationship between publishers and viewers of their content exists. As such, we believe that publishers have a right to anchor all user ids generated by the Prebid userid module, to a persistent (hashed) identifier such as an email address. We believe that it is in the publisher, and by extension the open web’s interest, to create a deterministic id graph composed of the linkage between persistent ids and 1st party ids generated by the user id sub-adapters. As browsers limit our ability to generate cross domain identifiers, we believe a publisher controlled id graph is the right response. We (Magnite) may decide to help publishers build this graph in the future if asked. |
This pull request introduces 2 alerts when merging 4a6c09c into ae1b4d7 - view on LGTM.com new alerts:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's some changes needed.
modules/idLibrary.js
Outdated
function debounce(func, wait, immediate) { | ||
let timeout; | ||
|
||
return function (...args) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So debounce is just returning a function to be called later?
However, anywhere debounce is called, the return is not used.
So this is effectively not doing anything.
I think the real fix here is to add the method bindings as requested for the other methods (first param anytime debounce is called).
And to make this function a regular function that just executes, not one which returns a function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, did you mean to use the keyword arguments
instead of args
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@robertrmartinez The debounce issue is fixed and full body scan config is provided.
modules/idLibrary.js
Outdated
function handleTargetElement(conf) { | ||
const targetObserver = new MutationObserver(function (mutations, observer) { | ||
logInfo('target observer called'); | ||
debounce(targetAction(conf, mutations, observer), conf.debounce, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the intent here? Please elaborate on the expectation of things to happen when this line is hit.
debounce
takes in a func
param that looks like the intent is for it to be called later.
But you are executing the targetAction
here with those params, which returns undefined. So really func
is always undefined when debounce
is run.
Is the intent to not call targetAction
here but just pass its reference to debounce?
If so, recommend you use bind
.
debounce(targetAction.bind(null, conf, mutations, observer), conf.debounce, false);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The debounce issue is fixed
modules/idLibrary.js
Outdated
addInputElementsElementListner(conf); | ||
const bodyObserver = new MutationObserver(function (mutations, observer) { | ||
logInfo('Body observer called'); | ||
debounce(bodyAction(conf, mutations, observer), conf.debounce, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as debounce
change needed here.
@jdwieland8282 is there a use case where the publisher doesn't want this library to scan the page for a hashed email, but has their own identifier already (maybe in a 1P cookie, session, etc) but they still want to collect all of the userId module IDs and post them to themselves? in other words...is it possible to turn off the first 4 steps in "Application Flow" (https://github.com/prebid/prebid.github.io/pull/2429/files) and just capture and post the user IDs someplace? |
Hi @smenzer , Not exactly, the intention here is to make the output deterministic in nature, and not require a bunch of probabilistic ML. If a user wanted to do what you suggested, they could deploy the pubprovided userid module, but they would be limited to page loads where a persistent value was observed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Several more questions / comments
modules/idLibrary.js
Outdated
logError('The required url is not configured'); | ||
return; | ||
} | ||
if (!config.debounce) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is debounce === 0 not a valid use case?
Maybe this should be:
if (typeof config.debounce !== 'number')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
modules/idLibrary.js
Outdated
config.debounce = CONF_DEFAULT_OBSERVER_DEBOUNCE_MS; | ||
logInfo('Set default observer debounce to ' + CONF_DEFAULT_OBSERVER_DEBOUNCE_MS); | ||
} | ||
if (config.fullscan == undefined) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would reccomend just checking if it is type of boolean here to make sure it is always one.
if (typeof config.fullscan !== 'boolean')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also when possible, it is best to use strict ===.
Sometimes you want to catch null
and undefined
so == will do that,
but we know we want it to be a number for sure, so I think it is best to explicitly check if it is a number.
Same with fullscan (boolean).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
modules/idLibrary.js
Outdated
function postData() { | ||
(getGlobal()).refreshUserIds(); | ||
const userIds = (getGlobal()).getUserIds(); | ||
if (!userIds || userIds.length === 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think it may be better to use
!Array.isArray(userIds)
instead of !userIds
If for some reason userIds comes back with a prop that does not have .length this will throw an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
userIds is an object, so updated the check to Object.keys(userIds).length === 0
modules/idLibrary.js
Outdated
|
||
function syncCallback() { | ||
return { | ||
success: function (responseBody) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not using responseBody, can remove I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
} | ||
|
||
function postData() { | ||
(getGlobal()).refreshUserIds(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we refreshing userIds every single call?
Not sure that was the intent when pbjs.refreshUserIds
was implemented.
CC @bretg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refreshUserIds is called prior to posting data which happens only when a email is found.
modules/idLibrary.js
Outdated
const value = event.target.value; | ||
logInfo(`Modified Value of input ${event.target.value}`); | ||
email = getEmail(value); | ||
if (email != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same, strict check since getEmail explicitly will return null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
logInfo(` Original Value in Input = ${inputs[i].value}`); | ||
inputs[i].addEventListener('change', event => processInputChange(event)); | ||
inputs[i].addEventListener('blur', event => processInputChange(event)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use a forEach
instead of a for loop since you do not need to exit early or anything.
Just a personal preference, can leave it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using forEach LGTM analysis: JavaScript test failed so used for loop
modules/idLibrary.js
Outdated
return function () { | ||
var context = this; | ||
var args = arguments; | ||
var later = function () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
usually we use let / or const now instead of actual var declaration.
No big deal though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
for (const node of mutation.addedNodes) { | ||
email = node.textContent; | ||
|
||
if (email) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we get an email do we need to keep going through all other mutations?
Should there only ever be one email on page?
Should we break here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once email is found, data is posted and returned now
modules/idLibrary.js
Outdated
|
||
const body = document.body.innerHTML; | ||
email = getEmail(body); | ||
if (email != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!==
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
@robertrmartinez @smenzer All the review comments has been addressed. Can you take a look and approve if no more comments. |
I have no other comments, thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, pending review by @robertrmartinez
modules/idLibrary.js
Outdated
targetObserver.observe(targetElement, OBSERVER_CONFIG); | ||
} else { | ||
logInfo(' Target found with target ' + email); | ||
logInfo(' Post data on email found in target with target'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the space is already included in the logInfo function you created, are these needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
modules/idLibrary.js
Outdated
email = getEmail(document.body.innerHTML); | ||
if (email !== null) { | ||
logInfo('Email found in body ' + email); | ||
logInfo(' Post data on email found in the body without observer'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary space?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
modules/idLibrary.js
Outdated
logInfo(' Data synced successfully.'); | ||
}, | ||
error: function () { | ||
logInfo(' Data sync failed.'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spaces still
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Code looks good now, I am going to do some on page testing today and if it goes well we should be good to merge for tomorrows release! |
* ID Library * build fix * build fix * build fix * build fix * Fixing review comments * Fixing review comments (debounce, full body scan option * Fixing review comments (var declaration, strict check) * Fixing space Co-authored-by: skocheri <skocheri@rubiconproject.com>
Type of change
Description of change
Be sure to test the integration with your adserver using the Hello World sample page.
For any changes that affect user-facing APIs or example code documented on http://prebid.org, please provide:
Other information