Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ID Library #5863

Merged
merged 9 commits into from
Nov 12, 2020
Merged

ID Library #5863

merged 9 commits into from
Nov 12, 2020

Conversation

SKOCHERI
Copy link
Contributor

@SKOCHERI SKOCHERI commented Oct 16, 2020

Type of change

  • Bugfix
  • Feature
  • New bidder adapter
  • Code style update (formatting, local variables)
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • CI related changes
  • Does this change affect user-facing APIs or examples documented on http://prebid.org?
  • Other

Description of change

  • test parameters for validating bids
{
  bidder: '<bidder name>',
  params: {
    // ...
  }
}

Be sure to test the integration with your adserver using the Hello World sample page.

For any changes that affect user-facing APIs or example code documented on http://prebid.org, please provide:

Other information

@SKOCHERI SKOCHERI closed this Oct 16, 2020
@lgtm-com
Copy link

lgtm-com bot commented Oct 16, 2020

This pull request introduces 2 alerts when merging 7440f9e into 4662f06 - view on LGTM.com

new alerts:

  • 2 for Use of returnless function

@SKOCHERI SKOCHERI reopened this Oct 16, 2020
@lgtm-com
Copy link

lgtm-com bot commented Oct 16, 2020

This pull request introduces 2 alerts when merging 021cd69 into 926ccc6 - view on LGTM.com

new alerts:

  • 2 for Use of returnless function

@lgtm-com
Copy link

lgtm-com bot commented Oct 16, 2020

This pull request introduces 2 alerts when merging a44f22a into 926ccc6 - view on LGTM.com

new alerts:

  • 2 for Use of returnless function

@lgtm-com
Copy link

lgtm-com bot commented Oct 16, 2020

This pull request introduces 2 alerts when merging 9a20eaf into 926ccc6 - view on LGTM.com

new alerts:

  • 2 for Use of returnless function

@lgtm-com
Copy link

lgtm-com bot commented Oct 16, 2020

This pull request introduces 2 alerts when merging 0aec7ea into 926ccc6 - view on LGTM.com

new alerts:

  • 2 for Use of returnless function

@jdwieland8282
Copy link
Member

docs pr: prebid/prebid.github.io#2429

@smenzer
Copy link
Collaborator

smenzer commented Oct 20, 2020

@SKOCHERI or @jdwieland8282 can you provide a bit of context as to what this is for, please?

@jdwieland8282
Copy link
Member

@smenzer context is in the docs pr, but specifically to answer your question:

"The ID Library module gathers and generates a map of identities present on the page. The primary usecase for this adapter is for Publishers who have included multiple UserId subadapters in their prebid.js implementation, and want to store the resulting user ids server side for modeling or graphing purposes. "

@smenzer
Copy link
Collaborator

smenzer commented Oct 20, 2020

@jdwieland8282 ah thanks...I opened the docs PR looking for some context but didn't actually go read the .md file 🤦

from the docs and code, it looks like this is all tied to a hashed email, is that correct? what if there isn't an email available, can the publisher provide their own UID or something?

@jdwieland8282
Copy link
Member

jdwieland8282 commented Oct 20, 2020

its all tied to a persistent identifier, hashed email is one option but not the only option. If the id library module doesn't find a persistent id nothing happens. The publisher can declare an element id for where they store the users userName for example on the page:

ex. "2. a publisher configured element, for example when configuration target='username', the value is extracted from an element having id 'username'"

@gglas gglas assigned gglas and robertrmartinez and unassigned gglas Oct 21, 2020
Copy link
Collaborator

@robertrmartinez robertrmartinez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need to do some more review but had a few concerns / comments

logInfo('Email is found, body observer disconnected');
}

const body = document.body.innerHTML;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are scanning the entire page with regex for an email?

Can you explain the use case here? This doesn't seem very prebid-y.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless the publisher has configured the element from which the email needs to be fetched, we will have to scan the entire page to get email.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can't the publisher just pass the hashed email directly to the library when they initiate it? i agree that scanning an entire page is not ideal

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend we make this scanning of entire document a configurable option via the setConfig call to this module.

We will leave it defaulting to true but I think it is in our best interest to document the behavior of the module thoroughly, and make sure the publisher has the control necessary.

@jdwieland8282 Can we make sure the associated docs PR contains the steps outlined and the fallbacks?

Something that outlines the flow:

How the target is found:

  1. Uses publisher defined target element
  2. Searches for HTML input (text/email) element
  3. Searches entire document for email using regex.

And outline that the latter (3) can be configured off by something like:

pbjs.setConfig({
   idLibrary: {
      fallbackSearch: false // defaults to true (I am bad at naming vars)
   }
});

@SKOCHERI

Copy link
Member

@jdwieland8282 jdwieland8282 Nov 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robertrmartinez the existing Docs PR describes the following:

A persistent identifier can be extracted in the following ways:

1. From a generic `<div>` element
2. From a publisher configured HTML element id
3. From an `<input>` element of type text/email

Are you suggesting your 1-3 above replace the existing 1-3?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well just something that outlines exactly the flow of the module.

Could be another flow chart or something.

const body = document.body.innerHTML;

if (hasEmail(body)) {
email = getEmail(body);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think it is a good idea to call hasEmail which calls getEmail and does a full regex lookup on the innerHtml of the whole page, then call getEmail again. Seems very inefficient.

If you are gonna run getEmail anyways, we should just check the response instead of wrapping it in hasEmail

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been fixed

@jdwieland8282
Copy link
Member

adding some context behind the motivation for creating this module

Our point of view with regard to identity is that a 1st party relationship between publishers and viewers of their content exists. As such, we believe that publishers have a right to anchor all user ids generated by the Prebid userid module, to a persistent (hashed) identifier such as an email address. We believe that it is in the publisher, and by extension the open web’s interest, to create a deterministic id graph composed of the linkage between persistent ids and 1st party ids generated by the user id sub-adapters. As browsers limit our ability to generate cross domain identifiers, we believe a publisher controlled id graph is the right response. We (Magnite) may decide to help publishers build this graph in the future if asked.

@lgtm-com
Copy link

lgtm-com bot commented Nov 2, 2020

This pull request introduces 2 alerts when merging 4a6c09c into ae1b4d7 - view on LGTM.com

new alerts:

  • 2 for Use of returnless function

Copy link
Collaborator

@robertrmartinez robertrmartinez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some changes needed.

function debounce(func, wait, immediate) {
let timeout;

return function (...args) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So debounce is just returning a function to be called later?

However, anywhere debounce is called, the return is not used.

So this is effectively not doing anything.

I think the real fix here is to add the method bindings as requested for the other methods (first param anytime debounce is called).

And to make this function a regular function that just executes, not one which returns a function.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, did you mean to use the keyword arguments instead of args ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robertrmartinez The debounce issue is fixed and full body scan config is provided.

function handleTargetElement(conf) {
const targetObserver = new MutationObserver(function (mutations, observer) {
logInfo('target observer called');
debounce(targetAction(conf, mutations, observer), conf.debounce, false);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SKOCHERI

What is the intent here? Please elaborate on the expectation of things to happen when this line is hit.

debounce takes in a func param that looks like the intent is for it to be called later.

But you are executing the targetAction here with those params, which returns undefined. So really func is always undefined when debounce is run.

Is the intent to not call targetAction here but just pass its reference to debounce?

If so, recommend you use bind.

debounce(targetAction.bind(null, conf, mutations, observer), conf.debounce, false);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The debounce issue is fixed

addInputElementsElementListner(conf);
const bodyObserver = new MutationObserver(function (mutations, observer) {
logInfo('Body observer called');
debounce(bodyAction(conf, mutations, observer), conf.debounce, false);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as debounce change needed here.

@smenzer
Copy link
Collaborator

smenzer commented Nov 3, 2020

@jdwieland8282 is there a use case where the publisher doesn't want this library to scan the page for a hashed email, but has their own identifier already (maybe in a 1P cookie, session, etc) but they still want to collect all of the userId module IDs and post them to themselves? in other words...is it possible to turn off the first 4 steps in "Application Flow" (https://github.com/prebid/prebid.github.io/pull/2429/files) and just capture and post the user IDs someplace?

@jdwieland8282
Copy link
Member

Hi @smenzer , Not exactly, the intention here is to make the output deterministic in nature, and not require a bunch of probabilistic ML. If a user wanted to do what you suggested, they could deploy the pubprovided userid module, but they would be limited to page loads where a persistent value was observed.

Copy link
Collaborator

@robertrmartinez robertrmartinez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several more questions / comments

logError('The required url is not configured');
return;
}
if (!config.debounce) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is debounce === 0 not a valid use case?

Maybe this should be:

if (typeof config.debounce !== 'number')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

config.debounce = CONF_DEFAULT_OBSERVER_DEBOUNCE_MS;
logInfo('Set default observer debounce to ' + CONF_DEFAULT_OBSERVER_DEBOUNCE_MS);
}
if (config.fullscan == undefined) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would reccomend just checking if it is type of boolean here to make sure it is always one.

if (typeof config.fullscan !== 'boolean')

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also when possible, it is best to use strict ===.

Sometimes you want to catch null and undefined so == will do that,

but we know we want it to be a number for sure, so I think it is best to explicitly check if it is a number.

Same with fullscan (boolean).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

function postData() {
(getGlobal()).refreshUserIds();
const userIds = (getGlobal()).getUserIds();
if (!userIds || userIds.length === 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think it may be better to use

!Array.isArray(userIds) instead of !userIds

If for some reason userIds comes back with a prop that does not have .length this will throw an error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

userIds is an object, so updated the check to Object.keys(userIds).length === 0


function syncCallback() {
return {
success: function (responseBody) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not using responseBody, can remove I guess.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

}

function postData() {
(getGlobal()).refreshUserIds();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we refreshing userIds every single call?

Not sure that was the intent when pbjs.refreshUserIds was implemented.

@jdwieland8282

CC @bretg

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refreshUserIds is called prior to posting data which happens only when a email is found.

const value = event.target.value;
logInfo(`Modified Value of input ${event.target.value}`);
email = getEmail(value);
if (email != null) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same, strict check since getEmail explicitly will return null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

logInfo(` Original Value in Input = ${inputs[i].value}`);
inputs[i].addEventListener('change', event => processInputChange(event));
inputs[i].addEventListener('blur', event => processInputChange(event));
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use a forEach instead of a for loop since you do not need to exit early or anything.

Just a personal preference, can leave it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using forEach LGTM analysis: JavaScript test failed so used for loop

return function () {
var context = this;
var args = arguments;
var later = function () {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usually we use let / or const now instead of actual var declaration.

No big deal though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

for (const node of mutation.addedNodes) {
email = node.textContent;

if (email) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we get an email do we need to keep going through all other mutations?

Should there only ever be one email on page?

Should we break here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once email is found, data is posted and returned now


const body = document.body.innerHTML;
email = getEmail(body);
if (email != null) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!==

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@SKOCHERI
Copy link
Contributor Author

SKOCHERI commented Nov 5, 2020

@robertrmartinez @smenzer All the review comments has been addressed. Can you take a look and approve if no more comments.

@smenzer
Copy link
Collaborator

smenzer commented Nov 6, 2020

I have no other comments, thanks.

Copy link
Collaborator

@smenzer smenzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, pending review by @robertrmartinez

targetObserver.observe(targetElement, OBSERVER_CONFIG);
} else {
logInfo(' Target found with target ' + email);
logInfo(' Post data on email found in target with target');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the space is already included in the logInfo function you created, are these needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

email = getEmail(document.body.innerHTML);
if (email !== null) {
logInfo('Email found in body ' + email);
logInfo(' Post data on email found in the body without observer');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary space?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

logInfo(' Data synced successfully.');
},
error: function () {
logInfo(' Data sync failed.');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spaces still

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@robertrmartinez
Copy link
Collaborator

Code looks good now,

I am going to do some on page testing today and if it goes well we should be good to merge for tomorrows release!

@robertrmartinez robertrmartinez merged commit 954acdb into prebid:master Nov 12, 2020
@patmmccann patmmccann mentioned this pull request Nov 19, 2020
stsepelin pushed a commit to cointraffic/Prebid.js that referenced this pull request May 28, 2021
* ID Library

* build fix

* build fix

* build fix

* build fix

* Fixing review comments

* Fixing review comments (debounce, full body scan option

* Fixing review comments (var declaration, strict check)

* Fixing space

Co-authored-by: skocheri <skocheri@rubiconproject.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants