-
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add crawler functionality for identifying sites' usage of GPP 1.0 vs 1.1 and write to database #110
Comments
Thanks, @patmmccann! @katehausladen already looked into this issue. We will further evolve our code to reflect people's move from GPP v1.0 to v1.1. @franciscawijaya, can you take the lead on this issue and implement the functionality @katehausladen described and as outlined below for our June crawl? @Mattm27, can you work with @franciscawijaya as needed to bounce off ideas and discussion? And, @katehausladen, it would be great if you were available for any questions that @franciscawijaya and @Mattm27 still have remaining and general observations to make sure we are not making any mistake here 😄. What we need before the June crawl is logic for:
@katehausladen prepared this move already in the analysis.js. So, @franciscawijaya, that file is a good starting point together with @katehausladen's description. Please go ahead and create a new Once we have a record of which site is using which version, we can interpret the results in our analysis after the crawl accordingly. |
As additional reference, prebid deleting gpp 1.0 has merged in but not yet released prebid/Prebid.js#11461 Thanks! |
After reading more onto GPP string and CMP API, it seems that the one with an update to 1.1 version is the CMP API which captures the information of the GPP string I have also confirmed with Kate that our current code looks for version 1.1 first and then 1.0. It then proceeds to store only one string value. While the string value would be the same, the difference would lie in how we access the value (i.e getGPPdata function in v1.0 or just the ping function in v 1.1. (Reference: #60 (comment))
So, values captured by both CMP API v1.0 and v 1.1 should be the same given that the only change in the version is the removal of the getGPPdata function and merging the ping function. |
Hypothesis as of now:
What our code has: Possible solution: |
@patmmccann Could we clarify if, by changes in the GPP versions, you were referring to the changes in the CMP API versions? Currently, according to IAB, there is only one GPP version (1.0) but there are 2 CMP API versions (1.0 and 1.1) [CMP API captures the information of the GPP] |
Action plan on our end:
|
Yes the CMP API version 1.1, which we probably should have called 2.0 but oh well. InteractiveAdvertisingBureau/Global-Privacy-Platform#70 cc @lamrowena |
I suggest you get the version out of the ping response instead of testing for the absense of getgppdata. Some commercial vendors, eg @janwinkler, have backported getgppdata to assist in transitions yet still conform to the 1.1 spec and would have a signal recognized by platforms gathering the signal with the newly formatted eventlistener model. |
Thank you @patmmccann! Your insight was very helpful in guiding the steps that I need to take to enhance the crawler functionality. A note to self: This would prioritize the 1.1 spec since all default gpp functions (including ping and getGPPData) used return values in v1.0 now have callback functions in v1.1. On the other hand, v1.0 would return values as expected, with some executing callback functions and some don't. Hence, sites would fall into 3 categories: In order to add the column in the crawler data, I believe these are the steps I should take:
|
Logs/Update on adding the new column:
A side note: while figuring out the code for the addition of the gpp_version column, I also encountered some questions about some functions in analysis.js that I need to clarify and am currently asking Kate about it. Next step: I will be repackaging the gpc-analysis-extension into xpi file and test the extension locally before making a commit. |
Excellent! |
Excellent! Well done, @franciscawijaya! |
Using the April Crawl Data, I tested the crawl for sites that output GPP strings (as tested in April) to check the gpp-version. Out of the 20 sites that I picked from the data, it seems that all of them used v1.1 and that data is reflected in the gpp-version column accurately. I also tested on sites that do not output GPP strings before and after the gpc signal is sent and as expected the column would reflect a 'null' value for gpc_version, since their gpp_before_gpc and gpp_after_gpc would also output a 'null' value. In my testing and debugging of 20 sites, I have yet to encounter a site (that was crawled and identified to have a GPP string in April Crawl) that uses the v1.0. I'm not sure if this indicates and confirms that most sites have switched to the v1.1. While I'm thinking of continuing my manual testing of other sites from the site list that had gpp strings in April Crawl to make sure of this switch, I wonder if there is a way for me to get a hold of sites that are still using v1.0 right now and test those sites out, instead of going through our site list. |
I tried searching BuiltWith to find sites with GPP. But it does not detect GPP. Maybe, there are similar lead generation sites like BuiltWith that do, though. Another option may be to try the Internet Archive and Archive.today to see if they store sites with all their third parties. It is also possible to create your own site with GPP v1.0. But let's not go there unless it is absolutely necessary. |
Other than that, Google search for GPP v1.0 code snippets may get some relevant search results. |
@patmmccann Hello! Would you mind sharing sites that still use the v1.0 when you came across this issue? My sample sets of sites seemed to have switched to 1.1 but I'm currently still looking to test our crawler for sites that still use the v1.0 version. I would greatly appreciate any help. Thank you in advance! |
Sounds all good, @franciscawijaya! |
I am having trouble tracking down some of the old gpp implementations at the moment. Perhaps other outreach has been quite successful! |
Merged to the main branch! |
GPP 1.0 is no longer supported. If a site is broadcasting a GPP 1.0 signal, other entities on the page (eg Prebid.js or Google Ad Manager) generally will not understand it. You should just fail any site providing an API that no one understands. At Prebid, we're removing support for reading GPP 1.0 signals entirely and GAM already has.
The text was updated successfully, but these errors were encountered: