-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Greenbids Real Time Data Module #3242
Add Greenbids Real Time Data Module #3242
Conversation
f483f33
to
58ee63f
Compare
94be37c
to
5cac1d2
Compare
54dd419
to
2caf0d9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are definitely not the last comments, but at least a bunch of them might lead to a lot of changes, so it makes sense to continue the review once they are tackled
...server/hooks/modules/greenbids/real/time/data/config/GreenbidsRealTimeDataConfiguration.java
Outdated
Show resolved
Hide resolved
.expireAfterWrite(cacheExpirationMinutes, TimeUnit.MINUTES) | ||
.build(); | ||
|
||
final ReentrantLock lock = new ReentrantLock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
regarding this locking thing my suggestion is following:
- get rid of locking threads
- if the value in cache is expired
- remove the expired data from cache
- fetch the actual data and put it in cache asynchronously
- while any other thread sees the cache is empty - skip the module logic
- the very first fetching might be done on the very first request ideally OR on the app start up alternatively
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have removed the reentrant locks. Added CompletableFuture.runAsync
to fetch from GCS in case of cache miss + added Executors.newSingleThreadExecutor
to make only one thread access GCS (to avoid race condition)
Now the logic is as follows
- thread -> if cache miss -> fetch from GCS -> cache.put -> return null and skip filtering
- thread -> if cache hit -> cache.getIfPresent -> return OnnxModelRunner and ThrottlingThresholds -> do filtering
The very first call when cache is empty is handled in the same way as cache miss -> cache.getIfPresent
returns null and we fetch the artefacts and skip filtering after
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the data expires in cache, I suppose the cache is invalidated and when we do cache.getIfPresent(cacheKey)
it should return null. So it should be handled automatically by cache with expiration ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EvgeniiMunin I didn't understand your take on avoiding race conditions. If you point to the cache as a shared resource for threads you shouldn't worry since that's the main point of having in-memory cache where all the fetch data is preserved until expiration. I don't see any race condition problem here.
What I see you created a bottleneck for all the threads that would be waiting until each of them could fetch the data one by one. As I understand the GCS returns the same data for all the requests for 15 min, correct? If so and you want to minimize the number of calls to GCS you should make a synchronization/notification point where you decide whether the GCS should be called or not.
For example, you can use atomic boolean to notify all the threads that one thread started fetching process.
Another question: does the GCS support async execution? It's worth using them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added AtomicBoolean isFetching
flag and changed executorService newCachedThreadPool
. So now the fetch is not blocking for other threads.
By default the flag is set to false. When the thread is fetching from GCS it is switched to true with compareAndSet
. So other threads are not blocked and skip. When the fetching is done the flag is set back to false.
Concerning GCS doc I don't see the functionallity specifically for async download of objects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EvgeniiMunin Okay, if the GCS doesn't provide any async interface then please wrap it's call with the vertx.executeBlocking
. It will uses the vertx-based thread management of PBS. Also it's fine to operate with vertx Futures since the module eventually returns the Future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
...ata/src/main/java/org/prebid/server/hooks/modules/greenbids/real/time/data/core/Partner.java
Outdated
Show resolved
Hide resolved
...java/org/prebid/server/hooks/modules/greenbids/real/time/data/core/ThrottlingThresholds.java
Outdated
Show resolved
Hide resolved
...bid/server/hooks/modules/greenbids/real/time/data/model/GreenbidsRealTimeDataProperties.java
Outdated
Show resolved
Hide resolved
.../java/org/prebid/server/hooks/modules/greenbids/real/time/data/model/GreenbidsUserAgent.java
Outdated
Show resolved
Hide resolved
...ks/modules/greenbids/real/time/data/v1/GreenbidsRealTimeDataProcessedAuctionRequestHook.java
Outdated
Show resolved
Hide resolved
...ks/modules/greenbids/real/time/data/v1/GreenbidsRealTimeDataProcessedAuctionRequestHook.java
Show resolved
Hide resolved
...ata/src/main/java/org/prebid/server/hooks/modules/greenbids/real/time/data/core/Partner.java
Outdated
Show resolved
Hide resolved
...server/hooks/modules/greenbids/real/time/data/config/GreenbidsRealTimeDataConfiguration.java
Outdated
Show resolved
Hide resolved
...server/hooks/modules/greenbids/real/time/data/config/GreenbidsRealTimeDataConfiguration.java
Outdated
Show resolved
Hide resolved
...bid/server/hooks/modules/greenbids/real/time/data/model/GreenbidsRealTimeDataProperties.java
Outdated
Show resolved
Hide resolved
...bid/server/hooks/modules/greenbids/real/time/data/model/GreenbidsRealTimeDataProperties.java
Outdated
Show resolved
Hide resolved
...server/hooks/modules/greenbids/real/time/data/config/GreenbidsRealTimeDataConfiguration.java
Outdated
Show resolved
Hide resolved
.../org/prebid/server/hooks/modules/greenbids/real/time/data/model/predictor/FilterService.java
Outdated
Show resolved
Hide resolved
.../org/prebid/server/hooks/modules/greenbids/real/time/data/model/predictor/FilterService.java
Outdated
Show resolved
Hide resolved
...ava/org/prebid/server/hooks/modules/greenbids/real/time/data/model/predictor/ModelCache.java
Outdated
Show resolved
Hide resolved
...ava/org/prebid/server/hooks/modules/greenbids/real/time/data/model/predictor/ModelCache.java
Outdated
Show resolved
Hide resolved
...er/hooks/modules/greenbids/real/time/data/model/predictor/OnnxModelRunnerWithThresholds.java
Outdated
Show resolved
Hide resolved
...ks/modules/greenbids/real/time/data/v1/GreenbidsRealTimeDataProcessedAuctionRequestHook.java
Outdated
Show resolved
Hide resolved
throw new PreBidException("Cache was empty, fetching and put artefacts for next request"); | ||
} | ||
|
||
final GreenbidsInferenceData greenbidsInferenceData = new GreenbidsInferenceData(jacksonMapper, database); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this one is also can be created once
P.S. I've already asked you to check which objects can be created only once, so once more time, double-check the code and define such objects as beans
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this one I've modified the class with Builder
annotation and staticprepareData
that returns the built GreenbidsInferenceData, because the fields are private and final and we need to recreate it at each call with different throttlingMessages
, throttlingInferenceRows
. So GreenbidsInferenceData
has to be defined at each call
I've rechecked other object created at hook call
FilterService
moved to GreenbidsRealTimeDataConfig as beanGreenbidsInvocationResult
. Same as for GreenbidsInferenceData class here we need to rebuilt the object with different private finalupdatedBidRequest
,invocationAction
,analyticsResult
, so it has to be kept at hook call
...ks/modules/greenbids/real/time/data/v1/GreenbidsRealTimeDataProcessedAuctionRequestHook.java
Outdated
Show resolved
Hide resolved
...id/server/hooks/modules/greenbids/real/time/data/model/result/GreenbidsInvocationResult.java
Outdated
Show resolved
Hide resolved
.../org/prebid/server/hooks/modules/greenbids/real/time/data/model/data/GreenbidsUserAgent.java
Outdated
Show resolved
Hide resolved
...ks/modules/greenbids/real/time/data/v1/GreenbidsRealTimeDataProcessedAuctionRequestHook.java
Outdated
Show resolved
Hide resolved
...ks/modules/greenbids/real/time/data/v1/GreenbidsRealTimeDataProcessedAuctionRequestHook.java
Outdated
Show resolved
Hide resolved
...ks/modules/greenbids/real/time/data/v1/GreenbidsRealTimeDataProcessedAuctionRequestHook.java
Outdated
Show resolved
Hide resolved
.../prebid/server/hooks/modules/greenbids/real/time/data/model/data/GreenbidsInferenceData.java
Outdated
Show resolved
Hide resolved
...id/server/hooks/modules/greenbids/real/time/data/model/result/GreenbidsInvocationResult.java
Outdated
Show resolved
Hide resolved
...id/server/hooks/modules/greenbids/real/time/data/model/result/GreenbidsInvocationResult.java
Outdated
Show resolved
Hide resolved
...id/server/hooks/modules/greenbids/real/time/data/model/result/GreenbidsInvocationResult.java
Outdated
Show resolved
Hide resolved
...ks/modules/greenbids/real/time/data/v1/GreenbidsRealTimeDataProcessedAuctionRequestHook.java
Outdated
Show resolved
Hide resolved
...ks/modules/greenbids/real/time/data/v1/GreenbidsRealTimeDataProcessedAuctionRequestHook.java
Outdated
Show resolved
Hide resolved
97300c4
to
54722b7
Compare
.../prebid/server/hooks/modules/greenbids/real/time/data/model/data/GreenbidsInferenceData.java
Outdated
Show resolved
Hide resolved
.../prebid/server/hooks/modules/greenbids/real/time/data/model/data/GreenbidsInferenceData.java
Outdated
Show resolved
Hide resolved
.../prebid/server/hooks/modules/greenbids/real/time/data/model/data/GreenbidsInferenceData.java
Outdated
Show resolved
Hide resolved
.../prebid/server/hooks/modules/greenbids/real/time/data/model/data/GreenbidsInferenceData.java
Outdated
Show resolved
Hide resolved
.../prebid/server/hooks/modules/greenbids/real/time/data/model/data/GreenbidsInferenceData.java
Outdated
Show resolved
Hide resolved
.../org/prebid/server/hooks/modules/greenbids/real/time/data/model/predictor/FilterService.java
Outdated
Show resolved
Hide resolved
.../org/prebid/server/hooks/modules/greenbids/real/time/data/model/predictor/FilterService.java
Outdated
Show resolved
Hide resolved
final GreenbidsInferenceData greenbidsInferenceData = GreenbidsInferenceData | ||
.of(bidRequest, dbReader, mapper); | ||
|
||
impsBiddersFilterMap = filterService.filterBidders( | ||
onnxModelRunner, | ||
greenbidsInferenceData.getThrottlingMessages(), | ||
threshold); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably make a service bean from the GreenbidsInferenceData
with the method like parseData(BidRequest bidRequest)
(please come up with better name) that returns List<ThrottlingMessage>
in that way you can remove dbReader
from hook's dependencies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added Service annotation to GreenbidsInferenceDataService
and then defined it in config GreenbidsRealTimeDataConfiguration
as Bean. Named the method that outputs List<ThrottlingMessage>
extractThrottlingMessagesFromBidRequest
. Removed dbReader from hook field, as it is added to GreenbidsInferenceDataService
bean
final ModelCache modelCache = new ModelCache( | ||
onnxModelPath, | ||
storage, | ||
gcsBucketName, | ||
modelCacheWithExpiration, | ||
onnxModelCacheKeyPrefix, | ||
vertx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create each time a new ModelCache is something counter intuitive
create ModelCache bean with the modelCache.get(onnxModelPath, partner.getPbuid())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defined modelCache
and ThresholdCache
as beans + provided paths at and get
methods
...er/hooks/modules/greenbids/real/time/data/model/predictor/OnnxModelRunnerWithThresholds.java
Outdated
Show resolved
Hide resolved
2bf2ee0
to
8a3284d
Compare
...server/hooks/modules/greenbids/real/time/data/config/GreenbidsRealTimeDataConfiguration.java
Outdated
Show resolved
Hide resolved
@Service | ||
public class GreenbidsInferenceDataService { | ||
|
||
@Autowired |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redundant annotation, since you create bean yourself inside spring config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed redundant autowired
annotation. Also removed removed service
annotation from GreenbidsInferenceDataService
because otherwise the DatabaseReader
bean is not injected properly. Kept GreenbidsInferenceDataService
and DatabaseReader
as beans in spring config
.../server/hooks/modules/greenbids/real/time/data/model/data/GreenbidsInferenceDataService.java
Outdated
Show resolved
Hide resolved
.../server/hooks/modules/greenbids/real/time/data/model/data/GreenbidsInferenceDataService.java
Outdated
Show resolved
Hide resolved
.../server/hooks/modules/greenbids/real/time/data/model/data/GreenbidsInferenceDataService.java
Outdated
Show resolved
Hide resolved
...id/server/hooks/modules/greenbids/real/time/data/model/result/GreenbidsInvocationResult.java
Outdated
Show resolved
Hide resolved
@And1sS @AntoxaAntoxic The fixes on comments are applied, could you please have a look |
...ata/src/main/java/org/prebid/server/hooks/modules/greenbids/real/time/data/core/Partner.java
Outdated
Show resolved
Hide resolved
...server/hooks/modules/greenbids/real/time/data/config/GreenbidsRealTimeDataConfiguration.java
Outdated
Show resolved
Hide resolved
...java/org/prebid/server/hooks/modules/greenbids/real/time/data/core/ThrottlingThresholds.java
Outdated
Show resolved
Hide resolved
.../server/hooks/modules/greenbids/real/time/data/model/data/GreenbidsInferenceDataService.java
Outdated
Show resolved
Hide resolved
.../server/hooks/modules/greenbids/real/time/data/model/data/GreenbidsInferenceDataService.java
Outdated
Show resolved
Hide resolved
.../server/hooks/modules/greenbids/real/time/data/model/data/GreenbidsInferenceDataService.java
Outdated
Show resolved
Hide resolved
.../org/prebid/server/hooks/modules/greenbids/real/time/data/model/data/GreenbidsUserAgent.java
Outdated
Show resolved
Hide resolved
.../org/prebid/server/hooks/modules/greenbids/real/time/data/model/predictor/FilterService.java
Outdated
Show resolved
Hide resolved
.../org/prebid/server/hooks/modules/greenbids/real/time/data/model/predictor/FilterService.java
Outdated
Show resolved
Hide resolved
...ava/org/prebid/server/hooks/modules/greenbids/real/time/data/model/predictor/ModelCache.java
Outdated
Show resolved
Hide resolved
...rg/prebid/server/hooks/modules/greenbids/real/time/data/model/predictor/OnnxModelRunner.java
Outdated
Show resolved
Hide resolved
...org/prebid/server/hooks/modules/greenbids/real/time/data/model/predictor/ThresholdCache.java
Outdated
Show resolved
Hide resolved
...org/prebid/server/hooks/modules/greenbids/real/time/data/model/predictor/ThresholdCache.java
Outdated
Show resolved
Hide resolved
...d/server/hooks/modules/greenbids/real/time/data/model/result/GreenbidsInvocationService.java
Outdated
Show resolved
Hide resolved
...d/server/hooks/modules/greenbids/real/time/data/model/result/GreenbidsInvocationService.java
Outdated
Show resolved
Hide resolved
...ks/modules/greenbids/real/time/data/v1/GreenbidsRealTimeDataProcessedAuctionRequestHook.java
Outdated
Show resolved
Hide resolved
b01f018
to
452e928
Compare
.../org/prebid/server/hooks/modules/greenbids/real/time/data/model/predictor/FilterService.java
Outdated
Show resolved
Hide resolved
src/main/java/org/prebid/server/analytics/reporter/greenbids/GreenbidsAnalyticsReporter.java
Outdated
Show resolved
Hide resolved
final Map<String, Ortb2ImpExtResult> analyticsResultFromAnalyticsTag = extractAnalyticsResultFromAnalyticsTag( | ||
bidResponse); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
analyticsResultFromAnalyticsTag == null check happens at least 3 times further in the code, could you check it once, and skip everything that is redundant in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have removed one redundant null check in createBidMessage
.
Now we have the one in greenbidsId
private String greenbidsId(Map<String, Ortb2ImpExtResult> analyticsResultFromAnalyticsTag) {
return Optional.ofNullable(analyticsResultFromAnalyticsTag) // <--- null check 1
.map(Map::values)
.map(Collection::stream)
.flatMap(Stream::findFirst)
.map(Ortb2ImpExtResult::getGreenbids)
.map(ExplorationResult::getFingerprint)
.orElseGet(() -> UUID.randomUUID().toString());
}
And another one in createAdUnit
final Ortb2ImpResult ortb2ImpResult = Optional.ofNullable(analyticsResultFromAnalyticsTag)
.map(analyticsResult -> analyticsResult.get(imp.getId()))
.map(Ortb2ImpResult::of)
.orElse(null);
I prefer to keep 2 checks because of the following
if RTD module is called -> then we populate analyticsResultFromAnalyticsTag and extract it in AnalyticsAdapter
- here in processEvent greenbidsId = ExplorationResult.getFingerprint
- then in createAdunit we populate Ortb2ImpResult with ExplorationResult and tid for each adunit
if RTD module is not called but AnalyticsAdapter is enabled -> then in AnalyticsAdapter analyticsResultFromAnalyticsTag = null but it does not mean we skip it. We should still populate the fields as follows
- greenbidsId is random = UUID.randomUUID()
- and Ortb2ImpResult = null for each adunit
should still keep the check for greenbidsId and another one for Ortb2ImpResult extracted from analyticsResultFromAnalyticsTag
src/main/java/org/prebid/server/analytics/reporter/greenbids/model/ExplorationResult.java
Show resolved
Hide resolved
src/main/java/org/prebid/server/analytics/reporter/greenbids/GreenbidsAnalyticsReporter.java
Outdated
Show resolved
Hide resolved
private GreenbidsRealTimeDataProcessedAuctionRequestHook target; | ||
|
||
private JacksonMapper jacksonMapper; | ||
|
||
private Cache<String, OnnxModelRunner> modelCacheWithExpiration; | ||
|
||
private Cache<String, ThrottlingThresholds> thresholdsCacheWithExpiration; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we usually mock dependencies
P.S. Waiting for the rest of unit tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added new tests:
FilterServiceTest
shouldReturnFilteredBiddersWhenValidThrottlingMessagesProvided
shouldThrowPreBidExceptionWhenOrtExceptionOccurs
shouldThrowPreBidExceptionWhenThrottlingMessagesIsEmpty
shouldThrowPreBidExceptionWhenTensorSizeMismatchOccurs
GreenbidsInferenceDataServiceTest
shouldReturnValidThrottlingMessages
shouldHandleMissingIp
shouldThrowPreBidExceptionOnGeoIpFailure
GreenbidsInvocationServiceTest
shouldReturnUpdateBidRequestWhenNotExploration
shouldReturnNoActionWhenExploration
GreenbidsRealTimeDataModuleTest
shouldHaveValidInitialConfigs
GreenbidsUserAgentTest
getDeviceShouldReturnPCForWindowsNT
getDeviceShouldReturnDeviceIPhoneWhenIOS
getBrowserShouldReturnBrowserNameAndVersionWhenUserAgentIsPresent
getBrowserShouldReturnEmptyStringWhenBrowserIsNull
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed on Slack, have applied @ExtendWith(MockitoExtension.class)
annotation to test classes to mock dependencies
Also have moved the bid request creation into a separate utility class TestBidRequestProvider
because its methods are reused in GreenbidsInvocationServiceTest
, GreenbidsInferenceDataServiceTest
, shouldReturnFilteredBiddersWhenValidThrottlingMessagesProvided
...ks/modules/greenbids/real/time/data/v1/GreenbidsRealTimeDataProcessedAuctionRequestHook.java
Outdated
Show resolved
Hide resolved
...server/hooks/modules/greenbids/real/time/data/config/GreenbidsRealTimeDataConfiguration.java
Outdated
Show resolved
Hide resolved
.../org/prebid/server/hooks/modules/greenbids/real/time/data/model/predictor/FilterService.java
Outdated
Show resolved
Hide resolved
.../org/prebid/server/hooks/modules/greenbids/real/time/data/model/predictor/FilterService.java
Outdated
Show resolved
Hide resolved
...ks/modules/greenbids/real/time/data/v1/GreenbidsRealTimeDataProcessedAuctionRequestHook.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any tests that cover ModelCache, ThresholdCache and OnnxModel classes, the most important ones I believe
after you add them, I believe the hook test can be simplified
...test/java/org/prebid/server/hooks/modules/greenbids/real/time/data/v1/FilterServiceTest.java
Outdated
Show resolved
Hide resolved
...test/java/org/prebid/server/hooks/modules/greenbids/real/time/data/v1/FilterServiceTest.java
Outdated
Show resolved
Hide resolved
...test/java/org/prebid/server/hooks/modules/greenbids/real/time/data/v1/FilterServiceTest.java
Outdated
Show resolved
Hide resolved
...test/java/org/prebid/server/hooks/modules/greenbids/real/time/data/v1/FilterServiceTest.java
Outdated
Show resolved
Hide resolved
...ebid/server/hooks/modules/greenbids/real/time/data/v1/GreenbidsInferenceDataServiceTest.java
Outdated
Show resolved
Hide resolved
...ebid/server/hooks/modules/greenbids/real/time/data/v1/GreenbidsInferenceDataServiceTest.java
Outdated
Show resolved
Hide resolved
.../prebid/server/hooks/modules/greenbids/real/time/data/v1/GreenbidsInvocationServiceTest.java
Outdated
Show resolved
Hide resolved
.../prebid/server/hooks/modules/greenbids/real/time/data/v1/GreenbidsInvocationServiceTest.java
Outdated
Show resolved
Hide resolved
.../prebid/server/hooks/modules/greenbids/real/time/data/v1/GreenbidsInvocationServiceTest.java
Outdated
Show resolved
Hide resolved
...prebid/server/hooks/modules/greenbids/real/time/data/v1/GreenbidsRealTimeDataModuleTest.java
Outdated
Show resolved
Hide resolved
Hi, have added the tests
Also added Then removed some redundant test cases from hook test Remarks on OnnxModelRunnerTest: OrtEnvironment is not very much mockable as when mocking it NPE on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some of UT fails, please check before pushing
...ebid/server/hooks/modules/greenbids/real/time/data/v1/GreenbidsInferenceDataServiceTest.java
Outdated
Show resolved
Hide resolved
...test/java/org/prebid/server/hooks/modules/greenbids/real/time/data/v1/FilterServiceTest.java
Outdated
Show resolved
Hide resolved
...odules/greenbids/real/time/data/v1/GreenbidsRealTimeDataProcessedAuctionRequestHookTest.java
Outdated
Show resolved
Hide resolved
...st/java/org/prebid/server/hooks/modules/greenbids/real/time/data/v1/OnnxModelRunnerTest.java
Outdated
Show resolved
Hide resolved
...java/org/prebid/server/hooks/modules/greenbids/real/time/data/v1/TestBidRequestProvider.java
Outdated
Show resolved
Hide resolved
8119e7b
to
b18de6a
Compare
The errors in UT were due to Unnecessary stubbings detected when mocking GCS storage, bucket, blob. But we have to mock them to be able to properly simulate test cases. So I've added |
19a9632
to
bb0a589
Compare
...odules/greenbids/real/time/data/v1/GreenbidsRealTimeDataProcessedAuctionRequestHookTest.java
Show resolved
Hide resolved
.../server/hooks/modules/greenbids/real/time/data/model/data/GreenbidsInferenceDataService.java
Outdated
Show resolved
Hide resolved
...va/org/prebid/server/hooks/modules/greenbids/real/time/data/util/TestBidRequestProvider.java
Outdated
Show resolved
Hide resolved
d9ec9f1
to
6cbd929
Compare
e3641ce
to
43fd9f3
Compare
private Map<String, Ortb2ImpExtResult> extractAnalyticsResultFromAnalyticsTag(BidResponse bidResponse) { | ||
return Optional.ofNullable(bidResponse) | ||
.map(BidResponse::getExt) | ||
.map(ExtBidResponse::getPrebid) | ||
.map(ExtBidResponsePrebid::getModules) | ||
.map(ExtModules::getTrace) | ||
.map(ExtModulesTrace::getStages) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trace log will be available only in debug mode, maybe it is better to use AuctionContext.hookExecutionContext
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced extraction of analytics tags from bid respone to AuctionContext.hookExecutionContext + modified UT
src/main/java/org/prebid/server/analytics/reporter/greenbids/GreenbidsAnalyticsReporter.java
Outdated
Show resolved
Hide resolved
final Map<String, Ortb2ImpExtResult> analyticsResultFromAnalyticsTag = extractAnalyticsResultFromAnalyticsTag( | ||
bidResponse); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
URL url = new URL(properties.getGeoLiteCountryPath()); | ||
Path databasePath = Files.createTempFile("GeoLite2-Country", ".mmdb"); | ||
|
||
try (InputStream inputStream = url.openStream(); | ||
FileOutputStream outputStream = new FileOutputStream(databasePath.toFile())) { | ||
inputStream.transferTo(outputStream); | ||
} | ||
|
||
try { | ||
return new DatabaseReader.Builder(databasePath.toFile()).build(); | ||
} catch (IOException e) { | ||
throw new PreBidException("Failed build DatabaseReader from DB", e); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's conceptually better to create a proxy (or factory) for DatabaseReader
that also implements the Initializable
interface? The application will still wait for the initialize
method to complete before it can start working, but this will be easier to extend if additional logic is needed to download this file.
Besides, I just don't want to download anything while we're constructing beans :p
@And1sS what is your opinion on it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added DatabaseReaderFactory
that implements Initializable. It also contains AtomicReference<DatabaseReader> databaseReaderRef
to hold the DatabaseReader.
Provided factory into GreenbidsInferenceDataService
in constructor and then get the DatabaseReader at getCountry
from atomic reference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EvgeniiMunin Thanks for all the work you’ve done! 🎉🚀💪
Finally, we’re ready to merge.
Approved!! ✅
🔧 Type of changes
✨ What's the context?
Greenbids Real Time Data module to filter bidders SSP listed in the
imp[].ext.prebid.bidder
of the bid request at theprocessed-auction-request
stage.🧠 Rationale behind the change
Artefacts and caching
To perform the filtering the module uses the ML pipeline that outputs the probability of bid per bidder for each imp for the given BidRequest. Then this probability of bid is compared with the threshold to ensure target True Positive Rate (TPR) defined for each partner publisher. Thus the RTD module uses 2 artefacts that are fetched from the Greenbids Google Cloud Storage bucket
.onnx
format.json
format with the list of thresholds and their corresponding TPRsOn Greenbids side these artefacts are updated every 15 min in the storage. In the classes
ModelCache
andThresholdCache
we introduce the caching of these artefacts on RTD side with the cache expiration delay of 15 min using Caffeine. When the artefacts are fetched they are put into the cache with their corresponding keys. When the new thread withe the BidRequest is hooked by the module we search the artefacts from the cache.no_action
and BidRequest unchangedWhen the artefacts are fetched we craft ThrottlingMessage with the features necessary for inference. The inference time for a single row of data is up to 0.2 ms in normal regime and 3.5 ms at the first call after fetching (warmup).
Communication with AnalyticsReporter and Exploration
The RTD module also communicates the filtering results with the
GreenbidsAnalyticsReporter
via AnalyticsTags. Here we populate AnalyticsResult of AnalyticsTag for each Imp the withfingerprint
(greenbidsId),isKeptInAuction
map of booleans for each bidder wether we keep them in auction or not for the given imp,isExploration
flag that is necessary to isolate the training logThen the AnalyticsTag is then parsed by the AnalyticsReporter from
ExtBidResponsePrebid
and its content added to the analytics payload sent to Greenbids server.The Exploration part of traffic is split randomly with the ratio defined for each partner publisher per bid requests and is not filtered by the RTD module.
Publisher settings
We have observed 2 options how to handle the settings by the activated partner publishers to avoid adding them to the code base
prebid-config-with-modules.yaml
common config of prebid and update it buy sending manually to PBS teambid-request.json
where they indicate the activation of our module in bid request extenstionbid-request.ext.prebid.analytics.greenbids
andbid-request.ext.prebid.analytics.greenbids-rtd
. Here we use analytics extension in the hierarchy of both moduels as there is no RTD extensions in ExtRequestPrebid , so this is a workaround to add both modules configs but to name them differently, something likeanalytics.greenbids-analytics
and.analytics.greenbids-rtd
At the given moment the 2nd option is implemented in the RTD module.
🧪 Test plan
We are testing the following cases
shouldExitEarlyIfPartnerNotActivatedInBidRequest
when the partner is not activated so the extensionbidrequest.ext.prebid.analytics.greenbids-rtd
is not defined -> we skip filtering and returnFutureSucceeded
without modifiying BidRequestshouldExitEarlyIfModelIsNotAvailable
when ONNX model is not in cache on the partner and GCS bucket or the artefact init is not available -> we skip filtering and returnFutureSucceeded
without modifiying BidRequestshouldExitEarlyIfThresholdIsNotAvailable
same as aboveshouldNotFilterBiddersAndReturnAnalyticsTagWhenExploration
when we are at the exploration, the model should do the inference but the BidRequest stays unchanged. We populate the AnalyticsTag with the inference results and send it to AnalyticsReporter by indicating all bidders in the result asisKeptInAuction = true
shouldFilterBiddersBasedOnModelResults
the nominal case when the module operates end2end: we filter the bidders and return the update BidRequest with modifiedbidrequest.imp[].ext.prebid.bidders
and sendAnalyticsTags
to AnalyticsReporter.shouldFilterBiddersBasedOnModelIfAnyFeatureNotAvailable
if by any reason some features are not available in the BidRequest (ex: user IP address or device are null) -> we do the inference anyway same as in normal case but with filling the absent features with empty strings🏎 Quality check