-
Notifications
You must be signed in to change notification settings - Fork 33
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add Antifraud Detection to Impressions (#2290)
* Change variable names for accuracy * Generalize spike labeling to label spikes of either clicks or impressions + I set the default threshold for impressions 500x higher than that of clicks because we get about 500x as many impressions as clicks, but that decision needs review * Parametrize the query param addition in spike labeling to facilitate different ky value pairs for different types of event * We had ParseReportingURL and ParsedReportingURL, and those two names are similar enough that it's unclear why we need both and easy to confuse them also. This commit renames those for better disambiguation. * Aggregate presumed genuine and suspected fraudulent impressions separately * I had the conditions for genuine and fraudulent filtering on their opposite cases. This fixes that. * Ran ./bin/mvn spotless:apply to fix maven's formatting complaints, which were more or less all indentation-related * Acquiesce to Maven's formatting demands * Individually import all dependent objects in contextualservices rather than asterisk * Bubble up event type exception through tests * Move ghost counter metric initialization into the constructor to switch based on eventType * More formatting * Adjust Import Order * Import order...again * Add run script for contextual services job in staging + TO run in prod, change the PROJECT param (but you probably won't have access). Jobs are managed in prod by Terraform. * Update run options to the appropriate values -- see comments for details * Rename variables to clarify where we have generalized click counting to count either clicks OR impressions * We were explicitly filtering by impression spike status to form two separate collections, but we don't think we need to do that: #2290 (comment) + So this change set re-combines those. * Address style checker concerns * Run ./bin/mvn spotless:apply * IDE keeps automatically inserting a star import, which our style checker hates. Undoing that here. * Change job names so they will be easier to disambiguate in debugging * Remove superfluous filter step * I swear to you, the list of things I will do before I will make a debugging handle less specific and useful in order to pass a line length checkstyle error is both long and scandalous * Run bin/mvn spotless:apply * Make error message trio for UNSET variable as WELL as blank one instead of just for blank one * Actually show the stack trace if something goes wrong
- Loading branch information
1 parent
65f9ed7
commit 7eadf66
Showing
13 changed files
with
327 additions
and
149 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
#!/bin/bash | ||
|
||
set -ux | ||
|
||
PROJECT="contextual-services-dev" | ||
JOB_NAME="contextual-services-reporter-$(whoami)" | ||
|
||
SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd ) | ||
|
||
if [ -z ${GOOGLE_APPLICATION_CREDENTIALS+x} ] | ||
then | ||
cat << EOF | ||
You need to authenticate with gcloud. The commands are: | ||
gcloud auth login youremail@mozilla.com --update-adc | ||
export GOOGLE_APPLICATION_CREDENTIALS=$HOME/.config/gcloud/application_default_credentials.json | ||
Then you can run | ||
bash $0 | ||
again. | ||
EOF | ||
exit 1; | ||
fi | ||
|
||
$SCRIPT_DIR/mvn compile exec:java -Dexec.mainClass=com.mozilla.telemetry.ContextualServicesReporter -e -Dexec.args="\ | ||
--runner=Dataflow \ | ||
--jobName=$JOB_NAME \ | ||
--project=$PROJECT \ | ||
--inputType=pubsub \ | ||
--input='projects/contextual-services-dev/subscriptions/ctxsvc-input' \ | ||
--outputTableRowFormat=payload \ | ||
--errorBqWriteMethod=streaming \ | ||
--errorOutputType=bigquery \ | ||
--errorOutput=$PROJECT:contextual_services.reporting_errors \ | ||
--region=us-central1 \ | ||
--usePublicIps=true \ | ||
--gcsUploadBufferSizeBytes=16777216 \ | ||
--urlAllowList=gs://contextual-services-data-dev/urlAllowlist.csv \ | ||
--allowedDocTypes=topsites-click,topsites-impression,quicksuggest-impression,quicksuggest-click, \ | ||
--allowedNamespaces=contextual-services,org-mozilla-fenix,org-mozilla-firefox-beta,org-mozilla-firefox,org-mozilla-ios-firefox,org-mozilla-ios-firefoxbeta,org-mozilla-ios-fennec \ | ||
--aggregationWindowDuration=10m \ | ||
--clickSpikeWindowDuration=3m \ | ||
--clickSpikeThreshold=10 \ | ||
--impressionSpikeWindowDuration=3m \ | ||
--impressionSpikeThreshold=20 \ | ||
--reportingEnabled=false \ | ||
--logReportingUrls=true \ | ||
--maxNumWorkers=2 \ | ||
--numWorkers=1 \ | ||
--autoscalingAlgorithm=THROUGHPUT_BASED \ | ||
" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.