-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strip the user's email address out of the location used by GA #466
Conversation
Since this keeps coming up is there an opportunity to avoid this in the server side applications? For example if we know certain routes allow user input:
output this route to the frontend as a sort of 'mask' which then can take the current URL and remove this section from the URL.
This means we're not relying on matching arbitrary patterns, which as we've seen seems prone to letting things slip through.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a few comments, but happy to approve.
@@ -20,6 +22,10 @@ | |||
sendToGa('set', 'displayFeaturesTask', null) | |||
} | |||
|
|||
function stripLocationPII () { | |||
sendToGa('set', 'location', stripPIIFromString(window.location.href.toString())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think window.location.href
is already a string so we don't need to call toString
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, of course. I was thinking of window.location
, which is an Object.
@@ -160,6 +167,11 @@ | |||
} | |||
} | |||
|
|||
function stripPIIFromString (string) { | |||
var stripped = string.replace(EMAIL_PATTERN, '[email]') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the EMAIL_PATTERN
variable should be in this function so it's closer to the code where it gets used and then maybe rename this to stripEmailAddressesFromString
?
@NickColley - thanks that's a great suggestion. We've had a chat about it with some of the email-alert alumni and others on the platform-health team. We do want to do the work to obfuscate things server side, but we want to take the time to do that work correctly, both from a product point of view and from a user-journey point of view and with proper oversight about what is and isn't PII. I wonder if we could push this through as a temporary band-aid since it does stop the immediate and time-sensitive issue we have. |
@bevanloon sure, I won't block this pull request just wanted to highlight that previous to this we've already had two 'temporary band-aids', so we should probably try a more robust approach in the future. |
7de00b7
to
9791be0
Compare
We have seen some paths in Google Analytics that include the user's email address, normally as part of the query string. These URLs always have zero pageviews, but a number of hits (GA measures hit as any activity, e.g. event, timing measurement, etc).
Although there is existing code in this repo to strip personal information (PII) from the path, there are some cases where GA generates a new request (e.g. page timings) that we have not covered. In this case, it reports the actual URL rather than our custom path. This aligns with the observation of these hits having no pageviews, since pageviews are already stripped of this information.
This PR addresses that problem by stripping email addresses out of all URLs and setting this as a global location for GA to use on all requests from that page.
https://trello.com/c/jKitCyZI/311-remove-emails-from-ga-events-on-email-alert-frontend-pages