-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Log session ID instead of client IP during captcha flow #17956
base: develop
Are you sure you want to change the base?
Conversation
2be80d1
to
f46c765
Compare
f46c765
to
1989480
Compare
How can I effectively test against the |
|
||
logger.info( | ||
"Submitting recaptcha response %s with remoteip %s", user_response, clientip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can certainly see that logging the client's IP in this line is redundant, because that should already be getting logged for the request's log line that Synapse logs for every request.
I could get behind just removing the client IP from this log line and even downgrading it to debug
instead of info
, since it seems a bit needless to just log the input from the client.
I don't personally see much benefit in logging the session
for this flow on its own, so we could leave that off too.
However, I feel like I should mention the Synapse logs still contain more PII — User IDs, IP addresses on the request lines, etc — that I don't feel this PR really makes much difference on that particular front?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could get behind just removing the client IP from this log line and even downgrading it to debug instead of info, since it seems a bit needless to just log the input from the client.
I don't personally see much benefit in logging the session for this flow on its own, so we could leave that off too.
OK sure. I will modify my PR to avoid logging the session ID and change info
to debug
.
However, I feel like I should mention the Synapse logs still contain more PII — User IDs, IP addresses on the request lines, etc — that I don't feel this PR really makes much difference on that particular front?
Point taken about PII logging in other locations. To focus on client IPs specifically, I searched the codebase with the following regex and couldn't find locations where IPs are logged:
logger\.\w*\(\n*.*(?:ip|IP|Ip)
Are you aware of specific locations where the client IP ends up in the logs, either explicitly or as part of a larger data structure?
My overall goal in this PR is to minimize or eliminate logging of client IPs across the codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made the changes you suggested. Can you approve another run of the test suite please? Thanks
Logging the IP address of a client during the captcha registration flow is a privacy issue. Client IPs are stored in the database along with timestamp and UserAgent, so it seems redundant to log them here, where it would may be easier for an attacker to extract them, depending on how the Postgres DB is protected.
Therefore, I have modified the code to log the user
session
instead, which isn't PII at least. I'm not entirely sure how sensitive these session IDs are in the context of this application, or in the captcha flow. Does logging the session ID pose a security risk? Please let me know - I am happy to modify the PR and tests to log neither session ID nor IP address instead.Thank you
Pull Request Checklist
EventStore
toEventWorkerStore
.".code blocks
.(run the linters)