-
-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update popularity.php #1436
Update popularity.php #1436
Conversation
Reflect GDPR and store hased IPs instead of the real IP
I've not read the entirety of the GDPR or it's legal provisions, but does it specify that personal identifiable data is acceptable once obscured? Prior EU law has only required that websites inform visitors about what data is gathered, typically through listing it on a privacy policy page. |
You may extend my humble patch, but AFAIK, there is also room to do pseudonymization, which would be done with that. You could add some salt as well. I'm just not able to to this at this point |
A court ruling by the CJEU maintains that IP addresses are personal information, and Recital 26 of the GDPR confirms that pseudonymization would be sufficient. Though the effect of hashing an IP through Sha512 won't be noticeable on small to medium sized sites, on a large size it is an ill-afforded inefficiency. This will of course also vary by the server running Grav. Since many of the available algorithms in PHP are unnecessary slow or insecure, I ran a test with MD5, SHA1, SHA256, and SHA512 - running 100.000 iterations over the string "255.255.255.255". On PHP 7.0 this results:
So at the very least the end-user should have a choice between a handful of algorithms, if Popularity is enabled, but I agree with the sentiment @campino2k - this would legally be required under EU law. Given the nature of the data, defaulting to a lower-level hash such as MD5 or even MD4 - though both are cryptographically insecure, they perform well with a more comprehensive test on the same system - could be considered adequate anonymization where decrypting a large amount of them would in any case be inefficient. |
MD5 is out of the game, since collisions are found at similar inputs (which IPv4 is). sha1 is technically broken and the current recommendateion is to use at least 256-hashing. |
I think that SHA1 would be good enough because of there's really no benefit if you could guess the IP address. I see no benefit of using longer hashes to waste disk space. :) |
What about using SHA1( IP + date("ymd")) as basis? |
I think SHA1 is plenty good enough for this pupose. |
Updated manually.. thanks. |
Reflect GDPR and store hashed IPs instead of the real IP