-
-
Notifications
You must be signed in to change notification settings - Fork 30.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multiprocessing.connection challenge implicitly uses MD5 #61460
Comments
Within multiprocessing.connection, deliver_challenge() and hmac implicitly defaults to using MD5. MD5 should no longer be used for security purposes. See e.g. This fails in a FIPS-compliant environment (e.g. with the patches I There's thus a possibility of an attacker defeating the multiprocessing I'm attaching a patch which changes multiprocessing to use a clearly It's not clear to me whether hmac.py should also be changed (this would [Note to self: I'm tracking this downstream for RHEL as |
The statement "MD5 should no longer be used for security purposes" is not entirely correct. MD5 should no longer be used as cryptographic hash function for signatures. However HMAC-MD5 is a different story. From https://tools.ietf.org/html/rfc6151 The attacks on HMAC-MD5 do not seem to indicate a practical I agree that we should slowly migrate to a more modern MAC such as HMAC-SHA256. AES-CBC is too hard to get right and most AES implementation are vulnerable to timing attacks, too. How about we include the name of the MAC in multiprocessing's wire protocol and define "no MAC name given" as HMAC-MD5? Please don't call it SHA256 but HMAC-SHA256, too. |
Banning md5 as a matter of policy may be perfectly sensible. However, I think the way multiprocessing uses hmac authentication is *not* affected by the collision attacks the advisory talks about. These depend on the attacker being able to determine for himself whether a particular candidate string is a "solution". But with the way multiprocessing uses hmac authentication there is no way for the attacker to check for himself whether a candidate string has the desired hash: he does not know what the desired hash value is, or even what the hash function is. (The effective hash function, though built on top of md5, depends on the secret key.) |
Dave, are you still interested to address the issue? I think it's a good idea to replace HMAC-MD5 in the long run. But instead of hard-coding another hash algorithm, I would like to see an improved handshake protocol that supports flexible authentication algorithms. You could send an algorithm indicator (e.g. HMAC_SHA256) in the request. It would be really cool to run multiprocessing protocol over TLS with support for SASL with SCRAM or EXTERNAL (TLS cert auth, AF_UNIX PEERCRED, GSSAPI)... |
Indeed, we probably want a flexible handshake mechanism. This needn't be very optimized: probably a magic number followed by a JSON-encoded dict is sufficient. (of course, several years down the road, someone will engineer a downgrade attack) |
This isn't something to backport to a release as what we've got isn't broken. Changing this is a feature. |
So #20380 is a more complicated version of my draft PR above. In other PRs and issues related to this in the past, I see one claim by @tiran in particular that bothers me - #16264 (comment) - "The change breaks backward compatibility. multiprocessing supports distributed computing across multiple machines and works with multiple Python versions. With the change a controller with Python 3.N+1 would no longer be able to talk to a 3.N server or the other way around." What evidence is there that multi-python-version use of multiprocessing rather than use as a single Python process launching and controlling a bunch of children is a supported use case? People really should not be using multiprocessing that way. This isn't a distributed computing system. |
…tocol (gh-99623) Describe the multiprocessing connection protocol. It isn't a good protocol, but it is what it is. This way we can more easily reason about making changes to it in a backwards compatible way.
Just to note, this came up again, in #100755 , an apparent duplicate of this issue |
bpo-17258: `multiprocessing` now supports stronger HMAC algorithms for inter-process connection authentication rather than only HMAC-MD5. Signed-off-by: Christian Heimes <christian@python.org> gpshead: I Reworked to be more robust while keeping the idea. The protocol modification idea remains, but we now take advantage of the message length as an indicator of legacy vs modern protocol version. No more regular expression usage. We now default to HMAC-SHA256, but do so in a way that will be compatible when communicating with older clients or older servers. No protocol transition period is needed. More integration tests to verify these claims remain true are required. I'm unaware of anyone depending on multiprocessing connections between different Python versions. --------- Signed-off-by: Christian Heimes <christian@python.org> Co-authored-by: Gregory P. Smith [Google] <greg@krypto.org>
3.12beta1 is going out using hmac-sha256 by default with the above PR merge. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: