-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New PEP 546: Backport MemoryBIO to Python 2.7 #272
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,141 @@ | ||
PEP: 546 | ||
Title: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7 | ||
Version: $Revision$ | ||
Last-Modified: $Date$ | ||
Author: Victor Stinner <victor.stinner@gmail.com>, | ||
Status: Draft | ||
Type: Standards Track | ||
Content-Type: text/x-rst | ||
Created: 30-May-2017 | ||
|
||
|
||
Abstract | ||
======== | ||
|
||
Backport ssl.MemoryBIO and ssl.SSLObject classes from Python 3 to Python | ||
2.7 to enhance the overall security of Python 2.7. | ||
|
||
|
||
Rationale | ||
========= | ||
|
||
While Python 2.7 is getting closer to its end-of-line (scheduled for | ||
2020), it is still used on production and the Python community is still | ||
responsible for its security. And to facilitate the future adoption of | ||
:pep:`543`, which will improve security for Python3 users. | ||
|
||
This PEP does NOT propose a general exception for backporting new | ||
features to Python 2.7 - every new feature proposed for backporting will | ||
still need to be justified independently. In particular, it will need to | ||
be explained why relying on an independently updated backport on the | ||
Python Package Index instead is not an acceptable solution. | ||
|
||
|
||
PEP 543 | ||
------- | ||
|
||
The :pep:`543` defines a new TLS API for Python which would enhance the | ||
Python security: give access to the root certificate authorities on | ||
Windows and macOS by using native APIs, instead of OpenSSL. A side effect | ||
is that it gives access to certificates installed locally by system | ||
administrators, allowing to use "company certificates" without having to | ||
modify each Python application and so validate correctly TLS | ||
certificates (instead of having to ignore or bypass the TLS certificate | ||
validation). | ||
|
||
For practical reasons, Cory Benfield would like to first implement an | ||
I/O-less class similar to ssl.MemoryBIO and ssl.SSLObject for the | ||
:pep:`543`, and provide a second class based on the first one to use | ||
sockets or file descriptors. This design would help to structure the code | ||
to support more backends and simplify testing and auditing. Later, | ||
optimized classes using directly sockets or file descriptors may be | ||
added for performance. | ||
|
||
While the :pep:`543` defines an API, the PEP would only make sense if it | ||
comes with at least one complete and good implementation. The first | ||
implementation will be based on the ``ssl`` module of the Python | ||
standard library. | ||
|
||
In a perfect world, all applications would already run on Python 3 since | ||
Python 3.0 was released. In practice, many applications still run on | ||
production on top of Python 2.7. To make the new TLS API more widely | ||
used, it should be usable on all Python versions currently supported: | ||
Python 2.7, 3.5, 3.6. Otherwise, some applications would have to wait | ||
until they drop Python 2 support to be able to use the new TLS API. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just to connect all the dots: delaying adoption of the PEP 543 API means delaying the adoption for security improvements for Python3 users as well. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
|
||
Delaying adoption of the PEP 543 API means delaying the adoption for | ||
security improvements for Python 3 users as well. | ||
|
||
|
||
requests, pip and ensurepip | ||
--------------------------- | ||
|
||
There are plans afoot to look at moving Requests to a more event-loop-y | ||
model, and doing so basically mandates a MemoryBIO. In the absence of a | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Tornado has been doing TLS in an event-loop model in python 2.5+ with just There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Lukasa can maybe reply to this question. In my experience, on Windows, you really want to use IOCP rather than select() to implement an event loop, and you need MemoryBIO for IOCP. (Hum, but you also need C code to access to IOCP.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @bdarnell So the short answer is that wrap_socket interferes awkwardly with event loop management. A wrapped socket does not respond to selecting like a regular socket does, in the following ways:
Essentially, for all selecting models that use level-triggering as their approach, a wrapped socket behaves very strangely. It's at best an edge-triggered object (due to point 3), but even then it's an edge triggered object that may consistently refuse to behave the way you want it to due to the fact that triggering the socket into either readable or writable state may still prevent you from reading or writing any data at all. The MemoryBIO object gives you much more predictable behaviour because it doesn't intercept socket calls. When the FD is marked readable, it really is: there just may be no data to transfer further up the chain. This means that the event loop doesn't dirty its hands with special knowledge about the way TLS works and handle all of the wacky TLS edge case behaviours. Is it possible to write an event loop with just a wrapped socket? Sure. But the MemoryBIO provides a much more reasonable interface to do so. Most notably, Twisted does not use the wrapped socket approach any longer and I wouldn't propose that they should. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the answer @Lukasa :-) IMHO it's worth it to include this answer into the PEP since it was a very good question :-) (For example, I was unable to find the proper answer.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I should also note that @Haypo's concerns about alternative styles of event loop also come into play here and are worth including. ;) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For what it's worth, Twisted used to implement TLS via the wrap_socket-esque approach (since this is what OpenSSL strongly encourages you to do). Despite extensive test coverage and plenty of real-world usage, it never really worked right, and when we finally managed to switch over to the in-memory BIO model, dozens of bugs were fixed overnight, the whole system got considerably more reliable. These edge cases manifest most significantly when using embedded systems, low-spec hardware (think raspberry pi), or weird operating systems. I do still occasionally experience weird flakiness when using Tornado event loops on this kind of hardware that doesn't happen with Twisted, and tellingly, doesn't happen with Twisted's TLS support using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, while this PEP doesn't mention it: wrap_socket doesn't work with pipes, and it's helpful to be able to speak wire protocols over UNIX pipes (or other non-socket transports) between processes for things like multi-process parallelism. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does it support Unix domain sockets? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Still, I agree with @bdarnell that the sentence "There are plans afoot to look at moving Requests to a more event-loop-y model, and doing so basically mandates a MemoryBIO" is basically wrong. Besides, the idea that future directions for Requests should guide our 2.7 backport strategy also sounds entirely bogus. |
||
Python 2.7 backport, Requests is required to basically use the same | ||
solution that Twisted currently does: namely, a mandatory dependency on | ||
`pyOpenSSL <https://pypi.python.org/pypi/pyOpenSSL>`_. | ||
|
||
The `pip <https://pip.pypa.io/>`_ program has to embed all its | ||
dependencies for pratical reason. Since pip depends on requests, it means | ||
that it would have to embed a copy of pyOpenSSL. That would imply | ||
usability pain to install pip. Currently, pip doesn't support embedding | ||
C extensions which must be compiled on each platform and so require a C | ||
compiler. | ||
|
||
Since Python 2.7.9, Python embeds a copy of pip both for default | ||
installation and for use in virtual environments: the new ``ensurepip`` | ||
module. If pip ends up bundling PyOpenSSL, then Python will end up | ||
bundling PyOpenSSL. Only backporting ``ssl.MemoryBIO`` and | ||
``ssl.SSLObject`` would avoid to have to embed pyOpenSSL to only include | ||
the strict minimum features required by requests and fix the bootstrap | ||
issue (python -> ensurepip -> pip -> requests -> MemoryBIO). | ||
|
||
|
||
Changes | ||
======= | ||
|
||
Add ``MemoryBIO`` and ``SSLObject`` classes to the ``ssl`` module of | ||
Python 2.7. | ||
|
||
The code will be backported and adapted from the master branch | ||
(Python 3). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd prefer to backport from the Python 3.6 maintenance branch rather than from the development branch - that way it's a true backport of a released version, rather than potentially including code that hasn't previously been published in a stable release. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I expect the code to be exactly the same. I don't think that MemoryBIO or SSLObject changed much since Python 3.5. I prefer to backport from master to ease comparison of 2.7 and master branches, to easy re-sync later. As explained in another paragraph: reduce the diff between these two branches. |
||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This may be too in the weeds for a PEP, but when I worked on this in 2014, it also significantly reduced the size of the Python2/Python3 diff of the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
The backport also significantly reduced the size of the Python 2/Python | ||
3 difference of the ``_ssl`` module, which make maintenance easier. | ||
|
||
|
||
Links | ||
===== | ||
|
||
* :pep:`543` | ||
* `[backport] ssl.MemoryBIO | ||
<https://bugs.python.org/issue22559>`_: Implementation of this PEP | ||
written by Alex Gaynor (first version written at October 2014) | ||
* :pep:`466` | ||
|
||
|
||
Discussions | ||
=========== | ||
|
||
* `[Python-Dev] Backport ssl.MemoryBIO on Python 2.7? | ||
<https://mail.python.org/pipermail/python-dev/2017-May/147981.html>`_ | ||
(May 2017) | ||
|
||
|
||
Copyright | ||
========= | ||
|
||
This document has been placed in the public domain. | ||
|
||
|
||
|
||
|
||
.. | ||
Local Variables: | ||
mode: indented-text | ||
indent-tabs-mode: nil | ||
sentence-end-double-space: t | ||
fill-column: 70 | ||
coding: utf-8 | ||
End: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this paragraph is necessary - wanting
pip
, which we ship as part of CPython 2.7, to be able to reliably access the system provided TLS APIs, which we want it to do by way of PEP 543, is the real reason we want to update the standard library instead of just relying on PyOpenSSL.Relying on PyOpenSSL used to be major problem due to the difficulty of building it on Windows and Mac OS X, but the introduction of the wheel format largely addressed that problem (you still need to build the underlying
cryptography
library for *nix systems, but that isn't that difficult).