Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop usage of SO_EXCLUSIVEADDRUSE on Windows #928

Closed
njsmith opened this issue Feb 14, 2019 · 6 comments
Closed

Drop usage of SO_EXCLUSIVEADDRUSE on Windows #928

njsmith opened this issue Feb 14, 2019 · 6 comments

Comments

@njsmith
Copy link
Member

njsmith commented Feb 14, 2019

I recently saw a claim that contra the MSDN documentation, using SO_REUSEADDR on Windows no longer creates a bizarre security hole. We have a script in notes-to-self/ that demonstrates that SO_EXCLUSIVEADDRUSE doesn't break things the way MSDN claims it does; but I never did do a test about whether it fixes things the way MSDN claims it does. So... for all I know, this might be true! In fact, it's possible that both SO_REUSEADDR and SO_EXCLUSIVEADDRUSE are no-ops these days. Is this true? If so, when did it become true? Which versions of Windows are affected?

I suspect that the answers won't affect what we do in Trio itself, but it seems worthwhile to know in any case, so at least we know why we do the quirky platform-specific things we do.

See #39, #72 for more background.

@njsmith
Copy link
Member Author

njsmith commented Feb 14, 2019

How do these things work?

The official docs on this are here: https://docs.microsoft.com/en-us/windows/desktop/WinSock/using-so-reuseaddr-and-so-exclusiveaddruse

Unfortunately, the formatting on that page is mangled. There are some big tables that include crucial information, but they're completely unreadable. Also, they changed the URL format at some point after the mangling happened, so if you just paste that URL into archive.org you don't get anything useful. But, anyway, here's the old URL, in archive.org, with working tables. I'll link to two versions, for reasons that will become apparent soon: May 2015, March 2017

The first table is just a bunch of version info. The second purports to tell you how SO_REUSEADDR and SO_EXCLUSIVEADDRUSE work, but it applies to Windows XP and earlier, so we don't care about that. The third table, in the "Enhanced Socket Security" section, is the one we care about. According to the text, it tells you what happens on "Windows Server 2003 and later operating systems".

The most important entry in the table is at the intersection of row "Default / specific" and column "SO_REUSEADDR / specific". This tells you what happens if we make a regular socket, bind it to a specific hostname/port, and then later on some miscreant creates a socket with SO_REUSEADDR and attempts to hijack our socket.

In the May 2015 version of the table (and earlier versions), this entry is "Success", indicating that miscreants can use SO_REUSEADDR to hijack sockets, and that we need to use SO_EXCLUSIVEADDRUSE to protect against this.

In the March 2017 version of the table (and later versions ... including the version actually up on the website right now, if you squint at it long enough to figure out how it's been mangled), this entry is "ACCESS", indicating that miscreants cannot use SO_REUSEADDR to hijack sockets.

Except for this table, almost everything else between these two snapshots is identical (I used some janky SEO website's "page comparison tool" to check). The only other difference is that in the May 2015 version, the first table has an entry for "Windows 7", and in the March 2017 version, it instead has an entry for "Windows 7 and newer". In particular, they both claim that the table we care about describe what happens on "Windows Server 2003 and later operating systems", which seems unlikely, given that their tables are totally different from each other.

So, I wrote a little script to check. Here's the output on Windows 10:

                                                       second bind
                               | default             | SO_REUSEADDR        | SO_EXCLUSIVEADDRUSE
                               | wildcard | specific | wildcard | specific | wildcard | specific
first bind                     -----------------------------------------------------------------
            default | wildcard |    INUSE |  Success |   ACCESS |  Success |    INUSE |  Success
            default | specific |  Success |    INUSE |  Success |   ACCESS |    INUSE |    INUSE
       SO_REUSEADDR | wildcard |    INUSE |  Success |  Success |  Success |    INUSE |  Success
       SO_REUSEADDR | specific |  Success |    INUSE |  Success |  Success |    INUSE |    INUSE
SO_EXCLUSIVEADDRUSE | wildcard |    INUSE |   ACCESS |   ACCESS |   ACCESS |    INUSE |   ACCESS
SO_EXCLUSIVEADDRUSE | specific |  Success |    INUSE |  Success |   ACCESS |    INUSE |    INUSE

This exactly matches the "March 2017" version of the table, not the "May 2015" version.

So... it seems like they actually changed things at some point, to plug up this security hole. You still shouldn't use SO_REUSEADDR, because that opens you up to port hijacking. But if you don't use SO_REUSEADDR, then you're OK, whether or not you use SO_EXCLUSIVEADDRUSE.

Also they still have all this text on the page about how if you use SO_EXCLUSIVEADDRUSE then you can't bind a listening socket to any port that still has active connected sockets, but that never made any sense and they secretly fixed it at some point, as we discovered a while ago (#72 (comment)).

What does SO_EXCLUSIVEADDRUSE actually do then?

It looks like they decided it should be used to control another unrelated bit of IP configuration.

Quirky IP lesson: you can bind a socket to a specific host, like 127.0.0.1, in which case it only accepts connections directed at that host. (This is important for making sockets that can only be accessed over loopback, or through specific ethernet cards.) Or you can bind to a wildcard address, like 0.0.0.0, in which case it accepts connections over any interface (so loopback is fine, any ethernet card is fine, etc.).

Obviously if two sockets tries to bind to the same specific host + port at the same time, that can't work. And ditto if two sockets try to bind to same port on the wildcard address at the same time. But what if one socket binds to a specific host, and another binds to the wildcard address, on the same port at the same time? Is that a conflict? Conceptually, the answer is .... sorta? Honestly to me it feels like it should conflict. But there's at least a little more wiggle room. And here, platforms differ: On Linux, this is always a conflict. On BSDs, it's not a conflict, if the second socket has SO_REUSEADDR set. And we always want to set SO_REUSEADDR to deal with TIME_WAIT sockets, so in practice we always get this behavior tagging along for the ride.

OK, so what about Windows? Well, they did both! By default they act like the BSDs with SO_REUSEADDR set, but if both sockets set SO_EXCLUSIVEADDRUSE then they act like Linux, and if one socket sets it and the other doesn't, then they act like a mix of the two.

So... that's nice I guess?

Which versions do what?

OK that's how Windows 10 works. But when did this change happen? Does the security hole still exist in any versions we care about? So... I downloaded Windows 7 and installed it into a VM, and ran my script again. And I got the same results as on Windows 10, so that's nice. (I also got the same results for the TIME_WAIT test script.) What about older versions?

Windows Vista's "extended support" period ended April 11, 2017. Python 3.6 was released December 23, 2016. According to PEP 11, this means that Python 3.6 will support Windows Vista until 3.6 itself goes EOL. Also, Windows Server 2008 doesn't go out of "extended support" until January 2020, so even Python 3.8 will support Windows Server 2008. Of course, we don't necessarily have to support the same versions that Python itself does...

According to NetMarketShare's estimates, Windows XP is currently 3.96% of desktop installs worldwide. But I definitely don't think we should try to support Windows XP (and it's not even supported by any Python version that we can run on). And Vista is currently 0.26%, which seems like we can probably ignore it?

Well, that's good enough for now. Maybe if I'm really bored later I'll try installing a Vista VM.

What should we do?

So it sounds like SO_EXCLUSIVEADDRUSE's modern purpose is actually unrelated to what its docs say.

In terms of the actual semantics, I don't feel like there's a strong argument for or against SO_EXCLUSIVEADDRUSE. The Linux behavior seems slightly more sensible to me, but "acting like BSD" is perfectly defensible, and I guess the default behavior is probably closer to what locals expect, since it's, y'know, default? (Actually I doubt the locals have any expectations about this tiny edge case at all, but I'm grasping at straws here.)

So I have a weak feeling that perhaps we should stop using SO_EXCLUSIVEADDRUSE after all.

And of course, we should continue to never ever set SO_REUSEADDR.

@njsmith
Copy link
Member Author

njsmith commented Feb 15, 2019

Can now confirm that Windows Vista SP2 (released 2009) also follows the table from the March 2017 docs, and both of my test scripts behave exactly the same way they do on Windows 10.

So AFAICT, the timeline is: Win XP did a thing that was unfortunate but made sense given compatibility concerns (SO_REUSEADDR was dangerous, SO_EXCLUSIVEADDRUSE disabled the dangerousness but had no other effects), Windows Server 2003 did some bizarro thing where they made SO_EXCLUSIVEADDRUSE unnecessary in some cases, but not all, but also decided to make it configure wildcard vs. specific interface binding; and then Vista switched to a system where the original version of SO_EXCLUSIVEADDRUSE is enabled by default and it only does the new unrelated thing, and so far Vista's system has stuck. Also along the way they switched the default way of handling wildcard vs. specific addresses from Linux-style to BSD-style, because, why not.

The Windows IP stack is a land of contrasts.

@njsmith njsmith changed the title Do more research on SO_REUSEADDR/SO_EXCLUSIVEADDRUSE on Windows Drop usage of SO_EXCLUSIVEADDRUSE on Windows Feb 15, 2019
@njsmith
Copy link
Member Author

njsmith commented Apr 25, 2019

Added good first issue label. The specific todo items are:

njsmith added a commit to njsmith/trio that referenced this issue Apr 25, 2019
- Add the test script used in python-triogh-928 to decipher Window's port reuse
  semantics
- Note the existence of the 'tree-format' package (found thanks to
  @itamarst)
esnyder added a commit to esnyder/trio that referenced this issue Jun 12, 2019
esnyder added a commit to esnyder/trio that referenced this issue Jun 17, 2019
njsmith added a commit that referenced this issue Jun 18, 2019
Drop usage of SO_EXCLUSIVEADDRUSE on win32 (#928)
@sorcio
Copy link
Contributor

sorcio commented Jul 3, 2019

Closed by #1108

@sorcio sorcio closed this as completed Jul 3, 2019
@jay
Copy link
Contributor

jay commented Feb 7, 2021

                           | default           | SO_REUSEADDR      | SO_EXCLUSIVEADDRUSE
                           | specific| wildcard| specific| wildcard| specific| wildcard

@njsmith I'm struggling to make sense of this, did you possibly reverse it, in other words is it actually wildcard | specific and not specific | wildcard

@njsmith
Copy link
Member Author

njsmith commented Feb 8, 2021

@jay Entirely possible. The script I ran is here if you want to check definitively: https://github.com/python-trio/trio/blob/master/notes-to-self/how-does-windows-so-reuseaddr-work.py

jay added a commit to jay/trio that referenced this issue Feb 8, 2021
- Generate the table headers dynamically.

Prior to this change the table headers were static and incorrect,
'specific' and 'wildcard' labels were reversed.

------------------------------------------

Example from before the change (incorrect):

                   | default           |
                   | specific| wildcard|
                   ---------------------
default | wildcard |   INUSE | Success |
default | wildcard | Success |   INUSE |

Example from after the change (correct):

                   | default             |
                   | wildcard | specific |
                   -----------------------
default | wildcard |    INUSE |  Success |
default | wildcard |  Success |    INUSE |

------------------------------------------

Bug: python-trio#928 (comment)

Closes #xxxx
jay added a commit to jay/trio that referenced this issue Feb 8, 2021
- Generate the table headers dynamically.

Prior to this change the table headers were static and incorrect,
'specific' and 'wildcard' labels were reversed.

Example from before the change (incorrect):
~~~
                   | default           |
                   | specific| wildcard|
                   ---------------------
default | wildcard |   INUSE | Success |
default | wildcard | Success |   INUSE |
~~~

Example from after the change (correct):
~~~
                   | default             |
                   | wildcard | specific |
                   -----------------------
default | wildcard |    INUSE |  Success |
default | wildcard |  Success |    INUSE |
~~~

Bug: python-trio#928 (comment)

Closes #xxxx
jay added a commit to jay/trio that referenced this issue Feb 8, 2021
- Generate the table headers dynamically.

Prior to this change the table headers were static and incorrect,
'specific' and 'wildcard' labels were reversed.

Example output from before the change (incorrect):
~~~
                   | default           |
                   | specific| wildcard|
                   ---------------------
default | wildcard |   INUSE | Success |
default | wildcard | Success |   INUSE |
~~~

Example output from after the change (correct):
~~~
                   | default             |
                   | wildcard | specific |
                   -----------------------
default | wildcard |    INUSE |  Success |
default | wildcard |  Success |    INUSE |
~~~

Bug: python-trio#928 (comment)

Closes #xxxx
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants