-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IPv6 support #50
IPv6 support #50
Conversation
Damn. Develop on Linux, so it works on Linux, but macOS is busted 😂 |
I am a guest, so please have at it. May it be enjoyable to you. I have been developing some Tcp6Suite cases to try to prove to myself that I/we are actually using IPv6. As you have probably discovered Something to keep in mind, is that the sockaddr_in6 on SN 0.4.n Looks like you have also discovered that macOS, freeBSD, use a |
import scala.scalanative.posix.netinet.inOps._ | ||
import scala.scalanative.unsafe._ | ||
|
||
private[ch] object SocketHelpers { | ||
|
||
lazy val preferIPv4Stack = | ||
java.lang.Boolean.parseBoolean(System.getProperty("java.net.preferIPv4Stack", "false")) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does parseBoolean do a case-blind comparison?
The java docs are pretty unclear about if the property is
Boolean or String. They say
However, in the case an application would rather use IPv4 only sockets, then this property can be set to true. "true" is bolded
but not in double quotes.
You are probably correct here and I should fix in SN, with
your approval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably needs a placeholder that java.net.preferIPv6Addresses is known, but not implemented this evolution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. Here's what the Javadoc says.
Parses the string argument as a boolean. The
boolean
returned represents the valuetrue
if the string argument is notnull
and is equal, ignoring case, to the string"true"
.
Example:
Boolean.parseBoolean("True")
returnstrue
.
Example:Boolean.parseBoolean("yes")
returnsfalse
.
https://docs.oracle.com/javase/8/docs/api/java/lang/Boolean.html#parseBoolean-java.lang.String-
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know, thank you for the snippet. So, presumably 0 and 1 would both be Boolean:False?
Sometimes on unix command lines (perhaps bash?) those are also taken as boolean. There
are a bunch of quirks about the case of true/false, where something work and others (True?)
that one would expect to work do not. Like black flies in Maine, just an annoyance one
learn to live with (or pretend to live with).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably needs a placeholder that java.net.preferIPv6Addresses is known, but not implemented this evolution.
I am a little confused what we would do with preferIPv6Addresses
. It seems like this is a useful option for getByName
, which has the choice to resolve either an IPv4 address or an IPv6 address.
But in our case, we are directly handed an IP address and we have already created a socket according to the value of preferIPv4Stack
. So it's not obvious to me how we should modify our behavior based on this setting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of punted preferIPv6Addresses
in SN 0.5.0 and realized doing this work that I had forgotten that
I had done so. Oops!
You can see why I moved peferIPv6Addresses
to the rear echelons. It is a complex feature which
will probably never be used but some reasonable answer must be made for API compatibility.
I will have to think about and do some design studies on this one. I think the place where
I would start is if a host/machine has both IPv4 and IPv6 addresses, preferIPv6Addresses, is
set, and an IPv4 address is presented to getadderinfo. The "pure" IPv6 address should be
used. I just do not know if addrinfo will order the list that way. I suspect it must be done
manually. I believe this a live concern for SN. I am not convinced that such an IPv4
can sneak thru to epoll getadderinfo. I suspect it can.
Again, when this becomes our biggest & most urgent problem, it will be a good day.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Congratulations, looks like you did a lot of hard work.
And it passes CI.
-
I suggest that the call chain containing and under
getLocalAddress
is
complex. Now that you have the Big Picture down, you might want to
give that area focused pass. -
Now comes the hard part, convincing ourselves & others that IPv6 is actually
being used. After this is merged, I can run my IPv6 tests (only a few) and
run some dedicated tools to gather evidence that IPv6 is being used. -
By the light of day tomorrow I will go back and look at the getaddrinfo flags.
Now that they are in one place, it will be easier to do.
Later: Done, suggested change in its place in the file. -
There are a couple of "Due Bills", "preferIPv6Addesses" and initial determination
if IPv6 is available. Those can be fixed in follow-on PRs. Just mentioning to
keep track of them before they become bugs.
So, if there is urgency to getting IPv6 support available, I suggest you merge what
is here or the next immediate generation.
import scala.scalanative.posix.netinet.inOps._ | ||
import scala.scalanative.unsafe._ | ||
|
||
private[ch] object SocketHelpers { | ||
|
||
lazy val preferIPv4Stack = | ||
java.lang.Boolean.parseBoolean(System.getProperty("java.net.preferIPv4Stack", "false")) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably needs a placeholder that java.net.preferIPv6Addresses is known, but not implemented this evolution.
@@ -97,12 +97,12 @@ class TcpSuite extends EpollcatSuite { | |||
_ <- IO(assertEquals(res, "pong")) | |||
} yield () | |||
|
|||
serverCh.bind(new InetSocketAddress(0)) *> server.both(client).void | |||
serverCh.bind(new InetSocketAddress("localhost", 0)) *> server.both(client).void |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I usually try to use a "building up" strategy of testing.
Convincing myself that the shortest paths and primitives
work. I try to test things before depending upon their working.
That is not always possible.
The point is that I think there are some cases where you
want "127.0.0.1" explicitly, so that you know that you have
either an IPv4 address or an IPv4mappedIPv6 address.
As we have experienced, at least one case with "localhost"
is prudent. "localhost" takes one on a trip to the name (bind)
server and that depends on local configuration. There is
a certain point to testing "Who care what I get back", but doing
only it may not give you the coverage you/we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, it's a good strategy. I apply it mostly sometimes, usually when patching small bugs. Today I cheated!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, the initial macOS failures were when using new InetSocketAddress(0)
by itself. Each of the following work:
new InetSocketAddress("localhost", 0)
new InetSocketAddress("127.0.0.1", 0)
new InetSocketAddress("0.0.0.0", 0)
new InetSocketAddress("0:0:0:0:0:0:0:0")
So there is something strange going on in the no-argument case. After investigating aimlessly, I determined it wasn't important enough to block at this stage. And is possibly an issue in Scala Native core, and not here anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The no argument case means that the underlying code is supposed to
use the "wildcard" address. On SN 0.4.n, this would be the IPv4
0.0.0.0. I think that this is an SN 0.4.0 issue.
I would have check why the one arg case and the 3rd case in your
list 0.0.0.0 differ. I would have expected them to be the same.
What errors were you seeing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error I observed was that the client socket was unable to connect to the server: ConnectionException
. But only on macOS.
@@ -97,12 +97,12 @@ class TcpSuite extends EpollcatSuite { | |||
_ <- IO(assertEquals(res, "pong")) | |||
} yield () | |||
|
|||
serverCh.bind(new InetSocketAddress(0)) *> server.both(client).void | |||
serverCh.bind(new InetSocketAddress("localhost", 0)) *> server.both(client).void | |||
} | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a matter of style and approach. I find it useful to have the IPv6 tests in their own Suite.
Yes, there can be considerations of duplication. The regular TcpSuite does, in this PR,
exercise "can not tell the difference" IPv6 usage. I find that I can do pinpoint "there
ought to be a difference here" tests in a separate, dedicated Suite. By having the
two separate, I find there are fewer mental context switches involved when I am
reading either.
I know that the primary use of these tests is for CI and epollcat development. The
latter means that they may be run in the wild. It is increasingly unlikely these days
but IPv6 may not be available.
I have a note-to-future self in my IPv6 Suite to test the expected errors if
preferIPv4Stack is true or if IPv6 is not available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it useful to have the IPv6 tests in their own Suite.
Yes, this is a very good idea. In fact, we should also considering creating completely isolated sbt modules: one for testing preferIPv4Stack = true
and another for testing preferIPv4Stack = false
. Since this can be initialized only once, even separate suites is not sufficient insulation.
@@ -98,18 +106,26 @@ private[ch] object SocketHelpers { | |||
} | |||
|
|||
def getLocalAddress(fd: CInt): SocketAddress = { | |||
val addr = stackalloc[posix.netinet.in.sockaddr_in]() | |||
val sz = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent quite a bit of time at the end of my long day
working with getLocalAddress
. It was, understandably, causing my connect tests to fail.
I read through getLocalAddress
and the things it calls.
I think the entire section can be simplified. If this is working
that could happen in a later pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I will make another pass through this.
import scala.scalanative.posix.netinet.inOps._ | ||
import scala.scalanative.unsafe._ | ||
|
||
private[ch] object SocketHelpers { | ||
|
||
lazy val preferIPv4Stack = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SN 0.5.0 does a getaddrinfo of a known IPv6 site in a crude
attempt to ascertain if IPv6 support is available.
It is possible to get a list of interfaces and check if any
of them are configured for IPv6. The getaddrinfo() is
an implementation felicity avoid that the hairy code
dealing directly with interfaces.
In my mind, I have been trying to get rid of that startup execution cost. I did exactly what you did in my (various)
epoll IPv6 studies. Thinking of them and reviewing this
now makes me think the initial execution test is worthwhile/necessary.
It also affects preferIPv6Addresses.
The difference is that the runtime test means that IPv4 only systems work without effort. With the "check property only
approach", a developer, not an end user, has to set the
property. The land of ugly.
So, current code is good enough, IMHO, for scaffolding,
but it carries a "due bill" for a runtime check (or better solution)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. This sounds like a due bill and one to chew on as well. I've opened a followup issue.
#include <sys/socket.h> | ||
|
||
typedef unsigned short scalanative_sa_family_t; | ||
struct scalanative_sockaddr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just dealing with this today. You wonder where my
time goes ;-)
I am not quite catching why this "glue" is necessary, but
that is probably lack of cleverness on my end.
I believe that you want to allocate the largest size possible that
accept4 can possibly fill in. Generally, one would use
a sockaddr_storage, defined specifically for that purpose.
That is not usable on 0.4.n (I have not checked 0.4.7 to
see what the merge elf did). So, I suggest sockaddr_in6
and the adjusted length.
One is supposed to do a check after accept to see if the
addrlen passed back exceeded the addrlen passed in.
In this context, sockaddr_in6 is the largest that can be
passed back, so that check can be skipped. However,
it is good to know about that behavior. (if the return size
is larger, that means the buffer passed in was too small
and got truncated/mangled).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of the mysteries of SN 0.4.x. I believe it is necessary due to the use of an intermediate scalanative_sockaddr
?
For what it is worth, this code is entirely cribbed from Scala Native itself except switched to use accept4
instead of accept
. So any improvements should ideally be done in tandem, as we rely on the Scala Native version of this code when calling accept
on macOS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After years of sustained effort, I finally eliminated "glue" in straight passthru code in SN 0.5.0.
There is some code there to check, at compile time, that the structures are compatible.
IIRC, accept() was one of the ones which were simplified..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this file can go away completely, with
the corresponding change in socket.scala.
If you want to be extremely conservative,
the "@name" and the code here can go away, and this
file becomes just the _Static_assert statements relative
to struct sockaddr
.
The 0.5.0 code which does away with these conversions
has been merged into the SN mainline and seem to
have caused no problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, are you sure? That's what I did at first and it was not working actually. I had to add this glue code to fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Not working"? in what way?
For the sake of expediency, I suggest the merge proceed
with the file and I will do private builds afterwards, from a
stable baseline, to see why reality is not matching my
expectations. It has a habit of doing that, and pulling
my leash hard.
If there is something wrong with removing the file, there may be
an issue in 0.5.0.
Sorry for taking time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not at all, I would like to get to the bottom of this and would be happy to see the glue code go away. See #53.
@@ -98,18 +106,26 @@ private[ch] object SocketHelpers { | |||
} | |||
|
|||
def getLocalAddress(fd: CInt): SocketAddress = { | |||
val addr = stackalloc[posix.netinet.in.sockaddr_in]() | |||
val sz = | |||
if (preferIPv4Stack) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From reading the getsockname
man page, I believe
that you can unconditionally stackalloc a sockaddr_in6
with appropriate addrlen.
Yes, in the IPv4 case it over-allocates a few (10?) bytes
on the stack, & stack is a precious resource. If does
simplify the code flow and probably speeds things up
by removing a branch.
If only that level of detail were our biggest concern.
It is more the reduction in code complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah that does simplify things quite a bit. I will do that.
} | ||
|
||
def toInetSocketAddress( | ||
def toInet4SocketAddress( | ||
addr: Ptr[posix.netinet.in.sockaddr_in] | ||
): InetSocketAddress = { | ||
val port = posix.arpa.inet.htons(addr.sin_port).toInt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you double check my thinking? The port in the sockaddr_in passed in should already be in
network order. If it came from the OS, it is. If it was manually constructed, some blundered if it
is not.
On some machines, htons() is actually a no-op. I forget which architectures and often
have to either look at the system C header (where is is often a macro) or, more quickly,
write a C program for the architecture of interest. I am saying this because this code
may be working (as a no-op) on some machines but may scramble the port on others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somewhere around here there is byte copy which, I think, I have seen done more simply.
Now that the 50,000 foot issues seem to be resolving, I can see if I can create a candidate
for a PR after this merges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the code that I had been looking for which might simplify this section.
I will do a design study after this merges to see if what I think will work actually
does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am saying this because this code may be working (as a no-op) on some machines but may scramble the port on others.
Yikes 😂 I stumbled on the htons
in various sources, such as here:
https://www.gta.ufrj.br/ensino/eel878/sockets/sockaddr_inman.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think may "alert!" was at the proper places, but the wrong function. I can check again
in the light, but it dawned on me that the port in the various Java structures is almost
surely host-order. That would mean that a "network-order-to-host-order", aka ntohs()
would need to be done here.
Somewhere, and I can track it down, some does a final htons() (host-to-network)
before filling the sockaddr.
The network-to-host-to-network trip seems messy but I think it is required by
the containers we are using. Intel is a little-endian machine, network order
is big endian, so both htons() and ntohs() do real work and are not noops.
Re: the URL. Yes, they are describing the final htons() call I described.
I will check again, but here the port passed is inside a sockaddr, hence
in network order.
Let me see if I can confirm my suspicions about the port being stored in
Java structures in host order.
Sorry for the jerking around. This stuff is complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is getaddrinfo
that is doing the htons() under the covers. That is why grepping the
file does not show that function in the code. The port is being stored in the Java structures
as host order, which makes sense.
So, the fix here is, I believe, to use ntohs()
. Of course, this invites the question of why
it is working the way it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured out why htons() was working. htons() and, the correction, ntohs() both swap
bytes, in the case of a 2-byte short they do exactly the same thing. AB becomes BA in
both cases. Of course, this is on little-endian systems. On big endian systems, they
are both no-ops.
I think it is worth using the right name ntohs() here to help readers follow along and
reduce the astonishment factor.
Not a huge accomplishment, but figuring out a "not knowing why?" is a good
way to start my day.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh. I'm sorry, I completely missed the subtlety here between htons
and ntohs
. Now I see exactly what you are talking about 😂 I will take a look and fix this.
else | ||
posix.sys.socket.AF_INET6 | ||
hints.ai_flags = posix.netdb.AI_NUMERICHOST | posix.netdb.AI_NUMERICSERV | ||
if (!preferIPv4Stack) hints.ai_flags |= posix.netdb.AI_V4MAPPED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am pretty sure this wants to be
(posix.netdb.AI_V4MAPPED | posix.netdb.AI_ADDRCONFIG)
That is, if you naturally have no IPv6 on the system, do not
take a prefectly good IPv4 address and map it to a useless
IPv4mappedIPv6 address 😄
Gotta love the wide variety of cases we are trying to cover.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is another one that I'm confused about. The problem is, we've already created the socket according to the value of preferIPv4Stack
. So it seems to me that using AI_ADDRCONFIG
could get us into trouble?
In the specific case you mention, if preferIPv4Stack = false
and the system has no IPv6, then we are already hosed regardless of our choice here 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the difference is between java.net.preferIPv4Stack
which can and usually is false,
and the variable preferIPv4Stack. The latter is the reason for the startup runtime test
to see if IPv6 is available. If there is no IPv6, I believe that would set the variable
to true, even if the property is false. The property is "prefer", like a hint.
This is reasoning from the fact that the system property defaults to false,
yet Java appears to run just fine on IPv4 only systems (if you can find one).
My head is spinning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is no IPv6, I believe that would set the variable
to true, even if the property is false.
Yes, I agree. In fact, when we implement this, we should really introduce a new variable useIPv4Stack
that is derived from the combination of preferIPv4Stack
with the startup runtime test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I thought of that for SN 0.5.0 and do not remember what I did. If I left it as preferIPv4
it is
becoming evident that was a mistake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For historical reference, as we move towards closure:
In tracing this yet again, the line
if (!preferIPv4Stack) hints.ai_flags |= posix.netdb.AI_V4MAPPED
is directly equivalent to OR'in in AI_ADDRCONFIG. The mapping
only gets done when IPv6 is available.
My eye is just trained to expect (AI_V4MAPPED | AI_ADDRCONFIG)
and alerted.
No need for the OR in this case.
Knowing that there is some urgency, I took a focused look at two area.
|
Lee, thank you so much for the very detailed review!! I will go through each of your comments carefully.
My urgency was mostly to run the Skunk test suite which was completely blocked on this, as it also exercises the Cats Effect runtime, TCP sockets, and TLS, so I could become aware of any other issues as soon as possible. I was able to use e7dd9a7 to get a green CI run in typelevel/skunk#703, hallelujah 😁 So now that my immediate concerns have been alleviated, we have some time and space to make some iterations on this initial effort. If the stars align, I would love to release an epollcat v0.1.0 this week, but we shall see :) |
It must feel good to get to the point where you see the pieces you dreamed would My suggestion is to merge early, merge often, once you have considered the It is going to be far easier to iterate with something reasonable, but known I think the 'preferIPv6Addresses" is not implemented is a line in the code I do think that doing an initial getaddrinfo at startup is worth getting in I am enjoying the collaboration, even if there is some duplicated work. |
Thank you, yes, getting Skunk running is very exciting. In fact epollcat is an homage to Skunk's author, Rob Norris aka tpolecat :) this is a shared milestone, I couldn't have done this without you!!
I do apologize about the duplicated efforts, but I'm glad you feel that way. This is an entirely new space to me: I have done little C programming in my life let alone any networking, so getting my hands dirty has been crucial to finding my way around. I will make a cleanup+simplication passthrough based on your notes. |
re: homage to Skunk's author Rob Norris aka tpolecat. Very interesting. Over the years, I have seen/heard Rob's name but |
I'm going to merge to keep things unblocked, but I hope we will keep iterating on this! |
Understood and agreed. Good course of action. You created an Issue for the initialization issue, so that I think that it more likely to to keep iterating on us than the other |
Congratulations for both getting this off the shipping dock and doing so in such a short time. |
Lee, I hope you don't mind that I gave this a shot too, following your notes from #43. I ran into some troubles (particularly with
accept4
) so I am hoping we can compare strategies and find the best path forward. I am certainly curious to see your approach :)