Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial replacement of Helios with DotNetty #2444

Merged
merged 20 commits into from
Feb 5, 2017

Conversation

Horusiath
Copy link
Contributor

@Horusiath Horusiath commented Jan 11, 2017

This PR contains changes necessary to replace existing Helios transport with DotNetty.

@@ -611,7 +611,7 @@ public void ClusterClient_must_reestablish_connection_to_receptionist_after_serv
// start new system on same port
var sys2 = ActorSystem.Create(
Sys.Name,
ConfigurationFactory.ParseString("akka.remote.helios.tcp.port=" + Cluster.Get(Sys).SelfAddress.Port).WithFallback(Sys.Settings.Config));
ConfigurationFactory.ParseString("akka.remote.dot-netty.tcp.port=" + Cluster.Get(Sys).SelfAddress.Port).WithFallback(Sys.Settings.Config));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should remove the dash and use dotnetty ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Original name is DotNetty (which is PascalCase) therefore I've proposed dot-netty, as name convention for HOCON usually conforms to DashCase.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Horusiath IMHO the - looks weird. I'd condense it down to just dotnetty too, although it doesn't really matter

@alexvaluyskiy
Copy link
Contributor

Will you plan to support SSL and/or UDP in the initial version?

@Horusiath
Copy link
Contributor Author

SSL will be supported. UDP probably won't - UDP is very useful when designing higher level protocol on top of it as it allows us to go faster than TCP (see QUIC or Aeron), but unless we build one by ourselves, UDP itself probably won't be useful enough for Akka.

@Aaronontheweb
Copy link
Member

@Horusiath we should add UDP support anyway. Let the user make that decision. I get requests for it on occasion the time from IOT users.

@Aaronontheweb
Copy link
Member

Although, doesn't need to be in THIS pull request. Can add it later.

@Horusiath Horusiath force-pushed the dotnetty branch 4 times, most recently from 64fca9b to f47ed27 Compare January 18, 2017 13:01
@Horusiath
Copy link
Contributor Author

I've added test certificate for SSL tests to .gitignore, however it seems that git ignores it anyway.

@Horusiath
Copy link
Contributor Author

Horusiath commented Jan 20, 2017

At this point there are still few remaining issues:

  • SSL support is working, but in order to be able to test it, I have to add a test certificate to certificate store. I'm adding/removing cert before/after the test itself. This allows us to correctly work with those tests in R# test runner. However this also requires to run VS with admin permissions. Same will apply to running build script on CI agent.
  • DNS resolution specs inherited from Helios are failing for unknown reason. They are FsCheck-based and can fail i.e. after 78 valid assertions. However when I'm trying to use the input data that caused an error and run this as a straightforward Xunit fact, everything works.
  • The last truly missing piece is compatibility with Helios. Right now when DotNetty transport detects akka.remote.helios.tcp, it can use to as primary source of configuration. This gives us backward compatibility for configuration. However actor systems using DotNetty and Helios at this moment still are unable to talk with each other - I'm guessing that the reason for that is the endianness of both (DotNetty → big endian, Helios → little endian), that needs to be adjusted. However the problem will be with further verification as none of the transport layers gives any useful info about exceptions and I haven't found a good hook point to insert breakpoint before that error happens.

@Horusiath Horusiath force-pushed the dotnetty branch 3 times, most recently from 5febf78 to 8cd6fd4 Compare January 23, 2017 10:56
@Horusiath
Copy link
Contributor Author

At this point all tests are passing locally! Now we need to figure out how to enable secure socket layer tests working in CI agent in secure manner. I guess running pull request tests on agent with admin permissions is not the smartest move possible :P

@alexvaluyskiy
Copy link
Contributor

Have you compared the performance between Helios and DotNetty?

@alexvaluyskiy
Copy link
Contributor

alexvaluyskiy commented Jan 23, 2017

DNS resolution specs inherited from Helios are failing for unknown reason.

There is a difference between Helios and DotNetty as I remember. DotNetty can't resolve 'localhost' @Aaronontheweb Am I Right?

@alexvaluyskiy
Copy link
Contributor

The last truly missing piece is compatibility with Helios.

Do we need this compatibility? The best idea - to release a Helios transport as a separate package

@Horusiath
Copy link
Contributor Author

Horusiath commented Jan 23, 2017

@alexvaluyskiy I haven't made perf test yet. First I want CI build passing.

Resolving DNS is not a problem, as this is already implemented, in similar but not exactly the same way it works in Helios. And compatibility is desired, when people are upgrading their nuget packages, they expect to end up with working system. Helios will be released as a separate package (legacy transport layer), but I wanted to make DotNetty working as drop-in replacement.

I've managed to make DotNetty working with Helios already - major change was to switch default byte order in DotNetty to little endian (I've made it configurable).

@Aaronontheweb
Copy link
Member

The backwards compatibility test still fails - is this more DNS stuff?

@Horusiath
Copy link
Contributor Author

Nope, I've accidentally removed helios dependency from akka.remote.tests when fixing solution file.

Copy link
Member

@Aaronontheweb Aaronontheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have some questions and some requested changes

<Compile Include="HeliosTcpTransport.cs" />
<Compile Include="HeliosTransport.cs" />
<Compile Include="HeliosTransportSettings.cs" />
<Compile Include="Properties\AssemblyInfo.cs" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to include SharedAssemblyInfo.cs, like the other projects, for versioning

@@ -611,7 +611,7 @@ public void ClusterClient_must_reestablish_connection_to_receptionist_after_serv
// start new system on same port
var sys2 = ActorSystem.Create(
Sys.Name,
ConfigurationFactory.ParseString("akka.remote.helios.tcp.port=" + Cluster.Get(Sys).SelfAddress.Port).WithFallback(Sys.Settings.Config));
ConfigurationFactory.ParseString("akka.remote.dot-netty.tcp.port=" + Cluster.Get(Sys).SelfAddress.Port).WithFallback(Sys.Settings.Config));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Horusiath IMHO the - looks weird. I'd condense it down to just dotnetty too, although it doesn't really matter

}
}");

using (var heliosSystem = ActorSystem.Create("helios-system", heliosConfig))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the build server, fails on this line:

System.TypeLoadException : Cannot instantiate transport [Akka.Remote.Transport.Helios.HeliosTcpTransport, Akka.Remote.Transport.Helios]. Cannot find the type.

@@ -310,14 +310,17 @@ akka {
trttl = "Akka.Remote.Transport.ThrottlerProvider,Akka.Remote"
}

### Default configuration for the Helios based transport drivers
# necessary to keep backwards compatibility
helios.tcp.transport-class = "Akka.Remote.Transport.Helios.HeliosTcpTransport, Akka.Remote.Transport.Helios"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the error above, this FQN probably isn't correct


namespace Akka.Remote.Transport.DotNetty
{
internal class AkkaLoggingHandler : ChannelHandlerAdapter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this class for? Debugging when using Akka.Remote?

var resolved = await Dns.GetHostEntryAsync(address.Host);
//NOTE: for some reason while Helios takes first element from resolved address list
// on the DotNetty side we need to take the last one in order to be compatible
return new IPEndPoint(resolved.AddressList[resolved.AddressList.Length - 1], address.Port);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this, but moreover we should probably take the first address that matches the preferred DNS family settings. It's probably a bug that we don't do this today for Helios in the first place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that the overload below does this - does that get called by default?

Copy link
Contributor Author

@Horusiath Horusiath Jan 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on enforce-ip-family setting. It IP address family enforcement is on, second option will be used. I also think, we should pick first correct address, this however was causing a failure between Helios and DotNetty backward compatibility spec - I don't know yet why.

#endregion
}

internal class HeliosBackwardsCompatabilityLengthFramePrepender : LengthFieldPrepender
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ported over from the Helios 2.1 bits?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I've taken it from there + added ByteOrder.LittleEndian (because of DotNetty :) )

tcpKeepAlive: config.GetBoolean("tcp-keepalive", true),
tcpNoDelay: config.GetBoolean("tcp-nodelay", true),
backlog: config.GetInt("backlog", 4096),
enforceIpFamily: RuntimeDetector.IsMono || config.GetBoolean("enforce-ip-family", false),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't checked yet, but do we know if the newer (4.6+) versions of Mono function correctly with IPV6 now?

public readonly int? WriteBufferHighWaterMark;
public readonly int? WriteBufferLowWaterMark;
public readonly bool BackwardsCompatibilityModeEnabled;
public readonly bool LogTransport;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for activating that Logging ChannelHandlerAdapter from earlier, right?

helios.tcp {
### Default configuration for the DotNetty based transport drivers

dot-netty.tcp {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a new setting in here for turning on transport logging?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I've added log-transport value in order to add LoggingHandler to DotNetty pipeline. It's used to log user events on DotNetty side.

@Aaronontheweb
Copy link
Member

Aaronontheweb commented Feb 3, 2017

Looking at a measured performance difference of about ~3k msg / s (edit: in favor of Helios) but I've seen that vary quite a bit on some of the benchmarks for this spec; been a difference sometimes as large as 6k.

The culprit to me looks like GEN 1 GC, which is roughly 10x is high as it is in the Helios transport. My guess is that @Horusiath is right as to the reason why: we're allocating a number of SwappedByteBuffer objects in the process of doing this.

Other possibility is that something like the log-transport option defaulting to true, although after looking through the code it shouldn't be the case.

@maxcherednik
Copy link
Contributor

@Aaronontheweb @Horusiath guys, I haven't seen the exact changes here. Last time Aaron mentioned that dotnetty is nearly the same as Helios. I hope we keep the logging clean for the known case which we recently fixed on Helios.

@Horusiath Horusiath removed the WIP label Feb 4, 2017
@Horusiath Horusiath changed the title [WIP] Initial replacement of Helios with DotNetty Initial replacement of Helios with DotNetty Feb 4, 2017
@Aaronontheweb Aaronontheweb merged commit c95b2d2 into akkadotnet:dev Feb 5, 2017
@Aaronontheweb Aaronontheweb modified the milestone: 1.1.4 Feb 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants