-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Akka.Remote.EndpointException: Error while decoding incoming Akka PDU #3273
Comments
cc @Horusiath looks like this was an issue with the wire format not being totally compatible between 1.3.2 and 1.3.3 once We had to modify the |
@Aaronontheweb yes, new member status was added. However from what I was reading, it was marked as not breaking binary format compatibility. |
Yeah, that's what I thought upon looking at the changes to the .proto file too. Could also be that this cluster was using @nvivo's custom .NET Core / .NET Desktop intertop stuff. |
Some more info on this, I noticed this error appearing when having 2 different nightlies running: 1.3.3 beta-475 and another one from a few days ago, probably 472 or 470. After updating all nodes to the same version again, it stopped. |
Can this still be an issue? I am on 1.3.9, and I am seeing this as well. Actorsystem running for about an hour (on 1 system) and it just seems to stop nodes as well. |
Might be similar, or not related at all: (enable pooling was set to false (as seen on another similar issue)) |
This might still be an issue - mind submitting some information about your runtime environment @AndreSteenbergen ? It'd be helpful to know if it's Linux-specific or not. |
Off course: Problem is I can't seem to reproduce it simply, one time this happen after 5 minutes, another time oafter an hour, I am using MessagePack as serializer if that has anything to do with this |
Can it be connected with gossip messages? I have let my cluster sit overnight without giving any tasks. I see this in the logs: on one side (4062 port) I see this on the 4060 side I see this:
|
If it is of any help, I migrated from Azure to a VPS the dotnet version is 2.0.9. Because of the dotnet core 2.1 issue. Can this also be a dotnetty thing? |
Could it be my own deserializer? Not reading to the end?
|
Could this be an issue: I am also running squid on that machine. I used an old config from the machine I was migrating away from. It received constant forbiddens. Dotnet reported I was on |
Is this related? This is on a lighthouse instance, it's unlikely any of the messages from the actorsystem gets send to that instance.
Later in the logs I see these kind of messages. To me it looks like a stream of data where one hick-up results in errors later in the stream. As these errors are quite contstant after a while. When I restart lighthouse, the errors seem to stop (for a while).
|
Just upgraded to .net core 2.2 on ubuntu 16.04. This issue came back. It is the Lighthouse service node which throws these messages. I have configured my lighthouse system to not create custom actor. So I don't really understand what is going on. |
I need to upgrade the lighthouse image - has to do with the old Akka.NET version running on it.
…Sent from my iPhone
On Feb 18, 2019, at 11:54 AM, AndreSteenbergen ***@***.***> wrote:
Just upgraded to .net core 2.2 on ubuntu 16.04. This issue came back. It is the Lighthouse service node which throws these messages. I have configured my lighthouse system to not create custom actor. So I don't really understand what is going on.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Could this be an issue? I have set a max frame size of 256K
|
Nah, it's old DotNetty code. Just need to update the dependencies and rebuild the image.
…Sent from my iPhone
On Feb 18, 2019, at 2:29 PM, AndreSteenbergen ***@***.***> wrote:
Could this be an issue? I have set a max frame size of 256K
dot-netty.tcp {
transport-class = "Akka.Remote.Transport.DotNetty.TcpTransport, Akka.Remote"
applied-adapters = []
transport-protocol = tcp
#will be populated with a dynamic host-name at runtime if left uncommented
hostname = "0.0.0.0"
public-hostname = "10.0.0.31"
port = 10160
maximum-frame-size = 256000b
}
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I am not using LightHouse from docker, I compiled one myself. I run native on linux, without docker.
and:
|
What version of Akka.NET are you running on Lighthouse?
…Sent from my iPhone
On Feb 18, 2019, at 3:11 PM, AndreSteenbergen ***@***.***> wrote:
I am not using LightHouse from docker, I compiled one myself. I run native on linux, without docker.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
1.3.11
|
That defeats my theory then
…Sent from my iPhone
On Feb 18, 2019, at 3:46 PM, AndreSteenbergen ***@***.***> wrote:
1.3.11
Verzonden vanaf mijn Windows 10-apparaat
Van: Aaron Stannard
Verzonden: maandag 18 februari 2019 15:39
Aan: akkadotnet/akka.net
CC: AndreSteenbergen; Mention
Onderwerp: Re: [akkadotnet/akka.net] Akka.Remote.EndpointException: Error whiledecoding incoming Akka PDU (#3273)
What version of Akka.NET are you running on Lighthouse?
Sent from my iPhone
> On Feb 18, 2019, at 3:11 PM, AndreSteenbergen ***@***.***> wrote:
>
> I am not using LightHouse from docker, I compiled one myself. I run native on linux, without docker.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub, or mute the thread.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Issue resolved (I think) .... I Checked out LightHouse from the webcrawler example, with petabridge 0.4 something. Which is packed with Akka.Cluster 1.3.10, not 1.3.11. No build errors, because Cluster was part of the project. So I had a Cluster version mismatch ..... Sorry .. |
Related issue: Exception thrown: 'Google.Protobuf.InvalidProtocolBufferException' in Google.Protobuf.dll
|
This looks to me like it has to be a message framing issue somewhere further up the food chain, but I'm skeptical about that because we've tested the hell out of the DotNetty message framing sitting in front of it. Best idea is to probably add some additional Info logging inside the |
LOL welp, my fault - my issue was the result of a unit test I wrote intentionally injecting a mal-formed packet into the transport. Disregard my latest comments. |
Akka.NET v1.3.3 nightlies
Happened while the
endpointManager
actor was decoding a message.Have also seen this occur on an
endpointReader
actor inside the same cluster:The text was updated successfully, but these errors were encountered: