Skip to content

Conversation

@solongordon
Copy link
Contributor

The RPC heartbeat now checks for version compatibility between the two
nodes. This prevents a node from establishing a gRPC connection if its
binary version is lower than the cluster version on the other end.

Fixes #18058.

Release note (bug fix): Prevent RPC connections between nodes with
incompatible versions.

@solongordon solongordon requested a review from tbg December 9, 2017 03:41
@solongordon solongordon requested a review from a team as a code owner December 9, 2017 03:41
@solongordon solongordon requested review from a team December 9, 2017 03:41
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@bdarnell
Copy link
Contributor

:lgtm:


Reviewed 33 of 33 files at r1.
Review status: all files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.


pkg/rpc/heartbeat.go, line 65 at r1 (raw file):

func checkVersion(
	clusterVersion *cluster.ExposedClusterVersion, serverVersion roachpb.Version,

s/server/peer/


pkg/rpc/heartbeat_test.go, line 226 at r1 (raw file):

}

func TestVersionCheck(t *testing.T) {

What's the difference between this test and the similar one in context_test.go? (comments would be nice)


Comments from Reviewable

@a-robinson
Copy link
Contributor

:lgtm:


Reviewed 33 of 33 files at r1.
Review status: all files reviewed at latest revision, 3 unresolved discussions, some commit checks failed.


pkg/rpc/context.go, line 585 at r1 (raw file):

		if err == nil {
			err = checkVersion(ctx.version, response.ServerVersion)

The wording of the "but peer has version" error message is ambiguous depending on where it's returned from. If it's evaluated on the server and returned by heartbeatClient.Ping it means that the client is too old. If it's evaluated here it means the server is too old. In both cases we stick the error into the same place without distinguishing which case it was.

Could you disambiguate the situation in some way so that we don't get stuck wondering which server was at the bad version in a debugging scenario where we only have the logs to go by?


Comments from Reviewable

@a-robinson a-robinson requested review from a-robinson and removed request for tbg December 11, 2017 16:24
@solongordon
Copy link
Contributor Author

Review status: all files reviewed at latest revision, 3 unresolved discussions, some commit checks failed.


pkg/rpc/context.go, line 585 at r1 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

The wording of the "but peer has version" error message is ambiguous depending on where it's returned from. If it's evaluated on the server and returned by heartbeatClient.Ping it means that the client is too old. If it's evaluated here it means the server is too old. In both cases we stick the error into the same place without distinguishing which case it was.

Could you disambiguate the situation in some way so that we don't get stuck wondering which server was at the bad version in a debugging scenario where we only have the logs to go by?

Good point. I wrapped the error to indicate if it's on the request or response side. Hopefully that's sufficient.


pkg/rpc/heartbeat.go, line 65 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

s/server/peer/

Done.


pkg/rpc/heartbeat_test.go, line 226 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

What's the difference between this test and the similar one in context_test.go? (comments would be nice)

Yup, added comments. They test similar logic but at different levels. Basically this one tests the handler side of the heartbeat and the other tests the request side. I would have only added this test except that I needed to verify that the check was bidirectional. Hopefully the new comments make this clearer.


Comments from Reviewable

@a-robinson
Copy link
Contributor

👍


Reviewed 5 of 5 files at r2.
Review status: all files reviewed at latest revision, 3 unresolved discussions, some commit checks pending.


Comments from Reviewable

@solongordon
Copy link
Contributor Author

Update to fix acceptance tests: I added a rpc.GRPCDialRaw function for when we want a gRPC connection without running validation. This was useful for the failing test which needed to create a client connection to an older binary. I have a feeling it will come in handy in other situations too.


Review status: all files reviewed at latest revision, 3 unresolved discussions, some commit checks pending.


Comments from Reviewable

The RPC heartbeat now checks for version compatibility between the two
nodes. This prevents a node from establishing a gRPC connection if its
binary version is lower than the cluster version on the other end.

Fixes cockroachdb#18058

Release note (bug fix): Prevent RPC connections between nodes with
incompatible versions.
@solongordon solongordon changed the title Check version compatibility on RPC handshake core: check version compatibility on RPC handshake Dec 14, 2017
@solongordon solongordon merged commit 3d4066a into cockroachdb:master Dec 14, 2017
@solongordon solongordon deleted the check-version branch December 14, 2017 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

rpc: fail at handshake time when node versions are incompatible

4 participants