Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tablet move and group removal #2880

Merged
merged 7 commits into from
Jan 15, 2019
Merged

Tablet move and group removal #2880

merged 7 commits into from
Jan 15, 2019

Conversation

srfrog
Copy link
Contributor

@srfrog srfrog commented Jan 8, 2019

This PR adds tests to the process of moving a tablet to another group, and removing nodes from a group. We use this to test that these processes don't break the continuity of the cluster.

Test methodology:

  1. Load test data into 3-group cluster
  2. Move all predicates from group 3 to group 2
  3. Remove group 3
  4. Run a test query using the data.
  5. Move all predicates from group 2 to group 1
  6. Remove group 2
  7. Run a test query using the data, must match step 4.

Closes #2466


This change is Reviewable

Copy link
Contributor

@codexnull codexnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on @srfrog, @gitlw, @manishrjain, and @codexnull)


systest/nodes_test.go, line 97 at r1 (raw file):

			})
			x.Check(err)
			cnt = 0

cnt = 0 is not necessary since you're using cnt%100 above.

Copy link
Contributor Author

@srfrog srfrog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on @codexnull, @gitlw, and @manishrjain)


systest/nodes_test.go, line 97 at r1 (raw file):

Previously, codexnull (Javier Alvarado) wrote…

cnt = 0 is not necessary since you're using cnt%100 above.

nice catch, thx!

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Got a comment. So, address that before merging.

Reviewed 3 of 4 files at r1, 1 of 1 files at r2.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @codexnull, @srfrog, and @gitlw)


systest/nodes_test.go, line 141 at r2 (raw file):

	for pred := range state1.Groups["3"].Tablets {
		url := fmt.Sprintf("http://localhost:6080/moveTablet?tablet=%s&group=2", pred)
		resp, err := http.Get(url)

resp could also have some errors in there. Typically, that's what we do. So, check for those.

Copy link
Contributor Author

@srfrog srfrog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 4 files reviewed, 2 unresolved discussions (waiting on @manishrjain, @codexnull, and @gitlw)


systest/nodes_test.go, line 141 at r2 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

resp could also have some errors in there. Typically, that's what we do. So, check for those.

Done.

@srfrog srfrog closed this Jan 11, 2019
@srfrog srfrog deleted the srfrog/delete_node_group_fixes branch January 11, 2019 04:07
@srfrog srfrog restored the srfrog/delete_node_group_fixes branch January 11, 2019 04:11
@srfrog srfrog reopened this Jan 11, 2019
Copy link
Contributor Author

@srfrog srfrog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 4 files reviewed, 2 unresolved discussions (waiting on @manishrjain, @codexnull, and @gitlw)


systest/nodes_test.go, line 97 at r1 (raw file):

Previously, srfrog (Gus) wrote…

nice catch, thx!

Done.

@srfrog srfrog merged commit b5de0f4 into master Jan 15, 2019
@srfrog srfrog deleted the srfrog/delete_node_group_fixes branch January 15, 2019 22:01
dna2github pushed a commit to dna2fork/dgraph that referenced this pull request Jul 19, 2019
* send error when removing last node in a group would orphan tablets.

* reusing backup test data, moved to top systest dir.

* tests for tablet move and node removal.

* removed cnt reset.

* restructured tests to fail in order, and observing all errors from http content.

* created new group-delete dir for custom cluster testing

* docker-compose file for group-delete tests
@kuberxy
Copy link

kuberxy commented Sep 5, 2024

This PR adds tests to the process of moving a tablet to another group, and removing nodes from a group. We use this to test that these processes don't break the continuity of the cluster.

Test methodology:

  1. Load test data into 3-group cluster
  2. Move all predicates from group 3 to group 2
  3. Remove group 3
  4. Run a test query using the data.
  5. Move all predicates from group 2 to group 1
  6. Remove group 2
  7. Run a test query using the data, must match step 4.

Closes #2466

This change is Reviewable

Hello, how exactly do you do when you remove a group?

There are 4 groups in my cluster.

After I removed all the instances in group 4(by curl "http://127.0.0.1:6080/removeNode?group=4&id=10" ), my cluster became read-only.

When I write data, I get an error like this

{"errors":[{"message":"mutation addProduct failed because Dgraph execution failed because cannot retrieve predicate information: No connection exists","locations":[{"line":2,"column":3}],"path":["addProduct"]},{"message":"Mutation addCustomer was not executed because of a previous error.","locations":[{"line":11,"column":3}],"path":["addCustomer"]}],"data":{"addProduct":null},"extensions":{"tracing":{"version":1,"startTime":"2024-09-05T14:44:38.672030942+08:00","endTime":"2024-09-05T14:44:38.675771422+08:00","duration":3740518,"execution":{"resolvers":[{"path":["addProduct"],"parentType":"Mutation","fieldName":"addProduct","returnType":"AddProductPayload","startOffset":209005,"duration":3487989,"dgraph":[{"label":"preMutationQuery","startOffset":0,"duration":0},{"label":"mutation","startOffset":277442,"duration":0},{"label":"query","startOffset":0,"duration":0}]}]}}}}

My question is similar to this https://discuss.dgraph.io/t/option-to-gracefully-remove-a-undead-group/2921/5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants