Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make vttestserver compatible with persistent data directories #7718

Merged
merged 4 commits into from
Mar 24, 2021

Conversation

hkdsun
Copy link
Member

@hkdsun hkdsun commented Mar 19, 2021

Description

This is an attempt at making the vttestserver container honour a persisted data directory.

There were a few changes required to get here and I'm happy to split into multiple PRs:

  • Previously, the vttestserver binary did not correctly handle interruption signals. As such, the defer cluster.TearDown() was not run in go/cmd/vttestserver/main.go. I have added a signal handler to correctly run deferred functions.
  • The mysqlctl binary was previously always run with the init subcommand. We now check if the directory provided in -data_dir argument already exists, and if so, we use the mysqlctl start command instead!
  • I have added a -keep_data flag to vttestserver binary. This will guard all existing invocations from the new codepath and keeps things backwards compatible.

Related Issue(s)

Fixes #7717

Checklist

  • Should this PR be backported?
    • I'm not sure!
  • Tests were added or are not required
  • Documentation was added or is not required

Deployment Notes

Impacted Areas in Vitess

Components that this PR will affect:

  • Query Serving
  • VReplication
  • Cluster Management
  • Build/CI
  • VTAdmin

Copy link
Member

@deepthi deepthi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice improvement to vttestserver.
We have a unit test and an integration test (vttest_sample) for this package.
vttest_sample runs in docker on CI. Can you add some tests to cover the new functionality?

https://github.com/vitessio/vitess/blob/master/go/test/endtoend/vtcombo/vttest_sample_test.go

Makefile Outdated Show resolved Hide resolved
@hkdsun
Copy link
Member Author

hkdsun commented Mar 22, 2021

I discovered an issue while integrating the new container into our development environment.

The LocalCluster instance actually calls db.loadSchema() in its Setup method.

loadSchema applies sql and vschema migrations respectively for each keyspace in the topology

This becomes problematic across reboots of the test container if the SQL migrations destroy tables – which is the case in the typical output of a Rails application's "schema dump" tool. Something like:

DROP TABLE IF EXISTS `test_table`;
CREATE TABLE `test_table` (....) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 ROW_FORMAT=DYNAMIC;
DROP TABLE IF EXISTS `another_table`;
CREATE TABLE `another_table` (....) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 ROW_FORMAT=DYNAMIC;

What would the Vitess team want this workflow to look like? I see two options:

  1. We short-circuit SQL migrations in the LocalCluster code if running with the -keep_data option (except when Initializeing), or
  2. We avoid using destructive migrations and assume that all .sql files are run every time the container starts

@deepthi
Copy link
Member

deepthi commented Mar 22, 2021

What would the Vitess team want this workflow to look like? I see two options:

  1. We short-circuit SQL migrations in the LocalCluster code if running with the -keep_data option (except when Initializeing), or
  2. We avoid using destructive migrations and assume that all .sql files are run every time the container starts

Which one is easier for you as a user? I suspect it is the first one.

Signed-off-by: Hormoz K <hkheradm@gmail.com>
Signed-off-by: Hormoz K <hkheradm@gmail.com>
@hkdsun
Copy link
Member Author

hkdsun commented Mar 23, 2021

Which one is easier for you as a user? I suspect it is the first one.

Correct suspicion indeed 😄

I have addressed all your feedback, tidied up the code, added the migration short-circuit, and wrote some tests. Would you please have a look at the latest commits?

I found the e2e test in go/test/endtoend/vtcombo/vttest_sample_test.go to be redundant compared to the go/cmd/vttestserver/vttestserver_test.go but I may be overlooking context. Please let me know if you were expecting coverage in both.

@deepthi
Copy link
Member

deepthi commented Mar 23, 2021

I'll re-review, but tests are failing

F0323 15:32:34.243946   15971 main.go:252] mkdir /tmp/vttestserver_persistent_mode_510768164/logs: no such file or directory
goroutine 375 [running]:
github.com/golang/glog.stacks(0xc000127500, 0xc0004b8140, 0x7d, 0x96)
	/home/runner/go/pkg/mod/github.com/golang/glog@v0.0.0-20160126235308-23def4e6c14b/glog.go:769 +0xb9
github.com/golang/glog.(*loggingT).output(0x18d1f60, 0xc000000003, 0xc0001291f0, 0x16bdb9e, 0x7, 0xfc, 0x0)
	/home/runner/go/pkg/mod/github.com/golang/glog@v0.0.0-20160126235308-23def4e6c14b/glog.go:720 +0x3b3
github.com/golang/glog.(*loggingT).printDepth(0x18d1f60, 0x3, 0x1, 0xc00012ad90, 0x1, 0x1)
	/home/runner/go/pkg/mod/github.com/golang/glog@v0.0.0-20160126235308-23def4e6c14b/glog.go:646 +0x12d
github.com/golang/glog.(*loggingT).print(...)
	/home/runner/go/pkg/mod/github.com/golang/glog@v0.0.0-20160126235308-23def4e6c14b/glog.go:637
github.com/golang/glog.Fatal(0xc00012ad90, 0x1, 0x1)
	/home/runner/go/pkg/mod/github.com/golang/glog@v0.0.0-20160126235308-23def4e6c14b/glog.go:1128 +0x5c
vitess.io/vitess/go/cmd/vttestserver.runCluster(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/home/runner/work/vitess/vitess/go/cmd/vttestserver/main.go:252 +0x342
vitess.io/vitess/go/cmd/vttestserver.startCluster(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/home/runner/work/vitess/vitess/go/cmd/vttestserver/vttestserver_test.go:267 +0x2f8
vitess.io/vitess/go/cmd/vttestserver.TestCanVtGateExecute(0xc000102600)
	/home/runner/work/vitess/vitess/go/cmd/vttestserver/vttestserver_test.go:128 +0xa5

@deepthi
Copy link
Member

deepthi commented Mar 23, 2021

I found the e2e test in go/test/endtoend/vtcombo/vttest_sample_test.go to be redundant compared to the go/cmd/vttestserver/vttestserver_test.go but I may be overlooking context. Please let me know if you were expecting coverage in both.

The endtoend test runs in docker, so it verifies that things work as expected in a docker environment.

Signed-off-by: Hormoz K <hkheradm@gmail.com>
Signed-off-by: Hormoz K <hkheradm@gmail.com>
@hkdsun
Copy link
Member Author

hkdsun commented Mar 23, 2021

@deepthi this should be ready for review I think, I fixed the failing test and added the e2e test.

The failing test is flaky on master as well, it seems

Copy link
Member

@deepthi deepthi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -251,6 +251,13 @@ docker_local:
docker_mini:
${call build_docker_image,docker/mini/Dockerfile,vitess/mini}

DOCKER_VTTESTSERVER_SUFFIX = mysql57 mysql80
DOCKER_VTTESTSERVER_TARGETS = $(addprefix docker_vttestserver_,$(DOCKER_VTTESTSERVER_SUFFIX))
$(DOCKER_VTTESTSERVER_TARGETS): docker_vttestserver_%:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a neat trick

func startPersistentCluster(dir string, flags ...string) (vttest.LocalCluster, error) {
flags = append(flags, []string{
"-persistent_mode",
// FIXME: if port is not provided, data_dir is not respected
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FIXME can be removed now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still a bug unfortunately. -data_dir= needs to always be passed along with -port= or it won't be respected 😬

It's not really something that concerns us day-to-day but I may give it an hour or two today to make a PR for it anyway, if it's a simple fix

Copy link
Member Author

@hkdsun hkdsun Mar 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose we move forward without this fix in this PR and tackle this work separately. I can open an issue.

Edit: opened an issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

vttestserver cannot work with a persistent -data_dir= volume
3 participants