Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to GetSchema to only send the row count and data length over the wire #6985

Merged
merged 5 commits into from
Nov 29, 2020

Conversation

ajm188
Copy link
Contributor

@ajm188 ajm188 commented Nov 2, 2020

If you are attempting to get aggregate table sizes across a keyspace, this requires one call to GetSchema per shard, which result in the caller to vtctl needing to json2.Unmarshal the same info (create table statement, column info) N times, when we know it's not going to change. This flag will allow a user to make one GetSchema call to get the schema, and then N-1 GetSchema -table_sizes_only calls for the rest of the shards, and results in much less data transfer and much faster (3-4x faster by some metrics I took) response times.

Closes #6987

…er the wire

Signed-off-by: Andrew Mason <amason@slack-corp.com>
@ajm188
Copy link
Contributor Author

ajm188 commented Nov 2, 2020

A comment from @ameetkotian:

think we need to handle the case when both tableNamesOnly and tableSizesOnly are specified. The logic should be clear about what is returned. Now, it looks like only table names are returned.

I think for now I'm just going to make the help text for -table_sizes_only say that this is ignored if -table_names_only is passed, but I'm open to other suggestions.

Signed-off-by: Andrew Mason <amason@slack-corp.com>
Copy link
Member

@deepthi deepthi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks fine, but I'm debating whether you should be adding a test case. It will have to be an endtoend test, I think.
@vitessio/ps-vitess any opinions?

Signed-off-by: Andrew Mason <amason@slack-corp.com>
Signed-off-by: Andrew Mason <amason@slack-corp.com>
Signed-off-by: Andrew Mason <amason@slack-corp.com>
Copy link
Member

@deepthi deepthi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@deepthi deepthi merged commit 27c8a86 into vitessio:master Nov 29, 2020
@ajm188 ajm188 deleted the am_get_schema_sizes_only branch November 29, 2020 23:55
ajm188 added a commit to tinyspeck/vitess that referenced this pull request Dec 1, 2020
…er the wire (vitessio#6985)

* Add option to GetSchema to only send the row count and data length over the wire

Signed-off-by: Andrew Mason <amason@slack-corp.com>

* Update help text to clarify behavior

Signed-off-by: Andrew Mason <amason@slack-corp.com>

* Add endtoend test for vtctl GetSchema

Signed-off-by: Andrew Mason <amason@slack-corp.com>

* Alias the proto imports

Signed-off-by: Andrew Mason <amason@slack-corp.com>

* Format imports to match vtctl.go

Signed-off-by: Andrew Mason <amason@slack-corp.com>
@askdba askdba added this to the v9.0 milestone Dec 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Better support for pulling schema sizes from across a keyspace
3 participants