Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for query payload limit #6143

Merged
merged 4 commits into from
Jun 22, 2020

Conversation

spark4
Copy link
Contributor

@spark4 spark4 commented May 1, 2020

Overview

Currently, there is no way to enforce a limit on the query payload for an insert operations. This PR introduces a new comment directive, which, will enable us to configure threshold for insert query payloads.

Implementation

  1. Define a new comment directive DirectiveMaxPayloadSize for maximum payload size in bytes.
  2. Introduce validation functions which will determine whether a given insert query is within the configured max payload size threshold, if set
  3. At the plan building step, add the above validation check before returning a valid insert plan.

Testing

Testing was done primarily via unit testing and covers the following cases:

  • unsharded inserts, within the MAX_PAYLOAD_SIZE threshold
  • unsharded inserts, above the MAX_PAYLOAD_SIZE threshold
  • sharded inserts with multiple rows, within the MAX_PAYLOAD_SIZE threshold
  • sharded inserts with multiple rows, above the MAX_PAYLOAD_SIZE threshold

@sougou
Copy link
Contributor

sougou commented May 2, 2020

Haven't looked deep in the code. Some high level considerations:

  • This feature overlaps a bit with the grpc max packet size.
  • We can think about what this means for the rowcount limit. People have previously expressed concern that rowcount is not an accurate measure of payload size.

@rafael
Copy link
Member

rafael commented May 4, 2020

@sougou - This is how I’m thinking about the considerations you point out:

  1. I would see grpc limit as a last line of defense.
  2. Similarly, the queryserver-config-max-result-size limit is a hard stop but for select queries.
  3. This feature, adds on these capabilities to do this on a per query basis for inserts. In the future, I think we could also expand the same directive to enforce these limits but for selects.

Although there is some overlap, I think this kind of granularity is a powerful lever to have in the toolkit.

For context, our motivation for this, is that it is difficult for us to enforce the hard limits right now. We are going to towards that state, but we still have many places in our application where we can’t enforce this yet. With this new feature, we are going to be able to start enforcing limits with more granularity.

I agree with the point around the concern around rowcount not being an accurate measure of payload size. That’s why we leaned more towards payload size. We have run into situations where we the rowcount is not that big, but the payload is.

@spark4 spark4 force-pushed the query-payload-limit branch 2 times, most recently from 7daff90 to c36d8d6 Compare May 4, 2020 22:12
@zmagg
Copy link
Contributor

zmagg commented May 5, 2020

Agreed with everything Rafael said. Expanding on that a little:

The gRPC limit is also one global limit for both SELECTs and INSERTs. We're not yet in a place where we're able to bring that down globally across the board.

We've also seen a few incidents internally where the query payload size is large enough to start causing replication lag.

Having this granularity of control around limits lets us fail those queries before they even hit the database. We also have a large number of tables that have started putting large protobufs in LONGBLOB columns. We'd like to be able to protect ourselves from upstream application mistakes where we suddenly start sending all 4GB in that insert. 😁

go/vt/sqlparser/comments.go Outdated Show resolved Hide resolved
go/vt/sqlparser/comments.go Outdated Show resolved Hide resolved
@spark4 spark4 force-pushed the query-payload-limit branch 2 times, most recently from fac7e60 to e15947c Compare May 14, 2020 21:02
@spark4
Copy link
Contributor Author

spark4 commented May 14, 2020

Summary of changes

I have incorporated provided feedback and have made the following updates to the initial implementation:

  • Flags over query comments: the threshold can now be configured as a flag in the vtgate package instead of a query comment directive.
  • Payload validation occurs in query execution layer: we no longer perform the check within the plan builder.
  • Apply threshold to all operations, not just inserts: the primary motivator for this shift was to be able to use this flag to catch queries with large IN lists across all operations, in addition to large insert payloads.
  • Add warn threshold: this will allow us to tweak the max_payload_size limit
  • Add query comment override: this will allow us to override the max_payload_size limit for specific queries
  • Set 0 as default value for the flag to indicate no restrictions on payload size
  • Use vterror over error

Testing

In addition to unit tests, I performed the following manual checks on a vtgate host:

  • Ensured vtgate was able to execute queries with the default max_payload_size value
  • Ensured vtgate returned the expected error if the query payload exceeded the configured limit
  • Validated that a query comment will bypass the max_payload_size limit
  • Confirmed that the PayloadSizeExceeded counter incremented as expected when the warn_payload_size was set and exceeded

query := sql
statement := stmt
bindVarNeeds := sqlparser.BindVarNeeds{}
if !isValidPayloadSize(query) && !sqlparser.MaxPayloadSizeOverrideDirective(statement) {
return nil, vterrors.New(vtrpcpb.Code_RESOURCE_EXHAUSTED, "query payload size above threshold")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be Code_FAILED_PRECONDITION instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ResourceExhausted is fine, but it'd be nice to use a MySQL error code that indicates it's not retryable. In the upstream community Slack, we were discussing using MySQL error code 1153 for things like this and gRPC packet size exceeded going forward. I'm not sure where that MySQL error code is set and where the gRPC error code is set.

@spark4 spark4 force-pushed the query-payload-limit branch 4 times, most recently from 466e197 to e43c927 Compare May 20, 2020 23:23
@spark4 spark4 requested a review from deepthi June 4, 2020 19:41
Signed-off-by: Serry Park <me@serry.co>
Signed-off-by: Serry Park <me@serry.co>
@spark4 spark4 removed the request for review from arka-g June 22, 2020 17:47
Copy link
Member

@deepthi deepthi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions and some style / coding standard type changes.

go/vt/sqlparser/comments.go Outdated Show resolved Hide resolved
go/vt/sqlparser/comments_test.go Outdated Show resolved Hide resolved
go/vt/vtgate/executor.go Outdated Show resolved Hide resolved
go/vt/vtgate/executor_test.go Outdated Show resolved Hide resolved
go/vt/vtgate/executor_test.go Outdated Show resolved Hide resolved
Signed-off-by: Serry Park <me@serry.co>
@deepthi deepthi merged commit aed641b into vitessio:master Jun 22, 2020
@spark4 spark4 mentioned this pull request Jun 25, 2020
ajm188 pushed a commit to tinyspeck/vitess that referenced this pull request Jun 26, 2020
@deepthi deepthi added this to the v7.0 milestone Jul 27, 2020
ameetkotian pushed a commit to tinyspeck/vitess that referenced this pull request Aug 19, 2020
ameetkotian pushed a commit to tinyspeck/vitess that referenced this pull request Aug 19, 2020
ameetkotian pushed a commit to tinyspeck/vitess that referenced this pull request Aug 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants