Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add optional tag-based filtering for query logging #4895

Merged
merged 3 commits into from
Jun 1, 2019

Conversation

demmer
Copy link
Member

@demmer demmer commented May 31, 2019

Description

Add optional tag-based filtering for query logging.

Details

Although the query logging can be very useful for debugging why certain queries are slow, in large deployments the overhead of logging every query can be burdensome and we'd like more control over what is logged.

As a simple enhancement, this PR adds a flag -querylog-filter-tag that allows us to specify a string and only emits query logs for queries that contain the specified string.

This allows the application to control logging by including a marker in a trailing comment.

demmer added 2 commits May 31, 2019 12:10
Add a flag -querylog-filter-tag to only emit query logging
for queries that contain the specified string.

Signed-off-by: Michael Demmer <mdemmer@slack-corp.com>
Signed-off-by: Michael Demmer <mdemmer@slack-corp.com>
@demmer demmer requested a review from sougou as a code owner May 31, 2019 19:20
Signed-off-by: Michael Demmer <mdemmer@slack-corp.com>
@demmer demmer force-pushed the querylog-filter-tag branch from 28db76a to 1bfa92d Compare May 31, 2019 19:23
@@ -121,6 +121,10 @@ func (stats *LogStats) RemoteAddrUsername() (string, string) {
// Logf formats the log record to the given writer, either as
// tab-separated list of logged fields or as JSON.
func (stats *LogStats) Logf(w io.Writer, params url.Values) error {
if !streamlog.ShouldEmitLog(stats.SQL) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this include the full query or just the comments? Do you have the parsed sql in this context? I wonder if a better approach is to have a proper directive in the trailing comments. If stats.SQL contains the full original query, one thing that it's not ideal about the current approach is that random queries that contain the LogFilterTag will sneak into the logging.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the full query but without the bind variables, which is what’s logged today.

It is true that a more robust system would use properly parsed comment directives but I think this isn’t too big of a deal since users can easily come up with a filter tag that doesn’t match any of the table names / column names, and since the bind vars aren’t in the string there’s no worry about accidental matching of any column value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also fwiw there isn’t an easy way to plumb the parsed query down into the logging subsystem so we’d probably need to parse again which seems overly conservative compared to the simple string search.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, parsing again just for this does not make sense to me.

Thanks for the clarifications.

Copy link
Member

@rafael rafael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1

@demmer demmer merged commit 3f90008 into vitessio:master Jun 1, 2019
setassociative pushed a commit to tinyspeck/vitess that referenced this pull request Jun 5, 2019
setassociative pushed a commit to tinyspeck/vitess that referenced this pull request Jun 5, 2019
setassociative pushed a commit to tinyspeck/vitess that referenced this pull request Jun 5, 2019
* Adds timeouts for all kinds of statements. Power to the clients!

Signed-off-by: Rafael Chacon <rafael@slack-corp.com>

* Cherry-pick: suppress begin...commit in autocommit logs

Original Message:
The previous (untested) implementation turned out to be in the wrong
place in the TabletServer execution tier and did not properly log
the actual statements being executed.

Implement this the right way by returning the statements that were
really executed out from the TxPool, then using those to determine
whether or not to log the statement.

Original PR: vitessio#4827

* Cherry pick: add optional tag-based filtering for query logging

Original PR: vitessio#4895

* bootstrap.sh: Remove unused unused command.

The deprecated `unused` command has been removed from the source
repository, so this now fails. We no longer use `unused` so we can just
drop the line that tries to install it.

Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants