Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support tagfiltertree for fast matching metricIDs to queries #4310

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

shaan420
Copy link
Contributor

This PR introduces a new data structure which we call "tagfiltertree".
A query consists of several tag filters. A tag filter is nothing but a tag and value pair where the value is a pattern much like a regex. For instance, "service:foo" is a valid tag filter. A query is nothing but a list of such tag filters that need to be matched with Conjunction within an incoming metricID.

The key observation behind creating a tree structure is that the presence of a tag in a metricID and a query has the potential to drastically prune the search space of the possible queries going forward. So much so that within a couple comparisons we expect an incoming metricID to output the entire list of matched queries.

This package optimizes CPU and memory for match time as opposed to creation of the tree itself. It is not thread-safe so make sure to protect the tree appropriately if it is going to be used in a multithreaded context.

Refer to the README.md for more details.

What this PR does / why we need it:
Use this data structure when you want to quickly match an input metricID to a list of queries.

Does this PR introduce a user-facing and/or backwards incompatible change?:

NONE

Does this PR require updating code package or user-facing documentation?:

NONE

}

// IsVarTagValue returns true if the value is a variable tag value.
func IsVarTagValue(value string) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add some explanation? e.g. I am not sure why len<4 means false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a Var value type is of the form "{{...}}" so the minimum is 4 chars. Anything lesser and we know it's not a Var value type

Copy link
Contributor

@fengcheng1518 fengcheng1518 Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add comments to explain what Var Tag means. No reviewer can correlate "VarTagValue" with "...{{...}}..." pattern

type Tag struct {
Name string
Val string
Var string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is the usage of Var? I don't find any

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Var will be used in Namespace Attribution to support capture-groups like functionality in Regex.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so it will not be used in query matching? nit: add some comments to make the code more readable

return false
}

// IsMatchNoneTag returns true if the tag is a match none tag.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IsNegation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Done.

}
tagValue, tagNameFound := tags[name]
absVal, absValFound := node.AbsoluteValues[tagValue]
if tagNameFound && absValFound {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if isMatchAny==false, then within this if condition, you will never return true? is that correct logic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha! good catch! yeah this is a bug. We cannot rely on the returned "matched" bool flag. I'll fix it.

@@ -186,7 +186,6 @@ linters:
- gci
- goconst
- gocritic
- golint
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

golint is deprecated now and mainly not detecting the type correctly when using Generics.

a set of tag filters. One such use-case is metric attribution to namespaces.
Iterating through each filter individually and matching them is extremely expensive
since it has to be done on each incoming metricID. Therefore, this data structure
pre-compiles a set of tag filters in order to optimize matches against an input metricID.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand this paragraph.

  1. "attribution to namespaces" - what does this mean? What is a namespace in this context?
  2. How does this pre-compiled data structure prevent you from having to do matching on each incoming metricID?

Perhaps a diagram or example would help here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the Readme

pre-compiles a set of tag filters in order to optimize matches against an input metricID.

## Usage
First create a trie using New() and then add tagFilters using AddTagFilter().
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then I guess you use Match somehow? A code example here would be useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants