-
Notifications
You must be signed in to change notification settings - Fork 527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Search v0.1.0 (ingester only) #806
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
…search everything generically Signed-off-by: Martin Disibio <mdisibio@gmail.com>
…ue value in-memory Signed-off-by: Martin Disibio <mdisibio@gmail.com>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
…sults are found Signed-off-by: Martin Disibio <mdisibio@gmail.com>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
annanay25
reviewed
Aug 30, 2021
…h bytes exceed per trace limit Signed-off-by: Martin Disibio <mdisibio@gmail.com>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
mdisibio
changed the title
WIP: Search v0.0.00000001 (ingester only)
Search v0.1.0 (ingester only)
Aug 30, 2021
annanay25
reviewed
Aug 31, 2021
annanay25
reviewed
Aug 31, 2021
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
5 tasks
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
joe-elliott
approved these changes
Aug 31, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EL GEE TEE EM
annanay25
approved these changes
Sep 1, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
12 tasks
We determined that this also fixed #216 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does:
This PR adds basic search capabilities of traces in the ingesters. The approach should be decently fast and powerful enough for a first pass. Flatbuffer-encoded metadata for each trace is stored in a new file alongside the block data. When searching, the flatbuffer metadata is evaluated and matching traces are returned. The OTLP/protobuf block data is not involved at all. A new search api is exposed in the query-frontend/querier/ingester, and it can be called directly (by a Grafana experimental UI that is in the works), or via tempo-query which is also updated to translate the jaeger search conditions. There are also apis to lookup attribute names and values (for autocomplete and jaeger dropdowns).
Flatbuffers are quite fast and this basic implementation can search 3+GB/s of files, based on the benchmark in /tempodb/search/. With additional tuning I believe we could increase it further, but this is already enough to saturate most disk and network i/o, therefore a better direction might be to look at indexing or compression.
Functionality must be enabled by setting
search_enabled : true
at the root yaml config for query-frontend, querier, and distributor. This causes the search apis to be registered and the distributor to start capturing search data of traces as they are received. Ingester is backwards compatible and tolerates data present or not.Query Details
This approach is tag-oriented. The search metadata effectively flattens a trace down to all unique key/value pairs for span and resource attributes, and the min/max start and end times. All attributes are coerced to strings. Therefore this first version is quite basic and can answer questions about hits anywhere within a trace, or the overall trace duration. But it cannot satisfy complex searches on individual spans except the root.
Examples:
Combining conditions is matching multiple hits anywhere in the trace:
Implementation Details
Search data is extracted in the distributor since this is the only location where the trace is available in "cleartext". Flatbuffer metadata is built and byte slices are sent from distributor->ingester. Ingester stores this data alongside live traces, and it is eventually flushed to the WAL (separate files in /wal), and completed to the local backend (new files in /wal/blocks///search). When searching, the ingester checks in all 3 locations.
Flatbuffers are a compiled schema. There is a new pkg/tempofb with this and
make gen-flat
to compile it. One key detail is the use offlatbuffers.CreateSharedString
which interns a string in the block of data, leading to efficient storage for common tags and values.Search results are trace headers and not entire trace bodies. It includes basic details like id, duration, service, operation. Also included in the api response is some basic metrics for how many traces, blocks, and bytes were inspected. This will let us quickly gauge the performance of various queries.
Large remaining gaps / next steps
Consensus is that these are not needed for this first pass and will be addressed in the future.
Small remaining gaps / next steps. Not necessarily required in this PR but open to feedback
Which issue(s) this PR fixes:
Fixes: #471
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]