-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expected Usage Question #37
Comments
p.s. I realize i could use the |
Hey @ccollie, since ART is a prefix tree when you try to insert a key that is prefix of other this will result in a error during insertion. For example "hell" is prefix of "hello", so when inserting there is not enough bytes to distinguish between this two. If you know your labels are ascii only use a |
The trait tha dictates if a type is safe to not use the |
Metric names and values are utf-8. I'm hesitant to bring in something like ICU for this though. region=us-east-1 |
Another thought. I can suffix my keys with a non-utf8 sentinel byte if that makes it function similar to |
IDK if you are going to have more keys than this, but it's obvious that this is ascii only so |
Also in the new version we will have a |
Very nice ! I'm also a |
Last question: is there a cheap way to get a count for entries with a given prefix ? If so I'd like to raise it as a feature request. |
Yeah typically if you're going to have variable length keys, you're going to need some sort of suffix value that is not present in any other part of the key. I like your idea of using non-UTF8 sentinel byte, or you could append a This does make the string a sorta subset of UTF-8.
At the moment no, the internal nodes of the trie don't track "number of leaf node descendants" as a field. The check for number of entries would need to essentially count the leaf nodes. Feel free to cut a separate ticket as a feature request (doesn't need to be too detailed), I'd be happy to discuss that feature more. I think it would be a tradeoff between increasing the size of the trie inner nodes and being able to speed up "num children" functions for the various iterators
I'm working on this this week, I'll try to cut a release quickly! |
@declanvk was faster than me haha, but yeah we don't keep track of this number only at the root, but with the new implementation doing a |
I wasn't expecting tracking descendents, only that we iterate nodes and sum its children. I'll put in a request... |
I released version 0.3.0, let me know what you think! Any other problems or questions are welcome. I was also thinking about some usability improvements on the tree APIs have the |
I have a library that needs to index Prometheus like metrics (e.g. call_latency(instance="101", region="us-east1", env="staging"}
Each metric is indexed by name as well as labels (region="us-east1"), with each entry being a roaring bitmap of the metric ids containing the label.
One reason for looking at an ART is because of path compression, since there is great redundancy in label names. The proposed schema is
"${label}_${value}" -> bitmap # all metrics with the key value pair
"${label} -> bitmap # all metrics with a given label, irrespective of value
I've tried to implement this in BLART but it seems like this is not supported as I get Prefix errors when adding "region" then "region=us-east1" (or the inverse).
My assumption was that it functioned as a trie, which means my use case would be possible. Is this a valid assumption or am I using the API incorrectly (
try_insert
)The text was updated successfully, but these errors were encountered: