Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Queryable Sub-Trees #22

Open
vassudanagunta opened this issue Oct 10, 2018 · 5 comments
Open

Queryable Sub-Trees #22

vassudanagunta opened this issue Oct 10, 2018 · 5 comments

Comments

@vassudanagunta
Copy link

vassudanagunta commented Oct 10, 2018

I see that the implementation is recursive, but this isn't exposed in the public API.

For example:

  • Get(k []byte) is defined on *Node, but the only Node that is publicly accessible is the root node. It would be powerful to be able to make queries relative to any Node (e.g. Get("xyz") on the "abc" Node finding "abcxyz").
  • The docs state "Iterator is used to return an iterator at the given node to walk the tree", but the only possible "given node" is the root node.

I might be able to figure it out reverse engineering the code, but I'm hoping the maintainers know whether this is something that is (a) already supported; just need to provide public access to non-root nodes, (b) not supported but could be with some reasonalble changes or (c) technically not possible given the current implementation.

This is a great library. I'm trying to do some great open source code with it.

@vassudanagunta
Copy link
Author

@banks, since you just worked on the code, and @i0rek and @kyhavlov since you just reviewed said code, might any of you have an idea?

@banks
Copy link
Member

banks commented May 23, 2019

Hey @vassudanagunta,

Yeah the API could certainly be more flexible in general - it's really the minimum needed for specific use case (go-memdb).

That said, a prefix based lookup can in most cases give you essentially the same thing - it doesn't expose a subtree's actual node directly as it's wrapped in the iterator but generally allows you to walk any subset of the tree from a given node. Given that the only thing you can do with a Node currently is further prefix/exact match based lookups it's kinda of equivalent but if you have a use case where exposing the actual Node API of a subtree is convenient then it's probably not hard to add.

The Iterator itself has a reference to the node that was sought with any call to SeekPrefix so you could just make it return that node explicitly. I've not thought through whether this is a good idea or not in detail but off the to of my head it seems reasonable and likely a smallish change.

@vassudanagunta
Copy link
Author

vassudanagunta commented May 23, 2019

Thanks for the detail reply, @banks.

My use-case is fast prefix-based lookups of file/URL paths, as well as glob matching, which I can implement by seeking to glob's wildcard-free prefix, then walking the tree from there to find matches on the wildcard-containing suffix.

I will often have to do repeated lookups (not walks) relative to a specific node (e.g. a directory). You might call these subprefix lookups. Having to do it from the root every time means several points of redundancy: (1) having to concatenate the prefix (the directory/base path) and the subprefix (the relative path), (2) all the redundant string-[]byte conversions and then (3) go-immutable-radix having to rewalk the tree down to the prefix, even though it already did that for me many times just a microsecond ago.

Being a Go newb and only an open source programmer on the side at that, I'm not sure which makes most sense:

  1. The cost is relatively insignificant; just pay it! (context: this is for Hugo, must support hundreds/thousand/sometimes tens of thousands of paths)
  2. Update go-immutable-radix to provide a way of getting a public handle on an arbitrary Node. (I could do that, with tests, and submit a PR)
  3. Roll my own file-path optimized radix tree-like index, because for file paths "/" nodes are special.

@banks
Copy link
Member

banks commented May 24, 2019 via email

@vassudanagunta
Copy link
Author

@banks I have too many things to learn! I wanted to know if it would be relatively trivial (It didn't seem trivial to figure out if it would be trivial; perhaps that should have been my first clue. Thanks again for your help and for your work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants