Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TSI Proposal #7174

Closed
wants to merge 3 commits into from
Closed

TSI Proposal #7174

wants to merge 3 commits into from

Conversation

benbjohnson
Copy link
Contributor

Proposal for implementing #7151.

/cc @jwilder @pauldix @e-dard


### Series Blocks

Series blocks contain a `uint32` block ID followed by a list of `uint32` local series IDs. Combining the block id and local series ID provides a globally addressable `uint64` series ID: `(blockID,seriesID)`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the local series ID the index position within the list? For example, if you had block ID 1, and ten series within the block, The global series ID would be (1,0), (1,1), (1,2), ... (1,9)? Then if you had series ID (1,4), you could go directly to block 1 in the file and then use 4 as the index into the KEYPOS list to find the offset in the block for that key?

I'm assuming this is the case, but it would be good to explain how this works in more detail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's correct. I can add more detail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Detail added in 008dc1e.

@jwilder
Copy link
Contributor

jwilder commented Aug 18, 2016

I think this is a good start and we should keep iterating on this. Some diagrams like these would really help to visualize the format of the file IMO.

@toddboom
Copy link
Contributor

@benbjohnson gets to use Monodraw!

@jwilder jwilder added this to the 1.1.0 milestone Aug 18, 2016

### Tag Value Blocks

Tag value blocks contain a sorted list of tag values.
Copy link
Contributor

@e-dard e-dard Aug 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could consider storing the maximum and minimum tag value in the block. We could store them in the header, or we could avoid needing to them if we could easily jump to the first and last values in the block.

I would imagine if the tag values were container IDs for example, we could have millions of them, and a binary search for a value outside of [min, max] would be wasteful.

@jwilder jwilder mentioned this pull request Aug 22, 2016
@benbjohnson
Copy link
Contributor Author

@jwilder @e-dard @pauldix This proposal has been updated with the hash index and dictionary encoding from #7186.

@e-dard e-dard added the RFC label Sep 12, 2016
@rw
Copy link
Contributor

rw commented Sep 19, 2016

  • How are the search structures created? In particular, what is the maximum memory usage when creating the TSI hash tables and/or sorted lists?
  • Can we future proof by using uint64 instead of uint32 (to support more than 4 billion series)?

@jwilder jwilder modified the milestones: 1.2.0, 1.1.0 Oct 6, 2016
@timhallinflux timhallinflux modified the milestones: 1.3.0, 1.2.0 Dec 19, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants