Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The TSM storage engine #4308

Merged
merged 142 commits into from
Oct 6, 2015
Merged

The TSM storage engine #4308

merged 142 commits into from
Oct 6, 2015

Conversation

pauldix
Copy link
Member

@pauldix pauldix commented Oct 2, 2015

Internal people have seen the docs, external will see it soon. I haven't had a chance to refactor any of this yet, but let me have it. I'll be pushing up some refactoring changes over the weekend I think.

Will fix the merge conflicts with the stress tool in a bit, but I wanted to get people commenting on the code so I can get some feedback and fixes in.

pauldix and others added 30 commits September 24, 2015 15:36
Also fixed shard to work again with b1 and bz1 engines.
Also ensure that queries don't try to use files that have been deleted.
Prevents index out of bounds panic
Prevents a panics when response size is less than 100.  Also allows
data to be posted when it is less than the batch size.
Time compression uses an adaptive approach using delta-encoding,
frame-of-reference, run length encoding as well as compressed integer
encoding.

Float compression uses an implementation of the Gorilla paper encoding
for timestamps based on XOR deltas and leading and trailing null suppression.
This is using zig zag encoding to convert int64 to uint64s and then using simple8b
to compress them, falling back to uncompressed if the value exceeds 1 << 60.  A
patched encoding scheme would likely be better in general but this provides decent
compression for integers that are not at the ends of the int64 range.
jwilder and others added 23 commits October 5, 2015 20:09
Shard path can be a directory.
Not implemented for tsm1 engine
Will make it less error-prone to add new encodings int the future
since each encoder has it's set of constants.  There are some placeholder
contants for uncompressed encodings which are not in all encoder currently.
Avoid panicing in lower level code and allow the engine to decide what
it should do.
Should never get a block size 9 bytes since Encode always returns the min
timestampe and a 1 byte header.  If we get this, the engine is confused.
Closing the store did not properly return an error for in-flight
writes because the closing channel was set to nil when closed.  A
nil channel is not selectable so writes continue on past the guard
checks and trigger panics.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants