You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TSM compactions, which aim to replace a set of TSM files with versions that are of increased density along with removing any tombstoned data.
WAL Snapshots, which write time / value data from the in-memory cache to level-0 TSM files
For reference, issue #9981 implements the columnar decoders and readers.
For the sake of simplicity, some examples will refer to the Float data type, however, any of the other supported data types (Integer, String, etc) could be substituted.
TSM Compactions
When merging blocks of data for a series + field, the existing compaction implementation decodes and encodes compressed data using the iterative decoder / encoder APIs. This process should be refactored to use batch-oriented APIs to improve performance.
Starting from the top and working down, the key component responsible for merging a set of TSM files is the tsmKeyIterator, which is:
an implementation of the LSM Tree merge operation
that merges a specific set of TSM files
in series-key order
producing compressed and merged blocks of time and value data for each series
in ascending time order
for all series keys across the set of TSM files
Under certain conditions, as an optimization, blocks may not require decoding and re-encoding. This behavior can be observed for Float values:
// Next returns true if there are any values remaining in the iterator.
Next() bool
// Read returns the key, time range, and raw data for the next block,
// or any error that occurred.
Read() (key []byte, minTimeint64, maxTimeint64, data []byte, errerror)
// Close closes the iterator.
Close() error
// Err returns any errors encountered during iteration.
Err() error
// EstimatedIndexSize returns the estimated size of the index that would
// be required to store all the series and entries in the KeyIterator.
EstimatedIndexSize() int
}
is an abstraction used by the Compactor to read encoded TSM blocks for each series key. tsmKeyIterator conforms to this interface, such that it can be used by the write function:
// If we have a max file size configured and we're over it, close out the file
// and return the error.
ifw.Size() >maxTSMFileSize {
iferr:=w.WriteIndex(); err!=nil {
returnerr
}
returnerrMaxFileExceeded
}
}
to produce a new set of TSM files. Note that iter.Next() is called for each iteration of the loop to move to the next merged block and iter.Read() is called to obtain the data to be encoded. The tsmKeyIterator provides two strategies for compacting data, per the fast option:
// indicates whether the iterator should choose a faster merging strategy over a more
// optimally compressed one. If fast is true, multiple blocks will just be added as is
// and not combined. In some cases, a slower path will need to be utilized even when
// fast is true to prevent overlapping blocks of time for the same key.
// If false, the blocks will be decoded and duplicated (if needed) and
// then chunked into the maximally sized blocks.
fastbool
The focus for the remainder of this document will be to define the requirements to implement a new KeyIterator, which merges a set of TSM files using the existing batch-oriented decoders and new batch-oriented encoders.
Cache Snapshots
The cache (WAL) is periodically snapshotted to generate a level-0 TSM file. A separate investigation is necessary to understand what would be required to utilize batch encoders for this process.
Tasks
Similar to moving the query / read path to batch-oriented APIs, it makes sense to start from the lowest layers and work up to defining a new KeyIterator.
Unit tests and benchmarks
In order to demonstrate correctness and show the scale of the performance improvements, unit tests should be replicated from the existing encoders. A diary of benchmarks was created to keep track of the improvements to the decoders. It would be useful to maintain a similar diary for the encoders.
Add batch encoders
Similarly to the array decoders, new implementations of the encoders should provide a single API for encoding a block of values. These encoders are responsible for encoding a single column of data, such as timestamps, floats or strings.
An example of a batch decoder in tsdb/engine/tsm1/batch_float.go:
which decodes the byte slice b, using dst (to potentially avoid allocations) and returning the decoded slice or an error if decoding fails. The Float decoder was rewritten from the iterative approach:
Which required the client call Next() bool to decode the next value and Values() float64 to fetch the value. The improvements resulted in a single loop to decode an entire block of values, allowing a host of compiler optimizations.
A similar API for a batch encoder:
funcFloatArrayEncodeAll(src []int, b []byte) ([]byte, error)
which encodes the slice src, using the byte slice b and returns the encoded slice or an error if encoding fails. The definition of the FloatEncoder can be found here:
NOTE: FloatBatchDecodeAll should be renamed to FloatArrayDecodeAll, which follows the Array naming convention established throughout the remainder of the columnar work.
Add batch block encoders
The block encoders are responsible for TSM blocks, encoding a columns of up to 1,000 timestamps and values.
which is an array of structs. Struct of Arrays improves cache locality and is better suited to leverage SIMD and is also the layout specified by Apache Arrow.
The tsmKeyIterator uses the FloatValues#Encode API to encode a block of values:
// Prepend the first timestamp of the block in the first 8 bytes and the block
// in the next byte, followed by the block
b=packBlock(buf, BlockFloat64, tb, vb)
returnnil
}()
putTimeEncoder(tsenc)
putFloatEncoder(venc)
returnb, err
}
The existing implementation uses the iterative encoders and therefore requires a new implementation is required to encode a FloatArray, leveraging the batch encoders:
funcencodeFloatArrayBlock(a*tsdb.FloatArray, b []byte) ([]byte, error)
where a is encoded using b and the result returned or an error if there was a problem encoding the data.
Implement batch block encoders
encodeIntegerArrayBlock
encodeUnsignedArrayBlock
encodeFloatArrayBlock
encodeStringArrayBlock
encodeBooleanArrayBlock
Add Encode to FloatArray
Add a FloatValues#Encode method to the the code generation template arrayvalues.gen.go.tmpl
As noted above, the existing FloatValues#Encode method calls the encodeFloatValuesBlock. This task is to add the equivalent FloatArray#Encode methods, calling the new encodeFloatArrayBlock API.
Implement batch block encoders
IntegerArray#Encode
UnsignedArray#Encode
FloatArray#Encode
StringArray#Encode
BooleanArray#Encode
Implement batch-oriented KeyIterator
This task must implement a new version of the tsmKeyIterator using the batch-oriented types, including FloatArray, DecodeFloatArrayBlock and FloatArray#Encode APIs.
tsmKeyIterator replacement
Possible optimizations
It has been observed that large compactions generate a considerable amount of garbage. This is likely due to the merging of blocks:
Compactions
There are two distinct compaction processes:
For reference, issue #9981 implements the columnar decoders and readers.
For the sake of simplicity, some examples will refer to the
Float
data type, however, any of the other supported data types (Integer
,String
, etc) could be substituted.TSM Compactions
When merging blocks of data for a series + field, the existing compaction implementation decodes and encodes compressed data using the iterative decoder / encoder APIs. This process should be refactored to use batch-oriented APIs to improve performance.
Starting from the top and working down, the key component responsible for merging a set of TSM files is the
tsmKeyIterator
, which is:Under certain conditions, as an optimization, blocks may not require decoding and re-encoding. This behavior can be observed for
Float
values:influxdb/tsdb/engine/tsm1/compact.gen.go
Lines 104 to 118 in a85306c
The
KeyIterator
, defined as:influxdb/tsdb/engine/tsm1/compact.go
Lines 1204 to 1221 in 1c0e49e
is an abstraction used by the
Compactor
to read encoded TSM blocks for each series key.tsmKeyIterator
conforms to this interface, such that it can be used by thewrite
function:influxdb/tsdb/engine/tsm1/compact.go
Lines 1128 to 1163 in 1c0e49e
to produce a new set of TSM files. Note that
iter.Next()
is called for each iteration of the loop to move to the next merged block anditer.Read()
is called to obtain the data to be encoded. ThetsmKeyIterator
provides two strategies for compacting data, per thefast
option:influxdb/tsdb/engine/tsm1/compact.go
Lines 1240 to 1246 in 1c0e49e
The focus for the remainder of this document will be to define the requirements to implement a new
KeyIterator
, which merges a set of TSM files using the existing batch-oriented decoders and new batch-oriented encoders.Cache Snapshots
The cache (WAL) is periodically snapshotted to generate a level-0 TSM file. A separate investigation is necessary to understand what would be required to utilize batch encoders for this process.
Tasks
Similar to moving the query / read path to batch-oriented APIs, it makes sense to start from the lowest layers and work up to defining a new
KeyIterator
.Unit tests and benchmarks
In order to demonstrate correctness and show the scale of the performance improvements, unit tests should be replicated from the existing encoders. A diary of benchmarks was created to keep track of the improvements to the decoders. It would be useful to maintain a similar diary for the encoders.
Add batch encoders
Similarly to the array decoders, new implementations of the encoders should provide a single API for encoding a block of values. These encoders are responsible for encoding a single column of data, such as timestamps, floats or strings.
An example of a batch decoder in
tsdb/engine/tsm1/batch_float.go
:which decodes the byte slice
b
, usingdst
(to potentially avoid allocations) and returning the decoded slice or an error if decoding fails. TheFloat
decoder was rewritten from the iterative approach:influxdb/tsdb/engine/tsm1/float.go
Lines 142 to 155 in 426a9a0
Which required the client call
Next() bool
to decode the next value andValues() float64
to fetch the value. The improvements resulted in a single loop to decode an entire block of values, allowing a host of compiler optimizations.A similar API for a batch encoder:
which encodes the slice
src
, using the byte sliceb
and returns the encoded slice or an error if encoding fails. The definition of theFloatEncoder
can be found here:influxdb/tsdb/engine/tsm1/float.go
Lines 29 to 41 in 426a9a0
Implement batch encoders
IntegerArrayEncodeAll
UnsignedArrayEncodeAll
FloatArrayEncodeAll
StringArrayEncodeAll
BooleanArrayEncodeAll
TimestampArrayEncodeAll
NOTE:
FloatBatchDecodeAll
should be renamed toFloatArrayDecodeAll
, which follows theArray
naming convention established throughout the remainder of the columnar work.Add batch block encoders
The block encoders are responsible for TSM blocks, encoding a columns of up to 1,000 timestamps and values.
An example of a
Float
block decoder:decodes the
block
of time and value data into theFloatArray
a
using the block decoders.FloatArray
, defined as:influxdb/tsdb/arrayvalues.gen.go
Lines 9 to 12 in 0841c51
maintains the timestamps and values as separate slices. This is knows as a Struct of Arrays layout.
FloatArray
replacesFloatValues
:which is an array of structs. Struct of Arrays improves cache locality and is better suited to leverage SIMD and is also the layout specified by Apache Arrow.
The
tsmKeyIterator
uses theFloatValues#Encode
API to encode a block of values:influxdb/tsdb/engine/tsm1/encoding.gen.go
Lines 454 to 456 in 9cd3152
which forwards to
encodeFloatValuesBlock
:influxdb/tsdb/engine/tsm1/encoding.gen.go
Lines 458 to 496 in 9cd3152
The existing implementation uses the iterative encoders and therefore requires a new implementation is required to encode a
FloatArray
, leveraging the batch encoders:where
a
is encoded usingb
and the result returned or an error if there was a problem encoding the data.Implement batch block encoders
encodeIntegerArrayBlock
encodeUnsignedArrayBlock
encodeFloatArrayBlock
encodeStringArrayBlock
encodeBooleanArrayBlock
Add
Encode
toFloatArray
Add a
FloatValues#Encode
method to the the code generation templatearrayvalues.gen.go.tmpl
As noted above, the existing
FloatValues#Encode
method calls theencodeFloatValuesBlock
. This task is to add the equivalentFloatArray#Encode
methods, calling the newencodeFloatArrayBlock
API.Implement batch block encoders
IntegerArray#Encode
UnsignedArray#Encode
FloatArray#Encode
StringArray#Encode
BooleanArray#Encode
Implement batch-oriented
KeyIterator
This task must implement a new version of the
tsmKeyIterator
using the batch-oriented types, includingFloatArray
,DecodeFloatArrayBlock
andFloatArray#Encode
APIs.tsmKeyIterator
replacementPossible optimizations
It has been observed that large compactions generate a considerable amount of garbage. This is likely due to the merging of blocks:
influxdb/tsdb/engine/tsm1/compact.gen.go
Line 162 in a85306c
and the later encoding of blocks:
influxdb/tsdb/engine/tsm1/compact.gen.go
Line 175 in a85306c
and
influxdb/tsdb/engine/tsm1/compact.gen.go
Line 193 in a85306c
It would be worth investigating ways of recycling memory, as this should also improve performance and reduce GC pressure.
The text was updated successfully, but these errors were encountered: