-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add additional transport compression options #74587
Conversation
Pinging @elastic/es-distributed (Team:Distributed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a number of detailed comments. In addition, I would like to have the documentation added in this PR, marking the new options experimental. Finally, I wonder if there is additional testing we can do (added one such comment only but did not investigate more deeply which tests to add).
server/src/main/java/org/elasticsearch/transport/DeflateTransportDecompressor.java
Outdated
Show resolved
Hide resolved
return bytesConsumed; | ||
} | ||
|
||
private int decodeBlock(BytesReference reference) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would think we need a test or assertion to verify at least one of:
- Randomized compression and decompression works.
- The java library is the expected version
Ideally we would split this method into a class that can be contributed upstream to the the lz4 library (can do in follow-up).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The java library is the expected version
I'm not sure what you meant here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The testing added (plus the ones I suggested) should suffice here. I was only worried that we would not catch discrepancies fast if the library is updated to a newer version which changes the block format. And one way to catch that would be to assert something on the library version. But I prefer adding randomized testing so all good here.
server/src/main/java/org/elasticsearch/transport/Lz4TransportDecompressor.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/transport/CompressionScheme.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/transport/CompressionScheme.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/transport/RemoteClusterServiceTests.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/transport/RemoteConnectionStrategyTests.java
Show resolved
Hide resolved
@@ -440,6 +441,12 @@ private static Settings getRandomNodeSettings(long seed) { | |||
Random random = new Random(seed); | |||
Builder builder = Settings.builder(); | |||
builder.put(TransportSettings.TRANSPORT_COMPRESS.getKey(), rarely(random)); | |||
builder.put(TransportSettings.TRANSPORT_COMPRESS_RAW_DATA.getKey(), random.nextBoolean()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I imagine we use rarely above to avoid the overhead of compression. Can you run the full test suite with raw data compression on vs compression off to verify the penalty on builds? That would also give some indication of the overhead of lz4/raw_data.
I wonder if we only want LZ4/raw_data to run "every second run" and the other combinations (like deflate/raw_data and full compression) to only run rarely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put all compression back to rarely
. I will test this out a bit soon. Hopefully this is not a blocker as it is easy to fix if build times are impacted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, let us see how it goes, would be good to do a bit of checking after this goes in.
server/src/main/java/org/elasticsearch/transport/OutboundMessage.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/transport/Lz4TransportDecompressor.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some quick comments, unfortunately I didn't get through all of it today.
server/src/main/java/org/elasticsearch/transport/RawDataTransportRequest.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/transport/Lz4TransportDecompressor.java
Outdated
Show resolved
Hide resolved
*/ | ||
static final int COMPRESSION_LEVEL_BASE = 10; | ||
|
||
static final int MIN_BLOCK_SIZE = 64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unused?
|
||
static final int MIN_BLOCK_SIZE = 64; | ||
static final int MAX_BLOCK_SIZE = 1 << COMPRESSION_LEVEL_BASE + 0x0F; // 32 M | ||
static final int DEFAULT_BLOCK_SIZE = 1 << 16; // 64 KB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused as well?
server/src/main/java/org/elasticsearch/transport/Lz4TransportDecompressor.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/transport/Lz4TransportDecompressor.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/transport/OutboundMessage.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/transport/Lz4TransportDecompressor.java
Outdated
Show resolved
Hide resolved
} | ||
byte firstByte = bytes.get(0); | ||
byte[] header; | ||
if (firstByte == CompressionScheme.DEFLATE_HEADER[0]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should just interpret the headers as int
, then we don't have to have a mutable public static byte[]
and can just switch
on that int here (faster and less code)?
} | ||
byte firstByte = bytes.get(0); | ||
byte[] header; | ||
if (firstByte == CompressionScheme.DEFLATE_HEADER[0]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should just interpret the header as int
, then we don't have to have a mutable public static byte[]
and can just switch
on that int here (faster and less code)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I left a number of smaller things to address.
qa/full-cluster-restart/src/test/java/org/elasticsearch/upgrades/FullClusterRestartIT.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/transport/NetworkMessage.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/transport/RawDataTransportRequest.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/transport/OutboundHandlerTests.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/transport/OutboundHandlerTests.java
Outdated
Show resolved
Hide resolved
test/framework/src/main/java/org/elasticsearch/transport/AbstractSimpleTransportTestCase.java
Show resolved
Hide resolved
return bytesConsumed; | ||
} | ||
|
||
private int decodeBlock(BytesReference reference) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The testing added (plus the ones I suggested) should suffice here. I was only worried that we would not catch discrepancies fast if the library is updated to a newer version which changes the block format. And one way to catch that would be to assert something on the library version. But I prefer adding randomized testing so all good here.
@@ -440,6 +441,12 @@ private static Settings getRandomNodeSettings(long seed) { | |||
Random random = new Random(seed); | |||
Builder builder = Settings.builder(); | |||
builder.put(TransportSettings.TRANSPORT_COMPRESS.getKey(), rarely(random)); | |||
builder.put(TransportSettings.TRANSPORT_COMPRESS_RAW_DATA.getKey(), random.nextBoolean()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, let us see how it goes, would be good to do a bit of checking after this goes in.
This commit is related to elastic#73497. It adds two new settings. The first setting is transport.compression_scheme. This setting allows the user to configure LZ4 or DEFLATE as the transport compression. Additionally, it modifies transport.compress to support the value indexing_data. When this setting is set to indexing_data only messages which are primarily composed of raw source data will be compressed. This is bulk, operations recovery, and shard changes messages.
PR elastic#74587 added lz4-java as a dependency. This broken multiple third party audits which had ignored those missing classes. This commit fixes the issue.
This commit reenables the BWC tests after #74587 was backported.
Compression using indexing_data or lz4 as well as recovery from snapshot are primarily intended for ESS and is therefore marked ESS only in docs. Relates elastic#76237 and elastic#74587
This commit is related to #73497. It adds two new settings. The first setting
is
transport.compression_scheme
. This setting allows the user toconfigure LZ4 or DEFLATE as the transport compression. Additionally, it
modifies
transport.compress
to support the valueindexing_data
. Whenthis setting is set to
indexing_data
only messages which are primarilycomposed of raw source data will be compressed. This is bulk, operations
recovery, and shard changes messages.