[v24.1.x] `compression`: correct endianness in `snappy_java_compressor` (Manual backport) #25137

WillemKauf · 2025-02-22T02:05:58Z

Cherry-pick conflict in setup.py and bazel BUILD files.

Also removed test_upgrade_java_compression from java_compression_test.py in backports.

Closes issue #25136.

Backports Required

Release Notes

Bug Fixes

Fix the endianness of snappy_java_compressor headers to match that of snappy-java.

(cherry picked from commit 0352201)

The versions in the snappy header are written using big-endian format in the `snappy-java` client used by kafka. Mistakenly, `redpanda` would write them using little-endian format in our `snappy_java_compressor` implementation. Correct this by encoding and decoding the `version` and `compatible_version` headers using big-endian format in `snappy_java_compressor`. For references to `snappy-java`'s big-endian implementation, see: * https://github.com/xerial/snappy-java/blob/65e1ec3de1a0d447b137c6dd6393629aa3d75b8b/src/main/java/org/xerial/snappy/SnappyOutputStream.java#L343-L349 * https://github.com/xerial/snappy-java/blob/65e1ec3de1a0d447b137c6dd6393629aa3d75b8b/src/main/java/org/xerial/snappy/SnappyCodec.java#L78-L81 (cherry picked from commit 1c1b006)

Most `snappy` clients do not perform this version check, and furthermore, it is implemented incorrectly here. (cherry picked from commit 5723eb4)

(cherry picked from commit 72d02ee)

The two committed files in `snappy_payload` are a raw uncompressed data file, and a `snappy` compressed data file generated by `redpanda` using the incorrect little-endian encoding for the version fields in the `snappy` header. They are used in a unit test to ensure that with the big-endian fix for `snappy`, we are still able to decompress the buffer and get the same decompressed data as before the fix. (cherry picked from commit a84252d)

In order to allow `kafka-python` to use these compression types, we must be able to import the respective module. (cherry picked from commit 17a2e55)

To test compression compatibility with Java-based Kafka consumers/producers. These tests are parameterized for all compression types, but they most notably serve as reproducers for an outstanding header-field encoding bug in `snappy_java_compressor.cc`. (cherry picked from commit 379f380)

vbotbuildovich · 2025-02-22T05:11:29Z

CI test results

test results on build#62123

test_id	test_kind	job_url	test_status	passed
rptest.tests.delete_records_test.DeleteRecordsTest.test_delete_records_concurrent_truncations.cloud_storage_enabled=True.truncate_point=at_high_watermark	ducktape	https://buildkite.com/redpanda/redpanda/builds/62123#01952bae-8820-492c-954f-9e031376519f	FLAKY	5/6

WillemKauf added 7 commits February 21, 2025 21:00

compression: move snappy_magic to header

b4ec300

(cherry picked from commit 0352201)

compression: remove version check for snappy_java_compressor

a456236

Most `snappy` clients do not perform this version check, and furthermore, it is implemented incorrectly here. (cherry picked from commit 5723eb4)

compression: add snappy_tests.cc

4207b14

(cherry picked from commit 72d02ee)

tests: add python-snappy and lz4 as rptest dependencies

9a86d84

In order to allow `kafka-python` to use these compression types, we must be able to import the respective module. (cherry picked from commit 17a2e55)

WillemKauf requested a review from rockwotj February 22, 2025 02:05

github-actions bot added the area/redpanda label Feb 22, 2025

WillemKauf enabled auto-merge February 22, 2025 05:13

rockwotj approved these changes Feb 22, 2025

View reviewed changes

WillemKauf merged commit a26ca70 into redpanda-data:v24.1.x Feb 22, 2025
19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v24.1.x] `compression`: correct endianness in `snappy_java_compressor` (Manual backport) #25137

[v24.1.x] `compression`: correct endianness in `snappy_java_compressor` (Manual backport) #25137

WillemKauf commented Feb 22, 2025

vbotbuildovich commented Feb 22, 2025

[v24.1.x] compression: correct endianness in snappy_java_compressor (Manual backport) #25137

[v24.1.x] compression: correct endianness in snappy_java_compressor (Manual backport) #25137

Conversation

WillemKauf commented Feb 22, 2025

Backports Required

Release Notes

Bug Fixes

vbotbuildovich commented Feb 22, 2025

CI test results

[v24.1.x] `compression`: correct endianness in `snappy_java_compressor` (Manual backport) #25137

[v24.1.x] `compression`: correct endianness in `snappy_java_compressor` (Manual backport) #25137