Releases: redpanda-data/redpanda
Releases · redpanda-data/redpanda
v24.1.8
Bug Fixes
- fixed overflow that may lead to unnecessary moves by @mmaslankaprv in #19803
- rpk cluster config get: does not round float numbers anymore. by @r-vasquez in #18850
- PR #18835 [v24.1.x] Fixed possible log discrepancy when using forced reconfiguration by @mmaslankaprv
- PR #19313 [v24.1.x] cpu_profiler: prealloc result buffers by @StephanDollberg
- PR #19801 [v24.1.x] s/disk_log_impl: don't prefix-truncate empty segments by @ztlpn
- PR #19808 [v24.1.x] t/fetch_test: wait longer for leadership after stepping down by @mmaslankaprv
Improvements
- Run directory walking during cache trimming concurrently. On some deployments it was observed that it can take hours for 600K objects with busy reactor during which fetch operations that need to cache data are blocked. by @nvartolomei in #19815
- #19831 Don't try to transfer leadership to just restarted nodes when balancing leaders. by @ztlpn in #19832
- new metric providing more insight into recovery process by @mmaslankaprv in #18840
- reduced the amount of data required to transfer over the network by @mmaslankaprv in #19834
- PR #18854 [v24.1.x] cst: manual backport of chunk download changes PR 18278 by @abhijat
Full Changelog: v24.1.7...v24.1.8
v23.3.17
Features
- Adds configuration options to trigger cache trim before the cache reaches its maximum size. by @jcipar in #19624
- cloud_storage_cache_trim_threshold_size
- cloud_storage_cache_trim_threshold_objects
These mirror the options for controlling maximum size: cloud_storage_cache_size and cloud_storage_cache_max_objects
- The new default behavior, if these are not set, is to trigger a trim when the cache is 100% full. by @jcipar in #19624
- #18739 Schema Registry: Support
/mode
endpoints for READONLY by @BenPope in #18742
Bug Fixes
- Fixes a bug where crashes within the redpanda http client could occur when encountering tls exceptions by @graphcareful in #18696
- #18633 rpk: fixes an error in
rpk topic consume
that prevented the usage of the--regex
flag. by @r-vasquez in #18634 - #18734 Fixes incorrect ordering of arguments in the cloud cache trim admin endpoint. by @andrwng in #18764
- #18770 Fixes a bug that would allow requests to complete that created acls for topics with invalid kafka topic names by @graphcareful in #19791
- fixed overflow that may lead to unnecessary moves by @mmaslankaprv in #19805
- rpk cluster config get: does not round float numbers anymore. by @r-vasquez in #18849
- PR #18784 [v23.3.x] raft: fix node_id mismatch log message by @ztlpn
- PR #18855 [v23.3.x] Fixed possible log discrepancy when using forced reconfiguration by @mmaslankaprv
- PR #18573 rm_stm: couple of stability fixes noticed when down scaling max_concurrent_producer_ids by @bharathv
Improvements
- Short description of how this PR improves existing behavior. by @jcipar in #19624
- #18645 rpk: topic describe supports
--regex
flag by @daisukebe in #18646 - made fast partition movements easier to debug. by @mmaslankaprv in #18689
- reduced the amount of data required to transfer over the network by @mmaslankaprv in #19835
- PR #18741 [v23.3.x] cloud_storage_clients: check for
BlobNotFound
inabs_client::do_delete_path()
by @WillemKauf - PR #19838 [v23.3.x] s/disk_log_impl: don't prefix-truncate empty segments by @ztlpn
Full Changelog: v23.3.16...v23.3.17
v24.1.7
Features
- Split cache into buckets using
cloud_storage_cache_num_buckets
configuration parameter. by @Lazin in #18780
Bug Fixes
- Fixes a bug that would allow requests to complete that created acls for topics with invalid kafka topic names by @graphcareful in #18769
- #18735 Fixes incorrect ordering of arguments in the cloud cache trim admin endpoint. by @andrwng in #18763
Full Changelog: v24.1.6...v24.1.7
v24.1.6
Full Changelog: v24.1.5...v24.1.6
v24.1.5
v24.1.4
Bug Fixes
Improvements
- #18643 rpk: topic describe supports
--regex
flag by @daisukebe in #18644 - #18675 rpk now will exit (1) when running rpk with unknown commands by @r-vasquez in #18676
- made fast partition movements easier to debug. by @mmaslankaprv in #18690
Full Changelog: v24.1.3...v24.1.4
v24.1.3
Features
- Schema Registry: Support
/mode
endpoints for READONLY by @BenPope in #18623 - Schema Registry: Support for
deleted=true
query parameter onPOST /subjects/<subject>
. by @BenPope in #18433 - #18458 rpk: ability to transfer partition leadership by @daisukebe in #18459
Bug Fixes
- Don't mark partition rebalance complete if some partitions are not moveable (e.g. due to partial recovery mode) by @ztlpn in #18518
- Enforce client quota throttling in a Kafka-compatible way, meaning we enforce the throttle delay on the next request if the client did not enforce it on its side. by @pgellert in #18568
- Fixes a bug in the http client where a crash may occur in the event certain tls verification errors are observed by @graphcareful in #18428
- #18439 Fixed an assertion triggering in a full-disk scenario by @andijcr in #18440
- #18565 Fix an edge case where a timequery returns no results if it races with tiered storage retention and garbage collection. This is important at least for consumers that fall behind retention. They interpret such response as the partition is empty and jump to the HWM instead of resuming consuming from the first available message. by @nvartolomei in #18597
- #18631 rpk: fixes an error in
rpk topic consume
that prevented the usage of the--regex
flag. by @r-vasquez in #18632 - fixes possible stall in
raft::state_machine_manger
by @mmaslankaprv in #18638 - PR #18392 [v24.1.x] archival: clamp uploads to committed offset by @ nvartolomei
Improvements
- Made electing a leader faster by @mmaslankaprv in #18493
- PR #18448 [v24.1.x] cloud_storage: correct
list_object()
request headers and parameters (manual backport) by @WillemKauf - PR #18476 [v24.1.x] rptest: be more permissive with errors in stress fibers test by @andrwng
- PR #18488 [v24.1.x] tests: wait for messages before adding a node to cluster by @mmaslankaprv
- PR #18503 [v24.1.x] storage: change map type for
_db
inkvstore
by @ WillemKauf - PR #18520 [v24.1.x] Made client id parsing vcluster aware by @ mmaslankaprv
- PR #18560 [v24.1.x] cst/ducktape: Accept errors due to gap in manifest by @abhijat
- PR #18588 [v24.1.x] archival: Disable housekeeping jobs on startup by @ Lazin
- PR #18620 [v24.1.x] tests: fix replaced segments accounting in TopicRecoveryTest by @ztlpn
- PR #18639 [v24.1.x] schema_registry: Make mode_mutability: true by default by @BenPope
Full Changelog: v24.1.2...v24.1.3
v23.3.16
Features
- Schema Registry: Support for
deleted=true
query parameter onPOST /subjects/<subject>
. by @BenPope in #18432 - #18460 rpk: ability to transfer partition leadership by @daisukebe in #18461
Bug Fixes
- Fix initial_leader_epoch/KIP-320 handling in fetch requests. It was ignored until now which prevented consumers to correctly detect suffix truncation. For Redpanda (and Raft), this is a minor problem since suffix truncation is a very improbable event. by @nvartolomei in #17728
- #17957 Fix incorrect log truncations caused by delayed replication requests. by @ztlpn in #18523
- #18282 #18566 Fix a scenario where list_offset with a timestamp could return a lower offset than partition start after a trim-prefix command. This could lead to consumers being stuck with an out-of-range-offset exception if they began consuming from an offset below the one which was used in the trim-prefix command. by @nvartolomei in #18599
- #18282 #18566 Fix an edge case where a timequery returns no results if it races with tiered storage retention and garbage collection. This is important at least for consumers that fall behind retention. They interpret such response as the partition is empty and jump to the HWM instead of resuming consuming from the first available message. by @nvartolomei in #18599
- #18443 Fixed an assertion triggering in a full-disk scenario by @andijcr in #18444
- #18517 Don't mark partition rebalance complete if some partitions are not moveable (e.g. due to partial recovery mode) by @ztlpn in #18522
- #18569 Enforce client quota throttling in a Kafka-compatible way, meaning we enforce the throttle delay on the next request if the client did not enforce it on its side. by @pgellert in #18575
- concurrent requests of set_log_level + expiration now work as expected by @andijcr in #18438
- fixes possible stall in
raft::state_machine_manger
by @mmaslankaprv in #18637
Improvements
- Made electing a leader faster by @mmaslankaprv in #18625
- #17951 Schema Registry: Improve retry logic for
delete_config
anddelete_subject_permanent
by @BenPope in #18624 - #17951 Schema Registry: Improve tombstoning when deleting a subject by @BenPope in #18624
Full Changelog: v23.3.15...v23.3.16
v24.1.2
Features
- Re-adds the
fetch_read_strategy
cluster config property to select betweenpolling
andnon-polling
fetch implementations. Uses thenon-polling
fetch implementation by default. by @StephanDollberg in #18176 - #18163 rpk container start: now starts a Redpanda Console container connected with the cluster. by @r-vasquez in #18164
- rpk container now has a set of flags to specify ports for node to start on. by @r-vasquez in #18148
Bug Fixes
- Fix a bug validating WebAssembly when global constants are specific values that have the encoded byte 0x0B. by @rockwotj in #18108
- Fix a bug where an invalid buffer passed into the WebAssembly host from the guest could cause Redpanda to abort. by @rockwotj in #18234
- Fix a scenario where list_offset with a timestamp could return a lower offset than partition start after a trim-prefix command. This could lead to consumers being stuck with an out-of-range-offset exception if they began consuming from an offset below the one which was used in the trim-prefix command. by @nvartolomei in #18281
- #18100 Better mapping of REST error codes by @mmaslankaprv in #18102
- #18158 Fix issuing timequeries to cloud storage if
remote.read
is not enabled. by @WillemKauf in #18159 - #18240 Fixes a crash caused by a race between a client disconnect and a segment reader in tiered storage. by @andrwng in #18241
- #18317 Fixes expiration for transactions that have begun and not produced any data batches. This prevents a stalling LSO. by @bharathv in #18324
- PR #18051 [v24.1.x] Address oversized allocs across kafka API and schema registry by @oleiman
- PR #18125 [v24.1.x] cluster_recovery_backend_test: fix unsafe iteration by @andrwng
- PR #18141 [v24.1.x] Fixes for wait_ms cpu profiler mode by @StephanDollberg
- PR #18216 [v24.1.x] controller_backend: prevent busy-looping when removing partitions by @ztlpn
- PR #18222 [v24.1.x] tx/tm_stm: fix unboundedness of _pid_tx_id by @bharathv
- PR #18328 [v24.1.x] Change information stored in
_topic_node_index
to avoid oversized alloc by @ballard26 - PR #18406 [v24.1.x] Fix some concurrent memory access problems in partition balancer by @ztlpn
Improvements
- Improve cloud storage cache to prevent readers from being blocked during cache eviction. by @Lazin in #18134
- #18150
rpk container start
: You can now select the subnet and gateway to create your 'redpanda' network. by @r-vasquez in #18151 - allow interpreting
'retention_duration' = -1
in a topic_manifest.json file as infinite time retention by @andijcr in #18243 - rpk container now starts the seed broker using the default listener ports. by @r-vasquez in #18148
- PR #18117 [v24.1.x] wasm/parser: better global support by @rockwotj
- PR #18128 [v24.1.x] c/balancer_backend: first initialize planner and then call plan by @mmaslankaprv
- PR #18194 [v24.1.x] configuration to enable delete retention for consumer offsets by @bharathv
- PR #18228 [v24.1.x] CORE-1752: cst: Downgrade error logs to debug by @abhijat
- PR #18269 [v24.1.x] [CORE-2581] cst: move chunk downloads to remote segment bg loop by @abhijat
- PR #18321 [v24.1.x] rpk: stop using args[0] in cloud cluster select by @r-vasquez
- PR #18318 [v24.1.x] offline_log_viewer: fix get_control_record_type by @bharathv
Full Changelog: v24.1.1...v24.1.2
v23.3.15
Bug Fixes
- Fix a bug where an invalid buffer passed into the WebAssembly host from the guest could cause Redpanda to abort. by @rockwotj in #18235
- Fixes expiration for transactions that have begun and not produced any data batches. This prevents a stalling LSO. by @bharathv in #18248
- #18237 Fixes a crash caused by a race between a client disconnect and a segment reader in tiered storage. by @andrwng in #18238
- PR #18223 [v23.3.x] tx/tm_stm: fix unboundedness of _pid_tx_id by @bharathv
Improvements
- allow interpreting
'retention_duration' = -1
in a topic_manifest.json file as infinite time retention by @andijcr in #18242
Full Changelog: v23.3.14...v23.3.15