Skip to content

[Hitless Upgrades] React to maintenance events #3345 #3354

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

tishun
Copy link
Collaborator

@tishun tishun commented Jul 10, 2025

Draft for introducing maintenance events to Lettuce

Make sure that:

  • You have read the contribution guidelines.
  • You have created a feature request first to discuss your contribution intent. Please reference the feature request ticket number in the pull request.
  • You applied code formatting rules using the mvn formatter:format target. Don’t submit any formatting related changes.
  • You submit test cases (unit or integration tests) that back your changes.

* v0.1

* Simple reconnect now working

* Bind address from message is now considered

* Self-register the handler

* Format code

* Filter push messages in a more stable way

* (very hacky) Relax comand expire timers globbaly

* Configure if timeout relaxing should be applied

* Proper way to close channel

* Configure the timneout relaxing

* Sequential handover implemented

* Did not address formatting

* Prolong the rebind windwow for relaxed tiemouts

* PubSub no longer required; CommandExpiryWriter is now channel aware; Polishing

* Use the new MOVING push message from the RE server

* Unit test was not chaining delgates in the same way that the RedisClient/RedisClusterClient was

* Fix REBIND message validation

* Fixed the expiry mechanism

* Polishing

* Fix NPE.

Seems like AttributeMap.attr is not accurate and actually return's  null causing some unit test failures.

* Add support for MIGRATING/MIGRATED message handling in command expiry

This commit adds the ability to listen for MIGRATING and MIGRATED messages
and trigger extended command expiry timeouts during Redis shard migration.

Key changes:
- Enhanced RebindAwareConnectionWatchdog to detect MIGRATING/MIGRATED messages
- RebindAwareExpiryWriter to trigger timeout relaxation whenever MIGRATING message is received

This feature allows commands to have relaxed timeouts during shard migration
operations, preventing unnecessary timeouts when Redis is temporarily busy
with migration tasks.

* formating

* Fix Disabling relaxTimeouts after upgrade can interfere with an ongoing one from re-bind

* Additional fix for timeout relaxing disabled

* Fix push message listener registered multiple times after rebind.

* Fix: Report correct command timeout when relaxTimeout is configured

* Disable relaxedTimeout after configured grace period

- Introduce a delay before disabling relaxedTimeout
- Grace period duration is provided via push notification

* Code clean up
  - Remove reading from pub/sub chanel and relay only on push notifications

* Add FAILING_OVER/FAILED_OVER

* Polishing : Rename components to use the word 'maintenace'

---------

Co-authored-by: Igor Malinovskiy <u.glide@gmail.com>
Co-authored-by: ggivo <ivo.gaydazhiev@redis.com>
@tishun tishun added this to the 7.0.0.RELEASE milestone Jul 10, 2025
@tishun tishun added the type: feature A new feature label Jul 10, 2025
@redis redis deleted a comment from dengliming Jul 11, 2025
 (#3356)

* Unit tests for the maintanence aware classes

* Did not format properly

* Proper license
Copy link
Contributor

@ggivo ggivo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kiryazovi-redis kiryazovi-redis force-pushed the feature/maintenance-events branch from 7f3c59b to a133028 Compare July 28, 2025 10:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: feature A new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants