Skip to content

Conversation

@nickyc975
Copy link
Contributor

As discussed in #975, this PR implemented the following mechenism:

  1. Add client_id parameter to put_start/put_end/put_revoke, and save the client_id in object metadata. This is needed because now we may have more than one client that is processing the same key. We must identify the valid client and prevent unexpected put_end/put_revoke.
  2. For a successful put_start, save the client_id and the current timestamp in object metadata. When another put_start on the same object arrived, if the object has been in processing state for more than 30s (maybe configurable), we discard existing replicas of the object and renew its client_id and timestamp.
  3. Discarded replicas will be put into a staging list, and be released after 10-20 minutes (maybe configurable).
  4. A new thread put_start_monitor_thread_ is added to monitor the ongoing Put operations. If an object is found stucking in processing state for more than 10-20 minutes (can be the same as timeout of discarded replicas), we remove it and release the space immediately.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @nickyc975, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the robustness of Put operations within the store by introducing a comprehensive mechanism to handle and clean up potentially stuck or abandoned processing objects. By uniquely identifying clients, implementing timeouts for incomplete operations, and actively monitoring these states, the system can prevent resource leaks and ensure more reliable resource allocation, especially in scenarios with concurrent or failing client interactions.

Highlights

  • Client Identification for Put Operations: Introduced a client_id parameter to put_start, put_end, and put_revoke operations, and stored this ID in object metadata. This ensures that only the valid client can complete or revoke an ongoing Put operation, preventing conflicts when multiple clients might attempt to process the same key.
  • Timeout for Stuck Put Operations: Implemented a mechanism to detect and handle Put operations that get stuck in a processing state. If an object remains in processing for more than a configurable put_start_discard_timeout_sec (default 30 seconds) and has no completed replicas, its existing replicas are discarded, allowing a new put_start for the same key to proceed.
  • Staging and Delayed Release of Discarded Replicas: Discarded replicas are now placed into a staging list and their allocated space is released after a configurable put_start_release_timeout_sec (default 10-20 minutes). This allows for monitoring and accounting of resources temporarily held by failed or superseded operations.
  • New Put Start Monitor Thread: A dedicated put_start_monitor_thread_ has been added to continuously monitor ongoing Put operations. This thread is responsible for identifying and cleaning up objects that exceed the defined processing timeouts, ensuring timely resource recovery.
  • Enhanced Metrics for Put Operations: New metrics have been introduced to track the number of discarded PutStart operations, the number of released PutStart operations, and the total size of memory replicas currently in the discarded staging area, providing better visibility into resource management.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism to clean up timed-out Put operations, which is a great improvement for the store's robustness. The changes are quite extensive, involving API modifications to include client_id, new configuration options for timeouts, and a new monitor thread. The logic for handling expired put operations, including staging and releasing discarded replicas, seems well-thought-out and correctly implemented. The addition of a dedicated test for the new monitor thread is also a valuable contribution.

I have a few suggestions to improve code clarity and maintainability by using more standard C++ idioms.

Comment on lines 560 to 562
bool isExpired(const std::chrono::steady_clock::time_point& now) {
return ttl_ <= now;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The isExpired method does not modify the state of the DiscardedReplicas object. It should be marked as const to reflect this. This is good practice and allows the method to be called on const objects, which can enable further optimizations and cleaner code, for example when using it with standard library algorithms like std::list::remove_if.

        bool isExpired(const std::chrono::steady_clock::time_point& now) const {
            return ttl_ <= now;
        }

Comment on lines 1139 to 1146
auto it = discarded_replicas_.begin();
while (it != discarded_replicas_.end()) {
if (it->isExpired(now)) {
it = discarded_replicas_.erase(it);
} else {
it++;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This while loop for erasing expired replicas from the list can be simplified by using the std::list::remove_if member function. This makes the code more concise and expressive of its intent. Note that this requires the isExpired method to be const-qualified.

            discarded_replicas_.remove_if([
                &now
            ](const DiscardedReplicas& item) { return item.isExpired(now); });

Signed-off-by: Chen Jinlong <chenjinlong.cjl@alibaba-inc.com>
… by the same client

Signed-off-by: Chen Jinlong <chenjinlong.cjl@alibaba-inc.com>
Signed-off-by: Chen Jinlong <chenjinlong.cjl@alibaba-inc.com>
Signed-off-by: Chen Jinlong <chenjinlong.cjl@alibaba-inc.com>
Signed-off-by: Chen Jinlong <chenjinlong.cjl@alibaba-inc.com>
@nickyc975 nickyc975 force-pushed the jinlong/fix-put-start-upstream branch from 2c393b4 to 80c1684 Compare October 31, 2025 08:58
@ykwd ykwd self-requested a review November 3, 2025 03:00
@ykwd ykwd self-assigned this Nov 3, 2025
Copy link
Collaborator

@ykwd ykwd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great work! I have a few suggestions:

  1. processing_metadata only needs to be an unordered_set<string> rather than a holding a std::shared_ptr<ObjectMetadata> for each key. Since we can retrieve the metadata directly from the key, there is no need to change metadata to shared_ptr. Otherwise, it would require modifying many places, cause a lot of conflicts with other PRs, and there is no real necessity to use shared_ptr here.

  2. Can we move the logic of the PutStartMonitor thread into BatchEvict (specifically in the first pass of batch eviction)? In practice, we don’t need to reclaim memory immediately when the put-start timer expires, nor do we need to scan all shards every second (which also requires locking each shard). Instead, we can simply check this as part of the eviction process.


auto& metadata = accessor.Get();
if (client_id != metadata.client_id) {
LOG(ERROR) << "key=" << key << " putEnd client=" << client_id
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error log is a little bit misleading. Here the parameter is correct. The problem is that the previous put start from this client has been aborted by master


auto& metadata = accessor.Get();
if (client_id != metadata.client_id) {
LOG(ERROR) << "key=" << key << " putRevoke client=" << client_id
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar issue with put start error log

@ykwd
Copy link
Collaborator

ykwd commented Nov 3, 2025

Could you also add this new mechanism to the documents? Thanks

@nickyc975
Copy link
Contributor Author

Hello, @ykwd:

  1. Can we move the logic of the PutStartMonitor thread into BatchEvict (specifically in the first pass of batch eviction)? In practice, we don’t need to reclaim memory immediately when the put-start timer expires, nor do we need to scan all shards every second (which also requires locking each shard). Instead, we can simply check this as part of the eviction process.

BatchEvict itself has been too complicated, I don't think adding more logic into it is a good idea. I prefer wrapping the PutStartMonitor stuff into a single function and call it in EvictionThreadFunc, right before calling BatchEvict. If PutStartMonitor released enough space, we could just skip BatchEvict.

@ykwd
Copy link
Collaborator

ykwd commented Nov 4, 2025

Hello, @ykwd:

  1. Can we move the logic of the PutStartMonitor thread into BatchEvict (specifically in the first pass of batch eviction)? In practice, we don’t need to reclaim memory immediately when the put-start timer expires, nor do we need to scan all shards every second (which also requires locking each shard). Instead, we can simply check this as part of the eviction process.

BatchEvict itself has been too complicated, I don't think adding more logic into it is a good idea. I prefer wrapping the PutStartMonitor stuff into a single function and call it in EvictionThreadFunc, right before calling BatchEvict. If PutStartMonitor released enough space, we could just skip BatchEvict.

I think it’s reasonable to wrap this logic into separate functions. However, a put_start timeout is a very rare case, and in the current architecture, each check requires acquiring locks for all shards. That adds a relatively expensive overhead for something that almost never happens.

Would it make sense to wrap it as one or several standalone functions and call it at an appropriate point during batch eviction instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants