-
Notifications
You must be signed in to change notification settings - Fork 90
Fix backingstore deletion #9215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix backingstore deletion #9215
Conversation
WalkthroughIn the WIPING stage of node activity, when Changes
Sequence Diagram(s)sequenceDiagram
participant NM as NodeMonitor
participant N as Node (item)
participant RC as RemovalCollector
rect rgba(240,240,255,0.6)
Note over NM,N: WIPING stage handling
NM->>N: Complete WIPING stage for item
alt item.node.deleting == true
NM->>N: Set item.ready_to_be_deleted = true
Note right of NM: Marks item for deletion collector
else item.node.deleting != true
NM-->>N: No deletion flag set
end
end
rect rgba(240,255,240,0.6)
Note over RC,N: Existing deletion flow
RC->>RC: Collect items with ready_to_be_deleted == true
RC->>N: Trigger full removal processing
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Pre-merge checks (3 passed)✅ Passed checks (3 passed)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/server/node_services/nodes_monitor.js (2)
1907-1911: Avoid futile retries for unsupported internal nodes.
_update_deleted_nodes()throws foris_internal_node, so marking these as ready-to-delete will cause endless retries/noise. Gate the flag:- if (item.node.deleting) { + if (item.node.deleting && !item.node.is_internal_node) { // We mark it in order to remove the agent fully (process and tokens etc) // Only after successfully completing the removal we assign the deleted date item.ready_to_be_deleted = true; }
1907-1913: Eagerly queue a store update to ensure prompt processing.Although
_update_data_activity_schedule()will add the item to_set_need_update, add it here to avoid any race if later steps error out.if (item.node.deleting) { // We mark it in order to remove the agent fully (process and tokens etc) // Only after successfully completing the removal we assign the deleted date item.ready_to_be_deleted = true; + this._set_need_update.add(item); }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/server/node_services/nodes_monitor.js(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: mint-nc-tests / Mint NC Tests
- GitHub Check: warp-nc-tests / Warp NC Tests
- GitHub Check: mint-tests / Mint Tests
- GitHub Check: ceph-s3-tests / Ceph S3 Tests
- GitHub Check: warp-tests / Warp Tests
- GitHub Check: run-unit-tests-postgres / Unit Tests with Postgres
- GitHub Check: ceph-nsfs-s3-tests / NSFS Ceph S3 Tests
- GitHub Check: run-nc-unit-tests / Non Containerized Unit Tests
- GitHub Check: run-unit-tests / Unit Tests
- GitHub Check: run-sanity-ssl-tests / Sanity SSL Tests
- GitHub Check: run-sanity-tests / Sanity Tests
- GitHub Check: run-jest-unit-tests
🔇 Additional comments (1)
src/server/node_services/nodes_monitor.js (1)
1907-1911: Deletion flow restored at the correct state transition. LGTM.Setting
item.ready_to_be_deleted = trueright afterSTAGE_WIPINGcompletes (and only whennode.deletingis set) reactivates the intended deletion pipeline via_update_nodes_store()→_update_deleted_nodes(). This should fix the regression.Please run a quick e2e: delete a cloud/mongo backingstore node, wait for WIPING completion, and assert that
_update_deleted_nodes()is invoked (logs show removal attempt) and the node is eventually removed from DB.
Signed-off-by: Utkarsh Srivastava <srivastavautkarsh8097@gmail.com>
3f25a0f to
fc14fb1
Compare
Describe the Problem
The backingstore deletion was not working because of accidental removal of some code
Explain the Changes
This PR simply adds back the deleted code which marks the 'deleting
nodesready_to_be_deleted`.Issues: Fixed #xxx / Gap #xxx
Testing Instructions:
Summary by CodeRabbit