Skip to content

Conversation

@tangledbytes
Copy link
Member

@tangledbytes tangledbytes commented Sep 11, 2025

Describe the Problem

The backingstore deletion was not working because of accidental removal of some code

Explain the Changes

This PR simply adds back the deleted code which marks the 'deletingnodesready_to_be_deleted`.

Issues: Fixed #xxx / Gap #xxx

Testing Instructions:

  • Doc added/updated
  • Tests added

Summary by CodeRabbit

  • Bug Fixes
    • Nodes marked for deletion are now reliably flagged for final removal immediately after the wipe stage, preventing stalled or lingering items.
    • Improves reliability and consistency of cleanup workflows, reducing residual data after node removal.
    • Shortens the end-to-end deletion process with no action required by users.
    • No changes to public APIs or user-facing behavior otherwise.

@coderabbitai
Copy link

coderabbitai bot commented Sep 11, 2025

Walkthrough

In the WIPING stage of node activity, when item.node.deleting is truthy the monitor now sets item.ready_to_be_deleted = true, so the node becomes visible to the existing removal collector. No public/exported signatures were changed.

Changes

Cohort / File(s) Summary
Node deletion readiness flagging
src/server/node_services/nodes_monitor.js
In the WIPING stage, if item.node.deleting is truthy, set item.ready_to_be_deleted = true so the existing deletion collector will pick it up for full removal.

Sequence Diagram(s)

sequenceDiagram
  participant NM as NodeMonitor
  participant N as Node (item)
  participant RC as RemovalCollector

  rect rgba(240,240,255,0.6)
    Note over NM,N: WIPING stage handling
    NM->>N: Complete WIPING stage for item
    alt item.node.deleting == true
      NM->>N: Set item.ready_to_be_deleted = true
      Note right of NM: Marks item for deletion collector
    else item.node.deleting != true
      NM-->>N: No deletion flag set
    end
  end

  rect rgba(240,255,240,0.6)
    Note over RC,N: Existing deletion flow
    RC->>RC: Collect items with ready_to_be_deleted == true
    RC->>N: Trigger full removal processing
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title “Fix backingstore deletion” directly captures the primary intent of the changeset, which is to restore the code that marks deleting nodes as ready_to_be_deleted so that backingstore deletion functions correctly, and it does so concisely and without extraneous detail.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3f25a0f and fc14fb1.

📒 Files selected for processing (1)
  • src/server/node_services/nodes_monitor.js (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/server/node_services/nodes_monitor.js
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: ceph-s3-tests / Ceph S3 Tests
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/server/node_services/nodes_monitor.js (2)

1907-1911: Avoid futile retries for unsupported internal nodes.

_update_deleted_nodes() throws for is_internal_node, so marking these as ready-to-delete will cause endless retries/noise. Gate the flag:

-            if (item.node.deleting) {
+            if (item.node.deleting && !item.node.is_internal_node) {
                 // We mark it in order to remove the agent fully (process and tokens etc)
                 // Only after successfully completing the removal we assign the deleted date
                 item.ready_to_be_deleted = true;
             }

1907-1913: Eagerly queue a store update to ensure prompt processing.

Although _update_data_activity_schedule() will add the item to _set_need_update, add it here to avoid any race if later steps error out.

             if (item.node.deleting) {
                 // We mark it in order to remove the agent fully (process and tokens etc)
                 // Only after successfully completing the removal we assign the deleted date
                 item.ready_to_be_deleted = true;
+                this._set_need_update.add(item);
             }
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f187594 and 3f25a0f.

📒 Files selected for processing (1)
  • src/server/node_services/nodes_monitor.js (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: mint-nc-tests / Mint NC Tests
  • GitHub Check: warp-nc-tests / Warp NC Tests
  • GitHub Check: mint-tests / Mint Tests
  • GitHub Check: ceph-s3-tests / Ceph S3 Tests
  • GitHub Check: warp-tests / Warp Tests
  • GitHub Check: run-unit-tests-postgres / Unit Tests with Postgres
  • GitHub Check: ceph-nsfs-s3-tests / NSFS Ceph S3 Tests
  • GitHub Check: run-nc-unit-tests / Non Containerized Unit Tests
  • GitHub Check: run-unit-tests / Unit Tests
  • GitHub Check: run-sanity-ssl-tests / Sanity SSL Tests
  • GitHub Check: run-sanity-tests / Sanity Tests
  • GitHub Check: run-jest-unit-tests
🔇 Additional comments (1)
src/server/node_services/nodes_monitor.js (1)

1907-1911: Deletion flow restored at the correct state transition. LGTM.

Setting item.ready_to_be_deleted = true right after STAGE_WIPING completes (and only when node.deleting is set) reactivates the intended deletion pipeline via _update_nodes_store()_update_deleted_nodes(). This should fix the regression.

Please run a quick e2e: delete a cloud/mongo backingstore node, wait for WIPING completion, and assert that _update_deleted_nodes() is invoked (logs show removal attempt) and the node is eventually removed from DB.

Signed-off-by: Utkarsh Srivastava <srivastavautkarsh8097@gmail.com>
@tangledbytes tangledbytes force-pushed the utkarsh/fix/backinstore-deletion branch from 3f25a0f to fc14fb1 Compare September 11, 2025 10:23
@tangledbytes tangledbytes merged commit 7521471 into noobaa:master Sep 11, 2025
29 of 30 checks passed
nimrod-becker added a commit that referenced this pull request Sep 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants