Skip to content

Conversation

@nk-ag
Copy link
Contributor

@nk-ag nk-ag commented Aug 16, 2025

No description provided.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 16, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Summary by CodeRabbit

  • New Features
    • Workflows with no subsequent steps now complete successfully and are marked as finished.
  • Bug Fixes
    • Improved state validation to prevent missing/invalid ID errors.
    • More reliable progression to next steps by standardizing parent-state references.
    • Fixed occasional failures when fetching secrets by aligning ID handling.
  • Refactor
    • Streamlined how state updates are persisted to improve consistency and reduce edge-case errors.
    • Simplified background task inputs for better stability without changing behavior.

Walkthrough

State handling was changed to use in-memory mutations and .save() instead of DB set operations, parents now store PydanticObjectId values and ObjectId conversions were removed; create_next_state builds an expanded parents map (including current state.id) and marks terminal nodes SUCCESS; controllers now pass raw state IDs to State.get.

Changes

Cohort / File(s) Summary
Controller — executed state handling
state-manager/app/controller/executed_state.py
Add guard for missing state/state.id; on non-empty outputs mutate state.outputs, state.status, state.parents and await state.save(); on empty outputs persist status/outputs/parents using state.id; schedule background task unchanged.
Model — state parents type
state-manager/app/models/db/state.py
Remove bson.ObjectId import, add beanie.PydanticObjectId; change parents: dict[str, ObjectId]parents: dict[str, PydanticObjectId].
Task — next-state creation & resolution
state-manager/app/tasks/create_next_state.py
Assert state and state.id; if no next_nodes set state SUCCESS and return; build `parents = state.parents
Controller — secrets retrieval
state-manager/app/controller/get_secrets.py
Remove ObjectId conversion and call State.get(state_id) directly.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant ExecCtrl as ExecutedStateController
  participant DB as State DB
  participant BG as BackgroundTaskScheduler
  participant Creator as CreateNextState

  Client->>ExecCtrl: POST executed_state (with outputs)
  ExecCtrl->>DB: load State (ensure state and state.id)
  alt outputs non-empty
    ExecCtrl->>DB: mutate state (outputs/status/parents) and await state.save()
  else outputs empty
    ExecCtrl->>DB: persist empty outputs/status/parents using state.id
  end
  ExecCtrl->>BG: schedule CreateNextState(state_id)
  ExecCtrl-->>Client: 202 Accepted

  BG->>Creator: run(state_id)
  Creator->>DB: fetch State
  alt has next nodes
    Creator->>Creator: parents = state.parents ∪ {state.identifier: state.id}
    loop each next node
      Creator->>DB: create new State with parents
    end
    Creator->>DB: set original state SUCCESS
  else no next nodes
    Creator->>DB: set original state SUCCESS and return
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

I twitch my whiskers at the state we save,
From memory to disk — a neat little cave.
When no next hops wait, we declare "SUCCESS" with glee,
Parents map aligned, IDs snug as can be.
A rabbit's small cheer for code that's tidy and free. 🥕

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 3b50c84 and 201de0a.

📒 Files selected for processing (1)
  • state-manager/app/tasks/create_next_state.py (4 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
state-manager/app/tasks/create_next_state.py (3)
state-manager/app/models/state_status_enum.py (1)
  • StateStatusEnum (4-14)
state-manager/app/models/db/state.py (1)
  • State (8-18)
state-manager/app/tasks/verify_graph.py (1)
  • verify_node_identifiers (29-55)
🔇 Additional comments (5)
state-manager/app/tasks/create_next_state.py (5)

15-16: Good hardening: replaced assert with explicit runtime check

Using an explicit check avoids asserts being stripped with -O and makes failures clearer.


40-42: Terminal node handling changed to SUCCESS + early return — confirm lifecycle alignment

Marking leaf nodes SUCCESS and returning is reasonable. Please verify this aligns with upstream expectations (e.g., whether controllers or monitors rely on EXECUTED vs SUCCESS at this point) and won’t cause double-finalization in any reentrant paths.


104-114: Propagating the expanded parents map to child states — aligns with new typing

Creating child states with parents=parents ensures the current state is included and types align with PydanticObjectId.


118-119: Finalization to SUCCESS after enqueuing next nodes — OK

Assuming create_next_state is only invoked post-execution, setting the current state to SUCCESS here is consistent with the updated lifecycle.


61-103: Potential multi-parent (fan-in) resolution gap

Inputs are resolved against parents = state.parents | {state.identifier: state.id}. This union contains ancestors + current node, but not sibling nodes that may also be required by a downstream node. If a next node’s template references outputs from multiple immediate predecessors (siblings), this will raise “Parent identifier not found” here and mark the current state ERRORED.

Confirm whether the graph semantics forbid multi-parent joins in input placeholders, or if there’s a separate coordinator that triggers creation only when all required parents have completed and can supply a combined parents map. If joins are expected, we’ll need a different strategy (e.g., defer creation until all inputs resolvable, or compute a merged parents set across ready parents).

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @nk-ag, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses several bugs within the state management system. It refines how state objects are updated, correctly handles scenarios where a state has no subsequent nodes by marking it as successful, and improves the tracking of parent states to ensure accurate lineage for new state creations. Additionally, it updates type hints for Object IDs to align with the beanie ODM.

Highlights

  • State Object Update Mechanism: Changed the state update logic in executed_state.py from using State.find_one().set() to directly modifying the state object properties (outputs, status, parents) and then calling state.save(). This ensures a more direct and potentially more reliable update of the state document.
  • Terminal State Handling: Modified create_next_state.py to gracefully handle cases where a node template has no next nodes. Instead of raising an exception, the current state's status is now set to SUCCESS, and the function returns, indicating a successful completion of a state branch.
  • Parent State Tracking: Improved the parent tracking logic in create_next_state.py by explicitly including the current state's ID in its own parents dictionary before passing it down to subsequent state creations. This ensures that the parents dictionary accurately reflects the lineage when new states are generated.
  • Type Hinting for Object IDs: Updated the State model in state.py to import PydanticObjectId from beanie and use it as the type hint for the parents dictionary values, replacing the generic ObjectId. This aligns with the beanie ODM's specific type for MongoDB Object IDs.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several important bug fixes and improvements. The fix in executed_state.py to update the state object before passing it to a background task is crucial for correctness. The change in create_next_state.py to correctly handle terminal graph nodes as successful states is also a great improvement. The refactoring to use PydanticObjectId in the State model aligns with beanie best practices.

However, I've found a couple of issues related to the transition to PydanticObjectId. The code in a few places still attempts to use bson.ObjectId where a PydanticObjectId is now expected. I've left specific comments on the affected lines in the diff.

Additionally, please note that the change to PydanticObjectId in state.py likely introduces a bug in create_next_state.py on line 87: dependent_state = await State.get(ObjectId(parent_id)). Since parent_id will now be a PydanticObjectId, ObjectId(parent_id) will likely fail. This should probably be changed to await State.get(parent_id). As this line is not in the current diff, I am mentioning it here for your consideration.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 10b8357 and 6304210.

📒 Files selected for processing (3)
  • state-manager/app/controller/executed_state.py (1 hunks)
  • state-manager/app/models/db/state.py (2 hunks)
  • state-manager/app/tasks/create_next_state.py (3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (3)
state-manager/app/models/db/state.py (2)
state-manager/app/models/enqueue_response.py (1)
  • StateModel (6-11)
state-manager/app/models/db/base.py (1)
  • BaseDatabaseModel (7-15)
state-manager/app/controller/executed_state.py (4)
state-manager/app/models/state_status_enum.py (1)
  • StateStatusEnum (4-14)
state-manager/app/routes.py (1)
  • executed_state_route (84-94)
state-manager/app/models/executed_models.py (1)
  • ExecutedRequestModel (5-6)
state-manager/app/controller/errored_state.py (1)
  • errored_state (11-34)
state-manager/app/tasks/create_next_state.py (1)
state-manager/app/models/state_status_enum.py (1)
  • StateStatusEnum (4-14)
🔇 Additional comments (2)
state-manager/app/tasks/create_next_state.py (2)

40-42: Good fix: short-circuit leaf nodes by marking SUCCESS.

Early return on nodes without successors prevents unnecessary processing and double-finalization later. Looks correct.


113-114: LGTM: new_state inherits the expanded parents mapping.

Passing the merged parents to the child state ensures placeholders can resolve fields from the current state too.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (6)
state-manager/app/controller/get_secrets.py (2)

5-5: Fix pipeline failure: remove unused bson.ObjectId import

Ruff flagged this as unused. Removing it will fix the lint failure.

Apply this diff:

-from bson import ObjectId

28-35: Return proper HTTP status codes instead of ValueError

Raising ValueError bubbles up as 500s unless an exception handler is installed. Use HTTPException with appropriate status codes (404 for not found) to match executed_state and improve API ergonomics.

Apply this diff to use HTTPException:

@@
-from app.models.db.graph_template_model import GraphTemplate
-from bson import ObjectId
+from app.models.db.graph_template_model import GraphTemplate
+from fastapi import HTTPException, status
@@
-        if not state:
-            logger.error(f"State {state_id} not found", x_exosphere_request_id=x_exosphere_request_id)
-            raise ValueError(f"State {state_id} not found")
+        if not state:
+            logger.error(f"State {state_id} not found", x_exosphere_request_id=x_exosphere_request_id)
+            raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=f"State {state_id} not found")
@@
-        if state.namespace_name != namespace_name:
-            logger.error(f"State {state_id} does not belong to namespace {namespace_name}", x_exosphere_request_id=x_exosphere_request_id)
-            raise ValueError(f"State {state_id} does not belong to namespace {namespace_name}")
+        if state.namespace_name != namespace_name:
+            logger.error(f"State {state_id} does not belong to namespace {namespace_name}", x_exosphere_request_id=x_exosphere_request_id)
+            raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=f"State {state_id} not found in namespace {namespace_name}")
@@
-        if not graph_template:
-            logger.error(f"Graph template {state.graph_name} not found in namespace {namespace_name}", x_exosphere_request_id=x_exosphere_request_id)
-            raise ValueError(f"Graph template {state.graph_name} not found in namespace {namespace_name}")
+        if not graph_template:
+            logger.error(f"Graph template {state.graph_name} not found in namespace {namespace_name}", x_exosphere_request_id=x_exosphere_request_id)
+            raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=f"Graph template {state.graph_name} not found in namespace {namespace_name}")
@@
-    except Exception as e:
+    except Exception as e:
         logger.error(f"Error getting secrets for state {state_id}: {str(e)}", x_exosphere_request_id=x_exosphere_request_id)
         raise e 

Also applies to: 43-45, 54-56

state-manager/app/controller/executed_state.py (1)

2-2: Drop bson.ObjectId in favor of PydanticObjectId and use State.get

To complete the migration away from BSON and align with Beanie IDs end-to-end.

Apply this diff:

-from bson import ObjectId
+from beanie import PydanticObjectId
@@
-async def executed_state(namespace_name: str, state_id: ObjectId, body: ExecutedRequestModel, x_exosphere_request_id: str, background_tasks: BackgroundTasks) -> ExecutedResponseModel:
+async def executed_state(namespace_name: str, state_id: PydanticObjectId, body: ExecutedRequestModel, x_exosphere_request_id: str, background_tasks: BackgroundTasks) -> ExecutedResponseModel:
@@
-        state = await State.find_one(State.id == state_id)
+        state = await State.get(state_id)

Also applies to: 12-12, 17-17

state-manager/app/tasks/create_next_state.py (3)

4-4: Fix pipeline failure: remove unused bson.ObjectId import

bson.ObjectId is no longer used; Ruff flags this (F401).

Apply this diff:

-from bson import ObjectId

63-70: Fix unreachable branch and guard missing inputs in template

len(splits) == 0 is never true; also, indexing next_node_template.inputs[field_name] can KeyError. Simplify placeholder detection and validate presence.

Apply this diff:

-            for field_name, _ in next_node_input_model.model_fields.items():
-                temporary_input = next_node_template.inputs[field_name]
-                splits = temporary_input.split("${{")
-                    
-                if len(splits) == 0:
-                    next_node_input_data[field_name] = temporary_input
-                    continue
+            for field_name, _ in next_node_input_model.model_fields.items():
+                if field_name not in next_node_template.inputs:
+                    raise Exception(f"Input '{field_name}' not provided in template for node '{next_node_template.identifier}'")
+                temporary_input = next_node_template.inputs[field_name]
+                if "${{" not in temporary_input:
+                    next_node_input_data[field_name] = temporary_input
+                    continue
+                splits = temporary_input.split("${{")

99-99: Be robust to non-string outputs in placeholder substitution

If dependent_state.outputs[input_field] can be non-string, concatenation will fail. Cast to str.

Apply this diff:

-                        constructed_string += dependent_state.outputs[input_field] + split.split("}}")[1]
+                        constructed_string += str(dependent_state.outputs[input_field]) + split.split("}}")[1]
♻️ Duplicate comments (1)
state-manager/app/controller/executed_state.py (1)

25-27: Critical: Mixing .set() with passing a stale in-memory State can clobber DB fields later

You update via a query .set(...) but then pass the stale state object to the background task; create_next_state eventually calls await state.save(), which can overwrite the DB-updated parents/outputs with stale values. Unify by mutating the in-memory state and saving it before scheduling.

Apply this diff:

-        if len(body.outputs) == 0:
-            await State.find_one(State.id == state_id).set(
-                {"status": StateStatusEnum.EXECUTED, "outputs": {}, "parents": {**state.parents, state.identifier: state.id}}
-            )
-
-            background_tasks.add_task(create_next_state, state)
+        if len(body.outputs) == 0:
+            state.outputs = {}
+            state.status = StateStatusEnum.EXECUTED
+            state.parents = {**state.parents, state.identifier: state.id}
+            await state.save()
+
+            background_tasks.add_task(create_next_state, state)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 6304210 and 3747e37.

📒 Files selected for processing (4)
  • state-manager/app/controller/executed_state.py (2 hunks)
  • state-manager/app/controller/get_secrets.py (1 hunks)
  • state-manager/app/models/db/state.py (2 hunks)
  • state-manager/app/tasks/create_next_state.py (4 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (4)
state-manager/app/controller/get_secrets.py (4)
state-manager/app/models/db/state.py (1)
  • State (8-18)
state-manager/app/routes.py (1)
  • get_secrets_route (181-191)
state-manager/app/models/db/graph_template_model.py (2)
  • get_secrets (71-74)
  • get_secret (76-81)
python-sdk/exospherehost/runtime.py (1)
  • _get_secrets_endpoint (120-124)
state-manager/app/controller/executed_state.py (3)
state-manager/app/models/state_status_enum.py (1)
  • StateStatusEnum (4-14)
state-manager/app/models/db/state.py (1)
  • State (8-18)
state-manager/app/tasks/create_next_state.py (1)
  • create_next_state (14-127)
state-manager/app/models/db/state.py (1)
state-manager/app/models/db/base.py (1)
  • BaseDatabaseModel (7-15)
state-manager/app/tasks/create_next_state.py (2)
state-manager/app/models/state_status_enum.py (1)
  • StateStatusEnum (4-14)
state-manager/app/models/db/state.py (1)
  • State (8-18)
🪛 GitHub Actions: Ruff check on changed files only
state-manager/app/controller/get_secrets.py

[error] 5-5: Ruff lint error: Unused import 'bson.ObjectId' detected (F401). Remove the import or run 'ruff --fix' to auto-fix.

state-manager/app/tasks/create_next_state.py

[error] 4-4: Ruff lint error: Unused import 'bson.ObjectId' detected (F401). Remove the import or run 'ruff --fix' to auto-fix.

🔇 Additional comments (9)
state-manager/app/models/db/state.py (2)

4-4: Using PydanticObjectId is the right move

Switching to Beanie’s PydanticObjectId aligns the model with Beanie’s expected ID type and removes BSON coupling.


18-18: No bson.ObjectId wrappers for state.id found
I searched the entire state-manager module for any instances of ObjectId(state.id), re-wrapping in State.get(ObjectId(...)), or in parent maps, and found none. All callers appear to be passing and using the new PydanticObjectId directly.

state-manager/app/controller/get_secrets.py (1)

27-27: Directly using State.get(state_id) is correct and consistent with PydanticObjectId

This aligns with the migration away from bson.ObjectId conversions and leverages Beanie’s native ID handling.

state-manager/app/controller/executed_state.py (3)

18-18: Good guard: ensure state is present and has an id

This prevents downstream failures in create_next_state.


32-37: LGTM on in-memory mutation + save path

Consistent with the PR objective to avoid DB .set and pass up-to-date state to the background task.


51-53: Parents now store state.id directly — correct with PydanticObjectId

Using state.id instead of ObjectId(state.id) keeps types consistent.

state-manager/app/tasks/create_next_state.py (3)

42-44: Terminal-node handling is correct

Marking the current state SUCCESS and returning when there are no next nodes matches the intended semantics.


46-49: Correct: include the current state in parents and use state.id directly

parents = state.parents | {state.identifier: state.id} both deduplicates and ensures type correctness.


88-90: LGTM: stop re-wrapping ids with ObjectId when fetching

Using await State.get(parent_id) avoids type mismatches with PydanticObjectId.

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants