Skip to content

Conversation

@NiveditJain
Copy link
Member

  • Added a retry policy model to manage state retries with configurable methods (fixed, linear, exponential).
  • Updated the errored state function to create a retry state if the maximum retries have not been reached, improving error recovery.
  • Enhanced the ErroredResponseModel to include a flag indicating whether a retry state was created.
  • Modified the GraphTemplate and State models to incorporate retry policy attributes, ensuring better state management.
  • Improved validation and error handling in the upsert_graph_template function to accommodate the new retry policy structure.

- Added a retry policy model to manage state retries with configurable methods (fixed, linear, exponential).
- Updated the errored state function to create a retry state if the maximum retries have not been reached, improving error recovery.
- Enhanced the ErroredResponseModel to include a flag indicating whether a retry state was created.
- Modified the GraphTemplate and State models to incorporate retry policy attributes, ensuring better state management.
- Improved validation and error handling in the upsert_graph_template function to accommodate the new retry policy structure.
@NiveditJain NiveditJain added this to the 0.0.2 milestone Aug 31, 2025
@NiveditJain NiveditJain requested a review from nk-ag August 31, 2025 06:51
@NiveditJain NiveditJain self-assigned this Aug 31, 2025
@NiveditJain NiveditJain added the enhancement New feature or request label Aug 31, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 31, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Warning

Rate limit exceeded

@NiveditJain has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 12 minutes and 52 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between fded75b and 462cfde.

📒 Files selected for processing (7)
  • .github/workflows/publish-state-mangaer.yml (0 hunks)
  • docs/docs/exosphere/retry-policy.md (1 hunks)
  • state-manager/app/controller/errored_state.py (2 hunks)
  • state-manager/app/models/retry_policy_model.py (1 hunks)
  • state-manager/tests/unit/controller/test_errored_state.py (8 hunks)
  • state-manager/tests/unit/controller/test_upsert_graph_template.py (5 hunks)
  • state-manager/tests/unit/models/test_retry_policy_model.py (1 hunks)
📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Added configurable graph-level retry policy (strategies: exponential/linear/fixed with jitter, backoff, optional max delay).
    • Automatic retries on node errors up to max_retries; new state queued after computed delay, prior state marked errored.
    • Graph upsert now accepts retry_policy; graph templates include retry_policy.
    • Error response now indicates whether a retry was created (retry_created).
  • Documentation

    • New Retry Policy guide with configuration examples and best practices.
    • Updated create-graph docs and SDK examples to include retry_policy.
    • Docs navigation updated to include Retry Policy.

Walkthrough

Adds a graph-level RetryPolicy model and field, persists retry_policy on GraphTemplate, tracks per-State retry_count and fanout_id, computes retry delays, enqueues scheduled retry States from errored_state, and surfaces retry_created in errored responses and upsert APIs.

Changes

Cohort / File(s) Summary
Controller: errored retry flow
state-manager/app/controller/errored_state.py
Loads GraphTemplate and its retry_policy, computes delay, attempts to insert a cloned retry State (incremented retry_count, reset outputs/error, set enqueue_after), marks original state ERRORED with error, and returns ErroredResponseModel(status=ERRORED, retry_created=<bool>).
Controller: upsert graph template
state-manager/app/controller/upsert_graph_template.py
Wires retry_policy through create/update paths and returns it in UpsertGraphTemplateResponse.
Model: Retry policy (new module)
state-manager/app/models/retry_policy_model.py
New RetryStrategy enum and RetryPolicyModel with max_retries, strategy, backoff_factor, exponent, optional max_delay, and compute_delay(retry_count) implementing exponential/linear/fixed strategies with jitter variants.
Model: GraphTemplate (DB)
state-manager/app/models/db/graph_template_model.py
Adds persisted retry_policy: RetryPolicyModel = Field(default_factory=RetryPolicyModel) to GraphTemplate.
Model: State (DB)
state-manager/app/models/db/state.py
Adds retry_count: int = Field(default=0) and fanout_id: str = Field(default_factory=...), includes retry_count in fingerprint generation, overrides insert_many to precompute fingerprints, and adds unique index including retry_count and fanout_id.
Model: Errored response (API)
state-manager/app/models/errored_models.py
Adds retry_created: bool = Field(default=False) to ErroredResponseModel.
Model: Graph API schemas
state-manager/app/models/graph_models.py
Adds retry_policy: RetryPolicyModel to UpsertGraphTemplateRequest and UpsertGraphTemplateResponse.
Docs
docs/docs/exosphere/create-graph.md, docs/docs/exosphere/retry-policy.md, docs/mkdocs.yml
Adds Retry Policy docs, examples, SDK examples and nav entry; documents strategies, jitter variants, compute formulas, and runtime behavior for scheduled retries.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client
  participant Controller as errored_state
  participant StateDB as State Store
  participant GTDB as GraphTemplate Store

  Client->>Controller: POST /state/{id}/errored { error }
  Controller->>StateDB: Get State by id
  Controller->>GTDB: Get GraphTemplate (state.graph_name)
  alt retry allowed (state.retry_count < policy.max_retries)
    Controller->>Controller: delay = policy.compute_delay(state.retry_count + 1)
    Controller->>StateDB: Insert new State(status=CREATED, retry_count+=1, enqueue_after=now+delay, outputs={}, error=None)
    Note right of Controller: retry_created = true
  else no retry
    Note right of Controller: retry_created = false
  end
  Controller->>StateDB: Update original State -> status=ERRORED, error=payload.error
  Controller-->>Client: { status: ERRORED, retry_created }
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

I nibble code and count each try,
A jittered hop beneath the sky.
When nodes fall short, I plan anew—
I queue the retry and hop on through. 🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @NiveditJain, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the system's resilience by implementing a configurable retry mechanism for errored states. It allows for automatic re-processing of failed operations based on defined retry policies, enhancing the overall robustness and reliability of state transitions within the graph execution flow.

Highlights

  • New Retry Policy Model: Introduced a new RetryPolicyModel to define retry behavior, including max_retries, method (fixed, linear, exponential), and backoff_factor.
  • Enhanced Errored State Handling: The errored_state function now checks the GraphTemplate's retry policy. If the maximum retry count has not been reached, a new 'CREATED' state is generated with an enqueue_after time calculated based on the configured retry method and backoff factor.
  • Updated Data Models: The GraphTemplate model now includes a retry_policy field, and the State model has a new retry_count field, which is also incorporated into the state's fingerprint for uniqueness.
  • API Response Enhancement: The ErroredResponseModel now includes a retry_created boolean flag, indicating whether a new retry state was successfully generated after an error.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@NiveditJain NiveditJain linked an issue Aug 31, 2025 that may be closed by this pull request
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a retry mechanism for errored states, which is a great enhancement for resilience. The implementation is mostly solid, with new models for retry policies and updates to the state and graph template models. My review focuses on the logic for creating retry states in errored_state.py, where I've found a couple of critical issues regarding the enqueue_after timestamp calculation and the handling of the error field on the new retry state. Addressing these will ensure the retry mechanism works as expected.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)
state-manager/app/models/db/graph_template_model.py (2)

303-307: Fix Beanie AttributeError: use dict filters in queries.

CI shows AttributeError on GraphTemplate.namespace during find_one. Switch to dict-based filters to bypass the field descriptor lookup.

-        graph_template = await GraphTemplate.find_one(GraphTemplate.namespace == namespace, GraphTemplate.name == graph_name)
+        graph_template = await GraphTemplate.find_one({"namespace": namespace, "name": graph_name})

17-19: Harmonize GraphTemplate namespace naming
Rename the namespace field on GraphTemplate to namespace_name to align with other models and prevent confusion; update all usages accordingly:

  • state-manager/app/controller/list_graph_templates.py
  • state-manager/app/controller/upsert_graph_template.py
  • state-manager/app/controller/get_secrets.py
  • state-manager/app/controller/create_states.py
  • state-manager/app/controller/get_graph_template.py
  • GraphTemplate.get method in state-manager/app/models/db/graph_template_model.py
state-manager/app/controller/upsert_graph_template.py (3)

14-17: Prevent field-descriptor lookup error in find_one.

Same AttributeError risk here; use dict-based filter.

-        graph_template = await GraphTemplate.find_one(
-            GraphTemplate.name == graph_name,
-            GraphTemplate.namespace == namespace_name
-        )
+        graph_template = await GraphTemplate.find_one({"name": graph_name, "namespace": namespace_name})

58-64: Fix response validation error for retry_policy.

Pydantic validation fails; explicitly coerce to RetryPolicyModel.

-from app.models.graph_models import UpsertGraphTemplateRequest, UpsertGraphTemplateResponse
+from app.models.graph_models import UpsertGraphTemplateRequest, UpsertGraphTemplateResponse
+from app.models.retry_policy_model import RetryPolicyModel
@@
-        return UpsertGraphTemplateResponse(
+        return UpsertGraphTemplateResponse(
@@
-            retry_policy=graph_template.retry_policy,
+            retry_policy=RetryPolicyModel.model_validate(graph_template.retry_policy),
             created_at=graph_template.created_at,
             updated_at=graph_template.updated_at
         )

26-33: Optional: serialize complex value on update.

When updating via Set, ensure nested models are encoded predictably.

-                        GraphTemplate.retry_policy: body.retry_policy # type: ignore
+                        GraphTemplate.retry_policy: body.retry_policy.model_dump() # type: ignore
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 722b208 and 8f7bcc1.

📒 Files selected for processing (7)
  • state-manager/app/controller/errored_state.py (2 hunks)
  • state-manager/app/controller/upsert_graph_template.py (3 hunks)
  • state-manager/app/models/db/graph_template_model.py (2 hunks)
  • state-manager/app/models/db/state.py (2 hunks)
  • state-manager/app/models/errored_models.py (1 hunks)
  • state-manager/app/models/graph_models.py (1 hunks)
  • state-manager/app/models/retry_policy_model.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
state-manager/app/models/errored_models.py (1)
state-manager/app/models/state_status_enum.py (1)
  • StateStatusEnum (4-13)
state-manager/app/models/db/graph_template_model.py (1)
state-manager/app/models/retry_policy_model.py (1)
  • RetryPolicyModel (9-12)
state-manager/app/controller/errored_state.py (4)
state-manager/app/models/retry_policy_model.py (2)
  • RetryPolicyModel (9-12)
  • RetryMethod (4-7)
state-manager/app/models/db/graph_template_model.py (2)
  • GraphTemplate (16-330)
  • get (303-307)
state-manager/app/models/db/state.py (1)
  • State (12-82)
state-manager/app/models/errored_models.py (1)
  • ErroredResponseModel (9-11)
state-manager/app/models/graph_models.py (1)
state-manager/app/models/retry_policy_model.py (1)
  • RetryPolicyModel (9-12)
state-manager/app/controller/upsert_graph_template.py (1)
state-manager/app/models/db/graph_template_model.py (1)
  • GraphTemplate (16-330)
🪛 GitHub Actions: State Manager Unit Tests
state-manager/app/models/db/graph_template_model.py

[error] 304-304: AttributeError: 'namespace' encountered while querying GraphTemplate; GraphTemplate.namespace is not defined on the GraphTemplate model.


[error] 304-304: AttributeError: 'namespace' encountered while querying GraphTemplate; GraphTemplate.namespace is not defined on the GraphTemplate model.


[error] 304-304: AttributeError: 'namespace' encountered while querying GraphTemplate; GraphTemplate.namespace is not defined on the GraphTemplate model.

state-manager/app/controller/errored_state.py

[error] 39-39: Step 'uv run pytest tests/ --cov=app --cov-report=xml --cov-report=term-missing --cov-report=html -v --junitxml=full-pytest-report.xml' failed: AttributeError('namespace') raised during GraphTemplate.get in errored_state (app/controller/errored_state.py:39).


[error] 39-39: Step 'uv run pytest tests/ --cov=app --cov-report=xml --cov-report=term-missing --cov-report=html -v --junitxml=full-pytest-report.xml' failed: AttributeError('namespace') raised during GraphTemplate.get in errored_state (app/controller/errored_state.py:39).


[error] 39-39: Step 'uv run pytest tests/ --cov=app --cov-report=xml --cov-report=term-missing --cov-report=html -v --junitxml=full-pytest-report.xml' failed: AttributeError('namespace') raised during GraphTemplate.get in errored_state (app/controller/errored_state.py:39).

state-manager/app/controller/upsert_graph_template.py

[error] 58-58: Step 'uv run pytest tests/ --cov=app --cov-report=xml --cov-report=term-missing --cov-report=html -v --junitxml=full-pytest-report.xml' failed: 1 validation error for UpsertGraphTemplateResponse; retry_policy: Input should be a valid dictionary or instance of RetryPolicyModel (app/controller/upsert_graph_template.py:58).


[error] 58-58: Step 'uv run pytest tests/ --cov=app --cov-report=xml --cov-report=term-missing --cov-report=html -v --junitxml=full-pytest-report.xml' failed: 1 validation error for UpsertGraphTemplateResponse; retry_policy: Input should be a valid dictionary or instance of RetryPolicyModel (app/controller/upsert_graph_template.py:58).


[error] 58-58: Step 'uv run pytest tests/ --cov=app --cov-report=xml --cov-report=term-missing --cov-report=html -v --junitxml=full-pytest-report.xml' failed: 1 validation error for UpsertGraphTemplateResponse; retry_policy: Input should be a valid dictionary or instance of RetryPolicyModel (app/controller/upsert_graph_template.py:58).


[error] 58-58: Step 'uv run pytest tests/ --cov=app --cov-report=xml --cov-report=term-missing --cov-report=html -v --junitxml=full-pytest-report.xml' failed: 1 validation error for UpsertGraphTemplateResponse; retry_policy: Input should be a valid dictionary or instance of RetryPolicyModel (app/controller/upsert_graph_template.py:58).

🔇 Additional comments (6)
state-manager/app/models/db/state.py (1)

41-41: Confirm fingerprint semantics with retries.

Including retry_count in the fingerprint prevents uniting across retries. If that’s intentional, all good; otherwise exclude it.

state-manager/app/models/errored_models.py (1)

10-11: LGTM: response now signals retry creation.

Field is well-typed and defaulted.

state-manager/app/models/db/graph_template_model.py (1)

14-24: LGTM: retry policy added to GraphTemplate.

The field and import look correct with a safe default_factory.

state-manager/app/models/graph_models.py (1)

6-6: Retry policy fields wired into request/response — LGTM

Import and defaulted retry_policy on both models look correct and compatible with the new RetryPolicyModel.

Also applies to: 12-12, 18-18

state-manager/app/controller/errored_state.py (2)

67-67: Response shape update — LGTM

Returning retry_created alongside status is appropriate and backward-compatible for consumers that ignore unknown fields.


39-39: Ignore the namespace_name fix—use namespace as defined. The GraphTemplate model declares a namespace field (line 18), so replacing it with namespace_name is incorrect. The AttributeError('namespace') arises elsewhere—verify the import and the model instance being returned instead of renaming this field.

Likely an incorrect or invalid review comment.

- Introduced a new RetryStrategy enum with additional strategies for retrying operations, enhancing flexibility in retry mechanisms.
- Updated the RetryPolicyModel to include a compute_delay method for calculating delays based on the selected strategy.
- Refactored the errored_state function to utilize the new retry policy structure, improving error handling and state management.
- Removed the previous _calculate_enqueue_after function, streamlining the code and enhancing clarity.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
state-manager/app/controller/errored_state.py (2)

23-28: Tighten status checks.

The two checks effectively only allow QUEUED. Make it explicit and simplify.

-        if state.status != StateStatusEnum.QUEUED and state.status != StateStatusEnum.EXECUTED:
-            raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="State is not queued or executed")
-        
-        if state.status == StateStatusEnum.EXECUTED:
-            raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="State is already executed")
+        if state.status != StateStatusEnum.QUEUED:
+            if state.status == StateStatusEnum.EXECUTED:
+                raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="State is already executed")
+            raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="State is not queued")

59-61: Log with full traceback for easier debugging.

Use exception logging so CI failures capture stack traces.

-    except Exception as e:
-        logger.error(f"Error errored state {state_id} for namespace {namespace_name}", x_exosphere_request_id=x_exosphere_request_id, error=e)
-        raise e
+    except Exception:
+        logger.exception(f"Error errored state {state_id} for namespace {namespace_name}", x_exosphere_request_id=x_exosphere_request_id)
+        raise
♻️ Duplicate comments (1)
state-manager/app/models/retry_policy_model.py (1)

21-22: Clarify naming: backoff_factor represents a base delay in milliseconds.

Prior feedback suggested renaming to base_delay_seconds and allowing sub-second floats; here it’s milliseconds. At minimum, update the description to “Base delay in milliseconds”; consider a follow-up rename to base_delay_ms for clarity.

-    backoff_factor: int = Field(default=2000, description="The backoff factor in milliseconds (default: 2000 = 2 seconds)", gt=0)
+    backoff_factor: int = Field(default=2000, description="Base delay in milliseconds (default: 2000 = 2 seconds)", gt=0)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a00b97c and f6a3bb2.

📒 Files selected for processing (2)
  • state-manager/app/controller/errored_state.py (2 hunks)
  • state-manager/app/models/retry_policy_model.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
state-manager/app/controller/errored_state.py (4)
state-manager/app/models/errored_models.py (1)
  • ErroredResponseModel (9-11)
state-manager/app/models/db/state.py (1)
  • State (12-82)
state-manager/app/models/db/graph_template_model.py (2)
  • GraphTemplate (16-330)
  • get (303-307)
state-manager/app/models/retry_policy_model.py (1)
  • compute_delay (24-59)
🪛 GitHub Actions: State Manager Unit Tests
state-manager/app/controller/errored_state.py

[error] 61-61: AttributeError: 'namespace' encountered while querying GraphTemplate in errored_state. GraphTemplate.get attempted to access GraphTemplate.namespace on the class.


[error] 61-61: AttributeError: 'namespace' encountered while querying GraphTemplate in errored_state. GraphTemplate.get attempted to access GraphTemplate.namespace on the class.


[error] 61-61: AttributeError: 'namespace' encountered while querying GraphTemplate in errored_state. GraphTemplate.get attempted to access GraphTemplate.namespace on the class.

🔇 Additional comments (3)
state-manager/app/models/retry_policy_model.py (1)

5-16: Enum coverage looks comprehensive.

Nine strategies with jitter variants provide flexible control. No issues here.

state-manager/app/controller/errored_state.py (2)

46-47: Confirm delay units and attempt indexing.

compute_delay returns milliseconds; you’re adding to int(time.time()*1000), which is correct. Passing state.retry_count + 1 assumes compute_delay expects a 1-based attempt index (first retry = 1). Verify this contract is documented and consistent.


41-44: Good: fresh retry State doesn’t carry over outputs or error.

This prevents stale data from contaminating the next attempt.

- Introduced a new documentation file for the Retry Policy feature, detailing its configuration and usage within Exosphere.
- Updated the `create-graph.md` file to include a section on retry policies, explaining their structure and providing examples.
- Modified `mkdocs.yml` to include the new Retry Policy documentation in the navigation, enhancing accessibility for users.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between f6a3bb2 and 66c794a.

📒 Files selected for processing (3)
  • docs/docs/exosphere/create-graph.md (2 hunks)
  • docs/docs/exosphere/retry-policy.md (1 hunks)
  • docs/mkdocs.yml (2 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/docs/exosphere/retry-policy.md

[grammar] ~1-~1: Use correct spacing
Context: # Retry Policy !!! beta "Beta Feature" Retry Policy...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~4-~4: Use correct spacing
Context: ...tionality may change in future releases. The Retry Policy feature in Exosphere pr...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~6-~6: Use correct spacing
Context: ...cution based on configurable strategies. ## Overview Retry policies are configured ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~8-~8: Use correct spacing
Context: ...on configurable strategies. ## Overview Retry policies are configured at the gra...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~10-~10: Use correct spacing
Context: ...delay before the next execution attempt. ## Configuration Retry policies are define...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~12-~12: Use correct spacing
Context: ...ext execution attempt. ## Configuration Retry policies are defined in your graph...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~14-~14: Use correct spacing
Context: ...ed in your graph template configuration: json { "secrets": { "api_key": "your-api-key" }, "nodes": [ { "node_name": "MyNode", "namespace": "MyProject", "identifier": "my_node", "inputs": { "data": "initial" }, "next_nodes": [] } ], "retry_policy": { "max_retries": 3, "strategy": "EXPONENTIAL", "backoff_factor": 2000, "exponent": 2 } } ## Parameters ### max_retries - Type: ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~41-~41: Use correct spacing
Context: ... "exponent": 2 } } ``` ## Parameters ### max_retries - Type: int - **Defaul...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~43-~43: There might be a mistake here.
Context: ... } } ``` ## Parameters ### max_retries - Type: int - Default: `3` - **Des...

(QB_NEW_EN)


[grammar] ~44-~44: There might be a mistake here.
Context: ...Parameters ### max_retries - Type: int - Default: 3 - Description: The ma...

(QB_NEW_EN)


[grammar] ~45-~45: There might be a mistake here.
Context: ...etries - Type: int - Default: 3 - Description: The maximum number of ret...

(QB_NEW_EN)


[grammar] ~46-~46: There might be a mistake here.
Context: ...umber of retry attempts before giving up - Constraints: Must be >= 0 ### strateg...

(QB_NEW_EN_OTHER)


[grammar] ~47-~47: There might be a mistake here.
Context: ...iving up - Constraints: Must be >= 0 ### strategy - Type: string - **Defaul...

(QB_NEW_EN_OTHER)


[grammar] ~50-~50: There might be a mistake here.
Context: ... Must be >= 0 ### strategy - Type: string - Default: "EXPONENTIAL" - **Descripti...

(QB_NEW_EN)


[grammar] ~51-~51: There might be a mistake here.
Context: ...egy - Type: string - Default: "EXPONENTIAL" - Description: The retry strategy to use...

(QB_NEW_EN)


[grammar] ~52-~52: There might be a mistake here.
Context: ...y strategy to use for calculating delays - Options: See [Retry Strategies](#retry...

(QB_NEW_EN_OTHER)


[grammar] ~53-~53: There might be a mistake here.
Context: ...try Strategies](#retry-strategies) below ### backoff_factor - Type: int - **Def...

(QB_NEW_EN_OTHER)


[grammar] ~55-~55: There might be a mistake here.
Context: ...ry-strategies) below ### backoff_factor - Type: int - Default: 2000 (2 s...

(QB_NEW_EN)


[grammar] ~56-~56: There might be a mistake here.
Context: ...) below ### backoff_factor - Type: int - Default: 2000 (2 seconds) - **Descri...

(QB_NEW_EN)


[grammar] ~57-~57: There might be a mistake here.
Context: ... int - Default: 2000 (2 seconds) - Description: The base delay factor in ...

(QB_NEW_EN)


[grammar] ~58-~58: There might be a mistake here.
Context: ...*: The base delay factor in milliseconds - Constraints: Must be > 0 ### exponent...

(QB_NEW_EN)


[grammar] ~59-~59: Use correct spacing
Context: ...liseconds - Constraints: Must be > 0 ### exponent - Type: int - Default...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~61-~61: There might be a mistake here.
Context: ...Constraints**: Must be > 0 ### exponent - Type: int - Default: 2 - **Des...

(QB_NEW_EN)


[grammar] ~62-~62: There might be a mistake here.
Context: ...: Must be > 0 ### exponent - Type: int - Default: 2 - Description: The ex...

(QB_NEW_EN)


[grammar] ~63-~63: There might be a mistake here.
Context: ...ponent - Type: int - Default: 2 - Description: The exponent used for exp...

(QB_NEW_EN)


[grammar] ~64-~64: There might be a mistake here.
Context: ...exponent used for exponential strategies - Constraints: Must be > 0 ## Retry Str...

(QB_NEW_EN)


[grammar] ~65-~65: Use correct spacing
Context: ...trategies - Constraints: Must be > 0 ## Retry Strategies Exosphere supports thr...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~67-~67: Use correct spacing
Context: ...ints**: Must be > 0 ## Retry Strategies Exosphere supports three main categories...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~69-~69: Use correct spacing
Context: ...nts to prevent thundering herd problems. ### Exponential Strategies Exponential stra...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~71-~71: Use correct spacing
Context: ...rd problems. ### Exponential Strategies Exponential strategies increase the dela...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~73-~73: Use correct spacing
Context: ...y exponentially with each retry attempt. #### EXPONENTIAL Standard exponential backoff...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~75-~75: There might be a mistake here.
Context: ...th each retry attempt. #### EXPONENTIAL Standard exponential backoff without jit...

(QB_NEW_EN)


[grammar] ~76-~76: Use correct spacing
Context: ...dard exponential backoff without jitter. Formula: `backoff_factor * (exponent ^...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~78-~78: Use correct spacing
Context: ...l backoff without jitter. Formula: backoff_factor * (exponent ^ retry_count) Example: - Retry 1: 2000ms (2 seconds)...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~80-~80: There might be a mistake here.
Context: ... (exponent ^ retry_count)` Example: - Retry 1: 2000ms (2 seconds) - Retry 2: 4...

(QB_NEW_EN)


[grammar] ~81-~81: There might be a mistake here.
Context: ... retry_count)` Example: - Retry 1: 2000ms (2 seconds) - Retry 2: 4000ms (4 second...

(QB_NEW_EN_OTHER)


[grammar] ~81-~81: There might be a mistake here.
Context: ...Example**: - Retry 1: 2000ms (2 seconds) - Retry 2: 4000ms (4 seconds) - Retry 3: 8...

(QB_NEW_EN)


[grammar] ~82-~82: There might be a mistake here.
Context: ... Retry 1: 2000ms (2 seconds) - Retry 2: 4000ms (4 seconds) - Retry 3: 8000ms (8 second...

(QB_NEW_EN_OTHER)


[grammar] ~82-~82: There might be a mistake here.
Context: ...2 seconds) - Retry 2: 4000ms (4 seconds) - Retry 3: 8000ms (8 seconds) #### EXPONE...

(QB_NEW_EN)


[grammar] ~83-~83: There might be a mistake here.
Context: ... Retry 2: 4000ms (4 seconds) - Retry 3: 8000ms (8 seconds) #### EXPONENTIAL_FULL_JITT...

(QB_NEW_EN_OTHER)


[grammar] ~83-~83: Use correct spacing
Context: ...4 seconds) - Retry 3: 8000ms (8 seconds) #### EXPONENTIAL_FULL_JITTER Exponential back...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~85-~85: There might be a mistake here.
Context: ...8 seconds) #### EXPONENTIAL_FULL_JITTER Exponential backoff with full jitter (ra...

(QB_NEW_EN)


[grammar] ~86-~86: Use correct spacing
Context: ...m delay between 0 and calculated delay). Formula: `random(0, backoff_factor * (...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~88-~88: Use correct spacing
Context: ... 0 and calculated delay). Formula: random(0, backoff_factor * (exponent ^ retry_count)) Example: - Retry 1: 0-2000ms (random) ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~90-~90: There might be a mistake here.
Context: ...(exponent ^ retry_count))` Example: - Retry 1: 0-2000ms (random) - Retry 2: 0-...

(QB_NEW_EN)


[grammar] ~91-~91: There might be a mistake here.
Context: ...Example*: - Retry 1: 0-2000ms (random) - Retry 2: 0-4000ms (random) - Retry 3: 0-...

(QB_NEW_EN)


[grammar] ~92-~92: There might be a mistake here.
Context: ...ms (random) - Retry 2: 0-4000ms (random) - Retry 3: 0-8000ms (random) #### EXPONEN...

(QB_NEW_EN)


[grammar] ~93-~93: Use correct spacing
Context: ...ms (random) - Retry 3: 0-8000ms (random) #### EXPONENTIAL_EQUAL_JITTER Exponential bac...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~95-~95: There might be a mistake here.
Context: ... (random) #### EXPONENTIAL_EQUAL_JITTER Exponential backoff with equal jitter (r...

(QB_NEW_EN)


[grammar] ~96-~96: Use correct spacing
Context: ...delay around half the calculated delay). Formula: `(backoff_factor * (exponent ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~98-~98: Use correct spacing
Context: ...lf the calculated delay). Formula: (backoff_factor * (exponent ^ retry_count)) / 2 + random(0, (backoff_factor * (exponent ^ retry_count)) / 2) Example: - Retry 1: 1000-2000ms (rando...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~100-~100: There might be a mistake here.
Context: ...nent ^ retry_count)) / 2)` Example: - Retry 1: 1000-2000ms (random) - Retry 2:...

(QB_NEW_EN)


[grammar] ~101-~101: There might be a mistake here.
Context: ...t)) / 2)` Example: - Retry 1: 1000-2000ms (random) - Retry 2: 2000-4000ms (random...

(QB_NEW_EN_OTHER)


[grammar] ~101-~101: There might be a mistake here.
Context: ...ample**: - Retry 1: 1000-2000ms (random) - Retry 2: 2000-4000ms (random) - Retry 3:...

(QB_NEW_EN)


[grammar] ~102-~102: There might be a mistake here.
Context: ...1: 1000-2000ms (random) - Retry 2: 2000-4000ms (random) - Retry 3: 4000-8000ms (random...

(QB_NEW_EN_OTHER)


[grammar] ~102-~102: There might be a mistake here.
Context: ...(random) - Retry 2: 2000-4000ms (random) - Retry 3: 4000-8000ms (random) ### Linea...

(QB_NEW_EN)


[grammar] ~103-~103: There might be a mistake here.
Context: ...2: 2000-4000ms (random) - Retry 3: 4000-8000ms (random) ### Linear Strategies Linear...

(QB_NEW_EN_OTHER)


[grammar] ~103-~103: Use correct spacing
Context: ...(random) - Retry 3: 4000-8000ms (random) ### Linear Strategies Linear strategies inc...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~105-~105: Use correct spacing
Context: ...0-8000ms (random) ### Linear Strategies Linear strategies increase the delay lin...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~107-~107: Use correct spacing
Context: ... delay linearly with each retry attempt. #### LINEAR Standard linear backoff without j...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~109-~109: There might be a mistake here.
Context: ...ly with each retry attempt. #### LINEAR Standard linear backoff without jitter. ...

(QB_NEW_EN)


[grammar] ~110-~110: Use correct spacing
Context: ... Standard linear backoff without jitter. Formula: `backoff_factor * retry_count...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~112-~112: Use correct spacing
Context: ...r backoff without jitter. Formula: backoff_factor * retry_count Example: - Retry 1: 2000ms (2 seconds)...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~114-~114: There might be a mistake here.
Context: ...koff_factor * retry_count` Example: - Retry 1: 2000ms (2 seconds) - Retry 2: 4...

(QB_NEW_EN)


[grammar] ~115-~115: There might be a mistake here.
Context: ...* retry_count` Example: - Retry 1: 2000ms (2 seconds) - Retry 2: 4000ms (4 second...

(QB_NEW_EN_OTHER)


[grammar] ~115-~115: There might be a mistake here.
Context: ...Example**: - Retry 1: 2000ms (2 seconds) - Retry 2: 4000ms (4 seconds) - Retry 3: 6...

(QB_NEW_EN)


[grammar] ~116-~116: There might be a mistake here.
Context: ... Retry 1: 2000ms (2 seconds) - Retry 2: 4000ms (4 seconds) - Retry 3: 6000ms (6 second...

(QB_NEW_EN_OTHER)


[grammar] ~116-~116: There might be a mistake here.
Context: ...2 seconds) - Retry 2: 4000ms (4 seconds) - Retry 3: 6000ms (6 seconds) #### LINEAR...

(QB_NEW_EN)


[grammar] ~117-~117: There might be a mistake here.
Context: ... Retry 2: 4000ms (4 seconds) - Retry 3: 6000ms (6 seconds) #### LINEAR_FULL_JITTER Li...

(QB_NEW_EN_OTHER)


[grammar] ~117-~117: Use correct spacing
Context: ...4 seconds) - Retry 3: 6000ms (6 seconds) #### LINEAR_FULL_JITTER Linear backoff with f...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~119-~119: There might be a mistake here.
Context: ...0ms (6 seconds) #### LINEAR_FULL_JITTER Linear backoff with full jitter. **Form...

(QB_NEW_EN)


[grammar] ~120-~120: Use correct spacing
Context: ..._JITTER Linear backoff with full jitter. Formula: `random(0, backoff_factor * r...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~122-~122: Use correct spacing
Context: ...backoff with full jitter. Formula: random(0, backoff_factor * retry_count) Example: - Retry 1: 0-2000ms (random) ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~124-~124: There might be a mistake here.
Context: ...off_factor * retry_count)` Example: - Retry 1: 0-2000ms (random) - Retry 2: 0-...

(QB_NEW_EN)


[grammar] ~125-~125: There might be a mistake here.
Context: ...etry_count)` Example: - Retry 1: 0-2000ms (random) - Retry 2: 0-4000ms (random) -...

(QB_NEW_EN_OTHER)


[grammar] ~125-~125: There might be a mistake here.
Context: ...Example*: - Retry 1: 0-2000ms (random) - Retry 2: 0-4000ms (random) - Retry 3: 0-...

(QB_NEW_EN)


[grammar] ~126-~126: There might be a mistake here.
Context: ...Retry 1: 0-2000ms (random) - Retry 2: 0-4000ms (random) - Retry 3: 0-6000ms (random) ...

(QB_NEW_EN_OTHER)


[grammar] ~126-~126: There might be a mistake here.
Context: ...ms (random) - Retry 2: 0-4000ms (random) - Retry 3: 0-6000ms (random) #### LINEAR_...

(QB_NEW_EN)


[grammar] ~127-~127: There might be a mistake here.
Context: ...Retry 2: 0-4000ms (random) - Retry 3: 0-6000ms (random) #### LINEAR_EQUAL_JITTER Line...

(QB_NEW_EN_OTHER)


[grammar] ~127-~127: Use correct spacing
Context: ...ms (random) - Retry 3: 0-6000ms (random) #### LINEAR_EQUAL_JITTER Linear backoff with ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~129-~129: There might be a mistake here.
Context: ...000ms (random) #### LINEAR_EQUAL_JITTER Linear backoff with equal jitter. **For...

(QB_NEW_EN)


[grammar] ~130-~130: Use correct spacing
Context: ...JITTER Linear backoff with equal jitter. Formula: `(backoff_factor * retry_coun...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~132-~132: Use correct spacing
Context: ...ackoff with equal jitter. Formula: (backoff_factor * retry_count) / 2 + random(0, (backoff_factor * retry_count) / 2) Example: - Retry 1: 1000-2000ms (rando...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~134-~134: There might be a mistake here.
Context: ...actor * retry_count) / 2)` Example: - Retry 1: 1000-2000ms (random) - Retry 2:...

(QB_NEW_EN)


[grammar] ~135-~135: There might be a mistake here.
Context: ...nt) / 2)` Example: - Retry 1: 1000-2000ms (random) - Retry 2: 2000-4000ms (random...

(QB_NEW_EN_OTHER)


[grammar] ~135-~135: There might be a mistake here.
Context: ...ample**: - Retry 1: 1000-2000ms (random) - Retry 2: 2000-4000ms (random) - Retry 3:...

(QB_NEW_EN)


[grammar] ~136-~136: There might be a mistake here.
Context: ...1: 1000-2000ms (random) - Retry 2: 2000-4000ms (random) - Retry 3: 3000-6000ms (random...

(QB_NEW_EN_OTHER)


[grammar] ~136-~136: There might be a mistake here.
Context: ...(random) - Retry 2: 2000-4000ms (random) - Retry 3: 3000-6000ms (random) ### Fixed...

(QB_NEW_EN)


[grammar] ~137-~137: There might be a mistake here.
Context: ...2: 2000-4000ms (random) - Retry 3: 3000-6000ms (random) ### Fixed Strategies Fixed s...

(QB_NEW_EN_OTHER)


[grammar] ~137-~137: Use correct spacing
Context: ...(random) - Retry 3: 3000-6000ms (random) ### Fixed Strategies Fixed strategies use a...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~139-~139: Use correct spacing
Context: ...00-6000ms (random) ### Fixed Strategies Fixed strategies use a constant delay fo...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~141-~141: Use correct spacing
Context: ...a constant delay for all retry attempts. #### FIXED Standard fixed delay without jitte...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~143-~143: There might be a mistake here.
Context: ...elay for all retry attempts. #### FIXED Standard fixed delay without jitter. **...

(QB_NEW_EN)


[grammar] ~144-~144: Use correct spacing
Context: ...XED Standard fixed delay without jitter. Formula: backoff_factor Example...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~146-~146: Use correct spacing
Context: ...xed delay without jitter. Formula: backoff_factor Example: - Retry 1: 2000ms (2 seconds)...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~148-~148: There might be a mistake here.
Context: ...ormula**: backoff_factor Example: - Retry 1: 2000ms (2 seconds) - Retry 2: 2...

(QB_NEW_EN)


[grammar] ~149-~149: There might be a mistake here.
Context: ...ackoff_factor` Example: - Retry 1: 2000ms (2 seconds) - Retry 2: 2000ms (2 second...

(QB_NEW_EN_OTHER)


[grammar] ~149-~149: There might be a mistake here.
Context: ...Example**: - Retry 1: 2000ms (2 seconds) - Retry 2: 2000ms (2 seconds) - Retry 3: 2...

(QB_NEW_EN)


[grammar] ~150-~150: There might be a mistake here.
Context: ... Retry 1: 2000ms (2 seconds) - Retry 2: 2000ms (2 seconds) - Retry 3: 2000ms (2 second...

(QB_NEW_EN_OTHER)


[grammar] ~150-~150: There might be a mistake here.
Context: ...2 seconds) - Retry 2: 2000ms (2 seconds) - Retry 3: 2000ms (2 seconds) #### FIXED_...

(QB_NEW_EN)


[grammar] ~151-~151: There might be a mistake here.
Context: ... Retry 2: 2000ms (2 seconds) - Retry 3: 2000ms (2 seconds) #### FIXED_FULL_JITTER Fix...

(QB_NEW_EN_OTHER)


[grammar] ~151-~151: Use correct spacing
Context: ...2 seconds) - Retry 3: 2000ms (2 seconds) #### FIXED_FULL_JITTER Fixed delay with full ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~153-~153: There might be a mistake here.
Context: ...00ms (2 seconds) #### FIXED_FULL_JITTER Fixed delay with full jitter. **Formula...

(QB_NEW_EN)


[grammar] ~154-~154: Use correct spacing
Context: ...ULL_JITTER Fixed delay with full jitter. Formula: random(0, backoff_factor) ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~156-~156: Use correct spacing
Context: ...d delay with full jitter. Formula: random(0, backoff_factor) Example: - Retry 1: 0-2000ms (random) ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~158-~158: There might be a mistake here.
Context: ...random(0, backoff_factor)` Example: - Retry 1: 0-2000ms (random) - Retry 2: 0-...

(QB_NEW_EN)


[grammar] ~159-~159: There might be a mistake here.
Context: ...off_factor)` Example: - Retry 1: 0-2000ms (random) - Retry 2: 0-2000ms (random) -...

(QB_NEW_EN_OTHER)


[grammar] ~159-~159: There might be a mistake here.
Context: ...Example*: - Retry 1: 0-2000ms (random) - Retry 2: 0-2000ms (random) - Retry 3: 0-...

(QB_NEW_EN)


[grammar] ~160-~160: There might be a mistake here.
Context: ...Retry 1: 0-2000ms (random) - Retry 2: 0-2000ms (random) - Retry 3: 0-2000ms (random) ...

(QB_NEW_EN_OTHER)


[grammar] ~160-~160: There might be a mistake here.
Context: ...ms (random) - Retry 2: 0-2000ms (random) - Retry 3: 0-2000ms (random) #### FIXED_E...

(QB_NEW_EN)


[grammar] ~161-~161: There might be a mistake here.
Context: ...Retry 2: 0-2000ms (random) - Retry 3: 0-2000ms (random) #### FIXED_EQUAL_JITTER Fixed...

(QB_NEW_EN_OTHER)


[grammar] ~161-~161: Use correct spacing
Context: ...ms (random) - Retry 3: 0-2000ms (random) #### FIXED_EQUAL_JITTER Fixed delay with equa...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~163-~163: There might be a mistake here.
Context: ...2000ms (random) #### FIXED_EQUAL_JITTER Fixed delay with equal jitter. **Formul...

(QB_NEW_EN)


[grammar] ~164-~164: Use correct spacing
Context: ...AL_JITTER Fixed delay with equal jitter. Formula: `backoff_factor / 2 + random(...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~166-~166: Use correct spacing
Context: ... delay with equal jitter. Formula: backoff_factor / 2 + random(0, backoff_factor / 2) Example: - Retry 1: 1000-2000ms (rando...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~168-~168: There might be a mistake here.
Context: ...om(0, backoff_factor / 2)` Example: - Retry 1: 1000-2000ms (random) - Retry 2:...

(QB_NEW_EN)


[grammar] ~169-~169: There might be a mistake here.
Context: ...tor / 2)` Example: - Retry 1: 1000-2000ms (random) - Retry 2: 1000-2000ms (random...

(QB_NEW_EN_OTHER)


[grammar] ~169-~169: There might be a mistake here.
Context: ...ample**: - Retry 1: 1000-2000ms (random) - Retry 2: 1000-2000ms (random) - Retry 3:...

(QB_NEW_EN)


[grammar] ~170-~170: There might be a mistake here.
Context: ...1: 1000-2000ms (random) - Retry 2: 1000-2000ms (random) - Retry 3: 1000-2000ms (random...

(QB_NEW_EN_OTHER)


[grammar] ~170-~170: There might be a mistake here.
Context: ...(random) - Retry 2: 1000-2000ms (random) - Retry 3: 1000-2000ms (random) ## Usage ...

(QB_NEW_EN)


[grammar] ~171-~171: There might be a mistake here.
Context: ...2: 1000-2000ms (random) - Retry 3: 1000-2000ms (random) ## Usage Examples ### Basic ...

(QB_NEW_EN_OTHER)


[grammar] ~171-~171: Use correct spacing
Context: ...(random) - Retry 3: 1000-2000ms (random) ## Usage Examples ### Basic Exponential Re...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~173-~173: Use correct spacing
Context: ... 1000-2000ms (random) ## Usage Examples ### Basic Exponential Retry ```json { "ret...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~175-~175: Use correct spacing
Context: ...ge Examples ### Basic Exponential Retry json { "retry_policy": { "max_retries": 3, "strategy": "EXPONENTIAL", "backoff_factor": 1000, "exponent": 2 } } ### Aggressive Retry with Jitter ```json { ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~187-~187: Use correct spacing
Context: ... } ### Aggressive Retry with Jitterjson { "retry_policy": { "max_retries": 5, "strategy": "EXPONENTIAL_FULL_JITTER", "backoff_factor": 500, "exponent": 3 } } ### Conservative Linear Retryjson { "r...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~199-~199: Use correct spacing
Context: ... } } ### Conservative Linear Retryjson { "retry_policy": { "max_retries": 2, "strategy": "LINEAR", "backoff_factor": 5000 } } ### Fixed Retry for Rate Limitingjson { ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~210-~210: Use correct spacing
Context: ...} ### Fixed Retry for Rate Limitingjson { "retry_policy": { "max_retries": 10, "strategy": "FIXED_EQUAL_JITTER", "backoff_factor": 1000 } } ``` ## When Retries Are Triggered Retries are ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~221-~221: Use correct spacing
Context: ... } } ``` ## When Retries Are Triggered Retries are automatically triggered when...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~223-~223: Use correct spacing
Context: ...etries are automatically triggered when: 1. A node execution fails with an error 2. ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~225-~225: There might be a mistake here.
Context: ... 1. A node execution fails with an error 2. The current retry count is less than `ma...

(QB_NEW_EN_OTHER)


[grammar] ~227-~227: Use correct spacing
Context: ...ies3. The state status isQUEUEDorEXECUTED` The retry mechanism: - Creates a new sta...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~229-~229: There might be a mistake here.
Context: ...UEDorEXECUTED The retry mechanism: - Creates a new state withretry_count` i...

(QB_NEW_EN)


[grammar] ~230-~230: There might be a mistake here.
Context: ...tate with retry_count incremented by 1 - Sets enqueue_after to the current time...

(QB_NEW_EN_OTHER)


[grammar] ~232-~232: There might be a mistake here.
Context: ...atus to ERRORED with the error message ## Best Practices ### Choose the Right Str...

(QB_NEW_EN_OTHER)


[grammar] ~234-~234: Use correct spacing
Context: ...ith the error message ## Best Practices ### Choose the Right Strategy - **EXPONENTIA...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~236-~236: There might be a mistake here.
Context: ...## Best Practices ### Choose the Right Strategy - EXPONENTIAL: Best for most transient f...

(QB_NEW_EN_OTHER)


[grammar] ~237-~237: There might be a mistake here.
Context: ...ssues, temporary service unavailability) - LINEAR: Good for predictable, consiste...

(QB_NEW_EN)


[grammar] ~238-~238: There might be a mistake here.
Context: ... Good for predictable, consistent delays - FIXED: Useful for rate limiting scenar...

(QB_NEW_EN)


[grammar] ~239-~239: There might be a mistake here.
Context: ...tent delays - FIXED: Useful for rate limiting scenarios ### Use Jitter for H...

(QB_NEW_EN_OTHER)


[grammar] ~239-~239: Use correct spacing
Context: ...ED**: Useful for rate limiting scenarios ### Use Jitter for High Concurrency - **FULL...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~241-~241: There might be a mistake here.
Context: ...ios ### Use Jitter for High Concurrency - FULL_JITTER: Best for high concurrency...

(QB_NEW_EN_OTHER)


[grammar] ~244-~244: There might be a mistake here.
Context: ...nly when you need deterministic behavior ### Set Appropriate Limits - max_retries...

(QB_NEW_EN_OTHER)


[grammar] ~246-~246: There might be a mistake here.
Context: ...tic behavior ### Set Appropriate Limits - max_retries: Consider the nature of yo...

(QB_NEW_EN)


[grammar] ~247-~247: There might be a mistake here.
Context: ...our failures and downstream dependencies - backoff_factor: Balance between respon...

(QB_NEW_EN_OTHER)


[grammar] ~249-~249: There might be a mistake here.
Context: ...er values create more aggressive backoff ### Monitor Retry Patterns - Track retry cou...

(QB_NEW_EN_OTHER)


[grammar] ~251-~251: There might be a mistake here.
Context: ...sive backoff ### Monitor Retry Patterns - Track retry counts in your monitoring sy...

(QB_NEW_EN_OTHER)


[grammar] ~254-~254: There might be a mistake here.
Context: ...try patterns to identify systemic issues ## Limitations - Retry policies apply to a...

(QB_NEW_EN_OTHER)


[grammar] ~256-~256: Use correct spacing
Context: ...identify systemic issues ## Limitations - Retry policies apply to all nodes in a g...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~258-~258: There might be a mistake here.
Context: ... apply to all nodes in a graph uniformly - Individual node-level retry policies are...

(QB_NEW_EN_OTHER)


[grammar] ~261-~261: There might be a mistake here.
Context: ... backoff_factor and exponent values) ## Error Handling If a retry policy config...

(QB_NEW_EN_OTHER)


[grammar] ~263-~263: Use correct spacing
Context: ...nd exponent values) ## Error Handling If a retry policy configuration is inval...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~265-~265: There might be a mistake here.
Context: ...a retry policy configuration is invalid: - The graph template validation will fail ...

(QB_NEW_EN)


[grammar] ~266-~266: There might be a mistake here.
Context: ... The graph template validation will fail - An error will be returned during graph c...

(QB_NEW_EN_OTHER)


[grammar] ~268-~268: There might be a mistake here.
Context: ...ved until the configuration is corrected ## Integration with Signals Retry policies...

(QB_NEW_EN_OTHER)


[grammar] ~270-~270: Use correct spacing
Context: ...s corrected ## Integration with Signals Retry policies work alongside Exosphere'...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~272-~272: Use correct spacing
Context: ...ork alongside Exosphere's signal system: - Nodes can still raise PruneSignal to s...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~274-~274: There might be a mistake here.
Context: ...PruneSignalto stop retries immediately - Nodes can raiseReQueueAfterSignal` to ...

(QB_NEW_EN_OTHER)


[grammar] ~275-~275: Make sure you are using the right part of speech
Context: ... ReQueueAfterSignal to re-queue after sometime, this will not mark nodes as failure. - ...

(QB_NEW_EN_OTHER_ERROR_IDS_21)


[grammar] ~275-~275: There might be a problem here.
Context: ...r sometime, this will not mark nodes as failure. - The retry count is preserved when using ...

(QB_NEW_EN_MERGED_MATCH)


[grammar] ~276-~276: There might be a mistake here.
Context: ...The retry count is preserved when using signals

(QB_NEW_EN_OTHER)

docs/docs/exosphere/create-graph.md

[grammar] ~136-~136: Use correct spacing
Context: ... (e.g., "42", "true"). ### Retry Policy Graphs can include a retry policy to han...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~138-~138: Use correct spacing
Context: ...d applies to all nodes within the graph. json { "retry_policy": { "max_retries": 3, "strategy": "EXPONENTIAL", "backoff_factor": 2000, "exponent": 2 } } For detailed information about retry pol...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

🪛 markdownlint-cli2 (0.17.2)
docs/docs/exosphere/create-graph.md

55-55: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)


56-56: Lists should be surrounded by blank lines

(MD032, blanks-around-lists)


135-135: Lists should be surrounded by blank lines

(MD032, blanks-around-lists)


143-143: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)


149-149: Lists should be surrounded by blank lines

(MD032, blanks-around-lists)

🔇 Additional comments (3)
docs/mkdocs.yml (2)

104-104: Docs index: added Retry Policy to llmstxt sections — looks good.

Ensure the referenced file exists at docs/docs/exosphere/retry-policy.md (it does in this PR).


133-133: Navigation entry for Retry Policy — good placement.

Position under “Create Graph” fits the flow; no further nav adjustments needed.

docs/docs/exosphere/retry-policy.md (1)

221-233: Clarify state status preconditions for retry.

“State status is QUEUED or EXECUTED” during an error is confusing. Typically the failing state becomes ERRORED, then a retry state is created. Please confirm and adjust wording.

Suggested text:

  • “A node execution fails with an error.”
  • “If retry_count < max_retries, the failing state is marked ERRORED and a new retry state is created with retry_count+1 and enqueue_after = now + delay.”

- Added validation to ensure the retry count is greater than 0 in the compute_delay method of RetryPolicyModel, raising a ValueError for invalid inputs.
- Updated the compute_delay method to correctly calculate delays based on the retry count, adjusting the exponentiation logic.
- Refined error handling in the GraphTemplate model by replacing ValueError with HTTPException for better integration with FastAPI, ensuring a 404 response when a graph template is not found.
- Clarified the backoff_factor parameter in the retry policy documentation to specify its unit as milliseconds.
- Added retry policy examples in the create-graph.md file to demonstrate its usage in graph template creation and updates.
- Ensured consistency in the retry policy structure across documentation, enhancing user understanding and implementation.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
state-manager/app/models/db/graph_template_model.py (1)

311-333: Consider handling “not found” during get_valid explicitly.

If get raises an HTTPException/ValueError, get_valid exits immediately. If intended, fine; otherwise catch-not-found and keep polling.

-            graph_template = await GraphTemplate.get(namespace, graph_name)
+            try:
+                graph_template = await GraphTemplate.get(namespace, graph_name)
+            except Exception as e:
+                # Re-raise non-not-found errors
+                if "not found" not in str(e).lower():
+                    raise
+                # Optionally: await asyncio.sleep(polling_interval); continue
+                raise

Clarify desired semantics in docstring.

♻️ Duplicate comments (1)
state-manager/app/models/retry_policy_model.py (1)

21-22: Clarify units/name; consider renaming backoff_factor.

“backoff_factor” reads like a multiplier but holds milliseconds. Prefer base_delay_ms (and optionally allow exponent: float). At minimum, fix the description to “Base delay in milliseconds”.

-    backoff_factor: int = Field(default=2000, description="The backoff factor in milliseconds (default: 2000 = 2 seconds)", gt=0)
+    backoff_factor: int = Field(default=2000, description="Base delay in milliseconds (default: 2000 = 2 seconds)", gt=0)

Optionally:

-    backoff_factor: int = Field(...)
-    exponent: int = Field(...)
+    base_delay_ms: int = Field(default=2000, description="Base delay in milliseconds", gt=0)
+    exponent: float = Field(default=2.0, description="Exponent for exponential backoff", gt=0)

This would require renaming usages.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 66c794a and 728b225.

📒 Files selected for processing (2)
  • state-manager/app/models/db/graph_template_model.py (3 hunks)
  • state-manager/app/models/retry_policy_model.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
state-manager/app/models/db/graph_template_model.py (1)
state-manager/app/models/retry_policy_model.py (1)
  • RetryPolicyModel (18-62)
🪛 GitHub Actions: State Manager Unit Tests
state-manager/app/models/db/graph_template_model.py

[error] 306-306: AttributeError: 'namespace' during GraphTemplate.get query construction (GraphTemplate.find_one(GraphTemplate.namespace == namespace, GraphTemplate.name == graph_name)).


[error] 306-306: AttributeError: 'namespace' during GraphTemplate.get query construction (GraphTemplate.find_one(GraphTemplate.namespace == namespace, GraphTemplate.name == graph_name)).


[error] 306-306: AttributeError: 'namespace' during GraphTemplate.get query construction (GraphTemplate.find_one(GraphTemplate.namespace == namespace, GraphTemplate.name == graph_name)).

🔇 Additional comments (2)
state-manager/app/models/retry_policy_model.py (1)

31-38: No changes needed: compute_delay returns an int in every branch (multiplying ints yields an int and all jitter paths use int(...)), so explicit casts for non-jitter branches are unnecessary.

state-manager/app/models/db/graph_template_model.py (1)

25-25: LGTM: retry_policy field integration.

Defaulting with default_factory ensures every graph persists a policy; aligns with PR goals.

- Added the `max_delay` parameter to the retry policy model, allowing users to cap the maximum delay for retry attempts.
- Updated the documentation to include detailed explanations of the `max_delay` parameter and its usage in retry strategies.
- Improved error handling in the `errored_state` function to log errors when fetching graph templates and raise appropriate HTTP exceptions.
- Refactored the `GraphTemplate` model to raise a ValueError instead of an HTTPException when a graph template is not found, enhancing error handling consistency.
- Updated the `compute_delay` method in the `RetryPolicyModel` to apply the new delay capping logic across all retry strategies.
- Added error handling for duplicate retry states in the `errored_state` function, logging when a retry state already exists.
- Introduced a new `fanout_id` field in the `State` model to support unique identification of retry states.
- Updated the database index to enforce uniqueness on the combination of relevant state fields, improving data integrity and query performance.
@NiveditJain
Copy link
Member Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a robust retry policy for handling transient failures in graph execution. The changes include a new RetryPolicyModel with configurable strategies, updates to the State model to track retries, and modifications to the errored_state controller to create retry states. The documentation has also been updated with a comprehensive guide on the new retry policy feature.

My review focuses on ensuring the correctness and robustness of the new implementation. I've identified a critical issue with default value generation for fanout_id that could lead to incorrect behavior. I've also provided suggestions to improve error handling, logging clarity, and documentation accuracy. Overall, this is a great addition to improve the resilience of the workflows.

@NiveditJain
Copy link
Member Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 31, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

NiveditJain and others added 5 commits August 31, 2025 14:11
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
state-manager/app/models/db/state.py (1)

31-45: Confirm fingerprint semantics for unites with retries.

Including retry_count alters the fingerprint across retries. If the intent is to treat each retry as a distinct unite instance, this is correct; if not, consider omitting retry_count (or including fanout_id instead) to dedupe across retries of the same fanout.

♻️ Duplicate comments (4)
state-manager/app/models/db/state.py (1)

28-28: Enforce non-negative retry_count.

Add ge=0 to guard against accidental negatives (matches prior feedback).

-    retry_count: int = Field(default=0, description="Number of times the state has been retried")
+    retry_count: int = Field(default=0, ge=0, description="Number of times the state has been retried")
state-manager/app/models/retry_policy_model.py (1)

32-34: Fix off-by-one wording and add type guard for retry_count.

Message says “greater than 1” but 1 is valid; also fail fast on non-int.

-        if retry_count < 1:
-            raise ValueError(f"Retry count must be greater than 1, got {retry_count}")
+        if not isinstance(retry_count, int):
+            raise TypeError(f"retry_count must be int, got {type(retry_count).__name__}")
+        if retry_count < 1:
+            raise ValueError(f"retry_count must be >= 1 (1-based attempt), got {retry_count}")
state-manager/app/controller/errored_state.py (2)

40-64: Make retry creation idempotent and schedule from max(now, previous_enqueue); fix logging on DuplicateKeyError

  • De-dupe before insert to avoid noisy DuplicateKeyError races.
  • Compute enqueue_after using max(now_ms, state.enqueue_after) + compute_delay(next_attempt).
  • On DuplicateKeyError, don't reference retry_state.id (it’s None pre-insert); query and log the existing one.
  • Optional: Consider retry_created=False when we didn’t actually create (clarify API semantics).

Apply this diff:

-        if state.retry_count < graph_template.retry_policy.max_retries:
-            try:
-                retry_state = State(
-                    node_name=state.node_name,
-                    namespace_name=state.namespace_name,
-                    identifier=state.identifier,
-                    graph_name=state.graph_name,
-                    run_id=state.run_id,
-                    status=StateStatusEnum.CREATED,
-                    inputs=state.inputs,
-                    outputs={},
-                    error=None,
-                    parents=state.parents,
-                    does_unites=state.does_unites,
-                    enqueue_after= int(time.time() * 1000) + graph_template.retry_policy.compute_delay(state.retry_count + 1),
-                    retry_count=state.retry_count + 1,
-                    fanout_id=state.fanout_id
-                )
-                retry_state = await retry_state.insert()
-                logger.info(f"Retry state {retry_state.id} created for state {state_id}", x_exosphere_request_id=x_exosphere_request_id)
-                retry_created = True
-            except DuplicateKeyError:
-                logger.info(f"Retry state {retry_state.id} already exists for state {state_id}", x_exosphere_request_id=x_exosphere_request_id)
-                retry_created = True
+        if state.retry_count < graph_template.retry_policy.max_retries:
+            next_attempt = state.retry_count + 1
+            existing = await State.find_one({
+                "namespace_name": state.namespace_name,
+                "graph_name": state.graph_name,
+                "run_id": state.run_id,
+                "node_name": state.node_name,
+                "identifier": state.identifier,
+                "retry_count": next_attempt,
+                "status": {"$in": [StateStatusEnum.CREATED, StateStatusEnum.QUEUED]},
+                "fanout_id": state.fanout_id
+            })
+            if existing:
+                logger.info(f"Retry state {existing.id} already exists for state {state_id}", x_exosphere_request_id=x_exosphere_request_id)
+                retry_created = False
+            else:
+                now_ms = int(time.time() * 1000)
+                enqueue_at = max(now_ms, state.enqueue_after) + graph_template.retry_policy.compute_delay(next_attempt)
+                try:
+                    retry_state = State(
+                        node_name=state.node_name,
+                        namespace_name=state.namespace_name,
+                        identifier=state.identifier,
+                        graph_name=state.graph_name,
+                        run_id=state.run_id,
+                        status=StateStatusEnum.CREATED,
+                        inputs=state.inputs,
+                        outputs={},
+                        error=None,
+                        parents=state.parents,
+                        does_unites=state.does_unites,
+                        enqueue_after=enqueue_at,
+                        retry_count=next_attempt,
+                        fanout_id=state.fanout_id
+                    )
+                    retry_state = await retry_state.insert()
+                    logger.info(f"Retry state {retry_state.id} created for state {state_id}", x_exosphere_request_id=x_exosphere_request_id)
+                    retry_created = True
+                except DuplicateKeyError:
+                    existing = await State.find_one({
+                        "namespace_name": state.namespace_name,
+                        "graph_name": state.graph_name,
+                        "run_id": state.run_id,
+                        "node_name": state.node_name,
+                        "identifier": state.identifier,
+                        "retry_count": next_attempt,
+                        "fanout_id": state.fanout_id
+                    })
+                    logger.info(f"Retry state already exists for state {state_id} (existing_id={getattr(existing, 'id', None)})", x_exosphere_request_id=x_exosphere_request_id)
+                    retry_created = False

Follow-up:

  • Confirm compute_delay units are milliseconds to match enqueue_after. If seconds, multiply by 1000 here or in compute_delay.
#!/bin/bash
# Verify compute_delay units in RetryPolicyModel
rg -nC2 'class\s+RetryPolicyModel|def\s+compute_delay' state-manager/app/models/retry_policy_model.py

30-36: In state-manager/app/controller/errored_state.py (lines 30–36): use the correct fields for GraphTemplate.find_one and explicit 404
Replace the lookup with:

         try:
-            graph_template = await GraphTemplate.get(namespace_name, state.graph_name)
+            graph_template = await GraphTemplate.find_one({"namespace": namespace_name, "name": state.graph_name})
             if not graph_template:
                 raise HTTPException(
                     status_code=status.HTTP_404_NOT_FOUND,
-                    detail="Graph template not found"
+                    detail=f"Graph template not found for namespace {namespace_name} and graph {state.graph_name}"
                 )
         except HTTPException:
             raise
         except Exception as e:
             logger.error(
                 f"Error getting graph template {state.graph_name} for namespace {namespace_name}",
                 x_exosphere_request_id=x_exosphere_request_id,
                 error=e,
             )
             raise
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 728b225 and 64592d8.

📒 Files selected for processing (6)
  • docs/docs/exosphere/create-graph.md (4 hunks)
  • docs/docs/exosphere/retry-policy.md (1 hunks)
  • state-manager/app/controller/errored_state.py (2 hunks)
  • state-manager/app/models/db/graph_template_model.py (2 hunks)
  • state-manager/app/models/db/state.py (4 hunks)
  • state-manager/app/models/retry_policy_model.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
state-manager/app/models/db/graph_template_model.py (1)
state-manager/app/models/retry_policy_model.py (1)
  • RetryPolicyModel (18-69)
state-manager/app/controller/errored_state.py (4)
state-manager/app/models/errored_models.py (1)
  • ErroredResponseModel (9-11)
state-manager/app/models/db/state.py (1)
  • State (13-97)
state-manager/app/models/db/graph_template_model.py (2)
  • GraphTemplate (16-330)
  • get (303-307)
state-manager/app/models/retry_policy_model.py (1)
  • compute_delay (25-69)
🪛 GitHub Actions: State Manager Unit Tests
state-manager/app/models/db/graph_template_model.py

[error] 304-304: AttributeError: 'namespace' encountered when querying GraphTemplate.namespace in GraphTemplate.get (GraphTemplate.find_one call).

state-manager/app/controller/errored_state.py

[error] 31-31: Command 'uv run pytest tests/ --cov=app --cov-report=xml --cov-report=term-missing --cov-report=html -v --junitxml=full-pytest-report.xml' failed: GraphTemplate.get() raised AttributeError: 'namespace' during graph template retrieval in errored_state.

🪛 LanguageTool
docs/docs/exosphere/retry-policy.md

[grammar] ~1-~1: Use correct spacing
Context: # Retry Policy !!! beta "Beta Feature" Retry Policy...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~4-~4: Use correct spacing
Context: ...tionality may change in future releases. The Retry Policy feature in Exosphere pr...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~6-~6: Use correct spacing
Context: ...cution based on configurable strategies. ## Overview Retry policies are configured ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~8-~8: Use correct spacing
Context: ...on configurable strategies. ## Overview Retry policies are configured at the gra...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~10-~10: Use correct spacing
Context: ...delay before the next execution attempt. ## Configuration Retry policies are define...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~12-~12: Use correct spacing
Context: ...ext execution attempt. ## Configuration Retry policies are defined in your graph...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~14-~14: Use correct spacing
Context: ...ed in your graph template configuration: json { "secrets": { "api_key": "your-api-key" }, "nodes": [ { "node_name": "MyNode", "namespace": "MyProject", "identifier": "my_node", "inputs": { "data": "initial" }, "next_nodes": [] } ], "retry_policy": { "max_retries": 3, "strategy": "EXPONENTIAL", "backoff_factor": 2000, "exponent": 2, "max_delay": 3600000 } } ## Parameters ### max_retries - Type:...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~42-~42: Use correct spacing
Context: ...delay": 3600000 } } ``` ## Parameters ### max_retries - Type: int - **Defau...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~44-~44: Use correct spacing
Context: ... } } ``` ## Parameters ### max_retries - Type: int - Default: `3` - **Des...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~46-~46: There might be a mistake here.
Context: ...arameters ### max_retries - Type: int - Default: 3 - Description: The ma...

(QB_NEW_EN)


[grammar] ~47-~47: There might be a mistake here.
Context: ...tries - Type: int - Default: 3 - Description: The maximum number of ret...

(QB_NEW_EN)


[grammar] ~48-~48: There might be a mistake here.
Context: ...umber of retry attempts before giving up - Constraints: Must be >= 0 ### strateg...

(QB_NEW_EN)


[grammar] ~49-~49: Use correct spacing
Context: ...iving up - Constraints: Must be >= 0 ### strategy - Type: string - **Defau...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~51-~51: Use correct spacing
Context: ...onstraints**: Must be >= 0 ### strategy - Type: string - Default: `"EXPONE...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~53-~53: There might be a mistake here.
Context: ...Must be >= 0 ### strategy - Type: string - Default: "EXPONENTIAL" - **Descripti...

(QB_NEW_EN)


[grammar] ~54-~54: There might be a mistake here.
Context: ...gy - Type: string - Default: "EXPONENTIAL" - Description: The retry strategy to use...

(QB_NEW_EN)


[grammar] ~55-~55: There might be a mistake here.
Context: ...y strategy to use for calculating delays - Options: See [Retry Strategies](#retry...

(QB_NEW_EN_OTHER)


[grammar] ~56-~56: There might be a mistake here.
Context: ...try Strategies](#retry-strategies) below ### backoff_factor - Type: int (milli...

(QB_NEW_EN_OTHER)


[grammar] ~58-~58: Use correct spacing
Context: ...ry-strategies) below ### backoff_factor - Type: int (milliseconds) - **Default...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~60-~60: There might be a mistake here.
Context: ...factor - Type: int (milliseconds) - Default: 2000 (2 seconds) - **Descri...

(QB_NEW_EN)


[grammar] ~61-~61: There might be a mistake here.
Context: ...conds) - Default: 2000 (2 seconds) - Description: The base delay factor in ...

(QB_NEW_EN)


[grammar] ~62-~62: There might be a mistake here.
Context: ...*: The base delay factor in milliseconds - Constraints: Must be > 0 ### exponent...

(QB_NEW_EN)


[grammar] ~63-~63: Use correct spacing
Context: ...liseconds - Constraints: Must be > 0 ### exponent - Type: int - *Default...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~65-~65: Use correct spacing
Context: ...Constraints**: Must be > 0 ### exponent - Type: int - Default: 2 - **Des...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~67-~67: There might be a mistake here.
Context: ... Must be > 0 ### exponent - Type: int - Default: 2 - Description: The ex...

(QB_NEW_EN)


[grammar] ~68-~68: There might be a mistake here.
Context: ...onent - Type: int - Default: 2 - Description: The exponent used for exp...

(QB_NEW_EN)


[grammar] ~69-~69: There might be a mistake here.
Context: ...exponent used for exponential strategies - Constraints: Must be > 0 ### max_dela...

(QB_NEW_EN)


[grammar] ~70-~70: Use correct spacing
Context: ...trategies - Constraints: Must be > 0 ### max_delay - Type: int | null (mil...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~72-~72: Use correct spacing
Context: ...onstraints**: Must be > 0 ### max_delay - Type: int | null (milliseconds) - **...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~76-~76: There might be a mistake here.
Context: ... at this value using the _cap function - Constraints: Must be > 0 when not null...

(QB_NEW_EN_OTHER)


[grammar] ~77-~77: There might be a mistake here.
Context: ...Constraints**: Must be > 0 when not null - Example: 3600000 (1 hour) would cap ...

(QB_NEW_EN_OTHER)


[grammar] ~78-~78: There might be a mistake here.
Context: ...ld cap all delays to a maximum of 1 hour ## Retry Strategies Exosphere supports thr...

(QB_NEW_EN_OTHER)


[grammar] ~80-~80: Use correct spacing
Context: ...a maximum of 1 hour ## Retry Strategies Exosphere supports three main categories...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~82-~82: Use correct spacing
Context: ...nts to prevent thundering herd problems. ### Exponential Strategies Exponential stra...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~84-~84: Use correct spacing
Context: ...rd problems. ### Exponential Strategies Exponential strategies increase the dela...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~86-~86: Use correct spacing
Context: ...y exponentially with each retry attempt. #### EXPONENTIAL Standard exponential backof...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~88-~88: Use correct spacing
Context: ...th each retry attempt. #### EXPONENTIAL Standard exponential backoff without jit...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~90-~90: Use correct spacing
Context: ...dard exponential backoff without jitter. Formula: `backoff_factor * (exponent ^...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~92-~92: Use correct spacing
Context: ...l backoff without jitter. Formula: backoff_factor * (exponent ^ (retry_count - 1)) Example: - Retry 1: 2000ms (2 seconds...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~94-~94: Use correct spacing
Context: ...nent ^ (retry_count - 1))` Example: - Retry 1: 2000ms (2 seconds) - Retry 2: 4...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~96-~96: There might be a mistake here.
Context: ..._count - 1))` Example: - Retry 1: 2000ms (2 seconds) - Retry 2: 4000ms (4 second...

(QB_NEW_EN_OTHER)


[grammar] ~96-~96: There might be a mistake here.
Context: ...xample**: - Retry 1: 2000ms (2 seconds) - Retry 2: 4000ms (4 seconds) - Retry 3: 8...

(QB_NEW_EN)


[grammar] ~97-~97: There might be a mistake here.
Context: ... Retry 1: 2000ms (2 seconds) - Retry 2: 4000ms (4 seconds) - Retry 3: 8000ms (8 second...

(QB_NEW_EN_OTHER)


[grammar] ~97-~97: There might be a mistake here.
Context: ...2 seconds) - Retry 2: 4000ms (4 seconds) - Retry 3: 8000ms (8 seconds) #### EXPONE...

(QB_NEW_EN)


[grammar] ~98-~98: There might be a mistake here.
Context: ... Retry 2: 4000ms (4 seconds) - Retry 3: 8000ms (8 seconds) #### EXPONENTIAL_FULL_JITT...

(QB_NEW_EN_OTHER)


[grammar] ~98-~98: Use correct spacing
Context: ...4 seconds) - Retry 3: 8000ms (8 seconds) #### EXPONENTIAL_FULL_JITTER Exponential bac...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~100-~100: Use correct spacing
Context: ...8 seconds) #### EXPONENTIAL_FULL_JITTER Exponential backoff with full jitter (ra...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~102-~102: Use correct spacing
Context: ...m delay between 0 and calculated delay). Formula: `random(0, backoff_factor * (...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~104-~104: Use correct spacing
Context: ... 0 and calculated delay). Formula: random(0, backoff_factor * (exponent ^ (retry_count - 1))) *Note: random(a, b) denotes a uniform r...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~106-~106: Use correct spacing
Context: ...om draw over the inclusive range [a, b].* Example: - Retry 1: 0-2000ms (random)...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~108-~108: Use correct spacing
Context: ...e inclusive range [a, b].* Example: - Retry 1: 0-2000ms (random) - Retry 2: 0-...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~110-~110: There might be a mistake here.
Context: ...Example**: - Retry 1: 0-2000ms (random) - Retry 2: 0-4000ms (random) - Retry 3: 0-...

(QB_NEW_EN)


[grammar] ~111-~111: There might be a mistake here.
Context: ...ms (random) - Retry 2: 0-4000ms (random) - Retry 3: 0-8000ms (random) #### EXPONEN...

(QB_NEW_EN)


[grammar] ~112-~112: Use correct spacing
Context: ...ms (random) - Retry 3: 0-8000ms (random) #### EXPONENTIAL_EQUAL_JITTER Exponential ba...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~114-~114: Use correct spacing
Context: ... (random) #### EXPONENTIAL_EQUAL_JITTER Exponential backoff with equal jitter (r...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~116-~116: Use correct spacing
Context: ...delay around half the calculated delay). Formula: `(backoff_factor * (exponent ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~118-~118: Use correct spacing
Context: ...lf the calculated delay). Formula: (backoff_factor * (exponent ^ (retry_count - 1))) / 2 + random(0, (backoff_factor * (exponent ^ (retry_count - 1))) / 2) *Note: random(a, b) denotes a uniform r...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~120-~120: Use correct spacing
Context: ...om draw over the inclusive range [a, b].* Example: - Retry 1: 1000-2000ms (rand...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~122-~122: Use correct spacing
Context: ...e inclusive range [a, b].* Example: - Retry 1: 1000-2000ms (random) - Retry 2:...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~124-~124: There might be a mistake here.
Context: ...[a, b].* Example: - Retry 1: 1000-2000ms (random) - Retry 2: 2000-4000ms (random...

(QB_NEW_EN_OTHER)


[grammar] ~124-~124: There might be a mistake here.
Context: ...mple**: - Retry 1: 1000-2000ms (random) - Retry 2: 2000-4000ms (random) - Retry 3:...

(QB_NEW_EN)


[grammar] ~125-~125: There might be a mistake here.
Context: ...1: 1000-2000ms (random) - Retry 2: 2000-4000ms (random) - Retry 3: 4000-8000ms (random...

(QB_NEW_EN_OTHER)


[grammar] ~125-~125: There might be a mistake here.
Context: ...(random) - Retry 2: 2000-4000ms (random) - Retry 3: 4000-8000ms (random) ### Linea...

(QB_NEW_EN)


[grammar] ~126-~126: There might be a mistake here.
Context: ...2: 2000-4000ms (random) - Retry 3: 4000-8000ms (random) ### Linear Strategies Linear...

(QB_NEW_EN_OTHER)


[grammar] ~126-~126: Use correct spacing
Context: ...(random) - Retry 3: 4000-8000ms (random) ### Linear Strategies Linear strategies inc...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~128-~128: Use correct spacing
Context: ...0-8000ms (random) ### Linear Strategies Linear strategies increase the delay lin...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~130-~130: Use correct spacing
Context: ... delay linearly with each retry attempt. #### LINEAR Standard linear backoff without ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~132-~132: Use correct spacing
Context: ...ly with each retry attempt. #### LINEAR Standard linear backoff without jitter. ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~134-~134: Use correct spacing
Context: ... Standard linear backoff without jitter. Formula: `backoff_factor * retry_count...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~136-~136: Use correct spacing
Context: ...r backoff without jitter. Formula: backoff_factor * retry_count Example: - Retry 1: 2000ms (2 seconds...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~138-~138: Use correct spacing
Context: ...koff_factor * retry_count` Example: - Retry 1: 2000ms (2 seconds) - Retry 2: 4...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~140-~140: There might be a mistake here.
Context: ... retry_count` Example: - Retry 1: 2000ms (2 seconds) - Retry 2: 4000ms (4 second...

(QB_NEW_EN_OTHER)


[grammar] ~140-~140: There might be a mistake here.
Context: ...xample**: - Retry 1: 2000ms (2 seconds) - Retry 2: 4000ms (4 seconds) - Retry 3: 6...

(QB_NEW_EN)


[grammar] ~141-~141: There might be a mistake here.
Context: ... Retry 1: 2000ms (2 seconds) - Retry 2: 4000ms (4 seconds) - Retry 3: 6000ms (6 second...

(QB_NEW_EN_OTHER)


[grammar] ~141-~141: There might be a mistake here.
Context: ...2 seconds) - Retry 2: 4000ms (4 seconds) - Retry 3: 6000ms (6 seconds) #### LINEAR...

(QB_NEW_EN)


[grammar] ~142-~142: There might be a mistake here.
Context: ... Retry 2: 4000ms (4 seconds) - Retry 3: 6000ms (6 seconds) #### LINEAR_FULL_JITTER L...

(QB_NEW_EN_OTHER)


[grammar] ~142-~142: Use correct spacing
Context: ...4 seconds) - Retry 3: 6000ms (6 seconds) #### LINEAR_FULL_JITTER Linear backoff with ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~144-~144: Use correct spacing
Context: ...0ms (6 seconds) #### LINEAR_FULL_JITTER Linear backoff with full jitter. **Form...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~146-~146: Use correct spacing
Context: ...JITTER Linear backoff with full jitter. Formula: `random(0, backoff_factor * r...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~148-~148: Use correct spacing
Context: ...backoff with full jitter. Formula: random(0, backoff_factor * retry_count) *Note: random(a, b) denotes a uniform r...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~150-~150: Use correct spacing
Context: ...om draw over the inclusive range [a, b].* Example: - Retry 1: 0-2000ms (random)...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~152-~152: Use correct spacing
Context: ...e inclusive range [a, b].* Example: - Retry 1: 0-2000ms (random) - Retry 2: 0-...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~154-~154: There might be a mistake here.
Context: ...Example**: - Retry 1: 0-2000ms (random) - Retry 2: 0-4000ms (random) - Retry 3: 0-...

(QB_NEW_EN)


[grammar] ~155-~155: There might be a mistake here.
Context: ...ms (random) - Retry 2: 0-4000ms (random) - Retry 3: 0-6000ms (random) #### LINEAR_...

(QB_NEW_EN)


[grammar] ~156-~156: Use correct spacing
Context: ...ms (random) - Retry 3: 0-6000ms (random) #### LINEAR_EQUAL_JITTER Linear backoff with...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~158-~158: Use correct spacing
Context: ...000ms (random) #### LINEAR_EQUAL_JITTER Linear backoff with equal jitter. **For...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~160-~160: Use correct spacing
Context: ...ITTER Linear backoff with equal jitter. Formula: `(backoff_factor * retry_coun...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~162-~162: Use correct spacing
Context: ...ackoff with equal jitter. Formula: (backoff_factor * retry_count) / 2 + random(0, (backoff_factor * retry_count) / 2) *Note: random(a, b) denotes a uniform r...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~164-~164: Use correct spacing
Context: ...om draw over the inclusive range [a, b].* Example: - Retry 1: 1000-2000ms (rand...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~166-~166: Use correct spacing
Context: ...e inclusive range [a, b].* Example: - Retry 1: 1000-2000ms (random) - Retry 2:...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~168-~168: There might be a mistake here.
Context: ...[a, b].* Example: - Retry 1: 1000-2000ms (random) - Retry 2: 2000-4000ms (random...

(QB_NEW_EN_OTHER)


[grammar] ~168-~168: There might be a mistake here.
Context: ...mple**: - Retry 1: 1000-2000ms (random) - Retry 2: 2000-4000ms (random) - Retry 3:...

(QB_NEW_EN)


[grammar] ~169-~169: There might be a mistake here.
Context: ...1: 1000-2000ms (random) - Retry 2: 2000-4000ms (random) - Retry 3: 3000-6000ms (random...

(QB_NEW_EN_OTHER)


[grammar] ~169-~169: There might be a mistake here.
Context: ...(random) - Retry 2: 2000-4000ms (random) - Retry 3: 3000-6000ms (random) ### Fixed...

(QB_NEW_EN)


[grammar] ~170-~170: There might be a mistake here.
Context: ...2: 2000-4000ms (random) - Retry 3: 3000-6000ms (random) ### Fixed Strategies Fixed s...

(QB_NEW_EN_OTHER)


[grammar] ~170-~170: Use correct spacing
Context: ...(random) - Retry 3: 3000-6000ms (random) ### Fixed Strategies Fixed strategies use a...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~172-~172: Use correct spacing
Context: ...00-6000ms (random) ### Fixed Strategies Fixed strategies use a constant delay fo...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~174-~174: Use correct spacing
Context: ...a constant delay for all retry attempts. #### FIXED Standard fixed delay without jitt...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~176-~176: Use correct spacing
Context: ...elay for all retry attempts. #### FIXED Standard fixed delay without jitter. **...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~178-~178: Use correct spacing
Context: ...ED Standard fixed delay without jitter. Formula: backoff_factor Example...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~180-~180: Use correct spacing
Context: ...xed delay without jitter. Formula: backoff_factor Example: - Retry 1: 2000ms (2 seconds...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~182-~182: Use correct spacing
Context: ...ormula**: backoff_factor Example: - Retry 1: 2000ms (2 seconds) - Retry 2: 2...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~184-~184: There might be a mistake here.
Context: ...ckoff_factor` Example: - Retry 1: 2000ms (2 seconds) - Retry 2: 2000ms (2 second...

(QB_NEW_EN_OTHER)


[grammar] ~184-~184: There might be a mistake here.
Context: ...xample**: - Retry 1: 2000ms (2 seconds) - Retry 2: 2000ms (2 seconds) - Retry 3: 2...

(QB_NEW_EN)


[grammar] ~185-~185: There might be a mistake here.
Context: ... Retry 1: 2000ms (2 seconds) - Retry 2: 2000ms (2 seconds) - Retry 3: 2000ms (2 second...

(QB_NEW_EN_OTHER)


[grammar] ~185-~185: There might be a mistake here.
Context: ...2 seconds) - Retry 2: 2000ms (2 seconds) - Retry 3: 2000ms (2 seconds) #### FIXED_...

(QB_NEW_EN)


[grammar] ~186-~186: There might be a mistake here.
Context: ... Retry 2: 2000ms (2 seconds) - Retry 3: 2000ms (2 seconds) #### FIXED_FULL_JITTER Fi...

(QB_NEW_EN_OTHER)


[grammar] ~186-~186: Use correct spacing
Context: ...2 seconds) - Retry 3: 2000ms (2 seconds) #### FIXED_FULL_JITTER Fixed delay with full...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~188-~188: Use correct spacing
Context: ...00ms (2 seconds) #### FIXED_FULL_JITTER Fixed delay with full jitter. **Formula...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~190-~190: Use correct spacing
Context: ...LL_JITTER Fixed delay with full jitter. Formula: random(0, backoff_factor) ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~192-~192: Use correct spacing
Context: ...d delay with full jitter. Formula: random(0, backoff_factor) *Note: random(a, b) denotes a uniform r...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~194-~194: Use correct spacing
Context: ...om draw over the inclusive range [a, b].* Example: - Retry 1: 0-2000ms (random)...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~196-~196: Use correct spacing
Context: ...e inclusive range [a, b].* Example: - Retry 1: 0-2000ms (random) - Retry 2: 0-...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~198-~198: There might be a mistake here.
Context: ...Example**: - Retry 1: 0-2000ms (random) - Retry 2: 0-2000ms (random) - Retry 3: 0-...

(QB_NEW_EN)


[grammar] ~199-~199: There might be a mistake here.
Context: ...ms (random) - Retry 2: 0-2000ms (random) - Retry 3: 0-2000ms (random) #### FIXED_E...

(QB_NEW_EN)


[grammar] ~200-~200: Use correct spacing
Context: ...ms (random) - Retry 3: 0-2000ms (random) #### FIXED_EQUAL_JITTER Fixed delay with equ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~202-~202: Use correct spacing
Context: ...2000ms (random) #### FIXED_EQUAL_JITTER Fixed delay with equal jitter. **Formul...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~204-~204: Use correct spacing
Context: ...L_JITTER Fixed delay with equal jitter. Formula: `backoff_factor / 2 + random(...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~206-~206: Use correct spacing
Context: ... delay with equal jitter. Formula: backoff_factor / 2 + random(0, backoff_factor / 2) *Note: random(a, b) denotes a uniform r...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~208-~208: Use correct spacing
Context: ...om draw over the inclusive range [a, b].* Example: - Retry 1: 1000-2000ms (rand...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~210-~210: Use correct spacing
Context: ...e inclusive range [a, b].* Example: - Retry 1: 1000-2000ms (random) - Retry 2:...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~212-~212: There might be a mistake here.
Context: ...[a, b].* Example: - Retry 1: 1000-2000ms (random) - Retry 2: 1000-2000ms (random...

(QB_NEW_EN_OTHER)


[grammar] ~212-~212: There might be a mistake here.
Context: ...mple**: - Retry 1: 1000-2000ms (random) - Retry 2: 1000-2000ms (random) - Retry 3:...

(QB_NEW_EN)


[grammar] ~213-~213: There might be a mistake here.
Context: ...1: 1000-2000ms (random) - Retry 2: 1000-2000ms (random) - Retry 3: 1000-2000ms (random...

(QB_NEW_EN_OTHER)


[grammar] ~213-~213: There might be a mistake here.
Context: ...(random) - Retry 2: 1000-2000ms (random) - Retry 3: 1000-2000ms (random) ## Delay ...

(QB_NEW_EN)


[grammar] ~214-~214: There might be a mistake here.
Context: ...2: 1000-2000ms (random) - Retry 3: 1000-2000ms (random) ## Delay Capping The retry p...

(QB_NEW_EN_OTHER)


[grammar] ~214-~214: Use correct spacing
Context: ...(random) - Retry 3: 1000-2000ms (random) ## Delay Capping The retry policy includes...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~216-~216: Use correct spacing
Context: ...: 1000-2000ms (random) ## Delay Capping The retry policy includes a built-in del...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~218-~218: Use correct spacing
Context: ...gressive exponential backoff strategies. ### How Delay Capping Works The _cap func...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~220-~220: Use correct spacing
Context: ...strategies. ### How Delay Capping Works The _cap function is applied to all ca...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~222-~222: Use correct spacing
Context: ...ion is applied to all calculated delays: python def _cap(value: int) -> int: if self.max_delay is not None: return min(value, self.max_delay) return value Behavior: - If max_delay is set, all...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~231-~231: There might be a mistake here.
Context: ...delay) return value ``` Behavior: - If max_delay is set, all calculated de...

(QB_NEW_EN)


[grammar] ~232-~232: There might be a mistake here.
Context: ...lculated delays are capped at this value - If max_delay is null (default), no c...

(QB_NEW_EN_OTHER)


[grammar] ~233-~233: There might be a mistake here.
Context: ... null (default), no capping is applied - The capping is applied after all strateg...

(QB_NEW_EN_OTHER)


[grammar] ~234-~234: Use correct spacing
Context: ...applied after all strategy calculations. ### Example with Delay Capping Consider an ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~236-~236: Use correct spacing
Context: ...lations. ### Example with Delay Capping Consider an exponential strategy with `b...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~238-~238: Use correct spacing
Context: ..., exponent: 2, and max_delay: 10000: With capping: - Retry 1: 2000ms - Retr...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~240-~240: There might be a mistake here.
Context: ...and max_delay: 10000: With capping: - Retry 1: 2000ms - Retry 2: 4000ms - Retr...

(QB_NEW_EN)


[grammar] ~241-~241: There might be a mistake here.
Context: ...: 10000`: With capping: - Retry 1: 2000ms - Retry 2: 4000ms - Retry 3: 8000ms - Retr...

(QB_NEW_EN_OTHER)


[grammar] ~242-~242: There might be a mistake here.
Context: ...capping:** - Retry 1: 2000ms - Retry 2: 4000ms - Retry 3: 8000ms - Retry 4: 10000ms (capp...

(QB_NEW_EN_OTHER)


[grammar] ~243-~243: There might be a mistake here.
Context: ... 1: 2000ms - Retry 2: 4000ms - Retry 3: 8000ms - Retry 4: 10000ms (capped at max_delay) ...

(QB_NEW_EN_OTHER)


[grammar] ~244-~244: There might be a mistake here.
Context: ... 2: 4000ms - Retry 3: 8000ms - Retry 4: 10000ms (capped at max_delay) ### When to Use ...

(QB_NEW_EN_OTHER)


[grammar] ~244-~244: Use correct spacing
Context: ...- Retry 4: 10000ms (capped at max_delay) ### When to Use Delay Capping - **Long-runn...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~246-~246: Use correct spacing
Context: ...ax_delay) ### When to Use Delay Capping - Long-running workflows: Prevent excess...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~248-~248: There might be a mistake here.
Context: ... impact overall workflow completion time - User-facing applications: Ensure retri...

(QB_NEW_EN_OTHER)


[grammar] ~249-~249: There might be a mistake here.
Context: ...ies don't create unacceptable wait times - Resource management: Control resource ...

(QB_NEW_EN_OTHER)


[grammar] ~250-~250: There might be a mistake here.
Context: ...rce consumption by limiting retry delays - Predictable behavior: Create more pred...

(QB_NEW_EN_OTHER)


[grammar] ~251-~251: There might be a mistake here.
Context: ...try patterns for monitoring and alerting ## Usage Examples ### Basic Exponential Re...

(QB_NEW_EN_OTHER)


[grammar] ~253-~253: Use correct spacing
Context: ...nitoring and alerting ## Usage Examples ### Basic Exponential Retry ```json { "re...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~255-~255: Use correct spacing
Context: ...ge Examples ### Basic Exponential Retry json { "retry_policy": { "max_retries": 3, "strategy": "EXPONENTIAL", "backoff_factor": 1000, "exponent": 2 } } ### Aggressive Retry with Jitter ```json { ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~268-~268: Use correct spacing
Context: ... } ### Aggressive Retry with Jitter json { "retry_policy": { "max_retries": 5, "strategy": "EXPONENTIAL_FULL_JITTER", "backoff_factor": 500, "exponent": 3 } } ### Conservative Linear Retry json { "...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~281-~281: Use correct spacing
Context: ... } } ### Conservative Linear Retry json { "retry_policy": { "max_retries": 2, "strategy": "LINEAR", "backoff_factor": 5000 } } ### Fixed Retry for Rate Limiting json {...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~293-~293: Use correct spacing
Context: ...} ### Fixed Retry for Rate Limiting json { "retry_policy": { "max_retries": 10, "strategy": "FIXED_EQUAL_JITTER", "backoff_factor": 1000 } } ``` ### Exponential Retry with Delay Capping ``...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~305-~305: Use correct spacing
Context: ...### Exponential Retry with Delay Capping json { "retry_policy": { "max_retries": 5, "strategy": "EXPONENTIAL", "backoff_factor": 2000, "exponent": 2, "max_delay": 30000 } } ### Conservative Retry with Maximum Delay `...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~319-~319: Use correct spacing
Context: ...## Conservative Retry with Maximum Delay json { "retry_policy": { "max_retries": 3, "strategy": "EXPONENTIAL_FULL_JITTER", "backoff_factor": 1000, "exponent": 3, "max_delay": 60000 } } ## When Retries Are Triggered Retries are ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~333-~333: Use correct spacing
Context: ... } } ``` ## When Retries Are Triggered Retries are automatically triggered when...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~335-~335: Use correct spacing
Context: ...etries are automatically triggered when: 1. A node execution fails with an error 2. ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~337-~337: There might be a mistake here.
Context: ... 1. A node execution fails with an error 2. The current retry count is less than `ma...

(QB_NEW_EN_OTHER)


[grammar] ~338-~338: There might be a mistake here.
Context: ... 2. The current retry count is less than max_retries 3. The state status is QUEUED or `EXECUTE...

(QB_NEW_EN_OTHER)


[grammar] ~339-~339: Use correct spacing
Context: ...ies3. The state status isQUEUEDorEXECUTED` The retry mechanism: - Creates a new st...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~341-~341: Use correct spacing
Context: ...UEDorEXECUTED The retry mechanism: - Creates a new state withretry_count` i...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~343-~343: There might be a mistake here.
Context: ...tate with retry_count incremented by 1 - Sets enqueue_after to the current time...

(QB_NEW_EN_OTHER)


[grammar] ~344-~344: There might be a mistake here.
Context: ...e current time plus the calculated delay - Sets the original state status to `ERROR...

(QB_NEW_EN_OTHER)


[grammar] ~345-~345: There might be a mistake here.
Context: ...atus to ERRORED with the error message ## Best Practices ### Choose the Right Str...

(QB_NEW_EN_OTHER)


[grammar] ~347-~347: Use correct spacing
Context: ...ith the error message ## Best Practices ### Choose the Right Strategy - **EXPONENTI...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~349-~349: Use correct spacing
Context: ...Practices ### Choose the Right Strategy - EXPONENTIAL: Best for most transient f...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~351-~351: There might be a mistake here.
Context: ...ssues, temporary service unavailability) - LINEAR: Good for predictable, consiste...

(QB_NEW_EN)


[grammar] ~352-~352: There might be a mistake here.
Context: ... Good for predictable, consistent delays - FIXED: Useful for rate limiting scenar...

(QB_NEW_EN)


[grammar] ~353-~353: There might be a mistake here.
Context: ...tent delays - FIXED: Useful for rate limiting scenarios ### Use Jitter for H...

(QB_NEW_EN_OTHER)


[grammar] ~353-~353: Use correct spacing
Context: ...ED**: Useful for rate limiting scenarios ### Use Jitter for High Concurrency - **FUL...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~355-~355: Use correct spacing
Context: ...ios ### Use Jitter for High Concurrency - FULL_JITTER: Best for high concurrency...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~357-~357: There might be a mistake here.
Context: ...h concurrency to prevent thundering herd - EQUAL_JITTER: Good balance between pre...

(QB_NEW_EN)


[grammar] ~358-~358: There might be a mistake here.
Context: ...between predictability and randomization - No Jitter: Use only when you need dete...

(QB_NEW_EN)


[grammar] ~359-~359: Use correct spacing
Context: ...nly when you need deterministic behavior ### Set Appropriate Limits - *max_retries...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~361-~361: Use correct spacing
Context: ...tic behavior ### Set Appropriate Limits - max_retries: Consider the nature of yo...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~363-~363: There might be a mistake here.
Context: ...our failures and downstream dependencies - backoff_factor: Balance between respon...

(QB_NEW_EN_OTHER)


[grammar] ~364-~364: There might be a mistake here.
Context: ...etween responsiveness and resource usage - exponent: Higher values create more ag...

(QB_NEW_EN_OTHER)


[grammar] ~365-~365: There might be a mistake here.
Context: ...er values create more aggressive backoff - max_delay: Set a reasonable maximum de...

(QB_NEW_EN_OTHER)


[grammar] ~366-~366: There might be a mistake here.
Context: ...s, especially for exponential strategies ### Monitor Retry Patterns - Track retry co...

(QB_NEW_EN_OTHER)


[grammar] ~368-~368: Use correct spacing
Context: ...l strategies ### Monitor Retry Patterns - Track retry counts in your monitoring sy...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~370-~370: There might be a mistake here.
Context: ...k retry counts in your monitoring system - Set up alerts for graphs with high retry...

(QB_NEW_EN_OTHER)


[grammar] ~371-~371: There might be a mistake here.
Context: ... alerts for graphs with high retry rates - Analyze retry patterns to identify syste...

(QB_NEW_EN_OTHER)


[grammar] ~372-~372: There might be a mistake here.
Context: ...try patterns to identify systemic issues ## Limitations - Retry policies apply to a...

(QB_NEW_EN_OTHER)


[grammar] ~374-~374: Use correct spacing
Context: ...identify systemic issues ## Limitations - Retry policies apply to all nodes in a g...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~376-~376: There might be a mistake here.
Context: ... apply to all nodes in a graph uniformly - Individual node-level retry policies are...

(QB_NEW_EN_OTHER)


[grammar] ~377-~377: There might be a mistake here.
Context: ...e-level retry policies are not supported - Retry delays are calculated in milliseco...

(QB_NEW_EN_OTHER)


[grammar] ~378-~378: There might be a mistake here.
Context: ...ry delays are calculated in milliseconds - Maximum delay can be capped using the `m...

(QB_NEW_EN_OTHER)


[grammar] ~379-~379: There might be a mistake here.
Context: ...(recommended for long-running workflows) ## Error Handling If a retry policy config...

(QB_NEW_EN_OTHER)


[grammar] ~381-~381: Use correct spacing
Context: ...ng-running workflows) ## Error Handling If a retry policy configuration is inval...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~383-~383: Use correct spacing
Context: ...a retry policy configuration is invalid: - The graph template validation will fail ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~385-~385: There might be a mistake here.
Context: ... The graph template validation will fail - An error will be returned during graph c...

(QB_NEW_EN_OTHER)


[grammar] ~386-~386: There might be a mistake here.
Context: ...r will be returned during graph creation - The graph will not be saved until the co...

(QB_NEW_EN_OTHER)


[grammar] ~387-~387: There might be a mistake here.
Context: ...ved until the configuration is corrected ## Integration with Signals Retry policies...

(QB_NEW_EN_OTHER)


[grammar] ~389-~389: Use correct spacing
Context: ...s corrected ## Integration with Signals Retry policies work alongside Exosphere'...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~391-~391: Use correct spacing
Context: ... alongside Exosphere's signaling system: - Nodes can still raise PruneSignal to s...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~393-~393: There might be a mistake here.
Context: ...PruneSignalto stop retries immediately - Nodes can raiseReQueueAfterSignal` to ...

(QB_NEW_EN_OTHER)

docs/docs/exosphere/create-graph.md

[grammar] ~136-~136: Use correct spacing
Context: ... (e.g., "42", "true"). ### Retry Policy Graphs can include a retry policy to han...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~138-~138: Use correct spacing
Context: ...d applies to all nodes within the graph. json { "retry_policy": { "max_retries": 3, "strategy": "EXPONENTIAL", "backoff_factor": 2000, // milliseconds "exponent": 2 } } For detailed information about retry pol...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~151-~151: Use correct spacing
Context: ... Policy](retry-policy.md) documentation. ## Creating Graph Templates The recommende...

(QB_NEW_EN_OTHER_ERROR_IDS_5)

🪛 markdownlint-cli2 (0.17.2)
docs/docs/exosphere/retry-policy.md

232-232: Lists should be surrounded by blank lines

(MD032, blanks-around-lists)


241-241: Lists should be surrounded by blank lines

(MD032, blanks-around-lists)


395-395: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)


395-395: Files should end with a single newline character

(MD047, single-trailing-newline)

🔇 Additional comments (12)
state-manager/app/models/db/state.py (1)

54-61: Precomputing fingerprints in insert_many looks good.

Ensures before_event hooks aren’t missed on bulk insert; _generate_fingerprint is idempotent and cheap for non-unites.

state-manager/app/models/db/graph_template_model.py (2)

23-23: Graph-level retry_policy addition looks good.

Defaulting via default_factory=RetryPolicyModel keeps existing templates backward compatible and documented.


302-307: Use mapping filter for find_one
GraphTemplate.find_one(GraphTemplate.namespace == namespace, GraphTemplate.name == graph_name) raises AttributeError; switch to:

- graph_template = await GraphTemplate.find_one(GraphTemplate.namespace == namespace, GraphTemplate.name == graph_name)
+ graph_template = await GraphTemplate.find_one({"namespace": namespace, "name": graph_name})

Align other find_one calls with this pattern.

docs/docs/exosphere/retry-policy.md (2)

92-99: Formulas now match 1-based retries — good catch.

Using exponent^(retry_count - 1) aligns with the examples.


333-346: Verify trigger conditions vs. actual statuses.

Docs say retries trigger when status is QUEUED or EXECUTED. Cross-check with StateStatusEnum and errored_state flow; typically errors arise during execution (e.g., EXECUTING/RUNNING), not EXECUTED.

I can update the wording once you confirm the valid statuses.

docs/docs/exosphere/create-graph.md (3)

54-61: Good: example JSON includes retry_policy.

Including the field here reduces guesswork; unit comment on milliseconds is helpful.


136-151: Section reads cleanly and links to details.

The dedicated “Retry Policy” section plus cross-link is clear and actionable.


183-189: SDK examples pass retry_policy correctly.

Shows callers exactly how to supply the policy on create/update.

Also applies to: 301-307

state-manager/app/controller/errored_state.py (4)

1-1: Import looks correct

Needed for enqueue_after calculations.


6-6: Import looks correct

Handled explicitly later for idempotent retry creation.


11-11: Import OK, but lookup path below needs adjustment

See next comment on GraphTemplate lookup.


69-69: Response shape looks good

retry_created is surfaced as intended.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
state-manager/app/controller/errored_state.py (1)

24-29: Simplify status validation to only allow QUEUED.
The second check makes the first redundant; be explicit and consistent with docs.

-        if state.status != StateStatusEnum.QUEUED and state.status != StateStatusEnum.EXECUTED:
-            raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="State is not queued or executed")
-        
-        if state.status == StateStatusEnum.EXECUTED:
-            raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="State is already executed")
+        if state.status != StateStatusEnum.QUEUED:
+            raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="State is not queued")
♻️ Duplicate comments (6)
state-manager/app/models/db/state.py (2)

28-28: Enforce non-negative retry_count.
Add ge=0 to prevent invalid negative values at construction time.

-    retry_count: int = Field(default=0, description="Number of times the state has been retried")
+    retry_count: int = Field(default=0, ge=0, description="Number of times the state has been retried")

84-96: Consider partial index to reduce size/cost.
If uniqueness is only required for certain statuses, add partialFilterExpression to uniq_fanout_retry.

             IndexModel(
                 [
                     ("node_name", 1),
                     ("namespace_name", 1),
                     ("graph_name", 1),
                     ("identifier", 1),
                     ("run_id", 1),
                     ("retry_count", 1),
                     ("fanout_id", 1),
                 ],
                 unique=True,
-                name="uniq_fanout_retry"
+                name="uniq_fanout_retry",
+                partialFilterExpression={
+                    "status": {"$in": ["CREATED", "QUEUED"]},
+                }
             )
state-manager/app/controller/errored_state.py (2)

40-58: Optional: make retry creation idempotent without relying on exceptions.
Check for an existing CREATED/QUEUED next-attempt state before insert; still keep the unique index as a backstop.

-        if state.retry_count < graph_template.retry_policy.max_retries:
+        if state.retry_count < graph_template.retry_policy.max_retries:
+            next_attempt = state.retry_count + 1
+            existing = await State.find_one({
+                "namespace_name": state.namespace_name,
+                "graph_name": state.graph_name,
+                "run_id": state.run_id,
+                "node_name": state.node_name,
+                "identifier": state.identifier,
+                "retry_count": next_attempt,
+                "status": {"$in": [StateStatusEnum.CREATED, StateStatusEnum.QUEUED]}
+            })
+            if existing:
+                logger.info(f"Scheduled retry already exists for state {state_id} (attempt {next_attempt})", x_exosphere_request_id=x_exosphere_request_id)
+                retry_created = False
+            else:
             try:
-                retry_state = State(
+                delay_ms = graph_template.retry_policy.compute_delay(next_attempt)
+                retry_state = State(
                     node_name=state.node_name,
                     namespace_name=state.namespace_name,
                     identifier=state.identifier,
                     graph_name=state.graph_name,
                     run_id=state.run_id,
                     status=StateStatusEnum.CREATED,
                     inputs=state.inputs,
                     outputs={},
                     error=None,
                     parents=state.parents,
                     does_unites=state.does_unites,
-                    enqueue_after= int(time.time() * 1000) + graph_template.retry_policy.compute_delay(state.retry_count + 1),
-                    retry_count=state.retry_count + 1,
+                    enqueue_after=int(time.time() * 1000) + delay_ms,
+                    retry_count=next_attempt,
                     fanout_id=state.fanout_id
                 )
                 retry_state = await retry_state.insert()
-                logger.info(f"Retry state {retry_state.id} created for state {state_id}", x_exosphere_request_id=x_exosphere_request_id)
+                logger.info(f"Retry state {retry_state.id} created for state {state_id} (attempt {next_attempt}, delay_ms={delay_ms})", x_exosphere_request_id=x_exosphere_request_id)
                 retry_created = True

30-36: GraphTemplate lookup robustness.
Prefer an explicit find_one with a dict filter and return 404 on None to avoid get() surprises.

-        try:
-            graph_template = await GraphTemplate.get(namespace_name, state.graph_name)
+        try:
+            graph_template = await GraphTemplate.find_one({"namespace": namespace_name, "name": state.graph_name})
+            if not graph_template:
+                raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Graph template not found")
state-manager/app/models/retry_policy_model.py (2)

35-59: Normalize return types to int across all branches.
Minor consistency tweak; some branches already cast due to jitter.

-            return _cap(self.backoff_factor * (self.exponent ** (retry_count - 1)))
+            return _cap(int(self.backoff_factor * (self.exponent ** (retry_count - 1))))
...
-            return _cap(self.backoff_factor * retry_count)
+            return _cap(int(self.backoff_factor * retry_count))
...
-            return _cap(self.backoff_factor)
+            return _cap(int(self.backoff_factor))

Also applies to: 46-56, 57-59, 60-66


23-23: Fix description: default is None, not 1 hour.
Align Field description with the actual default.

-    max_delay: int | None = Field(default=None, description="The maximum delay in milliseconds (default: 3600000 = 1 hour)", gt=0)
+    max_delay: int | None = Field(default=None, description="The maximum delay in milliseconds (default: None = no cap)", gt=0)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 64592d8 and fded75b.

📒 Files selected for processing (4)
  • docs/docs/exosphere/retry-policy.md (1 hunks)
  • state-manager/app/controller/errored_state.py (2 hunks)
  • state-manager/app/models/db/state.py (4 hunks)
  • state-manager/app/models/retry_policy_model.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
state-manager/app/controller/errored_state.py (3)
state-manager/app/models/errored_models.py (2)
  • ErroredRequestModel (5-6)
  • ErroredResponseModel (9-11)
state-manager/app/models/db/state.py (1)
  • State (13-97)
state-manager/app/models/retry_policy_model.py (1)
  • compute_delay (25-69)
🪛 LanguageTool
docs/docs/exosphere/retry-policy.md

[grammar] ~1-~1: Use correct spacing
Context: # Retry Policy !!! beta "Beta Feature" Retry Policy...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~4-~4: Use correct spacing
Context: ...tionality may change in future releases. The Retry Policy feature in Exosphere pr...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~6-~6: Use correct spacing
Context: ...cution based on configurable strategies. ## Overview Retry policies are configured ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~8-~8: Use correct spacing
Context: ...on configurable strategies. ## Overview Retry policies are configured at the gra...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~10-~10: Use correct spacing
Context: ...delay before the next execution attempt. ## Configuration Retry policies are define...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~12-~12: Use correct spacing
Context: ...ext execution attempt. ## Configuration Retry policies are defined in your graph...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~14-~14: Use correct spacing
Context: ...ed in your graph template configuration: json { "secrets": { "api_key": "your-api-key" }, "nodes": [ { "node_name": "MyNode", "namespace": "MyProject", "identifier": "my_node", "inputs": { "data": "initial" }, "next_nodes": [] } ], "retry_policy": { "max_retries": 3, "strategy": "EXPONENTIAL", "backoff_factor": 2000, "exponent": 2, "max_delay": 3600000 } } ## Parameters ### max_retries - Type:...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~42-~42: Use correct spacing
Context: ...delay": 3600000 } } ``` ## Parameters ### max_retries - Type: int - **Defau...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~44-~44: Use correct spacing
Context: ... } } ``` ## Parameters ### max_retries - Type: int - Default: `3` - **Des...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~46-~46: There might be a mistake here.
Context: ...arameters ### max_retries - Type: int - Default: 3 - Description: The ma...

(QB_NEW_EN)


[grammar] ~47-~47: There might be a mistake here.
Context: ...tries - Type: int - Default: 3 - Description: The maximum number of ret...

(QB_NEW_EN)


[grammar] ~48-~48: There might be a mistake here.
Context: ...umber of retry attempts before giving up - Constraints: Must be >= 0 ### strateg...

(QB_NEW_EN)


[grammar] ~49-~49: Use correct spacing
Context: ...iving up - Constraints: Must be >= 0 ### strategy - Type: string - **Defau...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~51-~51: Use correct spacing
Context: ...onstraints**: Must be >= 0 ### strategy - Type: string - Default: `"EXPONE...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~53-~53: There might be a mistake here.
Context: ...Must be >= 0 ### strategy - Type: string - Default: "EXPONENTIAL" - **Descripti...

(QB_NEW_EN)


[grammar] ~54-~54: There might be a mistake here.
Context: ...gy - Type: string - Default: "EXPONENTIAL" - Description: The retry strategy to use...

(QB_NEW_EN)


[grammar] ~55-~55: There might be a mistake here.
Context: ...y strategy to use for calculating delays - Options: See [Retry Strategies](#retry...

(QB_NEW_EN_OTHER)


[grammar] ~56-~56: There might be a mistake here.
Context: ...try Strategies](#retry-strategies) below ### backoff_factor - Type: int (milli...

(QB_NEW_EN_OTHER)


[grammar] ~58-~58: Use correct spacing
Context: ...ry-strategies) below ### backoff_factor - Type: int (milliseconds) - **Default...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~60-~60: There might be a mistake here.
Context: ...factor - Type: int (milliseconds) - Default: 2000 (2 seconds) - **Descri...

(QB_NEW_EN)


[grammar] ~61-~61: There might be a mistake here.
Context: ...conds) - Default: 2000 (2 seconds) - Description: The base delay factor in ...

(QB_NEW_EN)


[grammar] ~62-~62: There might be a mistake here.
Context: ...*: The base delay factor in milliseconds - Constraints: Must be > 0 ### exponent...

(QB_NEW_EN)


[grammar] ~63-~63: Use correct spacing
Context: ...liseconds - Constraints: Must be > 0 ### exponent - Type: int - *Default...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~65-~65: Use correct spacing
Context: ...Constraints**: Must be > 0 ### exponent - Type: int - Default: 2 - **Des...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~67-~67: There might be a mistake here.
Context: ... Must be > 0 ### exponent - Type: int - Default: 2 - Description: The ex...

(QB_NEW_EN)


[grammar] ~68-~68: There might be a mistake here.
Context: ...onent - Type: int - Default: 2 - Description: The exponent used for exp...

(QB_NEW_EN)


[grammar] ~69-~69: There might be a mistake here.
Context: ...exponent used for exponential strategies - Constraints: Must be > 0 ### max_dela...

(QB_NEW_EN)


[grammar] ~70-~70: Use correct spacing
Context: ...trategies - Constraints: Must be > 0 ### max_delay - Type: int | null (mil...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~72-~72: Use correct spacing
Context: ...onstraints**: Must be > 0 ### max_delay - Type: int | null (milliseconds) - **...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~76-~76: There might be a mistake here.
Context: ... at this value using the _cap function - Constraints: Must be > 0 when not null...

(QB_NEW_EN_OTHER)


[grammar] ~77-~77: There might be a mistake here.
Context: ...Constraints**: Must be > 0 when not null - Example: 3600000 (1 hour) would cap ...

(QB_NEW_EN_OTHER)


[grammar] ~78-~78: There might be a mistake here.
Context: ...ld cap all delays to a maximum of 1 hour ## Retry Strategies Exosphere supports thr...

(QB_NEW_EN_OTHER)


[grammar] ~80-~80: Use correct spacing
Context: ...a maximum of 1 hour ## Retry Strategies Exosphere supports three main categories...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~82-~82: Use correct spacing
Context: ...nts to prevent thundering herd problems. ### Exponential Strategies Exponential stra...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~84-~84: Use correct spacing
Context: ...rd problems. ### Exponential Strategies Exponential strategies increase the dela...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~86-~86: Use correct spacing
Context: ...y exponentially with each retry attempt. #### EXPONENTIAL Standard exponential backof...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~88-~88: Use correct spacing
Context: ...th each retry attempt. #### EXPONENTIAL Standard exponential backoff without jit...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~90-~90: Use correct spacing
Context: ...dard exponential backoff without jitter. Formula: `backoff_factor * (exponent ^...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~92-~92: Use correct spacing
Context: ...l backoff without jitter. Formula: backoff_factor * (exponent ^ (retry_count - 1)) Example: - Retry 1: 2000ms (2 seconds...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~94-~94: Use correct spacing
Context: ...nent ^ (retry_count - 1))` Example: - Retry 1: 2000ms (2 seconds) - Retry 2: 4...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~96-~96: There might be a mistake here.
Context: ..._count - 1))` Example: - Retry 1: 2000ms (2 seconds) - Retry 2: 4000ms (4 second...

(QB_NEW_EN_OTHER)


[grammar] ~96-~96: There might be a mistake here.
Context: ...xample**: - Retry 1: 2000ms (2 seconds) - Retry 2: 4000ms (4 seconds) - Retry 3: 8...

(QB_NEW_EN)


[grammar] ~97-~97: There might be a mistake here.
Context: ... Retry 1: 2000ms (2 seconds) - Retry 2: 4000ms (4 seconds) - Retry 3: 8000ms (8 second...

(QB_NEW_EN_OTHER)


[grammar] ~97-~97: There might be a mistake here.
Context: ...2 seconds) - Retry 2: 4000ms (4 seconds) - Retry 3: 8000ms (8 seconds) #### EXPONE...

(QB_NEW_EN)


[grammar] ~98-~98: There might be a mistake here.
Context: ... Retry 2: 4000ms (4 seconds) - Retry 3: 8000ms (8 seconds) #### EXPONENTIAL_FULL_JITT...

(QB_NEW_EN_OTHER)


[grammar] ~98-~98: Use correct spacing
Context: ...4 seconds) - Retry 3: 8000ms (8 seconds) #### EXPONENTIAL_FULL_JITTER Exponential bac...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~100-~100: Use correct spacing
Context: ...8 seconds) #### EXPONENTIAL_FULL_JITTER Exponential backoff with full jitter (ra...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~102-~102: Use correct spacing
Context: ...m delay between 0 and calculated delay). Formula: `random(0, backoff_factor * (...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~104-~104: Use correct spacing
Context: ... 0 and calculated delay). Formula: random(0, backoff_factor * (exponent ^ (retry_count - 1))) *Note: random(a, b) denotes a uniform r...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~106-~106: Use correct spacing
Context: ...om draw over the inclusive range [a, b].* Example: - Retry 1: 0-2000ms (random)...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~108-~108: Use correct spacing
Context: ...e inclusive range [a, b].* Example: - Retry 1: 0-2000ms (random) - Retry 2: 0-...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~110-~110: There might be a mistake here.
Context: ...Example**: - Retry 1: 0-2000ms (random) - Retry 2: 0-4000ms (random) - Retry 3: 0-...

(QB_NEW_EN)


[grammar] ~111-~111: There might be a mistake here.
Context: ...ms (random) - Retry 2: 0-4000ms (random) - Retry 3: 0-8000ms (random) #### EXPONEN...

(QB_NEW_EN)


[grammar] ~112-~112: Use correct spacing
Context: ...ms (random) - Retry 3: 0-8000ms (random) #### EXPONENTIAL_EQUAL_JITTER Exponential ba...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~114-~114: Use correct spacing
Context: ... (random) #### EXPONENTIAL_EQUAL_JITTER Exponential backoff with equal jitter (r...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~116-~116: Use correct spacing
Context: ...delay around half the calculated delay). Formula: `(backoff_factor * (exponent ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~118-~118: Use correct spacing
Context: ...lf the calculated delay). Formula: (backoff_factor * (exponent ^ (retry_count - 1))) / 2 + random(0, (backoff_factor * (exponent ^ (retry_count - 1))) / 2) *Note: random(a, b) denotes a uniform r...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~120-~120: Use correct spacing
Context: ...om draw over the inclusive range [a, b].* Example: - Retry 1: 1000-2000ms (rand...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~122-~122: Use correct spacing
Context: ...e inclusive range [a, b].* Example: - Retry 1: 1000-2000ms (random) - Retry 2:...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~124-~124: There might be a mistake here.
Context: ...[a, b].* Example: - Retry 1: 1000-2000ms (random) - Retry 2: 2000-4000ms (random...

(QB_NEW_EN_OTHER)


[grammar] ~124-~124: There might be a mistake here.
Context: ...mple**: - Retry 1: 1000-2000ms (random) - Retry 2: 2000-4000ms (random) - Retry 3:...

(QB_NEW_EN)


[grammar] ~125-~125: There might be a mistake here.
Context: ...1: 1000-2000ms (random) - Retry 2: 2000-4000ms (random) - Retry 3: 4000-8000ms (random...

(QB_NEW_EN_OTHER)


[grammar] ~125-~125: There might be a mistake here.
Context: ...(random) - Retry 2: 2000-4000ms (random) - Retry 3: 4000-8000ms (random) ### Linea...

(QB_NEW_EN)


[grammar] ~126-~126: There might be a mistake here.
Context: ...2: 2000-4000ms (random) - Retry 3: 4000-8000ms (random) ### Linear Strategies Linear...

(QB_NEW_EN_OTHER)


[grammar] ~126-~126: Use correct spacing
Context: ...(random) - Retry 3: 4000-8000ms (random) ### Linear Strategies Linear strategies inc...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~128-~128: Use correct spacing
Context: ...0-8000ms (random) ### Linear Strategies Linear strategies increase the delay lin...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~130-~130: Use correct spacing
Context: ... delay linearly with each retry attempt. #### LINEAR Standard linear backoff without ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~132-~132: Use correct spacing
Context: ...ly with each retry attempt. #### LINEAR Standard linear backoff without jitter. ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~134-~134: Use correct spacing
Context: ... Standard linear backoff without jitter. Formula: `backoff_factor * retry_count...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~136-~136: Use correct spacing
Context: ...r backoff without jitter. Formula: backoff_factor * retry_count Example: - Retry 1: 2000ms (2 seconds...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~138-~138: Use correct spacing
Context: ...koff_factor * retry_count` Example: - Retry 1: 2000ms (2 seconds) - Retry 2: 4...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~140-~140: There might be a mistake here.
Context: ... retry_count` Example: - Retry 1: 2000ms (2 seconds) - Retry 2: 4000ms (4 second...

(QB_NEW_EN_OTHER)


[grammar] ~140-~140: There might be a mistake here.
Context: ...xample**: - Retry 1: 2000ms (2 seconds) - Retry 2: 4000ms (4 seconds) - Retry 3: 6...

(QB_NEW_EN)


[grammar] ~141-~141: There might be a mistake here.
Context: ... Retry 1: 2000ms (2 seconds) - Retry 2: 4000ms (4 seconds) - Retry 3: 6000ms (6 second...

(QB_NEW_EN_OTHER)


[grammar] ~141-~141: There might be a mistake here.
Context: ...2 seconds) - Retry 2: 4000ms (4 seconds) - Retry 3: 6000ms (6 seconds) #### LINEAR...

(QB_NEW_EN)


[grammar] ~142-~142: There might be a mistake here.
Context: ... Retry 2: 4000ms (4 seconds) - Retry 3: 6000ms (6 seconds) #### LINEAR_FULL_JITTER L...

(QB_NEW_EN_OTHER)


[grammar] ~142-~142: Use correct spacing
Context: ...4 seconds) - Retry 3: 6000ms (6 seconds) #### LINEAR_FULL_JITTER Linear backoff with ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~144-~144: Use correct spacing
Context: ...0ms (6 seconds) #### LINEAR_FULL_JITTER Linear backoff with full jitter. **Form...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~146-~146: Use correct spacing
Context: ...JITTER Linear backoff with full jitter. Formula: `random(0, backoff_factor * r...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~148-~148: Use correct spacing
Context: ...backoff with full jitter. Formula: random(0, backoff_factor * retry_count) *Note: random(a, b) denotes a uniform r...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~150-~150: Use correct spacing
Context: ...om draw over the inclusive range [a, b].* Example: - Retry 1: 0-2000ms (random)...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~152-~152: Use correct spacing
Context: ...e inclusive range [a, b].* Example: - Retry 1: 0-2000ms (random) - Retry 2: 0-...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~154-~154: There might be a mistake here.
Context: ...Example**: - Retry 1: 0-2000ms (random) - Retry 2: 0-4000ms (random) - Retry 3: 0-...

(QB_NEW_EN)


[grammar] ~155-~155: There might be a mistake here.
Context: ...ms (random) - Retry 2: 0-4000ms (random) - Retry 3: 0-6000ms (random) #### LINEAR_...

(QB_NEW_EN)


[grammar] ~156-~156: Use correct spacing
Context: ...ms (random) - Retry 3: 0-6000ms (random) #### LINEAR_EQUAL_JITTER Linear backoff with...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~158-~158: Use correct spacing
Context: ...000ms (random) #### LINEAR_EQUAL_JITTER Linear backoff with equal jitter. **For...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~160-~160: Use correct spacing
Context: ...ITTER Linear backoff with equal jitter. Formula: `(backoff_factor * retry_coun...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~162-~162: Use correct spacing
Context: ...ackoff with equal jitter. Formula: (backoff_factor * retry_count) / 2 + random(0, (backoff_factor * retry_count) / 2) *Note: random(a, b) denotes a uniform r...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~164-~164: Use correct spacing
Context: ...om draw over the inclusive range [a, b].* Example: - Retry 1: 1000-2000ms (rand...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~166-~166: Use correct spacing
Context: ...e inclusive range [a, b].* Example: - Retry 1: 1000-2000ms (random) - Retry 2:...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~168-~168: There might be a mistake here.
Context: ...[a, b].* Example: - Retry 1: 1000-2000ms (random) - Retry 2: 2000-4000ms (random...

(QB_NEW_EN_OTHER)


[grammar] ~168-~168: There might be a mistake here.
Context: ...mple**: - Retry 1: 1000-2000ms (random) - Retry 2: 2000-4000ms (random) - Retry 3:...

(QB_NEW_EN)


[grammar] ~169-~169: There might be a mistake here.
Context: ...1: 1000-2000ms (random) - Retry 2: 2000-4000ms (random) - Retry 3: 3000-6000ms (random...

(QB_NEW_EN_OTHER)


[grammar] ~169-~169: There might be a mistake here.
Context: ...(random) - Retry 2: 2000-4000ms (random) - Retry 3: 3000-6000ms (random) ### Fixed...

(QB_NEW_EN)


[grammar] ~170-~170: There might be a mistake here.
Context: ...2: 2000-4000ms (random) - Retry 3: 3000-6000ms (random) ### Fixed Strategies Fixed s...

(QB_NEW_EN_OTHER)


[grammar] ~170-~170: Use correct spacing
Context: ...(random) - Retry 3: 3000-6000ms (random) ### Fixed Strategies Fixed strategies use a...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~172-~172: Use correct spacing
Context: ...00-6000ms (random) ### Fixed Strategies Fixed strategies use a constant delay fo...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~174-~174: Use correct spacing
Context: ...a constant delay for all retry attempts. #### FIXED Standard fixed delay without jitt...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~176-~176: Use correct spacing
Context: ...elay for all retry attempts. #### FIXED Standard fixed delay without jitter. **...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~178-~178: Use correct spacing
Context: ...ED Standard fixed delay without jitter. Formula: backoff_factor Example...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~180-~180: Use correct spacing
Context: ...xed delay without jitter. Formula: backoff_factor Example: - Retry 1: 2000ms (2 seconds...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~182-~182: Use correct spacing
Context: ...ormula**: backoff_factor Example: - Retry 1: 2000ms (2 seconds) - Retry 2: 2...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~184-~184: There might be a mistake here.
Context: ...ckoff_factor` Example: - Retry 1: 2000ms (2 seconds) - Retry 2: 2000ms (2 second...

(QB_NEW_EN_OTHER)


[grammar] ~184-~184: There might be a mistake here.
Context: ...xample**: - Retry 1: 2000ms (2 seconds) - Retry 2: 2000ms (2 seconds) - Retry 3: 2...

(QB_NEW_EN)


[grammar] ~185-~185: There might be a mistake here.
Context: ... Retry 1: 2000ms (2 seconds) - Retry 2: 2000ms (2 seconds) - Retry 3: 2000ms (2 second...

(QB_NEW_EN_OTHER)


[grammar] ~185-~185: There might be a mistake here.
Context: ...2 seconds) - Retry 2: 2000ms (2 seconds) - Retry 3: 2000ms (2 seconds) #### FIXED_...

(QB_NEW_EN)


[grammar] ~186-~186: There might be a mistake here.
Context: ... Retry 2: 2000ms (2 seconds) - Retry 3: 2000ms (2 seconds) #### FIXED_FULL_JITTER Fi...

(QB_NEW_EN_OTHER)


[grammar] ~186-~186: Use correct spacing
Context: ...2 seconds) - Retry 3: 2000ms (2 seconds) #### FIXED_FULL_JITTER Fixed delay with full...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~188-~188: Use correct spacing
Context: ...00ms (2 seconds) #### FIXED_FULL_JITTER Fixed delay with full jitter. **Formula...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~190-~190: Use correct spacing
Context: ...LL_JITTER Fixed delay with full jitter. Formula: random(0, backoff_factor) ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~192-~192: Use correct spacing
Context: ...d delay with full jitter. Formula: random(0, backoff_factor) *Note: random(a, b) denotes a uniform r...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~194-~194: Use correct spacing
Context: ...om draw over the inclusive range [a, b].* Example: - Retry 1: 0-2000ms (random)...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~196-~196: Use correct spacing
Context: ...e inclusive range [a, b].* Example: - Retry 1: 0-2000ms (random) - Retry 2: 0-...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~198-~198: There might be a mistake here.
Context: ...Example**: - Retry 1: 0-2000ms (random) - Retry 2: 0-2000ms (random) - Retry 3: 0-...

(QB_NEW_EN)


[grammar] ~199-~199: There might be a mistake here.
Context: ...ms (random) - Retry 2: 0-2000ms (random) - Retry 3: 0-2000ms (random) #### FIXED_E...

(QB_NEW_EN)


[grammar] ~200-~200: Use correct spacing
Context: ...ms (random) - Retry 3: 0-2000ms (random) #### FIXED_EQUAL_JITTER Fixed delay with equ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~202-~202: Use correct spacing
Context: ...2000ms (random) #### FIXED_EQUAL_JITTER Fixed delay with equal jitter. **Formul...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~204-~204: Use correct spacing
Context: ...L_JITTER Fixed delay with equal jitter. Formula: `backoff_factor / 2 + random(...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~206-~206: Use correct spacing
Context: ... delay with equal jitter. Formula: backoff_factor / 2 + random(0, backoff_factor / 2) *Note: random(a, b) denotes a uniform r...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~208-~208: Use correct spacing
Context: ...om draw over the inclusive range [a, b].* Example: - Retry 1: 1000-2000ms (rand...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~210-~210: Use correct spacing
Context: ...e inclusive range [a, b].* Example: - Retry 1: 1000-2000ms (random) - Retry 2:...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~212-~212: There might be a mistake here.
Context: ...[a, b].* Example: - Retry 1: 1000-2000ms (random) - Retry 2: 1000-2000ms (random...

(QB_NEW_EN_OTHER)


[grammar] ~212-~212: There might be a mistake here.
Context: ...mple**: - Retry 1: 1000-2000ms (random) - Retry 2: 1000-2000ms (random) - Retry 3:...

(QB_NEW_EN)


[grammar] ~213-~213: There might be a mistake here.
Context: ...1: 1000-2000ms (random) - Retry 2: 1000-2000ms (random) - Retry 3: 1000-2000ms (random...

(QB_NEW_EN_OTHER)


[grammar] ~213-~213: There might be a mistake here.
Context: ...(random) - Retry 2: 1000-2000ms (random) - Retry 3: 1000-2000ms (random) ## Delay ...

(QB_NEW_EN)


[grammar] ~214-~214: There might be a mistake here.
Context: ...2: 1000-2000ms (random) - Retry 3: 1000-2000ms (random) ## Delay Capping The retry p...

(QB_NEW_EN_OTHER)


[grammar] ~214-~214: Use correct spacing
Context: ...(random) - Retry 3: 1000-2000ms (random) ## Delay Capping The retry policy includes...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~216-~216: Use correct spacing
Context: ...: 1000-2000ms (random) ## Delay Capping The retry policy includes a built-in del...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~218-~218: Use correct spacing
Context: ...gressive exponential backoff strategies. ### How Delay Capping Works The _cap func...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~220-~220: Use correct spacing
Context: ...strategies. ### How Delay Capping Works The _cap function is applied to all ca...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~222-~222: Use correct spacing
Context: ...ion is applied to all calculated delays: python def _cap(value: int) -> int: if self.max_delay is not None: return min(value, self.max_delay) return value Behavior: - If max_delay is set, all...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~231-~231: There might be a mistake here.
Context: ...delay) return value ``` Behavior: - If max_delay is set, all calculated de...

(QB_NEW_EN)


[grammar] ~232-~232: There might be a mistake here.
Context: ...lculated delays are capped at this value - If max_delay is null (default), no c...

(QB_NEW_EN_OTHER)


[grammar] ~233-~233: There might be a mistake here.
Context: ... null (default), no capping is applied - The capping is applied after all strateg...

(QB_NEW_EN_OTHER)


[grammar] ~234-~234: Use correct spacing
Context: ...applied after all strategy calculations. ### Example with Delay Capping Consider an ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~236-~236: Use correct spacing
Context: ...lations. ### Example with Delay Capping Consider an exponential strategy with `b...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~238-~238: Use correct spacing
Context: ..., exponent: 2, and max_delay: 10000: With capping: - Retry 1: 2000ms - Retr...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~240-~240: There might be a mistake here.
Context: ...and max_delay: 10000: With capping: - Retry 1: 2000ms - Retry 2: 4000ms - Retr...

(QB_NEW_EN)


[grammar] ~241-~241: There might be a mistake here.
Context: ...: 10000`: With capping: - Retry 1: 2000ms - Retry 2: 4000ms - Retry 3: 8000ms - Retr...

(QB_NEW_EN_OTHER)


[grammar] ~242-~242: There might be a mistake here.
Context: ...capping:** - Retry 1: 2000ms - Retry 2: 4000ms - Retry 3: 8000ms - Retry 4: 10000ms (capp...

(QB_NEW_EN_OTHER)


[grammar] ~243-~243: There might be a mistake here.
Context: ... 1: 2000ms - Retry 2: 4000ms - Retry 3: 8000ms - Retry 4: 10000ms (capped at max_delay) ...

(QB_NEW_EN_OTHER)


[grammar] ~244-~244: There might be a mistake here.
Context: ... 2: 4000ms - Retry 3: 8000ms - Retry 4: 10000ms (capped at max_delay) ### When to Use ...

(QB_NEW_EN_OTHER)


[grammar] ~244-~244: Use correct spacing
Context: ...- Retry 4: 10000ms (capped at max_delay) ### When to Use Delay Capping - **Long-runn...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~246-~246: Use correct spacing
Context: ...ax_delay) ### When to Use Delay Capping - Long-running workflows: Prevent excess...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~248-~248: There might be a mistake here.
Context: ... impact overall workflow completion time - User-facing applications: Ensure retri...

(QB_NEW_EN_OTHER)


[grammar] ~249-~249: There might be a mistake here.
Context: ...ies don't create unacceptable wait times - Resource management: Control resource ...

(QB_NEW_EN_OTHER)


[grammar] ~250-~250: There might be a mistake here.
Context: ...rce consumption by limiting retry delays - Predictable behavior: Create more pred...

(QB_NEW_EN_OTHER)


[grammar] ~251-~251: There might be a mistake here.
Context: ...try patterns for monitoring and alerting ## Usage Examples ### Basic Exponential Re...

(QB_NEW_EN_OTHER)


[grammar] ~253-~253: Use correct spacing
Context: ...nitoring and alerting ## Usage Examples ### Basic Exponential Retry ```json { "re...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~255-~255: Use correct spacing
Context: ...ge Examples ### Basic Exponential Retry json { "retry_policy": { "max_retries": 3, "strategy": "EXPONENTIAL", "backoff_factor": 1000, "exponent": 2 } } ### Aggressive Retry with Jitter ```json { ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~268-~268: Use correct spacing
Context: ... } ### Aggressive Retry with Jitter json { "retry_policy": { "max_retries": 5, "strategy": "EXPONENTIAL_FULL_JITTER", "backoff_factor": 500, "exponent": 3 } } ### Conservative Linear Retry json { "...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~281-~281: Use correct spacing
Context: ... } } ### Conservative Linear Retry json { "retry_policy": { "max_retries": 2, "strategy": "LINEAR", "backoff_factor": 5000 } } ### Fixed Retry for Rate Limiting json {...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~293-~293: Use correct spacing
Context: ...} ### Fixed Retry for Rate Limiting json { "retry_policy": { "max_retries": 10, "strategy": "FIXED_EQUAL_JITTER", "backoff_factor": 1000 } } ``` ### Exponential Retry with Delay Capping ``...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~305-~305: Use correct spacing
Context: ...### Exponential Retry with Delay Capping json { "retry_policy": { "max_retries": 5, "strategy": "EXPONENTIAL", "backoff_factor": 2000, "exponent": 2, "max_delay": 30000 } } ### Conservative Retry with Maximum Delay `...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~319-~319: Use correct spacing
Context: ...## Conservative Retry with Maximum Delay json { "retry_policy": { "max_retries": 3, "strategy": "EXPONENTIAL_FULL_JITTER", "backoff_factor": 1000, "exponent": 3, "max_delay": 60000 } } ## When Retries Are Triggered Retries are ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~333-~333: Use correct spacing
Context: ... } } ``` ## When Retries Are Triggered Retries are automatically triggered when...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~335-~335: Use correct spacing
Context: ...etries are automatically triggered when: 1. A node execution fails with an error 2. ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~337-~337: There might be a mistake here.
Context: ... 1. A node execution fails with an error 2. The current retry count is less than `ma...

(QB_NEW_EN_OTHER)


[grammar] ~338-~338: There might be a mistake here.
Context: ... 2. The current retry count is less than max_retries 3. The state status is QUEUED The retry ...

(QB_NEW_EN_OTHER)


[grammar] ~339-~339: Use correct spacing
Context: ...an max_retries 3. The state status is QUEUED The retry mechanism: - Creates a new st...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~341-~341: Use correct spacing
Context: ...status is QUEUED The retry mechanism: - Creates a new state with retry_count i...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~343-~343: There might be a mistake here.
Context: ...tate with retry_count incremented by 1 - Sets enqueue_after to the current time...

(QB_NEW_EN_OTHER)


[grammar] ~344-~344: There might be a mistake here.
Context: ...e current time plus the calculated delay - Sets the original state status to `ERROR...

(QB_NEW_EN_OTHER)


[grammar] ~345-~345: There might be a mistake here.
Context: ...atus to ERRORED with the error message ## Best Practices ### Choose the Right Str...

(QB_NEW_EN_OTHER)


[grammar] ~347-~347: Use correct spacing
Context: ...ith the error message ## Best Practices ### Choose the Right Strategy - **EXPONENTI...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~349-~349: Use correct spacing
Context: ...Practices ### Choose the Right Strategy - EXPONENTIAL: Best for most transient f...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~351-~351: There might be a mistake here.
Context: ...ssues, temporary service unavailability) - LINEAR: Good for predictable, consiste...

(QB_NEW_EN)


[grammar] ~352-~352: There might be a mistake here.
Context: ... Good for predictable, consistent delays - FIXED: Useful for rate limiting scenar...

(QB_NEW_EN)


[grammar] ~353-~353: There might be a mistake here.
Context: ...tent delays - FIXED: Useful for rate limiting scenarios ### Use Jitter for H...

(QB_NEW_EN_OTHER)


[grammar] ~353-~353: Use correct spacing
Context: ...ED**: Useful for rate limiting scenarios ### Use Jitter for High Concurrency - **FUL...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~355-~355: Use correct spacing
Context: ...ios ### Use Jitter for High Concurrency - FULL_JITTER: Best for high concurrency...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~357-~357: There might be a mistake here.
Context: ...h concurrency to prevent thundering herd - EQUAL_JITTER: Good balance between pre...

(QB_NEW_EN)


[grammar] ~358-~358: There might be a mistake here.
Context: ...between predictability and randomization - No Jitter: Use only when you need dete...

(QB_NEW_EN)


[grammar] ~359-~359: Use correct spacing
Context: ...nly when you need deterministic behavior ### Set Appropriate Limits - *max_retries...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~361-~361: Use correct spacing
Context: ...tic behavior ### Set Appropriate Limits - max_retries: Consider the nature of yo...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~363-~363: There might be a mistake here.
Context: ...our failures and downstream dependencies - backoff_factor: Balance between respon...

(QB_NEW_EN_OTHER)


[grammar] ~364-~364: There might be a mistake here.
Context: ...etween responsiveness and resource usage - exponent: Higher values create more ag...

(QB_NEW_EN_OTHER)


[grammar] ~365-~365: There might be a mistake here.
Context: ...er values create more aggressive backoff - max_delay: Set a reasonable maximum de...

(QB_NEW_EN_OTHER)


[grammar] ~366-~366: There might be a mistake here.
Context: ...s, especially for exponential strategies ### Monitor Retry Patterns - Track retry co...

(QB_NEW_EN_OTHER)


[grammar] ~368-~368: Use correct spacing
Context: ...l strategies ### Monitor Retry Patterns - Track retry counts in your monitoring sy...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~370-~370: There might be a mistake here.
Context: ...k retry counts in your monitoring system - Set up alerts for graphs with high retry...

(QB_NEW_EN_OTHER)


[grammar] ~371-~371: There might be a mistake here.
Context: ... alerts for graphs with high retry rates - Analyze retry patterns to identify syste...

(QB_NEW_EN_OTHER)


[grammar] ~372-~372: There might be a mistake here.
Context: ...try patterns to identify systemic issues ## Limitations - Retry policies apply to a...

(QB_NEW_EN_OTHER)


[grammar] ~374-~374: Use correct spacing
Context: ...identify systemic issues ## Limitations - Retry policies apply to all nodes in a g...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~376-~376: There might be a mistake here.
Context: ... apply to all nodes in a graph uniformly - Individual node-level retry policies are...

(QB_NEW_EN_OTHER)


[grammar] ~377-~377: There might be a mistake here.
Context: ...e-level retry policies are not supported - Retry delays are calculated in milliseco...

(QB_NEW_EN_OTHER)


[grammar] ~378-~378: There might be a mistake here.
Context: ...ry delays are calculated in milliseconds - Maximum delay can be capped using the `m...

(QB_NEW_EN_OTHER)


[grammar] ~379-~379: There might be a mistake here.
Context: ...(recommended for long-running workflows) ## Error Handling If a retry policy config...

(QB_NEW_EN_OTHER)


[grammar] ~381-~381: Use correct spacing
Context: ...ng-running workflows) ## Error Handling If a retry policy configuration is inval...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~383-~383: Use correct spacing
Context: ...a retry policy configuration is invalid: - The graph template validation will fail ...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~385-~385: There might be a mistake here.
Context: ... The graph template validation will fail - An error will be returned during graph c...

(QB_NEW_EN_OTHER)


[grammar] ~386-~386: There might be a mistake here.
Context: ...r will be returned during graph creation - The graph will not be saved until the co...

(QB_NEW_EN_OTHER)


[grammar] ~387-~387: There might be a mistake here.
Context: ...ved until the configuration is corrected ## Integration with Signals Retry policies...

(QB_NEW_EN_OTHER)


[grammar] ~389-~389: Use correct spacing
Context: ...s corrected ## Integration with Signals Retry policies work alongside Exosphere'...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~391-~391: Use correct spacing
Context: ... alongside Exosphere's signaling system: - Nodes can still raise PruneSignal to s...

(QB_NEW_EN_OTHER_ERROR_IDS_5)


[grammar] ~393-~393: There might be a mistake here.
Context: ...PruneSignalto stop retries immediately - Nodes can raiseReQueueAfterSignal` to ...

(QB_NEW_EN_OTHER)

🪛 markdownlint-cli2 (0.17.2)
docs/docs/exosphere/retry-policy.md

232-232: Lists should be surrounded by blank lines

(MD032, blanks-around-lists)


241-241: Lists should be surrounded by blank lines

(MD032, blanks-around-lists)


395-395: Files should end with a single newline character

(MD047, single-trailing-newline)

🔇 Additional comments (3)
state-manager/app/models/db/state.py (2)

29-29: Good fix: per-instance fanout_id.
default_factory with uuid ensures uniqueness and keeps the uniq_fanout_retry index meaningful.


31-45: Fingerprint includes retry_count for unites — LGTM.
This prevents accidental coalescing across attempts; canonical JSON and sha256 are appropriate.

Also applies to: 52-53

docs/docs/exosphere/retry-policy.md (1)

92-121: Formulas and jitter notes look correct.
Exponential examples match (retry_count - 1); jitter clarified as uniform.

- Added missing newlines in the retry policy documentation for better readability.
- Removed redundant assignment in the errored state handling to streamline the code logic.
- Added retry policy integration in the test cases for errored state handling and upsert graph template.
- Updated test fixtures to include additional state attributes for better simulation of state behavior.
- Improved error handling assertions in the errored state tests to ensure proper HTTP exceptions are raised.
- Mocked retry policy in graph template tests to validate retry behavior during upsert operations.
@codecov
Copy link

codecov bot commented Aug 31, 2025

Codecov Report

❌ Patch coverage is 98.76543% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
state-manager/app/models/retry_policy_model.py 98.03% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

- Removed the unused 'random' import from the test file for cleaner code.
- Updated the instantiation of RetryPolicyModel to improve readability.
- Introduced multiple test cases to cover scenarios where graph templates are not found, encounter other errors, or trigger DuplicateKeyError during state creation.
- Enhanced error handling assertions to ensure proper HTTP exceptions are raised for missing graph templates and other exceptions.
- Validated behavior when maximum retries are reached, ensuring no new state is created in such cases.
- Improved overall test coverage for errored state functionality.
- Updated assertions in the TestErroredState class to use 'not' instead of '== False' for improved readability.
- Ensured consistency in the test code style across multiple test cases.
- Eliminated the deploy-to-k8s job from the publish-state-manager workflow to streamline the CI/CD process.
- This change focuses on publishing the image without the deployment step, simplifying the workflow configuration.
@NiveditJain NiveditJain merged commit a69288d into exospherehost:main Aug 31, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for retries in StateManager

2 participants