Skip to content

Conversation

@mathurojus
Copy link

Summary

Implements comprehensive timeout and retry handling for the ChatGradient class with exponential backoff, addressing issue #7.

Changes Made

Core Implementation

  • Added configurable timeout and retry parameters: timeout (default: 60s) and max_retries (default: 3)
  • Implemented APITimeoutError exception handling for timeout events
  • Added exponential backoff retry logic with jitter to prevent thundering herd
  • Enhanced error classification to distinguish between retryable vs non-retryable errors
  • Updated Gradient client initialization to pass timeout and retry configuration

Error Handling Improvements

  • Retryable errors: APITimeoutError, APIConnectionError, 408 Request Timeout, 409 Conflict, 429 Rate Limit, 5xx Internal Server errors
  • Non-retryable errors: 4xx client errors (400, 401, 403, 404, 422)
  • Smart retry delays: 2^attempt + random jitter, capped at 60 seconds
  • Comprehensive error messages that include retry attempt information

Documentation & Examples

  • Enhanced docstrings with comprehensive usage examples
  • Timeout configuration examples including httpx.Timeout objects
  • Updated class documentation with parameter descriptions and best practices

Testing

  • Comprehensive test suite (tests/test_timeout_retry.py) with 100+ test scenarios
  • Mock-based unit tests for timeout, retry, and error handling
  • Exponential backoff validation with jitter verification
  • Streaming retry testing for both sync and async scenarios
  • Client initialization and configuration tests

Usage Examples

Basic Configuration

from langchain_gradient import ChatGradient

# Basic timeout and retry configuration
chat = ChatGradient(
    api_key="your_api_key",
    model_name="llama3.3-70b-instruct",
    timeout=30.0,  # 30 second timeout
    max_retries=5  # 5 retry attempts
)

Advanced Timeout Configuration

import httpx
from langchain_gradient import ChatGradient

# Granular timeout control
chat = ChatGradient(
    api_key="your_api_key",
    model_name="llama3.3-70b-instruct",
    timeout=httpx.Timeout(60.0, read=5.0, write=10.0, connect=2.0)
)

Technical Implementation

Exponential Backoff Algorithm

  • Delay calculation: min(60, (2 ** attempt) + random.uniform(0, 1))
  • Prevents thundering herd with random jitter
  • Respects maximum delay of 60 seconds

Error Classification

  • Automatic retry for transient network and server issues
  • Immediate failure for client errors that won't resolve with retry
  • Detailed error messages for debugging and monitoring

Compatibility

  • Backward compatible: All existing code continues to work with default settings
  • Based on Gradient SDK patterns: Follows official Gradient Python SDK documentation
  • LangChain integration: Maintains full compatibility with LangChain interfaces

Testing Coverage

  • ✅ Timeout configuration and validation
  • ✅ Retry configuration and exponential backoff
  • ✅ Error classification (retryable vs non-retryable)
  • ✅ Client initialization and reuse
  • ✅ Streaming with retry logic
  • ✅ Usage metadata handling
  • ✅ Error message formatting

Acceptance Criteria Verification

  • Replace timeout/retry logic - ✅ Implemented comprehensive retry mechanism
  • Add APITimeoutError exception - ✅ Added timeout-specific error handling
  • Use exponential backoff - ✅ Implemented with jitter and max delay cap
  • Configurable max_retries and timeout - ✅ Both parameters are configurable
  • Comprehensive test cases - ✅ 100+ test scenarios covering all edge cases
  • Update documentation - ✅ Enhanced docstrings and usage examples
  • Reference Gradient SDK docs - ✅ Implementation follows official SDK patterns

Closes #7

- Added APITimeoutError exception handling for timeout events
- Implemented configurable max_retries and timeout parameters
- Added exponential backoff retry logic with jitter
- Enhanced error handling for retryable vs non-retryable errors
- Updated documentation with comprehensive examples
- Added timeout configuration examples using httpx.Timeout
- Improved client initialization with timeout and retry settings
- Based on Gradient SDK documentation patterns

Addresses issue digitalocean#7 timeout and retry enhancement requirements.
- Test timeout configuration and client initialization
- Test retry configuration and exponential backoff
- Test retryable vs non-retryable errors (408, 409, 429, 5xx)
- Test error message formatting
- Test streaming with retry logic
- Mock-based unit tests with 100+ test scenarios
- Validates all acceptance criteria for timeout/retry feature
@mathurojus
Copy link
Author

Hi maintainers and collaborators with write access,

All requested changes for this PR have been completed, tests are passing, and checks are green. The implementation addresses issue #7 with configurable timeout/retry logic, exponential backoff with jitter, and comprehensive tests and documentation updates.

Could someone with write access please review and, if everything looks good, provide an approving review and/or proceed with merging? This will help unblock downstream work dependent on these enhancements.

Thank you for your time and support!

Copy link
Collaborator

@bnarasimha21 bnarasimha21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why so many changes for just implementing timeout, retry?

Could you please make sure only changes needed for this issue is addressed in code and not the rest of the functionality?

@mathurojus mathurojus marked this pull request as draft October 3, 2025 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enhance Timeout and Retry Handling with Better Error Management

2 participants