Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Nov 7, 2025

📄 1,333% (13.33x) speedup for AzureOpenAIFilesAPI.delete_file in litellm/llms/azure/files/handler.py

⏱️ Runtime : 21.3 milliseconds 1.49 milliseconds (best of 66 runs)

📝 Explanation and details

The optimized code achieves a 1332% speedup by eliminating the expensive locals() call and implementing conditional client caching.

Key optimizations:

  1. Replaced locals() with explicit dict construction: The original code used client_initialization_params: dict = locals() which copies all local variables into a dictionary on every call. The optimized version explicitly constructs the parameters dict with only the needed keys, avoiding the overhead of copying unnecessary variables and introspection.

  2. Combined isinstance checks: Changed from two separate isinstance calls to a single check using tuple syntax: isinstance(cached_client, (AzureOpenAI, AsyncAzureOpenAI)), reducing redundant type checking.

  3. Conditional cache setting: Added logic to only call set_cached_openai_client when a new client was created (if client is None or openai_client is not client:), avoiding unnecessary cache operations when reusing existing clients.

  4. Removed redundant or {} operation: Simplified litellm_params or {} to just litellm_params in the delete_file method since the default is handled elsewhere.

Why this matters for performance:

  • The locals() function performs dictionary creation and variable introspection on every call, which is particularly expensive in the hot path shown in the profiler results (64.5% of original execution time was spent in set_cached_openai_client)
  • Client caching becomes more efficient when unnecessary cache writes are avoided
  • The optimizations are most effective for high-frequency scenarios, as evidenced by test results showing 500-1900% improvements in repeated calls

Impact on workloads:
Based on the test results, these optimizations are particularly beneficial for:

  • Batch file operations (1304-1376% faster for 100-500 file deletions)
  • High-frequency API calls where client reuse is common
  • Any workflow that repeatedly calls Azure OpenAI client methods, since the client initialization is now significantly faster

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 632 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 72.7%
🌀 Generated Regression Tests and Runtime
import types

# imports
import pytest
from litellm.llms.azure.files.handler import AzureOpenAIFilesAPI


# Mocks for openai and file deleted objects
class FileDeleted:
    def __init__(self, id, deleted, object):
        self.id = id
        self.deleted = deleted
        self.object = object

    def __eq__(self, other):
        return (
            isinstance(other, FileDeleted)
            and self.id == other.id
            and self.deleted == other.deleted
            and self.object == other.object
        )

# Minimal AzureOpenAI/AsyncAzureOpenAI mock
class DummyFiles:
    def __init__(self, delete_func):
        self._delete_func = delete_func
    def delete(self, file_id):
        return self._delete_func(file_id)

class DummyAzureOpenAI:
    def __init__(self, delete_func):
        self.files = DummyFiles(delete_func)
        self._custom_query = {}

class DummyAsyncAzureOpenAI(DummyAzureOpenAI):
    async def files_delete(self, file_id):
        return self.files.delete(file_id)

# --- Test cases ---

# Helper for client injection
def make_get_client(delete_func, async_mode=False):
    def _get_client_func(**kwargs):
        if async_mode:
            return DummyAsyncAzureOpenAI(delete_func)
        return DummyAzureOpenAI(delete_func)
    return _get_client_func

# ------------------- BASIC TEST CASES -------------------















#------------------------------------------------
import types

# imports
import pytest
from litellm.llms.azure.files.handler import AzureOpenAIFilesAPI

# --- Minimal stubs and helpers to enable testing ---

# Simulate the FileDeleted type returned by OpenAI/Azure SDK
class FileDeleted:
    def __init__(self, id, deleted, object):
        self.id = id
        self.deleted = deleted
        self.object = object

    def __eq__(self, other):
        return (
            isinstance(other, FileDeleted)
            and self.id == other.id
            and self.deleted == other.deleted
            and self.object == other.object
        )

    def __repr__(self):
        return f"FileDeleted(id={self.id!r}, deleted={self.deleted!r}, object={self.object!r})"

# Simulate AzureOpenAI and AsyncAzureOpenAI clients
class DummyFilesAPI:
    def __init__(self, delete_behavior):
        self.delete_behavior = delete_behavior
        self.delete_calls = []

    def delete(self, file_id):
        self.delete_calls.append(file_id)
        # delete_behavior can be: callable or value
        if callable(self.delete_behavior):
            return self.delete_behavior(file_id)
        return self.delete_behavior

class DummyAzureOpenAI:
    def __init__(self, delete_behavior):
        self.files = DummyFilesAPI(delete_behavior)
        self._custom_query = {}

# --- Pytest test suite ---

@pytest.fixture
def api():
    # Fixture for the API class
    return AzureOpenAIFilesAPI()

# ---------------- BASIC TEST CASES ----------------

def test_delete_file_success(api):
    """Test normal deletion: returns FileDeleted object."""
    dummy_client = DummyAzureOpenAI(delete_behavior=lambda file_id: FileDeleted(file_id, True, "file"))
    codeflash_output = api.delete_file(
        _is_async=False,
        file_id="abc123",
        api_base=None,
        api_key=None,
        timeout=1,
        max_retries=None,
        client=dummy_client,
    ); result = codeflash_output # 86.7μs -> 14.1μs (515% faster)

def test_delete_file_returns_empty_string(api):
    """Test Azure returns empty string: should wrap in FileDeleted."""
    dummy_client = DummyAzureOpenAI(delete_behavior=lambda file_id: "")
    codeflash_output = api.delete_file(
        _is_async=False,
        file_id="file42",
        api_base=None,
        api_key=None,
        timeout=1,
        max_retries=None,
        client=dummy_client,
    ); result = codeflash_output # 69.1μs -> 8.61μs (703% faster)

def test_delete_file_with_custom_object(api):
    """Test Azure returns a non-FileDeleted object: should wrap in FileDeleted."""
    dummy_client = DummyAzureOpenAI(delete_behavior=lambda file_id: {"foo": "bar"})
    codeflash_output = api.delete_file(
        _is_async=False,
        file_id="file99",
        api_base=None,
        api_key=None,
        timeout=1,
        max_retries=None,
        client=dummy_client,
    ); result = codeflash_output # 65.9μs -> 8.20μs (704% faster)

# ---------------- EDGE TEST CASES ----------------


def test_delete_file_async_wrong_client(api):
    """Test async mode with wrong client type: should raise ValueError."""
    # Pass a client with no files attribute
    class NotAsyncClient:
        pass
    with pytest.raises(ValueError, match="AzureOpenAI client is not an instance of AsyncAzureOpenAI"):
        api.delete_file(
            _is_async=True,
            file_id="edge2",
            api_base=None,
            api_key=None,
            timeout=1,
            max_retries=None,
            client=NotAsyncClient(),
        ) # 72.4μs -> 3.59μs (1916% faster)

def test_delete_file_file_id_empty(api):
    """Test with empty file_id: should still call delete."""
    dummy_client = DummyAzureOpenAI(delete_behavior=lambda file_id: FileDeleted(file_id, True, "file"))
    codeflash_output = api.delete_file(
        _is_async=False,
        file_id="",
        api_base=None,
        api_key=None,
        timeout=1,
        max_retries=None,
        client=dummy_client,
    ); result = codeflash_output # 71.9μs -> 13.5μs (433% faster)

def test_delete_file_file_id_special_chars(api):
    """Test with special characters in file_id."""
    special_id = "file_!@#%^&*()_+"
    dummy_client = DummyAzureOpenAI(delete_behavior=lambda file_id: FileDeleted(file_id, True, "file"))
    codeflash_output = api.delete_file(
        _is_async=False,
        file_id=special_id,
        api_base=None,
        api_key=None,
        timeout=1,
        max_retries=None,
        client=dummy_client,
    ); result = codeflash_output # 66.2μs -> 9.01μs (635% faster)

def test_delete_file_delete_raises_exception(api):
    """Test if client.files.delete raises an exception: should propagate."""
    def raise_exc(file_id):
        raise RuntimeError("Delete failed!")
    dummy_client = DummyAzureOpenAI(delete_behavior=raise_exc)
    with pytest.raises(RuntimeError, match="Delete failed!"):
        api.delete_file(
            _is_async=False,
            file_id="file_exc",
            api_base=None,
            api_key=None,
            timeout=1,
            max_retries=None,
            client=dummy_client,
        ) # 59.9μs -> 3.84μs (1460% faster)

# ---------------- LARGE SCALE TEST CASES ----------------

def test_delete_file_many_files(api):
    """Test deleting many files in succession (scalability)."""
    deleted_ids = []
    def behavior(file_id):
        deleted_ids.append(file_id)
        return FileDeleted(file_id, True, "file")
    dummy_client = DummyAzureOpenAI(delete_behavior=behavior)
    for i in range(100):  # keep under 1000 as per instruction
        file_id = f"file_{i}"
        codeflash_output = api.delete_file(
            _is_async=False,
            file_id=file_id,
            api_base=None,
            api_key=None,
            timeout=1,
            max_retries=None,
            client=dummy_client,
        ); result = codeflash_output # 3.51ms -> 249μs (1304% faster)


def test_delete_file_performance(api):
    """Test that deleting 500 files does not take excessive time (performance)."""
    # Not a strict timing test, but ensures function can handle a batch
    dummy_client = DummyAzureOpenAI(delete_behavior=lambda file_id: FileDeleted(file_id, True, "file"))
    for i in range(500):
        file_id = f"perf_{i}"
        codeflash_output = api.delete_file(
            _is_async=False,
            file_id=file_id,
            api_base=None,
            api_key=None,
            timeout=1,
            max_retries=None,
            client=dummy_client,
        ); result = codeflash_output # 17.2ms -> 1.17ms (1376% faster)

# ---------------- ASYNC TESTS (BASIC) ----------------

@pytest.mark.asyncio
async def test_adelete_file_success(api):
    """Test async deletion: returns FileDeleted object."""
    class AsyncFilesAPI:
        async def delete(self, file_id):
            return FileDeleted(file_id, True, "file")
    class DummyAsyncClient:
        def __init__(self):
            self.files = AsyncFilesAPI()
    dummy_client = DummyAsyncClient()
    result = await api.adelete_file(
        file_id="async_1",
        openai_client=dummy_client,
    )

@pytest.mark.asyncio
async def test_adelete_file_returns_empty_string(api):
    """Test async: Azure returns empty string, should wrap in FileDeleted."""
    class AsyncFilesAPI:
        async def delete(self, file_id):
            return ""
    class DummyAsyncClient:
        def __init__(self):
            self.files = AsyncFilesAPI()
    dummy_client = DummyAsyncClient()
    result = await api.adelete_file(
        file_id="async_2",
        openai_client=dummy_client,
    )
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-AzureOpenAIFilesAPI.delete_file-mho9wqfz and push.

Codeflash Static Badge

The optimized code achieves a **1332% speedup** by eliminating the expensive `locals()` call and implementing conditional client caching. 

**Key optimizations:**

1. **Replaced `locals()` with explicit dict construction**: The original code used `client_initialization_params: dict = locals()` which copies all local variables into a dictionary on every call. The optimized version explicitly constructs the parameters dict with only the needed keys, avoiding the overhead of copying unnecessary variables and introspection.

2. **Combined isinstance checks**: Changed from two separate `isinstance` calls to a single check using tuple syntax: `isinstance(cached_client, (AzureOpenAI, AsyncAzureOpenAI))`, reducing redundant type checking.

3. **Conditional cache setting**: Added logic to only call `set_cached_openai_client` when a new client was created (`if client is None or openai_client is not client:`), avoiding unnecessary cache operations when reusing existing clients.

4. **Removed redundant `or {}` operation**: Simplified `litellm_params or {}` to just `litellm_params` in the delete_file method since the default is handled elsewhere.

**Why this matters for performance:**
- The `locals()` function performs dictionary creation and variable introspection on every call, which is particularly expensive in the hot path shown in the profiler results (64.5% of original execution time was spent in `set_cached_openai_client`)
- Client caching becomes more efficient when unnecessary cache writes are avoided
- The optimizations are most effective for high-frequency scenarios, as evidenced by test results showing 500-1900% improvements in repeated calls

**Impact on workloads:**
Based on the test results, these optimizations are particularly beneficial for:
- Batch file operations (1304-1376% faster for 100-500 file deletions)
- High-frequency API calls where client reuse is common
- Any workflow that repeatedly calls Azure OpenAI client methods, since the client initialization is now significantly faster
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 November 7, 2025 03:05
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant