Skip to content

Conversation

@strawgate
Copy link
Owner

@strawgate strawgate commented Oct 27, 2025

Summary

Adds optional native_storage mode to ElasticsearchStore for storing dictionaries as Elasticsearch flattened objects instead of JSON strings.

Changes

  • Add native_storage parameter to ElasticsearchStore constructor (defaults to False)
  • Update mapping to include value_flattened field (flattened type)
  • Store values in appropriate field based on mode
  • Read from both fields to support migration
  • Add comprehensive tests for native storage and migration

Benefits

  • More efficient storage and retrieval
  • Better aligned with Elasticsearch's document model
  • Enables basic querying on nested fields
  • Backward compatible with gradual migration support

Note

Sync library needs to be regenerated with make codegen using Python 3.10.

Closes #87


Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Optional native storage mode for Elasticsearch-backed stores: values may be stored as native flattened objects or as JSON strings, with migration support and serializer installation.
  • Bug Fixes

    • Improved detection and clearer reporting of serialization/deserialization errors during store operations.
  • Tests

    • Expanded tests covering native vs non‑native modes, migrations, TTL, round‑trip integrity, and index cleanup.
    • Added container lifecycle helpers for test setup/teardown.
  • Chores

    • Expanded type-checking scope (pyright).

Add optional `native_storage` parameter to ElasticsearchStore that stores
values as Elasticsearch flattened objects instead of JSON strings.

Key changes:
- Add `native_storage` parameter to constructor (defaults to False)
- Update mapping to include `value_flattened` field (flattened type)
- Update `managed_entry_to_document` to store in appropriate field
- Update `source_to_managed_entry` to read from both fields for migration
- Add comprehensive tests for native storage mode and migration

Benefits:
- More efficient storage and retrieval
- Better aligned with Elasticsearch's document model
- Enables basic querying on nested fields

The dual-field approach (value + value_flattened) enables backward
compatibility and gradual migration from legacy JSON string mode.

Related to #87

Co-authored-by: William Easton <strawgate@users.noreply.github.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 27, 2025

📝 Walkthrough

Walkthrough

Adds an optional native_storage mode to Elasticsearch stores (sync and async) to store values as flattened objects or as JSON strings; introduces custom Json/Ndjson serializers and bulk-action builders; updates read/write and error-mapping paths; and expands tests and test fixtures to validate native/non-native storage and migrations.

Changes

Cohort / File(s) Summary
Shared utilities
key-value/key-value-shared/src/key_value/shared/utils/managed_entry.py
Added value_as_dict property; value_as_json and dump_to_json now use it and serialize with sort_keys=True.
Async Elasticsearch store
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py
Added _native_storage field and native_storage ctor param; managed_entry_to_document(..., native_storage=...) now writes to value.flattened (native) or value.string (non-native); source_to_managed_entry reads flattened first then string; installs custom serializers; maps client serialization errors to public SerializationError; updated DEFAULT_MAPPING.
Sync Elasticsearch store
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py
Mirror of async changes: native_storage flag added, (de)serialization updated to handle value.flattened vs value.string, serializer installation, error mapping, and ctor overload updates.
Async utils
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py
Added new_bulk_action builder; introduced LessCapableJsonSerializer and LessCapableNdjsonSerializer (mimetype/compatibility_mimetype, install helpers); removed exported managed_entry_to_document.
Sync utils
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py
Parallel additions: new_bulk_action, LessCapableJsonSerializer, LessCapableNdjsonSerializer, and serializer install helpers; removed managed_entry_to_document.
Async tests & fixtures
key-value/key-value-aio/tests/conftest.py, key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py
Added docker_get and docker_wait_container_gone; new BaseTestElasticsearchStore for setup/cleanup; added native/non-native test classes and migration tests; updated document-shape expectations and cleanup flow.
Sync tests & fixtures
key-value/key-value-sync/tests/code_gen/conftest.py, key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py
Added docker_get and docker_wait_container_gone; updated tests and scaffolding to validate native vs non-native storage, migrations, and adjusted expectations.
Test helpers (common)
key-value/key-value-aio/tests/conftest.py, key-value/key-value-sync/tests/code_gen/conftest.py
Added docker_get and docker_wait_container_gone utilities; docker lifecycle waits prior to run; docker_stop/docker_rm updated to use docker_get.
Misc / Config / Scripts
pyproject.toml, scripts/build_sync_library.py, key-value/key-value-sync/src/key_value/sync/code_gen/stores/base.py
Added Pyright include for package; added extra lint invocations in build script; minor blank-line cleanup in base store code.
Other tests updated
key-value/key-value-aio/tests/stores/redis/test_redis.py, key-value/key-value-sync/tests/code_gen/stores/redis/test_redis.py
Updated expected serialized JSON field order in test snapshots (created_at/expires_at before value).

Sequence Diagram(s)

sequenceDiagram
    participant App as Application
    participant Store as ElasticsearchStore
    participant Doc as Document Builder
    participant ES as Elasticsearch

    rect rgb(238,247,255)
    Note over Store,Doc: Put — Native mode (native_storage=true)
    App->>Store: put(key, ManagedEntry)
    Store->>Doc: managed_entry_to_document(..., native_storage=true)
    Doc->>Doc: write value under value.flattened
    Doc-->>Store: document(with value.flattened)
    Store->>ES: index / bulk index
    ES-->>Store: ack
    end

    rect rgb(255,247,238)
    Note over Store,Doc: Put — Non-native mode (native_storage=false)
    App->>Store: put(key, ManagedEntry)
    Store->>Doc: managed_entry_to_document(..., native_storage=false)
    Doc->>Doc: serialize value -> JSON string under value.string
    Doc-->>Store: document(with value.string)
    Store->>ES: index / bulk index
    ES-->>Store: ack
    end

    rect rgb(238,255,241)
    Note over Store,ES: Get — unified retrieval
    App->>Store: get(key)
    Store->>ES: fetch document
    ES-->>Store: source (may contain flattened or string)
    Store->>Store: source_to_managed_entry(source) — try flattened, else parse string
    alt deserialization error
        Store-->>App: raise DeserializationError / map to SerializationError
    else success
        Store-->>App: ManagedEntry
    end
    end
Loading

Possibly related PRs

Suggested labels

breaking-change

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.79% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "feat: add native storage mode for Elasticsearch" is clear, concise, and accurately describes the primary change in the changeset. The title directly corresponds to the main feature being implemented—adding an optional native_storage mode to the ElasticsearchStore that allows dictionaries to be stored as flattened objects instead of JSON strings. The title is appropriately specific for quick scanning of commit history while avoiding unnecessary technical details.
Linked Issues Check ✅ Passed The pull request implementation fully addresses the requirements from issue #87. The changes add the native_storage parameter to ElasticsearchStore constructors defaulting to False [native_storage: bool = False], update the index mapping to include value field with string and flattened subfields supporting both storage modes, and implement dual-field read logic where source_to_managed_entry reads from value.flattened first then falls back to value.string for backward compatibility. Comprehensive tests validate both native and non-native storage modes, including migration scenarios between them, ensuring gradual migration support as specified in the issue.
Out of Scope Changes Check ✅ Passed The vast majority of changes directly support the native storage feature implementation. The Elasticsearch store and utilities modifications, comprehensive test coverage, and Docker testing infrastructure are all clearly in-scope. Supporting changes to managed_entry.py (adding value_as_dict property and sort_keys=True for deterministic serialization) serve as necessary infrastructure for proper value serialization in native mode. Changes to build scripts, pyproject.toml, and sync library code generation are justified supporting updates. Redis test snapshot updates reflect expected consequences of the serialization order change.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch claude/issue-87-20251027-0227

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (2)

268-284: Harden mget: catch deserialization errors per document

A single bad doc raises and aborts the entire batch. Mirror the single-get behavior by guarding source_to_managed_entry.

Apply this diff:

-            if not (source := doc.get("_source")):
+            if not (source := doc.get("_source")):
                 entries_by_id[doc_id] = None
                 continue
-
-            entries_by_id[doc_id] = source_to_managed_entry(source=source)
+            try:
+                entries_by_id[doc_id] = source_to_managed_entry(source=source)
+            except DeserializationError:
+                entries_by_id[doc_id] = None

206-214: Upgrade mapping for existing indices when native mode is on

The review identifies a valid issue: if an index exists before native mode is enabled, _setup_collection returns early without applying the value_flattened mapping. First writes then create value_flattened dynamically as object, losing the benefits of the flattened type. However, the suggested diff uses an incorrect API signature—the Elasticsearch Python client's put_mapping method does not accept a body parameter. Use the properties parameter directly instead:

    async def _setup_collection(self, *, collection: str) -> None:
        index_name = self._sanitize_index_name(collection=collection)

-        if await self._client.options(ignore_status=404).indices.exists(index=index_name):
-            return
+        if await self._client.options(ignore_status=404).indices.exists(index=index_name):
+            if self._native_storage:
+                # Ensure `value_flattened` is mapped as `flattened` on existing indices
+                try:
+                    await self._client.options(ignore_status=404).indices.put_mapping(
+                        index=index_name,
+                        properties={"value_flattened": {"type": "flattened"}},
+                    )
+                except Exception:
+                    # Ignore if already mapped or not updatable
+                    pass
+            return
 
     _ = await self._client.options(ignore_status=404).indices.create(index=index_name, mappings=DEFAULT_MAPPING, settings={})

Consider checking the current field mapping before calling put_mapping to avoid noisy logs.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between db43abc and f8e7f01.

📒 Files selected for processing (2)
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (11 hunks)
  • key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Python code must adhere to Ruff formatting and linting with a maximum line length of 140 characters
Use Basedpyright in strict mode for type checking
Use Beartype for runtime type checking (can be disabled via PY_KEY_VALUE_DISABLE_BEARTYPE=true)
Code must be compatible with Python 3.10+ (codegen targets 3.10)

Files:

  • key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py
key-value/key-value-aio/**

📄 CodeRabbit inference engine (AGENTS.md)

Async-first: Make all code changes in the async package (key-value/key-value-aio) and regenerate sync

Files:

  • key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py
{key-value/key-value-aio/src/key_value/aio/stores/**,key-value/key-value-sync/src/key_value/sync/code_gen/stores/**}

📄 CodeRabbit inference engine (AGENTS.md)

Store implementations must store ManagedEntry wrappers, not raw values

Files:

  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py
key-value/key-value-aio/src/key_value/aio/stores/**

📄 CodeRabbit inference engine (AGENTS.md)

Async store implementations must conform to the AsyncKeyValue protocol from key_value/aio/protocols/key_value.py

Files:

  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py
🧬 Code graph analysis (2)
key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (1)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (5)
  • managed_entry_to_document (79-96)
  • source_to_managed_entry (99-122)
  • ElasticsearchStore (125-482)
  • _sanitize_index_name (215-221)
  • _sanitize_document_id (223-229)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (4)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (1)
  • managed_entry_to_document (111-123)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (2)
  • managed_entry_to_document (61-69)
  • source_to_managed_entry (72-80)
key-value/key-value-shared/src/key_value/shared/utils/managed_entry.py (3)
  • ManagedEntry (14-119)
  • to_json (65-75)
  • load_from_json (130-136)
key-value/key-value-shared/src/key_value/shared/utils/time_to_live.py (1)
  • try_parse_datetime_str (34-41)
🔇 Additional comments (10)
key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (3)

72-95: Good native-storage round‑trip test

Covers document shape and deserialization via source_to_managed_entry; assertions look correct.


215-234: Migration test looks good

Validates native mode reads legacy JSON string docs; includes refresh. Solid coverage.


26-29: Tags 9.0.0 and 9.2.0 exist on the official Elasticsearch registry

The official Elasticsearch Docker images for versions 9.0.0 and 9.2.0 are available on docker.elastic.co. Elasticsearch publishes images on docker.elastic.co (the Docker Hub "official" elasticsearch is deprecated and no longer maintained by Elastic). The hardcoded versions in your test file will resolve correctly.

The suggestion to parameterize versions is a reasonable enhancement for flexibility, but it is not required to address an actual availability concern.

Likely an incorrect or invalid review comment.

key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (7)

63-66: Mapping: value_flattened added

Appropriate use of flattened for native mode. Consider ensuring existing indices are upgraded too (see comment below).


79-90: Document serialization path LGTM

Correctly toggles between JSON string and flattened dict by native_storage.


139-157: Native-storage flag plumbing

Overloads, ctor arg, attribute, and docstring are consistent and backward compatible.

Also applies to: 166-178, 194-196


248-251: Graceful handling for single-get

Catching DeserializationError and returning None is appropriate.


301-304: Put paths correctly propagate native_storage

Both single and bulk use managed_entry_to_document(..., native_storage=self._native_storage). Good.

Also applies to: 335-338


100-113: Current load_from_json import is correct—no changes needed.

The imported load_from_json from key_value.shared.utils.managed_entry already enforces all required contracts: it returns dict[str, Any], validates dict structure via verify_dict, and raises DeserializationError on invalid input (JSON parse errors or non-dict types). The source_to_managed_entry code at lines 100-113 correctly handles both storage modes and properly validates types before use.


71-79: Sync codegen requires regeneration to include value_flattened support

Verification confirms the async version now conditionally writes to value_flattened (when native_storage=True) and reads from both value_flattened (with fallback to value), but the sync store still only reads the "value" field. The managed_entry_to_document function is also missing the native_storage parameter.

Run make codegen to regenerate the sync store and align it with the async changes.

@strawgate strawgate force-pushed the claude/issue-87-20251027-0227 branch from d9601fa to 302b39f Compare October 27, 2025 15:39
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py (1)

1-151: Critical: Do not edit generated sync files directly.

This file is auto-generated and should not be modified manually. All changes must be made in the async version (key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py) and then the sync library must be regenerated using make codegen.

As per coding guidelines.

key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (1)

1-413: Critical: Do not edit generated sync files directly.

This file is auto-generated and should not be modified manually. All changes must be made in the async version (key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py) and then the sync library must be regenerated using make codegen.

As per coding guidelines.

♻️ Duplicate comments (1)
key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (1)

121-126: Fix index cleanup: iterate response body keys, not response object.

This is the same issue flagged in previous reviews. The response object from indices.get should not be iterated directly.

Apply this diff:

-    async def _cleanup(self):
-        elasticsearch_client = get_elasticsearch_client()
-        indices = await elasticsearch_client.options(ignore_status=404).indices.get(index="kv-store-e2e-test-*")
-        for index in indices:
-            _ = await elasticsearch_client.options(ignore_status=404).indices.delete(index=index)
+    async def _cleanup(self):
+        async with get_elasticsearch_client() as elasticsearch_client:
+            resp = await elasticsearch_client.options(ignore_status=404).indices.get(index="kv-store-e2e-test-*")
+            for index in list(resp.body or {}):
+                await elasticsearch_client.options(ignore_status=404).indices.delete(index=index)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f8e7f01 and 65b755d.

📒 Files selected for processing (9)
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (12 hunks)
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (2 hunks)
  • key-value/key-value-aio/tests/conftest.py (4 hunks)
  • key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (3 hunks)
  • key-value/key-value-shared/src/key_value/shared/utils/managed_entry.py (1 hunks)
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (12 hunks)
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py (2 hunks)
  • key-value/key-value-sync/tests/code_gen/conftest.py (4 hunks)
  • key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (3 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Python code must adhere to Ruff formatting and linting with a maximum line length of 140 characters
Use Basedpyright in strict mode for type checking
Use Beartype for runtime type checking (can be disabled via PY_KEY_VALUE_DISABLE_BEARTYPE=true)
Code must be compatible with Python 3.10+ (codegen targets 3.10)

Files:

  • key-value/key-value-shared/src/key_value/shared/utils/managed_entry.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py
  • key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py
  • key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py
  • key-value/key-value-sync/tests/code_gen/conftest.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py
  • key-value/key-value-aio/tests/conftest.py
key-value/key-value-sync/**

📄 CodeRabbit inference engine (AGENTS.md)

Never edit files in key-value/key-value-sync; the sync package is generated and changes will be overwritten

Files:

  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py
  • key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py
  • key-value/key-value-sync/tests/code_gen/conftest.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py
{key-value/key-value-aio/src/key_value/aio/stores/**,key-value/key-value-sync/src/key_value/sync/code_gen/stores/**}

📄 CodeRabbit inference engine (AGENTS.md)

Store implementations must store ManagedEntry wrappers, not raw values

Files:

  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py
key-value/key-value-sync/src/key_value/sync/code_gen/stores/**

📄 CodeRabbit inference engine (AGENTS.md)

Sync store implementations must conform to the KeyValue protocol (generated mirror) from key_value/sync/code_gen/protocols/key_value.py

Files:

  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py
key-value/key-value-aio/**

📄 CodeRabbit inference engine (AGENTS.md)

Async-first: Make all code changes in the async package (key-value/key-value-aio) and regenerate sync

Files:

  • key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py
  • key-value/key-value-aio/tests/conftest.py
key-value/key-value-aio/src/key_value/aio/stores/**

📄 CodeRabbit inference engine (AGENTS.md)

Async store implementations must conform to the AsyncKeyValue protocol from key_value/aio/protocols/key_value.py

Files:

  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py
🧬 Code graph analysis (8)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (4)
key-value/key-value-shared/src/key_value/shared/errors/key_value.py (2)
  • SerializationError (10-11)
  • DeserializationError (14-15)
key-value/key-value-shared/src/key_value/shared/utils/managed_entry.py (5)
  • ManagedEntry (14-123)
  • load_from_json (134-140)
  • verify_dict (143-152)
  • value_as_dict (47-48)
  • value_as_json (42-44)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py (5)
  • LessCapableJsonSerializer (134-141)
  • NdjsonSerializer (144-151)
  • new_bulk_action (114-115)
  • install_serializer (125-126)
  • install_default_serializer (129-131)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (2)
  • managed_entry_to_document (82-96)
  • source_to_managed_entry (99-125)
key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (3)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (4)
  • source_to_managed_entry (99-125)
  • managed_entry_to_document (82-96)
  • _sanitize_index_name (222-228)
  • _sanitize_document_id (230-236)
key-value/key-value-aio/src/key_value/aio/stores/base.py (3)
  • ttl (214-223)
  • get (175-201)
  • put (284-297)
key-value/key-value-shared/src/key_value/shared/utils/managed_entry.py (1)
  • ManagedEntry (14-123)
key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (3)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (4)
  • source_to_managed_entry (79-101)
  • managed_entry_to_document (62-76)
  • _sanitize_index_name (187-193)
  • _sanitize_document_id (195-196)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/base.py (3)
  • ttl (215-224)
  • get (181-202)
  • put (271-280)
key-value/key-value-shared/src/key_value/shared/utils/managed_entry.py (1)
  • ManagedEntry (14-123)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (4)
key-value/key-value-shared/src/key_value/shared/errors/key_value.py (2)
  • SerializationError (10-11)
  • DeserializationError (14-15)
key-value/key-value-shared/src/key_value/shared/utils/managed_entry.py (4)
  • load_from_json (134-140)
  • verify_dict (143-152)
  • value_as_dict (47-48)
  • value_as_json (42-44)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (5)
  • LessCapableJsonSerializer (137-146)
  • NdjsonSerializer (149-156)
  • new_bulk_action (112-113)
  • install_serializer (123-129)
  • install_default_serializer (132-134)
key-value/key-value-shared/src/key_value/shared/utils/time_to_live.py (1)
  • try_parse_datetime_str (34-41)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (1)
key-value/key-value-shared/src/key_value/shared/errors/key_value.py (1)
  • SerializationError (10-11)
key-value/key-value-sync/tests/code_gen/conftest.py (1)
key-value/key-value-aio/tests/conftest.py (1)
  • get_docker_client (34-35)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py (2)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (8)
  • NdjsonSerializer (149-156)
  • new_bulk_action (112-113)
  • InstallSerializerMixin (116-134)
  • install_serializer (123-129)
  • install_default_serializer (132-134)
  • LessCapableJsonSerializer (137-146)
  • default (143-146)
  • default (155-156)
key-value/key-value-shared/src/key_value/shared/errors/key_value.py (1)
  • SerializationError (10-11)
key-value/key-value-aio/tests/conftest.py (2)
key-value/key-value-shared/src/key_value/shared/code_gen/sleep.py (1)
  • sleep (13-17)
key-value/key-value-sync/tests/code_gen/conftest.py (1)
  • get_docker_client (37-38)
🪛 GitHub Actions: Run Tests
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py

[error] 103-103: pyright: Type "Any | None" is not assignable to declared type "dict[str, Any]".

🪛 GitHub Check: static_analysis (key-value/key-value-aio)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py

[failure] 103-103:
Unnecessary isinstance call; "dict[str, Any]" is always an instance of "dict[Unknown, Unknown]" (reportUnnecessaryIsInstance)


[failure] 103-103:
Type "Any | None" is not assignable to declared type "dict[str, Any]"
  Type "Any | None" is not assignable to type "dict[str, Any]"
    "None" is not assignable to "dict[str, Any]" (reportAssignmentType)

key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py

[failure] 149-149:
Base class type is unknown, obscuring type of derived class (reportUntypedBaseClass)


[failure] 149-149:
Class cannot derive from itself (reportGeneralTypeIssues)


[failure] 134-134:
Cannot assign to attribute "default_serializer" for class "SerializerCollection"
  Expression of type "Self@InstallSerializerMixin" cannot be assigned to attribute "default_serializer" of class "SerializerCollection"
    Type "Self@InstallSerializerMixin" is not assignable to type "JsonSerializer | TextSerializer | NdjsonSerializer | Serializer"
      "InstallSerializerMixin*" is not assignable to "JsonSerializer"
      "InstallSerializerMixin*" is not assignable to "TextSerializer"
      "InstallSerializerMixin*" is not assignable to "NdjsonSerializer"
      "InstallSerializerMixin*" is not assignable to "Serializer" (reportAttributeAccessIssue)


[failure] 134-134:
"_transport" is protected and used outside of the class in which it is declared (reportPrivateUsage)


[failure] 125-125:
Argument of type "dict[str, Self@InstallSerializerMixin]" cannot be assigned to parameter "m" of type "Iterable[tuple[str, Serializer]]" in function "update"
  "str" is not assignable to "tuple[str, Serializer]"
  "str" is not assignable to "tuple[str, Serializer]" (reportArgumentType)


[failure] 124-124:
"_transport" is protected and used outside of the class in which it is declared (reportPrivateUsage)


[failure] 124-124:
No overloads for "update" match the provided arguments (reportCallIssue)

🔇 Additional comments (8)
key-value/key-value-shared/src/key_value/shared/utils/managed_entry.py (1)

42-49: LGTM: consistent dict view + JSON path

Using value_as_json -> value_as_dict removes duplication and keeps serialization consistent. Looks good.

key-value/key-value-sync/tests/code_gen/conftest.py (1)

64-75: Docker lifecycle helpers improve test stability

Refactors look good: central docker_get, short‑circuit on missing containers, and wait‑until‑gone before run should reduce flakiness.

Please confirm these changes eliminate occasional "Conflict. The container name is already in use" flakes in CI runs.

Also applies to: 90-105, 108-117, 119-134, 159-160

key-value/key-value-aio/tests/conftest.py (1)

62-75: Docker helpers look good

Refactors mirror the sync output and improve determinism. Nice.

Also applies to: 89-104, 106-115, 117-132, 157-160

key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (1)

112-114: Bulk action header helper: LGTM

new_bulk_action returns the minimal, correct header structure for bulk APIs.

key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (3)

154-185: Fix assertions for native_storage and avoid brittle timestamp snapshots

In native mode, documents should store under value.flattened, not value.string. The hard‑coded ISO timestamps will also flake. Assert only the stable parts.

Apply this diff (do not edit sync tests directly; update the aio test and regenerate sync):

-        response = es_client.get(index=index_name, id=doc_id)
-        assert response.body["_source"] == snapshot(
-            {
-                "collection": "test",
-                "key": "test_key",
-                "value": {"string": '{"name": "Alice", "age": 30}'},
-                "created_at": "2025-10-27T15:33:31.795253+00:00",
-            }
-        )
+        response = es_client.get(index=index_name, id=doc_id)
+        source = response.body["_source"]
+        assert source["collection"] == "test"
+        assert source["key"] == "test_key"
+        assert source["value"] == {"flattened": {"name": "Alice", "age": 30}}
+        assert "created_at" in source
@@
-        response = es_client.get(index=index_name, id=doc_id)
-        assert response.body["_source"] == snapshot(
-            {
-                "collection": "test",
-                "key": "test_key",
-                "value": {"string": '{"name": "Bob", "age": 25}'},
-                "created_at": "2025-10-27T15:33:32.040020+00:00",
-                "expires_at": "2025-10-27T15:33:42.040020+00:00",
-            }
-        )
+        response = es_client.get(index=index_name, id=doc_id)
+        source = response.body["_source"]
+        assert source["value"] == {"flattened": {"name": "Bob", "age": 25}}
+        assert "created_at" in source and "expires_at" in source

As per coding guidelines.

Likely an incorrect or invalid review comment.


186-199: Clarify or fix legacy compatibility test—file is auto-generated

This test file is auto-generated from key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py and will be overwritten on rebuild. Changes must be made to the source file.

The core issue is valid: the docstring claims "Verify native mode can read legacy JSON string data," but the test inserts a top-level string value and asserts None. The parser at source_to_managed_entry() requires a nested dict structure like {"value": {"string": "..."}}—it will raise DeserializationError for the inserted format, caught and converted to None by store.get().

Either:

  • Update the docstring to clarify this validates graceful error handling (not active legacy support)
  • Fix the test data to use the correct schema: {"value": {"string": '{"legacy": "data"}'}} and assert the parsed result
  • Implement legacy compatibility to accept top-level strings and update the implementation

Edit the source file key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (not the sync version).

Likely an incorrect or invalid review comment.


30-31: Docker image tags verified—both Elasticsearch 9.0.0 and 9.2.0 are available on docker.elastic.co

The pinning to ES 9.x versions is safe; both referenced tags are officially published on the Elastic Docker registry and will pull successfully during testing.

key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (1)

142-170: Verify that defaulting native_storage to True is intentional.

The native_storage parameter defaults to True, which changes the storage format for existing users who upgrade. This is a breaking change that could cause compatibility issues with existing indices created with the old string-based format.

Consider whether the default should be False for backward compatibility, or ensure this breaking change is clearly documented in migration notes.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (5)
key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (2)

121-125: Fix in async version: iterate indices.body keys explicitly.

This file is auto-generated. The _cleanup method iterates the indices response object directly, which may be unreliable. Apply the fix in key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py at lines 120-124, then regenerate sync.


213-214: Fix in async version: correct the misleading docstring.

This file is auto-generated. The docstring incorrectly states "flattened objects" when non-native mode stores values as JSON strings. Apply the fix in key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py at line 220, then regenerate sync.

key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (3)

120-124: Fix index cleanup: iterate response body keys explicitly.

The indices.get() call returns an ObjectApiResponse that may not iterate reliably. Access the response body and iterate its keys explicitly.

Apply this diff:

     async def _cleanup(self):
-        elasticsearch_client = get_elasticsearch_client()
-        indices = await elasticsearch_client.options(ignore_status=404).indices.get(index="kv-store-e2e-test-*")
-        for index in indices:
-            _ = await elasticsearch_client.options(ignore_status=404).indices.delete(index=index)
+        async with get_elasticsearch_client() as es_client:
+            resp = await es_client.options(ignore_status=404).indices.get(index="kv-store-e2e-test-*")
+            for index in list(resp.body or {}):
+                await es_client.options(ignore_status=404).indices.delete(index=index)

219-220: Fix misleading docstring.

The docstring incorrectly states "flattened objects" when this test verifies non-native mode, which stores values as JSON strings.

Apply this diff:

     async def test_value_stored_as_json_string(self, store: ElasticsearchStore, es_client: AsyncElasticsearch):
-        """Verify values are stored as flattened objects, not JSON strings"""
+        """Verify values are stored as JSON strings (non-native mode)"""

188-207: Fix legacy document structure to match schema.

Line 200 uses a bare JSON string for the value field, but the schema requires an object with a string or flattened subfield (see source_to_managed_entry at lines 98-124 of store.py). The current structure will fail deserialization.

Additionally, line 207 expects None, which suggests the test knows the read will fail. After fixing the document structure, verify whether the assertion should expect {"legacy": "data"} instead.

Apply this diff:

         await es_client.index(
             index=index_name,
             id=doc_id,
             body={
                 "collection": "test",
                 "key": "legacy_key",
-                "value": '{"legacy": "data"}',  # JSON string
+                "value": {"string": '{"legacy": "data"}'},  # Legacy JSON string format
             },
         )
         await es_client.indices.refresh(index=index_name)
 
         # Should be able to read stringified values too
         result = await store.get(collection="test", key="legacy_key")
-        assert result == snapshot(None)
+        assert result == snapshot({"legacy": "data"})
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 65b755d and d823f90.

📒 Files selected for processing (4)
  • key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (5 hunks)
  • key-value/key-value-shared/src/key_value/shared/utils/managed_entry.py (2 hunks)
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/base.py (0 hunks)
  • key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (5 hunks)
💤 Files with no reviewable changes (1)
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/base.py
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Python code must adhere to Ruff formatting and linting with a maximum line length of 140 characters
Use Basedpyright in strict mode for type checking
Use Beartype for runtime type checking (can be disabled via PY_KEY_VALUE_DISABLE_BEARTYPE=true)
Code must be compatible with Python 3.10+ (codegen targets 3.10)

Files:

  • key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py
  • key-value/key-value-shared/src/key_value/shared/utils/managed_entry.py
  • key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py
key-value/key-value-aio/**

📄 CodeRabbit inference engine (AGENTS.md)

Async-first: Make all code changes in the async package (key-value/key-value-aio) and regenerate sync

Files:

  • key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py
key-value/key-value-sync/**

📄 CodeRabbit inference engine (AGENTS.md)

Never edit files in key-value/key-value-sync; the sync package is generated and changes will be overwritten

Files:

  • key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py
🧬 Code graph analysis (2)
key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (4)
key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (4)
  • _cleanup (121-125)
  • get_elasticsearch_client (33-34)
  • store (153-155)
  • store (209-211)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (2)
  • managed_entry_to_document (82-96)
  • source_to_managed_entry (99-125)
key-value/key-value-aio/tests/stores/base.py (2)
  • ContextManagerStoreTestMixin (266-278)
  • store (31-31)
key-value/key-value-aio/tests/conftest.py (1)
  • should_skip_docker_tests (219-220)
key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (2)
key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (7)
  • test_managed_entry_document_conversion_native_storage (72-94)
  • BaseTestElasticsearchStore (97-143)
  • _cleanup (120-124)
  • get_elasticsearch_client (32-33)
  • store (152-154)
  • store (215-217)
  • es_client (116-118)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (3)
  • managed_entry_to_document (62-76)
  • source_to_managed_entry (79-101)
  • ElasticsearchStore (104-413)
🔇 Additional comments (5)
key-value/key-value-shared/src/key_value/shared/utils/managed_entry.py (2)

42-48: Good refactoring to centralize dict conversion logic.

The new value_as_dict property effectively centralizes the conversion from Mapping[str, Any] to dict[str, Any], eliminating duplication and supporting the native storage approach. The refactored value_as_json correctly delegates to this property.

Minor consideration: The property creates a new dict on each access. If value_as_dict is accessed frequently in hot paths outside this file, consider using @functools.cached_property for efficiency. However, current usage within this file appears minimal.


128-128: Excellent addition of deterministic JSON serialization.

Adding sort_keys=True ensures consistent, predictable JSON output across serialization calls, which is crucial for testing, document comparison, and reliable Elasticsearch indexing in the native storage mode.

key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (1)

8-8: LGTM: Import needed for new test assertions.

The IsStr import supports the new snapshot assertions that validate timestamp string formats.

key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (2)

5-5: LGTM: Import needed for new test assertions.

The IsStr import supports the new snapshot assertions that validate timestamp string formats.


72-94: LGTM: Native storage document conversion test is well-structured.

The test correctly verifies that native_storage=True produces a flattened structure and validates round-trip conversion with proper TTL handling.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
key-value/key-value-sync/tests/code_gen/conftest.py (1)

150-160: Optionally check wait result before run

If the prior container lingers, fail fast or log at warning level to aid flakiness debugging.

-        docker_wait_container_gone(name=name, max_tries=10, wait_time=1.0)
+        if not docker_wait_container_gone(name=name, max_tries=10, wait_time=1.0):
+            logger.warning(f"Container {name} still present before run; proceeding may fail")
♻️ Duplicate comments (9)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (1)

3-3: Fix serializer installation: typing, class shadowing, and dict update misuse

Basedpyright errors and shadowing stem from inheriting from identically named bases, using dict.update with wrong type, and missing Serializer cast. Apply the following changes.

-from elastic_transport import JsonSerializer, NdjsonSerializer, ObjectApiResponse, SerializationError
+from elastic_transport import (
+    JsonSerializer as BaseJsonSerializer,
+    NdjsonSerializer as BaseNdjsonSerializer,
+    ObjectApiResponse,
+    SerializationError,
+    Serializer,
+)
@@
 class InstallSerializerMixin:
@@
     @classmethod
     def install_serializer(cls, client: AsyncElasticsearch) -> None:
-        client.transport.serializers.serializers.update(
-            {
-                cls.mimetype: cls(),
-                cls.compatibility_mimetype: cls(),
-            }
-        )
+        transport = client.transport
+        sc = transport.serializers
+        serializer = cast(Serializer, cls())
+        sc.serializers[cls.mimetype] = serializer
+        sc.serializers[cls.compatibility_mimetype] = serializer
@@
     @classmethod
     def install_default_serializer(cls, client: AsyncElasticsearch) -> None:
         cls.install_serializer(client=client)
-        client.transport.serializers.default_serializer = cls()
+        transport = client.transport
+        transport.serializers.default_serializer = cast(Serializer, cls())
@@
-class LessCapableJsonSerializer(InstallSerializerMixin, JsonSerializer):
+class StrictJsonSerializer(InstallSerializerMixin, BaseJsonSerializer):
@@
-    def default(self, data: Any) -> Any:
+    def default(self, data: Any) -> Any:
         raise SerializationError(
             message=f"Unable to serialize to JSON: {data!r} (type: {type(data).__name__})",
         )
@@
-class NdjsonSerializer(InstallSerializerMixin, NdjsonSerializer):
+class StrictNdjsonSerializer(InstallSerializerMixin, BaseNdjsonSerializer):
@@
-    def default(self, data: Any) -> Any:
-        return LessCapableJsonSerializer.default(self, data)
+    def default(self, data: Any) -> Any:
+        return StrictJsonSerializer.default(self, data)

Note: If other modules reference the old class names, either update call sites or add back-compat aliases:
LessCapableJsonSerializer = StrictJsonSerializer and NdjsonSerializer = StrictNdjsonSerializer. As per coding guidelines.

Also applies to: 116-136, 137-147, 149-156

key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (3)

121-126: Iterate indices.body keys, not the response object

The client returns an ObjectApiResponse; iterate over response.body keys explicitly.

Apply in aio test file and regenerate sync:

-        indices = elasticsearch_client.options(ignore_status=404).indices.get(index="kv-store-e2e-test-*")
-        for index in indices:
+        indices = elasticsearch_client.options(ignore_status=404).indices.get(index="kv-store-e2e-test-*")
+        for index in indices.body.keys():
             _ = elasticsearch_client.options(ignore_status=404).indices.delete(index=index)

189-201: Legacy document structure must use wrapper; fix assertion

Schema expects value to be under value.string; current test inserts a bare string and then expects None.

Change in aio test; regenerate sync:

-        es_client.index(index=index_name, id=doc_id, body={"collection": "test", "key": "legacy_key", "value": '{"legacy": "data"}'})
+        es_client.index(
+            index=index_name,
+            id=doc_id,
+            body={"collection": "test", "key": "legacy_key", "value": {"string": '{"legacy": "data"}'}},
+        )
         es_client.indices.refresh(index=index_name)
@@
-        result = store.get(collection="test", key="legacy_key")
-        assert result == snapshot(None)
+        result = store.get(collection="test", key="legacy_key")
+        assert result == snapshot({"legacy": "data"})

213-215: Fix misleading docstring

Non-native mode stores JSON strings; adjust the docstring.

-        """Verify values are stored as flattened objects, not JSON strings"""
+        """Verify values are stored as JSON strings (non-native mode)"""

Update in aio test; regenerate sync.

key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (5)

120-124: Fix index cleanup: iterate response body keys and use context manager.

The cleanup iterates the response object directly which is unreliable. The Elasticsearch client is also not closed properly.

Apply this diff:

 async def _cleanup(self):
-    elasticsearch_client = get_elasticsearch_client()
-    indices = await elasticsearch_client.options(ignore_status=404).indices.get(index="kv-store-e2e-test-*")
-    for index in indices:
-        _ = await elasticsearch_client.options(ignore_status=404).indices.delete(index=index)
+    async with get_elasticsearch_client() as es_client:
+        resp = await es_client.options(ignore_status=404).indices.get(index="kv-store-e2e-test-*")
+        for index in list(resp.body or {}):
+            await es_client.options(ignore_status=404).indices.delete(index=index)

188-207: Legacy document structure doesn't match the schema.

The inserted document uses a bare JSON string for the value field, but the mapping requires an object with either a string or flattened subfield. The test comment suggests it should read legacy data, but the assertion expects None.

Apply this diff to fix the document structure:

         await es_client.index(
             index=index_name,
             id=doc_id,
             body={
                 "collection": "test",
                 "key": "legacy_key",
-                "value": '{"legacy": "data"}',  # JSON string
+                "value": {"string": '{"legacy": "data"}'},  # JSON string in correct field
             },
         )

Then update the assertion to verify the correct behavior:

-        result = await store.get(collection="test", key="legacy_key")
-        assert result == snapshot(None)
+        result = await store.get(collection="test", key="legacy_key")
+        assert result == snapshot({"legacy": "data"})

219-247: LGTM with minor suggestion for consistency.

The test correctly validates non-native storage format with JSON string values.

Optional: Use _get_destination helper for consistency.

Same suggestion as the native mode test:

-        index_name = store._sanitize_index_name(collection="test")  # pyright: ignore[reportPrivateUsage]
-        doc_id = store._sanitize_document_id(key="test_key")  # pyright: ignore[reportPrivateUsage]
+        index_name, doc_id = store._get_destination(collection="test", key="test_key")  # pyright: ignore[reportPrivateUsage]

249-267: LGTM with minor suggestion for consistency.

The test correctly validates that non-native mode can read native storage format, supporting migration scenarios.

Optional: Use _get_destination helper for consistency.

Same suggestion as previous tests:

-        index_name = store._sanitize_index_name(collection="test")  # pyright: ignore[reportPrivateUsage]
-        doc_id = store._sanitize_document_id(key="legacy_key")  # pyright: ignore[reportPrivateUsage]
+        index_name, doc_id = store._get_destination(collection="test", key="legacy_key")  # pyright: ignore[reportPrivateUsage]

156-186: LGTM with minor suggestion for consistency.

The test correctly validates native storage format with flattened objects.

Optional: Use _get_destination helper for consistency.

Consider using the _get_destination helper instead of calling sanitizers separately:

-        index_name = store._sanitize_index_name(collection="test")  # pyright: ignore[reportPrivateUsage]
-        doc_id = store._sanitize_document_id(key="test_key")  # pyright: ignore[reportPrivateUsage]
+        index_name, doc_id = store._get_destination(collection="test", key="test_key")  # pyright: ignore[reportPrivateUsage]

This aligns better with the store's internal logic and reduces duplication.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d823f90 and d6849e3.

📒 Files selected for processing (6)
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (2 hunks)
  • key-value/key-value-aio/tests/conftest.py (4 hunks)
  • key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (5 hunks)
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py (2 hunks)
  • key-value/key-value-sync/tests/code_gen/conftest.py (4 hunks)
  • key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (5 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Python code must adhere to Ruff formatting and linting with a maximum line length of 140 characters
Use Basedpyright in strict mode for type checking
Use Beartype for runtime type checking (can be disabled via PY_KEY_VALUE_DISABLE_BEARTYPE=true)
Code must be compatible with Python 3.10+ (codegen targets 3.10)

Files:

  • key-value/key-value-sync/tests/code_gen/conftest.py
  • key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py
  • key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py
  • key-value/key-value-aio/tests/conftest.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py
key-value/key-value-sync/**

📄 CodeRabbit inference engine (AGENTS.md)

Never edit files in key-value/key-value-sync; the sync package is generated and changes will be overwritten

Files:

  • key-value/key-value-sync/tests/code_gen/conftest.py
  • key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py
key-value/key-value-aio/**

📄 CodeRabbit inference engine (AGENTS.md)

Async-first: Make all code changes in the async package (key-value/key-value-aio) and regenerate sync

Files:

  • key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py
  • key-value/key-value-aio/tests/conftest.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py
{key-value/key-value-aio/src/key_value/aio/stores/**,key-value/key-value-sync/src/key_value/sync/code_gen/stores/**}

📄 CodeRabbit inference engine (AGENTS.md)

Store implementations must store ManagedEntry wrappers, not raw values

Files:

  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py
key-value/key-value-sync/src/key_value/sync/code_gen/stores/**

📄 CodeRabbit inference engine (AGENTS.md)

Sync store implementations must conform to the KeyValue protocol (generated mirror) from key_value/sync/code_gen/protocols/key_value.py

Files:

  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py
key-value/key-value-aio/src/key_value/aio/stores/**

📄 CodeRabbit inference engine (AGENTS.md)

Async store implementations must conform to the AsyncKeyValue protocol from key_value/aio/protocols/key_value.py

Files:

  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py
🧬 Code graph analysis (6)
key-value/key-value-sync/tests/code_gen/conftest.py (2)
key-value/key-value-shared/src/key_value/shared/code_gen/sleep.py (1)
  • sleep (13-17)
key-value/key-value-aio/tests/conftest.py (4)
  • docker_get (62-72)
  • get_docker_client (34-35)
  • docker_wait_container_gone (106-114)
  • docker_rm (117-131)
key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (1)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (2)
  • managed_entry_to_document (62-76)
  • source_to_managed_entry (79-101)
key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (3)
key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (3)
  • get_elasticsearch_client (33-34)
  • store (153-155)
  • store (209-211)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (2)
  • managed_entry_to_document (82-96)
  • source_to_managed_entry (99-125)
key-value/key-value-aio/tests/stores/base.py (2)
  • ContextManagerStoreTestMixin (266-278)
  • store (31-31)
key-value/key-value-aio/tests/conftest.py (2)
key-value/key-value-shared/src/key_value/shared/code_gen/sleep.py (1)
  • sleep (13-17)
key-value/key-value-sync/tests/code_gen/conftest.py (4)
  • docker_get (64-74)
  • get_docker_client (37-38)
  • docker_wait_container_gone (108-116)
  • docker_rm (119-133)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py (2)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (8)
  • NdjsonSerializer (149-156)
  • new_bulk_action (112-113)
  • InstallSerializerMixin (116-134)
  • install_serializer (123-129)
  • install_default_serializer (132-134)
  • LessCapableJsonSerializer (137-146)
  • default (143-146)
  • default (155-156)
key-value/key-value-shared/src/key_value/shared/errors/key_value.py (1)
  • SerializationError (10-11)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (2)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py (8)
  • NdjsonSerializer (144-151)
  • new_bulk_action (114-115)
  • InstallSerializerMixin (118-131)
  • install_serializer (125-126)
  • install_default_serializer (129-131)
  • LessCapableJsonSerializer (134-141)
  • default (140-141)
  • default (150-151)
key-value/key-value-shared/src/key_value/shared/errors/key_value.py (1)
  • SerializationError (10-11)
🪛 GitHub Check: static_analysis (key-value/key-value-aio)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py

[failure] 149-149:
Base class type is unknown, obscuring type of derived class (reportUntypedBaseClass)


[failure] 149-149:
Class cannot derive from itself (reportGeneralTypeIssues)


[failure] 134-134:
Cannot assign to attribute "default_serializer" for class "SerializerCollection"
  Expression of type "Self@InstallSerializerMixin" cannot be assigned to attribute "default_serializer" of class "SerializerCollection"
    Type "Self@InstallSerializerMixin" is not assignable to type "JsonSerializer | TextSerializer | NdjsonSerializer | Serializer"
      "InstallSerializerMixin*" is not assignable to "JsonSerializer"
      "InstallSerializerMixin*" is not assignable to "TextSerializer"
      "InstallSerializerMixin*" is not assignable to "NdjsonSerializer"
      "InstallSerializerMixin*" is not assignable to "Serializer" (reportAttributeAccessIssue)


[failure] 125-125:
Argument of type "dict[str, Self@InstallSerializerMixin]" cannot be assigned to parameter "m" of type "Iterable[tuple[str, Serializer]]" in function "update"
  "str" is not assignable to "tuple[str, Serializer]"
  "str" is not assignable to "tuple[str, Serializer]" (reportArgumentType)


[failure] 124-124:
No overloads for "update" match the provided arguments (reportCallIssue)

🔇 Additional comments (7)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (1)

112-113: Bulk action helper looks good

Header construction is correct and typed. No changes needed.

key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py (1)

114-115: Bulk action helper mirrors aio; looks good

Consistent with aio helper. No changes needed here.

key-value/key-value-sync/tests/code_gen/conftest.py (1)

14-16: LGTM: container helpers and sleep shim

Good use of Container typing and shared sleep shim; docker_get and docker_wait_container_gone implementations are clear.

Also applies to: 64-75, 108-117

key-value/key-value-aio/tests/conftest.py (1)

11-13: LGTM: Docker helpers and shared sleep shim

Consistent with sync tests; utilities improve stability and codegen friendliness.

Also applies to: 62-73, 106-115, 149-158

key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (1)

141-145: Access indices via .body for assertions

The current code inconsistently accesses the elasticsearch response object: line 141 correctly uses indices.body, but lines 142-143 use in indices directly. While elasticsearch-py's ObjectApiResponse supports membership checking through __contains__, explicitly accessing .body maintains consistency and clarity about accessing the actual response dict.

Fix the source file key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py at lines 142-143 to use in indices.body, then regenerate the sync package.

-        assert "kv-store-e2e-test-test_collection" in indices
-        assert "kv-store-e2e-test-test_collection_2" in indices
+        assert "kv-store-e2e-test-test_collection" in indices.body
+        assert "kv-store-e2e-test-test_collection_2" in indices.body
key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (2)

5-5: LGTM!

The import addition and test update correctly validate non-native storage mode behavior with proper round-trip verification.

Also applies to: 47-69


72-94: LGTM!

The native storage test correctly validates the flattened field format and ensures round-trip consistency.

@claude claude bot mentioned this pull request Oct 28, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (2)

141-172: Restore default native_storage=False to match PR/backward compatibility

Public API currently defaults to True in overloads and init while the PR says default off.

Apply:

-    def __init__(..., native_storage: bool = True, ...): ...
+    def __init__(..., native_storage: bool = False, ...): ...
@@
-    def __init__(..., native_storage: bool = True, ...): ...
+    def __init__(..., native_storage: bool = False, ...): ...
@@
-        native_storage: bool = True,
+        native_storage: bool = False,
@@
-            native_storage: Whether to use native storage mode (flattened field type) or serialize
-                            all values to JSON strings. Defaults to True.
+            native_storage: Whether to use native storage mode (flattened field type) or serialize
+                            all values to JSON strings. Defaults to False.

Ensure tests don’t rely on the old default.

Also applies to: 176-183


277-293: Make batch get resilient to per‑doc deserialization errors

One bad document currently raises and aborts the whole batch.

Apply:

-            entries_by_id[doc_id] = source_to_managed_entry(source=source)
+            try:
+                entries_by_id[doc_id] = source_to_managed_entry(source=source)
+            except DeserializationError:
+                entries_by_id[doc_id] = None

Consistent with _get_managed_entry.

♻️ Duplicate comments (5)
key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (2)

121-125: Use indices.body keys (ObjectApiResponse is not directly iterable)

Iterate and test membership against indices.body, not the response object.

Apply this diff:

 def _cleanup(self):
     elasticsearch_client = get_elasticsearch_client()
-    indices = elasticsearch_client.options(ignore_status=404).indices.get(index="kv-store-e2e-test-*")
-    for index in indices:
+    indices = elasticsearch_client.options(ignore_status=404).indices.get(index="kv-store-e2e-test-*")
+    for index in indices.body.keys():
         _ = elasticsearch_client.options(ignore_status=404).indices.delete(index=index)
@@
-    indices = es_client.options(ignore_status=404).indices.get(index="kv-store-e2e-test-*")
-    assert len(indices.body) == 2
-    assert "kv-store-e2e-test-test_collection" in indices
-    assert "kv-store-e2e-test-test_collection_2" in indices
+    indices = es_client.options(ignore_status=404).indices.get(index="kv-store-e2e-test-*")
+    assert len(indices.body) == 2
+    assert "kv-store-e2e-test-test_collection" in indices.body
+    assert "kv-store-e2e-test-test_collection_2" in indices.body

Note: Make this change in the async source test and regenerate sync.

Also applies to: 141-145


212-214: Fix misleading docstring (this is non‑native mode)

Docstring should reflect JSON‑string storage.

Apply:

-        """Verify values are stored as flattened objects, not JSON strings"""
+        """Verify values are stored as JSON strings (non-native mode)"""

Update in aio test and regenerate sync.

key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (2)

135-147: Harden serializer installation with explicit assignments and typing cast

Cast to Serializer for Basedpyright strict, and assign dict entries directly.

Apply:

-from elastic_transport import (
-    JsonSerializer,
-    NdjsonSerializer,
-    ObjectApiResponse,
-    SerializationError,
-)
+from elastic_transport import (
+    JsonSerializer,
+    NdjsonSerializer,
+    ObjectApiResponse,
+    SerializationError,
+    Serializer,
+)
@@
     def install_default_serializer(cls, client: AsyncElasticsearch) -> None:
         cls.install_serializer(client=client)
-        client.transport.serializers.default_serializer = cls()
+        client.transport.serializers.default_serializer = cast(Serializer, cls())
@@
     def install_serializer(cls, client: AsyncElasticsearch) -> None:
-        client.transport.serializers.serializers.update(
-            {
-                cls.mimetype: cls(),
-                cls.compatibility_mimetype: cls(),
-            }
-        )
+        sc = client.transport.serializers.serializers
+        serializer = cast(Serializer, cls())
+        sc[cls.mimetype] = serializer
+        sc[cls.compatibility_mimetype] = serializer

As per coding guidelines.


156-158: Fix missing self in Ndjson default delegation (runtime TypeError)

Calling LessCapableJsonSerializer.default without self will raise TypeError.

Apply:

-    def default(self, data: Any) -> Any:
-        return LessCapableJsonSerializer.default(data=data)  # pyright: ignore[reportCallIssue, reportUnknownVariableType]
+    def default(self, data: Any) -> Any:
+        # Reuse the same strict behavior
+        return LessCapableJsonSerializer.default(self, data)  # type: ignore[arg-type]

Alternative: duplicate the raise here instead of delegating.

key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (1)

156-187: Optional: use _get_destination helper for consistency.

Instead of calling both sanitizers separately, consider using _get_destination to derive (index, id) in one call. This keeps the test aligned with the store's canonical path resolution logic.

Apply this diff:

-        index_name = store._sanitize_index_name(collection="test")  # pyright: ignore[reportPrivateUsage]
-        doc_id = store._sanitize_document_id(key="test_key")  # pyright: ignore[reportPrivateUsage]
+        index_name, doc_id = store._get_destination(collection="test", key="test_key")  # pyright: ignore[reportPrivateUsage]

Repeat for the second test block at lines 176-177.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d6849e3 and fd99815.

📒 Files selected for processing (7)
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (12 hunks)
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (3 hunks)
  • key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (5 hunks)
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (12 hunks)
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py (3 hunks)
  • key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (5 hunks)
  • pyproject.toml (1 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Python code must adhere to Ruff formatting and linting with a maximum line length of 140 characters
Use Basedpyright in strict mode for type checking
Use Beartype for runtime type checking (can be disabled via PY_KEY_VALUE_DISABLE_BEARTYPE=true)
Code must be compatible with Python 3.10+ (codegen targets 3.10)

Files:

  • key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py
  • key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py
key-value/key-value-aio/**

📄 CodeRabbit inference engine (AGENTS.md)

Async-first: Make all code changes in the async package (key-value/key-value-aio) and regenerate sync

Files:

  • key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py
key-value/key-value-sync/**

📄 CodeRabbit inference engine (AGENTS.md)

Never edit files in key-value/key-value-sync; the sync package is generated and changes will be overwritten

Files:

  • key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py
{key-value/key-value-aio/src/key_value/aio/stores/**,key-value/key-value-sync/src/key_value/sync/code_gen/stores/**}

📄 CodeRabbit inference engine (AGENTS.md)

Store implementations must store ManagedEntry wrappers, not raw values

Files:

  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py
key-value/key-value-sync/src/key_value/sync/code_gen/stores/**

📄 CodeRabbit inference engine (AGENTS.md)

Sync store implementations must conform to the KeyValue protocol (generated mirror) from key_value/sync/code_gen/protocols/key_value.py

Files:

  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py
key-value/key-value-aio/src/key_value/aio/stores/**

📄 CodeRabbit inference engine (AGENTS.md)

Async store implementations must conform to the AsyncKeyValue protocol from key_value/aio/protocols/key_value.py

Files:

  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py
🧬 Code graph analysis (6)
key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (3)
key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (3)
  • get_elasticsearch_client (33-34)
  • store (153-155)
  • store (208-210)
key-value/key-value-shared/src/key_value/shared/utils/managed_entry.py (1)
  • ManagedEntry (14-123)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (2)
  • managed_entry_to_document (82-96)
  • source_to_managed_entry (99-127)
key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (2)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (2)
  • managed_entry_to_document (62-76)
  • source_to_managed_entry (79-103)
key-value/key-value-sync/tests/code_gen/stores/base.py (2)
  • ContextManagerStoreTestMixin (263-275)
  • store (28-28)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (4)
key-value/key-value-shared/src/key_value/shared/errors/key_value.py (2)
  • SerializationError (10-11)
  • DeserializationError (14-15)
key-value/key-value-shared/src/key_value/shared/utils/managed_entry.py (5)
  • ManagedEntry (14-123)
  • load_from_json (134-140)
  • verify_dict (143-152)
  • value_as_dict (47-48)
  • value_as_json (42-44)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py (6)
  • LessCapableJsonSerializer (119-135)
  • LessCapableNdjsonSerializer (138-149)
  • new_bulk_action (115-116)
  • install_serializer (134-135)
  • install_serializer (148-149)
  • install_default_serializer (129-131)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (2)
  • managed_entry_to_document (82-96)
  • source_to_managed_entry (99-127)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (4)
key-value/key-value-shared/src/key_value/shared/errors/key_value.py (2)
  • SerializationError (10-11)
  • DeserializationError (14-15)
key-value/key-value-shared/src/key_value/shared/utils/managed_entry.py (5)
  • ManagedEntry (14-123)
  • load_from_json (134-140)
  • verify_dict (143-152)
  • value_as_dict (47-48)
  • value_as_json (42-44)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (5)
  • LessCapableJsonSerializer (124-147)
  • LessCapableNdjsonSerializer (150-166)
  • install_serializer (141-147)
  • install_serializer (160-166)
  • install_default_serializer (136-138)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (2)
  • managed_entry_to_document (62-76)
  • source_to_managed_entry (79-103)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py (1)
key-value/key-value-shared/src/key_value/shared/errors/key_value.py (1)
  • SerializationError (10-11)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (2)
key-value/key-value-shared/src/key_value/shared/errors/key_value.py (1)
  • SerializationError (10-11)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py (8)
  • new_bulk_action (115-116)
  • LessCapableJsonSerializer (119-135)
  • default (125-126)
  • default (144-145)
  • install_default_serializer (129-131)
  • install_serializer (134-135)
  • install_serializer (148-149)
  • LessCapableNdjsonSerializer (138-149)
🔇 Additional comments (8)
pyproject.toml (1)

102-104: Pyright include LGTM

Including the whole key-value workspace is good for strict typing across aio/sync/shared. No changes needed.

key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (1)

73-96: Tests for native/non‑native conversion and migration look good

Good coverage of round‑trip, TTL fields, and cross‑mode reads.

Also applies to: 157-187, 189-201, 242-256

key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (1)

120-121: Bulk action helper LGTM

Header shape is correct for bulk ops.

key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (1)

314-326: Good: wrap transport SerializationError into public SerializationError

This maps transport errors to the store’s error surface.

key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (3)

72-95: LGTM! Good test coverage for native storage mode.

The test properly validates that managed_entry_to_document with native_storage=True produces the expected flattened structure and that source_to_managed_entry correctly deserializes it.


188-208: LGTM! Migration test correctly validates backward compatibility.

The test properly verifies that a store in native mode can read legacy documents using the string-based value format, ensuring smooth migration paths.


249-267: LGTM! Forward migration test validates bidirectional compatibility.

The test correctly verifies that a store in non-native mode can read documents stored using the native (flattened) format, confirming that migration works in both directions.

key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (1)

1-3: Auto-generated file—review only.

This is an auto-generated sync version (lines 1-3). Per the coding guidelines, changes should be made in the async source file (key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py) and regenerated using make codegen.

The generated code appears consistent with the async implementation and properly implements native storage support.

Based on coding guidelines.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

♻️ Duplicate comments (2)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (1)

50-51: Fix mapping: value.string must be a string field, not object. Apply in aio store and regenerate.

Elasticsearch will error when a concrete string is sent to a field mapped as type=object. Use a non-indexed keyword with doc_values disabled.

Apply in key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py:

-        "value": {
-            "properties": {"string": {"type": "object", "enabled": False}, "flattened": {"type": "flattened"}},
-        }
+        "value": {
+            "properties": {
+                "string": {"type": "keyword", "index": False, "doc_values": False},
+                "flattened": {"type": "flattened"},
+            },
+        }

Regenerate sync with make codegen.

key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (1)

3-10: Import Serializer and use explicit casting in installers.

Strict typing benefits from importing Serializer and casting instances assigned into transport serializers.

 from elastic_transport import (
-    JsonSerializer,
-    NdjsonSerializer,
-    ObjectApiResponse,
-    SerializationError,
+    JsonSerializer,
+    NdjsonSerializer,
+    ObjectApiResponse,
+    SerializationError,
+    Serializer,
 )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fd99815 and ed1b7cb.

📒 Files selected for processing (4)
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (12 hunks)
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (3 hunks)
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (12 hunks)
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py (3 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Python code must adhere to Ruff formatting and linting with a maximum line length of 140 characters
Use Basedpyright in strict mode for type checking
Use Beartype for runtime type checking (can be disabled via PY_KEY_VALUE_DISABLE_BEARTYPE=true)
Code must be compatible with Python 3.10+ (codegen targets 3.10)

Files:

  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py
key-value/key-value-aio/**

📄 CodeRabbit inference engine (AGENTS.md)

Async-first: Make all code changes in the async package (key-value/key-value-aio) and regenerate sync

Files:

  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py
{key-value/key-value-aio/src/key_value/aio/stores/**,key-value/key-value-sync/src/key_value/sync/code_gen/stores/**}

📄 CodeRabbit inference engine (AGENTS.md)

Store implementations must store ManagedEntry wrappers, not raw values

Files:

  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py
key-value/key-value-aio/src/key_value/aio/stores/**

📄 CodeRabbit inference engine (AGENTS.md)

Async store implementations must conform to the AsyncKeyValue protocol from key_value/aio/protocols/key_value.py

Files:

  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py
key-value/key-value-sync/**

📄 CodeRabbit inference engine (AGENTS.md)

Never edit files in key-value/key-value-sync; the sync package is generated and changes will be overwritten

Files:

  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py
key-value/key-value-sync/src/key_value/sync/code_gen/stores/**

📄 CodeRabbit inference engine (AGENTS.md)

Sync store implementations must conform to the KeyValue protocol (generated mirror) from key_value/sync/code_gen/protocols/key_value.py

Files:

  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py
🧬 Code graph analysis (4)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (1)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py (8)
  • new_bulk_action (115-116)
  • LessCapableJsonSerializer (119-135)
  • default (125-126)
  • default (144-145)
  • install_default_serializer (129-131)
  • install_serializer (134-135)
  • install_serializer (148-149)
  • LessCapableNdjsonSerializer (138-149)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py (2)
key-value/key-value-shared/src/key_value/shared/errors/key_value.py (1)
  • SerializationError (10-11)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (8)
  • new_bulk_action (120-121)
  • LessCapableJsonSerializer (124-147)
  • default (130-133)
  • default (156-157)
  • install_default_serializer (136-138)
  • install_serializer (141-147)
  • install_serializer (160-166)
  • LessCapableNdjsonSerializer (150-166)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (4)
key-value/key-value-shared/src/key_value/shared/errors/key_value.py (2)
  • SerializationError (10-11)
  • DeserializationError (14-15)
key-value/key-value-shared/src/key_value/shared/utils/managed_entry.py (5)
  • ManagedEntry (14-123)
  • load_from_json (134-140)
  • verify_dict (143-152)
  • value_as_dict (47-48)
  • value_as_json (42-44)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (5)
  • LessCapableJsonSerializer (124-147)
  • LessCapableNdjsonSerializer (150-166)
  • install_serializer (141-147)
  • install_serializer (160-166)
  • install_default_serializer (136-138)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (2)
  • managed_entry_to_document (84-98)
  • source_to_managed_entry (101-129)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (5)
key-value/key-value-shared/src/key_value/shared/errors/key_value.py (2)
  • SerializationError (10-11)
  • DeserializationError (14-15)
key-value/key-value-shared/src/key_value/shared/utils/managed_entry.py (5)
  • ManagedEntry (14-123)
  • load_from_json (134-140)
  • verify_dict (143-152)
  • value_as_dict (47-48)
  • value_as_json (42-44)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (6)
  • LessCapableJsonSerializer (124-147)
  • LessCapableNdjsonSerializer (150-166)
  • new_bulk_action (120-121)
  • install_serializer (141-147)
  • install_serializer (160-166)
  • install_default_serializer (136-138)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (2)
  • managed_entry_to_document (64-78)
  • source_to_managed_entry (81-105)
key-value/key-value-shared/src/key_value/shared/utils/time_to_live.py (1)
  • try_parse_datetime_str (34-41)
🔇 Additional comments (10)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py (3)

115-116: LGTM: bulk action builder is correct and minimal.


134-135: Serializer install: assign explicitly and reuse a single instance (fix in aio source, then regenerate).

Avoid dict.update and create one instance to register under both mimetypes; also cast to Serializer to satisfy strict typing. Apply in aio utils and run make codegen.
[ suggest_essential_refactor ][ request_verification ][ duplicate_comment ]
Example change to apply in key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py:

-from elastic_transport import (
-    JsonSerializer,
-    NdjsonSerializer,
-    ObjectApiResponse,
-    SerializationError,
-)
+from elastic_transport import (
+    JsonSerializer,
+    NdjsonSerializer,
+    ObjectApiResponse,
+    SerializationError,
+    Serializer,
+)
@@
-    def install_serializer(cls, client: AsyncElasticsearch) -> None:
-        client.transport.serializers.serializers.update(
-            {
-                cls.mimetype: cls(),
-                cls.compatibility_mimetype: cls(),
-            }
-        )
+    def install_serializer(cls, client: AsyncElasticsearch) -> None:
+        sc = client.transport.serializers
+        serializer = cast(Serializer, cls())
+        sc.serializers[cls.mimetype] = serializer
+        sc.serializers[cls.compatibility_mimetype] = serializer
@@
-    def install_default_serializer(cls, client: AsyncElasticsearch) -> None:
-        cls.install_serializer(client=client)
-        client.transport.serializers.default_serializer = cls()
+    def install_default_serializer(cls, client: AsyncElasticsearch) -> None:
+        cls.install_serializer(client=client)
+        client.transport.serializers.default_serializer = cast(Serializer, cls())

Run codegen to propagate. As per coding guidelines.


144-145: Remove pyright ignores by inlining the error raise (fix in aio source, then regenerate).

Calling LessCapableJsonSerializer.default with self from a different class forces ignores. Just raise the same SerializationError directly.
[ suggest_essential_refactor ][ request_verification ]
Apply in key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py:

 class LessCapableNdjsonSerializer(NdjsonSerializer):
@@
-    def default(self, data: Any) -> Any:
-        return LessCapableJsonSerializer.default(self=self, data=data)  # pyright: ignore[reportCallIssue, reportUnknownVariableType, reportArgumentType]
+    def default(self, data: Any) -> Any:
+        raise SerializationError(
+            message=f"Unable to serialize to JSON: {data!r} (type: {type(data).__name__})",
+        )

Then make codegen.

key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (1)

63-79: Deserialization handling: good migration behavior.

Reading value.flattened first, then falling back to value.string is correct and migration-friendly.

Please ensure tests cover mixed-index scenarios (some docs with string, some with flattened) for mget paths.

Also applies to: 219-223

key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (6)

84-99: Document build/read paths look correct for migration.

Conditional write to value.flattened/value.string and symmetric read with verify_dict/load_from_json is sound.

Please keep tests for mixed-mode indices and invalid types (e.g., value.string non-string).

Also applies to: 101-129


201-204: Serializer installation order is fine.

Installing JSON, setting default, then NDJSON is reasonable; with the typing fixes it should satisfy strict mode.


248-263: Good: _get_managed_entry/_get_managed_entries wrap deserialization errors.

This keeps store semantics stable across migrations.

Also applies to: 265-296


410-422: Use of fields in search: verify field retrieval works against your ES version.

fields=[{"key": None}] may vary by client version; if issues arise, switch to fields=["key"].

Also applies to: 443-454


484-496: Cull query is correct and bounded.

Using a millisecond epoch range filter is appropriate.


58-69: Fix mapping: value.string must use keyword type, not object—this is a breaking change.

The current mapping is semantically incorrect: storing JSON strings in an object field with enabled: false works in practice but violates mapping semantics. The suggested fix (keyword with index: false, doc_values: false) is semantically correct and functionally equivalent in behavior.

However, this is a breaking change: existing indices cannot be updated in-place and require reindexing. Ensure:

  • A migration path exists for production data
  • The suggested mapping has been tested end-to-end
  • Backward compatibility strategy is documented if needed

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ed1b7cb and 5b07963.

📒 Files selected for processing (2)
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/base.py (0 hunks)
  • scripts/build_sync_library.py (1 hunks)
💤 Files with no reviewable changes (1)
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/base.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Python code must adhere to Ruff formatting and linting with a maximum line length of 140 characters
Use Basedpyright in strict mode for type checking
Use Beartype for runtime type checking (can be disabled via PY_KEY_VALUE_DISABLE_BEARTYPE=true)
Code must be compatible with Python 3.10+ (codegen targets 3.10)

Files:

  • scripts/build_sync_library.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: test_quick (3.10, macos-latest, key-value/key-value-sync)
  • GitHub Check: test_quick (3.10, ubuntu-latest, key-value/key-value-sync)
  • GitHub Check: test_quick (3.10, ubuntu-22.04, key-value/key-value-sync)
  • GitHub Check: test_quick (3.10, windows-latest, key-value/key-value-aio)
  • GitHub Check: test_quick (3.10, windows-latest, key-value/key-value-sync)
  • GitHub Check: test_quick (3.10, windows-2022, key-value/key-value-sync)
  • GitHub Check: test_quick (3.10, ubuntu-22.04, key-value/key-value-aio)
  • GitHub Check: test_quick (3.10, ubuntu-latest, key-value/key-value-aio)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (1)

241-253: Harden mget path: ignore bad documents instead of raising.

Wrap source_to_managed_entry in try/except DeserializationError (as in _get_managed_entry) to avoid bulk get failures on mixed/migrating data.

-            entries_by_id[doc_id] = source_to_managed_entry(source=source)
+            try:
+                entries_by_id[doc_id] = source_to_managed_entry(source=source)
+            except DeserializationError:
+                entries_by_id[doc_id] = None
♻️ Duplicate comments (8)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py (2)

127-135: Serializer install: reuse a single instance and satisfy strict typing.

Create one Serializer instance, assign to both mimetypes, and cast default_serializer. Do this in aio source and regenerate sync.

Apply (illustrative) diff:

-from typing import Any, ClassVar, TypeVar, cast
+from typing import Any, ClassVar, TypeVar, cast
+from elastic_transport import Serializer
@@
     def install_default_serializer(cls, client: Elasticsearch) -> None:
-        cls.install_serializer(client=client)
-        client.transport.serializers.default_serializer = cls()
+        cls.install_serializer(client=client)
+        client.transport.serializers.default_serializer = cast(Serializer, cls())
@@
     def install_serializer(cls, client: Elasticsearch) -> None:
-        client.transport.serializers.serializers.update({cls.mimetype: cls(), cls.compatibility_mimetype: cls()})
+        sc = client.transport.serializers
+        serializer = cast(Serializer, cls())
+        sc.serializers[cls.mimetype] = serializer
+        sc.serializers[cls.compatibility_mimetype] = serializer
@@
     def install_serializer(cls, client: Elasticsearch) -> None:
-        client.transport.serializers.serializers.update({cls.mimetype: cls(), cls.compatibility_mimetype: cls()})
+        sc = client.transport.serializers
+        serializer = cast(Serializer, cls())
+        sc.serializers[cls.mimetype] = serializer
+        sc.serializers[cls.compatibility_mimetype] = serializer

Make this change in key-value-aio/src/.../elasticsearch/utils.py and run make codegen. As per coding guidelines.

Also applies to: 143-148


137-145: Avoid cross-class default() call; inline the error.

Remove the pyright ignores by raising the same error here directly. Apply in aio and regenerate.

 class LessCapableNdjsonSerializer(NdjsonSerializer):
@@
-    def default(self, data: Any) -> Any:
-        return LessCapableJsonSerializer.default(self=self, data=data)  # pyright: ignore[reportCallIssue, reportUnknownVariableType, reportArgumentType]
+    def default(self, data: Any) -> Any:
+        raise SerializationError(message=f"Unable to serialize to JSON: {data!r} (type: {type(data).__name__})")
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (1)

121-130: Default must be opt‑in (False) per PR objective.

Flip native_storage default to False in overloads, init, and docstring. Change in aio source and regenerate.

-@overload
-def __init__(..., native_storage: bool = True, ...) -> None: ...
+@overload
+def __init__(..., native_storage: bool = False, ...) -> None: ...
@@
-    native_storage: bool = True,
+    native_storage: bool = False,
@@
-            native_storage: Whether to use native storage mode (flattened field type) or serialize
-                            all values to JSON strings. Defaults to True.
+            native_storage: Whether to use native storage mode (flattened field type) or serialize
+                            all values to JSON strings. Defaults to False.

Also applies to: 138-139, 148-149

key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (3)

37-41: Minor: drop duplicated pyright ignore.

The generator’s ignore already covers the expression; remove the trailing duplicate.


135-147: Serializer installation: reuse instance and cast; use public transport.

Import Serializer; construct one instance and assign to both mimetypes; cast default_serializer. This satisfies Basedpyright strict. Then regenerate sync.

-from elastic_transport import (
-    JsonSerializer,
-    NdjsonSerializer,
-    ObjectApiResponse,
-    SerializationError,
-)
+from elastic_transport import (
+    JsonSerializer,
+    NdjsonSerializer,
+    ObjectApiResponse,
+    SerializationError,
+    Serializer,
+)
@@
     def install_default_serializer(cls, client: AsyncElasticsearch) -> None:
-        cls.install_serializer(client=client)
-        client.transport.serializers.default_serializer = cls()
+        cls.install_serializer(client=client)
+        client.transport.serializers.default_serializer = cast(Serializer, cls())
@@
     def install_serializer(cls, client: AsyncElasticsearch) -> None:
-        client.transport.serializers.serializers.update(
-            {
-                cls.mimetype: cls(),
-                cls.compatibility_mimetype: cls(),
-            }
-        )
+        sc = client.transport.serializers
+        serializer = cast(Serializer, cls())
+        sc.serializers[cls.mimetype] = serializer
+        sc.serializers[cls.compatibility_mimetype] = serializer

Also applies to: 140-147


156-158: Avoid cross-class default() call; inline the error.

Remove the ignores; raise the error directly.

-    def default(self, data: Any) -> Any:
-        return LessCapableJsonSerializer.default(self=self, data=data)  # pyright: ignore[reportCallIssue, reportUnknownVariableType, reportArgumentType]
+    def default(self, data: Any) -> Any:
+        raise SerializationError(
+            message=f"Unable to serialize to JSON: {data!r} (type: {type(data).__name__})",
+        )
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (2)

58-70: Critical: value.string mapping will fail with JSON string storage.

This is the same issue flagged in the past review. The mapping defines value.string as an object type (line 63), which will cause Elasticsearch to throw an error when attempting to store a JSON string. The field should be mapped as a keyword with indexing disabled.

Refer to the past review comment for the recommended fix.


145-164: Critical: native_storage must default to False, not True.

This is the same issue flagged in the past review. The constructor overloads (lines 151, 162) and init (line 173) all default native_storage to True, which:

  1. Contradicts the PR objective of making native storage opt-in
  2. Breaks backward compatibility by changing default behavior
  3. Is inconsistent with the helper function managed_entry_to_document which defaults to False

All three locations must be changed to native_storage: bool = False, and the docstring (lines 183-184) must state "Defaults to False."

Refer to the past review comment for the complete fix including sync regeneration via codegen.

Also applies to: 173-184

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5b07963 and a082156.

📒 Files selected for processing (6)
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (12 hunks)
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (3 hunks)
  • key-value/key-value-aio/tests/stores/redis/test_redis.py (1 hunks)
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (12 hunks)
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py (2 hunks)
  • key-value/key-value-sync/tests/code_gen/stores/redis/test_redis.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Python code must adhere to Ruff formatting and linting with a maximum line length of 140 characters
Use Basedpyright in strict mode for type checking
Use Beartype for runtime type checking (can be disabled via PY_KEY_VALUE_DISABLE_BEARTYPE=true)
Code must be compatible with Python 3.10+ (codegen targets 3.10)

Files:

  • key-value/key-value-aio/tests/stores/redis/test_redis.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py
  • key-value/key-value-sync/tests/code_gen/stores/redis/test_redis.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py
key-value/key-value-aio/**

📄 CodeRabbit inference engine (AGENTS.md)

Async-first: Make all code changes in the async package (key-value/key-value-aio) and regenerate sync

Files:

  • key-value/key-value-aio/tests/stores/redis/test_redis.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py
{key-value/key-value-aio/src/key_value/aio/stores/**,key-value/key-value-sync/src/key_value/sync/code_gen/stores/**}

📄 CodeRabbit inference engine (AGENTS.md)

Store implementations must store ManagedEntry wrappers, not raw values

Files:

  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py
key-value/key-value-aio/src/key_value/aio/stores/**

📄 CodeRabbit inference engine (AGENTS.md)

Async store implementations must conform to the AsyncKeyValue protocol from key_value/aio/protocols/key_value.py

Files:

  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py
  • key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py
key-value/key-value-sync/**

📄 CodeRabbit inference engine (AGENTS.md)

Never edit files in key-value/key-value-sync; the sync package is generated and changes will be overwritten

Files:

  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py
  • key-value/key-value-sync/tests/code_gen/stores/redis/test_redis.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py
key-value/key-value-sync/src/key_value/sync/code_gen/stores/**

📄 CodeRabbit inference engine (AGENTS.md)

Sync store implementations must conform to the KeyValue protocol (generated mirror) from key_value/sync/code_gen/protocols/key_value.py

Files:

  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py
  • key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py
🧬 Code graph analysis (4)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (2)
key-value/key-value-shared/src/key_value/shared/errors/key_value.py (1)
  • SerializationError (10-11)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py (8)
  • new_bulk_action (114-115)
  • LessCapableJsonSerializer (118-134)
  • default (124-125)
  • default (143-144)
  • install_default_serializer (128-130)
  • install_serializer (133-134)
  • install_serializer (147-148)
  • LessCapableNdjsonSerializer (137-148)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py (2)
key-value/key-value-shared/src/key_value/shared/errors/key_value.py (1)
  • SerializationError (10-11)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (8)
  • new_bulk_action (120-121)
  • LessCapableJsonSerializer (124-147)
  • default (130-133)
  • default (156-157)
  • install_default_serializer (136-138)
  • install_serializer (141-147)
  • install_serializer (160-166)
  • LessCapableNdjsonSerializer (150-166)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (5)
key-value/key-value-shared/src/key_value/shared/errors/key_value.py (2)
  • SerializationError (10-11)
  • DeserializationError (14-15)
key-value/key-value-shared/src/key_value/shared/utils/managed_entry.py (5)
  • ManagedEntry (14-123)
  • load_from_json (134-140)
  • verify_dict (143-152)
  • value_as_dict (47-48)
  • value_as_json (42-44)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (6)
  • LessCapableJsonSerializer (124-147)
  • LessCapableNdjsonSerializer (150-166)
  • new_bulk_action (120-121)
  • install_serializer (141-147)
  • install_serializer (160-166)
  • install_default_serializer (136-138)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (2)
  • managed_entry_to_document (64-78)
  • source_to_managed_entry (81-105)
key-value/key-value-shared/src/key_value/shared/utils/time_to_live.py (1)
  • try_parse_datetime_str (34-41)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (4)
key-value/key-value-shared/src/key_value/shared/errors/key_value.py (2)
  • SerializationError (10-11)
  • DeserializationError (14-15)
key-value/key-value-shared/src/key_value/shared/utils/managed_entry.py (5)
  • ManagedEntry (14-123)
  • load_from_json (134-140)
  • verify_dict (143-152)
  • value_as_dict (47-48)
  • value_as_json (42-44)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py (6)
  • LessCapableJsonSerializer (118-134)
  • LessCapableNdjsonSerializer (137-148)
  • new_bulk_action (114-115)
  • install_serializer (133-134)
  • install_serializer (147-148)
  • install_default_serializer (128-130)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (2)
  • managed_entry_to_document (84-98)
  • source_to_managed_entry (101-129)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: test_quick (3.10, windows-2022, key-value/key-value-aio)
  • GitHub Check: test_quick (3.10, ubuntu-latest, key-value/key-value-sync)
  • GitHub Check: test_quick (3.10, ubuntu-22.04, key-value/key-value-sync)
  • GitHub Check: test_quick (3.10, ubuntu-latest, key-value/key-value-aio)
  • GitHub Check: test_quick (3.10, ubuntu-22.04, key-value/key-value-aio)
🔇 Additional comments (12)
key-value/key-value-aio/tests/stores/redis/test_redis.py (1)

39-40: LGTM: snapshot order update is fine.

Stable JSON ordering in tests; round-trip checks still validate semantics.

key-value/key-value-sync/tests/code_gen/stores/redis/test_redis.py (1)

39-41: LGTM: snapshot order update mirrors aio tests.

Generated file matches source; no behavior change.

key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/utils.py (1)

114-116: LGTM: new_bulk_action helper is correct and minimal.

key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/utils.py (1)

120-122: LGTM: new_bulk_action helper is correct and consistent with sync.

key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (8)

5-8: LGTM: Import additions are appropriate.

All newly imported symbols are used correctly throughout the file for serialization/deserialization handling and error management.

Also applies to: 26-26


84-98: LGTM: Document builder correctly implements dual storage modes.

The function properly defaults native_storage to False (opt-in behavior) and correctly branches to store values either as flattened dicts or JSON strings.


101-129: LGTM: Deserialization correctly supports both storage modes.

The function properly prioritizes the flattened field (native storage) and falls back to the string field (legacy storage), enabling migration-friendly behavior. Type validation is thorough at each step.


143-143: LGTM: Attribute addition for native storage mode tracking.


201-203: LGTM: Serializer installation ensures consistent error handling.

The custom serializers correctly raise SerializationError for unserializable data, aligning with the error handling strategy throughout the store.


205-206: LGTM: Attribute assignment stores native storage configuration.


312-327: LGTM: Single-document put correctly handles serialization errors.

The error handling properly maps ElasticsearchSerializationError to the public SerializationError type with a descriptive message.


352-363: LGTM: Bulk put error handling now matches single-document pattern.

The implementation correctly catches ElasticsearchSerializationError and maps it to SerializationError, addressing the past review comment about consistent error handling.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (2)

120-124: Fix resource leak and unreliable response iteration.

The Elasticsearch client is created without an async context manager, causing a resource leak. Additionally, iterating the response object directly is unreliable—you must iterate indices.body.keys() to get actual index names.

Apply this diff:

 async def _cleanup(self):
-    elasticsearch_client = get_elasticsearch_client()
-    indices = await elasticsearch_client.options(ignore_status=404).indices.get(index="kv-store-e2e-test-*")
-    for index in indices:
-        _ = await elasticsearch_client.options(ignore_status=404).indices.delete(index=index)
+    async with get_elasticsearch_client() as elasticsearch_client:
+        resp = await elasticsearch_client.options(ignore_status=404).indices.get(index="kv-store-e2e-test-*")
+        for index in list(resp.body or {}):
+            await elasticsearch_client.options(ignore_status=404).indices.delete(index=index)

After fixing, regenerate the sync version with make codegen.


156-207: Native mode tests look good; consider using _get_destination helper.

The test logic correctly validates native storage format and migration behavior. Both tests properly verify that flattened objects are stored (not JSON strings) and that legacy string-wrapped data can be read.

Optional: For consistency, consider using _get_destination instead of calling both sanitizers separately:

-        index_name = store._sanitize_index_name(collection="test")  # pyright: ignore[reportPrivateUsage]
-        doc_id = store._sanitize_document_id(key="test_key")  # pyright: ignore[reportPrivateUsage]
+        index_name, doc_id = store._get_destination(collection="test", key="test_key")  # pyright: ignore[reportPrivateUsage]

Apply this pattern at lines 162-163 and 190-191.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a082156 and 0ef5175.

📒 Files selected for processing (2)
  • key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (5 hunks)
  • key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (5 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Python code must adhere to Ruff formatting and linting with a maximum line length of 140 characters
Use Basedpyright in strict mode for type checking
Use Beartype for runtime type checking (can be disabled via PY_KEY_VALUE_DISABLE_BEARTYPE=true)
Code must be compatible with Python 3.10+ (codegen targets 3.10)

Files:

  • key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py
  • key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py
key-value/key-value-sync/**

📄 CodeRabbit inference engine (AGENTS.md)

Never edit files in key-value/key-value-sync; the sync package is generated and changes will be overwritten

Files:

  • key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py
key-value/key-value-aio/**

📄 CodeRabbit inference engine (AGENTS.md)

Async-first: Make all code changes in the async package (key-value/key-value-aio) and regenerate sync

Files:

  • key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py
🧬 Code graph analysis (2)
key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (1)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (2)
  • managed_entry_to_document (64-78)
  • source_to_managed_entry (81-105)
key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (3)
key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (4)
  • _cleanup (121-125)
  • get_elasticsearch_client (33-34)
  • store (153-155)
  • store (209-211)
key-value/key-value-shared/src/key_value/shared/utils/managed_entry.py (1)
  • ManagedEntry (14-123)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (2)
  • managed_entry_to_document (84-98)
  • source_to_managed_entry (101-129)
🔇 Additional comments (2)
key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (1)

1-257: This file is auto-generated; review the async source instead.

Per coding guidelines, this sync test file is generated from key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py. Any issues should be fixed in the async source and the sync version regenerated.

key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (1)

220-268: Non-native mode tests validate correct behavior.

Aside from the docstring issue at line 221, both tests correctly validate that non-native mode stores JSON strings and can read legacy flattened data. The migration test ensures backward compatibility during transitions between modes.

@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
31.9% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (2)

146-149: Fix index membership assertions to check indices.body.

You computed len(indices.body) but then checked membership on the response object. Use indices.body.

Apply this diff:

-        indices = await es_client.options(ignore_status=404).indices.get(index="kv-store-e2e-test-*")
-        assert len(indices.body) == 2
-        assert "kv-store-e2e-test-test_collection" in indices
-        assert "kv-store-e2e-test-test_collection_2" in indices
+        indices = await es_client.options(ignore_status=404).indices.get(index="kv-store-e2e-test-*")
+        assert len(indices.body) == 2
+        assert "kv-store-e2e-test-test_collection" in indices.body
+        assert "kv-store-e2e-test-test_collection_2" in indices.body

121-124: Minor: reuse the helper to construct the ES client.

For consistency, use get_elasticsearch_client() here.

Apply this diff:

     @pytest.fixture
     async def es_client(self) -> AsyncGenerator[AsyncElasticsearch, None]:
-        async with AsyncElasticsearch(hosts=[ES_URL]) as es_client:
+        async with get_elasticsearch_client() as es_client:
             yield es_client

As per coding guidelines (consistency/readability).

♻️ Duplicate comments (3)
key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (2)

165-169: Optional: use _get_destination instead of private sanitizers.

Use the helper to derive (index, id) in one call for consistency.

Apply this diff:

-        index_name = store._sanitize_index_name(collection="test")  # pyright: ignore[reportPrivateUsage]
-        doc_id = store._sanitize_document_id(key="test_key")  # pyright: ignore[reportPrivateUsage]
+        index_name, doc_id = store._get_destination(collection="test", key="test_key")  # pyright: ignore[reportPrivateUsage]

228-230: Optional: use _get_destination helper here as well.

Mirror the native-mode test for consistency.

Apply this diff:

-        index_name = store._sanitize_index_name(collection="test")  # pyright: ignore[reportPrivateUsage]
-        doc_id = store._sanitize_document_id(key="test_key")  # pyright: ignore[reportPrivateUsage]
+        index_name, doc_id = store._get_destination(collection="test", key="test_key")  # pyright: ignore[reportPrivateUsage]
key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (1)

44-48: Fix ObjectApiResponse handling: iterate and assert against .body (apply in aio test, then regenerate sync).

Elasticsearch client returns an ObjectApiResponse; iterating it or using membership on the response object is ambiguous. Iterate indices.body.keys() and check membership against indices.body. This was flagged earlier and appears not regenerated into sync yet.

Apply this diff in the source async test file (key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py), then run make codegen to refresh sync:

 def cleanup_elasticsearch_indices(elasticsearch_client: Elasticsearch):
-    indices = elasticsearch_client.options(ignore_status=404).indices.get(index="kv-store-e2e-test-*")
-    for index in indices:
-        _ = elasticsearch_client.options(ignore_status=404).indices.delete(index=index)
+    indices = elasticsearch_client.options(ignore_status=404).indices.get(index="kv-store-e2e-test-*")
+    for index in indices.body.keys():
+        _ = elasticsearch_client.options(ignore_status=404).indices.delete(index=index)
         indices = es_client.options(ignore_status=404).indices.get(index="kv-store-e2e-test-*")
         assert len(indices.body) == 2
-        assert "kv-store-e2e-test-test_collection" in indices
-        assert "kv-store-e2e-test-test_collection_2" in indices
+        assert "kv-store-e2e-test-test_collection" in indices.body
+        assert "kv-store-e2e-test-test_collection_2" in indices.body

Verify no other places rely on iterating the response object:

#!/bin/bash
rg -nP --type=py -C1 '\bindices\.get\(index="kv-store-e2e-test-\*"\)' 
rg -nP --type=py -C1 '\bfor\s+index\s+in\s+indices\b'
rg -nP --type=py -C1 '\bin\s+indices\)'

Also applies to: 147-151

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0ef5175 and 1185087.

📒 Files selected for processing (2)
  • key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (6 hunks)
  • key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (6 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Python code must adhere to Ruff formatting and linting with a maximum line length of 140 characters
Use Basedpyright in strict mode for type checking
Use Beartype for runtime type checking (can be disabled via PY_KEY_VALUE_DISABLE_BEARTYPE=true)
Code must be compatible with Python 3.10+ (codegen targets 3.10)

Files:

  • key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py
  • key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py
key-value/key-value-sync/**

📄 CodeRabbit inference engine (AGENTS.md)

Never edit files in key-value/key-value-sync; the sync package is generated and changes will be overwritten

Files:

  • key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py
key-value/key-value-aio/**

📄 CodeRabbit inference engine (AGENTS.md)

Async-first: Make all code changes in the async package (key-value/key-value-aio) and regenerate sync

Files:

  • key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py
🧬 Code graph analysis (2)
key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (2)
key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (6)
  • cleanup_elasticsearch_indices (43-46)
  • cleanup_elasticsearch_indices (127-130)
  • test_managed_entry_document_conversion_native_storage (78-100)
  • es_client (122-124)
  • store (158-159)
  • store (221-222)
key-value/key-value-sync/src/key_value/sync/code_gen/stores/elasticsearch/store.py (2)
  • managed_entry_to_document (64-78)
  • source_to_managed_entry (81-105)
key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (3)
key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (4)
  • cleanup_elasticsearch_indices (44-47)
  • cleanup_elasticsearch_indices (128-131)
  • store (159-160)
  • store (214-215)
key-value/key-value-aio/src/key_value/aio/stores/elasticsearch/store.py (2)
  • managed_entry_to_document (84-98)
  • source_to_managed_entry (101-129)
key-value/key-value-aio/tests/stores/base.py (2)
  • ContextManagerStoreTestMixin (266-278)
  • store (31-31)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: test_quick (3.10, macos-14, key-value/key-value-sync)
  • GitHub Check: test_quick (3.10, windows-2022, key-value/key-value-sync)
  • GitHub Check: test_quick (3.10, windows-latest, key-value/key-value-sync)
  • GitHub Check: test_quick (3.10, macos-latest, key-value/key-value-aio)
  • GitHub Check: test_quick (3.10, windows-2022, key-value/key-value-aio)
  • GitHub Check: test_quick (3.10, ubuntu-latest, key-value/key-value-aio)
  • GitHub Check: test_quick (3.10, macos-14, key-value/key-value-aio)
  • GitHub Check: test_quick (3.10, windows-latest, key-value/key-value-aio)
  • GitHub Check: test_quick (3.10, ubuntu-latest, key-value/key-value-sync)
  • GitHub Check: test_quick (3.10, ubuntu-22.04, key-value/key-value-sync)
  • GitHub Check: test_quick (3.10, ubuntu-22.04, key-value/key-value-aio)
🔇 Additional comments (4)
key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py (3)

5-5: LGTM: imports are correct and used.

IsStr and IsFloat are used in snapshots; no issues.


64-64: LGTM: legacy document shape asserted via value["string"].

Matches managed_entry_to_document(..., native_storage=False).


78-101: LGTM: native-storage conversion and round-trip.

Asserts value["flattened"] and validates deserialization; good coverage.

key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py (1)

79-102: Native storage round‑trip test looks solid.

Covers both document shape and source_to_managed_entry round‑trip with TTL behavior. LGTM.

Comment on lines +43 to +46
async def cleanup_elasticsearch_indices(elasticsearch_client: AsyncElasticsearch):
indices = await elasticsearch_client.options(ignore_status=404).indices.get(index="kv-store-e2e-test-*")
for index in indices:
_ = await elasticsearch_client.options(ignore_status=404).indices.delete(index=index)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Iterate response body keys; don’t iterate the response object.

indices.get returns a response; iterating it is unreliable. Iterate resp.body keys and delete each index. Also guard for None.

Apply this diff:

 async def cleanup_elasticsearch_indices(elasticsearch_client: AsyncElasticsearch):
-    indices = await elasticsearch_client.options(ignore_status=404).indices.get(index="kv-store-e2e-test-*")
-    for index in indices:
-        _ = await elasticsearch_client.options(ignore_status=404).indices.delete(index=index)
+    resp = await elasticsearch_client.options(ignore_status=404).indices.get(index="kv-store-e2e-test-*")
+    for index in list(resp.body or {}):
+        await elasticsearch_client.options(ignore_status=404).indices.delete(index=index)

Based on past review comments.

🤖 Prompt for AI Agents
In key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py
around lines 43 to 46, the code iterates the AsyncElasticsearch response object
directly which is unreliable; instead assign the result to a variable (e.g.,
resp), guard that resp and resp.body are not None, then iterate resp.body.keys()
and call indices.delete for each key (use options(ignore_status=404) as before)
so you delete each index by name safely.

Comment on lines +126 to +130
@pytest.fixture(autouse=True)
async def cleanup_elasticsearch_indices(self, es_client: AsyncElasticsearch):
await cleanup_elasticsearch_indices(elasticsearch_client=es_client)
yield
await cleanup_elasticsearch_indices(elasticsearch_client=es_client)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Avoid helper/fixture name shadowing; rename the fixture for clarity.

The fixture name matches the module helper, which is confusing. Rename the fixture to reset_indices.

Apply this diff:

-    @pytest.fixture(autouse=True)
-    async def cleanup_elasticsearch_indices(self, es_client: AsyncElasticsearch):
-        await cleanup_elasticsearch_indices(elasticsearch_client=es_client)
-        yield
-        await cleanup_elasticsearch_indices(elasticsearch_client=es_client)
+    @pytest.fixture(autouse=True)
+    async def reset_indices(self, es_client: AsyncElasticsearch):
+        await cleanup_elasticsearch_indices(elasticsearch_client=es_client)
+        yield
+        await cleanup_elasticsearch_indices(elasticsearch_client=es_client)
🤖 Prompt for AI Agents
In key-value/key-value-aio/tests/stores/elasticsearch/test_elasticsearch.py
around lines 126 to 130, the autouse fixture is named
cleanup_elasticsearch_indices which shadows the module helper; rename the
fixture to reset_indices to avoid confusion, update the decorator and function
name (e.g., @pytest.fixture(autouse=True) async def reset_indices(...):) and
update any local references or yields in this test file to use the new fixture
name, leaving the helper function name unchanged.

Comment on lines +127 to 132
@pytest.fixture(autouse=True)
def cleanup_elasticsearch_indices(self, es_client: Elasticsearch):
cleanup_elasticsearch_indices(elasticsearch_client=es_client)
yield
cleanup_elasticsearch_indices(elasticsearch_client=es_client)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Avoid fixture/function name shadowing for clarity.

This autouse fixture’s name shadows the module‑level function cleanup_elasticsearch_indices. Not a bug (name lookup resolves to the module function), but it’s confusing. Rename the fixture to make intent obvious. Apply in aio test then regenerate sync.

-@pytest.fixture(autouse=True)
-def cleanup_elasticsearch_indices(self, es_client: Elasticsearch):
-    cleanup_elasticsearch_indices(elasticsearch_client=es_client)
-    yield
-    cleanup_elasticsearch_indices(elasticsearch_client=es_client)
+@pytest.fixture(autouse=True)
+def cleanup_elasticsearch_indices_fixture(self, es_client: Elasticsearch):
+    cleanup_elasticsearch_indices(elasticsearch_client=es_client)
+    yield
+    cleanup_elasticsearch_indices(elasticsearch_client=es_client)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@pytest.fixture(autouse=True)
def cleanup_elasticsearch_indices(self, es_client: Elasticsearch):
cleanup_elasticsearch_indices(elasticsearch_client=es_client)
yield
cleanup_elasticsearch_indices(elasticsearch_client=es_client)
@pytest.fixture(autouse=True)
def cleanup_elasticsearch_indices_fixture(self, es_client: Elasticsearch):
cleanup_elasticsearch_indices(elasticsearch_client=es_client)
yield
cleanup_elasticsearch_indices(elasticsearch_client=es_client)
🤖 Prompt for AI Agents
In
key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py
around lines 127 to 132, the autouse fixture name cleanup_elasticsearch_indices
shadows the module-level function of the same name; rename the fixture (for
example to cleanup_elasticsearch_fixture or es_indices_cleanup_fixture) and
update its internal calls to invoke the module function by its original name
(cleanup_elasticsearch_indices) so there’s no ambiguity; apply the same rename
in the corresponding aio test file before regenerating the sync tests.

Comment on lines +162 to +192
def test_value_stored_as_flattened_object(self, store: ElasticsearchStore, es_client: Elasticsearch):
"""Verify values are stored as flattened objects, not JSON strings"""
store.put(collection="test", key="test_key", value={"name": "Alice", "age": 30})

# Check raw Elasticsearch document using public sanitization methods
# Note: We need to access these internal methods for testing the storage format
index_name = store._sanitize_index_name(collection="test") # pyright: ignore[reportPrivateUsage]
doc_id = store._sanitize_document_id(key="test_key") # pyright: ignore[reportPrivateUsage]

response = es_client.get(index=index_name, id=doc_id)
assert response.body["_source"] == snapshot(
{
"collection": "test",
"key": "test_key",
"value": {"flattened": {"name": "Alice", "age": 30}},
"created_at": IsStr(min_length=20, max_length=40),
}
)

# Test with TTL
store.put(collection="test", key="test_key", value={"name": "Bob", "age": 25}, ttl=10)
response = es_client.get(index=index_name, id=doc_id)
assert response.body["_source"] == snapshot(
{
"collection": "test",
"key": "test_key",
"value": {"flattened": {"name": "Bob", "age": 25}},
"created_at": IsStr(min_length=20, max_length=40),
"expires_at": IsStr(min_length=20, max_length=40),
}
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Minimize reliance on private methods in tests.

Using _sanitize_index_name and _sanitize_document_id couples tests to internals (note the pyright ignores). Prefer a public helper (e.g., store.index_name_for(collection) / store.document_id_for(key)) exposed for testing, or compute index/doc id via documented rules here. If not feasible now, keep as is.

🤖 Prompt for AI Agents
In
key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py
lines 162-192 the test accesses private methods _sanitize_index_name and
_sanitize_document_id (with pyright ignores), coupling tests to internals;
expose public helper methods on the ElasticsearchStore (e.g.,
index_name_for(collection) and document_id_for(key)) or implement the same
documented sanitization logic directly in the test, then update the test to call
those public helpers (or the inline documented rules) and remove the pyright
ignores so tests use public API instead of private methods.

Comment on lines +217 to +246
def test_value_stored_as_json_string(self, store: ElasticsearchStore, es_client: Elasticsearch):
"""Verify values are stored as JSON strings"""
store.put(collection="test", key="test_key", value={"name": "Alice", "age": 30})

index_name = store._sanitize_index_name(collection="test") # pyright: ignore[reportPrivateUsage]
doc_id = store._sanitize_document_id(key="test_key") # pyright: ignore[reportPrivateUsage]

response = es_client.get(index=index_name, id=doc_id)
assert response.body["_source"] == snapshot(
{
"collection": "test",
"key": "test_key",
"value": {"string": '{"age": 30, "name": "Alice"}'},
"created_at": IsStr(min_length=20, max_length=40),
}
)

# Test with TTL
store.put(collection="test", key="test_key", value={"name": "Bob", "age": 25}, ttl=10)
response = es_client.get(index=index_name, id=doc_id)
assert response.body["_source"] == snapshot(
{
"collection": "test",
"key": "test_key",
"value": {"string": '{"age": 25, "name": "Bob"}'},
"created_at": IsStr(min_length=20, max_length=40),
"expires_at": IsStr(min_length=20, max_length=40),
}
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Make non‑native JSON assertion robust to serializer differences.

Asserting the exact JSON string is brittle (key order, spacing). Validate semantics by parsing the JSON string, while still snapshotting the rest.

Apply in aio test then regenerate sync:

         response = es_client.get(index=index_name, id=doc_id)
-        assert response.body["_source"] == snapshot(
+        src = response.body["_source"]
+        assert src == snapshot(
             {
                 "collection": "test",
                 "key": "test_key",
-                "value": {"string": '{"age": 30, "name": "Alice"}'},
+                # keep snapshot of structure; content validated separately
+                "value": {"string": IsStr(min_length=10)},
                 "created_at": IsStr(min_length=20, max_length=40),
             }
         )
+        import json as _json
+        assert _json.loads(src["value"]["string"]) == {"name": "Alice", "age": 30}

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In
key-value/key-value-sync/tests/code_gen/stores/elasticsearch/test_elasticsearch.py
around lines 217 to 246, the test asserts an exact JSON string which is brittle;
instead parse the stored JSON string into a dict and assert its semantic
equality to the expected dict while snapshotting the remaining fields. Replace
the direct string comparison for response.body["_source"]["value"]["string"]
with json.loads(response.body["_source"]["value"]["string"]) and compare that
dict to {"name":"Alice","age":30} (and {"name":"Bob","age":25} for the TTL
case), keep snapshot asserts for collection, key, created_at and expires_at
using IsStr, and mirror the same change in the aio test before regenerating the
sync tests.

@strawgate strawgate merged commit 7a3edc9 into main Oct 28, 2025
175 of 237 checks passed
@strawgate strawgate deleted the claude/issue-87-20251027-0227 branch October 28, 2025 18:46
claude bot pushed a commit that referenced this pull request Oct 28, 2025
Add support for storing dictionaries as native BSON objects in MongoDB,
similar to the native storage feature implemented for Elasticsearch in #149.

Changes:
- Add native_storage parameter to MongoDBStore (defaults to True)
- Use two-field approach: value.dict for native BSON, value.string for legacy JSON strings
- Support reading both formats for backward compatibility and migration
- Update serialization/deserialization functions to handle both modes
- Add comprehensive tests for native mode, legacy mode, and migration scenarios
- Update sync library via codegen

The native storage mode stores values as BSON documents instead of JSON strings,
enabling better query support and more efficient storage in MongoDB.

Resolves #159

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-authored-by: William Easton <strawgate@users.noreply.github.com>
@claude claude bot mentioned this pull request Oct 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for storing dictionaries directly in Elasticsearch

2 participants