fix(milvus): Enhanced Milvus VectorDB Instrumentation for Improved search Monitoring #2815

divyapathak24 · 2025-04-14T18:48:07Z

I have added tests that cover my changes.
If adding a new instrumentation or changing an existing one, I've added screenshots from some observability platform showing the change.
PR name follows conventional commits format: feat(instrumentation): ... or fix(instrumentation): ....
(If applicable) I have updated the documentation accordingly.

New search attributes for single-vector searches

New search attributes for multi-vector searches

Search Events to log results:

Important

Enhance Milvus VectorDB instrumentation with detailed search monitoring attributes and events, and add tests for single and multi-vector searches.

Instrumentation Enhancements:
- Add search duration and result status attributes in _wrap() in wrapper.py.
- Log search result events in _add_search_result_events() in wrapper.py.
- Add new search attributes like MILVUS_SEARCH_RADIUS, MILVUS_SEARCH_METRIC_TYPE, and MILVUS_SEARCH_INDEX_TYPE in semconv_ai/__init__.py.
Testing:
- Add test_search.py to test single and multi-vector searches, verifying span attributes and events.
- Test different radius values to ensure varying result counts.
Misc:
- Import time module in wrapper.py for measuring search duration.

^{This description was created by}^{for 3fe4f70. It will automatically update as commits are pushed.}

CLAassistant · 2025-04-14T18:48:13Z

All committers have signed the CLA.

ellipsis-dev

❌ Changes requested. Reviewed everything up to 3fe4f70 in 3 minutes and 16 seconds

More details

Looked at 454 lines of code in 3 files
Skipped 0 files when reviewing.
Skipped posting 5 drafted comments based on config settings.

1. packages/opentelemetry-instrumentation-milvus/opentelemetry/instrumentation/milvus/wrapper.py:313

Draft comment:
BUG: In function _add_search_result_events, the variable 'query_match_ids' used in set_global_stats is not defined in the outer scope. It should be aggregated outside the loop or passed appropriately.
Reason this comment was not posted:
Comment looked like it was already resolved.

2. packages/opentelemetry-instrumentation-milvus/opentelemetry/instrumentation/milvus/wrapper.py:222

Draft comment:
Potential Issue: The _encode_partition_name function is used on 'partition_names', which might be a list. Consider encoding lists appropriately if multiple partition names are expected.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 0% vs. threshold = 50%
The comment points out a potential issue but doesn't provide strong evidence that it's actually a problem. The str() function works fine on lists, converting them to a readable string representation. There's no evidence this is causing any issues. The code already handles partition_names as a potential list elsewhere (line 216 uses count_or_none on it).
The comment could be right that this wasn't the intended behavior - maybe partition names should be encoded differently than just str(). But without more context about requirements or bugs, we can't be sure.
While the encoding could potentially be done differently, the current implementation is not clearly wrong - str() is a reasonable way to encode both single values and lists for logging/tracing purposes.
The comment raises a speculative concern without clear evidence of an actual problem. The current implementation appears functional.

3. packages/opentelemetry-instrumentation-milvus/tests/test_search.py:103

Draft comment:
Note: The tests assert string representations for list attributes (e.g. '[5]' for query vector dimensions). Ensure that this string conversion is intended and documented.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50%
None

4. packages/opentelemetry-instrumentation-milvus/opentelemetry/instrumentation/milvus/wrapper.py:313

Draft comment:
The variable 'query_match_ids' is used in set_global_stats without being defined in an outer scope. In a single-query scenario it relies on the loop variable. Consider initializing and aggregating match IDs outside the loop for clarity and to prevent potential issues if no results are returned.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50%
While technically the variable is defined in a narrower scope than where it's used, the code flow ensures it will always be defined before use. The single_query case only happens when len(kwargs) == 1, meaning we'll always go through the loop exactly once before calling set_global_stats(). The code is actually safe as written.
The reviewer raises a valid point about code clarity and potential fragility. If the code structure changes in the future, this implicit dependency could become problematic.
However, the current implementation is logically sound and the variable will always be defined when needed. The suggested change would make the code more verbose without adding real safety.
The comment should be deleted as it points out a theoretical issue that isn't actually a problem in practice, and the suggested change would not meaningfully improve the code.

5. packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py:25

Draft comment:
There seems to be an inconsistency in the naming of the Pinecone usage unit keys. 'PINECONE_DB_USAGE_READ_UNITS' uses the dot notation in its value ('db.pinecone.usage.read_units'), but 'PINECONE_DB_USAGE_WRITE_UNITS' uses an underscore ('db.pinecone.usage_write_units'). Consider updating it to 'db.pinecone.usage.write_units' for consistency.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_5CBtqlqXvBq3Y90S

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev · 2025-04-14T18:51:29Z

packages/opentelemetry-instrumentation-milvus/opentelemetry/instrumentation/milvus/wrapper.py

        span, AISpanAttributes.MILVUS_SEARCH_ANNS_FIELD, kwargs.get("anns_field")
    )
+    _set_span_attribute(
+        span, AISpanAttributes.MILVUS_SEARCH_PARTITION_NAMES, _encode_partition_name(kwargs.get("partition_names"))


The _encode_partition_name function is used to encode partition_names, which may be a list. Consider enhancing this helper to handle list inputs appropriately if that's expected.

ellipsis-dev · 2025-04-14T18:51:29Z

packages/opentelemetry-instrumentation-milvus/opentelemetry/instrumentation/milvus/wrapper.py

+    )
+    _set_span_attribute(
+        span, 
+        AISpanAttributes.MILVUS_SEARCH_RADIUS, 


Consider storing kwargs.get("search_params") in a local variable to avoid redundant calls and improve readability when extracting radius, metric_type, and index_type.

packages/opentelemetry-instrumentation-milvus/tests/test_search.py

nirga

Thanks @divyapathak24! Can you fix the broken CI?

divyapathak24 · 2025-04-15T04:25:38Z

Thanks @divyapathak24! Can you fix the broken CI?

Sure. I will fix them @nirga

divyapathak24 · 2025-04-16T14:26:17Z

@nirga I think the correct approach is to first add the new semantic conventions-ai and publish a new version of the package (say 0.4.4). Only after that can I safely use those attributes in Milvus—otherwise, the tests will fail because the new attributes won’t be available to milvus yet right?

galkleinman

Hey @divyapathak24,

Thanks for submitting this one! please look at my comments - if you agree with them, please adjust (all occurrences of the things i mentioned, i didn't comment on all of them), otherwise let's discuss it here :)

galkleinman · 2025-04-17T12:39:29Z

packages/opentelemetry-instrumentation-milvus/opentelemetry/instrumentation/milvus/wrapper.py

+            _set_span_attribute(
+                span,
+                AISpanAttributes.MILVUS_SEARCH_DURATION_IN_MS,
+                ((end_time - start_time) * 1000),
+            )
+            _set_span_attribute(
+                span, AISpanAttributes.MILVUS_SEARCH_RESULT_STATUS, "success"
+            )


wondering wether these attrs are needed... the search time (the way it's being calculated here) can be inferred implicitly by the start and end time of the span.

for the MILVUS_SEARCH_RESULT_STATUS, wondering if we need it, because it's sort of const here and because we can just use the generic "span.error"/"span.status_code" to report errors.

galkleinman · 2025-04-17T15:26:05Z

packages/opentelemetry-instrumentation-milvus/opentelemetry/instrumentation/milvus/wrapper.py

+            if kwargs.get("search_params") and "radius" in kwargs.get("search_params")
+            else None


isn't it redundant?

galkleinman · 2025-04-17T15:26:22Z

packages/opentelemetry-instrumentation-milvus/opentelemetry/instrumentation/milvus/wrapper.py

+            if kwargs.get("search_params")
+            and "metric_type" in kwargs.get("search_params")
+            else None


same as above

galkleinman · 2025-04-17T15:27:22Z

packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py

+    MILVUS_SEARCH_INDEX_TYPE = "db.milvus.search.index_type"
    MILVUS_SEARCH_TIMEOUT = "db.milvus.search.timeout"
+    MILVUS_SEARCH_RESULT_COUNT = "db.milvus.search.result_count"
+    MILVUS_SEARCH_RESULT_STATUS = "db.milvus.search.status"


Just to confirm, do you mean remove these lines entirely and prepare a separate PR on semconv_ai ?

galkleinman · 2025-04-17T15:27:37Z

packages/opentelemetry-semantic-conventions-ai/opentelemetry/semconv_ai/__init__.py

+    MILVUS_SEARCH_RESULT_MIN_DISTANCE = "db.milvus.search.result_min_distance"
+    MILVUS_SEARCH_RESULT_MAX_DISTANCE = "db.milvus.search.result_max_distance"
+    MILVUS_SEARCH_RESULT_AVG_DISTANCE = "db.milvus.search.result_avg_distance"
+    MILVUS_SEARCH_DURATION_IN_MS = "db.milvus.search.duration_ms"


divyapathak24 · 2025-04-22T11:47:01Z

Hey @divyapathak24,

Thanks for submitting this one! please look at my comments - if you agree with them, please adjust (all occurrences of the things i mentioned, i didn't comment on all of them), otherwise let's discuss it here :)

Hi @galkleinman
The explicit attributes like MILVUS_SEARCH_DURATION_IN_MS, MILVUS_SEARCH_RESULT_STATUS, RADIUS, and METRIC_TYPE make Milvus search insights directly accessible, avoiding post-processing and assumptions about span internals.

galkleinman · 2025-04-26T12:53:59Z

Hey @divyapathak24,
Thanks for submitting this one! please look at my comments - if you agree with them, please adjust (all occurrences of the things i mentioned, i didn't comment on all of them), otherwise let's discuss it here :)

Hi @galkleinman The explicit attributes like MILVUS_SEARCH_DURATION_IN_MS, MILVUS_SEARCH_RESULT_STATUS, RADIUS, and METRIC_TYPE make Milvus search insights directly accessible, avoiding post-processing and assumptions about span internals.

Hey again @divyapathak24,

It isn't span internal, an no additional post-processing is needed, it's just the way OTEL works afaik... RADIUS and METRIC_TYPE are indeed unique to Milvus. But the duration of the action (in this case search) implies for all of the spans, and therefore implemented OOTB by start_time, end_time and duration... So it feels (to me at least) like a data duplication and deviation from the standard/protocol.

@nirga wdyt?

divyapathak24 · 2025-04-29T04:27:14Z

Hey @divyapathak24,
Thanks for submitting this one! please look at my comments - if you agree with them, please adjust (all occurrences of the things i mentioned, i didn't comment on all of them), otherwise let's discuss it here :)

Hi @galkleinman The explicit attributes like MILVUS_SEARCH_DURATION_IN_MS, MILVUS_SEARCH_RESULT_STATUS, RADIUS, and METRIC_TYPE make Milvus search insights directly accessible, avoiding post-processing and assumptions about span internals.

Hey again @divyapathak24,

It isn't span internal, an no additional post-processing is needed, it's just the way OTEL works afaik... RADIUS and METRIC_TYPE are indeed unique to Milvus. But the duration of the action (in this case search) implies for all of the spans, and therefore implemented OOTB by start_time, end_time and duration... So it feels (to me at least) like a data duplication and deviation from the standard/protocol.

@nirga wdyt?

Hi @galkleinman @nirga,

Thanks for the feedback! I will remove the redundant attributes. Just wanted to confirm if everything else looks good to you both. Let me know if you have any other suggestions for other attributes.

divyapathak24 · 2025-04-30T05:57:09Z

@nirga @galkleinman Can you please review the changes? I have removed redundant attributes and pushed changes and updated test cases accordingly.
cc: @hk-bmi

nirga · 2025-04-30T06:33:04Z

Thanks @divyapathak24 - can you sign the CLA? - #2815 (comment)

divyapathak24 · 2025-04-30T07:18:55Z

@nirga done. Can you re-trigger the build?

divyapathak24 · 2025-04-30T18:20:25Z

@nirga done. Can you re-trigger the build?

@nirga @galkleinman It seems my test cases are failing as it is not able to find the newly added attributes in semconv ai. How do I resolve this? By changing semconv-ai package version from 0.4.3 to 0.4.4 ?

nirga · 2025-04-30T19:34:02Z

Yes @divyapathak24 :)

divyapathak24 · 2025-05-05T04:03:30Z

@nirga @galkleinman I have opened another PR #2883 for semconv version update which has updated search attributes.

…arch Monitoring (#2815)

ellipsis-dev bot reviewed Apr 14, 2025

View reviewed changes

nirga reviewed Apr 14, 2025

View reviewed changes

divyapathak24 force-pushed the main branch from dc63cc5 to f0a54b2 Compare April 15, 2025 15:11

galkleinman reviewed Apr 17, 2025

View reviewed changes

divyapathak24 closed this May 5, 2025

divyapathak24 force-pushed the main branch from dcc4bbf to 7a1b8bb Compare May 5, 2025 03:29

updated milvus search instrumentation

8c168fc

divyapathak24 reopened this May 5, 2025

divyapathak24 and others added 4 commits May 7, 2025 15:49

modified instrumentation for events attributes

aabbb9e

bug fix

1ce0f8f

bug fix

d65594e

Merge branch 'traceloop:main' into main

068a102

nirga approved these changes May 10, 2025

View reviewed changes

nirga merged commit c3b3559 into traceloop:main May 10, 2025
9 checks passed

nina-kollman pushed a commit that referenced this pull request Aug 11, 2025

fix(milvus): Enhanced Milvus VectorDB Instrumentation for Improved se…

a1c4981

…arch Monitoring (#2815)

		if kwargs.get("search_params") and "radius" in kwargs.get("search_params")
		else None

fix(milvus): Enhanced Milvus VectorDB Instrumentation for Improved search Monitoring #2815

fix(milvus): Enhanced Milvus VectorDB Instrumentation for Improved search Monitoring #2815

Conversation

divyapathak24 commented Apr 14, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CLAassistant commented Apr 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot Apr 14, 2025

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot Apr 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nirga left a comment

Choose a reason for hiding this comment

Uh oh!

divyapathak24 commented Apr 15, 2025

Uh oh!

divyapathak24 commented Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

galkleinman left a comment

Choose a reason for hiding this comment

Uh oh!

galkleinman Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

galkleinman Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

galkleinman Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

galkleinman Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

divyapathak24 Apr 22, 2025

Choose a reason for hiding this comment

Uh oh!

galkleinman Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

divyapathak24 commented Apr 22, 2025

Uh oh!

galkleinman commented Apr 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

divyapathak24 commented Apr 29, 2025

Uh oh!

divyapathak24 commented Apr 30, 2025

Uh oh!

nirga commented Apr 30, 2025

Uh oh!

divyapathak24 commented Apr 30, 2025

Uh oh!

divyapathak24 commented Apr 30, 2025

Uh oh!

nirga commented Apr 30, 2025

Uh oh!

divyapathak24 commented May 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

divyapathak24 commented Apr 14, 2025 •

edited by ellipsis-dev bot

Loading

CLAassistant commented Apr 14, 2025 •

edited

Loading

divyapathak24 commented Apr 16, 2025 •

edited

Loading

galkleinman commented Apr 26, 2025 •

edited

Loading

divyapathak24 commented May 5, 2025 •

edited

Loading