-
Notifications
You must be signed in to change notification settings - Fork 678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(bedrock): utilize invocation metrics from response body for AI21, Anthropic, Meta models when available to record usage on spans #1286
Conversation
…opic, Meta models when available to record usage on spans
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @aannirajpatel, thanks so much! I've been meaning to do that for a while. Left a small comment, and there's a small lint issue to fix 🙏
@@ -216,8 +216,18 @@ def _set_anthropic_completion_span_attributes(span, request_body, response_body) | |||
) | |||
|
|||
if Config.enrich_token_usage: | |||
prompt_tokens = _count_anthropic_tokens([request_body.get("prompt")]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason we put this under if Config.enrich_token_usage
is because _count_anthropic_tokens
is expensive to run and we want to give users an option to disable it. If you're getting the data from the request - no need to put it under this if
@@ -361,6 +380,17 @@ def _set_llama_span_attributes(span, request_body, response_body): | |||
span, SpanAttributes.LLM_REQUEST_MAX_TOKENS, request_body.get("max_gen_len") | |||
) | |||
|
|||
if Config.enrich_token_usage and response_body.get("prompt_token_count") is not None and response_body.get("generation_token_count") is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, no need for the Config.enrich_token_usage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pointers, Nir! I've addressed the lint issue and consolidated the attribute addition logic for usage into a function call common for ai21, anthropic, and meta. I also implemented similar logic for Cohere based on their API documentation for Command R and related models. However, when I ran a local test, I noticed that the model did not return token counts as the documentation suggested it would. So I've wrapped that logic in a try-catch but still kept it here (O(1) hit at worst -- not much to lose).
…c, and meta models, add unit test for ai21 model instrumentation
…thub.com/aannirajpatel/openllmetry into fix-bedrock-instrumentation-token-counts
aabd1e4
to
a3cddea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @aannirajpatel, thank you so much for this!
TL;DR: report actuals when possible when instrumenting Bedrock Anthropic and AI21 models. Add a test to cover instrumentation for Meta's Llama models on Bedrock.
feat(instrumentation): ...
orfix(instrumentation): ...
.