Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(bedrock): utilize invocation metrics from response body for AI21, Anthropic, Meta models when available to record usage on spans #1286

Merged

Conversation

aannirajpatel
Copy link
Contributor

@aannirajpatel aannirajpatel commented Jun 9, 2024

TL;DR: report actuals when possible when instrumenting Bedrock Anthropic and AI21 models. Add a test to cover instrumentation for Meta's Llama models on Bedrock.

  • ✅ I have added tests that cover my changes.
  • ✅ If adding a new instrumentation or changing an existing one, I've added screenshots from some observability platform showing the change.

image

  • ✅ PR name follows conventional commits format: feat(instrumentation): ... or fix(instrumentation): ....
  • ✅ (If applicable) I have updated the documentation accordingly - not applicable for this fix PR.

…opic, Meta models when available to record usage on spans
@CLAassistant
Copy link

CLAassistant commented Jun 9, 2024

CLA assistant check
All committers have signed the CLA.

@aannirajpatel aannirajpatel changed the title fix(bedrock): utilize invocation metrics from response body for Antropic, Meta models when available to record usage on spans fix(bedrock): utilize invocation metrics from response body for Anthropic, Meta models when available to record usage on spans Jun 9, 2024
Copy link
Member

@nirga nirga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @aannirajpatel, thanks so much! I've been meaning to do that for a while. Left a small comment, and there's a small lint issue to fix 🙏

@@ -216,8 +216,18 @@ def _set_anthropic_completion_span_attributes(span, request_body, response_body)
)

if Config.enrich_token_usage:
prompt_tokens = _count_anthropic_tokens([request_body.get("prompt")])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason we put this under if Config.enrich_token_usage is because _count_anthropic_tokens is expensive to run and we want to give users an option to disable it. If you're getting the data from the request - no need to put it under this if

@@ -361,6 +380,17 @@ def _set_llama_span_attributes(span, request_body, response_body):
span, SpanAttributes.LLM_REQUEST_MAX_TOKENS, request_body.get("max_gen_len")
)

if Config.enrich_token_usage and response_body.get("prompt_token_count") is not None and response_body.get("generation_token_count") is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, no need for the Config.enrich_token_usage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pointers, Nir! I've addressed the lint issue and consolidated the attribute addition logic for usage into a function call common for ai21, anthropic, and meta. I also implemented similar logic for Cohere based on their API documentation for Command R and related models. However, when I ran a local test, I noticed that the model did not return token counts as the documentation suggested it would. So I've wrapped that logic in a try-catch but still kept it here (O(1) hit at worst -- not much to lose).

…c, and meta models, add unit test for ai21 model instrumentation
@aannirajpatel aannirajpatel changed the title fix(bedrock): utilize invocation metrics from response body for Anthropic, Meta models when available to record usage on spans fix(bedrock): utilize invocation metrics from response body for AI21, Anthropic, Meta models when available to record usage on spans Jun 11, 2024
@aannirajpatel aannirajpatel force-pushed the fix-bedrock-instrumentation-token-counts branch from aabd1e4 to a3cddea Compare June 11, 2024 01:05
Copy link
Member

@nirga nirga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @aannirajpatel, thank you so much for this!

@nirga nirga merged commit b0d948f into traceloop:main Jun 11, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants