-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[OPIK-415] Compute traces cost based on token usage #703
[OPIK-415] Compute traces cost based on token usage #703
Conversation
00b21ea
to
dd0b37f
Compare
@@ -581,7 +581,7 @@ AND id in ( | |||
"""; | |||
|
|||
private static final String ESTIMATED_COST_VERSION = "1.0"; | |||
private static final BigDecimal ZERO_COST = new BigDecimal("0.00000000"); | |||
public static final BigDecimal ZERO_COST = new BigDecimal("0.00000000"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public static final BigDecimal ZERO_COST = new BigDecimal("0.00000000"); | |
public static final BigDecimal ZERO_COST = new BigDecimal.ZERO; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
@@ -747,6 +753,9 @@ private Publisher<Trace> mapToDto(Result result) { | |||
.filter(it -> !it.isEmpty()) | |||
.orElse(null)) | |||
.usage(row.get("usage", Map.class)) | |||
.totalEstimatedCost(row.get("total_estimated_cost", BigDecimal.class).equals(ZERO_COST) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably should use compareTo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, now it's a must, since we changed to BigDecimal.ZERO
.
equals
works only for the same precision comparison and is not robust. Good catch.
d8a8ec8
to
2be4465
Compare
@@ -581,7 +581,7 @@ AND id in ( | |||
"""; | |||
|
|||
private static final String ESTIMATED_COST_VERSION = "1.0"; | |||
private static final BigDecimal ZERO_COST = new BigDecimal("0.00000000"); | |||
public static final BigDecimal ZERO_COST = BigDecimal.ZERO; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for this constant just use the one from BigDecimal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
.toBuilder() | ||
.id(null) | ||
.projectName(projectName) | ||
.usage(Map.of("completion_tokens", 200 * 5L, "prompt_tokens", 300 * 5L, "total_tokens", 4 * 5L)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we generate random values to make the test more robust?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No we can't. It should be related to spans to assert and also usage keys should be specific for cost calculation and can't be random.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean something like this:
Spans
.usage(Map.of("completion_tokens", factory. manufacturePojo(Integer.class), "prompt_tokens", factory. manufacturePojo(Integer.class), "total_tokens", factory. manufacturePojo(Integer.class)))
Then, in the traces, we just group by usage name and calculate the avg expected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated per your request. But could you please explain how hardcoded usage values might introduce flakiness?
2be4465
to
9215e66
Compare
9215e66
to
6a2f37b
Compare
6a2f37b
to
2ea154e
Compare
assertThat(traceExpectedCost.compareTo(BigDecimal.ZERO) == 0 ? | ||
createdTrace.totalEstimatedCost() == null : | ||
traceExpectedCost.compareTo(createdTrace.totalEstimatedCost()) == 0) | ||
.isEqualTo(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assertThat(traceExpectedCost.compareTo(BigDecimal.ZERO) == 0 ? | |
createdTrace.totalEstimatedCost() == null : | |
traceExpectedCost.compareTo(createdTrace.totalEstimatedCost()) == 0) | |
.isEqualTo(true); | |
var actual = createdTrace.totalEstimatedCost(); | |
traceExpectedCost = traceExpectedCost.compareTo(BigDecimal.ZERO) == 0 ? null : traceExpectedCost; | |
assertThat(actual) | |
.usingRecursiveComparison(RecursiveComparisonConfiguration.builder() | |
.withComparatorForType(BigDecimal::compareTo, BigDecimal.class) | |
.build()) | |
.isEqualTo(traceExpectedCost); |
Details
Computes trace cost based on token usage
Resolves #OPIK-415
Testing
Added integration tests
Documentation
https://www.notion.so/cometml/Opik-Span-Cost-tracking-13c7124010a380a7ad36ced128a26dc7