Skip to content

Conversation

@gary-huang
Copy link
Contributor

What Does This Do

Motivation

Additional Notes

Contributor Checklist

Jira ticket: [PROJ-IDENT]

@github-actions
Copy link
Contributor

Hi! 👋 Looks like you updated a Git Submodule.
If this was not intentional please make sure to:

@gary-huang gary-huang changed the base branch from master to gary/use-ctx-api April 10, 2025 04:21
@gary-huang gary-huang force-pushed the gary/submit-evals-2 branch 2 times, most recently from d7d8e6a to 53386c1 Compare April 10, 2025 04:26
@pr-commenter
Copy link

pr-commenter bot commented Apr 10, 2025

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master gary/submit-evals-2
git_commit_date 1751991953 1751992926
git_commit_sha 860a603 a8ef30d
release_version 1.51.0-SNAPSHOT~860a603678 1.51.0-SNAPSHOT~a8ef30deee
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1751994744 1751994744
ci_job_id 1018911488 1018911488
ci_pipeline_id 69941737 69941737
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-2-6rdwpwku 6.8.0-1030-aws #32~22.04.1-Ubuntu SMP Thu Jun 5 08:38:24 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-2-6rdwpwku 6.8.0-1030-aws #32~22.04.1-Ubuntu SMP Thu Jun 5 08:38:24 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module Agent Agent
parent None None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 47 metrics, 6 unstable metrics.

Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.51.0-SNAPSHOT~a8ef30deee, baseline=1.51.0-SNAPSHOT~860a603678

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (995.804 ms) : 0, 995804
Total [baseline] (8.599 s) : 0, 8598983
Agent [candidate] (995.505 ms) : 0, 995505
Total [candidate] (8.575 s) : 0, 8575123
section iast
Agent [baseline] (1.135 s) : 0, 1135314
Total [baseline] (9.33 s) : 0, 9329655
Agent [candidate] (1.132 s) : 0, 1132229
Total [candidate] (9.287 s) : 0, 9287133
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 995.804 ms -
Agent iast 1.135 s 139.51 ms (14.0%)
Total tracing 8.599 s -
Total iast 9.33 s 730.673 ms (8.5%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 995.505 ms -
Agent iast 1.132 s 136.725 ms (13.7%)
Total tracing 8.575 s -
Total iast 9.287 s 712.01 ms (8.3%)
gantt
    title insecure-bank - break down per module: candidate=1.51.0-SNAPSHOT~a8ef30deee, baseline=1.51.0-SNAPSHOT~860a603678

    dateFormat X
    axisFormat %s
section tracing
BytebuddyAgent [baseline] (687.09 ms) : 0, 687090
BytebuddyAgent [candidate] (687.494 ms) : 0, 687494
GlobalTracer [baseline] (242.328 ms) : 0, 242328
GlobalTracer [candidate] (242.103 ms) : 0, 242103
AppSec [baseline] (30.506 ms) : 0, 30506
AppSec [candidate] (30.115 ms) : 0, 30115
Debugger [baseline] (6.084 ms) : 0, 6084
Debugger [candidate] (6.056 ms) : 0, 6056
Remote Config [baseline] (674.963 µs) : 0, 675
Remote Config [candidate] (684.763 µs) : 0, 685
Telemetry [baseline] (8.275 ms) : 0, 8275
Telemetry [candidate] (8.24 ms) : 0, 8240
section iast
BytebuddyAgent [baseline] (809.561 ms) : 0, 809561
BytebuddyAgent [candidate] (807.593 ms) : 0, 807593
GlobalTracer [baseline] (232.674 ms) : 0, 232674
GlobalTracer [candidate] (231.726 ms) : 0, 231726
IAST [baseline] (29.642 ms) : 0, 29642
IAST [candidate] (24.667 ms) : 0, 24667
AppSec [baseline] (28.102 ms) : 0, 28102
AppSec [candidate] (32.985 ms) : 0, 32985
Debugger [baseline] (5.913 ms) : 0, 5913
Debugger [candidate] (5.826 ms) : 0, 5826
Remote Config [baseline] (585.206 µs) : 0, 585
Remote Config [candidate] (582.477 µs) : 0, 582
Telemetry [baseline] (8.049 ms) : 0, 8049
Telemetry [candidate] (7.988 ms) : 0, 7988
Loading
Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.51.0-SNAPSHOT~a8ef30deee, baseline=1.51.0-SNAPSHOT~860a603678

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (996.676 ms) : 0, 996676
Total [baseline] (10.721 s) : 0, 10720882
Agent [candidate] (995.911 ms) : 0, 995911
Total [candidate] (10.622 s) : 0, 10621746
section appsec
Agent [baseline] (1.178 s) : 0, 1177512
Total [baseline] (10.755 s) : 0, 10755346
Agent [candidate] (1.178 s) : 0, 1177850
Total [candidate] (10.799 s) : 0, 10799023
section iast
Agent [baseline] (1.132 s) : 0, 1132131
Total [baseline] (10.875 s) : 0, 10875425
Agent [candidate] (1.134 s) : 0, 1133900
Total [candidate] (10.829 s) : 0, 10829108
section profiling
Agent [baseline] (1.246 s) : 0, 1245905
Total [baseline] (10.921 s) : 0, 10921370
Agent [candidate] (1.253 s) : 0, 1253041
Total [candidate] (11.04 s) : 0, 11039709
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 996.676 ms -
Agent appsec 1.178 s 180.835 ms (18.1%)
Agent iast 1.132 s 135.455 ms (13.6%)
Agent profiling 1.246 s 249.228 ms (25.0%)
Total tracing 10.721 s -
Total appsec 10.755 s 34.464 ms (0.3%)
Total iast 10.875 s 154.543 ms (1.4%)
Total profiling 10.921 s 200.489 ms (1.9%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 995.911 ms -
Agent appsec 1.178 s 181.939 ms (18.3%)
Agent iast 1.134 s 137.989 ms (13.9%)
Agent profiling 1.253 s 257.13 ms (25.8%)
Total tracing 10.622 s -
Total appsec 10.799 s 177.277 ms (1.7%)
Total iast 10.829 s 207.362 ms (2.0%)
Total profiling 11.04 s 417.963 ms (3.9%)
gantt
    title petclinic - break down per module: candidate=1.51.0-SNAPSHOT~a8ef30deee, baseline=1.51.0-SNAPSHOT~860a603678

    dateFormat X
    axisFormat %s
section tracing
BytebuddyAgent [baseline] (688.53 ms) : 0, 688530
BytebuddyAgent [candidate] (687.669 ms) : 0, 687669
GlobalTracer [baseline] (242.07 ms) : 0, 242070
GlobalTracer [candidate] (242.302 ms) : 0, 242302
AppSec [baseline] (30.206 ms) : 0, 30206
AppSec [candidate] (30.212 ms) : 0, 30212
Debugger [baseline] (6.114 ms) : 0, 6114
Debugger [candidate] (6.074 ms) : 0, 6074
Remote Config [baseline] (671.864 µs) : 0, 672
Remote Config [candidate] (682.772 µs) : 0, 683
Telemetry [baseline] (8.234 ms) : 0, 8234
Telemetry [candidate] (8.189 ms) : 0, 8189
section appsec
BytebuddyAgent [baseline] (711.779 ms) : 0, 711779
BytebuddyAgent [candidate] (711.799 ms) : 0, 711799
GlobalTracer [baseline] (235.588 ms) : 0, 235588
GlobalTracer [candidate] (235.809 ms) : 0, 235809
AppSec [baseline] (171.47 ms) : 0, 171470
AppSec [candidate] (171.733 ms) : 0, 171733
Debugger [baseline] (5.762 ms) : 0, 5762
Debugger [candidate] (5.765 ms) : 0, 5765
Remote Config [baseline] (586.404 µs) : 0, 586
Remote Config [candidate] (602.991 µs) : 0, 603
Telemetry [baseline] (8.056 ms) : 0, 8056
Telemetry [candidate] (8.046 ms) : 0, 8046
IAST [baseline] (23.414 ms) : 0, 23414
IAST [candidate] (23.21 ms) : 0, 23210
section iast
BytebuddyAgent [baseline] (807.359 ms) : 0, 807359
BytebuddyAgent [candidate] (808.083 ms) : 0, 808083
GlobalTracer [baseline] (232.082 ms) : 0, 232082
GlobalTracer [candidate] (232.59 ms) : 0, 232590
AppSec [baseline] (29.697 ms) : 0, 29697
AppSec [candidate] (31.631 ms) : 0, 31631
Debugger [baseline] (5.822 ms) : 0, 5822
Debugger [candidate] (5.824 ms) : 0, 5824
Remote Config [baseline] (570.956 µs) : 0, 571
Remote Config [candidate] (575.431 µs) : 0, 575
Telemetry [baseline] (7.94 ms) : 0, 7940
Telemetry [candidate] (7.966 ms) : 0, 7966
IAST [baseline] (27.807 ms) : 0, 27807
IAST [candidate] (26.347 ms) : 0, 26347
section profiling
ProfilingAgent [baseline] (104.474 ms) : 0, 104474
ProfilingAgent [candidate] (103.989 ms) : 0, 103989
BytebuddyAgent [baseline] (677.959 ms) : 0, 677959
BytebuddyAgent [candidate] (683.174 ms) : 0, 683174
GlobalTracer [baseline] (361.861 ms) : 0, 361861
GlobalTracer [candidate] (363.751 ms) : 0, 363751
AppSec [baseline] (32.257 ms) : 0, 32257
AppSec [candidate] (30.998 ms) : 0, 30998
Debugger [baseline] (12.231 ms) : 0, 12231
Debugger [candidate] (13.546 ms) : 0, 13546
Remote Config [baseline] (660.292 µs) : 0, 660
Remote Config [candidate] (661.079 µs) : 0, 661
Telemetry [baseline] (7.974 ms) : 0, 7974
Telemetry [candidate] (8.057 ms) : 0, 8057
Profiling [baseline] (104.499 ms) : 0, 104499
Profiling [candidate] (104.013 ms) : 0, 104013
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master gary/submit-evals-2
git_commit_date 1751991953 1751992926
git_commit_sha 860a603 a8ef30d
release_version 1.51.0-SNAPSHOT~860a603678 1.51.0-SNAPSHOT~a8ef30deee
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1751994473 1751994473
ci_job_id 1018911489 1018911489
ci_pipeline_id 69941737 69941737
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-1-5n752ufk 6.8.0-1030-aws #32~22.04.1-Ubuntu SMP Thu Jun 5 08:38:24 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-1-5n752ufk 6.8.0-1030-aws #32~22.04.1-Ubuntu SMP Thu Jun 5 08:38:24 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 1 performance improvements and 2 performance regressions! Performance is the same for 9 metrics, 12 unstable metrics.

scenario Δ mean http_req_duration Δ mean throughput candidate mean http_req_duration candidate mean throughput baseline mean http_req_duration baseline mean throughput
scenario:load:insecure-bank:profiling:high_load worse
[+202.817µs; +502.416µs] or [+2.390%; +5.920%]
unstable
[-89.343op/s; +45.843op/s] or [-16.349%; +8.389%]
8.840ms 524.719op/s 8.487ms 546.469op/s
scenario:load:insecure-bank:iast_GLOBAL:high_load worse
[+403.583µs; +822.873µs] or [+3.954%; +8.062%]
unstable
[-76.042op/s; +24.854op/s] or [-16.703%; +5.459%]
10.820ms 429.656op/s 10.207ms 455.250op/s
scenario:load:petclinic:profiling:high_load better
[-4.015ms; -3.006ms] or [-7.960%; -5.960%]
unstable
[-0.219op/s; +13.994op/s] or [-0.235%; +15.079%]
46.931ms 99.688op/s 50.442ms 92.800op/s
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.51.0-SNAPSHOT~a8ef30deee, baseline=1.51.0-SNAPSHOT~860a603678
    dateFormat X
    axisFormat %s
section baseline
no_agent (4.427 ms) : 4377, 4477
.   : milestone, 4427,
iast (8.906 ms) : 8754, 9057
.   : milestone, 8906,
iast_FULL (13.867 ms) : 13591, 14144
.   : milestone, 13867,
iast_GLOBAL (10.207 ms) : 10007, 10407
.   : milestone, 10207,
profiling (8.487 ms) : 8350, 8624
.   : milestone, 8487,
tracing (7.609 ms) : 7494, 7724
.   : milestone, 7609,
section candidate
no_agent (4.376 ms) : 4318, 4433
.   : milestone, 4376,
iast (8.867 ms) : 8726, 9009
.   : milestone, 8867,
iast_FULL (13.763 ms) : 13492, 14034
.   : milestone, 13763,
iast_GLOBAL (10.82 ms) : 10630, 11010
.   : milestone, 10820,
profiling (8.84 ms) : 8698, 8981
.   : milestone, 8840,
tracing (7.416 ms) : 7312, 7520
.   : milestone, 7416,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 4.427 ms [4.377 ms, 4.477 ms] -
iast 8.906 ms [8.754 ms, 9.057 ms] 4.478 ms (101.2%)
iast_FULL 13.867 ms [13.591 ms, 14.144 ms] 9.44 ms (213.2%)
iast_GLOBAL 10.207 ms [10.007 ms, 10.407 ms] 5.78 ms (130.5%)
profiling 8.487 ms [8.35 ms, 8.624 ms] 4.06 ms (91.7%)
tracing 7.609 ms [7.494 ms, 7.724 ms] 3.182 ms (71.9%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 4.376 ms [4.318 ms, 4.433 ms] -
iast 8.867 ms [8.726 ms, 9.009 ms] 4.492 ms (102.7%)
iast_FULL 13.763 ms [13.492 ms, 14.034 ms] 9.387 ms (214.5%)
iast_GLOBAL 10.82 ms [10.63 ms, 11.01 ms] 6.445 ms (147.3%)
profiling 8.84 ms [8.698 ms, 8.981 ms] 4.464 ms (102.0%)
tracing 7.416 ms [7.312 ms, 7.52 ms] 3.04 ms (69.5%)
Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.51.0-SNAPSHOT~a8ef30deee, baseline=1.51.0-SNAPSHOT~860a603678
    dateFormat X
    axisFormat %s
section baseline
no_agent (37.319 ms) : 37021, 37617
.   : milestone, 37319,
appsec (49.515 ms) : 49071, 49958
.   : milestone, 49515,
code_origins (44.428 ms) : 44064, 44793
.   : milestone, 44428,
iast (45.498 ms) : 45097, 45899
.   : milestone, 45498,
profiling (50.442 ms) : 49942, 50942
.   : milestone, 50442,
tracing (44.9 ms) : 44515, 45285
.   : milestone, 44900,
section candidate
no_agent (38.248 ms) : 37936, 38561
.   : milestone, 38248,
appsec (49.418 ms) : 48982, 49854
.   : milestone, 49418,
code_origins (44.793 ms) : 44412, 45174
.   : milestone, 44793,
iast (45.201 ms) : 44794, 45609
.   : milestone, 45201,
profiling (46.931 ms) : 46496, 47366
.   : milestone, 46931,
tracing (44.819 ms) : 44443, 45194
.   : milestone, 44819,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 37.319 ms [37.021 ms, 37.617 ms] -
appsec 49.515 ms [49.071 ms, 49.958 ms] 12.196 ms (32.7%)
code_origins 44.428 ms [44.064 ms, 44.793 ms] 7.109 ms (19.0%)
iast 45.498 ms [45.097 ms, 45.899 ms] 8.179 ms (21.9%)
profiling 50.442 ms [49.942 ms, 50.942 ms] 13.123 ms (35.2%)
tracing 44.9 ms [44.515 ms, 45.285 ms] 7.581 ms (20.3%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 38.248 ms [37.936 ms, 38.561 ms] -
appsec 49.418 ms [48.982 ms, 49.854 ms] 11.169 ms (29.2%)
code_origins 44.793 ms [44.412 ms, 45.174 ms] 6.544 ms (17.1%)
iast 45.201 ms [44.794 ms, 45.609 ms] 6.953 ms (18.2%)
profiling 46.931 ms [46.496 ms, 47.366 ms] 8.683 ms (22.7%)
tracing 44.819 ms [44.443 ms, 45.194 ms] 6.57 ms (17.2%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master gary/submit-evals-2
git_commit_date 1751991953 1751992926
git_commit_sha 860a603 a8ef30d
release_version 1.51.0-SNAPSHOT~860a603678 1.51.0-SNAPSHOT~a8ef30deee
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1751994909 1751994909
ci_job_id 1018911490 1018911490
ci_pipeline_id 69941737 69941737
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-3-aomqyz7n 6.8.0-1030-aws #32~22.04.1-Ubuntu SMP Thu Jun 5 08:38:24 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-3-aomqyz7n 6.8.0-1030-aws #32~22.04.1-Ubuntu SMP Thu Jun 5 08:38:24 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 12 metrics, 0 unstable metrics.

Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.51.0-SNAPSHOT~a8ef30deee, baseline=1.51.0-SNAPSHOT~860a603678
    dateFormat X
    axisFormat %s
section baseline
no_agent (14.915 s) : 14915000, 14915000
.   : milestone, 14915000,
appsec (14.774 s) : 14774000, 14774000
.   : milestone, 14774000,
iast (18.698 s) : 18698000, 18698000
.   : milestone, 18698000,
iast_GLOBAL (17.829 s) : 17829000, 17829000
.   : milestone, 17829000,
profiling (15.702 s) : 15702000, 15702000
.   : milestone, 15702000,
tracing (14.612 s) : 14612000, 14612000
.   : milestone, 14612000,
section candidate
no_agent (15.374 s) : 15374000, 15374000
.   : milestone, 15374000,
appsec (14.682 s) : 14682000, 14682000
.   : milestone, 14682000,
iast (18.242 s) : 18242000, 18242000
.   : milestone, 18242000,
iast_GLOBAL (17.726 s) : 17726000, 17726000
.   : milestone, 17726000,
profiling (15.129 s) : 15129000, 15129000
.   : milestone, 15129000,
tracing (14.964 s) : 14964000, 14964000
.   : milestone, 14964000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 14.915 s [14.915 s, 14.915 s] -
appsec 14.774 s [14.774 s, 14.774 s] -141.0 ms (-0.9%)
iast 18.698 s [18.698 s, 18.698 s] 3.783 s (25.4%)
iast_GLOBAL 17.829 s [17.829 s, 17.829 s] 2.914 s (19.5%)
profiling 15.702 s [15.702 s, 15.702 s] 787.0 ms (5.3%)
tracing 14.612 s [14.612 s, 14.612 s] -303.0 ms (-2.0%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.374 s [15.374 s, 15.374 s] -
appsec 14.682 s [14.682 s, 14.682 s] -692.0 ms (-4.5%)
iast 18.242 s [18.242 s, 18.242 s] 2.868 s (18.7%)
iast_GLOBAL 17.726 s [17.726 s, 17.726 s] 2.352 s (15.3%)
profiling 15.129 s [15.129 s, 15.129 s] -245.0 ms (-1.6%)
tracing 14.964 s [14.964 s, 14.964 s] -410.0 ms (-2.7%)
Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.51.0-SNAPSHOT~a8ef30deee, baseline=1.51.0-SNAPSHOT~860a603678
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.475 ms) : 1464, 1487
.   : milestone, 1475,
appsec (2.41 ms) : 2360, 2460
.   : milestone, 2410,
iast (2.197 ms) : 2134, 2259
.   : milestone, 2197,
iast_GLOBAL (2.231 ms) : 2168, 2294
.   : milestone, 2231,
profiling (2.032 ms) : 1982, 2082
.   : milestone, 2032,
tracing (2.01 ms) : 1961, 2059
.   : milestone, 2010,
section candidate
no_agent (1.475 ms) : 1464, 1487
.   : milestone, 1475,
appsec (2.398 ms) : 2348, 2447
.   : milestone, 2398,
iast (2.178 ms) : 2116, 2241
.   : milestone, 2178,
iast_GLOBAL (2.236 ms) : 2173, 2299
.   : milestone, 2236,
profiling (2.036 ms) : 1986, 2086
.   : milestone, 2036,
tracing (2.007 ms) : 1959, 2055
.   : milestone, 2007,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.475 ms [1.464 ms, 1.487 ms] -
appsec 2.41 ms [2.36 ms, 2.46 ms] 934.409 µs (63.3%)
iast 2.197 ms [2.134 ms, 2.259 ms] 721.287 µs (48.9%)
iast_GLOBAL 2.231 ms [2.168 ms, 2.294 ms] 756.002 µs (51.2%)
profiling 2.032 ms [1.982 ms, 2.082 ms] 556.565 µs (37.7%)
tracing 2.01 ms [1.961 ms, 2.059 ms] 534.327 µs (36.2%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.475 ms [1.464 ms, 1.487 ms] -
appsec 2.398 ms [2.348 ms, 2.447 ms] 922.34 µs (62.5%)
iast 2.178 ms [2.116 ms, 2.241 ms] 702.996 µs (47.7%)
iast_GLOBAL 2.236 ms [2.173 ms, 2.299 ms] 760.863 µs (51.6%)
profiling 2.036 ms [1.986 ms, 2.086 ms] 560.439 µs (38.0%)
tracing 2.007 ms [1.959 ms, 2.055 ms] 531.823 µs (36.1%)

@gary-huang gary-huang force-pushed the gary/submit-evals-2 branch from 53386c1 to 573416d Compare April 10, 2025 18:50
@gary-huang gary-huang force-pushed the gary/submit-evals-2 branch from bb6d246 to a74e456 Compare April 21, 2025 14:13
@gary-huang gary-huang changed the title Gary/submit evals 2 LLM Obs SDK evaluation metrics submission Jun 26, 2025
@gary-huang gary-huang marked this pull request as ready for review July 8, 2025 15:57
@gary-huang gary-huang requested review from a team as code owners July 8, 2025 15:57
@gary-huang gary-huang requested a review from mcculls July 8, 2025 15:57
@gary-huang gary-huang marked this pull request as draft July 8, 2025 16:36
Base automatically changed from gary/use-ctx-api to gary/llmobs-sdk-merge July 8, 2025 19:33
@gary-huang gary-huang marked this pull request as ready for review July 8, 2025 19:59
@gary-huang gary-huang merged commit 54a3344 into gary/llmobs-sdk-merge Jul 8, 2025
494 of 504 checks passed
@gary-huang gary-huang deleted the gary/submit-evals-2 branch July 8, 2025 19:59
gary-huang added a commit that referenced this pull request Jul 9, 2025
* add APIs for llm obs

* add llm message class to support llm spans

* add llm message class to support llm spans

* impl llmobs agent and llmobs apis

* support llm messages with tool calls

* handle default model name and provider

* rm unneeded file

* impl llmobs agent and llmobs apis

* impl llmobs agent

* working writer

* add support for llm message and tool calls

* impl llmobs agent and llmobs apis

* use new ctx api to track parent span

* add api for evals

* working impl supporting both agentless and agent

* handle null tags and default to default ml app if null or empty string provided in the override

* cleaned up whitespace

* resolve merge conflicts

* remaining merge conflicts

* fix bad method call

* fixed llmobs intake creation if llmobs not enabled

* removed print statements

* ran spotless

* ran spotless

* added tests for llmobsspanmapper

* fixed coverage for tags

---------

Co-authored-by: Nayeem Kamal <nayeem.kamal@datadoghq.com>
Co-authored-by: Nayeem Kamal <kamal.nayeem12@gmail.com>
nayeem-kamal added a commit that referenced this pull request Jul 9, 2025
* add APIs for llm obs sdk (#8135)

* add APIs for llm obs

* add llm message class to support llm spans

* follow java convention of naming Id instead of ID

* add codeowners

* implement LLM Obs SDK spans APIs (#8390)

* add APIs for llm obs

* add llm message class to support llm spans

* add llm message class to support llm spans

* impl llmobs agent and llmobs apis

* support llm messages with tool calls

* handle default model name and provider

* rm unneeded file

* spotless

* add APIs for llm obs sdk (#8135)

* add APIs for llm obs

* add llm message class to support llm spans

* follow java convention of naming Id instead of ID

* add codeowners

* rename ID to Id according to java naming conventions

* Undo change to integrations-core submodule

* fix build gradle

* rm empty line

* fix test

* LLM Obs SDK Mapper (#8372)

* add APIs for llm obs

* add llm message class to support llm spans

* add llm message class to support llm spans

* impl llmobs agent and llmobs apis

* support llm messages with tool calls

* handle default model name and provider

* rm unneeded file

* impl llmobs agent and llmobs apis

* impl llmobs agent

* working writer

* add support for llm message and tool calls

* cleaned up whitespace

* resolve merge conflicts

* remaining merge conflicts

* fix bad method call

* fixed llmobs intake creation if llmobs not enabled

* removed print statements

* added tests for llmobsspanmapper

* fixed coverage for tags

---------

Co-authored-by: Nayeem Kamal <nayeem.kamal@datadoghq.com>

* updated to master submodule

* LLM Obs SDK use context API for parent children span linkage (#8711)

* add APIs for llm obs

* add llm message class to support llm spans

* add llm message class to support llm spans

* impl llmobs agent and llmobs apis

* support llm messages with tool calls

* handle default model name and provider

* rm unneeded file

* impl llmobs agent and llmobs apis

* impl llmobs agent

* working writer

* add support for llm message and tool calls

* impl llmobs agent and llmobs apis

* use new ctx api to track parent span

* cleaned up whitespace

* resolve merge conflicts

* remaining merge conflicts

* fix bad method call

* fixed llmobs intake creation if llmobs not enabled

* removed print statements

* ran spotless

* added tests for llmobsspanmapper

* fixed coverage for tags

---------

Co-authored-by: Nayeem Kamal <nayeem.kamal@datadoghq.com>
Co-authored-by: Nayeem Kamal <kamal.nayeem12@gmail.com>

* LLM Obs SDK evaluation metrics submission (#8688)

* add APIs for llm obs

* add llm message class to support llm spans

* add llm message class to support llm spans

* impl llmobs agent and llmobs apis

* support llm messages with tool calls

* handle default model name and provider

* rm unneeded file

* impl llmobs agent and llmobs apis

* impl llmobs agent

* working writer

* add support for llm message and tool calls

* impl llmobs agent and llmobs apis

* use new ctx api to track parent span

* add api for evals

* working impl supporting both agentless and agent

* handle null tags and default to default ml app if null or empty string provided in the override

* cleaned up whitespace

* resolve merge conflicts

* remaining merge conflicts

* fix bad method call

* fixed llmobs intake creation if llmobs not enabled

* removed print statements

* ran spotless

* ran spotless

* added tests for llmobsspanmapper

* fixed coverage for tags

---------

Co-authored-by: Nayeem Kamal <nayeem.kamal@datadoghq.com>
Co-authored-by: Nayeem Kamal <kamal.nayeem12@gmail.com>

* fix CODEOWNERS

---------

Co-authored-by: Nayeem Kamal <nayeem.kamal@datadoghq.com>
Co-authored-by: Nayeem Kamal <kamal.nayeem12@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants