Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix][Refactor] Fix some bugs and refine codes for large scale simulator test #93

Open
wants to merge 55 commits into
base: main
Choose a base branch
from

Conversation

s5u13b
Copy link
Contributor

@s5u13b s5u13b commented Jan 16, 2025

  1. Simplify request timestamps implementation and add metrics
  2. Set max-instances for auto_scale_up loop
  3. Support retry binding address for zmq server
  4. Support power-of-k-choice for dispatch
  5. Change num_cpus of ProxyActor from 1 to 0
  6. Fix some bugs: abort in AsyncStream, host in glocal launch mode, simulator in global launch mode
  7. Reorg simulator files
  8. Reorg global_scheduler directory
  9. Resort manager and launcher functions
  10. Others Minors

@s5u13b s5u13b changed the title [Observability] Refine request timestamps implementation and add more metrics [WIP][Observability] Refine request timestamps implementation and add more metrics Jan 16, 2025
Copy link

prefill p25 p50 p75 p95 p99 mean
latency(ms) 10578.32 73505.90 133588.45 170028.33 171892.32 75987.91
decode p25 p50 p75 p95 p99 mean
latency(ms) 50.86 56.09 70.64 147.21 390.72 73.09

Copy link

migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 232.00 MB 256.00 MB 312.00 MB 352.00 MB 448.00 MB 472.00 MB 528.00 MB
rayrpc_speed(GB/s) 1.05 1.50 1.78 1.93 2.04 2.12 2.15 2.13 2.24 2.31 2.34 2.29 2.45 2.43 2.37 2.43 2.44 2.43 2.53 2.57 2.50 2.52 2.52 2.54 2.56 2.58 2.47 3.04 3.00 2.76 3.10 3.18 2.98
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 216.00 MB 224.00 MB 232.00 MB 248.00 MB 264.00 MB 368.00 MB 416.00 MB 472.00 MB
gloo_speed(GB/s) 1.00 1.62 2.08 2.31 2.50 2.67 2.84 2.81 2.98 2.91 2.87 3.12 3.24 3.14 3.37 2.86 2.85 2.75 2.62 2.22 2.29 2.82 2.12 4.19 2.79 3.45 3.33 2.84 2.99 2.81 1.40 2.95 2.79 0.88

@s5u13b s5u13b changed the title [WIP][Observability] Refine request timestamps implementation and add more metrics [Observability] Refine request timestamps implementation and add more metrics Jan 17, 2025
@s5u13b s5u13b force-pushed the request_timestamps branch from 5501476 to 8ec7ba7 Compare January 20, 2025 04:27
Copy link

prefill p25 p50 p75 p95 p99 mean
latency(ms) 15971.81 75729.48 117620.09 161436.13 189060.75 71825.76
decode p25 p50 p75 p95 p99 mean
latency(ms) 52.30 56.66 69.52 117.12 372.92 80.02

Copy link

migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 216.00 MB 232.00 MB 264.00 MB 336.00 MB 384.00 MB 472.00 MB 544.00 MB
rayrpc_speed(GB/s) 1.03 1.51 1.74 1.93 2.02 2.12 2.15 2.24 2.22 2.25 2.32 2.34 2.45 2.48 2.50 2.43 2.45 2.40 2.45 2.56 2.49 2.56 2.61 2.55 2.49 2.70 2.55 2.59 2.81 3.08 3.01 3.31 3.28
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 216.00 MB 232.00 MB 240.00 MB 256.00 MB 368.00 MB
gloo_speed(GB/s) 1.04 1.74 2.22 2.47 2.64 2.81 3.12 3.20 3.18 3.41 3.54 3.20 3.54 3.40 3.37 3.17 3.14 2.44 2.73 2.66 3.06 3.07 1.88 3.21 2.99 2.35 2.04 4.90 2.55 0.48 2.44

Copy link

prefill p25 p50 p75 p95 p99 mean
latency(ms) 3505.96 75484.03 129908.60 162583.37 185617.71 72493.51
decode p25 p50 p75 p95 p99 mean
latency(ms) 51.51 55.91 70.52 121.72 346.31 73.13

Copy link

migration_size 1.59 GB 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 216.00 MB 224.00 MB 240.00 MB 256.00 MB 280.00 MB
rayrpc_speed(GB/s) 3.91 1.05 1.54 1.81 1.91 2.02 2.11 2.17 2.25 2.33 2.36 2.45 2.45 2.36 2.44 2.55 2.52 2.36 2.58 2.53 2.55 2.63 2.75 2.50 2.17 2.63 2.70 2.80 2.72 2.68 2.87 2.76
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 216.00 MB 280.00 MB 448.00 MB
gloo_speed(GB/s) 1.00 1.68 2.06 2.30 2.60 2.82 2.94 2.94 2.86 2.89 2.96 3.54 3.13 3.25 3.63 2.73 2.52 2.39 2.72 2.77 2.66 2.38 3.13 2.53 3.04 1.61 2.69 2.30 1.35

Copy link

prefill p25 p50 p75 p95 p99 mean
latency(ms) 15731.92 68576.88 122544.35 191770.90 192504.54 76629.42
decode p25 p50 p75 p95 p99 mean
latency(ms) 49.42 54.94 68.49 107.64 222.68 65.71

Copy link

migration_size 1.59 GB 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 216.00 MB 224.00 MB 232.00 MB 256.00 MB 264.00 MB 288.00 MB 560.00 MB
rayrpc_speed(GB/s) 3.50 1.03 1.53 1.78 1.92 1.99 2.07 2.11 2.14 2.22 2.18 2.29 2.25 2.35 2.26 2.44 2.40 2.51 2.47 2.47 2.57 2.48 2.58 2.46 2.54 2.36 2.46 2.71 2.74 2.72 2.68 2.61 2.99 3.29
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 216.00 MB 224.00 MB 312.00 MB 336.00 MB 416.00 MB
gloo_speed(GB/s) 0.99 1.66 2.09 2.26 2.41 2.76 2.83 2.96 3.10 2.95 3.27 3.46 3.26 3.39 3.31 2.67 3.21 3.18 2.19 2.86 1.66 2.88 0.92 2.74 3.43 3.01 3.12 1.80 1.45 0.95

Copy link

migration_size 1.59 GB 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 216.00 MB 224.00 MB 232.00 MB 240.00 MB 264.00 MB 272.00 MB 280.00 MB 328.00 MB 376.00 MB 416.00 MB 600.00 MB 656.00 MB
rayrpc_speed(GB/s) 3.45 1.03 1.52 1.80 1.92 2.07 2.10 2.13 2.18 2.27 2.29 2.29 2.34 2.37 2.46 2.42 2.47 2.52 2.49 2.52 2.56 2.59 2.54 2.53 2.33 2.44 2.51 2.71 2.62 2.59 2.69 2.67 2.62 2.80 3.00 3.10 3.23 3.22 3.64
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 200.00 MB 224.00 MB 232.00 MB 256.00 MB 416.00 MB
gloo_speed(GB/s) 1.04 1.74 2.13 2.46 2.73 2.91 2.94 3.02 3.00 3.36 3.07 3.42 3.25 3.42 3.12 2.86 3.01 3.14 2.20 3.14 2.23 3.94 1.98 2.90 3.12 4.37 3.33 3.39

Copy link

prefill p25 p50 p75 p95 p99 mean
latency(ms) 14642.17 80519.75 130939.61 181272.94 181582.98 80402.36
decode p25 p50 p75 p95 p99 mean
latency(ms) 50.12 53.89 62.69 104.63 190.77 63.56

Copy link

prefill p25 p50 p75 p95 p99 mean
latency(ms) 15635.69 74445.50 125064.97 185430.89 210975.15 76822.74
decode p25 p50 p75 p95 p99 mean
latency(ms) 50.45 54.65 65.08 95.30 191.98 62.26

Copy link

migration_size 1.59 GB 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 208.00 MB 224.00 MB 232.00 MB 248.00 MB 264.00 MB 272.00 MB 280.00 MB 408.00 MB 416.00 MB 536.00 MB
rayrpc_speed(GB/s) 3.58 1.02 1.50 1.77 1.96 1.99 2.04 2.12 2.11 2.21 2.29 2.27 2.34 2.29 2.37 2.40 2.41 2.45 2.50 2.49 2.42 2.47 2.58 2.61 2.56 2.52 2.77 2.81 2.90 2.66 2.72 3.04 3.37 3.10 3.27
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 224.00 MB 272.00 MB 400.00 MB
gloo_speed(GB/s) 1.02 1.67 2.09 2.41 2.68 2.68 2.87 2.90 2.88 3.02 2.98 3.06 3.41 3.23 3.51 2.62 2.81 2.62 2.88 2.28 2.04 1.93 3.07 2.39 2.51 1.04 3.10 1.29

@s5u13b s5u13b changed the title [Observability] Refine request timestamps implementation and add more metrics [Observability] Refine request timestamps implementation and add more metrics & Reorg simulator files Jan 21, 2025
Copy link

migration_size 1.59 GB 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 224.00 MB 232.00 MB 240.00 MB 264.00 MB 352.00 MB 480.00 MB 568.00 MB
rayrpc_speed(GB/s) 3.65 1.02 1.51 1.73 1.90 2.03 2.02 2.12 2.15 2.21 2.22 2.28 2.29 2.34 2.41 2.42 2.43 2.37 2.45 2.50 2.49 2.48 2.51 2.55 2.52 2.56 2.66 2.48 2.88 2.55 2.66 2.78 3.12 3.43
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 200.00 MB 224.00 MB 232.00 MB 376.00 MB
gloo_speed(GB/s) 0.99 1.66 2.07 2.39 2.53 2.79 2.87 2.99 2.85 3.22 3.23 3.57 3.22 3.23 3.15 2.95 2.23 2.55 2.74 3.05 1.87 2.17 3.44 3.00 3.41 3.23 2.92

Copy link

prefill p25 p50 p75 p95 p99 mean
latency(ms) 15374.72 81117.49 134885.05 171695.47 188812.34 76513.39
decode p25 p50 p75 p95 p99 mean
latency(ms) 51.41 55.83 67.09 104.81 298.59 67.25

@s5u13b s5u13b changed the title [Observability] Refine request timestamps implementation and add more metrics & Reorg simulator files [Observability] Refine request timestamps implementation and add more metrics Jan 22, 2025
Copy link

migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 216.00 MB 232.00 MB 248.00 MB 264.00 MB 280.00 MB 336.00 MB 360.00 MB 480.00 MB 544.00 MB
rayrpc_speed(GB/s) 1.03 1.54 1.80 1.89 2.03 2.06 2.15 2.14 2.28 2.22 2.38 2.30 2.38 2.38 2.39 2.46 2.43 2.33 2.42 2.56 2.46 2.66 2.33 2.62 2.38 2.74 2.59 2.73 2.79 2.92 2.89 3.17 2.99 3.15 3.22
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 192.00 MB 200.00 MB 208.00 MB 216.00 MB 232.00 MB 312.00 MB 328.00 MB 432.00 MB
gloo_speed(GB/s) 1.03 1.72 2.17 2.39 2.56 2.85 3.03 2.96 3.02 3.12 3.13 3.15 3.26 2.93 3.43 2.73 2.86 3.03 2.64 2.70 2.42 3.27 2.04 3.46 2.44 2.72 2.09 3.18 2.30 3.43

Copy link

prefill p25 p50 p75 p95 p99 mean
latency(ms) 16465.44 78088.89 130741.60 166650.61 173568.97 76939.03
decode p25 p50 p75 p95 p99 mean
latency(ms) 50.74 55.46 64.29 100.58 165.65 62.31

@s5u13b s5u13b force-pushed the request_timestamps branch from 1a49a08 to d2894ca Compare February 7, 2025 11:38
Copy link

github-actions bot commented Feb 7, 2025

prefill p25 p50 p75 p95 p99 mean
latency(ms) 3106.97 4274.23 26191.15 88149.30 123252.74 21229.58
decode p25 p50 p75 p95 p99 mean
latency(ms) 71.95 106.97 150.60 1924.35 23199.80 1105.10

Copy link

github-actions bot commented Feb 7, 2025

migration_size 1.59 GB 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 200.00 MB 208.00 MB 248.00 MB 256.00 MB 272.00 MB 280.00 MB 312.00 MB 320.00 MB
rayrpc_speed(GB/s) 3.69 0.89 1.36 1.64 1.83 1.93 2.07 2.15 2.13 2.20 2.27 2.25 2.32 2.31 2.34 2.41 2.41 2.41 2.57 2.44 2.63 2.68 2.51 2.59 2.69 2.69 2.76 2.96 2.78 2.93 3.12 2.96
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 232.00 MB 312.00 MB 392.00 MB
gloo_speed(GB/s) 0.87 1.45 1.88 2.11 2.24 2.41 2.42 2.44 2.67 2.64 2.45 2.78 2.65 2.70 2.60 2.39 2.61 1.74 1.75 2.64 2.11 2.91 2.92 0.91 2.71 2.67 2.52 2.29 2.42
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 224.00 MB 232.00 MB 280.00 MB 320.00 MB 400.00 MB 464.00 MB
nccl_speed(GB/s) 0.20 0.40 0.67 0.81 0.96 1.21 1.44 1.59 1.66 2.04 1.92 2.13 1.98 2.06 2.34 2.34 2.78 3.03 2.56 3.00 3.30 2.97 3.68 2.10 3.64 2.97 1.98 2.84 4.06 3.62

commit 48c674b
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 09:41:05 2025 +0000

    Fix lint

commit 322862b
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 09:39:31 2025 +0000

    Fix entrypoints unit test

commit 75af824
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 08:07:26 2025 +0000

    Fix lint

commit 2818c8d
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 08:06:08 2025 +0000

    Fix cr

commit a172468
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 07:01:07 2025 +0000

    Fix lint

commit 3f863b2
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 06:54:18 2025 +0000

    Add back timestamp

commit 2e53b24
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 06:45:16 2025 +0000

    Fix lint

commit eea1a3a
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 06:37:30 2025 +0000

    Add back timestamps

commit b4a45ef
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 06:21:48 2025 +0000

    Remove old filter

commit f2df197
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 06:12:53 2025 +0000

    Add _process_model_outputs back

commit a51cf25
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 03:46:45 2025 +0000

    Fix abort

commit 1058ec0
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 02:43:14 2025 +0000

    Remove blank todo

commit 670018e
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Feb 7 02:36:27 2025 +0000

    Filter out migrating request

commit fa2fc9c
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Jan 24 06:25:35 2025 +0000

    Remove process_model_outputs request timestamps

commit 2a980ca
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Jan 24 06:10:49 2025 +0000

    Fix linting

commit 78a1ab4
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Jan 24 05:30:15 2025 +0000

    Fix request leaking bug of migration

commit 774205b
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Jan 24 03:11:08 2025 +0000

    Fix

commit 814521e
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Jan 24 02:57:20 2025 +0000

    Minors

commit b3f0688
Author: s5u13b <sunbiao.sun@alibaba-inc.com>
Date:   Fri Jan 24 01:56:09 2025 +0000

    Change ci timeout-minutes
@s5u13b s5u13b force-pushed the request_timestamps branch from 00d3273 to 2c4cc50 Compare February 12, 2025 07:57
@s5u13b s5u13b changed the title [Observability] Refine request timestamps implementation and add more metrics [Observability][GlobalScheduler][BugFix] Simplify request timestamps implementation & Support power-of-k-choice for dispatch & Fix some bugs and refine codes for large scale simulator test Feb 12, 2025
@s5u13b s5u13b changed the title [Observability][GlobalScheduler][BugFix] Simplify request timestamps implementation & Support power-of-k-choice for dispatch & Fix some bugs and refine codes for large scale simulator test [BugFix][Refactor] Fix some bugs and refine codes for large scale simulator test Feb 12, 2025
@s5u13b s5u13b requested a review from zhypku February 12, 2025 08:34
Copy link

prefill p25 p50 p75 p95 p99 mean
latency(ms) 3115.62 20473.43 27978.57 88301.43 111353.63 27058.84
decode p25 p50 p75 p95 p99 mean
latency(ms) 70.73 95.35 142.00 1536.00 17277.53 857.66

Copy link

migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 224.00 MB 232.00 MB 248.00 MB 272.00 MB 280.00 MB 320.00 MB 424.00 MB 472.00 MB 512.00 MB 520.00 MB 560.00 MB
rayrpc_speed(GB/s) 0.90 1.35 1.64 1.85 1.94 2.01 2.15 2.23 2.23 2.29 2.35 2.28 2.25 2.35 2.32 2.42 2.51 2.41 2.51 2.61 2.57 2.53 2.59 2.54 2.66 2.66 2.55 2.52 2.73 2.97 2.86 3.21 3.27 3.19 3.12 3.17
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 192.00 MB 216.00 MB 224.00 MB 232.00 MB 240.00 MB 280.00 MB 312.00 MB 416.00 MB
gloo_speed(GB/s) 0.88 1.47 1.83 2.00 2.13 2.28 2.53 2.56 2.32 2.53 2.61 2.47 2.34 2.57 2.60 2.43 2.39 2.07 2.71 2.49 2.23 2.53 2.41 3.30 2.18 1.99 1.19 1.94 2.24 2.19
migration_size 8.00 MB 16.00 MB 24.00 MB 32.00 MB 40.00 MB 48.00 MB 56.00 MB 64.00 MB 72.00 MB 80.00 MB 88.00 MB 96.00 MB 104.00 MB 112.00 MB 120.00 MB 128.00 MB 136.00 MB 144.00 MB 152.00 MB 160.00 MB 168.00 MB 176.00 MB 184.00 MB 192.00 MB 200.00 MB 208.00 MB 216.00 MB 232.00 MB 296.00 MB 392.00 MB 424.00 MB
nccl_speed(GB/s) 0.20 0.47 0.66 0.85 1.10 1.41 1.42 1.50 1.75 1.74 2.00 2.03 2.13 2.32 2.45 2.19 2.96 2.54 2.49 3.02 2.52 3.11 2.47 3.38 3.63 1.65 2.35 2.09 2.99 2.53 2.30

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant