Skip to content

Commit 60327c0

Browse files
committed
moving to 007
1 parent c5a0bb7 commit 60327c0

File tree

2 files changed

+23
-21
lines changed

2 files changed

+23
-21
lines changed

005/README.md

Lines changed: 18 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,23 @@
1-
# 006
1+
# 005
22

3-
In this version I move the player state to live in user memory.
3+
Since I'm blocked on ring buffers with what looks like a libxdp bug, I'm going to move ahead with testing the performance of using bpf hash maps to store player state.
44

5-
This should make player updates around twice as fast, because we don't have to copy down from kernel memory -> user memory, nad then upload user memory -> kernel memory per-update, we only have to copy into kernel memory once.
5+
I'm going to spin up 16 threads, and pin each to CPUs [0,15], then I'm going to simulate 50k players evenly distributed across these CPUs (eg. 3125 players per-CPU).
6+
7+
Each player simulation step will read the player state from the kernel memory, then advance the player state forward via dt, and stors the updated player state back into the bpf hash map.
8+
9+
If the player maps are a bottleneck, that should show up here. It's possible they are, since each read and write to the map needs to be transferred between kernel and userspace memory, presumably via a syscall.
10+
11+
Initially, I'm going to keep the player simulation very light. It will just touch all 1200 bytes of the player state (read and write), before committing it back to the map.
12+
13+
I'm not sure if I'll need to expand the number of cores dedicated to player simulation or not. I have a 64 core thread ripper running Linux, and if I need to go up to 64 cores in order to distribute the load of 50k players @ 100HZ I should be able to.
614

715
# Results
816

9-
...
17+
I can do 345,541 player updates per-second on 16 cpus on my bare metal linux box.
18+
19+
This means I can do 21k player updates per-cpu on bare metal.
20+
21+
Given that each player does 100 updates per-second, for 50k players, I need 50,000 * 100 = 5,000,000 player updates per-second.
22+
23+
5,000,000 updates per-second would require 238 cpus.

006/README.md

Lines changed: 5 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,11 @@
1-
# 005
1+
# 006
22

3-
Since I'm blocked on ring buffers with what looks like a libxdp bug, I'm going to move ahead with testing the performance of using bpf hash maps to store player state.
3+
In this version I move the player state to live in user memory.
44

5-
I'm going to spin up 16 threads, and pin each to CPUs [0,15], then I'm going to simulate 50k players evenly distributed across these CPUs (eg. 3125 players per-CPU).
6-
7-
Each player simulation step will read the player state from the kernel memory, then advance the player state forward via dt, and stors the updated player state back into the bpf hash map.
8-
9-
If the player maps are a bottleneck, that should show up here. It's possible they are, since each read and write to the map needs to be transferred between kernel and userspace memory, presumably via a syscall.
10-
11-
Initially, I'm going to keep the player simulation very light. It will just touch all 1200 bytes of the player state (read and write), before committing it back to the map.
12-
13-
I'm not sure if I'll need to expand the number of cores dedicated to player simulation or not. I have a 64 core thread ripper running Linux, and if I need to go up to 64 cores in order to distribute the load of 50k players @ 100HZ I should be able to.
5+
This should make player updates maybe twice as fast, because we don't have to copy down from kernel memory -> user memory, nad then upload user memory -> kernel memory per-update, we only have to copy into kernel memory once.
146

157
# Results
168

17-
I can do 345,541 player updates per-second on 16 cpus on my bare metal linux box.
18-
19-
This means I can do 21k player updates per-cpu on bare metal.
20-
21-
Given that each player does 100 updates per-second, for 50k players, I need 5,000,000 player updates per-second.
9+
I can now do around 550k updates per-second with 16 cpus.
2210

23-
5,000,000 updates per-second would require 238 cpus.
11+
This means that we should be able to theoretically do 50k players with 144 cpus.

0 commit comments

Comments
 (0)