Commit c4b33d2
KVM: x86/mmu: Split huge pages mapped by the TDP MMU on fault
Now that the TDP MMU has a mechanism to split huge pages, use it in the
fault path when a huge page needs to be replaced with a mapping at a
lower level.
This change reduces the negative performance impact of NX HugePages.
Prior to this change if a vCPU executed from a huge page and NX
HugePages was enabled, the vCPU would take a fault, zap the huge page,
and mapping the faulting address at 4KiB with execute permissions
enabled. The rest of the memory would be left *unmapped* and have to be
faulted back in by the guest upon access (read, write, or execute). If
guest is backed by 1GiB, a single execute instruction can zap an entire
GiB of its physical address space.
For example, it can take a VM longer to execute from its memory than to
populate that memory in the first place:
$ ./execute_perf_test -s anonymous_hugetlb_1gb -v96
Populating memory : 2.748378795s
Executing from memory : 2.899670885s
With this change, such faults split the huge page instead of zapping it,
which avoids the non-present faults on the rest of the huge page:
$ ./execute_perf_test -s anonymous_hugetlb_1gb -v96
Populating memory : 2.729544474s
Executing from memory : 0.111965688s <---
This change also reduces the performance impact of dirty logging when
eager_page_split=N. eager_page_split=N (abbreviated "eps=N" below) can
be desirable for read-heavy workloads, as it avoids allocating memory to
split huge pages that are never written and avoids increasing the TLB
miss cost on reads of those pages.
| Config: ept=Y, tdp_mmu=Y, 5% writes |
| Iteration 1 dirty memory time |
| --------------------------------------------- |
vCPU Count | eps=N (Before) | eps=N (After) | eps=Y |
------------ | -------------- | ------------- | ------------ |
2 | 0.332305091s | 0.019615027s | 0.006108211s |
4 | 0.353096020s | 0.019452131s | 0.006214670s |
8 | 0.453938562s | 0.019748246s | 0.006610997s |
16 | 0.719095024s | 0.019972171s | 0.007757889s |
32 | 1.698727124s | 0.021361615s | 0.012274432s |
64 | 2.630673582s | 0.031122014s | 0.016994683s |
96 | 3.016535213s | 0.062608739s | 0.044760838s |
Eager page splitting remains beneficial for write-heavy workloads, but
the gap is now reduced.
| Config: ept=Y, tdp_mmu=Y, 100% writes |
| Iteration 1 dirty memory time |
| --------------------------------------------- |
vCPU Count | eps=N (Before) | eps=N (After) | eps=Y |
------------ | -------------- | ------------- | ------------ |
2 | 0.317710329s | 0.296204596s | 0.058689782s |
4 | 0.337102375s | 0.299841017s | 0.060343076s |
8 | 0.386025681s | 0.297274460s | 0.060399702s |
16 | 0.791462524s | 0.298942578s | 0.062508699s |
32 | 1.719646014s | 0.313101996s | 0.075984855s |
64 | 2.527973150s | 0.455779206s | 0.079789363s |
96 | 2.681123208s | 0.673778787s | 0.165386739s |
Further study is needed to determine if the remaining gap is acceptable
for customer workloads or if eager_page_split=N still requires a-priori
knowledge of the VM workload, especially when considering these costs
extrapolated out to large VMs with e.g. 416 vCPUs and 12TB RAM.
Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Mingwei Zhang <mizhang@google.com>
Message-Id: <20221109185905.486172-3-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>1 parent 92292c1 commit c4b33d2
1 file changed
+35
-38
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1146 | 1146 | | |
1147 | 1147 | | |
1148 | 1148 | | |
| 1149 | + | |
| 1150 | + | |
| 1151 | + | |
1149 | 1152 | | |
1150 | 1153 | | |
1151 | 1154 | | |
| |||
1171 | 1174 | | |
1172 | 1175 | | |
1173 | 1176 | | |
1174 | | - | |
1175 | | - | |
1176 | | - | |
1177 | | - | |
1178 | | - | |
| 1177 | + | |
1179 | 1178 | | |
1180 | | - | |
1181 | | - | |
1182 | | - | |
| 1179 | + | |
| 1180 | + | |
1183 | 1181 | | |
1184 | | - | |
1185 | | - | |
1186 | | - | |
1187 | | - | |
1188 | | - | |
1189 | | - | |
1190 | | - | |
| 1182 | + | |
| 1183 | + | |
| 1184 | + | |
| 1185 | + | |
| 1186 | + | |
| 1187 | + | |
1191 | 1188 | | |
1192 | | - | |
1193 | | - | |
1194 | | - | |
1195 | | - | |
1196 | | - | |
1197 | | - | |
1198 | | - | |
1199 | | - | |
| 1189 | + | |
| 1190 | + | |
| 1191 | + | |
| 1192 | + | |
| 1193 | + | |
| 1194 | + | |
1200 | 1195 | | |
1201 | | - | |
1202 | | - | |
| 1196 | + | |
1203 | 1197 | | |
1204 | | - | |
| 1198 | + | |
| 1199 | + | |
| 1200 | + | |
| 1201 | + | |
1205 | 1202 | | |
1206 | | - | |
1207 | | - | |
1208 | | - | |
1209 | | - | |
| 1203 | + | |
| 1204 | + | |
| 1205 | + | |
| 1206 | + | |
1210 | 1207 | | |
1211 | | - | |
1212 | | - | |
1213 | | - | |
1214 | | - | |
1215 | | - | |
1216 | | - | |
| 1208 | + | |
| 1209 | + | |
| 1210 | + | |
| 1211 | + | |
| 1212 | + | |
1217 | 1213 | | |
1218 | 1214 | | |
1219 | 1215 | | |
| |||
1477 | 1473 | | |
1478 | 1474 | | |
1479 | 1475 | | |
| 1476 | + | |
1480 | 1477 | | |
1481 | 1478 | | |
1482 | 1479 | | |
1483 | 1480 | | |
1484 | 1481 | | |
1485 | 1482 | | |
1486 | 1483 | | |
1487 | | - | |
1488 | | - | |
1489 | 1484 | | |
1490 | 1485 | | |
1491 | 1486 | | |
| |||
1561 | 1556 | | |
1562 | 1557 | | |
1563 | 1558 | | |
| 1559 | + | |
| 1560 | + | |
1564 | 1561 | | |
1565 | 1562 | | |
1566 | 1563 | | |
| |||
0 commit comments