-
Notifications
You must be signed in to change notification settings - Fork 63
4. Kernel
This section explains some technical background, functionality, and testing of SMDK kernel. It covers how to recognize CXL memory device on OS booting, interpreting information out of BIOS. Then how the CXL device becomes memory interfaces, System RAM, Swap and DAX.
At first, the base address and size information of the CXL device being attached should be provided by BIOS and/or the device through SRAT, CEDT, and/or DVSEC. In addition, the CXL memory range presented in the EFI memory map must be typed as soft reserved, not as usable. Details are described below.
In order for the CXL device to be detected and function properly, OS should be able to retrieve base address and size information of CXL device from SRAT (System Resources Affinity Table). Thus, in case a CXL device is not normally detected and operated in your system, you need to check whether SRAT entry contains CXL device information such as affinity, base address, and size.
The first step is to parse the SRAT information. The SRAT table is one of the ACPI (Advanced Configuration and Power Interface) tables. Next, dump the ACPI tables from the system, and then extract the SRAT table from the dumped file.
# /path/to/SMDK/src/test/system/extract_system_info.sh
# Install packages
$ sudo apt install acpica-tools
# Extract ACPI Tables
$ sudo acpidump -o acpidump.out
# Separate Dumped files by tables
$ acpixtract -a acpidump.out
# Change raw data's format to human-readable through parser
$ iasl -d srat.dat
# Find the result
$ ls srat.dsl
srat.dsl
You can now check the details through the srat.dsl file. The srat.dsl file lists information such as Processor Local Affinity, Memory Affinity, which are the subtable type of SRAT table. In a system in which the CXL device is normally initialized, the CXL memory range should be included as Memory Affinity as follows. In the example below, the Base Address of the CXL memory region is 0x2380000000, and the Address Length is 0x2000000000, that is, 128GB. In addition, the Proximity Domain of the CXL memory area is identified as 1. This value is used by OS to assign the NUMA node ID during kernel booting.
[78C0h 30912 1] Subtable Type : 01 [Memory Affinity]
[78C1h 30913 1] Length : 28
[78C2h 30914 4] Proximity Domain : 00000001
[78C6h 30918 2] Reserved1 : 0000
[78C8h 30920 8] Base Address : 0000002380000000
[78D0h 30928 8] Address Length : 0000002000000000
[78D8h 30936 4] Reserved2 : 00000000
[78DCh 30940 4] Flags (decoded below) : 00000001
Enabled : 1
Hot Pluggable : 0
Non-Volatile : 0
[78E0h 30944 8] Reserved3 : 0000000000000000
If there are multiple CXL memory devices, there would be multiple Memory Affinities in the SRAT table, and different values of the proximity domain will be assigned. If the CXL memory range is included as Memory Affinity, the SRAT Table is parsed and CXL memory is added to NUMA node during kernel booting as follows. You can check the following log using the $ dmesg command. In the example below, the CXL memory area with Proximity Domain (PXM) 1 is registered as NUMA Node 1.
$ dmesg
...
[ 0.012865] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x7fffffff]
[ 0.012868] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x107fffffff]
[ 0.012877] ACPI: SRAT: Node 1 PXM 1 [mem 0x2380000000-0x437fffffff]
...
If you cannot extract srat.dat file, it means your BIOS has not published the SRAT table to your OS. So the BIOS option to support SRAT table needs to be enabled. On the other hand, even though the srat.dat file is extracted, if there is no Memory Affinity for the CXL memory in srat.dsl file, there may be a need to update the BIOS to add the information to the SRAT table.
The other means that SMDK kernel uses to register CXL device is CEDT(CXL Early Discovery Table) and/or DVSEC(Designated Vendor-Specific Extended Capability). DVSEC is a structure defined in the CXL specification and includes a set of information about the capabilities of the CXL device that vendor supports. In particular, the PCIe DVSEC for CXL device(DVSEC ID=0) contains the base address and size of the CXL device. CEDT enables OS to locate CXL Host Bridges and location of Host Bridge registers during boot process. Both CEDT and DVSEC contains the base address and size information. SMDK registers CXL devices as system memory using one of the 3 ways, i.e., SRAT, CEDT, and DVSEC.
It is necessary to verify that CXL memory range is registered as soft reserved in EFI memory map. EFI memory map can be found in kernel boot log. See the example below. BIOS-e820* prefix is indicated as e820 memory map information received from BIOS, and memory range, memory attribute of each range are displayed.
$ dmesg
...
[ 0.000000] BIOS-e820: [mem 0x0000002380000000-0x000000437fffffff] soft reserved
...
If BIOS did not set specific-purpose memory attribute (EFI_MEMORY_SP) for the range, this area would be recognized as a usable area. To recognize this as a soft reserved area, you can set EFI_MEMORY_SP attribute by adding efi_fake_mem to the kernel command line. (e.g., efi_fake_mem=<size>@<start address>:<memory attribute>) This kernel command is used to set the memory attribute for a specific memory range. During system booting, you can add kernel commands by pressing 'e' on the kernel selection grub screen. Please refer to Installation Guide for an example of the boot screen.
Below is an example of setting efi_fake_mem that should be added to kernel commands when the CXL memory region is recognized as usable in the BIOS memory map. In the example below, the base address is 0x2380000000, the size of the CXL memory area is 128GB, and the memory attribute to be added is 0x40000(=EFI_MEMORY_SP).
efi_fake_mem=128G@0x2380000000:0x40000
After adding the efi_fake_mem command and rebooting your system, check the e820 memory map for the CXL memory region in the booting log again. If CXL memory region is recognized as soft reserved, the CXL driver of SMDK will register the area in the EXMEM zone.
Having SMDK kernel booted, CXL memory channel(s) in the system is grouped as zone partition by default. Later, a system administrator can change the grouping policies through the CXL-CLI or sysfs interface with root permission.
SMDK supports three grouping policies: zone, node, and noop. You can change SMDK memory partition with CXL-CLI. Please refer to CXL-CLI Guide section for more details.
# ./cxl group-zone ("group-zone", "group-node" or "group-noop")
By default, the grouping policy of SMDK Memory Partition is group-zone (represented as EXMEM zone in the same node where CPU socket and its DDR memory is located in). See the table below for more details of each policy.
Value | Desc. | Example: 3ch of CXL devices @Socket 0 |
---|---|---|
group-zone (default) |
Zone Partition: Represent CXL memories as a EXMEM Zone. CXL Memories : Zone = N : 1 |
node 0 : CPU #1 + DDR Memory #1 + CXL #1, #2, #3 |
group-node | Node Partition: Represent CXL memories as independent nodes. CXL Memories : Node = N : 1 |
node 0 : CPU #1 + DDR Memory #1 node 1 : CXL #1, #2, #3 |
group-noop | Single Node: Represent a CXL Memory as an independent node. CXL Memories : Node = 1 : 1 |
node 0 : CPU #1 + DDR Memory #1 node 1 : CXL #1 node 2 : CXL #2 node 3 : CXL #3 |
Note: As for group-zone, both ZONE_NORMAL and ZONE_EXMEM are in one NUMA Node. In this case, the priority of memory allocation order can be controlled via /proc/sys/vm/numa_zonelist_order interface. You can set the order by writing exmem or normal to that file.
online/offline
- Offline: CXL memory is not recognized as a system RAM but as a soft reserved area.
Node 0, zone DMA 1 0 0 1 2 1 1 0 1 1 3
Node 0, zone DMA32 3 8 3 4 4 5 3 5 4 4 436
Node 0, zone Normal 1897 128 43 108 119 76 37 13 5 1 0 45128
- Online: CXL is mapped to the EXMEM zone, so CXL and DRAM are mapped to different zones.
Node 0, zone DMA 1 0 0 1 2 1 1 0 1 1 3
Node 0, zone DMA32 3 8 3 4 4 5 3 5 4 4 436
Node 0, zone Normal 2238 196 75 40 17 31 29 10 5 2 43595
Node 0, zone ExMem 0 0 0 0 0 0 0 0 0 0 98304
SMDK Memory Partition
- group-zone (default): All CXL memory devices are added to the same node with DDR memory.
Node 0, zone DMA 1 0 0 1 2 1 1 0 1 1 3
Node 0, zone DMA32 3 8 3 4 4 5 3 5 4 4 436
Node 0, zone Normal 3 0 1 2 2 1 0 3 3 0 43772
Node 0, zone ExMem 0 0 0 0 0 0 0 0 0 0 98304
- group-node: All CXL memory devices are grouped by the installed socket, and devices of each socket are added as separate nodes.
Node 0, zone DMA 1 0 0 1 2 1 1 0 1 1 3
Node 0, zone DMA32 3 8 3 4 4 5 3 5 4 4 436
Node 0, zone Normal 33615 309 94 63 38 3 4 1 2 2 43597
Node 1, zone ExMem 0 0 0 0 0 0 0 0 0 0 98304
- group-noop: All CXL memory devices are added as different nodes from normal DDR memory. Every single CXL device becomes a seperate node.
Node 0, zone DMA 1 0 0 1 2 1 1 0 1 1 3
Node 0, zone DMA32 3 8 3 4 4 5 3 5 4 4 436
Node 0, zone Normal 224 705 660 443 166 103 50 31 16 15 44047
Node 1, zone ExMem 0 0 0 0 0 0 0 0 0 0 32768
Node 2, zone ExMem 0 0 0 0 0 0 0 0 0 0 32768
Node 3, zone ExMem 0 0 0 0 0 0 0 0 0 0 32768
If the memory of the CXL device is in use, the change operation of online/offline and memory partition is canceled. You can check the result of memory partition change through the command below.
# ./cxl group-list <--node | --dev>
In addition to group-zone, group-node, and group-noop commands of CXL-CLI, you can freely configure CXL node partitions through group-add and group-remove commands. Assuming that noop partition policy has been applied as in the example right above, you can combine Node 2 and Node 3 into one through the command below.
# ./cxl group-add --target_node 2 --dev cxl2
# cat /proc/buddyinfo
Node 0, zone DMA 1 0 0 1 2 1 1 0 1 1 3
Node 0, zone DMA32 3 8 3 4 4 5 3 5 4 4 436
Node 0, zone Normal 2907 1686 828 4416 2038 913 407 192 106 58 13430
Node 1, zone ExMem 0 0 0 0 0 0 0 0 0 0 32768
Node 2, zone ExMem 0 0 0 0 0 0 0 0 0 0 65536
# ./cxl group-list --node
[
{
"node_id" : -1,
"devices" : [ ]
}
{
"node_id" : 0,
"devices" : [ ]
}
{
"node_id" : 1,
"devices" : [ "cxl0" ]
}
{
"node_id" : 2,
"devices" : [ "cxl1" "cxl2" ] # Node 2 consists of cxl dev 1 and 2.
}
]
Please check CXL-CLI Guide for more details.
To use CXL memory as System RAM, the CXL memory device should be mapped as EXMEM zone. SMDK provides one more way to utilize CXL memory; DAX. If you want to bind it to DAX driver, you need to make CXL device offline with sysfs interface. This task requires root permission.
Below is an example of offlining CXL memory range. If you set node id to -1, the CXL memory range will be offlined.
$ echo -1 > /sys/kernel/cxl/devices/cxl0/node_id
Now the DAX driver can bind and use the CXL memory range. Device-dax provides an interface to utilize specific-purpose memory to be used by a specific application. This kind of memory ranges are reserved by HMEM platform devices. Since the CXL driver, like the HMEM device, operates for soft reserved areas, the CXL driver and the DAX driver must operate exclusively for each other. The SMDK kernel basically recognizes the CXL memory range as the EXMEM zone when booting. That is, in order for the DAX driver to operate for the CXL memory range, the corresponding memory area must be unbind from the CXL driver first.
Below is an example of binding the CXL memory range to dax0.0 in a system equipped with a single channel of CXL memory expander device.
$ echo -1 > /sys/kernel/cxl/devices/cxl0/node_id
$ echo "880000000-287fffffff" > /sys/devices/platform/hmem.0/dax0.0/mapping
$ echo dax0.0 > /sys/bus/dax/drivers/device_dax/bind
To check if the DAX device successfully binds the CXL memory range, check /proc/iomem. Below is an example of a system equipped with a single channel of CXL memory expander device.
$ sudo cat /proc/iomem
...
880000000-287fffffff : hmem.1
880000000-287fffffff : Soft Reserved
880000000-287fffffff : hmem.0
880000000-287fffffff : dax0.0
You can get the same result with CXL-CLI like below.
# ./cxl group-dax --dev cxl0
Now, this DAX device can be used through fio benchmark, etc. For more information, refer to Test section below.
If you want to unbind the CXL memory range from the DAX device and register it as the EXMEM zone again, execute the command below.
$ echo dax0.0 > /sys/bus/dax/drivers/device_dax/unbind
$ echo 0 > /sys/kernel/cxl/devices/cxl0/node_id
Or you can execute the command below using CXL-CLI.
# ./cxl group-add --target_node 0 --dev cxl0
CXL swap is another memory inteface for userspace applications. CXL swap allows a CXL Device to be used as a swap interface, and unlikely zswap, it does not carry (de)compression overhead and latency fluctuation by wasting host cpu while swap-out(in) pages. When swapping takes place, CXL swap works in the middle of Linux swap procedure, prior to cast disk I/Os and then retrieve/locate the swap pages in a ZONE EXMEM memory pool that expands and shrinks dynamically.
CXL Swap implements the frontswap interfaces like zSwap, and its usage is very similar with zSwap. On executable point of view, it is built-in kernel module, so you don't need to insert a separate module; just turn on CONFIG_CXLSWAP when $ make menuconfig. After system booting, you can enable CXL Swap feature like below.
echo 1 > /sys/module/cxlswap/parameters/enabled
Other parameters can be found in the following Configurations.
Note that it is recommended to use zSwap and CXL Swap exclusively because the two modules targets different contribution. (trade-off: CPU and memory density)
Note: The following configurations are located in /sys/module/cxlswap/parameters/ and can be changed by writing values in the following files or using CXL-CLI. Root privileges are required to change the settings.
Config. | Desc. | Default | Note |
---|---|---|---|
accept_threshold_percent | The threshold at which cxlswap would start accepting pages again after it became full. | 90 | |
cxlpool | The memory pool for cxlswap that grows on demand and shrinks as pages are freed. | cxlbud | |
enabled | Enable or disable cxlswap at runtime. | N | |
flush (experimental) | Flush all pages in cxlpool. CXL Swap should be disabled before execute flush. | N/A | |
max_pool_percent | The maximum percentage of memory that the cxlpool can occupy. | 20 | |
same_filled_pages_enabled | Identify same-value filled pages (i.e. contents of the page have same value or repetitive pattern) during store operation, and if true, the length of the page is set to zero and the pattern or same-filled value is stored. | Y | |
non_same_filled_pages_enabled | If the attribute is disabled, the handling of non-same-value pages by cxlswap is disabled. | Y |
# echo 1 | sudo tee /sys/module/cxlswap/parameters/enabled
# cat /sys/module/cxlswap/parameters/enabled
Y
# echo 0 | sudo tee /sys/module/cxlswap/parameters/enabled
# cat /sys/module/cxlswap/parameters/enabled
N
# echo 1 | sudo tee /sys/module/cxlswap/parameters/flush
Below is a set of test cases and examples, to verify operations of SMDK kernel.
You can build the binaries required for the tests through running make command at /path/to/SMDK/src/test once.
This is the test case to check whether UEFI BIOS properly provides CXL device related information to kernel.
The test case checks the following:
- SRAT table contains the Memory Affinity information of the CXL memory.
- CXL memory range is included in the EFI memory map on dmesg, and it is recognized as soft reserved.
- The CXL memory range of /proc/iomem is recognized as system RAM.
- The CXL memory is recognized as EXMEM zone in /proc/buddyinfo.
Command lines
$ cd /path/to/SMDK/src/test/system
$ ./extract_system_info.sh <CXL memory start address>
(Example) $ ./extract_system_info.sh 1080000000
Result
1. SRAT table:
[7850h 30800 1] Subtable Type : 01 [Memory Affinity]
[7851h 30801 1] Length : 28
[7852h 30802 4] Proximity Domain : 00000001
[7856h 30806 2] Reserved1 : 0000
[7858h 30808 8] Base Address : 0000001080000000
[7860h 30816 8] Address Length : 0000002000000000
[7868h 30824 4] Reserved2 : 00000000
[786Ch 30828 4] Flags (decoded below) : 00000001
Enabled : 1
Hot Pluggable : 0
Non-Volatile : 0
[7870h 30832 8] Reserved3 : 0000000000000000
2. e820 memory map:
[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009efff] usable
[ 0.000000] BIOS-e820: [mem 0x000000000009f000-0x00000000000fffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000006a5b9fff] usable
[ 0.000000] BIOS-e820: [mem 0x000000006a5ba000-0x000000006c6b1fff] reserved
[ 0.000000] BIOS-e820: [mem 0x000000006c6b2000-0x000000006d8c9fff] ACPI NVS
[ 0.000000] BIOS-e820: [mem 0x000000006d8ca000-0x000000006ddfdfff] ACPI data
[ 0.000000] BIOS-e820: [mem 0x000000006ddfe000-0x00000000777fffff] usable
[ 0.000000] BIOS-e820: [mem 0x0000000077800000-0x000000008fffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000fe010000-0x00000000fe010fff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000507fffffff] usable
3. /proc/iomem:
1080000000-507fffffff : System RAM
4. /proc/buddyinfo:
Node 0, zone DMA 0 1 0 0 0 0 0 0 1 1 3
Node 0, zone DMA32 2 5 5 4 5 3 3 4 4 4 440
Node 0, zone Normal 4 7 87 71 113 254 96 45 20 1 14327
Node 1, zone ExMem 0 0 0 0 0 0 0 0 0 0 32365
Node 2, zone ExMem 0 0 0 0 0 0 0 0 0 0 32368
SMDK kernel includes Subzone architecture for efficient memory management, and this test script is for verifying the operation of subzone function. For a detailed description of subzone architecture, refer to the Memory Partition of SMDK Architecture.
In this test, 1 thread will perform a memory allocation request of size 1KiB/4KiB/128KiB/4MiB by a total of 4GiB per each thread. Then 10 threads allocate memory in the same way.
Command lines
$ cd /path/to/SMDK/src/test/subzone
$ ./run_4GB_malloc_test.sh
Result
Single Thread Testcases
TC test_malloc_1K_bytes_4M_times starts
cxl group-zone, size: 1.0K bytes, iteration: 4.0M times, elapsed time: ......
cxl group-node, size: 1.0K bytes, iteration: 4.0M times, elapsed time: ......
......
TC test_malloc_1K_bytes_4M_times done
TC test_malloc_4K_bytes_1M_times starts
......
TC test_malloc_4GB_10_threads_4M_unit done
This script is similar to the above (run_4GB_malloc_test.sh), but requested memory size is random, i.e., allocation request size changes per every request. Total amount of memory requested is 4GiB.
Command lines
$ cd /path/to/SMDK/src/test/subzone
$ ./run_random_malloc_test.sh
Result
Allocation size: 4294990785
This is a test case to check whether the online/offline change and node id change of the CXL device work normally.
Command lines
$ cd /path/to/SMDK/src/test/driver
$ ./run_functional_test.sh
Result
[[Buddy Info]]
Node 0, zone DMA 0 0 0 0 0 0 0 0 1 1 3
Node 0, zone DMA32 4 3 8 4 5 5 4 4 5 3 437
Node 0, zone Normal 817 409 228 295 112 28 18 11 4 1 15199
Node 0, zone ExMem 3 2 1 2 2 2 2 1 0 1 8191
[[Device Info]]
start_address: 0x1080000000
size: 0x400000000
node_id: 0
socket_id: 0
state: online
[OFFLINE TEST]
[[Buddy Info]]
Node 0, zone DMA 0 0 0 0 0 0 0 0 1 1 3
Node 0, zone DMA32 4 3 8 4 5 5 4 4 5 3 437
Node 0, zone Normal 460 384 264 317 117 31 19 11 4 4 15261
Node 0, zone ExMem 0 0 0 0 0 0 0 0 0 0 4096
[[Device Info]]
start_address: 0x1080000000
size: 0x400000000
node_id: -1
socket_id: 0
state: offline
PASS
[ONLINE TEST]
[[Buddy Info]]
Node 0, zone DMA 0 0 0 0 0 0 0 0 1 1 3
Node 0, zone DMA32 4 3 8 4 5 5 4 4 5 3 437
Node 0, zone Normal 189 304 186 303 117 31 19 11 4 2 15198
Node 0, zone ExMem 0 0 0 0 0 0 0 0 0 0 8192
[[Device Info]]
start_address: 0x1080000000
size: 0x400000000
node_id: 0
socket_id: 0
state: online
PASS
[NODE CHANGE TEST]
[[Buddy Info]]
Node 0, zone DMA 0 0 0 0 0 0 0 0 1 1 3
Node 0, zone DMA32 4 3 8 4 5 5 4 4 5 3 437
Node 0, zone Normal 170 339 220 331 120 31 19 11 4 1 15199
Node 0, zone ExMem 0 0 0 0 0 0 0 0 0 0 4096
Node 1, zone ExMem 0 0 0 0 0 0 0 0 0 0 4096
[[Device Info]]
start_address: 0x1080000000
size: 0x400000000
node_id: 1
socket_id: 0
state: online
PASS
[KOBJECT RELEASE TEST]
kobject is released
PASS
memdev: /sys/devices/pci0000:16/0000:16:00.0/mem0
This is a test case to check whether the state of the device remains unchanged when attempting to change to offline/online in the case of a CXL device that is in use or bound to DAX.
Command lines
$ cd /path/to/SMDK/src/test/driver
$ ./run_rollback_test.sh
Result
[online rollback test]
[[Device Info]]
start_address: 0x1080000000
size: 0x400000000
node_id: 0
socket_id: 0
state: online
./run_rollback_test.sh: line 37: echo: write error: Device or resource busy
addr[0x7ff49536b000]
[[Device Info]]
start_address: 0x1080000000
size: 0x400000000
node_id: 0
socket_id: 0
state: online
[offline rollback test]
[[Device Info]]
start_address: 0x1080000000
size: 0x400000000
node_id: -1
socket_id: 0
state: offline
./run_rollback_test.sh: line 48: echo: write error: Device or resource busy
[[Device Info]]
start_address: 0x1080000000
size: 0x400000000
node_id: -1
socket_id: 0
state: offline
PASS
mmap description
#include<sys/mman.h>
void mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);
mmap() creates a new mapping in the virtual address space of the calling process. The starting address for the new mapping is specified in addr.
The length argument specifies the length of the mapping in bytes.
The contents of a file mapping, are initialized using length bytes starting at offset in the file referred to by the file descriptor fd.
mmap flag argument
The flags argument determines whether updates to the mapping are visible to other processes mapping the same region, and whether updates are carried through to the underlying file.
Flags | Desc. |
---|---|
MAP_SHARED | Share this mapping. Updates to the mapping are visible to other processes mapping the same region, and are carried through to the underlying file. |
MAP_PRIVATE | Create a private copy-on-write mapping. Updates to the mapping are not visible to other processes mapping the same file, and are not carried through to the underlying file. |
MAP_ANONYMOUS | The mapping is not backed by any file; its contents are initialized to zero. The fd argument is ignored. |
MAP_POPULATE | Populate (prefault) page tables for a mapping. This will help to reduce blocking on page faults later. |
MAP_EXMEM | SMDK adds this flag to target the memory zone. If MAP_EXMEM is set, caller's virtual address space will be mapped to physical page that belongs to EXMEM zone. |
MAP_NORMAL | SMDK adds this flag to target the memory zone. If MAP_NORMAL is set, caller's virtual address space will be mapped to physical page that belongs to zones other than EXMEM zone. e.g. NORMAL zone. |
This is a test case to check whether the new mmap flag MAP_EXMEM (see below code block) added in SMDK kernel for using CXL memory is working properly at kernel level, which is different from syscall test.
unsigned int flag = MAP_PRIVATE | MAP_ANONYMOUS | MAP_EXMEM;
char *addr = mmap(NULL, length, PROT_READ | PROT_WRITE, flag, -1, 0);
Command lines
$ cd /path/to/SMDK/src/test/mmap
$ ./test_mmap_cxl [options...]
(Example) $ ./test_mmap_cxl E
Options
Options | Desc. |
---|---|
e | E | Requests mmap with MAP_EXMEM flag. |
n | N | Requests mmap with MAP_NORMAL flag. |
loop <n> | Specifies iteration count. |
usleep <usec> | Specifies an usleep time in microsecond. (default: 1 second) |
buddyinfo | Writes /proc/buddyinfo to output file. |
Result
......
MAP_EXMEM
addr[0x7f8b3239d000], one='1' zero='0'
addr[0x7f8b31f9d000], one='1' zero='0'
addr[0x7f8b31b9d000], one='1' zero='0'
addr[0x7f8b3179d000], one='1' zero='0'
addr[0x7f8b3139d000], one='1' zero='0'
addr[0x7f8b30f9d000], one='1' zero='0'
addr[0x7f8b30b9d000], one='1' zero='0'
addr[0x7f8b3079d000], one='1' zero='0'
addr[0x7f8b3039d000], one='1' zero='0'
addr[0x7f8b2ff9d000], one='1' zero='0'
addr[0x7f8b2fb9d000], one='1' zero='0'
......
Command lines
$ cd /path/to/SMDK/src/test/mmap
$ ./run_zonebind_test.sh <ddr-only node id> <cxl-only node id> <ddr+cxl node id>
(Example) $ ./run_zonebind_test.sh 0 1 -1
Options
Options | Desc. |
---|---|
ddr-only node id | nid of a node containing only DDR memory. If there is no such node, use -1. |
cxl-only node id | nid of a node containing only CXL memory. If there is no such node, use -1. |
ddr+cxl node id | nid of a node containing both DDR and CXL memory. If there is no such node, use -1. |
Result
RUN TC4
1) target node: 0
2) numactl:
3) mmap: e
Expected Result: Node 0, zone ExMem
Before:
Node 0, zone ExMem 2 5 2 1 4 4 4 2 3 1 8183
After:
Node 0, zone ExMem 0 2 3 0 0 2 3 3 3 1 7682
Press any key to continue
...
This test checks that CXL Swap works well on various swap out/in scenario and the functionality of CXL Swap flush.
The detail description and prerequisite for each test are written in each script file. So before you run test, please read comment in script first.
1) run_cxlswap_storeload_test.sh
Test basic swap out/in data to/from CXL Swap. The data before swap out and the data after swap in must be the same.
Command lines
$ cd /path/to/SMDK/src/test/cxlswap/
$ ./run_cxlswap_storeload_test.sh
Result
Store Load Test Start
=======Test Info======
Process ID : 2034 / CXL Swap Enabled : Y
Test Name : store_load
Total Memory Size 512.00M / Memory Limit to 460.80M
======= RESULT =======
CXL Swap Stored Pages Before Swap : 81380
CXL Swap Stored Pages After Swap : 104871
====== PASS ======
...
2) run_cxlswap_multithread_test.sh
Test swap out/in data to/from CXL Swap by multi-threaded. Regardless of the thread, the data before swap out and the data after swap in must be the same.
Command lines
$ cd /path/to/SMDK/src/test/cxlswap/
$ ./run_cxlswap_multithread_test.sh
Result
Multi Thread Test Start
=======Test Info======
Process ID : 2072 / CXL Swap Enabled : Y
Test Name : multi_thread
Total Memory Size 1.00G / Memory Limit to 921.60M
======= RESULT =======
Elapsed Time 0.688225 using 10 threads
CXL Swap Stored Pages Before Swap : 81384
CXL Swap Stored Pages After Swap : 104861
====== PASS ======
...
3) run_cxlswap_sharedmemory_test.sh
Test swap out/in shared data to/from CXL Swap. The data before swap out and the data after swap in must be the same even using shared memory.
Command lines
$ cd /path/to/SMDK/src/test/cxlswap/
$ ./run_cxlswap_sharedmemory_test.sh
Result
Shared Memory Test Start
=======Test Info======
Process ID : 1980 / CXL Swap Enabled : Y
Test Name : shared_memory
Total Memory Size 512.00M / Memory Limit to 460.80M
Process 1980 Initialize Data [Shmid 0]...
Process 1992 Check Initialized Data [Shmid 0]...
Process 1992 Check Initialized Data [Shmid 0] Pass
Process 1992 Modify Data [Shmid 0]...
Process 1980 Check Modified Data [Shmid 0]...
Process 1980 Check Modified Data [Shmid 0] Pass
======= RESULT =======
CXL Swap Stored Pages Before Swap : 1
CXL Swap Stored Pages After Swap : 13801
====== PASS ======
...
4) run_cxlswap_flush_test.sh
Test CXL Swap Flush functionality. Note that even after Flush, there can be few remain pages in CXL Swap. See the description in this script.
Command lines
$ cd /path/to/SMDK/src/test/cxlswap/
$ ./run_cxlswap_flush_test.sh
Result
Flush Test Start
Before Flush : 81373
After Flush : 5
Flush Test Finish
This test checks that registered CXL devices works well as DAX devices. This script releases the CXL device memory area from the memory, binds it to the DAX device, and checks if it operates as a DAX device through fio. The number of devices and addresses of the devices values in the script should be modified to run.
Note: In order to run the script below, you need to install fio in your system first. Please refer to fio GitHub for information related to the installation and usage of it.
Commnad lines
$ cd /path/to/SMDK/src/test/dax
$ vi ./run_dax_test.sh
# Change the number of devices and address of devices according to your system
NUM_DEVICE=3
ADDRESS=("1080000000-307fffffff" "3080000000-507fffffff" "5080000000-707fffffff")
# Download fio from https://github.com/axboe/fio.git
# Change FIO_PATH from /path/to to your system's path
FIO_PATH=/path/to/fio/
# After modifying the script
$ ./run_dax_test.sh
Result
IOMEM
1080000000-307fffffff : hmem.1
1080000000-307fffffff : Soft Reserved
1080000000-307fffffff : hmem.0
1080000000-307fffffff : cxl
1080000000-307fffffff : System RAM (cxl)
[[Buddy Info]]
Node 0, zone DMA 0 0 0 0 0 0 0 0 1 1 3
Node 0, zone DMA32 6 6 4 6 5 3 4 5 3 5 436
Node 0, zone Normal 3 29 2 12 31 11 9 7 4 1 13831
Node 0, zone ExMem 0 0 0 0 0 0 0 0 0 0 32768
-----------------------------------------------------------------
IOMEM
1080000000-307fffffff : hmem.1
1080000000-307fffffff : Soft Reserved
1080000000-307fffffff : hmem.0
[[Buddy Info]]
Node 0, zone DMA 0 0 0 0 0 0 0 0 1 1 3
Node 0, zone DMA32 6 6 4 6 5 3 4 5 3 5 436
Node 0, zone Normal 1 2 14 63 109 28 3 3 1 0 13828
-----------------------------------------------------------------
FIO TEST
dev-dax-write: (g=0): rw=randwrite, bs=(R) 2048KiB-2048KiB, (W) 2048KiB-2048KiB, (T) 2048KiB-2048KiB, ioengine=dev-dax, iodepth=1
...
dev-dax-read: (g=1): rw=randread, bs=(R) 2048KiB-2048KiB, (W) 2048KiB-2048KiB, (T) 2048KiB-2048KiB, ioengine=dev-dax, iodepth=1
IOMEM
1080000000-307fffffff : hmem.1
1080000000-307fffffff : Soft Reserved
1080000000-307fffffff : hmem.0
1080000000-307fffffff : cxl
1080000000-307fffffff : System RAM (cxl)
[[Buddy Info]]
Node 0, zone DMA 0 0 0 0 0 0 0 0 1 1 3
Node 0, zone DMA32 6 6 4 6 5 3 4 5 3 5 436
Node 0, zone Normal 3 29 2 12 31 11 9 7 4 1 13831
Node 0, zone ExMem 0 0 0 0 0 0 0 0 0 0 32768
You can use system emulation with QEMU to check the operation of the SMDK kernel. First, build the QEMU.
$ cd /path/to/SMDK/lib/
$ ./build_lib.sh qemu
After downloading Ubuntu ISO image file from here, update the ISO file path in /path/to/SMDK/lib/qemu/create_gui_image.sh, then run the script.
$ cd /path/to/SMDK/lib/qemu/
$ vi create_gui_image.sh # Update UBUNTU_ISO file path.
$ ./create_gui_image.sh
When the Ubuntu installation is finished, run the following command to boot to Ubuntu.
$ cd /path/to/SMDK/lib/qemu/
$ ./setup_gui_ssh.sh
After booting, update the APT repository if necessary, and install the required package by $ sudo apt update, $ sudo apt install <packages e.g., openssh-server>, etc.
With SMDK repository cloned from github, build and install SMDK kernel. You can now emulate the SMDK Kernel with the following script.
$ cd /path/to/SMDK/lib/qemu/
$ ./run_cxl_emu_gui.sh # default setting: 6 cores, 8GB RAM.
You can connect to the QEMU virtual machine through QEMU monitor(port: 45454) and sshd(port: 2242) with scripts below.
# Connect to QEMU Monitor
$ cd /path/to/SMDK/lib/qemu/
$ ./connect_monitor.sh
# Connect to sshd
$ cd /path/to/SMDK/lib/qemu/
$ ./connect_ssh.sh