This is the open-source repository for our paper TeRM: Extending RDMA-Attached Memory with SSD on FAST'24 and ACM Transactions on Storage.
Notably, the codename of TeRM is PDP (one step further beyond ODP).
TeRM
|---- ae # artifact evaluation files
|---- bin # binaries generated by source code
|---- scripts # common scripts
|---- figure-*.py # scripts to execute experiments
|---- run-all.sh # run all experiments
|---- app # source code of octopus and xstore with some bugs fixed
|---- driver # TeRM's driver
|---- driver.patch # patches to the official driver
|---- mlnx-*.zip # the patched driver
|---- libterm # TeRM's userspace shared library
- OS: Ubuntu 22.04.2 LTS
- Kernel: Linux 5.19.0-50-generic
- OFED driver: 5.8-2.0.3
We recommend the same environment used in our development. You may need to customize the source code for different enviroment. The environment is mainly required for the driver. We only need the patched driver on the server side.
sudo apt install libfmt-dev libaio-dev libboost-coroutine-dev libmemcached-dev libgoogle-glog-dev libgflags-dev
We hard coded some settings in the source code. Please modify them according to your cluster settings.
-
memcached. TeRM uses memcached to synchronize cluster metadata. Please install memcached in your cluster and modify the ip and port in
ae/scripts/reset-memc.sh,libterm/ibverbs-pdp/global.cc, andlibterm/include/node.hh. -
CPU affinity. The source code is in
class Scheduleof filelibterm/include/util.hh. Please modify the constants according to your CPU hardware.
The patched driver is required on the server side. There are two ways to build the driver. We provide an out-of-the-box driver zip file in the second choice.
-
Download the source code of the driver from the official website. Apply official backport batches first and then patch the modifications listed in
driver/driver.patch. Then, build the driver. Please note that, we apply minimum number of patches, instead of all patches, that make it work for our environment. One shall notgit applythedriver/driver.patchdirectly, because line numbers may differ. One should parse and patch it manually. -
Use
driver/mlnx-ofed-kernel-5.8-2.0.3.0.zip. Unzip it and run the containedbuild.sh.
We provide CMakeLists.txt for building.
It produces two outputs, the userspace shared library libpdp.so and a program perf.
Please copy two files to ae/bin before running AE scripts.
$ cd libterm
$ mkdir -p build && cd build
$ cmake .. -DCMAKE_BUILD_TYPE=Release # Release for compiler optimizations and high performance
$ make -j
- Replace the modified driver
*.kofiles on the server side and restart theopenibdservice. - Restart the
memcachedinstance. We provide a scriptae/scripts/reset-memc.shto do so. mmapan SSD in the RDMA program withMAP_SHAREDandibv_reg_mrthe memory area as anODP MR.- Set
LD_PRELOAD=libpdp.soon all nodes to enable TeRM. Also set enviroment variablesPDP_server_mmap_dev=nvmeXnYfor the SSD backend andPDP_server_memory_gb=Zfor the size of the mapped area. SetPDP_is_server=1if and only if for the server side. - Run the RDMA application.
libterm accepts a series of environment variables for configuration. Please refer to libterm/ibverbs-pdp/global.cc for more details.
If you have further questions and interests about the repository, please feel free to propose an issue or contact me via email (yangzhe.ac AT outlook.com). You can find my github at yzim.
To cite our paper:
@inproceedings {fast24-term,
author = {Zhe Yang and Qing Wang and Xiaojian Liao and Youyou Lu and Keji Huang and Jiwu Shu},
title = {{TeRM}: Extending {RDMA-Attached} Memory with {SSD}},
booktitle = {22nd USENIX Conference on File and Storage Technologies (FAST 24)},
year = {2024},
isbn = {978-1-939133-38-0},
address = {Santa Clara, CA},
pages = {1--16},
url = {https://www.usenix.org/conference/fast24/presentation/yang-zhe},
publisher = {USENIX Association},
month = feb
}
@article{tos24-term,
author = {Yang, Zhe and Wang, Qing and Liao, Xiaojian and Lu, Youyou and Huang, Keji and Shu, Jiwu},
title = {Efficiently Enlarging RDMA-Attached Memory with SSD},
year = {2024},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
issn = {1553-3077},
url = {https://doi.org/10.1145/3700772},
doi = {10.1145/3700772},
journal = {ACM Trans. Storage},
month = oct
}