This project demonstrates a performance comparison between memory-mapped I/O (mmap()
) and the read()
system call in C programming. The focus is on scenarios where random access to large files is required, showcasing how mmap()
can outperform read()
under specific conditions.
To test and analyze the performance of mmap()
and read()
by:
- Measuring execution time, context switches, and page faults.
- Understanding the scenarios where
mmap()
provides significant advantages overread()
.
- GCC or compatible C compiler
- Linux system with
perf
installed - A 1 GB or larger file for testing
Generate a large file (e.g., 1 GB) to test the performance:
dd if=/dev/urandom of=large_file.txt bs=1M count=1024
Compile the mmap_test.c
and read_test.c
files:
gcc -o mmap_test mmap_test.c
gcc -o read_test read_test.c
Execute both programs with perf
to gather performance data:
sudo perf stat -e context-switches,page-faults,cpu-migrations,cycles,instructions ./mmap_test large_file.txt
sudo perf stat -e context-switches,page-faults,cpu-migrations,cycles,instructions ./read_test large_file.txt
Alternatively, use the provided benchmark.sh
script to automate the process:
chmod +x benchmark.sh
./benchmark.sh
Review the formatted output, which includes:
- Execution time (elapsed time for program completion)
- Context switches (how often the CPU switches between tasks)
- Page faults (memory access issues handled by the kernel)
- Kernel vs. user time (time spent in kernel-mode vs user-mode)
Below is an example of the formatted output from running the tests:
----------------------------------------------------------
| Metric | mmap_test | read_test | Observations |
----------------------------------------------------------
| Execution Time | 0.812s | 1.531s | mmap is faster for random access |
| Context Switches | 4 | 7 | mmap avoids repeated system calls |
| Page Faults | 16426 | 42 | mmap triggers on-demand paging |
| Kernel Time | 0.064s | 0.675s | read spends more time in kernel |
----------------------------------------------------------
mmap()
is expected to outperformread()
for random access due to:- Reduced System Calls:
mmap()
maps the file into memory once, allowing subsequent access directly in user space. - Efficient Random Access: Accessing random locations in memory is faster than repeatedly seeking and reading from disk.
- Reduced System Calls:
The benchmark.sh
script cleans up all temporary files after execution, including:
- The compiled binaries (
mmap_test
,read_test
). - The large test file (
large_file.txt
).
If running manually, you can clean up with:
rm mmap_test read_test large_file.txt
This project is licensed under the MIT License. See the LICENSE file for details.