Lock-Free Programming #14
Labels
enhancement
New feature or request
good first issue
Good for newcomers
help wanted
Extra attention is needed
Most C++ developers have heard of
std::mutex
andstd::shared_mutex
, and some may have implemented their oversimplified versions usingstd::atomic
. However, there have been few attempts to design a scalable shared mutex that works efficiently on modern 100+ core NUMA systems.To understand the foundational problems causing these limitations, one should study memory contention and profile how the system behaves when many threads attempt to read or modify the same memory address. Before attempting to design high-level STL-like abstractions, one should understand how some key instructions operate:
LOCK XADD
,LOCK CMPXCHG
, andPAUSE
for spinlocks and synchronization.MFENCE
,SFENCE
,LFENCE
.XBEGIN
/XEND (TSX)
.LDXR
/STXR
and atomic compare-and-swap (CAS).DMB
,DSB
,ISB
.YIELD
.Once the low-level profiling is complete, the next step could be to explore implementations of advanced concurrency primitives proposed in the Concurrency TS, like the Distributed Counters in P0261 or Byte-wise atomic
memcpy
in P1478.This topic could also serve as the foundation for a research paper on concurrency primitives, especially for those pursuing a master’s or PhD in Systems Programming.
The text was updated successfully, but these errors were encountered: