Skip to content

Commit

Permalink
NBSNEBIUS-65: add RDMA read/write sequence diagrams
Browse files Browse the repository at this point in the history
  • Loading branch information
budevg committed Jan 17, 2024
1 parent eb07416 commit 8226341
Show file tree
Hide file tree
Showing 4 changed files with 40 additions and 6 deletions.
15 changes: 9 additions & 6 deletions doc/rdma/architecture.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
# RDMA architecture
## Read/Write sequence
![Read/Write Sequence](diagrams/rw_sequence.svg)

## [RDMA library](../../cloud/blockstore/libs/rdma)

RDMA library provides interface to communicate
Expand All @@ -9,10 +12,10 @@ and `Server`. It allows multiple Clients to connect to a single Server.

Single client object (created using `NRdma::CreateClient`) can connect to multiple
servers using `StartEndpoint`. After successful connection `Endpoint` can be
used to allocate request (`AllocateRequest`) and sending it (`SendRequest`).
used to allocate request (`AllocateRequest`) and send it (`SendRequest`).

Each endpoint pre-allocates `QueueDepth` buffers for request/response
metadata. Additional buffers for request/response data is allocated on demand
metadata. Additional buffers for request/response data are allocated on demand
using `SendBuffers` or `RecvBuffers` buffer pools kept for each endpoint.

Calling `AllocateRequest` on endpoint will allocate enough buffers for the
Expand Down Expand Up @@ -75,12 +78,12 @@ connections. Once connection is established the rdma_target will handle
`Read/WriteDeviceBlocksRequest`.

For `WriteDeviceBlocksRequest`, it constructs `WriteBlocksRequest` using the
same data buffers and writes data using `StorageAdaptor`.
same data buffers and writes data using `StorageAdapter`.

For `ReadDeviceBlocksRequest`, it uses `StorageAdaptor` to read data into
For `ReadDeviceBlocksRequest`, it uses `StorageAdapter` to read data into
`ReadBlocksRequest` and then converts response to `ReadDeviceBlocksRequest`.

There is extra buffer copy inside the `StorageAdaptor`. We use aio for
read/write and it require page_size aligned buffers. When we call
There is extra buffer copy inside the `StorageAdapter`. We use aio for
read/write and it requires page_size aligned buffers. When we call
`device->Read/WriteBlocks`, the device will allocate aligned buffer, and copy
data to/from it.
7 changes: 7 additions & 0 deletions doc/rdma/diagrams/build_diagrams.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
#!/bin/bash

# see https://github.com/mermaid-js/mermaid-cli for installation instructions

set -x

mmdc -i rw_sequence.mmd -o rw_sequence.svg
23 changes: 23 additions & 0 deletions doc/rdma/diagrams/rw_sequence.mmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
sequenceDiagram
box Blockstore Server
participant P as Partition
participant C as Rdma Client
end
box Disk Agent
participant S as Rdma Server
participant T as Rdma Target
end

Note over P: Receive<br/>Read/WriteBlocksRequest
P-->>C: Searialize<br/>Read/WriteDeviceBlocksRequest<br/>(WRITE: COPY DATA BUFFER)
C->>S: IBV_WR_SEND<br/>(HEADER)
S->>C: IBV_WR_RDMA_READ<br/>(READ: MSG, WRITE: MSG + DATA)
S-->>T: Deserialize<br/>Read/WriteDeviceBlocksRequest
Note over T: Construct<br/>Read/WriteBlocksRequest
Note over T: AIO Read/Write<br/>(COPY DATA BUFFER)
Note over T: Read/WriteBlocksResponse
T-->>S: Serialize<br/>Read/WriteDeviceBlocksResponse
S->>C: IBV_WR_RDMA_WRITE<br/>(READ: MSG + DATA, WRITE: MSG)
S->>C: IBV_WR_SEND<br/>(HEADER)
C-->>P: Deserialize<br/>Read/WriteDeviceBlocksResponse<br/>(READ: COPY DATA BUFFER)
Note over P: Complete<br/>Read/WriteBlocksResponse
Loading

0 comments on commit 8226341

Please sign in to comment.