Skip to content

Commit

Permalink
net: use readv for reading frames from TAP device
Browse files Browse the repository at this point in the history
Right now, we are performing two copies for writing a frame from the TAP
device into guest memory. We first read the frame in an array held by
the Net device and then copy that array in a DescriptorChain.

In order to avoid the double copy use the readv system call to read
directly from the TAP device into the buffers described by
DescriptorChain.

The main challenge with this is that DescriptorChain objects describe
memory that is at least 65562 bytes long when guest TSO4, TSO6 or UFO
are enabled or 1526 otherwise and parsing the chain includes overhead
which we pay even if the frame we are receiving is much smaller than
these sizes.

PR firecracker-microvm#4748 reduced
the overheads involved with parsing DescriptorChain objects. To further
avoid this overhead, move the parsing of DescriptorChain objects out of
the hot path of process_rx() where we are actually receiving a frame
into process_rx_queue_event() where we get the notification that the
guest added new buffers for network RX.

Signed-off-by: Babis Chalios <bchalios@amazon.es>
  • Loading branch information
bchalios authored and ShadowCurse committed Nov 5, 2024
1 parent 0023de3 commit ce60145
Show file tree
Hide file tree
Showing 6 changed files with 557 additions and 297 deletions.
102 changes: 102 additions & 0 deletions resources/seccomp/aarch64-unknown-linux-musl.json
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,108 @@
"syscall": "writev",
"comment": "Used by the VirtIO net device to write to tap"
},
{
"syscall": "readv",
"comment": "Used by the VirtIO net device to read from tap"
},
{
"syscall": "memfd_create",
"comment": "Used by the IovDeque implementation"
},
{
"syscall": "fcntl",
"comment": "Used by the IovDeque implementation",
"args": [
{
"index": 1,
"type": "dword",
"op": "eq",
"val": 1033,
"comment": "FCNTL_F_SETFD"
},
{
"index": 2,
"type": "dword",
"op": "eq",
"val": 6,
"comment": "F_SEAL_SHRINK|F_SEAL_GROW"
}
]
},
{
"syscall": "fcntl",
"comment": "Used by the IovDeque implementation",
"args": [
{
"index": 1,
"type": "dword",
"op": "eq",
"val": 1033,
"comment": "FCNTL_F_SETFD"
},
{
"index": 2,
"type": "dword",
"op": "eq",
"val": 1,
"comment": "F_SEAL_SEAL"
}
]
},
{
"syscall": "mmap",
"comment": "Used by the IovDeque implementation",
"args": [
{
"index": 1,
"type": "dword",
"op": "eq",
"val": 4096,
"comment": "Page size allocation"
},
{
"index": 2,
"type": "dword",
"op": "eq",
"val": 3,
"comment": "PROT_READ|PROT_WRITE"
},
{
"index": 3,
"type": "dword",
"op": "eq",
"val": 17,
"comment": "MAP_SHARED|MAP_FIXED"
}
]
},
{
"syscall": "mmap",
"comment": "Used by the IovDeque implementation",
"args": [
{
"index": 1,
"type": "dword",
"op": "eq",
"val": 8192,
"comment": "2 pages allocation"
},
{
"index": 2,
"type": "dword",
"op": "eq",
"val": 0,
"comment": "PROT_NONE"
},
{
"index": 3,
"type": "dword",
"op": "eq",
"val": 34,
"comment": "MAP_PRIVATE|MAP_ANONYMOUS"
}
]
},
{
"syscall": "fsync"
},
Expand Down
102 changes: 102 additions & 0 deletions resources/seccomp/x86_64-unknown-linux-musl.json
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,108 @@
"syscall": "writev",
"comment": "Used by the VirtIO net device to write to tap"
},
{
"syscall": "readv",
"comment": "Used by the VirtIO net device to read from tap"
},
{
"syscall": "memfd_create",
"comment": "Used by the IovDeque implementation"
},
{
"syscall": "fcntl",
"comment": "Used by the IovDeque implementation",
"args": [
{
"index": 1,
"type": "dword",
"op": "eq",
"val": 1033,
"comment": "FCNTL_F_SETFD"
},
{
"index": 2,
"type": "dword",
"op": "eq",
"val": 6,
"comment": "F_SEAL_SHRINK|F_SEAL_GROW"
}
]
},
{
"syscall": "fcntl",
"comment": "Used by the IovDeque implementation",
"args": [
{
"index": 1,
"type": "dword",
"op": "eq",
"val": 1033,
"comment": "FCNTL_F_SETFD"
},
{
"index": 2,
"type": "dword",
"op": "eq",
"val": 1,
"comment": "F_SEAL_SEAL"
}
]
},
{
"syscall": "mmap",
"comment": "Used by the IovDeque implementation",
"args": [
{
"index": 1,
"type": "dword",
"op": "eq",
"val": 4096,
"comment": "Page size allocation"
},
{
"index": 2,
"type": "dword",
"op": "eq",
"val": 3,
"comment": "PROT_READ|PROT_WRITE"
},
{
"index": 3,
"type": "dword",
"op": "eq",
"val": 17,
"comment": "MAP_SHARED|MAP_FIXED"
}
]
},
{
"syscall": "mmap",
"comment": "Used by the IovDeque implementation",
"args": [
{
"index": 1,
"type": "dword",
"op": "eq",
"val": 8192,
"comment": "2 pages allocation"
},
{
"index": 2,
"type": "dword",
"op": "eq",
"val": 0,
"comment": "PROT_NONE"
},
{
"index": 3,
"type": "dword",
"op": "eq",
"val": 34,
"comment": "MAP_PRIVATE|MAP_ANONYMOUS"
}
]
},
{
"syscall": "fsync"
},
Expand Down
Loading

0 comments on commit ce60145

Please sign in to comment.