forked from luigirizzo/netmap
-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
306 lines (228 loc) · 10.9 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
Netmap - a framework for fast packet I/O
VALE - a Virtual Local Ethernet using the netmap API
========================================================================
NETMAP is a framework for very fast packet I/O from userspace.
VALE is an equally fast in-kernel software switch using the netmap API.
Both are implemented as a single kernel module for FreeBSD, Linux and
since summer 2015, also for Windows.
Netmap/VALE can handle tens of millions of packets per second, matching
the speed of 10G and 40G ports even with minimum sized frames.
See details at
http://info.iet.unipi.it/~luigi/netmap/
This repository, hosted at https://github.com/luigirizzo/netmap , contains
source code (BSD-Copyright) for FreeBSD, Linux and Windows.
Note that recent FreeBSD distributions already include both NETMAP and VALE.
A netmap tutorial is avaliable at https://github.com/vmaffione/netmap-tutorial.
What is this good for
---------------------
Netmap is mostly useful for userspace applications that must deal with raw
packets: traffic generators, sinks, monitors, loggers, software switches
and routers, generic middleboxes, interconnection of virtual machines.
The example/ directory includes pkt-gen.c (a fast traffic generator/receiver)
and bridge.c, a simple bidirectional interconnect between two ports.
The kernel module itself implements a learning ethernet bridge.
More resources are hosted on other repositories. For example
https://github.com/luigirizzo/netmap-libpcap
contains a netmap-enabled version of libpcap (which is also
included in FreeBSD distribution) so you can run any libpcap client
on top of netmap at much higher speeds than using bpf.
https://github.com/luigirizzo/netmap-ipfw
is a userspace version of ipfw and dummynet which can handle several
million packets per second in a single thread
Qemu/kvm has native netmap support, so it can interconnect VMs at high speed
through netmap ports. There is experimental netmap support in the FreeBSD's
bhyve hypervisor.
Netmap alone DOES NOT accelerate your TCP. For that you need to implement
your own tcp/ip stack probably using some of the techniques indicated
below to reduce the processing costs.
Architecture
------------
netmap uses a number of techniques to establish a fast and efficient path
between applications and the network. In order of importance:
1. I/O batching
2. efficient device drivers
3. pre-allocated tx/rx buffers
4. memory mapped buffers
Despite the name, memory mapping is NOT the key feature for netmap's
speed; systems that do not apply all these techniques do not achieve
the same speed _and_ efficiency.
Netmap clients use a select()-able file descriptor to synchronize
with the network card/software switch, and exchange multiple packets
per system call through device-independent memory mapped buffers and
descriptors. Device drivers are completely in the kernel, and the system
does not rely on IOMMU or other special mechanisms.
Installation instructions
-------------------------
A single kernel module implements the core NETMAP functions, including
the VALE switch and access to physical NICS using unmodified device drivers
(at the price of much lower performance than netmap-aware drivers).
Netmap-aware device drivers are needed to use netmap at high speed
on ethernet ports. To date, we have support for Intel ixgbe (10G),
ixl (10/40G), e1000/e1000e/igb (1G), Realtek 8169 (1G) and Nvidia (1G).
FreeBSD has also native netmap support in the Chelsio 10/40G cards.
FreeBSD
-------
Since recent FreeBSD distributions already include netmap, you only
need build the new kernel or modules as below:
+ add 'device netmap' to your kernel config file and rebuild a kernel.
This will include the netmap module and netmap support in the device
drivers. Alternatively, you can build standalone modules
(netmap, ixgbe, em, lem, re, igb)
+ sample applications are in the examples/ directory in this archive,
or in src/tools/tools/netmap/ in FreeBSD distributions
Linux
-----
The ./configure && make build system in the LINUX/
directory will let you patch device driver sources and build
some netmap-enabled device drivers.
Please look at LINUX/README for details.
+ make sure you have kernel headers matching your installed kernel.
+ the sources for e1000e, igb, ixgbe and i40e will be downloaded
from the Intel e1000 project on sourceforce.
+ if you need the netmap enabled drivers for e1000, veth, forcedeth,
virtio-net or r8169 you will also need the full kernel sources.
+ Configure netmap.
To compile NETMAP/VALE and the Intel drivers above:
./configure
(This will also download the Intel driver sources from sourceforce).
To compile only NETMAP/VALE (using unmodified drivers):
./configure --no-drivers # only netmap
If you need the full kernel sources and you have installed them in
/a/b/c/linux-A.B.C/, then you should do
./configure --kernel-dir=/a/b/c/linux-A.B.C/ # netmap+device drivers
You can omit --kernel-dir if your kernel sources are in a standard place.
If you use distribution packages, full sources and headers may be in
different places contain headers (e.g., on debian systems). Use
./configure --kernel-sources=/a/b/c/linux-sources-A.B/ \
--kernel-dir=/a/b/c/linux-headers-A.B/
+ build kernel modules and sample applications:
make
+ (optionally) install the new modules and the applications:
make install # as root
To have the new netmap-enabled driver modules alongside the original
ones, you may want to add --driver-suffix=-netmap to the configure
command above. The new drivers will then be called e1000e-netmap,
ixgbe-netmap and so on.
WINDOWS
-------
Netmap has been ported to Windows in summer 2015 by Alessio Faina as part of
his Master thesis. Please look at WINDOWS/README.txt for details.
Applications
------------
The directory examples/ contains some programs that use the netmap API
pkt-gen.c a packet generator/receiver working at line rate at 10Gbit/s
vale-cfg.c utility to configure ports of a VALE switch
bridge.c a utility that bridges two interfaces or one interface
with the host stack
For libpcap and other applications look at the extra/ directory.
Testing
-------
pkt-gen is a generic test program which can act as a sender or receiver.
It has a large number of options, but the simplest form is:
pkt-gen -i ix0 -f rx # receive and print stats
pkt-gen -i ix0 -f tx -l 60 # send a stream of 60-byte packets
(replace ix0 with the name of the interface or VALE port).
This should be able to work at line rate (up to 14.88 Mpps on 10
Gbit/interfaces, even higher on VALE) but note the following
OPERATING SPEED
---------------
Netmap is able to send packets at very high rates, and for simple
packet transmission and reception, speed generally not limited by
the CPU but by other factors (link speed, bus or NIC hw limitations).
For a physical link, the maximum numer of packets per second can
be computed with the formula:
pps = line_rate / (672 + 8 * pkt_size)
where "line_rate" is the nominal link rate (e.g 10 Gbit/s) and
pkt_size is the actual packet size including MAC headers and CRC.
The following table summarizes some results
LINE RATE
pkt_size \ 100M 1G 10G 40G
64 .1488 1.488 14.88 59.52
128 .0589 0.589 5.89 23.58
256 .0367 0.367 3.67 14.70
512 .0209 0.209 2.09 8.38
1024 .0113 0.113 1.13 4.51
1518 .0078 0.078 0.78 3.12
On VALE ports, there is no physical link and the throughput is
limited by CPU or memory depending on the packet size.
COMMON PROBLEMS
---------------
Before reporting slow send or receive speed on a physical interface,
check ALL of the following:
CANNOT SET THE DEVICE IN NETMAP MODE:
+ make sure that the netmap module and drivers are correctly
loaded and can allocate all the memory they need (check into
/var/log/messages or equivalent)
+ check permissions on /dev/netmap
+ make sure the interface is up before invoking pkt-gen
SENDER DOES NOT TRANSMIT
+ some switches/interfaces take a long time to (re)negotiate
the link after starting pkt-gen; in case, use the -w N option
to increase the initial delay to N seconds;
This may cause inability to transmit, or lost packets for
the first few seconds of transmission
RECEIVER DOES NOT RECEIVE
+ make sure traffic uses a broadcast MAC addresses, or the UNICAST
address of the receiving interface, or the receiving interface is in
promiscuous mode (this must be done with ifconfig; pkt-gen does not
change the operating mode)
LOWER SPEED THAN LINE RATE
+ check that your CPUs are running at the maximum clock rate
and are not throttled down by the governor/powerd.
Linux:
lscpu # shows current cpu speed
# install cpufrequtils
# sudo apt-get install cpufrequtils
+ make sure that the sender/receiver interfaces and switch have
flow control (FC) disabled (either via sysctl or ethtool).
If FC is enabled and the receiving end is unable to cope
with the traffic, the driver will try to slow down transmission,
sometimes to very low rates.
+ a lot of hardware is not able to sustain line rate. For instance,
ixgbe has problems with receiving frames that are not multiple
of 64 bytes (with/without CRC depending on the driver); also on
transmissions, ixgbe tops at about 12.5 Mpps unless the driver
prefetches tx descriptors. igb does line rate in all configurations.
e1000/e1000e vary between 1.15 and 1.32 Mpps. re/r8169 is
extremely slow in sending (max 4-500 Kpps)
Credits
-------
NETMAP and VALE are projects of the Universita` di Pisa,
partially supported by various entities including:
Intel Research Berkeley, EU FP7 projects CHANGE and OPENLAB,
Netapp/Silicon Valley Community Foundation, ICSI
Author: Luigi Rizzo
Contributors:
Giuseppe Lettieri
Michio Honda
Marta Carbone
Gaetano Catalli
Matteo Landi
Vincenzo Maffione
Stefano Garzarella
Alessio Faina
References
----------
There are a few academic papers describing netmap, VALE and applications.
You can find the papers at http://info.iet.unipi.it/~luigi/research.html
+ Luigi Rizzo,
netmap: a novel framework for fast packet I/O,
Usenix ATC'12, Boston, June 2012
+ Luigi Rizzo,
Revisiting network I/O APIs: the netmap framework,
Communications of the ACM 55 (3), 45-51, March 2012
+ Luigi Rizzo, Marta Carbone, Gaetano Catalli,
Transparent acceleration of software packet forwarding using netmap,
IEEE Infocom 2012, Orlando, March 2012
+ Luigi Rizzo, Giuseppe Lettieri,
VALE: a switched ethernet for virtual machines,
ACM Conext 2012, Nice, Dec. 2012
+ Luigi Rizzo, Giuseppe Lettieri, Vincenzo Maffione,
Speeding up packet I/O in virtual machines,
IEEE/ACM ANCS 2013, San Jose, Oct. 2013
+ Stefano Garzarella, Giuseppe Lettieri, Luigi Rizzo,
Virtual device passthrough for high speed VM networking
IEEE/ACM ANCS 2015, Oakland, May 2015
+ Vincenzo Maffione, Luigi Rizzo, Giuseppe Lettieri,
Flexible virtual machine networking using netmap passthrough
IEEE Lanman 2016, Rome, June 2016