forked from oasis-tcs/virtio-spec
-
Notifications
You must be signed in to change notification settings - Fork 0
/
introduction.tex
244 lines (204 loc) · 10.2 KB
/
introduction.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
\chapter{Introduction}
\input{abstract.tex}
\begin{description}
\item[Straightforward:] Virtio devices use normal bus mechanisms of
interrupts and DMA which should be familiar to any device driver
author. There is no exotic page-flipping or COW mechanism: it's just
a normal device.\footnote{This lack of page-sharing implies that the implementation of the
device (e.g. the hypervisor or host) needs full access to the
guest memory. Communication with untrusted parties (i.e.
inter-guest communication) requires copying.
}
\item[Efficient:] Virtio devices consist of rings of descriptors
for both input and output, which are neatly laid out to avoid cache
effects from both driver and device writing to the same cache
lines.
\item[Standard:] Virtio makes no assumptions about the environment in which
it operates, beyond supporting the bus to which device is attached.
In this specification, virtio
devices are implemented over MMIO, Channel I/O and PCI bus transports
\footnote{The Linux implementation further separates the virtio
transport code from the specific virtio drivers: these drivers are shared
between different transports.
}, earlier drafts
have been implemented on other buses not included here.
\item[Extensible:] Virtio devices contain feature bits which are
acknowledged by the guest operating system during device setup.
This allows forwards and backwards compatibility: the device
offers all the features it knows about, and the driver
acknowledges those it understands and wishes to use.
\end{description}
\section{Normative References}\label{sec:Normative References}
\begin{longtable}{l p{5in}}
\phantomsection\label{intro:rfc2119}\textbf{[RFC2119]} &
Bradner S., ``Key words for use in RFCs to Indicate Requirement
Levels'', BCP 14, RFC 2119, March 1997. \newline\url{http://www.ietf.org/rfc/rfc2119.txt}\\
\phantomsection\label{intro:rfc4122}\textbf{[RFC4122]} &
Leach, P., Mealling, M., and R. Salz, ``A Universally Unique
IDentifier (UUID) URN Namespace'', RFC 4122, DOI 10.17487/RFC4122,
July 2005. \newline\url{http://www.ietf.org/rfc/rfc4122.txt}\\
\phantomsection\label{intro:S390 PoP}\textbf{[S390 PoP]} & z/Architecture Principles of Operation, IBM Publication SA22-7832, \newline\url{http://publibfi.boulder.ibm.com/epubs/pdf/dz9zr009.pdf}, and any future revisions\\
\phantomsection\label{intro:S390 Common I/O}\textbf{[S390 Common I/O]} & ESA/390 Common I/O-Device and Self-Description, IBM Publication SA22-7204, \newline\url{http://publibfp.dhe.ibm.com/cgi-bin/bookmgr/BOOKS/dz9ar501/CCONTENTS}, and any future revisions\\
\phantomsection\label{intro:PCI}\textbf{[PCI]} &
Conventional PCI Specifications,
\newline\url{http://www.pcisig.com/specifications/conventional/},
PCI-SIG\\
\phantomsection\label{intro:PCIe}\textbf{[PCIe]} &
PCI Express Specifications
\newline\url{http://www.pcisig.com/specifications/pciexpress/},
PCI-SIG\\
\phantomsection\label{intro:IEEE 802}\textbf{[IEEE 802]} &
IEEE Standard for Local and Metropolitan Area Networks: Overview and Architecture,
\newline\url{http://www.ieee802.org/},
IEEE\\
\phantomsection\label{intro:SAM}\textbf{[SAM]} &
SCSI Architectural Model,
\newline\url{http://www.t10.org/cgi-bin/ac.pl?t=f&f=sam4r05.pdf}\\
\phantomsection\label{intro:SCSI MMC}\textbf{[SCSI MMC]} &
SCSI Multimedia Commands,
\newline\url{http://www.t10.org/cgi-bin/ac.pl?t=f&f=mmc6r00.pdf}\\
\phantomsection\label{intro:FUSE}\textbf{[FUSE]} &
Linux FUSE interface,
\newline\url{https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/fuse.h}\\
\phantomsection\label{intro:eMMC}\textbf{[eMMC]} &
eMMC Electrical Standard (5.1), JESD84-B51,
\newline\url{http://www.jedec.org/sites/default/files/docs/JESD84-B51.pdf}\\
\phantomsection\label{intro:HDA}\textbf{[HDA]} &
High Definition Audio Specification,
\newline\url{https://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/high-definition-audio-specification.pdf}\\
\phantomsection\label{intro:I2C}\textbf{[I2C]} &
I2C-bus specification and user manual,
\newline\url{https://www.nxp.com/docs/en/user-guide/UM10204.pdf}\\
\phantomsection\label{intro:SCMI}\textbf{[SCMI]} &
Arm System Control and Management Interface, DEN0056,
\newline\url{https://developer.arm.com/docs/den0056/c}, version C and any future revisions\\
\end{longtable}
\section{Non-Normative References}
\begin{longtable}{l p{5in}}
\phantomsection\label{intro:Virtio PCI Draft}\textbf{[Virtio PCI Draft]} &
Virtio PCI Draft Specification
\newline\url{http://ozlabs.org/~rusty/virtio-spec/virtio-0.9.5.pdf}\\
\end{longtable}
\section{Terminology}\label{Terminology}
The key words ``MUST'', ``MUST NOT'', ``REQUIRED'', ``SHALL'', ``SHALL NOT'', ``SHOULD'', ``SHOULD NOT'', ``RECOMMENDED'', ``MAY'', and ``OPTIONAL'' in this document are to be interpreted as described in \hyperref[intro:rfc2119]{[RFC2119]}.
\subsection{Legacy Interface: Terminology}\label{intro:Legacy
Interface: Terminology}
Specification drafts preceding version 1.0 of this specification
(e.g. see \hyperref[intro:Virtio PCI Draft]{[Virtio PCI Draft]})
defined a similar, but different
interface between the driver and the device.
Since these are widely deployed, this specification
accommodates OPTIONAL features to simplify transition
from these earlier draft interfaces.
Specifically devices and drivers MAY support:
\begin{description}
\item[Legacy Interface]
is an interface specified by an earlier draft of this specification
(before 1.0)
\item[Legacy Device]
is a device implemented before this specification was released,
and implementing a legacy interface on the host side
\item[Legacy Driver]
is a driver implemented before this specification was released,
and implementing a legacy interface on the guest side
\end{description}
Legacy devices and legacy drivers are not compliant with this
specification.
To simplify transition from these earlier draft interfaces,
a device MAY implement:
\begin{description}
\item[Transitional Device]
a device supporting both drivers conforming to this
specification, and allowing legacy drivers.
\end{description}
Similarly, a driver MAY implement:
\begin{description}
\item[Transitional Driver]
a driver supporting both devices conforming to this
specification, and legacy devices.
\end{description}
\begin{note}
Legacy interfaces are not required; ie. don't implement them unless you
have a need for backwards compatibility!
\end{note}
Devices or drivers with no legacy compatibility are referred to as
non-transitional devices and drivers, respectively.
\subsection{Transition from earlier specification drafts}\label{sec:Transition from earlier specification drafts}
For devices and drivers already implementing the legacy
interface, some changes will have to be made to support this
specification.
In this case, it might be beneficial for the reader to focus on
sections tagged "Legacy Interface" in the section title.
These highlight the changes made since the earlier drafts.
\section{Structure Specifications}
Many device and driver in-memory structure layouts are documented using
the C struct syntax. All structures are assumed to be without additional
padding. To stress this, cases where common C compilers are known to insert
extra padding within structures are tagged using the GNU C
__attribute__((packed)) syntax.
For the integer data types used in the structure definitions, the following
conventions are used:
\begin{description}
\item[u8, u16, u32, u64] An unsigned integer of the specified length in bits.
\item[le16, le32, le64] An unsigned integer of the specified length in bits,
in little-endian byte order.
\item[be16, be32, be64] An unsigned integer of the specified length in bits,
in big-endian byte order.
\end{description}
Some of the fields to be defined in this specification don't
start or don't end on a byte boundary. Such fields are called bit-fields.
A set of bit-fields is always a sub-division of an integer typed field.
Bit-fields within integer fields are always listed in order,
from the least significant to the most significant bit. The
bit-fields are considered unsigned integers of the specified
width with the next in significance relationship of the bits
preserved.
For example:
\begin{lstlisting}
struct S {
be16 {
A : 15;
B : 1;
} x;
be16 y;
};
\end{lstlisting}
documents the value A stored in the low 15 bit of \field{x} and
the value B stored in the high bit of \field{x}, the 16-bit
integer \field{x} in turn stored using the big-endian byte order
at the beginning of the structure S,
and being followed immediately by an unsigned integer \field{y}
stored in big-endian byte order at an offset of 2 bytes (16 bits)
from the beginning of the structure.
Note that this notation somewhat resembles the C bitfield syntax but
should not be naively converted to a bitfield notation for portable
code: it matches the way bitfields are packed by C compilers on
little-endian architectures but not the way bitfields are packed by C
compilers on big-endian architectures.
Assuming that CPU_TO_BE16 converts a 16-bit integer from a native
CPU to the big-endian byte order, the following is the equivalent
portable C code to generate a value to be stored into \field{x}:
\begin{lstlisting}
CPU_TO_BE16(B << 15 | A)
\end{lstlisting}
\section{Constant Specifications}
In many cases, numeric values used in the interface between the device
and the driver are documented using the C \#define and /* */
comment syntax. Multiple related values are grouped together with
a common name as a prefix, using _ as a separator.
Using _XXX as a suffix refers to all values in a group.
For example:
\begin{lstlisting}
/* Field Fld value A description */
#define VIRTIO_FLD_A (1 << 0)
/* Field Fld value B description */
#define VIRTIO_FLD_B (1 << 1)
\end{lstlisting}
documents two numeric values for a field \field{Fld}, with
\field{Fld} having value 1 referring to \field{A} and \field{Fld}
having value 2 referring to \field{B}.
Note that $<<$ refers to the shift-left operation.
Further, in this case VIRTIO_FLD_A and VIRTIO_FLD_B
refer to values 1 and 2 of Fld respectively. Further, VIRTIO_FLD_XXX refers to
either VIRTIO_FLD_A or VIRTIO_FLD_B.
\newpage