-
Notifications
You must be signed in to change notification settings - Fork 1
/
index.html
402 lines (386 loc) · 21.2 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
<!DOCTYPE html>
<html lang="en" class="">
<head>
<meta charset='utf-8'>
<meta http-equiv="Content-Language" content="en">
<link rel="stylesheet" type="text/css" href="stylesheets/stylesheet.css" media="screen" />
<link rel="stylesheet" type="text/css" href="stylesheets/print.css" media="print" />
<title>RDL - Resource Description Language</title>
</head>
<body>
<header>
<div class="inner">
<h1>RDL</h1>
<h2>Resource Description Language</h2>
</div>
<div class="aside">
<a href="https://github.com/ardielle" class="button"><small>View project on </small>GitHub</a>
</div>
</header>
<div id="content-wrapper">
<div class="inner clearfix">
<section id="main-content">
<h2><a name="Overview">Overview</a></h2>
<p>
RDL is a machine-readable description of a <code>schema</code> that describes data types,
as well as resources using those types. Such a schema can be used to describe
HTTP web services, as well as serve as the source of truth for data encoding
mechanisms like Protocol Buffers and Avro, as well as augment JSON and other encoding schemes by
providing data validation.
</p>
<p>
Types are defined by deriving from an already defined type. Every type is thus derived
(perhaps indirectly) from a primitive base type. For each base type various different options
may be available to further restrict the type.
</p>
<p>
For more information and source code, look at the
<a href="https://github.com/ardielle">Github</a> repository.
</p>
<h2><a name="Syntax">Syntax</a></h2>
<p>
RDL's syntax is similar to C and Java, and is fairly familiar-looking to most programmers, i.e.
<pre>
type Point Struct {
Int32 x;
Int32 y;
}
</pre>
The syntax is defined by an <a href="rdl_ebnf.txt" target="rdl_ebnf.txt">EBNF grammer</a>, which has been used to generate a <a href="syntax.xhtml">visual railroad diagram</a>.
</p>
<h2><a name="Primitives">Primitive Types</a></h2>
<table>
<colgroup>
<col style="text-align:left;"/>
<col style="text-align:left;"/>
</colgroup>
<thead>
<tr>
<th style="text-align:left;">Name</th>
<th style="text-align:left;">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left;"><code>Null</code></td>
<td style="text-align:left;">No value</td>
</tr>
<tr>
<td style="text-align:left;"><code>Bool</code></td>
<td style="text-align:left;">Either `true` or `false`</td>
</tr>
<tr>
<td style="text-align:left;"><code>Int8</code></td>
<td style="text-align:left;">An 8-bit signed integer</td>
</tr>
<tr>
<td style="text-align:left;"><code>Int16</code></td>
<td style="text-align:left;">A 16-bit signed integer</td>
</tr>
<tr>
<td style="text-align:left;"><code>Int32</code></td>
<td style="text-align:left;">A 32-bit signed integer</td>
</tr>
<tr>
<td style="text-align:left;"><code>Int64</code></td>
<td style="text-align:left;">A 64-bit signed integer</td>
</tr>
<tr>
<td style="text-align:left;"><code>Float32</code></td>
<td style="text-align:left;">A single precision (32-bit) IEEE 754 floating-point number</td>
</tr>
<tr>
<td style="text-align:left;"><code>Float64</code></td>
<td style="text-align:left;">A double precision (64) IEEE 754 floating-point number</td>
</tr>
<tr>
<td style="text-align:left;"><code>Bytes</code></td>
<td style="text-align:left;">A sequence of 8 bit bytes</td>
</tr>
<tr>
<td style="text-align:left;"><code>String</code></td>
<td style="text-align:left;">A sequence of unicode characters expressed in the UTF8 character set.</td>
</tr>
<tr>
<td style="text-align:left;"><code>Symbol</code></td>
<td style="text-align:left;">A simple identifier, like a string but restricted in the characters accepted, generally following what most llanguages would consider a valid variable name</td>
</tr>
<tr>
<td style="text-align:left;"><code>UUID</code></td>
<td style="text-align:left;">A universally unique identifier, as defined by <a href="#References">RFC 4122 [UUID]</a></td>
</tr>
<tr>
<td style="text-align:left;"><code>Timestamp</code></td>
<td style="text-align:left;">An instance in time, expressed as a floating point number number of seconds since 1970. May also be represented as a string in UTC as described in <a href="#References">RFC 3339 [Timestamp]</a>)</td>
</tr>
<tr>
<td style="text-align:left;"><code>Array</code></td>
<td style="text-align:left;">An ordered collection of other values</td>
</tr>
<tr>
<td style="text-align:left;"><code>Map</code></td>
<td style="text-align:left;">An unordered mapping of keys to values</td>
</tr>
<tr>
<td style="text-align:left;"><code>Enum</code></td>
<td style="text-align:left;">An enumerated set of symbolic identifiers.</td>
</tr>
<tr>
<td style="text-align:left;"><code>Union</code></td>
<td style="text-align:left;">A tagged union of other types</td>
</tr>
<tr>
<td style="text-align:left;"><code>Struct</code></td>
<td style="text-align:left;">An ordered collection of named fields, describable by a schema</td>
</tr>
</tbody>
</table>
<p> </p>
<p>Note: all type names in RDL are case-insensitive. Capitalized types are used in this document.</p>
<h2><a name="Representation">Representation</a></h2>
<p>
Such a structured type definition gets compiled to a <code>Schema</code>, a data structure that describes
the typedefs. Although schemas could be directly written as data, i.e. in JSON or YAML, the
RDL source is designed to be more expressive, less noisy and easier to diff in a source control system. The
<code>Point</code> type defined above would be expressed as the following schema, shown
here in JSON:
</p>
<pre>
{
"types": [
{
"StructTypeDef": {
"type": "Struct",
"name": "Point",
"fields": [
{
"name": "x",
"type": "Int32"
},
{
"name": "y",
"type": "Int32"
}
]
}
}
]
}
</pre>
<p>
Each of the types in the array are of type <code>Type</code>, which is a Union of
a variety of type definition structures. The format of the Schema data structure is
defined itself in RDL, see <a href="">rdl.rdl</a> for this definition.
</p>
<h2><a name="Type_Mappings">Type Mappings</a></h2>
<p>Below are mappings of RDL types to some other common type systems. The following table summarizes the relationship, with notes following:</p>
<table style="xtext-align:left;">
<colgroup>
<col />
<col />
<col />
<col />
<col />
<col />
<col />
<col />
<col />
<col />
</colgroup>
<thead style="text-align:left;">
<tr>
<th>RDL</th>
<th>JSON</th>
<th>Protobuf</th>
<th>Avro</th>
<th>Hive</th>
<th>XSD</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left;"><code>Null</code></td>
<td style="text-align:left;"><code>null</code></td>
<td style="text-align:left;"><code>-</code> [1]</td>
<td style="text-align:left;"><code>null</code></td>
<td style="text-align:left;"><code>-</code> [1]</td>
<td style="text-align:left;"><code>-</code> [1]</td>
</tr>
<tr>
<td style="text-align:left;"><code>Bool</code></td>
<td style="text-align:left;"><code>true</code> or <code>false</code></td>
<td style="text-align:left;"><code>bool</code></td>
<td style="text-align:left;"><code>boolean</code></td>
<td style="text-align:left;"><code>boolean</code></td>
<td style="text-align:left;"><code>boolean</code></td>
</tr>
<tr>
<td style="text-align:left;"><code>Int8</code></td>
<td style="text-align:left;"><code>number</code> [2]</td>
<td style="text-align:left;"><code>sint32</code> [3]</td>
<td style="text-align:left;"><code>int</code> [4]</td>
<td style="text-align:left;"><code>tinyint</code></td>
<td style="text-align:left;"><code>byte</code></td>
</tr>
<tr>
<td style="text-align:left;"><code>Int16</code></td>
<td style="text-align:left;"><code>number</code> [2]</td>
<td style="text-align:left;"><code>sint32</code> [3]</td>
<td style="text-align:left;"><code>int</code> [4]</td>
<td style="text-align:left;"><code>smallint</code></td>
<td style="text-align:left;"><code>short</code></td>
</tr>
<tr>
<td style="text-align:left;"><code>Int32</code></td>
<td style="text-align:left;"><code>number</code> [2]</td>
<td style="text-align:left;"><code>sint32</code> [5]</td>
<td style="text-align:left;"><code>int</code> [6]</td>
<td style="text-align:left;"><code>int</code> [5]</td>
<td style="text-align:left;"><code>integer</code></td>
</tr>
<tr>
<td style="text-align:left;"><code>Int64</code></td>
<td style="text-align:left;"><code>number</code> [2]</td>
<td style="text-align:left;"><code>sint64</code> [5]</td>
<td style="text-align:left;"><code>long</code> [6]</td>
<td style="text-align:left;"><code>bigint</code> [5]</td>
<td style="text-align:left;"><code>long</code></td>
</tr>
<tr>
<td style="text-align:left;"><code>Float32</code></td>
<td style="text-align:left;"><code>number</code> [2]</td>
<td style="text-align:left;"><code>float</code> [5]</td>
<td style="text-align:left;"><code>float</code> [6]</td>
<td style="text-align:left;"><code>float</code> [5]</td>
<td style="text-align:left;"><code>float</code></td>
</tr>
<tr>
<td style="text-align:left;"><code>Float64</code></td>
<td style="text-align:left;"><code>number</code> [2]</td>
<td style="text-align:left;"><code>double</code> [5]</td>
<td style="text-align:left;"><code>double</code> [6]</td>
<td style="text-align:left;"><code>double</code> [5]</td>
<td style="text-align:left;"><code>double</code></td>
</tr>
<tr>
<td style="text-align:left;"><code>Bytes</code></td>
<td style="text-align:left;"><code>string</code> [7]</td>
<td style="text-align:left;"><code>bytes</code></td>
<td style="text-align:left;"><code>bytes</code></td>
<td style="text-align:left;"><code>binary</code></td>
<td style="text-align:left;"><code>hexBinary</code></td>
</tr>
<tr>
<td style="text-align:left;"><code>String</code></td>
<td style="text-align:left;"><code>string</code> [8]</td>
<td style="text-align:left;"><code>string</code> [8]</td>
<td style="text-align:left;"><code>string</code> [9]</td>
<td style="text-align:left;"><code>string</code> [8]</td>
<td style="text-align:left;"><code>string</code></td>
</tr>
<tr>
<td style="text-align:left;"><code>Symbol</code></td>
<td style="text-align:left;"><code>string</code> [8]</td>
<td style="text-align:left;"><code>string</code> [8]</td>
<td style="text-align:left;"><code>string</code> [9]</td>
<td style="text-align:left;"><code>string</code> [8]</td>
<td style="text-align:left;"><code>string</code></td>
</tr>
<tr>
<td style="text-align:left;"><code>UUID</code></td>
<td style="text-align:left;"><code>string</code> [10]</td>
<td style="text-align:left;"><code>string</code> [10]</td>
<td style="text-align:left;"><code>string</code> [11]</td>
<td style="text-align:left;"><code>string</code> [10]</td>
<td style="text-align:left;"><code>string</code> [12,13]</td>
</tr>
<tr>
<td style="text-align:left;"><code>Timestamp</code></td>
<td style="text-align:left;"><code>string</code> [2]</td>
<td style="text-align:left;"><code>double</code> [2]</td>
<td style="text-align:left;"><code>double</code> [2]</td>
<td style="text-align:left;"><code>double</code> [2]</td>
<td style="text-align:left;"><code>dateTime</code> [2]</td>
</tr>
<tr>
<td style="text-align:left;"><code>Array</code></td>
<td style="text-align:left;"><code>array</code></td>
<td style="text-align:left;"><code>repeated <V></code> [14]</td>
<td style="text-align:left;"><code>array</code> [14]</td>
<td style="text-align:left;"><code>array</code> [14, 8]</td>
<td style="text-align:left;"><code>sequence</code></td>
</tr>
<tr>
<td style="text-align:left;"><code>Map</code></td>
<td style="text-align:left;"><code>object</code></td>
<td style="text-align:left;"><code>repeated T<K,V></code> [14,15]</td>
<td style="text-align:left;"><code>map</code> [14,13]</td>
<td style="text-align:left;"><code>map</code> [14, 8]</td>
<td style="text-align:left;"><code>all</code></td>
</tr>
<tr>
<td style="text-align:left;"><code>Struct</code></td>
<td style="text-align:left;"><code>object</code></td>
<td style="text-align:left;"><code>message</code></td>
<td style="text-align:left;"><code>record</code></td>
<td style="text-align:left;"><code>struct</code></td>
<td style="text-align:left;"><code>all</code></td>
</tr>
<tr>
<td style="text-align:left;"><code>Enum</code></td>
<td style="text-align:left;"><code>string</code> [2]</td>
<td style="text-align:left;"><code>enum</code></td>
<td style="text-align:left;"><code>enum</code></td>
<td style="text-align:left;"><code>string</code> [2]</td>
<td style="text-align:left;"><code>string [2]</code></td>
</tr>
<tr>
<td style="text-align:left;"><code>Union</code></td>
<td style="text-align:left;"><code>value</code> [2]</td>
<td style="text-align:left;"><code>message optional [2]</code></td>
<td style="text-align:left;"><code>union</code></td>
<td style="text-align:left;"><code>union</code> [8]</td>
<td style="text-align:left;"><code>union</code></td>
</tr>
<tr><td style="text-align:left;" colspan="8"> </td></tr>
<tr><td style="text-align:left;" colspan="8">Notes: </td></tr>
<tr><td style="text-align:left;" colspan="8">[1] null is not supported in this representation</td></tr>
<tr><td style="text-align:left;" colspan="8">[2] type information is lost</td></tr>
<tr><td style="text-align:left;" colspan="8">[3] mapped to larger size number, original type is lost</td></tr>
<tr><td style="text-align:left;" colspan="8">[4]mapped to larger size number, original type becomes an annotation</td></tr>
<tr><td style="text-align:left;" colspan="8">[5] mapped to the number, subtype info is lost</td></tr>
<tr><td style="text-align:left;" colspan="8">[6] mapped to the number, subtype info becomes an annotation</td></tr>
<tr><td style="text-align:left;" colspan="8">[7] base64 url-friendly encoding, subtype info is lost</td></tr>
<tr><td style="text-align:left;" colspan="8">[8] subtype information is lost</td></tr>
<tr><td style="text-align:left;" colspan="8">[9] subtype is preserved as an annotation</td></tr>
<tr><td style="text-align:left;" colspan="8">[10] RFC 4122 string, type information is lost</td></tr>
<tr><td style="text-align:left;" colspan="8">[11] fixed[16], type is preserved as an annotation</td></tr>
<tr><td style="text-align:left;" colspan="8">[12] RFC 4122 format URN, i.e. "urn:uuid:fae891e0-0538-11e3-851b-d875f41b36e4"</td></tr>
<tr><td style="text-align:left;" colspan="8">[13] keys converted to string, key type lost</td></tr>
<tr><td style="text-align:left;" colspan="8">[14] item type required</td></tr>
<tr><td style="text-align:left;" colspan="8">[15] key type required</td></tr>
</tbody>
</table>
<p> </p>
<p>Note that most JSON implementations use `double` as the type to hold numbers, so Int64 cannot be
accurately represented. Most other types in JSON can be represented (usually as strings), but type
information is lost. Decoding with a schema can recover this information.</p>
<p>For Protobuf, note that not all types can be derived from. Numbers, Booleans, and String types
get encoded as the base type, and other type information is lost.</p>
<p>Avro uses JSON to represent schemas, and a type structure can generally be annotated with additional
information, for example the RDL schema object itself. This can be used to preserve type (and subtype)
information, but after decoding, post-processing must be done to recover that information.</p>
<h2><a name="References">References</a></h2>
<ul>
[UUID] <a href="http://tools.ietf.org/html/rfc4122">http://tools.ietf.org/html/rfc4122</a><br/>
[Timestamp] <a href="http://tools.ietf.org/html/rfc3339">http://tools.ietf.org/html/rfc3339</a><br/>
[JSON] <a href="http://tools.ietf.org/html/rfc4627">http://tools.ietf.org/html/rfc4627</a><br/>
[Protobuf] <a href="https://developers.google.com/protocol-buffers/docs/proto">https://developers.google.com/protocol-buffers/docs/proto</a><br/>
[Avro] <a href="http://avro.apache.org/docs/current/spec.html#schemas">http://avro.apache.org/docs/current/spec.html#schemas</a><br/>
[Hive] <a href="https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types">https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types</a><br/>
</ul>
</section>
</div>
</div>
</body>
</html>