-
Notifications
You must be signed in to change notification settings - Fork 4
/
Copy pathsect04.html
339 lines (282 loc) · 11.8 KB
/
sect04.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
<!DOCTYPE html>
<html>
<head>
<title>The Z-Machine Standards Document</title>
<link rel="stylesheet" type="text/css" href="zspec.css">
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
</head>
<body>
<img class="icon" src="images/icon04.gif" alt="">
<h1>4. How instructions are encoded</h1>
<blockquote>
<p>We do but teach bloody instructions</p>
<p>Which, being taught, return to plague th' inventor</p>
<p>Shakespeare, <strong>Macbeth</strong></p>
</blockquote>
<hr>
<p>4.1 <a href="#4.1">Instructions</a> /
4.2 <a href="#4.2">Operand types</a> /
4.3 <a href="#4.3">Form and operand count</a> /
4.4 <a href="#4.4">Specifying operand types</a> /
4.5 <a href="#4.5">Operands</a> /
4.6 <a href="#4.6">Stores</a> /
4.7 <a href="#4.7">Branches</a> /
4.8 <a href="#4.8">Text opcodes</a>
</p>
<hr>
<h2 id="4.1">4.1</h2>
<p>A single Z-machine instruction consists of the following
sections (and in the order shown):
</p>
<pre>
Opcode 1 or 2 bytes
(Types of operands) 1 or 2 bytes: 4 or 8 2-bit fields
Operands Between 0 and 8 of these: each 1 or 2 bytes
(Store variable) 1 byte
(Branch offset) 1 or 2 bytes
(Text to print) An encoded string (of unlimited length)
</pre>
<p>Bracketed sections are not present in all opcodes. (A few opcodes
take both "store" and "branch".)
</p>
<h2 id="4.2">4.2</h2>
<p>There are four 'types' of operand. These are often specified
by a number stored in 2 binary digits:
</p>
<pre>
$$00 Large constant (0 to 65535) 2 bytes
$$01 Small constant (0 to 255) 1 byte
$$10 Variable 1 byte
$$11 Omitted altogether 0 bytes
</pre>
<h2 id="4.2.1">4.2.1</h2>
<p>Large constants, like all 2-byte words of data in the Z-machine,
are stored with most significant byte first (e.g. <strong>$2478</strong> is stored as
<strong>$24</strong> followed by <strong>$78</strong>). A 'large constant' may in fact be a small
number.
</p>
<h2 id="4.2.2">4.2.2</h2>
<p>Variable number <strong>$00</strong> refers to the top of the stack,
<strong>$01</strong> to <strong>$0f</strong> mean the local variables of the current routine
and <strong>$10</strong> to <strong>$ff</strong> mean the global variables. It is illegal to
refer to local variables which do not exist for the current routine
(there may even be none).
</p>
<h2 id="4.2.3">4.2.3</h2>
<p>The type 'Variable' really means "variable by value". Some
instructions take as an operand a "variable by reference": for instance,
<a href="sect15.html#inc"><strong>inc</strong><a/> has one operand, the
reference number of a variable to increment. This operand usually has type
'Small constant' (and Inform automatically assembles a line like
<strong>@inc turns</strong> by writing the operand <strong>turns</strong> as
a small constant with value the reference number of the variable
<strong>turns</strong>).
</p>
<h2 id="4.3">4.3</h2>
<p>Each instruction has a form (long, short, extended or variable)
and an operand count (0OP, 1OP, 2OP or VAR).
If the top two bits of the opcode are <strong>$$11</strong> the form is variable;
if <strong>$$10</strong>, the form is short. If the opcode is 190 (<strong>$BE</strong> in hexadecimal)
and the version is 5 or later, the form is "extended". Otherwise, the
form is "long".
</p>
<h2 id="4.3.1">4.3.1</h2>
<p>In short form, bits 4 and 5 of the opcode byte give an operand
type as above. If this is <strong>$11</strong> then the operand count is
0OP; otherwise, 1OP. In either case the opcode number is
given in the bottom 4 bits.
</p>
<h2 id="4.3.2">4.3.2</h2>
<p>In long form the operand count is always 2OP. The opcode number
is given in the bottom 5 bits.
</p>
<h2 id="4.3.3">4.3.3</h2>
<p>In variable form, if bit 5 is 0 then the count is 2OP;
if it is 1, then the count is VAR. The opcode number is given
in the bottom 5 bits.
</p>
<h2 id="4.3.4">4.3.4</h2>
<p>In extended form, the operand count is VAR. The opcode number
is given in a second opcode byte.
</p>
<h2 id="4.4">4.4</h2>
<p>Next, the types of the operands are specified.</p>
<h2 id="4.4.1">4.4.1</h2>
<p> In short form, bits 4 and 5 of the opcode give the type.</p>
<h2 id="4.4.2">4.4.2</h2>
<p>In long form, bit 6 of the opcode gives the type of
the first operand, bit 5 of the second. A value of 0 means a small
constant and 1 means a variable. (If a 2OP instruction needs a
large constant as operand, then it should be assembled in variable
rather than long form.)
</p>
<h2 id="4.4.3">4.4.3</h2>
<p>In variable or extended forms, a byte of 4 operand types is
given next. This contains 4 2-bit fields: bits 6 and 7 are the first
field, bits 0 and 1 the fourth. The values are operand types as above.
Once one type has been given as 'omitted', all subsequent ones must
be. Example: <strong>$$00101111</strong> means large constant followed by variable
(and no third or fourth opcode).
</p>
<h2 id="4.4.3.1">4.4.3.1</h2>
<p>In the special case of the "double variable" VAR opcodes
<a href="sect15.html#call_vs2"><strong>call_vs2</strong></a> and
<a href="sect15.html#call_vn2"><strong>call_vn2</strong></a> (opcode
numbers 12 and 26), a second byte of types is given, containing the types
for the next four operands.
</p>
<h2 id="4.5">4.5</h2>
<p>The operands are given next. Operand counts of 0OP, 1OP or 2OP
require 0, 1 or 2 operands to be given, respectively. If the count is VAR,
there must be as many operands as there were types other than 'omitted'.
</p>
<h2 id="4.5.1">4.5.1</h2>
<p>Note that only <a href="sect15.html#call_vs2"><strong>call_vs2</strong></a> and
<a href="sect15.html#call_vn2"><strong>call_vn2</strong></a> can have more than 4
operands, and no instruction can have more than 8.
</p>
<h2 id="4.5.2">4.5.2</h2>
<p>Opcode operands are always evaluated from first to last - this order is
important when the stack pointer appears as an argument. Thus</p>
<pre>@sub sp sp -> i;</pre>
<p>subtracts the second-from-top stack item from the topmost
stack item.
</p>
<h2 id="4.6">4.6</h2>
<p>"Store" instructions return a value: e.g.,
<a href="sect15.html#mul"><strong>mul</strong></a> multiplies
its two operands together. Such instructions must be followed by a
single byte giving the variable number of where to put the result.
</p>
<h2 id="4.7">4.7</h2>
<p>Instructions which test a condition are called "branch"
instructions. The branch information is stored in one or two bytes,
indicating what to do with the result of the test. If bit 7 of
the first byte is 0, a branch occurs when the condition was false;
if 1, then branch is on true. If bit 6 is set, then the branch
occupies 1 byte only, and the "offset" is in the range 0 to 63,
given in the bottom 6 bits. If bit 6 is clear, then the offset is
a signed 14-bit number given in bits 0 to 5 of the first byte
followed by all 8 of the second.
</p>
<h2 id="4.7.1">4.7.1</h2>
<p>An offset of 0 means "return false from the current
routine", and 1 means "return true from the current routine".
</p>
<h2 id="4.7.2">4.7.2</h2>
<p>Otherwise, a branch moves execution to the instruction at
address
</p>
<pre>
Address after branch data + Offset - 2.
</pre>
<h2 id="4.8">4.8</h2>
<p>Two opcodes, <a href="sect15.html#print"><strong>print</strong></a> and
<a href="sect15.html#print_ret"><strong>print_ret</strong></a>, are followed
by a text string. This is stored according to the usual rules: in particular
execution continues after the last 2-byte word of text (the one with top bit
set).
</p>
<hr>
<h2 id="remarks">Remarks</h2>
<p>Some opcodes have type VAR only because the available codes for
the other types had run out; <a href="sect15.html#print_char"><strong>print_char</strong></a>,
for instance. Others, especially <a href="sect15.html#call"><strong>call</strong></a>,
need the flexibility to have between 1 and 4 operands.
</p>
<p>The Inform assembler can assemble branches in either form, though
the programmer should always use long form unless there's a good
reason. Inform automatically optimises branch statements so as to
force as many of them as possible into short form. (This optimisation
will happen to branches written by hand in assembler as well as
to branches compiled by Inform.)
</p>
<p>The disassembler <strong>Txd</strong> numbers locals from 0 to 14 and globals from
0 to 239 in its output (corresponding to variable numbers 1 to 15, and
16 to 255, respectively).
</p>
<p>The branch formula is sensible because in the natural implementation,
the program counter is at the address after the branch data when the branch
takes place: thus it can be regarded as
</p>
<pre>
PC = PC + Offset - 2.
</pre>
<p>If the rule were simply "add the offset" then, since the offset couldn't
be 0 or 1 (because of the return-false and return-true values), we would
never be able to skip past a 1-byte instruction (say, a 0OP like
<a href="sect15.html#quit"><strong>quit</strong></a>), or specify the
branch "don't branch at all" (sometimes useful to ignore the result of
the test altogether). Subtracting 2 means that the only effects we can't
achieve are
</p>
<pre>
PC = PC - 1 and PC = PC - 2
</pre>
<p>and we would never want these anyway, since they would put the program
counter somewhere back inside the same instruction, with horrid
consequences.
</p>
<hr>
<h2 id="disassembly">On disassembly</h2>
<p>Briefly, the first byte of an instruction can be decoded
using the following table:
</p>
<pre>
$00 -- $1f long 2OP small constant, small constant
$20 -- $3f long 2OP small constant, variable
$40 -- $5f long 2OP variable, small constant
$60 -- $7f long 2OP variable, variable
$80 -- $8f short 1OP large constant
$90 -- $9f short 1OP small constant
$a0 -- $af short 1OP variable
$b0 -- $bf short 0OP
except $be extended opcode given in next byte
$c0 -- $df variable 2OP (operand types in next byte)
$e0 -- $ff variable VAR (operand types in next byte(s))
</pre>
<p>Here is an example disassembly:</p>
<pre>
@inc_chk c 0 label; 05 02 00 d4
long form; count 2OP; opcode number 5; operands:
02 small constant (referring to variable c)
00 small constant 0
branch if true: 1-byte offset, 20 (since label is
18 bytes forward from here).
@print "Hello.^"; b2 11 aa 46 34 16 45 9c a5
short form; count 0OP.
literal string, Z-chars: 4 13 10 17 17 20 5 18 5 7 5 5.
@mul 1000 c -> sp; d6 2f 03 e8 02 00
variable form; count 2OP; opcode number 22; operands:
03 e8 long constant (1000 decimal)
02 variable c
store result to stack pointer (var number 00).
@call_1n Message; 8f 01 56
short form; count 1OP; opcode number 15; operand:
01 56 long constant (packed address of routine)
.label;
</pre>
<hr>
<p>
<a href="index.html">Contents</a> /
<a href="preface.html">Preface</a> /
<a href="overview.html">Overview</a>
</p>
<p>Section
<a href="sect01.html">1</a> / <a href="sect02.html">2</a> /
<a href="sect03.html">3</a> / <a href="sect04.html">4</a> /
<a href="sect05.html">5</a> / <a href="sect06.html">6</a> /
<a href="sect07.html">7</a> / <a href="sect08.html">8</a> /
<a href="sect09.html">9</a> / <a href="sect10.html">10</a> /
<a href="sect11.html">11</a> / <a href="sect12.html">12</a> /
<a href="sect13.html">13</a> / <a href="sect14.html">14</a> /
<a href="sect15.html">15</a> / <a href="sect16.html">16</a>
</p>
<p>Appendix
<a href="appa.html">A</a> / <a href="appb.html">B</a> /
<a href="appc.html">C</a> / <a href="appd.html">D</a> /
<a href="appe.html">E</a> / <a href="appf.html">F</a>
</p>
<hr>
</body>
</html>