-
Notifications
You must be signed in to change notification settings - Fork 0
/
oka.sgml.in
736 lines (628 loc) · 26.8 KB
/
oka.sgml.in
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
<!doctype linuxdoc system>
<article>
<!-- Title information -->
<title>OKA (pipeline hazards description translator)
<author>Vladimir Makarov, <tt>vmakarov@gcc.gnu.org</tt>
<date>Apr 5, 2001
<abstract>
This document describes OKA (translator of a processor pipeline
hazards description into code for fast recognition of pipeline
hazards).
</abstract>
<toc>
<sect>Introduction
<p>
OKA is a translator of a processor pipeline hazards description (PHD)
into code for fast recognition of pipeline hazards. Instruction
execution can be started only if its issue conditions are satisfied.
If not, instruction is interlocked until its conditions are satisfied.
Such an "interlock (pipeline) delay" causes interruption of the
fetching of successor instructions (or demands NOP instructions, e.g.
for MIPS).
There are two major kind of interlock delays in modern superscalar
RISC processors. The first one is data dependence delay. The
instruction execution is not started until all source data has been
evaluated by previous instructions (there are more complex cases when
the instruction execution starts even when the data are not evaluated
but will be ready till given time after the instruction execution
start). Taking into account of such kind delay is simple. Data
dependence (true, output, and anti-dependence) delay between two
instructions is given by constant. In the most cases this approach is
adequate. The second kind of interlock delay is reservation delay.
Two such way dependent instructions under execution will be in need of
shared processors resources, i.e. buses, internal registers, and/or
functional units, which are reserved for some time. Taking into
account of this kind of delay is complex especially for modern RISC
processors. The goal of OKA is to generate code for fast recognition
of such kind delay (pipeline hazards).
<sect>Pipeline hazards description language
<p>
A pipeline hazards description describes mainly reservations of
processor functional units by an instruction during its execution.
The instruction reservations are given by regular expression
describing nondeterministic finite state automaton (NDFA).
<sect1>Layout of pipeline hazards description
<p>
Pipeline hazards description structure has the following layout
which is similar to one of YACC file.
<tscreen><verb>
DECLARATIONS
%%
EXPRESSIONS
%%
ADDITIONAL C/C++ CODE
</verb></tscreen>
The `%%' serves to separate the sections of description. All
sections are optional. The first `%%' starts section of rules and is
obligatory even if the section is empty, the second `%%' may be absent
if section of additional C/C++ code is absent too.
The section of declarations contains declarations of functional
units and instructions of a processor. The section also may contain
declarations of automata on which result automaton are split in order
to decrease size of tables needed for fast recognition of pipeline
hazards. And finally the section may contain subsections of code on
C/C++.
The next section contains expressions list which describes
functional units reservations by instructions. Regular expressions in
general case correspond to nondeterministic final state automaton.
The expression list can be empty. In this case the result automaton
can contains only arcs marked by special token corresponding advancing
cycle.
The additional C/C++ code can contain any C/C++ code you want to
use. Often functions which are not generated by the translator but
are needed to work of the instruction scheduler go here. This code
without changes is placed at the end of file generated by the
translator.
<sect1>Declarations
<p>
The section of declarations may contain the following construction:
<tscreen><verb>
%instruction IDENTIFIER ...
</verb></tscreen>
Such constructions declare instructions (or instruction class) of
processor. All instructions must be defined in constructions of such
kind. The same instruction identifier can be defined repeatedly.
There is analogous construction which can serves to describe
frequently repeated functional units reservations by real
instructions.
<tscreen><verb>
%reservation IDENTIFIER ...
</verb></tscreen>
The functional unit declarations has the following form:
<tscreen><verb>
%unit <automaton identifier> IDENTIFIER ...
</verb></tscreen>
The construction is used to describe functional units of given
processor. Optional identifier in angle brackets describes how to
split result automaton onto smaller automata. Each such automaton
will contain states corresponding to only reservations of functional
units described with given automaton identifier. The same unit
identifier can be defined repeatedly with the same automaton
identifier.
Sometimes it is necessary to describe that some units can not be
reserved simultaneously, e.g. floating point unit is pipelined but can
execute only single or double floating point operation. The following
construction is useful in such situations
<tscreen><verb>
%exclude IDENTIFIER ... : IDENTIFIER ...
</verb></tscreen>
The functional units left to the semicolon can not be reserved with
the units right to the semicolon (and vise versa) on the same cycle.
All units in the construction should belong the same automata.
All automaton identifiers present in unit declarations must be
declared in the following construction
<tscreen><verb>
%automaton IDENTIFIER ...
</verb></tscreen>
If there is an automaton declaration, all unit declarations must be
with a declared automaton. There may be also the following
constructions in the declaration section
<tscreen><verb>
%local {
C/C++ DECLARATIONS
}
%import {
C/C++ DECLARATION
}
and
%export {
C/C++ DECLARATION
}
</verb></tscreen>
which contain any C/C++ declarations (types, variables, macros, and so
on) used in sections. The local C/C++ declarations are inserted at
the begin of generated implementation file (see pipeline hazards
description interface) but after include-directive of interface file.
C/C++ declarations which start with `%import' are inserted at the
begin of generated interface file. C/C++ declarations which start
with `%export' are inserted at the end of generated interface file.
For example, such C/C++ code may contain definitions of of external
variables and functions which refer to definitions generated by OKA.
All C/C++ declarations are placed in the same order as in the section
of declarations.
<sect1>Expressions
<p>
The section of declarations is followed by section of expressions.
A expression is described the following construction
<tscreen><verb>
instruction or reservation identifiers : expression;
</verb></tscreen>
This construction means that instructions given in the left part
reserves units according to the expression. In the case of
reservation identifier, it is usually used for describing
sub-expression frequently used in other expressions. Of course, each
declared instruction and reservation must be present in only one such
construction. The expression describes non-deterministic finite state
automaton (NDFA) in general case and can contain the following forms:
<tscreen><verb>
EXPRESSION EXPRESSION
EXPRESSION + EXPRESSION
EXPRESSION * NUMBER
EXPRESSION | EXPRESSION
%nothing
UNIT OR RESERVATION IDENTIFIER
[ EXPRESSION ]
( EXPRESSION )
</verb></tscreen>
Binary operator ` ' (blank) and `+' describes sequential
reservations given by the expressions. In the first case the first
unit in the right expression is reserved on the next cycle after the
reservation of the last unit in left expression (so called
concatenation with cycle advancing). The reservation of an unit is
described simply by the unit identifier. In the second case the first
unit in the right expression is reserved on the same cycle after the
reservation of the last unit in left expression (so called
concatenation without cycle advancing). Sometimes it is necessary to
describe absence of unit reservations during several cycles. The
construction `%nothing' serves for this purpose. `%nothing' can be in
the same place as an unit identifier.
Reservation identifier is simply changed by the construction `(the
corresponding reservation expression)'.
The construction `EXPRESSION * NUMBER' is simply abbreviation of
`EXPRESSION EXPRESSION ...' where the expression is repeated by given
positive number times.
The construction `EXPRESSION | EXPRESSION' means that the instructions
reserve units according to the left or to the right expression (so
called alternative). If an unit is present only on one alternative it
should belong to the same automaton as units on other alternative.
OKA checks this and reports if it is not true.
All binary operators have the left associativity and the following
priority:
<tscreen><verb>
`*' - the highest priority
`+' and ` ' - the middle priority
`|' - the lowest priority
</verb></tscreen>
The construction `[EXPRESSION]' serves for describing optional
construction and is simply abbreviation of the following construction
` | EXPRESSION'.
The parentheses are used to grouping sub-expressions in another
order then the one given by priorities and associativities of
operators.
Full YACC syntax (with some hints in order to transform into LALR
(1)-grammar) of pipeline hazards description language is placed in
Appendix 1.
<sect>Generated code
<p>
A specification as described in the previous section is translated by
OKA (pipeline hazards description translator) into interface and
implementation files having the same names as one of specification
file and correspondingly suffixes `.h' and `.c' (C code) or `.cpp'
(C++ code).
<sect1>C++ code
<p>
Interface file of PHD consists of the following definitions:
<descrip>
<tag>Class `OKA_chip'.</tag>
Object of the class describes current state of processor. The
class has the following public members:
<enum>
<item>Function
<tscreen><verb>
`int OKA_transition (int OKA_instruction)'
</verb></tscreen>
The function has one parameter: instruction code. If
corresponding instruction can be started by the processor in
its current state, the function returns 1 and the object
(processor) changes own state which reflects starting the
execution of given instruction. In the opposite case, when
the instruction can not be started, the function returns 0
and the object (processor) does not change own state.
<item>Function
<tscreen><verb>
`int OKA_is_dead_lock (void)'
</verb></tscreen>
The function returns 1 (TRUE) when transition from the
corresponding processor state is not possible on any
instruction. The single way to change object (processor)
state is to advance time (on one cycle) with the aid of
special pseudo-instruction code `OKA__ADVANCE_CYCLE'. For
example, dead lock state for dual-issue processor can be
state reflecting starting two instructions on a cycle.
<item>Function
<tscreen><verb>
`void OKA_reset (void)'
</verb></tscreen>
The function sets up the object (processor) in initial
state. No any processor unit is busy in the state.
<item>Constructor
<tscreen><verb>
`OKA_chip (void)'
</verb></tscreen>
The constructor simply calls function `OKA_reset'.
</enum>
<tag>Macros or enumeration (see option `-enum')</tag>
which declare instruction codes. Macros or enumeration
constants have the same name as one in PHD and prefix `OKA_'
(see also option `-p'). OKA always generates additional code
`OKA__ADVANCE_CYCLE'. If such pseudo-instruction starts, the
processor make transition into the state reflecting advancing
time on one cycle. It is guaranteed that there is always
transition from any processor state on given
pseudo-instruction. Macros or enumeration are generated in
interface file only if option `-export' is present on OKA
command line. By default, the macros or the enumeration are
generated in the implementation file. Usually, the last case
means that the scheduler code is placed PHD in additional C/C++
code.
</descrip>
<sect1>C code
<p>
Interface file of PHD consists of the following definitions of
generated type and functions:
<enum>
<item>Structure `OKA_chip' which describes state of the processor.
<item>Type definition `OKA_chip' which is simply structure `OKA_chip'.
<item>Function
<tscreen><verb>
`int OKA_transition (OKA_chip *OKA_chip,
int OKA_instruction)'
</verb></tscreen>
The function has two parameter: pointer to structure
describing the processor state and instruction code. If
corresponding instruction can be started by the processor in
its current state, the function returns 1 and the structure is
changed in order to reflects starting the execution of given
instruction. In the opposite case, when the instruction can
not be started, the function returns 0 and the structure is
not changed.
<item>Function
<tscreen><verb>
`int OKA_is_dead_lock (OKA_chip *OKA_chip)'
</verb></tscreen>
The function returns 1 (TRUE) when transition from the
processor state given by the structure is not possible on any
instruction. The single way to change processor state is to
advance time (on one cycle) with the aid of special
pseudo-instruction code `OKA__ADVANCE_CYCLE'. For example,
dead lock state for dual-issue processor can be state
reflecting starting two instructions on a cycle.
<item>Function
<tscreen><verb>
`void OKA_reset (OKA_chip *OKA_chip)'
</verb></tscreen>
The function sets up the structure (processor) in initial
state. No any processor unit is busy in the state.
<item>Macros or enumeration (see option `-enum') which declare
instruction codes. Macros or enumeration constants have the
same name as one in PHD and prefix `OKA_' (see also option `-p').
OKA always generates additional code `OKA__ADVANCE_CYCLE'. If
such pseudo-instruction starts, the processor make transition
into the state reflecting advancing time on one cycle. It is
guaranteed that there is always transition from any processor
state on given pseudo-instruction. Macros or enumeration are
generated in interface file only if option `-export' is present
on OKA command line. By default, the macros or the enumeration
are generated in the implementation file. Usually, the last
case means that the scheduler code is placed PHD in additional
C/C++ code.
</enum>
<sect>OKA Usage
<p>
<tscreen><verb>
%%%
</verb></tscreen>
<sect>Implementation
<p>
The OKA is implemented with other COCOM tools. NDFA(s) is created
at the begin. After that NDFA(s) is transformed to DFA(s). DFA(s) is
than minimized. Tables representing DFA(s) are compacted with the aid
of comb-vector method. To decrease size of the generated tables also
instructions are divided on equivalence classes. It is especially
important when automaton is split on several automata.
<sect>Future of OKA development
<p>
<enum>
<item>Automatic splitting automaton on given number automata.
<item>Code for determining what units are reserved in given state. It
may be necessary in some very complex cases when given model is
not sufficient for accurate recognition of pipeline hazards.
<item>Possibility for generation of reverse automaton. Which can be
used to insert instruction in already scheduled basic block,
e.g. for trace scheduling or scheduling super-blocks.
<item>Expansion of model of description of pipeline hazards in order to
enable descriptions of processors with dynamic execution and
register renaming.
</enum>
<sect>Appendix 1 - Syntax of pipeline Hazards Description (YACC grammar)
<p>
<tscreen><verb>
%token PERCENTS COMMA COLON SEMICOLON LEFT_PARENTHESIS RIGHT_PARENTHESIS
LEFT_BRACKET RIGHT_BRACKET LEFT_ANGLE_BRACKET RIGHT_ANGLE_BRACKET
PLUS BAR STAR
LOCAL IMPORT EXPORT EXCLUSION AUTOMATON
UNIT NOTHING INSTRUCTION RESERVATION
%token IDENTIFIER NUMBER CODE_INSERTION ADDITIONAL_C_CODE
%start description
%%
description : declaration_part PERCENTS
expression_definition_list ADDITIONAL_C_CODE
;
declaration_part :
| declaration_part declaration
;
declaration : identifier_declaration
| LOCAL CODE_INSERTION
| IMPORT CODE_INSERTION
| EXPORT CODE_INSERTION
;
identifier_declaration : instruction_declaration
| reservation_declaration
| unit_declaration
| automaton_declaration
| exclusion_clause
;
instruction_declaration : INSTRUCTION
| instruction_declaration IDENTIFIER
;
reservation_declaration : RESERVATION
| reservation_declaration IDENTIFIER
;
unit_declaration : UNIT optional_automaton_identifier
| unit_declaration IDENTIFIER
;
exclusion_clause : EXCLUSION identifier_list COLON identifier_list
;
identifier_list : IDENTIFIER
| identifier_list IDENTIFIER
;
optional_automaton_identifier :
| LEFT_ANGLE_BRACKET
IDENTIFIER RIGHT_ANGLE_BRACKET
;
automaton_declaration : AUTOMATON
| automaton_declaration IDENTIFIER
;
expression_definition_list
:
| expression_definition_list expression_definition
;
expression_definition : instruction_or_reservation_identifier_list COLON
expression SEMICOLON
;
instruction_or_reservation_identifier_list
: instruction_or_reservation_identifier
| instruction_or_reservation_identifier_list
COMMA instruction_or_reservation_identifier
;
instruction_or_reservation_identifier : IDENTIFIER
;
expression : expression expression
| expression PLUS expression
| expression STAR NUMBER
| LEFT_PARENTHESIS expression RIGHT_PARENTHESIS
| LEFT_BRACKET expression RIGHT_BRACKET
| expression BAR expression
| unit_or_reservation_identifier
| NOTHING
;
unit_or_reservation_identifier : IDENTIFIER
;
</verb></tscreen>
<sect>Appendix 2 - Description of Alpha architecture (EV5 version)
<p>
<tscreen><verb>
/* Problems of the description:
o Is it necessary divider_write_back if floating divide has not
fixed latency? */
%automaton integer multiply float
%unit <integer> e0 e1 load_store_1 load_store_2 store_reservation
%unit <multiply> multiplier multiplier_write_back
%unit <float> fa fm float_divider divider_write_back
%instruction LDL LDQ LDQ_U LDS LDT STL STQ STQ_U STS STT
LDL_L LDQ_L MB WMB STL_C STQ_C FETCH
RS RC HW_MFPR HW_MTPR BLBC BLBS BEQ BNE BLT BLE BGT BGE
FBEQ FBNE FBLT FBLE FBGT FBGE
JMP JSR RET JSR_COROUTINE BSR BR HW_REI CALLPAL
LDAH LDA ADDL ADDLV ADDQ ADDQV S4ADDL S4ADDQ S8ADDL S8ADDQ
S4SUBL S4SUBQ S8SUBL S8SUBQ SUBL SUBLV SUBQ SUBQV
AND BIC BIS ORNOT XOR EQV
SLL SRA SRL EXTBL EXTWL EXTLL EXTQL
EXTWH EXTLH EXTQH INSBL INSWL INSLL INSQL INSWH INSLH INSQH
MSKBL MSKWL MSKLL MSKQL MSKWH MSKLH MSKQH ZAP ZAPNOT
CMOVEQ CMOVNE CMOVLBS CMOVLT CMOVGE CMOVLBC CMOVLE CMOVGT
CMPEQ CMPLT CMPLE CMPULT CMPULE CMPBGE
MULL MULLV MULL1 MULLV1 MULL2 MULLV2
MULQ MULQV MULQ1 MULQV1 MULQ2 MULQV2 UMULH UMULH1 UMULH2
ADDS ADDT SUBS SUBT CPYSN CPYSE CVTLQ CVTQL CVTTQ
FCMOVEQ FCMOVNE FCMOVLE FCMOVLT FCMOVGE FCMOVGT
DIVS DIVT MULS MULT CPYS RPCC TRAPB UNOP
%%
/* Class LD:
o An instruction of class LD can not be simulteniously issued with
an instruction of class ST;
o An instruction of class LD can not be issued in the second cycle
after an instruction of class ST is issued. */
LDL, LDQ, LDQ_U, LDS, LDT:
(e0 + multiplier_write_back | e1) + (load_store_1 | load_store_2)
+ store_reservation
;
/* Class ST:
o An instruction of class LD can not be simulteniously issued with
an instruction of class ST;
o An instruction of class LD can not be issued in the second cycle
after an instruction of class ST is issued. */
STL, STQ, STQ_U, STS, STT:
e0 + multiplier_write_back + load_store_1 + load_store_2 %nothing
store_reservation
;
/* Class MBX */
LDL_L, LDQ_L, MB, WMB, STL_C, STQ_C, FETCH:
e0 + multiplier_write_back
;
/* Class RX */
RS, RC: e0 + multiplier_write_back
;
/* Class MXPR */
HW_MFPR, HW_MTPR: %nothing
;
/* Class IBR */
BLBC, BLBS, BEQ, BNE, BLT, BLE, BGT, BGE: e1
;
/* Class FBR */
FBEQ, FBNE, FBLT, FBLE, FBGT, FBGE: fa
;
/* Class JSR */
JMP, JSR, RET, JSR_COROUTINE, BSR, BR, HW_REI, CALLPAL: e1
;
/* Class IADD */
LDAH, LDA, ADDL, ADDLV, ADDQ, ADDQV, S4ADDL, S4ADDQ, S8ADDL, S8ADDQ,
S4SUBL, S4SUBQ, S8SUBL, S8SUBQ, SUBL, SUBLV, SUBQ, SUBQV:
e0 + multiplier_write_back
;
/* Class ILOG */
AND, BIC, BIS, ORNOT, XOR, EQV : (e0 + multiplier_write_back | e1)
;
/* Class SHIFT */
SLL, SRA, SRL, EXTBL, EXTWL, EXTLL, EXTQL,
EXTWH, EXTLH, EXTQH, INSBL, INSWL, INSLL, INSQL, INSWH, INSLH, INSQH,
MSKBL, MSKWL, MSKLL, MSKQL, MSKWH, MSKLH, MSKQH, ZAP, ZAPNOT:
e0 + multiplier_write_back
;
/* Class CMOV */
CMOVEQ, CMOVNE, CMOVLBS, CMOVLT, CMOVGE, CMOVLBC, CMOVLE, CMOVGT:
(e0 + multiplier_write_back | e1)
;
/* Class ICMP */
CMPEQ, CMPLT, CMPLE, CMPULT, CMPULE, CMPBGE:
(e0 + multiplier_write_back | e1)
;
/* Class IMULL:
o Thirty-two-bit multiplies have an 8-cycle latency, and the
multiplier can start a second multiply after 4 cycles, provided
that the second multiply has no data dependency on the first;
o No instruction can be issued to pipe e0 exactly two cycles before
an integer multiplication complete. */
MULL, MULLV: e0 + multiplier_write_back + multiplier*4 %nothing*2
multiplier_write_back
;
/* Class IMULL with 1 cycle delay */
MULL1, MULLV1: e0 + multiplier_write_back %nothing + multiplier*4
%nothing*2 multiplier_write_back
;
/* Class IMULL with 2 cycles delay */
MULL2, MULLV2: e0 + multiplier_write_back %nothing*2 + multiplier*4
%nothing*2 multiplier_write_back
;
/* Class IMULQ:
o Sixty-for-bit signed multiplies have an 12-cycle latency, and the
multiplier can start a second multiply after 8 cycles, provided
that the second multiply has no data dependency on the first;
o No instruction can be issued to pipe e0 exactly two cycles before
an integer multiplication complete. */
MULQ, MULQV: e0 + multiplier_write_back + multiplier*8 %nothing*2
multiplier_write_back
;
/* Class IMULQ with 1 cycle delay */
MULQ1, MULQV1: e0 + multiplier_write_back %nothing + multiplier*8
%nothing*2 multiplier_write_back
;
/* Class IMULQ with 2 cycles delay */
MULQ2, MULQV2: e0 + multiplier_write_back %nothing*2 + multiplier*8
%nothing*2 multiplier_write_back
;
/* Class IMULH
o Sixty-for-bit unsigend multiplies have an 14-cycle latency, and
the multiplier can start a second multiply after 8 cycles, provided
that the second multiply has no data dependency on the first;
o No instruction can be issued to pipe e0 exactly two cycles before
an integer multiplication complete. */
UMULH: e0 + multiplier_write_back + multiplier*8 %nothing*4
multiplier_write_back
;
/* Class IMULH with 1 cycle delay */
UMULH1: e0 + multiplier_write_back %nothing + multiplier*8 %nothing*4
multiplier_write_back
;
/* Class IMULH with 2 cycles delay */
UMULH2: e0 + multiplier_write_back %nothing*2 + multiplier*8 %nothing*4
multiplier_write_back
;
/* Class FADD */
ADDS, ADDT, SUBS, SUBT, CPYSN, CPYSE, CVTLQ, CVTQL, CVTTQ,
FCMOVEQ, FCMOVNE, FCMOVLE, FCMOVLT, FCMOVGE, FCMOVGT:
fa + divider_write_back
;
/* Class FDIV:
o 2.4 bits per cycle average rate. The next floating divide can be
issued in the same cycle the result of the previous divide's result
is avialable.
o Instruction issue to teh add pipeline continues whaile a divide
is in progress until the result is ready. At that point the issue
stage in the instruction umit stalls one cycle to allow the
quotient to be sent the round adder and then be written into the
register file. */
DIVS: fa + float_divider*18 + divider_write_back
;
/* Class FDIV:
o 2.4 bits per cycle average rate. The next floating divide can be
issued in the same cycle the result of the previous divide's result
is avialable.
o Instruction issue to teh add pipeline continues whaile a divide
is in progress until the result is ready. At that point the issue
stage in the instruction umit stalls one cycle to allow the
quotient to be sent the round adder and then be written into the
register file. */
DIVT: fa + float_divider*30 + divider_write_back
;
/* Class FMUL */
MULS, MULT: fm
;
/* Class FCPYS */
CPYS: (fa + divider_write_back | fm)
;
/* Class MISC */
RPCC, TRAPB: e0 + multiplier_write_back
;
/* Class UNOP */
UNOP: %nothing
;
</verb></tscreen>
<sect>Appendix 3 - Output of pipeline Hazards Description Translator
<p>
The following output was generated under Linux 1.2.8 on Compaq Aero
(Intel SX-25, 8MB memory).
<tscreen><verb>
bash$ time oka -v alpha-ev5.oka
Automaton `integer'
36 NDFA states, 152 NDFA arcs
32 DFA states, 138 DFA arcs
24 minimal DFA states, 118 minimal DFA arcs
146 all instructions 7 instruction equivalence classes
Automaton `multiply'
186 NDFA states, 2283 NDFA arcs
261 DFA states, 2958 DFA arcs
236 minimal DFA states, 2748 minimal DFA arcs
146 all instructions 13 instruction equivalence classes
Automaton `float'
180 NDFA states, 720 NDFA arcs
209 DFA states, 867 DFA arcs
149 minimal DFA states, 687 minimal DFA arcs
146 all instructions 8 instruction equivalence classes
606 all allocated states, 3802 all allocated arcs
1281 all allocated alternative states
2177 all comb vector elements, 4428 all transition table elements
7.90user 1.09system 0:10.39elapsed 86%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (69major+247minor)pagefaults 0swaps
</verb></tscreen>
</article>