Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve register allocation (part 2) #48

Merged
merged 12 commits into from
May 31, 2024

Conversation

janvrany
Copy link
Owner

This PR further improves RA in Tinyrossa by adding support for spilling/reloading via
live interval splitting. Conflicting register dependencies are also solved via splitting.

Class comment of TRReverseLinearScanRegisterAllocator provides basic overview
and pointers to various methods which contain further comments (hopefully)
explaining how the allocator works.

To test the allocator, all tests are now run with and without "stressRA" config option,
which severely limits number of registers available for allocation. Also, when "stressRA"
option is on, RISC-V codegen artifically forces mul operands to t0 to put more stress
on RA.

Still, there's much to be desired:

  • For example, the allocator ignores dependencies when allocating registers - it should look ahead and allocate them so they satisfy the dependency rather than more things around later when needed.
  • There's no support for value rematerialization, for example there's no need to spill register than holds constant or value of parameter/automatic.
  • Spill slot allocation is rather simplistic, it just create new automatic. This works fine, but some ABIs require stack frame to contain spill-area (POWER, Windowns X64). So ultimately, it is the linkage that should provide spill slot.

All of this is left as future work.

janvrany added 12 commits May 22, 2024 19:20
This has been used to track whether or not the value is currently
spilled or not but this information is not currently used anywhere.

Might be reintroduced in future when needed.
This commit changes live interval structure to keep track of all
uses of a virtual registers rather than just its definition (start)
and last use (end). It also keeps track whether at given position
the register was defined (written to) or used (value read) or both.

This is a preparation for spilling / reloading when running out
of available registers as well as with deciding which register to spill
(based on how many times and how "far" it is given virtual register
defd/used).

Internally, both def and use positions are are stored single ordered
array: def position `p` is stored as `(p*2) + 1` and use position
as `(p*2)`.
This commit moves spills up to where the value is actually defined
(i.e., to the beginning of live interval). This will (hopefully) make
interval splitting easier to implement. Also, this moves spills (memory
writes) further away from reloads (memory reads) which may (or may not)
help performance.

This change makes manual handling of thrashed registers (defined by
instruction's register dependencies) simpler - no need to call to
`#insertSpill:` when handling pre-dependencies - thrashed registers are
spilled automatically when expired.
This commit add support for general register spilling, handling the
(previously unhandled) case allocator runs out of free registers.

This is done by splitting (live) interval at given allocation position.
A reload is inserted at allocation position + 1 and new interval
representing the 'still-live' part of splitted interval is setup to spill
and pushed back onto worklist.

Also, thrashed registers are now handled the same way - when an instruction
at given position thrashes a register, its interval is force-split,
causing the value to be spilled before and reloaded after given position.

This these changes, Tinyrossa can compile recursive factorial with overflow
with just 2 registers available for allocation.
This commit changes `#pickRegister:` to NOT assign register to interval
in order to behave like other 'pick' methods in RSLRA.
This commit adds `#isUnsatisfiedDependency` which returns true if
there's a dependency of virtual register on real register which is not
satisfied (i.e., virtual register is allocated different register than
the desired real register).

Also remove `TRRegisterDependency >> #isDependency` as it is not used
any more.
So far, handling pre and post register dependencies was very limited:
it could only handle dependencies on non-allocatable registers. This
made things simple - all that was needed was to move value to / from
required real register. Since dependency could be only on non-allocatable
registers, there was no way to introduce conflict.

The downside of this was that this limited a number of registers
available - for example argument / return registers could not be
allocated: what a waste of good registers!

This commit improves handling of dependencies leveraging support for
interval splitting.
This commit adds one more parameter to run all tests with `stressRA`
config option.
This commit forces first operand and result of `mul` and `mulw`
instruction to register `t0` when user enabled `stressRA` config option.
This is to generate more pressure on register allocator for testing
and debugging purposes, normally this option is not used.
@janvrany
Copy link
Owner Author

FYI: @shingarov , @melkyades

@shingarov
Copy link
Collaborator

I am subscribed to this repo so I am getting email notifications of all PR activity even without the FYI mention. Saying this just to double-ensure you have the right idea about my level of interest in TR.

@janvrany janvrany merged commit 1ac545a into master May 31, 2024
2 checks passed
@janvrany janvrany deleted the pr/improve-register-allocation-part02 branch May 31, 2024 13:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants