Skip to content
This repository has been archived by the owner on Mar 20, 2024. It is now read-only.

Element width in whole register move load/stores affects misalignment exceptions? #529

Closed
kasanovic opened this issue Jul 10, 2020 · 4 comments
Labels
Resolve for v1.0 To be resolved for v1.0 draft

Comments

@kasanovic
Copy link
Collaborator

The element width in whole register load/stores is treated as hint currently, but it would somewhat simplify machines if this was treated as the actual transfer width in terms of reporting misalignment exceptions (on EEW boundaries). The width would then no longer be a strict hint.

@David-Horner
Copy link
Contributor

This may be problematic for use within traps/code unaware of last store values.

The width is now coupled to the spillers /fillers memory alignment constraints and not the application that invoked (or was interrupted).

In such a case the code might resolve to the lowest common denominator, 8bits, and that can be very bad for subsequent regiater access for EEW > 8 on a machine that has in-register not mapping in-memory.

@aswaterman
Copy link
Collaborator

Spill/fill code can easily guarantee sufficient alignment for whichever width hint it chooses to use. It doesn't seem likely that will force least-common-denominator use of EEW=8.

@David-Horner
Copy link
Contributor

The point is we are providing this formulation for the benefit of SLEN<VLEN-like (SlVl) machines.
The majority of machines are likely not to be SlVl.
What incentive is there to software developers to use multiple widths when one-size-(apparently)-fits-all?
Especially when loads
a) are the most impactful, and
b) only SlVl down stream execution performance is impacted (typically outside of the scope of software testing/profiling)
c) LCD 8bit is convenient for common code collapse (thus software maintenance) and
d) tail return code is a good candidate for code consolidation and where fill loads are most likely to occur.

This is a recipe for LCD 8bit domination as all the incentives drive that way; not forced but many collective nudges and no guide rails to ensure the code does not go off track from advantaging SlVl machines.

If SlVl machines were to gain dominance, then the metrics could change, but only if the performance impacts were also substantial.

More likely, for 8bit WR load SlVl machines will (eventually) ignore alignment issues (it is LCD alignment after all) and also ignore 8bit as a in-register format directive, instead use some heuristic, possibly of the type I suggest in section Better Mechanisms #503 (comment)

In other words, what we intended for good will end up dead weight.

The possible good take away from exploring the alignment issues is that we may as well designate 8bit as the "unsure" case and not e1024. (which works well for #519 Reserve e1024 (and also e512, maybe e256).

@David-Horner
Copy link
Contributor

To the original point, I see no value to these width formulation enforcing alignment.

How does it help software?

Whole Register (WR) stores' width specification have no impact on subsequent SLEN<VLEN-like (SlVl) execution, only the loads do.
Alignment has performance impact, but that is independent of a width designation.
The typical (and original justification for WR load/store) usage is to vacate physical registers with no need to understand its current "vtype usage" (not the same as the current vtype setting), that is the EMUL/EEW/EDIV/vl/ta/ma used to generate its contents or the same regarding its future consumption.

Thus, given this predominant use case, WR store alignment will typically affect WR load alignment as the same memory location is typically used for both.
A more restrictive alignment on load than on store may result in runtime failure that is hardware implementation dependent.
We have generally been adverse to non-portability and run time failure.
#397 (comment) in a portable way wrt SLEN
#418 (comment) point 2. A certain class of code errors would not be caught

Further as explained above #529 (comment), software will be driven to WR load of 8bit having agonized (lost energy) on the correct formulation for WR stores.

The net is that software is negatively affected by this change that is not required for correct functioning of WR loads/stores.

how does it help hardware?

it would somewhat simplify machines if this was treated as the actual transfer width in terms of reporting misalignment

but is there any misalignment to report?
A machine could handle all WR store/loads as 8bit aligned from memory side perspective, and only such SlVl machines are going to care about width encoding from the in-register perspective.
A machine could derive the unit of load/store from the low bits of the address, optimize for "good alignment", and only such SlVl machines are going to care about width encoding from the in-register perspective.

A profile stipulation can be made that all machines of a certain class must only use a certain non-byte alignment (e..g.the XLEN of the system)..
in which case

  • there is a loss of functionality (whole register shift up/down by less than this alignment factor is lost)
  • consistency with regular load/store is vacated, should not be supported and risks further fragmentation in the whole ecosystem ( although it might be OK for that profile, like say unix-like.

An none of this assures SlVl machines are going to have reasonable in-register format assignments for their next use.

This is not a win-win by any metrics.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Resolve for v1.0 To be resolved for v1.0 draft
Projects
None yet
Development

No branches or pull requests

3 participants