Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[aapcs64] Unclear callee/caller wording in aapcs64.rst #266

Open
a74nh opened this issue Jun 17, 2024 · 17 comments · Fixed by #267
Open

[aapcs64] Unclear callee/caller wording in aapcs64.rst #266

a74nh opened this issue Jun 17, 2024 · 17 comments · Fixed by #267

Comments

@a74nh
Copy link

a74nh commented Jun 17, 2024

In aapcs64.rst

z0-z7 are used to pass scalable vector arguments to a subroutine, and to return scalable vector results from a function. If a subroutine takes at least one argument in scalable vector registers or scalable predicate registers, or if it is a function that returns results in such registers, it must ensure that the entire contents of z8-z23 are preserved across the call. In other cases it need only preserve the low 64 bits of z8-z15, as described in SIMD and Floating-Point registers.

p0-p3 are used to pass scalable predicate arguments to a subroutine and to return scalable predicate results from a function. If a subroutine takes at least one argument in scalable vector registers or scalable predicate registers, or if it is a function that returns results in such registers, it must ensure that p4-p15 are preserved across the call. In other cases it need not preserve any scalable predicate register contents.

In both cases in it must ensure that it is not clear whether it refers to the caller or the callee.

Eg: if it is the callee then the wording should be the subroutine must ensure that.

This wording caused issues when designing SVE support for .NET.

@kunalspathak
Copy link

.NET issue that describes SVE support: dotnet/runtime#93095

@smithp35
Copy link
Contributor

I agree that the sentence can be clarified.

Assuming the confusion hasn't been resolved already, there's a couple of other parts that may help parsing the text:
https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#22terms-and-abbreviations

Routine, subroutine
A fragment of program to which control can be transferred that, on completing its task, returns control to its caller at an instruction following the call. Routine is used for clarity where there are nested calls: a routine is the caller and a subroutine is the callee.

https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#3scope

Obligations on the called routine to preserve the program state of the caller across the call.

Combining this with the original text, the it is referring to the callee.

@a74nh
Copy link
Author

a74nh commented Jun 17, 2024

Combining this with the original text, the it is referring to the callee.

Thanks! That was the conclusion we came to after reading around the issue elsewhere. But it would be nice for it to be clearer.

smithp35 added a commit to smithp35/abi-aa that referenced this issue Jun 18, 2024
At least one community got confused as to whether it refered
to the callee or the caller. Use subroutine instead of it to
make it clear that we are referring to the same subroutine
that takes z and p registers as arguments.

Fixes ARM-software#266
@smithp35
Copy link
Contributor

#267 to update wording.

@kunalspathak
Copy link

kunalspathak commented Jun 19, 2024

Just to be clear, here is my understanding. @rsandifo-arm @smithp35 - please correct if I missed anything.

Terminology

  • callee-save: Registers that should be saved/restored by the callee in its prolog/epilog
  • caller-save: Registers that should be saved/restored by the caller around the call it makes
  • sve method: Method that takes sve/predicate arguments or returns sve/predicate result
  • regular method: Method that neither take sve/predicate arguments nor returns sve/predicate result
A()
{
prolog:
   save callee-save registers
   ...
   ...
   save caller-save registers
   B();
   restore caller-save registers
   ...   
   ...
epilog:
   restore callee-save registers
}
  • A to B, read it as method A calls method B
  • prolog/epilog of A: callee-save of method A
  • before/after call to B: caller-save by method A

Float/Scalable registers

Scenario# A to B prolog/epilog of A before/after call to B
1 regular to regular bottom 64-bits v8-v15 1 v0-v7, v16-v31, top 64-bits v8-v15 1
2 regular to sve bottom 64-bits v8-v15 1 z0-z7, z24-z31
3 sve to regular z8-z23 v0-v7, v16-v31, top 64-bits v8-v15 1
4 sve to sve z8-z23 z0-z7, z24-z31

1 : This is same specification we have for NEON and only applicable when registers are in use or live

Predicate registers

Scenario# A to B prolog/epilog of A before/after call to B
1 regular to regular NA p0-p15
2 regular to sve NA p0-p3
3 sve to regular p4-p15 p0-p15
4 sve to sve p4-p15 p0-p3

@smithp35
Copy link
Contributor

smithp35 commented Jun 19, 2024

I'm going to use the official terminology of caller-save instead of callee_trash.

Just to be sure, apologies if this was already clear, caller-save and callee-save are more like responsibilities to save than they are requirements to save. For example a callee only needs to save a callee-save register if it uses the register. A caller only needs to save a caller-save register before a call if there is a live value in the register that the caller needs to access after the call.

This is my reading of the document. I'm not a SVE expert like @rsandifo-arm so if I've got this wrong please go with his answer/correction rather than mine. I'm more of a linker than a compiler person.

I found it easier to describe when not considering the different call scenarios as there is only a caller and a callee and the responsibilities of the caller don't change if the callee is sve or regular.

Function type callee-save caller-save
regular bottom 64-bits of v8-v15 v0-v7, v16-v31, top 64-bits of v8-15
sve z8-z23 v0-v7, v16-v31 (*)

(*) z16-z23 are extensions of v16-v23 so these are both callee and caller saved.

Function type callee-save caller-save
regular - p0-p3
sve p4-15 p0-p3

I do hope I've got this right, if I haven't and it isn't a silly mistake then we may need more clarifications.

@kunalspathak
Copy link

I found it easier to describe when not considering the different call scenarios

That's how I wanted it to be, but I wanted to be explicit about the situation. For e.g. in your table, for "regular" function type, under "caller-save", the way I interpret is if a "regular" function is a caller, what registers it need to save/restore across a function call. But that depends on what type of function it is calling. If it is a "regular" function, it needs to save/restore v0-v7, v16-v31, top 64-bits of v8-15, but if it is a sve function, it needs to just save/restore z0-z7, z24-z31, because the sve function (which will be callee in this case) will be responsible for preserving z8~z23. Same goes with other combination.

Also, for "regular" function type, if it is calling "regular" function, then it should save/restore entire p0~p15, while if it is calling "sve" function, it should preserve just p0~p3, because p4~p15 will be preserved by the "sve" function (which is callee in this case).

Note: When I say caller should preserve across function call, I mean only the registers that are live across the call. So, in my table, out of the registers mentioned in "callee-trash" column, only the registers that are live across the call will be preserved by the caller.

I do hope I've got this right

I feel the same :)

@kunalspathak
Copy link

then we may need more clarifications

Regardless of if we get this or not, I think the document needs a clear way of stating these requirements, something equivalent of how we are having this information in the table. Lot of time is being spent by multiple people in trying to interpret couple of lines of the document.

@smithp35
Copy link
Contributor

OK I see where you are coming from. The safest assumption is that what is not callee-saved by the callee must be caller-saved. That would indeed imply that p0~p15 would need saving when calling a regular function.

I'll reopen this as I think more work is needed here.

@smithp35 smithp35 reopened this Jun 19, 2024
@pmsjt
Copy link

pmsjt commented Jun 19, 2024

Functions without SVE types in the signature don't have to save any SVE state. If they had to, then existing function would not be legal anymore. The only things function without SVE types in the signature must worry about are:

  • If they want to preserve SVE state across calls they make, they may need to save them. This will depend on whether the callee takes SVE parameters or not. Callees with SVE parameters will, themselves, preserve a lot of registers so the caller may not need to save anything. When calling a function that doesn't have SVE types in the signature, you must assume all SVE state will be trashed.
  • If they need stack-bound local SVE variables, wither because the function uses more SVE variables than there are registers, or because some variable has address-taken, then you must allocate space in the stack for them.
  • The only thing they might have to save in the prolog and restore in the epilog are D8->D15 (lower 64bits of Neon registers Q8->Q15). This is not new - this is the existing Neon callee-saved rule, but the compiler must take into consideration that using Z8->Z15 means D8->D15 will be affected. If a function that doesn't have SVE types in the signature uses SVE but avoids Z8->Z15 then it doesn't have to save anything in the prolog or restore it in the epilog.

@tannergooding
Copy link

tannergooding commented Jun 19, 2024

the responsibilities of the caller don't change if the callee is sve or regular.

There is a lot of nuance here and it is easy for developers to miss considerations.

A callee x is responsible for saving (typically in the prologue) and restoring (typically in the epilogue) the callee-save set of its own calling convention a

A caller x is also responsible for saving (typically before the call) and restoring (typically after the call) the caller-save set of the calling convention b for callee y

Thus, if conventions a and b match (sve x->sve y -or- regular x->regular y), then this is relatively simple as you only have to consider the context of the individual methods x and y because the callee-save for a is the inverse mask to the caller-save for a

However, if conventions a and b do not match (sve x->regular y -or- regular x->sve y), then the caller-save set becomes more interesting as the callee-save for a will typically not be an inverse of the caller-save for b. Instead, they will have a union of some registers. This means that the caller x must also consider any registers that are disjoint.

The simplest example of this is that for a regular call, none of P0-P15 are considered callee-save. Thus a regular method is free to trash any and all predicate registers without consideration. However, P4-P15 are considered callee-save for an sve call and thus must save P4-P15 is they are used.

What this means is that for regular x->regular y, x is free to trash any predicate registers. If it has a predicate register that needs to remain "live" across the call to y, it must save/restore them.

For sve x->sve y, x is free to trash P0-P3, but must save and restore P4-P15 if they are used. It must only save P0-P3 across the call to y if they need to remain live.

However, for regular x->sve y the sets differ and x now only has to save P0-P3 because y must be saving/restoring P4-P15.

It gets very interesting for sve x->regular y however, because the regular call (y) is free to trash any of P0-P15. This means that not only does x need to save the normal set of P0-P3 if it's using them and needs them to remain live across the call, it must also assume that y will trash P4-P15 and is now responsible for saving them across the call boundary (because any prior sve caller could itself be using them and expected x to have saved them).

@smithp35
Copy link
Contributor

Thanks for the additional points. This has somewhat spiralled from the meaning of it :-) in a couple of sentences. I'll discuss with my colleagues to see if there is a better way of describing this.

@kunalspathak
Copy link

kunalspathak commented Jun 19, 2024

I have updated #266 (comment) to use the terminology of "caller-save" instead of "callee-trash".

@smithp35
Copy link
Contributor

smithp35 commented Jun 20, 2024

Looking at the table that you have updated I think it is best not to try and enumerate the caller-save registers and caller-save registers in the same table.

The callee-save registers are a requirement for a function to preserve the values of registers across the call, so that the values of these registers on entry to the function are the same as the values on return. This requirement is invariant of the caller, or whether there are any calls at all. This looks right in your table.

The set of caller-save registers are determined per call (a function could call both regular and sve functions). They are the registers that are not guaranteed to be preserved by the function being called (registers not in the callee-saves of the function being called).

Function Type Callee-saves
regular bottom 64-bits v8-v15
SVE z8-z23, p4-p15
Called function type Caller Save registers for call
regular All registers not in {bottom 64-bits of v8-v15} *
sve All registers not in {z8-z23, p4-p15}
  • In practice this means all SVE state including predicate registers, it is going to be hard to work out that SVE values are always going to be within bottom 64-bits of v8-v15.

I've got more registers that need to be saved when calling regular functions than your table entries for caller-save.

Hope I haven't made any mistakes, I'm hoping that we can find the right wording to improve the AAPCS over the next few weeks.

@kunalspathak
Copy link

kunalspathak commented Jun 20, 2024

All registers not in {bottom 64-bits of v8-v15} *

I assume that includes p0-p15 (might be better to clarify)

@smithp35
Copy link
Contributor

I've edited my * comment to "In practice this means all SVE state including predicate registers". Hopefully that should cover it.

@kunalspathak
Copy link

I've got more registers that need to be saved when calling regular functions than your table entries for caller-save.

Yes, I realized it and have updated #266 (comment) accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants