-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
passing struct by value does not respect x86_64 SysV ABI #443
Comments
Here is a small example below showing the eightbyte rule in action: short-short-int fits in a eightbyte ($RAX), but short-int-short does not, it requires 2 eightbytes ($RAX, $RDX) - as if alignment was taken into account. We cannot return 3 eightbytes registers (like 2 long, 1 double), so the struct is passed by pointer in $RDI int this case, and $RDI is copied in $RAX on return.
which disassemble to:
|
For passing parameters by value, my understanding is that we can pass up to 8 eigthbytes individually for each struct (either via register if struct fits on two eightbytes, and if we still have available registers, or via stack...). Only for more than 8 eightbytes, we will pass by MEMORY (a pointer to struct). Here is a little example:
double-int are passed by registers (by value) because fitting on 2 eightbytes Correction: struct are limited to 2 eightbytes, the case of 8 eigthbytes only apply to SSE vectors ( |
I've heard that Lua's FFI is wonderful. Maybe we should take a look how did they tackle this problem. After some quick digging around, here is the code setting up the arguments: Here are the x64 specific defines used by the setting up code: They definitely have all the type information. They get it from parsing the C declarations. Here is an example of the declaration part: |
PyPy/Python CFFI is similiarily good: |
I think I ran into this trying to build FFI to libclang.
It passes a lot of structs by value and some calls came back with nothing.
My message to vm-dev@lists.squeakfoundation.org on Nov 21, 2017 follows;
Apologies if this isn't relevant.
… I've been trying to track this down for a couple weeks now.
I have concluded that structs passed by value to functions on the 64 bit VM are not properly populated. The struct's memory is all zero'd.
I found this while trying to work with LibClang and found that functions that fetched code locations from code ranges always returned invalid zero'd locations. After spending some time with lldb I have traced the problem into the native code and found that the argument is not correct.
I've carved out the wee bit of clang to reproduce this in a tiny library.
The gist of it is below and the entire file is included. Basically the struct passed to the function clang_getRangeStart is zero'd memory regardless of the data I send from the image side.
The build command I used on sierra is clang -shared -undefined dynamic_lookup -o microclang.dylib microclang.c
I'm stuck. The FFI64 plugin is way beyond my comprehension.
// microclang.c
typedef struct {
const void *ptr_data[2];
unsigned int_data;
} CXSourceLocation;
/**
* \brief Identifies a half-open character range in the source code.
*
* Use clang_getRangeStart() and clang_getRangeEnd() to retrieve the
* starting and end locations from a source range, respectively.
*/
typedef struct {
const void *ptr_data[2];
unsigned begin_int_data;
unsigned end_int_data;
} CXSourceRange;
const char* first = "first_pointer";
const char* second = "second_pointer";
// return a fake range with non zero data
CXSourceRange clang_getArbitraryRange()
{
CXSourceRange range = {0};
range.ptr_data[0] = (void*)first;
range.ptr_data[1] = (void*)second;
range.begin_int_data = 17;
range.end_int_data = 24;
return range;
}
// Actual clang function - range is always zero'd here despite having values in the image
CXSourceLocation clang_getRangeStart(CXSourceRange range) {
// Special decoding for CXSourceLocations for CXLoadedDiagnostics.
if ((uintptr_t)range.ptr_data[0] & 0x1) {
CXSourceLocation Result = { { range.ptr_data[0], nullptr }, 0 };
return Result;
}
CXSourceLocation Result = { { range.ptr_data[0], range.ptr_data[1] },
range.begin_int_data };
return Result;
}
On Nov 6, 2019, at 12:57 PM, Nicolas Cellier ***@***.***> wrote:
For passing parameters by value, my understanding is that we can pass up to 8 eigthbytes individually for each struct (either via register if struct fits on two eightbytes, and if we still have available registers, or via stack...). Only for more than 8 eightbytes, we will pass by MEMORY (a pointer to struct).
Here is a little example:
double adi(sdi x) { return (double) x.a + (double) x.b ; }
double adi_2(sdi x,sdi y) { return adi(x) + adi(y); }
double adi_4(sdi x,sdi y,sdi z,sdi t) { return adi_2(x,y) + adi_2(z,t); }
double aslf(sslf x) { return (double) x.a + (double) x.b + (double) x.c ; }
double aslf_2(sslf x,sslf y) { return aslf(x) + aslf(y); }
double aslf_4(sslf x,sslf y,sslf z,sslf t) { return aslf_2(x,y) + aslf_2(z,t); }
double-int are passed by registers (by value) because fitting on 2 eightbytes
short-long-float are passed via stack (by value) because requiring 3 eightbytes
We should try longer (> 8 eightbytes) and verify that they are passed by pointer...
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#443?email_source=notifications&email_token=AIJPEW3KXWHZHIYSFPN56DDQSMVU3A5CNFSM4JJ3EAU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDH6XYY#issuecomment-550497251>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AIJPEW6526SF2E4V5NDITSTQSMVU3ANCNFSM4JJ3EAUQ>.
|
Hi Nicolas, Hi All, this is an issue of funding the correct specification. I fond this one:
This is far simpler than the alternatives. So at least what we have may be correct on FreeBSD ;-) Can someone find out what version of the standard is operative on linux and macOS? The spec that splits struct fields across available registers is horribly complex but doable. I did implement that spec for the VisualWorks FFI. We're in a much better position than the VisualWorks FFI because Andreas designed and implemented signature type information correctly in his (our) FFI. In his/our FFI, a function's type specification is associated with the function itself. In VisualWorks, what is associated with a function is a very simplified reduction of types, and rich type information is only available attached to actual parameters, so pass in a C object with the wrong type information in VW and it will be marshaled incorrectly. So while this is work to do, I'm sure we can do it quite straight-forwardly. The key issue is to determine the right versions of ABI specification to use before we start implementation. As I showed above I took the easy route; I found a version of the spec that had simple semantics and implemented it. Mea culpa. |
What frustrates me here is that we also have all the type info and we have our own code which mostly works. So why is no one suggesting looking at our code and fixing it instead of relying on other's code that doesn't suit as well? We can wipe our own chins. |
Follow up: for returning struct by value, we use a fake SixteenByteReturn struct {sqInt a; sqInt b;} We would need 4 different return cases sqInt-sqInt sqInt-double double-sqInt double-double For passing struct by value, I just corrected a bug in VMMaker slang about floatType & doubleType check of struct members. But it ain't gonna work in all cases. I will also push more FFI tests for some tricky cases. I have prototyped a method that enumerate the fields and re-compute the alignment, but it does not sound good, this job/information is already done at image side... |
These tests are related to #443
WIP https://source.squeak.org/VMMakerInbox/VMMaker.oscog-nice.2678.diff UPDATE: UPDATE: Next stuff: handle union and packed struct. |
ThreradedFFIPlugin: solve passing/returning struct by value on X64 See #443 On X64/SysV struct up to 16 byte long can be passed by value (& returned) into a pair of 8-byte registers. The problem is to know whether these are integer (RAX RDX) or float (XMM0 XMM1) registers or eventually a mix of... For each 8-byte, we must know if it contains at least an int (in which case we have to use an int register), or exclusively floating points (a pair of float or a double). Previous algorithm did check first two fields, or last two fields which does not correctly cover all cases... For example int-int-float has last two fields int-float, though it will use RAX XMM0. So we have to know about struct layout... Unfortunately, this information is not included into the compiledSpec. The idea here is to reconstruct the information. See #registerTypeForStructSpecs:OfLength: & It's also impossible to cover the exotic alignments like packed structure cases... if we really want to pass that, this will mean passing the alignment information, a more involved change of #compiledSpec (we need up to 16 bits by field to handle that information since our FFI struct are limited to 65535 bytes anyway). For returning a struct, that's the same problem. We have four possible combinations of int-float registers. Consequently, the idea is to analyze #registerType: and switch to appropriate case. I found convenient to pass the ffiRetSpec compiledSpec object thru CalloutState (it's the Smalltalk WordArray object, not a pointer to its firstIndexableField) for performing this analysis... Not sure if this is the best choice. Since we have 4 different SixteenByte types, I have changed value, since it's what will be used to memcpy to allocated ByteArray handle.
ThreadedFFIPlugins: See #443 FFI support for returning of a packed struct by value in X64 SysV On X64/SysV struct up to 16 byte long can be passed by value into a pair of 8-byte registers. The problem is to know whether these are int (RAX RDX) or float (XMM0 XMM1) registers or eventually a mix of... For each 8-byte, we must know if it contains at least an int (in which case we have to use an int register), or exclusively floating points (a pair of float or a double). Previous algorithm did check first two fields, or last two fields which does not correctly cover all cases... For example int-int-float has last two fields int-float, though it will use RAX XMM0. So we have to know about struct layout... Unfortunately, this information is not included into the compiledSpec. The idea here is to reconstruct the information. See #registerTypeForStructSpecs:OfLength: It's also impossible to cover the exotic alignments like packed structure cases... But if we really want to pass that, this will mean passing the alignment information, a more involved change of #compiledSpec (we need up to 16 bits by field to handle that information since our FFI struct are limited to 65535 bytes anyway). For returning a struct, that's the same problem. We have four possible combinations of int-float registers. Consequently, the idea is to analyze the ffiRetSpec compiledSpec object thru CalloutState (it's the Smalltalk WordArray object, not a pointer to its firstIndexableField) for performing this analysis... Not sure if the best choice. Since we have 4 different SixteenByte types, I have changed value, since it's what will be used to memcpy to allocated ByteArray handle. Checking the size of a struct is not the only condition for returning a struct via registers. Some ABI like X64 SysV also mandates that struct fields be properly aligned. Therefore, we cannot just rely on #returnStructInRegisters:. Rename #returnStructInRegisters: -> #canReturnInRegistersStructOfSize: Perform a more thorough analysis during the setup in #ffiCheckReturn:With:in: The ABI will #encodeStructReturnTypeIn: a new callout state. This structReturnType is telling how the struct should be returned - via registers (and which registers) - or via pointer to memory allocated by caller This structReturnType will be used at time of: - allocating the memory in caller - see #ffiCall:ArgArrayOrNil:NumArgs: - dispatching to the correct FFI prototype - see ThreadedX64SysVFFIPlugin>>#ffiCalloutTo:SpecOnStack:in: - copying back the struct contents to ExternalStructure handle (a ByteArray) - see #ffiReturnStruct:ofType:in: Since structReturnType is encoded, it is not necessarily accessed directly, but rather via new implementation of #returnStructInRegisters: whch now takes the calloutState and knows how to decode its structReturnType. Check for unaligned struct and pass them in MEMORY (alloca'd memory passed thru a pointer). Use a new (branchless) formulation for aligning the byteSize to next multiple of fieldAlignment. Encode registryType of invalid unaligned candidate as 2r110, and pass the struct address returned by the foreign function in $RAX register in place of callout limit when stuct is returned by MEMORY. CoInterpreter: eliminate all but one compiler warning. Cogit/Slang: fix several C compiler warnings re the Cogits. Cogit: DUAL_MAPPED_CODE_ZONE (require -DDUAL_MAPPED_CODE_ZONE=1 to enable) Fix denial of write/execute facility on modern Linuxes by dual mapping the code zone in to a read/execute address range for code execution and a read/write address range for code editing. Maintain codeToDataDelta and provide codeXXXAt:put: to write at address + codeToDataDelta to the offset writable address range. Hence DUAL_MAPPED_CODE_ZONE requires a new executbale permissions applyer that will also do the dual mapping, sqMakeMemoryExecutableFrom:To:CodeToDataDelta:. Provide writableMethodFor: as a convenience for obtaining a writable cogMethod. No longer have the fillInXXXHeaderYYY: methods answer anything since they're given the writable header, not the actual header. Cogit: Refactor indexForSelector:in:at: to indexForSelector:in: in the back end so it can be inlined (via a macro). Slang: emit constant for (M << N) and (M - N) - L for constant integers. Fix in slang case statement expansion labels. During expansion in case statements, trees are duplicated and expanded.
Should be completely fixed by c855035 |
… [ lastPointerOfWhileSwizzling: ]
…[ lastPointerOfWhileSwizzling: ] KILLED by 1/209 test cases.
…ver with false ] on method [ freeTreeNodesDo: ]
…er with false ] on method [ freeTreeNodesDo: ] KILLED by 28/234 test cases.
… method [ popObjStack: ] 14 test cases.
…method [ popObjStack: ] 14/14 test case are EQUIVALENT
The case when a structure passed by value is passed via memory (thru a pointer), via registers, or just stack, same for structure return value is much more complex than what is programmed currently in both FFI and Alien (FFI and Alien currently do not even agree...).
This badly need more test cases and more attention.
See squeak-dev thread [squeak-dev] Alien primFFICall returning struct with 64bit vm
http://lists.squeakfoundation.org/pipermail/squeak-dev/2019-November/204974.html
See draft for ABI, near page 21 and 24 (not very clear)
https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-1.0.pdf
The text was updated successfully, but these errors were encountered: