Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there interest in a SSE port of vec4, quat & mat4? #16

Open
mpolitzer opened this issue Nov 2, 2013 · 7 comments
Open

Is there interest in a SSE port of vec4, quat & mat4? #16

mpolitzer opened this issue Nov 2, 2013 · 7 comments

Comments

@mpolitzer
Copy link

I'm willing to do it. I just wan't to make sure there is a possibility of inclusion before starting.

Regards,
Marcelo

@Kazade
Copy link
Owner

Kazade commented Nov 2, 2013

Definitely! It's always been on my "to do someday" list :)

On 2 November 2013 08:49, Marcelo Politzer Couto
notifications@github.comwrote:

I'm willing to do it. I just wan't to make sure there is a possibility of
inclusion before starting.

Regards,
Marcelo


Reply to this email directly or view it on GitHubhttps://github.com//issues/16
.

@mpolitzer
Copy link
Author

Sorry for the really long delay.
I am far away from a specialist on SSE, but I would like to talk about the API a little.
I know that breaking the compatibility is probably not worth it here goes my 2 cents:
For SSE, since it has a different register set, it is usually best pass its arguments as value and not by reference.

For example:

  • kmVec4* kmVec4Add(kmVec4* pOut, const kmVec4* pV1, const kmVec4* pV2);

Would become like this aliviating the load/store cost:

  • kmVec4 kmVec4Add(const kmVec4 pV1, const kmVec4 pV2);

A vec3 could be implemented as a vec4 and eat an extra float, or stay as float[3] and be transformed into a vec4 and back to vec3 for each operation eating some performance.

Mat4 is good as it is regarding SSE operations. I've never used Mat3 with SSE, so I can't say, but seems ok.

So for SSE (with float): quat, plane, vec4, vec3, Mat4, Mat3 seem like the best targets.

@Kazade
Copy link
Owner

Kazade commented Nov 11, 2013

Hmm, OK, obviously changing the API at this point isn't ideal, however, I
suggest instead we do something like this:

kmVec4 kmVec4AddSSE(...) //SSE postfix, pass by value

kmVec4* kmVec4Add(kmVec4* pOut) { //Existing function thunks the SSE
version
_pOut = kmVec4AddSSE(_v1, *v2);
return pOut;
}

This way, we don't break the API, and users can choose to use the SSE
versions of functions directly if they wish. Make sense?

Everything else sounds awesome, I'm looking forward to reviewing your code
:)

Thanks,

Luke.

On 10 November 2013 22:02, Marcelo Politzer Couto
notifications@github.comwrote:

Sorry for the really long delay.
I am far away from a specialist on SSE, but I would like to talk about the
API a little.
I know that breaking the compatibility is probably not worth it here goes
my 2 cents:
For SSE, since it has a different register set, it is usually best pass
its arguments as value and not by reference.

For example:

  • kmVec4* kmVec4Add(kmVec4* pOut, const kmVec4* pV1, const kmVec4*
    pV2);

Would become like this aliviating the load/store cost:

  • kmVec4 kmVec4Add(const kmVec4 pV1, const kmVec4 pV2);

A vec3 could be implemented as a vec4 and eat an extra float, or stay as
float[3] and be transformed into a vec4 and back to vec3 for each operation
eating some performance.

Mat4 is good as it is regarding SSE operations. I've never used Mat3 with
SSE, so I can't say, but seems ok.

So for SSE (with float): quat, plane, vec4, vec3, Mat4, Mat3 seem like the
best targets.


Reply to this email directly or view it on GitHubhttps://github.com//issues/16#issuecomment-28162151
.

@chriscamacho
Copy link
Contributor

I'm not up on SSE so I don't see why you can't point to a struct of SSE
values but the api does allow chaining of functions which is very powerful
, the java wrapper also relies on this api it would be a substantial change.

Will there be a compile time switch so none x86 can still use the lib (arm
for example)

Re vec3 / vec4 its not uncommon for alignment reasons for vec3 to have an
unused value and just be a synonym for vec4...
On 11 Nov 2013 08:57, "Luke Benstead" notifications@github.com wrote:

Hmm, OK, obviously changing the API at this point isn't ideal, however, I
suggest instead we do something like this:

kmVec4 kmVec4AddSSE(...) //SSE postfix, pass by value

kmVec4* kmVec4Add(kmVec4* pOut) { //Existing function thunks the SSE
version
_pOut = kmVec4AddSSE(_v1, *v2);
return pOut;
}

This way, we don't break the API, and users can choose to use the SSE
versions of functions directly if they wish. Make sense?

Everything else sounds awesome, I'm looking forward to reviewing your code
:)

Thanks,

Luke.

On 10 November 2013 22:02, Marcelo Politzer Couto
notifications@github.comwrote:

Sorry for the really long delay.
I am far away from a specialist on SSE, but I would like to talk about
the
API a little.
I know that breaking the compatibility is probably not worth it here
goes
my 2 cents:
For SSE, since it has a different register set, it is usually best pass
its arguments as value and not by reference.

For example:

  • kmVec4* kmVec4Add(kmVec4* pOut, const kmVec4* pV1, const kmVec4*
    pV2);

Would become like this aliviating the load/store cost:

  • kmVec4 kmVec4Add(const kmVec4 pV1, const kmVec4 pV2);

A vec3 could be implemented as a vec4 and eat an extra float, or stay as
float[3] and be transformed into a vec4 and back to vec3 for each
operation
eating some performance.

Mat4 is good as it is regarding SSE operations. I've never used Mat3
with
SSE, so I can't say, but seems ok.

So for SSE (with float): quat, plane, vec4, vec3, Mat4, Mat3 seem like
the
best targets.


Reply to this email directly or view it on GitHub<
https://github.com/Kazade/kazmath/issues/16#issuecomment-28162151>
.


Reply to this email directly or view it on GitHubhttps://github.com//issues/16#issuecomment-28182117
.

@Kazade
Copy link
Owner

Kazade commented Nov 14, 2013

Hey

Just want to let you know that I'm not ignoring you, just trying to find a
spare 5 minutes to look over the code. :)

Sorry!
On 13 Nov 2013 14:31, "Marcelo Politzer Couto" notifications@github.com
wrote:

Got some drafts, please have a look.

https://github.com/mpolitzer/kazmath/blob/master/kazmath/vec4_sse.h
https://github.com/mpolitzer/kazmath/blob/master/kazmath/vec4_sse.c
https://github.com/mpolitzer/kazmath/blob/master/kazmath/mat4_sse.h


Reply to this email directly or view it on GitHubhttps://github.com//issues/16#issuecomment-28398456
.

@Kazade
Copy link
Owner

Kazade commented Nov 17, 2013

Hi Marcelo,

So, I've had a quick look and it looks awesome, I just have one suggestion.
Could you remove the typedef to kmVec4SSE? My instinct when I saw that in
the API was "Oh, now there's another kmVec4 struct..." which confused me
for a bit. If we just leave it as _m128, it's more obvious that it's an SSE
type, and not the same as kmVec4.

We'll have to ensure that our normal kmVec4 struct is packed on 16 byte
boundaries and then we can call through to SSE by dereferencing the
pointers, but at compile type we'll have to make sure that kmScalar ==
float.

When you are happy with your work, just send a pull request and I'll
happily connect up the current API with the SSE one.

Amazing work!

Luke.

On 14 November 2013 17:56, Luke Benstead kazade@gmail.com wrote:

Hey

Just want to let you know that I'm not ignoring you, just trying to find a
spare 5 minutes to look over the code. :)

Sorry!
On 13 Nov 2013 14:31, "Marcelo Politzer Couto" notifications@github.com
wrote:

Got some drafts, please have a look.

https://github.com/mpolitzer/kazmath/blob/master/kazmath/vec4_sse.h
https://github.com/mpolitzer/kazmath/blob/master/kazmath/vec4_sse.c
https://github.com/mpolitzer/kazmath/blob/master/kazmath/mat4_sse.h


Reply to this email directly or view it on GitHubhttps://github.com//issues/16#issuecomment-28398456
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants