You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As we want to have this library portable, the first step would be to make 100% of this library run correctly on only CPU (i.e. not requiring CUDA for any part of the functionality). This would serve two purposes:
Provide a baseline that contributors of ports can reference
Provide a fallback for partially implemented hardware platforms
Proposed solution
Implement all the CUDA kernels in "normal" C++
Make sure the unit tests all run on the CPU as well
Make sure unit test coverage is satisfactory
Open questions
Which CPU architectures do we support (x86_64 and arm64 are givens, but any more)?
How do we deal with SIMD intrinsics? Build separate libraries for each SIMD architecture? Or run-time selection based on CPU features?
@Titus-von-Koeller Feel free to edit this issue as you see fit, if you want a different structure for it for example.tbd
tbd
The text was updated successfully, but these errors were encountered:
@rickardp Where are we on this feature ? It is some part already working, or another threads talking about this feature ?, not much comment here.
Hi @simepy, sorry not much to add here still. I am still up for contributing towards this when 1) I have time to do so and 2) the dependencies that I do not have time to contribute are ready to use. More specifically the idea is to take a gradual approach and use the reference implementation where MPS acceleration is not yet implemented. Currently, large parts of this codebase require CUDA, which does not run on Apple silicon, making a partial implementation virtually unusable.
Motivation
As we want to have this library portable, the first step would be to make 100% of this library run correctly on only CPU (i.e. not requiring CUDA for any part of the functionality). This would serve two purposes:
Proposed solution
Open questions
@Titus-von-Koeller Feel free to edit this issue as you see fit, if you want a different structure for it for example.tbd
tbd
The text was updated successfully, but these errors were encountered: