Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible extension for larger datasets - is it valid? #3

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

twistedmage
Copy link

Hi Serban, I emailed you yesterday about my experiences using your code for larger datasets. And the failure of the reduction when the number of threads exceeds the limit of 1024.

This is my attempt at avoiding this problem. Could you comment on it's validity?

As part of the solution, I used atomic expressions, which unfortunately require compute capability 1.1. There may be other ways to solve it.

@canselcik
Copy link

Successfully used your patch to process a large dataset containing more than 22 million entries.

The original code (the CUDA version) was throwing the error cuda invalid configuration error 9 and exiting.

@vmarkovtsev
Copy link

This patch is magic, it makes Lloyd converge in a single iteration. I guess something is wrong.

@calebsmith
Copy link

calebsmith commented May 6, 2019

I was also hitting the cuda error 9: Invalid configuration error with larger datasets.

After trying a small hardcoded dataset of 20 elements, I loaded in RGB float data for a jpeg that was
roughly 900x700 and it far exceeded the bounds of the original implementation (without this patch). Using this patch and the output seems correct

@calebsmith
Copy link

After working with a few more example inputs with this, I realized that it does in fact exit after 1 iteration as @vmarkovtsev mentioned in the above. I initially didn't fully grasp his meaning, but it appears only 1 iteration is performed, no matter the threshold. I've reproduced this on several different inputs.

This creates quite poor results in some contexts. Something in this patch is possibly leading to an incorrect delta result in the end of the loop? I haven't dug very far yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants