PART 2: NAIVE PREFIX SUM

As we can see from the graph, GPU naive scan is faster than CPU version when the length of the array is smaller than 1.2 million. And the time consumption for GPU naive version become worse when the array size increases.The reason for naive version become inefficient when array size is too big is because there are too many global memory access.

PART 3 : OPTIMIZING PREFIX SUM

As can see from the graph, GPU shared memory (green) is always faster than both GPU naive version (red) and CPU version (blue). In the experiment, I used 1024 block size. If block size is smaller, the GPU shared memory version can run much faster. The main reason for shared memory become not that efficient after 1 million is because the bank conflict starts to happen more frequently. So, some threads have to wait until the shared memory is ready to use.

PART 4 : ADDING SCATTER

According to the graph, the thrust version is the most inefficient version, and the GPU version is the most efficient version. The thrust version is that slow may due to the way I calculated clock time is in term of CPU instead of GPU. If thrust can handle the bank conflits well, then it should run faster than my GPU version.

Name	Name	Last commit message	Last commit date
Latest commit History 5 Commits
StreamCompaction	StreamCompaction	Finished project	Sep 30, 2014
.gitignore	.gitignore	Initial commit	Sep 17, 2014
README.md	README.md	Finished project	Sep 30, 2014
prefix_sum.jpg	prefix_sum.jpg	Finished project	Sep 30, 2014
stream_compact.jpg	stream_compact.jpg	Finished project	Sep 30, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PART 2: NAIVE PREFIX SUM

PART 3 : OPTIMIZING PREFIX SUM

PART 4 : ADDING SCATTER

About

Releases

Packages

Languages

zxm5010/Project2-StreamCompaction

Folders and files

Latest commit

History

Repository files navigation

PART 2: NAIVE PREFIX SUM

PART 3 : OPTIMIZING PREFIX SUM

PART 4 : ADDING SCATTER

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages