Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task03 Евгений Свирин ITMO #122

Open
wants to merge 1 commit into
base: task03
Choose a base branch
from

Conversation

EvgenySvirin
Copy link

@EvgenySvirin EvgenySvirin commented Sep 29, 2024

Локальный вывод

aplusb:

OpenCL devices: Device #0: GPU. NVIDIA GeForce RTX 4060 Laptop GPU. Total memory: 7940 Mb Using device #0: GPU. NVIDIA GeForce RTX 4060 Laptop GPU. Total memory: 7940 Mb Data generated for n=50000000! Just example of printf usage: WARP_SIZE=32

mandelbrot:

OpenCL devices: Device #0: GPU. NVIDIA GeForce RTX 4060 Laptop GPU. Total memory: 7940 Mb Using device #0: GPU. NVIDIA GeForce RTX 4060 Laptop GPU. Total memory: 7940 Mb CPU: 0.286341+-0.00418889 s CPU: 34.9234 GFlops Real iterations fraction: 56.2638% GPU: 0.002554+-1.1547e-06 s GPU: 4204.16 GFlops Real iterations fraction: 56.2657% GPU vs CPU average results difference: 0.942446%

sum:

CPU: 0.117916+-0.0015682 s CPU: 848.058 millions/s CPU OMP: 0.0202012+-0.00288981 s CPU OMP: 4950.21 millions/s OpenCL devices: Device #0: GPU. NVIDIA GeForce RTX 4060 Laptop GPU. Total memory: 7940 Mb Using device #0: GPU. NVIDIA GeForce RTX 4060 Laptop GPU. Total memory: 7940 Mb GPU atomicSum1: 0.00332333+-1.10554e-06 s GPU atomicSum1: 30090.3 millions/s GPU loopSum2: 0.00980433+-0.000403407 s GPU loopSum2: 10199.6 millions/s GPU loopCoalescedSum3: 0.00934033+-9.42809e-07 s GPU loopCoalescedSum3: 10706.3 millions/s GPU localMemSum4: 0.00288017+-1.06719e-06 s GPU localMemSum4: 34720.2 millions/s GPU treeSum5: 0.00490683+-3.72678e-07 s GPU treeSum5: 20379.7 millions/s

Вывод Github CI

aplusb:

Run ./build/aplusb OpenCL devices: Device #0: CPU. AMD EPYC 7763 64-Core Processor . Intel(R) Corporation. Total memory: 15991 Mb Using device #0: CPU. AMD EPYC 7763 64-Core Processor . Intel(R) Corporation. Total memory: 15991 Mb Data generated for n=50000000! Just example of printf usage: WARP_SIZE=1

mandelbrot:

Run ./mandelbrot OpenCL devices: Device #0: CPU. AMD EPYC 7763 64-Core Processor . Intel(R) Corporation. Total memory: 15991 Mb Using device #0: CPU. AMD EPYC 7763 64-Core Processor . Intel(R) Corporation. Total memory: 15991 Mb CPU: 0.603726+-0.000772361 s CPU: 16.5638 GFlops Real iterations fraction: 56.2638% GPU: 0.162314+-0.00028869 s GPU: 66.1522 GFlops Real iterations fraction: 56.2663% GPU vs CPU average results difference: 0.982458%

sum:

Run ./sum CPU: 0.0322993+-0.000169735 s CPU: 3096.04 millions/s CPU OMP: 0.0179648+-0.000388956 s CPU OMP: 5566.43 millions/s OpenCL devices: Device #0: CPU. AMD EPYC 7763 64-Core Processor . Intel(R) Corporation. Total memory: 15991 Mb Using device #0: CPU. AMD EPYC 7763 64-Core Processor . Intel(R) Corporation. Total memory: 15991 Mb GPU atomicSum1: 1.41342+-0.00113643 s GPU atomicSum1: 70.7503 millions/s GPU loopSum2: 1.69654+-0.00157669 s GPU loopSum2: 58.9437 millions/s GPU loopCoalescedSum3: 1.41982+-0.00104539 s GPU loopCoalescedSum3: 70.4316 millions/s GPU localMemSum4: 0.0284178+-3.72622e-05 s GPU localMemSum4: 3518.92 millions/s GPU treeSum5: 0.116099+-0.00186911 s GPU treeSum5: 861.335 millions/s

arr.writeN(&as[0], n);

runBenchmark(arr, gpuSum, "atomicSum1", benchmarkingIters, reference_sum, n);
runBenchmark(arr, gpuSum, "loopSum2", benchmarkingIters, reference_sum, n);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

В этой и следующей версии один поток выполняет больше работы, сказывается ли это как-то на конфигурации рабочего пространства? (а конфигурация на производительности)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants