Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spin-wait when the hsa queue is full #69

Closed
wants to merge 1 commit into from

Conversation

scchan
Copy link
Collaborator

@scchan scchan commented Jun 2, 2016

No description provided.

@whchung
Copy link
Collaborator

whchung commented Jun 3, 2016

@scchan As the potential impact this change is quite large, could you help provide the following test results?

  • "HCC_RUNTIME=HSA make test"
  • HIP unit tests

@scchan
Copy link
Collaborator Author

scchan commented Jun 3, 2016

Here they are. The failures in hcc tests and HIP Rodinia are consistent with the develop branch without this patch

Testing Time: 106.98s


Failing Tests (17):
CPPAMP :: Unit/AmpMath/amp_math_erf_precise_math.cpp
CPPAMP :: Unit/AmpMath/amp_math_erfc_precise_math.cpp
CPPAMP :: Unit/AmpMath/amp_math_hypot_precise_math.cpp
CPPAMP :: Unit/DynamicTileStatic/test11.cpp
CPPAMP :: Unit/HC/memcpy_symbol1.cpp
CPPAMP :: Unit/HC/memcpy_symbol2.cpp
CPPAMP :: Unit/HC/memcpy_symbol3.cpp
CPPAMP :: Unit/HC/memcpy_symbol4.cpp
CPPAMP :: Unit/HSAIL/activelaneid.cpp
CPPAMP :: Unit/HSAIL/activelanepermute.cpp
CPPAMP :: Unit/HSAIL/shfl.cpp
CPPAMP :: Unit/ParallelSTL/sort_carray.cpp
CPPAMP :: Unit/ParallelSTL/sort_stdarray.cpp
CPPAMP :: Unit/ParallelSTL/sort_stdvector.cpp
CPPAMP :: Unit/ParallelSTL/stablesort_carray.cpp
CPPAMP :: Unit/ParallelSTL/stablesort_stdarray.cpp
CPPAMP :: Unit/ParallelSTL/stablesort_stdvector.cpp

Expected Passes : 657
Expected Failures : 25
Unsupported Tests : 10
Unexpected Failures: 17

HIP unit tests:

Running tests...
Test project /home/scchan/code/gerrit/hip/tests/build
Start 1: hip_ballot.tst
1/31 Test #1: hip_ballot.tst ......................... Passed 0.10 sec
Start 2: hip_anyall.tst
2/31 Test #2: hip_anyall.tst ......................... Passed 0.08 sec
Start 3: hip_popc.tst
3/31 Test #3: hip_popc.tst ........................... Passed 0.07 sec
Start 4: hip_brev.tst
4/31 Test #4: hip_brev.tst ........................... Passed 0.07 sec
Start 5: hip_clz.tst
5/31 Test #5: hip_clz.tst ............................ Passed 0.07 sec
Start 6: hip_ffs.tst
6/31 Test #6: hip_ffs.tst ............................ Passed 0.07 sec
Start 7: hipEventRecord--iterations10.tst
7/31 Test #7: hipEventRecord--iterations10.tst ....... Passed 0.15 sec
Start 8: hipMemset.tst
8/31 Test #8: hipMemset.tst .......................... Passed 0.08 sec
Start 9: hipMemset--N10--memsetval0x42.tst
9/31 Test #9: hipMemset--N10--memsetval0x42.tst ...... Passed 0.07 sec
Start 10: hipMemset--N10013--memsetval0x5a.tst
10/31 Test #10: hipMemset--N10013--memsetval0x5a.tst ... Passed 0.07 sec
Start 11: hipMemset--N256M--memsetval0xa6.tst
11/31 Test #11: hipMemset--N256M--memsetval0xa6.tst .... Passed 0.83 sec
Start 12: hipGridLaunch.tst
12/31 Test #12: hipGridLaunch.tst ...................... Passed 0.12 sec
Start 13: hipEnvVarDriver.tst
13/31 Test #13: hipEnvVarDriver.tst .................... Passed 0.71 sec
Start 14: hipMultiThreadStreams2.tst
14/31 Test #14: hipMultiThreadStreams2.tst ............. Passed 0.12 sec
Start 15: hipMemcpy_simple.tst
15/31 Test #15: hipMemcpy_simple.tst ................... Passed 0.37 sec
Start 16: hipMemcpy-modes
16/31 Test #16: hipMemcpy-modes ........................ Passed 4.87 sec
Start 17: hipMemcpy-size
17/31 Test #17: hipMemcpy-size ......................... Passed 21.46 sec
Start 18: hipMemcpy-multithreaded
18/31 Test #18: hipMemcpy-multithreaded ................ Passed 0.68 sec
Start 19: hipHostAlloc.tst
19/31 Test #19: hipHostAlloc.tst ....................... Passed 0.09 sec
Start 20: hipHcc.tst
20/31 Test #20: hipHcc.tst ............................. Passed 0.07 sec
Start 21: hipStreamL5.tst
21/31 Test #21: hipStreamL5.tst ........................ Passed 1.34 sec
Start 22: hipRandomMemcpyAsync.tst
22/31 Test #22: hipRandomMemcpyAsync.tst ............... Passed 0.07 sec
Start 23: hipMemoryAllocate.tst
23/31 Test #23: hipMemoryAllocate.tst .................. Passed 0.33 sec
Start 24: hipFuncSetDeviceFlags.tst
24/31 Test #24: hipFuncSetDeviceFlags.tst .............. Passed 0.06 sec
Start 25: hipFuncGetDevice.tst
25/31 Test #25: hipFuncGetDevice.tst ................... Passed 0.06 sec
Start 26: hipFuncSetDevice.tst
26/31 Test #26: hipFuncSetDevice.tst ................... Passed 0.06 sec
Start 27: hipFuncDeviceSynchronize.tst
27/31 Test #27: hipFuncDeviceSynchronize.tst ........... Passed 0.07 sec
Start 28: hipMultiThreadDevice-serial
28/31 Test #28: hipMultiThreadDevice-serial ............ Passed 0.48 sec
Start 29: hipMultiThreadDevice-pyramid
29/31 Test #29: hipMultiThreadDevice-pyramid ........... Passed 5.07 sec
Start 30: hipMultiThreadDevice-nearzero
30/31 Test #30: hipMultiThreadDevice-nearzero .......... Passed 3.97 sec
Start 31: specialFunc.cu.tst
31/31 Test #31: specialFunc.cu.tst ..................... Passed 0.02 sec

100% tests passed, 0 tests failed out of 31

Total Test time (real) = 41.69 sec

HIP Rodinia:

==== Rodinia ====
--CLEAN: b+tree
--CLEAN: backprop
--CLEAN: bfs
--CLEAN: cfd
--CLEAN: dwt2d
--CLEAN: gaussian
--CLEAN: heartwall
--CLEAN: hotspot
--CLEAN: hybridsort
--CLEAN: kmeans
--CLEAN: lavaMD
--CLEAN: lud
--CLEAN: myocyte
--CLEAN: nn
--CLEAN: nw
--CLEAN: pathfinder
--CLEAN: srad
--CLEAN: streamcluster
--TESTING: b+tree
executing: ../../test/b+tree/run0.cmd... PASSED!
--TESTING: backprop
executing: ../../test/backprop/run0.cmd... PASSED!
--TESTING: bfs
executing: ../../test/bfs/run0.cmd... PASSED!
executing: ../../test/bfs/run1.cmd... PASSED!
--TESTING: cfd
executing: ../../test/cfd/run0.cmd... PASSED!
executing: ../../test/cfd/run1.cmd... PASSED!
--TESTING: dwt2d
executing: ../../test/dwt2d/run0.cmd... PASSED!
executing: ../../test/dwt2d/run1.cmd... PASSED!
--TESTING: gaussian
executing: ../../test/gaussian/run0.cmd... PASSED!
executing: ../../test/gaussian/run1.cmd... PASSED!
executing: ../../test/gaussian/run2.cmd... PASSED!
executing: ../../test/gaussian/run3.cmd... PASSED!
executing: ../../test/gaussian/run4.cmd... PASSED!
--TESTING: heartwall
executing: ../../test/heartwall/run0.cmd... PASSED!
--TESTING: hotspot
executing: ../../test/hotspot/run0.cmd... PASSED!
--TESTING: hybridsort
executing: ../../test/hybridsort/run0.cmd... PASSED!
--TESTING: kmeans
executing: ../../test/kmeans/run0.cmd... PASSED!
executing: ../../test/kmeans/run1.cmd... PASSED!
executing: ../../test/kmeans/run2.cmd... PASSED!
executing: ../../test/kmeans/run3.cmd... PASSED!
--TESTING: lavaMD
executing: ../../test/lavaMD/run0.cmd... PASSED!
executing: ../../test/lavaMD/run1.cmd... PASSED!
executing: ../../test/lavaMD/run2.cmd... PASSED!
executing: ../../test/lavaMD/run3.cmd... PASSED!
executing: ../../test/lavaMD/run4.cmd... PASSED!
--TESTING: lud
executing: ../../test/lud/run0.cmd... PASSED!
--TESTING: myocyte
BUILD FAILURE!!
--TESTING: nn
executing: ../../test/nn/run0.cmd... PASSED!
--TESTING: nw
executing: ../../test/nw/run0.cmd... PASSED!
--TESTING: pathfinder
executing: ../../test/pathfinder/run0.cmd... PASSED!
--TESTING: srad
executing: ../../test/srad/run0.cmd... PASSED!
--TESTING: streamcluster
executing: ../../test/streamcluster/run0.cmd... PASSED!


uint64_t nextIndex = index + 1;
// spin-wait if the command queue is full
while(nextIndex - hsa_queue_load_read_index_acquire(commandQueue) >= commandQueue->size) ;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm feeling a bit uneasy here because it may potentially create an infinite loop in user mode without too much information disclosed to user applications.

Is it possible to fabricate a scenario where the queue is nearly full, or full? And measure how long the loop may take?

Also I'm wondering if there's a way (ex: signal / pipe / named pipe) we can interrupt the spin-wait and gracefully terminate the application?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error on queue full is fine and an improv,went over current state of silently corrupting the queue.

On Jun 3, 2016, at 7:32 PM, Jack Chung <notifications@github.commailto:notifications@github.com> wrote:

In lib/hsa/mcwamp_hsa.cpphttps://github.com//pull/69#discussion_r65793182:

@@ -2403,8 +2403,13 @@ HSADispatch::dispatchKernel(hsa_queue_t* commandQueue) {
uint32_t queueMask = commandQueue->size - 1;
// TODO: Need to check if package write is correct.
uint64_t index = hsa_queue_load_write_index_relaxed(commandQueue);
+

  • uint64_t nextIndex = index + 1;
  • // spin-wait if the command queue is full
  • while(nextIndex - hsa_queue_load_read_index_acquire(commandQueue) >= commandQueue->size) ;

I'm feeling a bit uneasy here because it may potentially create an infinite loop in user mode without too much information disclosed to user applications.

Is it possible to fabricate a scenario where the queue is nearly full, or full? And measure how long the loop may take?

Also I'm wondering if there's a way (ex: signal / pipe / named pipe) we can interrupt the spin-wait and gracefully terminate the application?

You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHubhttps://github.com//pull/69/files/4a163a4065807a2eb704b0b082113b8e053e215a#r65793182, or mute the threadhttps://github.com/notifications/unsubscribe/ACYSAss3bgFmaLXHi-i3_ZFIK-emMN_hks5qIMeAgaJpZM4ItDxr.

@scchan
Copy link
Collaborator Author

scchan commented Jun 4, 2016

will submit another PR to issue an error instead of spin wait

@scchan scchan closed this Jun 4, 2016
@scchan scchan deleted the q_overflow_check branch December 28, 2016 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants