Skip to content
This repository has been archived by the owner on Feb 6, 2023. It is now read-only.

Add CUDA 11.3 and fix broken cuTENSOR variations #658

Merged
merged 7 commits into from
Jun 21, 2021
Merged

Add CUDA 11.3 and fix broken cuTENSOR variations #658

merged 7 commits into from
Jun 21, 2021

Conversation

kmaehashi
Copy link
Member

No description provided.

@kmaehashi
Copy link
Member Author

pfnCI, test this please.

@kmaehashi kmaehashi mentioned this pull request May 20, 2021
@kmaehashi kmaehashi changed the title Add CUDA 11.3 Add CUDA 11.3 and fix broken cuTENSOR variations May 20, 2021
@kmaehashi
Copy link
Member Author

pfnCI, test this please.

docker.py Outdated Show resolved Hide resolved
@kmaehashi
Copy link
Member Author

pfnCI, test this please.

@kmaehashi kmaehashi mentioned this pull request May 20, 2021
21 tasks
@chainer-ci
Copy link
Member

Jenkins CI test (for commit 77a4aee, target branch master) failed with status FAILURE.

@kmaehashi
Copy link
Member Author

pfnCI, test this please.

@kmaehashi
Copy link
Member Author

pfnCI, test this please.

@kmaehashi
Copy link
Member Author

pfnCI, test this please.

@chainer-ci
Copy link
Member

Jenkins CI test (for commit 29b2984, target branch master) failed with status FAILURE.

@kmaehashi
Copy link
Member Author

kmaehashi commented May 21, 2021

Blocked by CUDA driver update in Jenkins slave. -> done

@kmaehashi
Copy link
Member Author

pfnCI, test this please.

@chainer-ci
Copy link
Member

Jenkins CI test (for commit 29b2984, target branch master) failed with status FAILURE.

@kmaehashi
Copy link
Member Author

pfnCI, test this please.

@chainer-ci
Copy link
Member

Jenkins CI test (for commit 29b2984, target branch master) failed with status FAILURE.

@kmaehashi
Copy link
Member Author

cupy/cupy#5264 got merged!
pfnCI, test this please.

@chainer-ci
Copy link
Member

Jenkins CI test (for commit 29b2984, target branch master) failed with status FAILURE.

@kmaehashi
Copy link
Member Author

pfnCI, test this please.

@chainer-ci
Copy link
Member

Jenkins CI test (for commit 29b2984, target branch master) failed with status FAILURE.

@kmaehashi
Copy link
Member Author

Hmm there are still some async mempool failures...

00:13:32 FAILED cupy_tests/cuda_tests/test_memory.py::TestAllocator_param_1_{mempool='MemoryAsyncPool'}::test_allocator_thread_local
00:13:32 FAILED cupy_tests/cuda_tests/test_memory.py::TestAllocator_param_1_{mempool='MemoryAsyncPool'}::test_set_allocator
00:13:32 FAILED cupy_tests/cuda_tests/test_memory.py::TestMemoryAsyncPool::test_free_bytes
00:13:32 FAILED cupy_tests/cuda_tests/test_memory.py::TestMemoryAsyncPool::test_total_bytes
00:13:32 FAILED cupy_tests/cuda_tests/test_memory.py::TestMemoryAsyncPool::test_total_bytes2
00:13:32 FAILED cupy_tests/cuda_tests/test_memory.py::TestMemoryAsyncPool::test_total_bytes_stream
00:13:32 FAILED cupy_tests/cuda_tests/test_memory.py::TestMemoryAsyncPool::test_used_bytes
00:13:32 FAILED cupy_tests/cuda_tests/test_memory.py::TestMemoryAsyncPool::test_used_bytes2
00:13:32 FAILED cupy_tests/cuda_tests/test_memory.py::TestMemoryAsyncPool::test_used_bytes_stream

@leofang
Copy link

leofang commented Jun 2, 2021

I could try to take a look later tonight 😅

@leofang
Copy link

leofang commented Jun 2, 2021

Hmmm I ran pytest tests/cupy_tests/ on bare metal but couldn't reproduce the errors; all tests in test_memory.py passed. @kmaehashi is the CI hardware resource shared by more than one runners?

@kmaehashi
Copy link
Member Author

@leofang Ah yes, several tests may run concurrently on the same device in Jenkins.

@kmaehashi
Copy link
Member Author

pfnCI, test this please.

@leofang
Copy link

leofang commented Jun 3, 2021

Let's see if cupy/cupy#5308 fixes it...🤞

@chainer-ci
Copy link
Member

Jenkins CI test (for commit 29b2984, target branch master) failed with status FAILURE.

@leofang
Copy link

leofang commented Jun 3, 2021

Let's see if cupy/cupy#5308 fixes it...🤞

Well, only part of it...

11:51:03 =========================== short test summary info ============================
11:51:03 FAILED cupy_tests/cuda_tests/test_memory.py::TestMemoryAsyncPool::test_free_bytes
11:51:03 FAILED cupy_tests/cuda_tests/test_memory.py::TestMemoryAsyncPool::test_total_bytes
11:51:03 FAILED cupy_tests/cuda_tests/test_memory.py::TestMemoryAsyncPool::test_total_bytes2
11:51:03 FAILED cupy_tests/cuda_tests/test_memory.py::TestMemoryAsyncPool::test_total_bytes_stream
11:51:03 FAILED cupy_tests/cuda_tests/test_memory.py::TestMemoryAsyncPool::test_used_bytes
11:51:03 FAILED cupy_tests/cuda_tests/test_memory.py::TestMemoryAsyncPool::test_used_bytes2
11:51:03 FAILED cupy_tests/cuda_tests/test_memory.py::TestMemoryAsyncPool::test_used_bytes_stream
11:51:03 = 7 failed, 80662 passed, 15834 skipped, 641 deselected, 74 xfailed, 1464 warnings in 5759.79s (1:35:59) =

Let me check later...

@leofang
Copy link

leofang commented Jun 7, 2021

Should we mark these failing tests xfail for the time being? I do not immediately see what else could go wrong and I'm wondering if it's due to architecture difference.

@kmaehashi
Copy link
Member Author

Should we mark these failing tests xfail for the time being? I do not immediately see what else could go wrong and I'm wondering if it's due to architecture difference.

Agree, sent a PR. cupy/cupy#5350

@kmaehashi
Copy link
Member Author

pfnCI, test this please.

@chainer-ci
Copy link
Member

Jenkins CI test (for commit 29b2984, target branch master) failed with status FAILURE.

@kmaehashi
Copy link
Member Author

pfnCI, test this please.

@chainer-ci
Copy link
Member

Jenkins CI test (for commit 29b2984, target branch master) failed with status FAILURE.

@kmaehashi
Copy link
Member Author

Jenkins test this please.

@chainer-ci
Copy link
Member

Jenkins CI test (for commit 29b2984, target branch master) failed with status FAILURE.

@kmaehashi
Copy link
Member Author

Test failure is irrelevant.

@asi1024
Copy link
Member

asi1024 commented Jun 21, 2021

LGTM.

@asi1024 asi1024 merged commit 459f051 into master Jun 21, 2021
@asi1024 asi1024 deleted the cuda113 branch June 21, 2021 05:06
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants