-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
Jenkinsfile
Outdated
} | ||
} | ||
}, | ||
'Python3: MKLDNN-GPU-NOCUDNN': { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it needed to test python2 and python3 or is one enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one is enough , i will remove python2.
Jenkinsfile
Outdated
ws('workspace/build-mkldnn-gpu-nocudnn') { | ||
timeout(time: max_time, unit: 'MINUTES') { | ||
init_git() | ||
docker_run('ubuntu_build_cuda', 'build_ubuntu_gpu_mkldnn_cuda', false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
supposed to be build_ubuntu_gpu_mkldnn_nocudnn?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch! will fix.
ci/docker/runtime_functions.sh
Outdated
|
||
make \ | ||
DEV=1 \ | ||
USE_CPP_PACKAGE=1 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for the cpp package here since we don't run the cpp tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure about this . shouldnt we test if it builds fine with CUDNN with CPP_PACKAGE just in case cpp package depends on the flag to use cudnn specific functionality (unlikely) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CPP package just generates wrappers around the C-API and should not rely on backend-specific things like CUDA or CUDNN. Thus, I'd say that we should keep it separate and only build the CPP package if we're actually going to run it (which we do in another test)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, removed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm besides the cpp package
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI failure seems to be caused by the assumption that GPU CI runs all have cudnn enabled. Could you propose a fix for that?
@szha Yep, planning to fix that. |
include/mxnet/c_api.h
Outdated
* \param curr returns the current status. | ||
* \return 0 when success, -1 when failure happens | ||
*/ | ||
MXNET_DLL int MXCUDNNIsEnabled(bool* curr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't add CAPI for this.
Add a environment variable if you need it for testing
We have to take care of conflicting changes when merging this PR and #11630. |
I have resolved the conflicts and addressed the comments on #11630 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm, thanks a lot!
This should fix #11339 |
* Fix build issue with CUDNN=0 * Fix nocudnn func name * Remove python2 tests * Remove CPP package test * Check assert raises when cudnn disabled for op tests on gpu * Add line * Remove whitespace * add decorator for other ops * Add and remove assert * Fix op and common * Fix merge issue * Remove C API * Fix * Fix lint * Add init git * Rename CUDNN_DISABLED env variable * Add a runtime function for nocudnn * Remove MXCudnnIsenabled * Add comment for disabled test * Add full link in comment
Description
This was found during the flaky test investigation #10973 . I found that the windows code path is different from linux, so it was testing the MXNet conv and deconv op which was not being tested for linux. Building with USE_CUDNN=0 and USE_CUDA=1 was failing and I have added the fix here. I have also added the build and test as a stage to jenkins. Not sure if there was a reason to not check this in CI (configuration not supported by some ops etc.)
@marcoabreu
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments