Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[PIP] add build variant for cuda 11.2 #19764

Merged
merged 2 commits into from
Feb 5, 2021
Merged

[PIP] add build variant for cuda 11.2 #19764

merged 2 commits into from
Feb 5, 2021

Conversation

szha
Copy link
Member

@szha szha commented Jan 18, 2021

Description

add build variant for cuda 11.2

Checklist

Essentials

  • PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented

Changes

  • add build variant for cuda 11.2

Comments

  • NVIDIA hasn't finished creating docker for cuda 11.2 with new cudnn yet

Signed-off-by: Sheng Zha <zhasheng@amazon.com>
@szha szha requested a review from leezu January 18, 2021 01:15
@mxnet-bot
Copy link

Hey @szha , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

  • To trigger all jobs: @mxnet-bot run ci [all]
  • To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [website, sanity, windows-cpu, centos-gpu, miscellaneous, clang, unix-cpu, unix-gpu, edge, centos-cpu, windows-gpu]


Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

@lanking520 lanking520 added the pr-work-in-progress PR is still work in progress label Jan 18, 2021
@szha
Copy link
Member Author

szha commented Jan 20, 2021

@szha szha marked this pull request as ready for review February 5, 2021 04:25
@lanking520 lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Feb 5, 2021
Signed-off-by: Sheng Zha <zhasheng@amazon.com>
@lanking520 lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-awaiting-review PR is waiting for code review and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Feb 5, 2021
@leezu leezu merged commit caa327b into apache:master Feb 5, 2021
@leezu leezu deleted the cu112 branch February 5, 2021 22:05
@mseth10
Copy link
Contributor

mseth10 commented Feb 8, 2021

hey @szha @leezu , the CD pipeline currently fails for CUDA 11.2 pipeline because of missing config/distribution/linux_cu112.cmake file.
https://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/restricted-mxnet-cd%2Fmxnet-cd-release-job/detail/mxnet-cd-release-job/2468/pipeline/76

@szha szha restored the cu112 branch February 8, 2021 22:58
@szha szha deleted the cu112 branch February 8, 2021 22:59
@szha
Copy link
Member Author

szha commented Feb 8, 2021

#19870

@access2rohit access2rohit mentioned this pull request Feb 17, 2021
13 tasks
access2rohit pushed a commit to access2rohit/incubator-mxnet that referenced this pull request Feb 20, 2021
access2rohit pushed a commit to access2rohit/incubator-mxnet that referenced this pull request Feb 22, 2021
access2rohit pushed a commit to access2rohit/incubator-mxnet that referenced this pull request Feb 22, 2021
access2rohit pushed a commit to access2rohit/incubator-mxnet that referenced this pull request Feb 23, 2021
Zha0q1 pushed a commit that referenced this pull request Feb 25, 2021
) (#19930)

* Enable CUDA 11.0 on nightly development builds (#19295)

Remove CUDA 9.2 and CUDA 10.0

* [PIP] add build variant for cuda 11.2 (#19764)

* adding ci docker files for cu111 and cu112

* removing previous CUDA make versions and adding support for cuda11.2

Co-authored-by: waytrue17 <52505574+waytrue17@users.noreply.github.com>
Co-authored-by: Sheng Zha <szha@users.noreply.github.com>
Co-authored-by: Rohit Kumar Srivastava <srivastava.141@buckeyemail.osu.edu>
access2rohit added a commit to access2rohit/incubator-mxnet that referenced this pull request Mar 10, 2021
apache#19764) (apache#19930)

* Enable CUDA 11.0 on nightly development builds (apache#19295)

Remove CUDA 9.2 and CUDA 10.0

* [PIP] add build variant for cuda 11.2 (apache#19764)

* adding ci docker files for cu111 and cu112

* removing previous CUDA make versions and adding support for cuda11.2

Co-authored-by: waytrue17 <52505574+waytrue17@users.noreply.github.com>
Co-authored-by: Sheng Zha <szha@users.noreply.github.com>
Co-authored-by: Rohit Kumar Srivastava <srivastava.141@buckeyemail.osu.edu>
access2rohit added a commit to access2rohit/incubator-mxnet that referenced this pull request Mar 10, 2021
apache#19764) (apache#19930)

* Enable CUDA 11.0 on nightly development builds (apache#19295)

Remove CUDA 9.2 and CUDA 10.0

* [PIP] add build variant for cuda 11.2 (apache#19764)

* adding ci docker files for cu111 and cu112

* removing previous CUDA make versions and adding support for cuda11.2

Co-authored-by: waytrue17 <52505574+waytrue17@users.noreply.github.com>
Co-authored-by: Sheng Zha <szha@users.noreply.github.com>
Co-authored-by: Rohit Kumar Srivastava <srivastava.141@buckeyemail.osu.edu>
access2rohit added a commit to access2rohit/incubator-mxnet that referenced this pull request Mar 12, 2021
apache#19764) (apache#19930)

* Enable CUDA 11.0 on nightly development builds (apache#19295)

Remove CUDA 9.2 and CUDA 10.0

* [PIP] add build variant for cuda 11.2 (apache#19764)

* adding ci docker files for cu111 and cu112

* removing previous CUDA make versions and adding support for cuda11.2

Co-authored-by: waytrue17 <52505574+waytrue17@users.noreply.github.com>
Co-authored-by: Sheng Zha <szha@users.noreply.github.com>
Co-authored-by: Rohit Kumar Srivastava <srivastava.141@buckeyemail.osu.edu>
mseth10 added a commit that referenced this pull request Mar 14, 2021
…20015)

* [BACKPORT]Enable CUDA 11.0 on nightly + CUDA 11.2 on pip (#19295)(#19764) (#19930)

* Enable CUDA 11.0 on nightly development builds (#19295)

Remove CUDA 9.2 and CUDA 10.0

* [PIP] add build variant for cuda 11.2 (#19764)

* adding ci docker files for cu111 and cu112

* removing previous CUDA make versions and adding support for cuda11.2

Co-authored-by: waytrue17 <52505574+waytrue17@users.noreply.github.com>
Co-authored-by: Sheng Zha <szha@users.noreply.github.com>
Co-authored-by: Rohit Kumar Srivastava <srivastava.141@buckeyemail.osu.edu>

* [FEATURE]Migrating all CD pipelines to Ninja build + fix cu112 CD pipeline (#19974)

* migrating cd builds to ninja + removing static links to nvidia libs and leagacy cuda versions

* installing NCCL manually for cuda11.2 container

* set MSHADOW_USE_CUDNN=1 in CMakelists of mshadow to build properly for CUDNN support

* adding coverage to cd requirements file to fix cu100, cu101 and cu102 tests

* updating cd_test containers to ubuntu 18

* adding cmake config for linux native and adding USE_KV_STORE in linux_cpu

* updating zmq builds to statically link to libmxnet.so

* updating toolchains for r, clang and llvm for ubuntu18. OpenBlas Static link for 'distribution' build type only. Fix caffe build to use openCV 3. Remove leagacy Clang 3.9 from CI

* fix versions for pip install in ubuntu_core_sh add new search path for cuDNN

* finxing cudnn link problem for CUDA<=11.0

* adding library paths for libjpegturbo and lapack to fix failing CI on ubuntu 18 images

* removing ASAN integration test from miscellaneous CI as its not required

* fix lapack path for gpu builds

* correctly installing libjpegturbo for ubuntu 18

* updating docker images of r,jekyll,julia etc test containers+ fix java version to 8

* installing libomp.so

* removing debug test as its not required. Code clean-up

* adding alternate URL source for MNIST dataset as original website is down

* skipping flaky tests issue tracked #20011

Co-authored-by: Rohit Kumar Srivastava <srivastava.141@buckeyemail.osu.edu>

* update cudnn from 7 to 8 for cu102 (#19506)

* update cudnn from 7 to 8 for cu102 (#19522)

* downloading MNIST dataset from alternate URL (#20014)

Co-authored-by: Rohit Kumar Srivastava <srivastava.141@buckeyemail.osu.edu>

* fixing CI issue with v1.8.x

* addressing review comments

Co-authored-by: waytrue17 <52505574+waytrue17@users.noreply.github.com>
Co-authored-by: Sheng Zha <szha@users.noreply.github.com>
Co-authored-by: Rohit Kumar Srivastava <srivastava.141@buckeyemail.osu.edu>
Co-authored-by: Manu Seth <22492939+mseth10@users.noreply.github.com>
mseth10 added a commit to mseth10/incubator-mxnet that referenced this pull request Mar 15, 2021
…pache#20015)

* [BACKPORT]Enable CUDA 11.0 on nightly + CUDA 11.2 on pip (apache#19295)(apache#19764) (apache#19930)

* Enable CUDA 11.0 on nightly development builds (apache#19295)

Remove CUDA 9.2 and CUDA 10.0

* [PIP] add build variant for cuda 11.2 (apache#19764)

* adding ci docker files for cu111 and cu112

* removing previous CUDA make versions and adding support for cuda11.2

Co-authored-by: waytrue17 <52505574+waytrue17@users.noreply.github.com>
Co-authored-by: Sheng Zha <szha@users.noreply.github.com>
Co-authored-by: Rohit Kumar Srivastava <srivastava.141@buckeyemail.osu.edu>

* [FEATURE]Migrating all CD pipelines to Ninja build + fix cu112 CD pipeline (apache#19974)

* migrating cd builds to ninja + removing static links to nvidia libs and leagacy cuda versions

* installing NCCL manually for cuda11.2 container

* set MSHADOW_USE_CUDNN=1 in CMakelists of mshadow to build properly for CUDNN support

* adding coverage to cd requirements file to fix cu100, cu101 and cu102 tests

* updating cd_test containers to ubuntu 18

* adding cmake config for linux native and adding USE_KV_STORE in linux_cpu

* updating zmq builds to statically link to libmxnet.so

* updating toolchains for r, clang and llvm for ubuntu18. OpenBlas Static link for 'distribution' build type only. Fix caffe build to use openCV 3. Remove leagacy Clang 3.9 from CI

* fix versions for pip install in ubuntu_core_sh add new search path for cuDNN

* finxing cudnn link problem for CUDA<=11.0

* adding library paths for libjpegturbo and lapack to fix failing CI on ubuntu 18 images

* removing ASAN integration test from miscellaneous CI as its not required

* fix lapack path for gpu builds

* correctly installing libjpegturbo for ubuntu 18

* updating docker images of r,jekyll,julia etc test containers+ fix java version to 8

* installing libomp.so

* removing debug test as its not required. Code clean-up

* adding alternate URL source for MNIST dataset as original website is down

* skipping flaky tests issue tracked apache#20011

Co-authored-by: Rohit Kumar Srivastava <srivastava.141@buckeyemail.osu.edu>

* update cudnn from 7 to 8 for cu102 (apache#19506)

* update cudnn from 7 to 8 for cu102 (apache#19522)

* downloading MNIST dataset from alternate URL (apache#20014)

Co-authored-by: Rohit Kumar Srivastava <srivastava.141@buckeyemail.osu.edu>

* fixing CI issue with v1.8.x

* addressing review comments

Co-authored-by: waytrue17 <52505574+waytrue17@users.noreply.github.com>
Co-authored-by: Sheng Zha <szha@users.noreply.github.com>
Co-authored-by: Rohit Kumar Srivastava <srivastava.141@buckeyemail.osu.edu>
Co-authored-by: Manu Seth <22492939+mseth10@users.noreply.github.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants