[v1.7] Fix the monitor_callback invalid issue during calibration with variable input shapes #18703

ciyongch · 2020-07-14T02:30:58Z

Description

Backport #18632 to addressed the calibration invalid issue with variable input shapes.

… variable input shapes (apache#18632) * Fix the monitor_callback invalid issue during calibration with variable input shapes * retrigger CI * Add UT for monitor check and disable codecov

mxnet-bot · 2020-07-14T02:31:02Z

Hey @ciyongch , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

To trigger all jobs: @mxnet-bot run ci [all]
To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [edge, centos-gpu, centos-cpu, windows-cpu, miscellaneous, website, clang, windows-gpu, unix-cpu, sanity, unix-gpu]

Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

ciyongch · 2020-07-14T05:21:40Z

@mxnet-bot run ci [unix-gpu]

mxnet-bot · 2020-07-14T05:21:43Z

Jenkins CI successfully triggered : [unix-gpu]

ciyongch · 2020-07-14T10:05:44Z

@mxnet-bot run ci [unix-gpu]

mxnet-bot · 2020-07-14T10:05:50Z

Jenkins CI successfully triggered : [unix-gpu]

ciyongch · 2020-07-14T13:53:30Z

Hi @TaoLv @szha @leezu @ChaiBapchya , please help to take a review and merge.

TaoLv · 2020-07-14T14:55:37Z

I just rebased your source branch. @ciyongch

ChaiBapchya

Thanks for the cherry-pick!

ciyongch · 2020-07-15T00:28:23Z

@mxnet-bot run ci [unix-gpu]

mxnet-bot · 2020-07-15T00:28:29Z

Jenkins CI successfully triggered : [unix-gpu]

ciyongch · 2020-07-15T01:32:47Z

@mxnet-bot run ci [unix-cpu]

mxnet-bot · 2020-07-15T01:32:51Z

Jenkins CI successfully triggered : [unix-cpu]

ciyongch · 2020-07-15T04:21:48Z

@mxnet-bot run ci [unix-cpu]

mxnet-bot · 2020-07-15T04:21:54Z

Jenkins CI successfully triggered : [unix-cpu]

ciyongch · 2020-07-15T06:49:51Z

Hi @TaoLv , please help to merge.

… variable input shapes (apache#18632) (apache#18703) * Fix the monitor_callback invalid issue during calibration with variable input shapes * retrigger CI * Add UT for monitor check and disable codecov Co-authored-by: Tao Lv <tao.a.lv@intel.com>

* * Fix einsum gradient (#18482) * [v1.7.x] Backport PRs of numpy features (#18653) * add zero grad for npi_unique (#18080) * fix np.clip scalar input case (#17788) * fix true_divide (#18393) Co-authored-by: Hao Jin <hjjn.amzn@gmail.com> Co-authored-by: Xi Wang <xidulu@gmail.com> * [v1.7.x] backport mixed type binary ops to v1.7.x (#18649) * Fix Windows GPU CI (#17962) Update Windows CI to use VS 2019 and enable x64 bit toolchain. Previously we are using an older 32 bit toolchain causing OOM errors during linking. Switching to x64 bit toolchain on the older VS version previously used by the CI was attempted in #17912 and did not work. Update to Cuda 10.2 as it is required by VS 2019. Switch to ninja-build on Windows to speed up build as ninja-build is now preinstalled. Remove logic to install cmake 3.16 on every PR as cmake 3.17 is now preinstalled. Add build retrials due to cuda thrust + VS2019 flakyness. Co-authored-by: vexilligera <vexilligera@gmail.com> * backport mixed type Co-authored-by: Leonard Lausen <lausen@amazon.com> Co-authored-by: vexilligera <vexilligera@gmail.com> * revise activations (#18700) * [v1.6] Fix the monitor_callback invalid issue during calibration with variable input shapes (#18632) (#18703) * Fix the monitor_callback invalid issue during calibration with variable input shapes * retrigger CI * Add UT for monitor check and disable codecov Co-authored-by: Tao Lv <tao.a.lv@intel.com> * Fail build_windows.py if all retries failed (#18177) * Update to thrust 1.9.8 on Windows (#18218) * Update to thrust 1.9.8 on Windows * Remove debug logic * Re-enable build retries on MSVC (#18230) Updating thrust alone did not help. Similar issues (though less often) still occur with updated thrust, and also with nvidia cub. Tracked upstream at NVIDIA/thrust#1090 Co-authored-by: Ke Han <38852697+hanke580@users.noreply.github.com> Co-authored-by: Xingjian Shi <xshiab@connect.ust.hk> Co-authored-by: Hao Jin <hjjn.amzn@gmail.com> Co-authored-by: Xi Wang <xidulu@gmail.com> Co-authored-by: Yijun Chen <chenyijun0902@gmail.com> Co-authored-by: vexilligera <vexilligera@gmail.com> Co-authored-by: ciyong <ciyong.chen@intel.com> Co-authored-by: Tao Lv <tao.a.lv@intel.com>

* * Fix einsum gradient (apache#18482) * [v1.7.x] Backport PRs of numpy features (apache#18653) * add zero grad for npi_unique (apache#18080) * fix np.clip scalar input case (apache#17788) * fix true_divide (apache#18393) Co-authored-by: Hao Jin <hjjn.amzn@gmail.com> Co-authored-by: Xi Wang <xidulu@gmail.com> * [v1.7.x] backport mixed type binary ops to v1.7.x (apache#18649) * Fix Windows GPU CI (apache#17962) Update Windows CI to use VS 2019 and enable x64 bit toolchain. Previously we are using an older 32 bit toolchain causing OOM errors during linking. Switching to x64 bit toolchain on the older VS version previously used by the CI was attempted in apache#17912 and did not work. Update to Cuda 10.2 as it is required by VS 2019. Switch to ninja-build on Windows to speed up build as ninja-build is now preinstalled. Remove logic to install cmake 3.16 on every PR as cmake 3.17 is now preinstalled. Add build retrials due to cuda thrust + VS2019 flakyness. Co-authored-by: vexilligera <vexilligera@gmail.com> * backport mixed type Co-authored-by: Leonard Lausen <lausen@amazon.com> Co-authored-by: vexilligera <vexilligera@gmail.com> * revise activations (apache#18700) * [v1.6] Fix the monitor_callback invalid issue during calibration with variable input shapes (apache#18632) (apache#18703) * Fix the monitor_callback invalid issue during calibration with variable input shapes * retrigger CI * Add UT for monitor check and disable codecov Co-authored-by: Tao Lv <tao.a.lv@intel.com> * Fail build_windows.py if all retries failed (apache#18177) * Update to thrust 1.9.8 on Windows (apache#18218) * Update to thrust 1.9.8 on Windows * Remove debug logic * Re-enable build retries on MSVC (apache#18230) Updating thrust alone did not help. Similar issues (though less often) still occur with updated thrust, and also with nvidia cub. Tracked upstream at NVIDIA/thrust#1090 Co-authored-by: Ke Han <38852697+hanke580@users.noreply.github.com> Co-authored-by: Xingjian Shi <xshiab@connect.ust.hk> Co-authored-by: Hao Jin <hjjn.amzn@gmail.com> Co-authored-by: Xi Wang <xidulu@gmail.com> Co-authored-by: Yijun Chen <chenyijun0902@gmail.com> Co-authored-by: vexilligera <vexilligera@gmail.com> Co-authored-by: ciyong <ciyong.chen@intel.com> Co-authored-by: Tao Lv <tao.a.lv@intel.com>

* * Fix einsum gradient (#18482) * [v1.7.x] Backport PRs of numpy features (#18653) * add zero grad for npi_unique (#18080) * fix np.clip scalar input case (#17788) * fix true_divide (#18393) Co-authored-by: Hao Jin <hjjn.amzn@gmail.com> Co-authored-by: Xi Wang <xidulu@gmail.com> * [v1.7.x] backport mixed type binary ops to v1.7.x (#18649) * Fix Windows GPU CI (#17962) Update Windows CI to use VS 2019 and enable x64 bit toolchain. Previously we are using an older 32 bit toolchain causing OOM errors during linking. Switching to x64 bit toolchain on the older VS version previously used by the CI was attempted in #17912 and did not work. Update to Cuda 10.2 as it is required by VS 2019. Switch to ninja-build on Windows to speed up build as ninja-build is now preinstalled. Remove logic to install cmake 3.16 on every PR as cmake 3.17 is now preinstalled. Add build retrials due to cuda thrust + VS2019 flakyness. Co-authored-by: vexilligera <vexilligera@gmail.com> * backport mixed type Co-authored-by: Leonard Lausen <lausen@amazon.com> Co-authored-by: vexilligera <vexilligera@gmail.com> * revise activations (#18700) * [v1.6] Fix the monitor_callback invalid issue during calibration with variable input shapes (#18632) (#18703) * Fix the monitor_callback invalid issue during calibration with variable input shapes * retrigger CI * Add UT for monitor check and disable codecov Co-authored-by: Tao Lv <tao.a.lv@intel.com> * Fail build_windows.py if all retries failed (#18177) * Update to thrust 1.9.8 on Windows (#18218) * Update to thrust 1.9.8 on Windows * Remove debug logic * Re-enable build retries on MSVC (#18230) Updating thrust alone did not help. Similar issues (though less often) still occur with updated thrust, and also with nvidia cub. Tracked upstream at NVIDIA/thrust#1090 Co-authored-by: Ke Han <38852697+hanke580@users.noreply.github.com> Co-authored-by: Xingjian Shi <xshiab@connect.ust.hk> Co-authored-by: Hao Jin <hjjn.amzn@gmail.com> Co-authored-by: Xi Wang <xidulu@gmail.com> Co-authored-by: Yijun Chen <chenyijun0902@gmail.com> Co-authored-by: vexilligera <vexilligera@gmail.com> Co-authored-by: ciyong <ciyong.chen@intel.com> Co-authored-by: Tao Lv <tao.a.lv@intel.com> Co-authored-by: Leonard Lausen <lausen@amazon.com> Co-authored-by: Ke Han <38852697+hanke580@users.noreply.github.com> Co-authored-by: Xingjian Shi <xshiab@connect.ust.hk> Co-authored-by: Hao Jin <hjjn.amzn@gmail.com> Co-authored-by: Xi Wang <xidulu@gmail.com> Co-authored-by: Yijun Chen <chenyijun0902@gmail.com> Co-authored-by: vexilligera <vexilligera@gmail.com> Co-authored-by: ciyong <ciyong.chen@intel.com> Co-authored-by: Tao Lv <tao.a.lv@intel.com>

[v1.6] Fix the monitor_callback invalid issue during calibration with…

8c24780

… variable input shapes (apache#18632) * Fix the monitor_callback invalid issue during calibration with variable input shapes * retrigger CI * Add UT for monitor check and disable codecov

ciyongch requested a review from szha as a code owner July 14, 2020 02:30

TaoLv approved these changes Jul 14, 2020

View reviewed changes

Merge branch 'v1.7.x' into fix_calib_1.7

369cea8

ChaiBapchya approved these changes Jul 14, 2020

View reviewed changes

TaoLv merged commit 64f737c into apache:v1.7.x Jul 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v1.7] Fix the monitor_callback invalid issue during calibration with variable input shapes #18703

[v1.7] Fix the monitor_callback invalid issue during calibration with variable input shapes #18703

ciyongch commented Jul 14, 2020

mxnet-bot commented Jul 14, 2020

ciyongch commented Jul 14, 2020

mxnet-bot commented Jul 14, 2020

ciyongch commented Jul 14, 2020

mxnet-bot commented Jul 14, 2020

ciyongch commented Jul 14, 2020

TaoLv commented Jul 14, 2020

ChaiBapchya left a comment

ciyongch commented Jul 15, 2020

mxnet-bot commented Jul 15, 2020

ciyongch commented Jul 15, 2020

mxnet-bot commented Jul 15, 2020

ciyongch commented Jul 15, 2020

mxnet-bot commented Jul 15, 2020

ciyongch commented Jul 15, 2020

[v1.7] Fix the monitor_callback invalid issue during calibration with variable input shapes #18703

[v1.7] Fix the monitor_callback invalid issue during calibration with variable input shapes #18703

Conversation

ciyongch commented Jul 14, 2020

Description

mxnet-bot commented Jul 14, 2020

ciyongch commented Jul 14, 2020

mxnet-bot commented Jul 14, 2020

ciyongch commented Jul 14, 2020

mxnet-bot commented Jul 14, 2020

ciyongch commented Jul 14, 2020

TaoLv commented Jul 14, 2020

ChaiBapchya left a comment

Choose a reason for hiding this comment

ciyongch commented Jul 15, 2020

mxnet-bot commented Jul 15, 2020

ciyongch commented Jul 15, 2020

mxnet-bot commented Jul 15, 2020

ciyongch commented Jul 15, 2020

mxnet-bot commented Jul 15, 2020

ciyongch commented Jul 15, 2020