-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
@haojin2 @sxjscience Could u take a look as well since u both worked on norm layers before. |
Have you tried training MaskRCNN with this? |
@Jerryzcn Not yet. |
src/common/cuda_utils.h
Outdated
static_assert(NTHREADS <= warp_size * warp_size, | ||
"Number of threads too large for reduction"); | ||
__shared__ T scratch[NTHREADS / warp_size]; | ||
const int my_id = threadIdx.x % warp_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use a more informative names other than my_*?
Could we get some convergence result on fasterrcnn before merging thanks! |
@Jerryzcn it's in progress, I'll post results here |
@Kh4L Could you post results of your runs? |
Lgtm |
BTW, were you able to find the bug that cause master MXNet to fail on mask rcnn? |
Reviving this PR. @apeforest I did change |
@mxnet-bot run ci [centos-cpu] |
Jenkins CI successfully triggered : [centos-cpu] |
@mxnet-bot run ci [centos-cpu, unix-gpu] |
Jenkins CI successfully triggered : [centos-cpu, unix-gpu] |
@mxnet-bot run ci [unix-gpu] |
Jenkins CI successfully triggered : [unix-gpu] |
@apeforest Is this PR good to go now? |
@apeforest Gentle ping |
hmm not sure what's happening with the CI. |
@mxnet-bot run ci [centos-cpu, miscellaneous] |
Jenkins CI successfully triggered : [centos-cpu, miscellaneous] |
@mxnet-bot run ci [centos-cpu, miscellaneous] |
Jenkins CI successfully triggered : [miscellaneous, centos-cpu] |
commit 8794a0a Author: Zhaoqi Zhu <zhaoqizh@usc.edu> Date: Tue Aug 18 17:36:42 2020 -0700 Numpy Dot Large Tensor Fix (apache#18925) * fix np dot * add test * fix test * tweak test Co-authored-by: Zhu <zhaoqzhu@3c22fbbb4e1a.ant.amazon.com> Co-authored-by: Ubuntu <ubuntu@ip-172-31-10-124.us-west-2.compute.internal> Co-authored-by: Ubuntu <ubuntu@ip-172-31-6-47.us-west-2.compute.internal> commit 32994bb Author: Przemyslaw Tredak <ptredak@nvidia.com> Date: Tue Aug 18 15:15:50 2020 -0700 Fix setting cudnn bias stride (apache#18905) commit c789d02 Author: Leonard Lausen <lausen@amazon.com> Date: Tue Aug 18 16:57:35 2020 +0000 Fix Python docs (apache#18924) * Fix Python docs * Fix * Fix commit 0afeb97 Author: nihui <shuizhuyuanluo@126.com> Date: Tue Aug 18 22:03:57 2020 +0800 Fix instancenorm math equation (apache#18955) commit e06ee4e Author: Przemyslaw Tredak <ptredak@nvidia.com> Date: Mon Aug 17 23:07:37 2020 -0700 Faster GPU frozen BatchNorm (apache#17368) * Better frozen batchnorm * Continue FreezeBN * Optimizations * Reduce number of mod operations * Cleaning * Fixing frozen bn with fix_gamma=False * Fix lint in BN * Backward frozen batchnorm * More work on backward of Frozen BN * Let it compile * NCHW Frozen BN backward * Frozen BN backward NHWC * Cleaning * Remove the change to Makefile * Fix from rebase * Temp space for BN backward * Fix from review * Fix lint * Changes from review commit 2610c10 Author: Serge Panev <spanev@nvidia.com> Date: Sat Aug 15 19:30:50 2020 -0700 Change Partition API's options_map to std::unordered_map (apache#18929) Signed-off-by: Serge Panev <spanev@nvidia.com> commit be12c8d Author: Sheng Zha <szha@users.noreply.github.com> Date: Fri Aug 14 17:35:34 2020 -0700 [Website] adjust website structure (apache#18839) * adjust website structure * update per comments * adjust ecosystem page * add ray tune * fix issues * update notebooks * fix breakage commit daf8b43 Author: Sam Skalicky <samskalicky@gmail.com> Date: Fri Aug 14 14:36:30 2020 -0700 Support extra inputs for subgraph ops (apache#18779) Support additional inputs to custom subgraph ops that are not direct dependencies to ops in the subgraph. This will enable various use cases: custom control flow ops, custom ops that maintain a state that should be saved/loaded, etc. Highlights: * Added test that uses a graph pass (addInputPass) to add a new custom input to the subgraph op * Added new optional argument (clear) to hybridize & optimize_for APIs in Gluon Block to enable multiple optimizations * refactored lib_api.h JSON utilities * added new Graph data structure utilities to simplify custom graph passes * refactored custom op registration * enhanced custom subgraph op to support additional inputs to subgraph op that is not an input to ops in the subgraph * updated subgraph & graph pass READMEs * Added error messaging from external library commit 86e96dc Author: Przemyslaw Tredak <ptredak@nvidia.com> Date: Thu Aug 13 22:27:10 2020 -0700 Fix backward of arctan2 and rarctan2 scalar on GPU (apache#18440) commit ee80b77 Author: bgawrych <bartlomiej.gawrych@intel.com> Date: Fri Aug 14 07:19:52 2020 +0200 Fix default CPU allocator memory alignment (apache#18885) * Replace std::malloc to aligned memory allocation in Pooled StorageManager * Add test checking CPU memory alignment * Fix memory allocation success check * Fix sanity commit 344587f Author: MoisesHer <50716238+MoisesHer@users.noreply.github.com> Date: Thu Aug 13 22:18:26 2020 -0700 Safe accumulation for computing gradient in Embedding & Take (apache#18385) * Safe accumulation for computing gradient in Embedding & Take * Fix bug in TakeGrad: initialize temporal storage for safe_accumulation * fix lint * make MXNET_SAFE_ACCUMULATION compatible with Windows * Increase test coverage: small inputs & SAFE_ACCUMULATION commit a2b400c Author: Joshua Z. Zhang <cheungchih@gmail.com> Date: Wed Aug 12 22:47:47 2020 -0700 fix center element not being copied (apache#18917) commit e2cbf66 Author: Leonard Lausen <lausen@amazon.com> Date: Wed Aug 12 16:22:17 2020 +0000 Revert "drop list support for gluon trainer (apache#18877)" (apache#18892) This reverts commit d5fdcbf. commit 83d2af5 Author: Xi Wang <xidulu@gmail.com> Date: Wed Aug 12 15:25:17 2020 +0800 Gamma reparameterization gradient (apache#18852) * gamma grad wip * gamma grad wip * test tbd * fix grad * change scale to the frontend * fix bugs * change distributions.gamma * fix test and operator tune commit f2a8b97 Author: Leonard Lausen <lausen@amazon.com> Date: Wed Aug 12 01:51:11 2020 +0000 Remove manually created symbolic link to ninja-build (apache#18906) apache@6bcfce9 for master branch commit 016c166 Author: JackieWu <wkcn@live.cn> Date: Wed Aug 12 03:54:46 2020 +0800 remove upper bound (apache#18857) commit e101d68 Author: Xi Wang <xidulu@gmail.com> Date: Tue Aug 11 12:50:26 2020 +0800 [Gluon] Add VAE demo (apache#18758) * add VAE demo * minor changes * change format to md * minor changes * add liscence * Update VAE.md * update vae demo * remove unnecessary files commit d0e17e5 Author: Ke Han <38852697+hanke580@users.noreply.github.com> Date: Mon Aug 10 12:48:56 2020 +0800 [Numpy] FFI: sort, argsort, vstack etc (apache#17857) * * sort FFI * * argsort FFI * * vstack, row_stack FFI * * greater FFI * * inner FFI * multinomial FFI * rand FFI * randn FFI * * Fix input out of index and rscalar of greater * * Fix ndarray situation * * Fix sanity * fix lint * fix bugs * * Remove duplicate operator (greater) * * Fix Tuple downcast Error (Only Integer) * Fix segmentation fault(pointer) Co-authored-by: Sheng Zha <zhasheng@amazon.com> commit 5c50475 Author: Liu, Hao <haoliuhust@hotmail.com> Date: Mon Aug 10 08:15:22 2020 +0800 fix pooling_convention warning when convert model to onnx (apache#18529) * fix pooling_convention warning * fix pooling_convention warning * fix lint Co-authored-by: JackieWu <wkcn@live.cn> commit d52d9c6 Author: Sheng Zha <szha@users.noreply.github.com> Date: Sun Aug 9 13:33:03 2020 -0700 Revert "Add SOVERSION when build shared libmxnet.so library (apache#17815)" (apache#18882) This reverts commit d101c3c. commit 706c369 Author: Ziyue Huang <ziyue@apache.org> Date: Sun Aug 9 07:55:16 2020 +0800 fix trainer when the model involves share_parameters (apache#18880) * fix trainer when using shared_param * add unittest commit cf908fd Author: Xingjian Shi <xshiab@connect.ust.hk> Date: Fri Aug 7 19:55:36 2020 -0700 [Numpy][Bugfix] Add hybridization test to loss layers (apache#18876) * Test for hybridization * fix typo * fix * fix test * update * Update loss.py * fix bug of sum commit d5fdcbf Author: Haibin Lin <linhaibin.eric@gmail.com> Date: Fri Aug 7 18:11:16 2020 -0700 drop list support for gluon trainer (apache#18877) Co-authored-by: Ubuntu <ubuntu@ip-172-31-42-138.ec2.internal> commit dde635f Author: Leonard Lausen <lausen@amazon.com> Date: Fri Aug 7 21:16:24 2020 +0000 Re-enable the linker version scripts for binary distribution (apache#18872) * Symbol visibility * Fix commit 1694d2f Author: Sheng Zha <szha@users.noreply.github.com> Date: Fri Aug 7 11:21:22 2020 -0700 [CI] remove data.mxnet.io usage for CI stability (apache#18871) * remove duplicate mnist functions * remove data.mxnet.io usage in tests * add waitall commit 708a900 Author: Serge Panev <spanev@nvidia.com> Date: Fri Aug 7 10:46:22 2020 -0700 Fix a bug in MXNet-TensorRT (apache#18870) Signed-off-by: Serge Panev <spanev@nvidia.com> commit d101c3c Author: Gustavo Alvarez <462213+sl1pkn07@users.noreply.github.com> Date: Fri Aug 7 04:34:51 2020 +0200 Add SOVERSION when build shared libmxnet.so library (apache#17815) https://en.wikipedia.org/wiki/Soname https://cmake.org/cmake/help/latest/prop_tgt/SOVERSION.html Co-authored-by: Leonard Lausen <lausen@amazon.com> commit a3eabf0 Author: Leonard Lausen <lausen@amazon.com> Date: Thu Aug 6 15:52:52 2020 +0000 Fix MXLibInfoCompiledWithCXX11ABI (apache#18864) * Fix MXLibInfoCompiledWithCXX11ABI * Fix test commit 84f8984 Author: bgawrych <bartlomiej.gawrych@intel.com> Date: Thu Aug 6 04:32:39 2020 +0200 ElementWiseSum fix for oneDNN (apache#18859) * Fix ElementwiseSum for DNNL * Add test for oneDNN ElemwiseSum Co-authored-by: Bart Gawrych <gawrych.bartlomiej@intel.com> commit a78f137 Author: Yang Shi <yangshia@amazon.com> Date: Wed Aug 5 14:24:46 2020 -0700 improve python api website ux - make toc sticky (apache#18863) commit 0f65ef6 Author: Xi Wang <xidulu@gmail.com> Date: Wed Aug 5 10:48:50 2020 +0800 nb fix (apache#18858) commit 7b7cef5 Author: Serge Panev <spanev@nvidia.com> Date: Tue Aug 4 18:23:48 2020 -0700 MXNet-TRT: Add PrePartition param caching - move init_tensorrt_params logic (apache#18490) * Update to TRT 7 API Signed-off-by: Serge Panev <spanev@nvidia.com> * Add PrePartition param caching - move init_tensorrt_params logic Signed-off-by: Serge Panev <spanev@nvidia.com> * Handle node with no defined input Signed-off-by: Serge Panev <spanev@nvidia.com> * Remove tmp comment Signed-off-by: Serge Panev <spanev@nvidia.com> commit 59e200a Author: Haibin Lin <linhaibin.eric@gmail.com> Date: Tue Aug 4 17:01:23 2020 -0700 fix nn.dense doc (apache#18830) Co-authored-by: Lin <haibilin@a483e7be4c92.ant.amazon.com> commit 2e97226 Author: Leonard Lausen <lausen@amazon.com> Date: Tue Aug 4 21:11:32 2020 +0000 Fix edge case when casting gluon Block before export (apache#18853) * Fix edge case when casting gluon Block before export Fixes apache#18843 * Fix gpu test commit b8eccc8 Author: Yang Shi <yangshia@amazon.com> Date: Tue Aug 4 14:08:09 2020 -0700 fix set default website version rewrite rule for cdn (apache#18856) commit 7a40219 Author: Serge Panev <spanev@nvidia.com> Date: Tue Aug 4 10:34:21 2020 -0700 Remove check for subgraph with cycles (apache#18555) * Remove check for subgraph with cycles Signed-off-by: Serge Panev <spanev@nvidia.com> * Add comments Signed-off-by: Serge Panev <spanev@nvidia.com> commit 95fa63f Author: Serge Panev <spanev@nvidia.com> Date: Mon Aug 3 17:15:02 2020 -0700 Update the onnx-tensorrt submodule - CI to TRT7 (apache#18574) commit 7f2e314 Author: Haibin Lin <linhaibin.eric@gmail.com> Date: Mon Aug 3 16:09:48 2020 -0700 update setup.py (apache#18850) * update setup.py * update python version Co-authored-by: Lin <haibilin@a483e7be4c92.ant.amazon.com> commit f872b43 Author: Leonard Lausen <lausen@amazon.com> Date: Mon Aug 3 20:11:06 2020 +0000 Protobuf_USE_STATIC_LIBS must be set on Apple too (apache#18851) Fixes apache#18840 commit 4bb8224 Author: Yang Shi <yangshia@amazon.com> Date: Mon Aug 3 12:30:13 2020 -0700 Fixed python website double scroller and improve UX (apache#18845) * make python site header scroll aware and avoid double scroller * add compiled assets * adjust python site second header height * add new line * set focus to main content on DOM load commit 7a5a488 Author: Iblis Lin <iblis@hs.ntnu.edu.tw> Date: Tue Aug 4 03:28:08 2020 +0800 Fix broken link in docs/README.md (apache#18847) commit 534cdbc Author: Sheng Zha <szha@users.noreply.github.com> Date: Mon Aug 3 11:58:33 2020 -0700 Create greetings.yml (apache#18842) commit 9fd2cce Author: kpuatamazon <56725192+kpuatamazon@users.noreply.github.com> Date: Mon Aug 3 17:40:44 2020 +0100 Update tests/README.md Docker instructions to match ci/README.md (apache#18848) Documentation was missing python3-docker and had an outdated platform. commit 54b9e9c Author: Sheng Zha <szha@users.noreply.github.com> Date: Mon Aug 3 08:59:33 2020 -0700 remove unnecessary usage of pretrained models, and prefer smaller size (apache#18844) commit 51340d8 Author: Haibin Lin <linhaibin.eric@gmail.com> Date: Sat Aug 1 16:23:03 2020 -0700 Add compiled_with_cxx11_abi API (apache#18836) * draft * add impl * add test * set default val Co-authored-by: Ubuntu <ubuntu@ip-172-31-42-138.ec2.internal> commit 5a22193 Author: Sheng Zha <szha@users.noreply.github.com> Date: Fri Jul 31 17:06:17 2020 -0700 [NumPy] allow mixed array types (apache#18562) * allow mixed types in array func protocol * fix apache#18746 * add support for memory share check commit 08a5ee3 Author: Tao Lv <tao.a.lv@intel.com> Date: Sat Aug 1 03:38:20 2020 +0800 fix gelu to use erf based algorithm (apache#18827) commit ac36089 Author: Leonard Lausen <lausen@amazon.com> Date: Fri Jul 31 04:54:10 2020 +0000 Fixup move gluon.metric api docs (apache#18748) * Fix metric API page * Update index.rst commit 7a24006 Author: Leonard Lausen <lausen@amazon.com> Date: Fri Jul 31 02:58:55 2020 +0000 Enable DIST_KVSTORE by default in staticbuild (apache#18796) * Enable DIST_KVSTORE by default in staticbuild set(USE_DIST_KVSTORE ON CACHE BOOL "Build with DIST_KVSTORE support") * Ensure static linkage of dependencies * Fix for OS X * Fix shell syntax * Alternate approach to force static linkage of libprotobuf commit aa53291 Author: Yang Shi <yangshia@amazon.com> Date: Thu Jul 30 19:53:27 2020 -0700 add adaptive left margin for python site document body (apache#18828) commit 045efb2 Author: Sheng Zha <szha@users.noreply.github.com> Date: Thu Jul 30 19:19:33 2020 -0700 [NumPy] DLPack refactor and npx.from_numpy (apache#18656) * refactor dlpack and add from_numpy to npx * remove reference of DeepNumPy * map platform-dependent types to fixed-size types * update DMLC_LOG_FATAL_THROW * fix flaky * fix flaky * test no error commit 608afef Author: Xi Wang <xidulu@gmail.com> Date: Fri Jul 31 02:30:25 2020 +0800 Fix dirichlet flaky tests (apache#18817) * make parameter smoother * minor changes commit 6bbd531 Author: Leonard Lausen <lausen@amazon.com> Date: Wed Jul 29 20:31:19 2020 +0000 Update clang-tidy integration (apache#18815) Run clang-tidy via cmake only on the code managed by mxnet (and not 3rdparty dependencies), update to clang-tidy-10 and run clang-tidy-10 -fix to fix all the warnings that are enforced on CI. Developers can run clang-tidy by specifying the -DCMAKE_CXX_CLANG_TIDY="clang-tidy-10" to cmake, or using the python ci/build.py -R --platform ubuntu_cpu /work/runtime_functions.sh build_ubuntu_cpu_clang_tidy script. commit b685fad Author: Yang Shi <yangshia@amazon.com> Date: Wed Jul 29 12:22:12 2020 -0700 use regex that is supported by all browsers (apache#18811) commit 9308aca Author: Yang Shi <yangshia@amazon.com> Date: Wed Jul 29 12:21:42 2020 -0700 remove other language bindings section from website api page (apache#18783) * remove other language bindings section from api page * remove language binding docs redirect * add call for contribution banner * modify call for contribution wording Co-authored-by: Aaron Markham <markhama@amazon.com> * more wording modification Co-authored-by: Aaron Markham <markhama@amazon.com> * add hyperlink to 1.x version in banner * add reference to the C api deprecation github issue Co-authored-by: Aaron Markham <markhama@amazon.com> commit 915f6b4 Author: Yang Shi <yangshia@amazon.com> Date: Wed Jul 29 11:28:37 2020 -0700 Remove deepnumpy reference and move Numpy tutorials to top level (apache#18798) * move np tutorials to top level * replace deepnumpy reference to np * add info in card * remove useless entry * replace NDArray API card with np.ndarray * python site refactor * remove duplicated drawer and refactor layout * extend document width to 100% for xl devices commit e9829e7 Author: Joe Evans <github@250hacks.net> Date: Tue Jul 28 18:53:29 2020 -0700 Cherry-pick large tensor support from apache#18752. (apache#18804) Co-authored-by: Joe Evans <joeev@amazon.com> commit 126636c Author: Leonard Lausen <lausen@amazon.com> Date: Tue Jul 28 22:11:20 2020 +0000 Fix naming in runtime_functions.sh (apache#18795) commit f83dbac Author: Haibin Lin <linhaibin.eric@gmail.com> Date: Tue Jul 28 11:48:05 2020 -0700 remove executor manager from API doc (apache#18802) Co-authored-by: Lin <haibilin@a483e7be4c92.ant.amazon.com> commit 7908d7e Author: Yiyan66 <57363390+Yiyan66@users.noreply.github.com> Date: Tue Jul 28 15:11:19 2020 +0800 [numpy] fix flaky mixed precision binary error (apache#18660) * temp * change test * fix bad func call * test * rectify * doc * change test commit a807f6d Author: Sheng Zha <szha@users.noreply.github.com> Date: Mon Jul 27 22:06:50 2020 -0700 [NumPy] loss for np array (apache#17196) * loss for np/nd array * fix flaky commit 74430a9 Author: phile <phile_999@126.com> Date: Tue Jul 28 06:44:54 2020 +0800 remove NLL in metric (apache#18794) commit 9e77e81 Author: Przemyslaw Tredak <ptredak@nvidia.com> Date: Mon Jul 27 14:27:52 2020 -0700 Update CUB and include it only for CUDA < 11 (apache#18799) commit 98b3f73 Author: Sheng Zha <szha@users.noreply.github.com> Date: Sat Jul 25 16:19:36 2020 -0700 add support for np.ndarray in autograd.function (apache#18790) commit c1db2d5 Author: Leonard Lausen <lausen@amazon.com> Date: Sat Jul 25 16:58:45 2020 +0000 Remove caffe plugin (apache#18787) * Remove caffe plugin * Fix * Remove CXX14 feature flag * Update test commit 2fbd182 Author: Leonard Lausen <lausen@amazon.com> Date: Sat Jul 25 02:48:30 2020 +0000 Split up CI sanity test functions to enable fine-grained trigger (apache#18786) Developers can now trigger fine grained checks: python ci/build.py -R --platform ubuntu_cpu /work/runtime_functions.sh sanity_python python ci/build.py -R --platform ubuntu_cpu /work/runtime_functions.sh sanity_license etc commit 06b5d22 Author: Serge Panev <spanev@nvidia.com> Date: Fri Jul 24 14:22:42 2020 -0700 ONNX import: use Conv pad attribute for symmetrical padding (apache#18675) Signed-off-by: Serge Panev <spanev@nvidia.com> commit e31ad77 Author: Yang Shi <yangshia@amazon.com> Date: Thu Jul 23 11:33:31 2020 -0700 set website default version to current stable (1.6) version (apache#18738) * set website default version - test redirect * enable first time redirect on all master website pages * update test code * remove unnecessary test code * fix typo * delete test code commit 02ae456 Author: Dick Carter <dick.carter@comcast.net> Date: Thu Jul 23 11:17:10 2020 -0700 Improve environment variable handling in unittests (apache#18424) This PR makes it easy to create unittests that require specific settings of environment variables, while avoiding the pitfalls (discussed in comments section). This PR can be considered a recasting and expansion of the great vision of @larroy in creating the EnvManager class in apache#13140. In its base form, the facility is a drop-in replacement for EnvManager, and is called 'environment': with environment('MXNET_MY_NEW_FEATURE', '1'): <test with feature enabled> with environment('MXNET_MY_NEW_FEATURE', '0'): <test with feature disabled> Like EnvManager, this facility takes care of the save/restore of the previous environment variable state, including when exceptions are raised. In addition though, this PR introduces the features: A similarly-named unittest decorator: @with_environment(key, value) The ability to pass in multiple env vars as a dict (as is needed for some tests) in both forms, so for example: with environment({'MXNET_FEATURE_A': '1', 'MXNET_FEATURE_B': '1'}): <test with both features enabled> Works on Windows! This PR includes a wrapping of the backend's setenv() and getenv() functions, and uses this direct access to the backend environment to keep it in sync with the python environment. This works around the problem that the C Runtime on Windows gets a snapshot of the Python environment at startup that is immutable from Python. with environment() has a simple implementation using the @contextmanager decorator Tests are included that validate the facility works with all combinations of before_val/set_val, namely unset/unset, unset/set, set/unset, set/set. There were 5 unittests previously using EnvManager, and this PR shifts those uses to with environment():, while converting over 20 other ad-hoc uses of os.environ[] within the unittests. This PR also enables those unittests that were bypassed on Windows (due to the inability to set environment variables) to run on all platforms. Further Comments Environment variables are a two-edged sword- they enable useful operating modes for testing, debugging or niche applications, but like all features they must be tested. The correct approach for testing with a particular env var setting is: def set_env_var(key, value): if value is None: os.environ.pop(key, None) else: os.environ[key] = value old_env_var_value = os.environ.get(env_var_name) try: set_env_var(env_var_name, test_env_var_value) <perform test> finally: set_env_var(env_var_name, old_env_var_value ) The above code makes no assumption about whether the before-test and within-test state of the env var is set or unset, and restores the prior environment even if the test raises an exception. This represents a lot of boiler-plate code that could be potentially mishandled. The with environment() context makes it simple to handle all this properly. If an entire unittest wants a forced env var setting, then using the @with_environment() decorator avoids the code indent of the with environment() approach if used otherwise within the test. commit 18af71e Author: Leonard Lausen <lausen@amazon.com> Date: Thu Jul 23 18:09:10 2020 +0000 CI: Migrate remaining Dockerfiles to docker-compose.yml and remove unused code (apache#18771) * Migrate remaining Dockerfiles to docker-compose.yml - Delete unused Dockerfiles - Delete unused install/*.sh scripts - Consolidate ubuntu_gpu_tensorrt and ubuntu_gpu - Remove deprecated logic in ci/build.py (no longer needed with docker-compose) - Remove ci/docker_cache.py (no longer needed with docker-compose) * Fix * Fix * Fix ubuntu_cpu_jekyll commit 1928117 Author: Przemyslaw Tredak <ptredak@nvidia.com> Date: Tue Jul 21 23:35:15 2020 -0700 Fix crash when accessing already destructed static variables (apache#18768) commit a330a02 Author: Leonard Lausen <lausen@amazon.com> Date: Wed Jul 22 06:31:47 2020 +0000 Fix mx.symbol.numpy._Symbol.__deepcopy__ logic error (apache#18686) * Fix mx.symbol.numpy._Symbol.__deepcopy__ logic error Performed shallow copy instead of deep copy * Test * Fix test commit 9548b0c Author: Leonard Lausen <lausen@amazon.com> Date: Tue Jul 21 21:42:01 2020 +0000 Remove duplicate settings in .codecov.yml (apache#18763) New PRs started showing the codecov/project badge again due apparent change in codecov's backend resolving these duplicate options specified in .codecov.yml
* Better frozen batchnorm * Continue FreezeBN * Optimizations * Reduce number of mod operations * Cleaning * Fixing frozen bn with fix_gamma=False * Fix lint in BN * Backward frozen batchnorm * More work on backward of Frozen BN * Let it compile * NCHW Frozen BN backward * Frozen BN backward NHWC * Cleaning * Remove the change to Makefile * Fix from rebase * Temp space for BN backward * Fix from review * Fix lint * Changes from review
Description
This PR introduces an improved implementation of GPU BatchNorm when
use_global_stats
isTrue
Performance results (using V100 PCIe card, shape of data =
(208, 64, 112, 112)
)dtype =
float32
dtype =
float16
@Kh4L @Jerryzcn FYI
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.