Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Apache MXNet (incubating) 2.0.0.beta1 Release Candidate 1

Pre-release
Pre-release
Compare
Choose a tag to compare
@barry-jin barry-jin released this 08 Mar 23:34
fc54fab

Features

Implementations and Improvements

Array-API Standardization

  • [API] Extend NumPy Array dtypes with int16, uint16, uint32, uint64 (#20478)
  • [API Standardization] Add Linalg kernels: (diagonal, outer, tensordot, cross, trace, matrix_transpose) (#20638)
  • [API Standardization]Standardize MXNet NumPy Statistical & Linalg Functions (#20592)
  • [2.0] Bump Python to >= 3.8 (#20593)
  • [API] Add positive (#20667)
  • [API] Add logaddexp (#20673)
  • [API] Add linalg.svdvals (#20696)
  • [API] Add floor_divide (#20620)
  • [API STD][SEARCH FUNC] Add keepdims=False to argmax/argmin (#20692)
  • [API NEW][METHOD] Add mT, permute_dims (#20688)
  • [API] Add bitwise_left/right_shift (#20587)
  • [API NEW][ARRAY METHOD] Add Index() and array_namespace() (#20689)
  • [API STD][LINALG] Standardize sort & linalg operators (#20694)
  • [API NEW][SET FUNC] Add set functions (#20693)
  • [API] Standardize MXNet NumPy creation functions (#20572)
  • [API NEW][LINALG] Add vector_norm, matrix_norm (#20703)
  • [API TESTS] Standardization and add more array api tests (#20725)
  • [API] Add new dlpack API (#20546)

FFI Improvements

  • [FFI] Add new containers and Implementations (#19685)
  • [FFI] Randint (#20083)
  • [FFI] npx.softmax, npx.activation, npx.batch_norm, npx.fully_connected (#20087)
  • [FFI] expand_dims (#20073)
  • [FFI] npx.pick, npx.convolution, npx.deconvolution (#20101)
  • [FFI] npx.pooling, npx.dropout, npx.one_hot, npx.rnn (#20102)
  • [FFI] fix masked_softmax (#20114)
  • [FFI] part5: npx.batch_dot, npx.arange_like, npx.broadcast_like (#20110)
  • [FFI] part4: npx.embedding, npx.topk, npx.layer_norm, npx.leaky_relu (#20105)
  • make stack use faster API (#20059)
  • Add interleaved_matmul_* to npx namespace (#20375)

Operators

  • [FEATURE] AdaBelief operator (#20065)
  • [Op] Fix reshape and mean (#20058)
  • Fusing gelu post operator in Fully Connected symbol (#20228)
  • [operator] Add logsigmoid activation function (#20268)
  • [operator] Add Mish Activation Function (#20320)
  • [operator] add threshold for mish (#20339)
  • [NumPy] Wrap unravel_index backend implementation instead of fallback (#20730)

cuDNN & CUDA & RTC & GPU Engine

  • [FEATURE] Use RTC for reduction ops (#19426)
  • Improve add_bias_kernel for small bias length (#19744)
  • [PERF] Moving GPU softmax to RTC and optimizations (#19905)
  • [FEATURE] Load libcuda with dlopen instead of dynamic linking (#20484)
  • [FEATURE] Add backend MXGetMaxSupportedArch() and frontend get_rtc_compile_opts() for CUDA enhanced compatibility (#20443)
  • Expand NVTX usage (#18683)
  • Fast cuDNN BatchNorm NHWC kernels support (#20615)
  • Add async GPU dependency Engine (#20331)
  • Port convolutions to cuDNN v8 API (#20635)
  • Automatic Layout Management (#20718)
  • Use cuDNN for conv bias and bias grad (#20771)
  • Fix the regular expression in RTC code (#20810)

Miscs

  • 1bit gradient compression implementation (#17952)
  • add inline for __half2float_warp (#20152)
  • [FEATURE] Add interleaved batch_dot oneDNN fuses for new GluonNLP models (#20312)
  • [ONNX] Foward port new mx2onnx into master (#20355)
  • Add new benchmark function for single operator comparison (#20388)
  • [BACKPORT] [FEATURE] Add API to control denormalized computations (#20387)
  • [v1.9.x] modify erfinv implementation based on scipy (#20517) (#20550)
  • [REFACTOR] Refactor test_quantize.py to use Gluon API (#20227)
  • Switch all HybridBlocks to use forward interface (#20262)
  • [FEATURE] MXIndexedRecordIO: avoid re-build index (#20549)
  • Split np_elemwise_broadcast_logic_op.cc (#20580)
  • [FEATURE] Add feature of retain_grad (#20500)
  • [v2.0] Split Large Source Files (#20604)
  • [submodule] Remove soon to be obsolete dnnl nomenclature from mxnet (#20606)
  • Added ::GCD and ::LCM: [c++17] contains gcd and lcm implementation (#20583)
  • [v2.0] RNN: use rnn_params (#20384)
  • Add quantized batch_dot (#20680)
  • [master] Add aliases for subgraph operators to be compatible with old models (#20679)
  • Optimize preparation of selfattn operators (#20682)
  • Fix scale bug in quantized batch_dot (#20735)
  • [master] Merge DNNL adaptive pooling with standard pooling (#20741)
  • Avoid redundant memcpy when reorder not in-place (#20746)
  • Add microbenchmark for FC + add fusion (#20780)
  • Optimize 'take' operator for CPU (#20745)
  • [FEATURE] Add g5 instance to CI (#20876)
  • Avoid modifying loaded library map while iterating in lib_close() (#20941)
  • quantized transpose operator (#20817)
  • Remove first_quantization_pass FC property (#20908)
  • Reduce after quantization memory usage (#20894)
  • [FEATURE] Add quantized version of reshape with DNNL reorder primitive. (#20835)
  • [FEATURE] Fuse dequantize with convolution (#20816)
  • [FEATURE] Add binomial sampling and fix multinomial sampling (#20734)
  • Refactor src/operator/subgraph/dnnl/dnnl_conv.cc file (#20849)

Language Bindings

  • Adding MxNet.Sharp package to the ecosystem page (#20162)
  • Add back cpp-package (#20131)

MKL & OneDNN

  • [operator] Integrate oneDNN layer normalization implementation (#19562)
  • Change inner mxnet flags nomenclature for oneDNN library (#19944)
  • Change MXNET_MKLDNN_DEBUG define name to MXNET_ONEDNN_DEBUG (#20031)
  • Change mx_mkldnn_lib to mx_onednn_lib in Jenkins_steps.groovy file (#20035)
  • Fix oneDNN feature name in MxNET (#20070)
  • Change MXNET_MKLDNN* flag names to MXNET_ONEDNN* (#20071)
  • Change _mkldnn test and build scenarios names to _onednn (#20034)
  • [submodule] Upgrade oneDNN to v2.2.1 (#20080)
  • [submodule] Upgrade oneDNN to v2.2.2 (#20267)
  • [operator] Integrate matmul primitive from oneDNN in batch dot (#20340)
  • [submodule] Upgrade oneDNN to v2.2.3 (#20345)
  • [submodule] Upgrade oneDNN to v2.2.4 (#20360)
  • [submodule] Upgrade oneDNN to v2.3 (#20418)
  • Fix backport of SoftmaxOutput implementation using onednn kernels (#20459)
  • [submodule] Upgrade oneDNN to v2.3.2 (#20502)
  • [FEATURE] Add oneDNN support for npx.reshape and np.reshape (#20563)
  • [Backport] Enabling BRGEMM FullyConnected based on shapes (#20568)
  • [BACKPORT][BUGFIX][FEATURE] Add oneDNN 1D and 3D deconvolution support and fix bias (#20292)
  • [FEATURE] Enable dynamic linking with MKL and compiler based OpenMP (#20474)
  • [Performance] Add oneDNN support for temperature parameter in Softmax (#20567)
  • [FEATURE] Add oneDNN support for numpy concatenate operator (#20652)
  • [master] Make warning message when oneDNN is turned off less confusing (#20700)
  • [FEATURE] add oneDNN support for numpy transpose (#20419)
  • Reintroduce next_impl in onednn deconvolution (#20663)
  • Unify all names used to refer to oneDNN library in logs and docs to oneDNN (#20719)
  • Improve stack operator performance by oneDNN (#20621)
  • [submodule] Upgrade oneDNN to v2.3.3 (#20752)
  • Unifying oneDNN post-quantization properties (#20724)
  • Add oneDNN support for reduce operators (#20669)
  • Remove identity operators from oneDNN optimized graph (#20712)
  • Fix oneDNN fallback for concat with scalar (#20772)
  • Fix identity fuse for oneDNN (#20767)
  • Improve split operator by oneDNN reorder primitive (#20757)
  • Remove doubled oneDNN memory descriptor creation (#20822)
  • [FEATURE] Integrate oneDNN support for add, subtract, multiply, divide. (#20713)
  • [master] 2022.00 MKL' version, update (#20865)
  • Add oneDNN support for "where" operator (#20862)
  • [master] Implemented oneDNN Backward Adaptive Pooling kernel (#20825)
  • Improve MaskedSoftmax by oneDNN (#20853)
  • [Feature] Add bfloat to oneDNN version of binary broadcast operators. (#20846)
  • [submodule] Upgrade oneDNN to v2.5.2 (#20843)
  • Make convolution operator fully work with oneDNN v2.4+ (#20847)
  • [FEAUTURE] Fuses FC + elemwise_add operators for oneDNN (#20821)
  • [master][submodule] Upgrade oneDNN to v2.5.1 (#20662)

CI-CD

  • CI Infra updates (#19903)
  • Fix cd by adding to $PATH (#19939)
  • Fix nightly CD for python docker image releases (#19772)
  • pass version param (#19984)
  • Update ci/dev_menu.py file (#20053)
  • add gomp and quadmath (#20121)
  • [CD] Fix the name of the pip wheels in CD (#20115)
  • Attemp to fix nightly docker for master cu112 (#20126)
  • Disable codecov (#20173)
  • [BUGFIX] Fix CI slowdown issue after removing 3rdparty/openmp (#20367)
  • cudnn8 for cu101 in cd (#20408)
  • [wip] Re-enable code cov (#20427)
  • [CI] Fix centos CI & website build (#20512)
  • [CI] Move link check from jenkins to github action (#20526)
  • Pin jupyter-client (#20545)
  • [CI] Add node for website full build and nightly build (#20543)
  • use restricted g4 node (#20554)
  • [CI] Freeze array-api-test (#20631)
  • Fix os_x_mklbuild.yml (#20668)
  • [CI] UPgrade windows CI (#20676)
  • [master][bugfix] Remove exit 0 to avoid blocking in CI pipeline (#20683)
  • [CI] Add timeout and retry to linkcheck (#20708)
  • Prospector checker initial commit (#20684)
  • [master][ci][feature] Static code checker for CMake files (#20706)
  • Fix sanity CI (#20763)
  • [CI] Workaround MKL CI timeout issue (#20777)
  • [master] CI/CD updates to be more stable (#20740)

Website & Documentation & Style

  • Fix static website build (#19906)
  • [website] Fix broken website for master version (#19945)
  • add djl (#19970)
  • [website] Automate website artifacts uploading (#19955)
  • Grammar fix (added period to README) (#19998)
  • [website] Update for MXNet 1.8.0 website release (#20013)
  • fix format issue (#20022)
  • [DOC]Disabling hybridization steps added (#19986)
  • [DOC] Add Flower to MXNet ecosystem (#20038)
  • doc add relu (#20193)
  • Avoid UnicodeDecodeError in method doc on Windows (#20215)
  • updated news.md and readme.md for 1.8.0 release (#19975)
  • [DOC] Update Website to Add Prerequisites for GPU pip install (#20168)
  • update short desc for pip (#20236)
  • [website] Fix Jinja2 version for python doc (#20263)
  • [Master] Auto-formatter to keep the same coding style (#20472)
  • [DOC][v2.0] Part1: Link Check (#20487)
  • [DOC][v2.0] Part3: Evaluate Notebooks (#20490)
  • If variable is not used within the loop body, start the name with an underscore (#20505)
  • [v2.0][DOC] Add migration guide (#20473)
  • [Master] Clang-formatter: only src/ directory (#20571)
  • [Website] Fix website publish (#20573)
  • [v2.0] Update Examples (#20602)
  • Attempt to fix website build pipeline (#20634)
  • [Master] Ignoring mass reformatting commits with git blame (#20578)
  • [Feature][Master] Clang-format tool to perform additional formatting and semantic checking of code. (#20433)
  • [Master] Clang-format description on a wiki (#20612)
  • Add: break line entry before tenary (#20705)
  • Fix csr param description (#20698)
  • [master] Bring dnnl_readme.md on master up-to-date (#20670)
  • Remove extra spaces between 'if' (#20721)
  • [DOC] Fix migration guide document (#20716)
  • [master][clang-format] Re-format cc. .h. .cu files; cond. (#20704)
  • [master][style-fix] Clang-format comment style fix (#20744)
  • Port #20786 from v1.9.x (#20787)
  • remove broken links (#20793)
  • Fix broken download link, reformat download page to make links more clear. (#20794) (#20796)
  • [website] Move trusted-by section from main page to a new page (#20788)
  • [DOC] Add Kubeflow to MXNet ecosystem (#20804)
  • Add the 1.9 release notice in README (#20806)
  • fix python docs ci (#20903)
  • [website] Add CPU quantization tutorial (#20856)
  • [DOC] Large tensors documentation update (#20860)
  • [DOC] Change of confusing Large Tensors documentation (#20831)
  • Fix data-api links (#20879)
  • Add quantization API doc and oneDNN to migration guide (#20813)
  • Fix data-api links (#20867)
  • [master] Avoid dots, full path to a file. (#20751)

Build

  • add cmake config for cu112 (#19870)
  • Remove USE_MKL_IF_AVAILABLE flag (#20004)
  • Define NVML_NO_UNVERSIONED_FUNC_DEFS (#20146)
  • Fix ChooseBlas.cmake for CMake build dir name (#20072)
  • Update select_compute_arch.cmake from upstream (#20369)
  • Remove duplicated project command in CMakeLists.txt (#20481)
  • Add check for MKL version selection (#20562)
  • fix macos cmake with TVM_OP ON (#20570)
  • Fix Windows-GPU build for monolithic arch dll (#20466)
  • An option to clorize output during build (#20681)
  • [FEATURE] Hardcode build-time branch and commit hash into the library (#20755)

License

Bug Fixes and Others

  • Mark test_masked_softmax as flaky and skip subgraph tests on windows (#19908)
  • Removed 3rdparty/openmp submodule (#19953)
  • [BUGFIX] Fix AmpCast for float16 (#19749) (#20003)
  • fix bugs for encoding params (#20007)
  • Fix for test_lans failure (#20036)
  • add flaky to norm (#20091)
  • Fix dropout and doc (#20124)
  • Revert "add flaky to norm (#20091)" (#20125)
  • Fix broadcast_like (#20169)
  • [BUGFIX] Add check to make sure num_group is non-zero (#20186)
  • Update CONTRIBUTORS.md (#20200)
  • Update CONTRIBUTORS.md (#20201)
  • [Bugfix] Fix take gradient (#20203)
  • Fix workspace of BoxNMS (#20212)
  • [BUGFIX][BACKPORT] Impose a plain format on padded concat output (#20129)
  • [BUGFIX] Fix Windows GPU VS2019 build (#20206) (#20207)
  • [BUGFIX]try avoid the error in operator/tensor/amp_cast.h (#20188)
  • [BUGFIX] Fix Windows GPU VS2019 build (#20206) (#20207)
  • [BUGFIX] fix #18936, #18937 (#19878)
  • [BUGFIX] fix numpy op fallback bug when ndarray in kwargs (#20233)
  • [BUGFIX] Fix test_zero_sized_dim save/restore of np_shape state (#20365)
  • [BUGFIX] Fix quantized_op + requantize + dequantize fuse (#20323)
  • [BUGFIX] Switch hybrid_forward to forward in test_fc_int8_fp32_outputs (#20398)
  • [2.0] fix benchmark and nightly tests (#20370)
  • [BUGFIX] fix log_sigmoid bugs (#20372)
  • [BUGFIX] fix npi_concatenate quantization dim/axis (#20383)
  • [BUGFIX] enable test_fc_subgraph.py::test_fc_eltwise (#20393)
  • [2.0] make npx.load support empty .npz files (#20403)
  • change argument order (#20413)
  • [BUGFIX] Add checks in BatchNorm's infer shape (#20415)
  • [BUGFIX] Fix Precision (#20421)
  • [v2.0] Add Optim Warning (#20426)
  • fix (#20534)
  • Test_take, add additional axis (#20532)
  • [BUGFIX] Fix (de)conv (#20597)
  • [BUGFIX] Fix NightlyTestForBinary in master branch (#20601)
  • change nd -> np in imagenet_gen_qsym_onedenn.py (#20399)
  • [Master][CI][Bugfix] Clang-format-13 file needs to have right license header and install clang-format package. (#20658)
  • Disable debug log to avoid duplications (#20665)
  • Permlink changes (#20674)
  • A clang-format file can be removed from .gitignore (#20664)
  • [2.0] Update Sparse Feature Related Error Message (#20402)
  • [master][tests] init' file to avoid undefined variables (#20701)
  • [BUGFIX] Fix #20293 (#20462)
  • [master][bugfix] Zero initialization to avoid error message on a Centos (#20582)
  • [2.0] Fix devices issues (#20732)
  • Fix test_numpy_op tests & lacking asserts (#20756)
  • Fix link check (#20773)
  • [KEYS] remove keys on master branch (#20764)
  • [BUGFIX] Type fix for large tensors (#20922)
  • add Bartłomiej as committer (#20896)
  • [master] Fix issue with even number of channels in BatchNorm (#20907)
  • Resolve the conflict with PR#20499 (#20887)
  • The size of a stack needs to be greather than 4; by default is 8 (#20581)
  • ensure type consistent with legacy nvml api (#20499)
  • Fix issue with LogMessageFatal (#20848)