Syncing forks #2

jinboci · 2020-07-08T10:09:54Z

Description

(Brief description on what this PR is about)

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
Check the API doc at https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

Feature1, tests, (and when applicable, API doc)
Feature2, tests, (and when applicable, API doc)

Comments

If this change is a backward incompatible change, why must this change be made.
Interesting edge cases to note here

…7841) * c++ dataloader and built-in image/bbox * update * fix error * fix import error * fix ci build * fix vs openmp loop type * fix warning as error with sign/unsign comp * sign/unsign comp * update to pytest * remove nose * fix tear_down * address comments * thread safe dataset * address comments * address comments * fix * serial pytest for data download

* fix the error message of reshape() * Fixing issue #16655 reshape() error message Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-249.ap-northeast-1.compute.internal>

Signed-off-by: Serge Panev <spanev@nvidia.com>

* fixed overwrite of args/aux variables * fixed spacing

) * Load the user's locale before performing tests * Change the locale for the CentOS CI jobs to weed out locale-related bugs * Mark tests that fail due to the decimal point with xfail * Run localedef when generating the CentOS CI image * Cancel some Scala tests when C locale uses a non-standard decimal sep. * Rename xfail helper to xfail_when_nonstandard_decimal_separator * Fix scalastyle errors * Disable more Python tests that fail due to locale-related issues * Move assumeStandardDecimalSeparator into separate object to fix scaladoc * Disable the "symbol pow" test when running with non-standard decimal sep * Disable new tests that fail due to locale-related issues

* fix doc * fix doc * fix axis Co-authored-by: Lin <haibilin@a483e7be4c92.ant.amazon.com>

* Android build containers * ARM build containers * ARM test containers * Fix naming scheme * Set WORKDIR at correct location

* port nd.multinomial to npx.sample_categorical * move to npx.random

…at (#18263) * FFI new feature * Feature ffi x 5 * Fix pylint error * Fix pylint error * Fix around error * repeat modified

* update nvidiadocker command & remove cuda compat * replace cu101 with cuda since compat is no longer to be used * skip flaky tests * get rid of ubuntu_build_cuda and point ubuntu_cu101 to base gpu instead of cuda compat * Revert "skip flaky tests" This reverts commit 1c720fa. * revert removal of ubuntu_build_cuda * add linux gpu g4 node to all steps using g3 in unix-gpu pipeline

Co-authored-by: Lin <haibilin@a483e7be4c92.ant.amazon.com>

* Revert "Fix and optimize handling of vectorized memory accesses (#17767)" This reverts commit 5542d03. * add license to reverted file

* Fix input gradient calculation for bidirectional LSTM For bidiractional LSTM with number of layers > 2 input gradient calculation was incorrect. Reason of wrong calculations was overwriting y derivative (dy) tensor by calculated x derivative (dx) tensor before right2left layer could use dy for own gradient calculations. Propsed fix uses additional space to avoid overwriting. * Fix gradient calculation for GRU For GRU with number of layers > 2 i2h_weight gradient for layers in the middle (all except last and first) was incorrect. Wrong caluculations were caused by assigning output pointer to input instead of calculating new input pointer. * Enable tests for GRU and LSTM gradients * Fix comments * Change loop iteration deduction * Add more test cases for fused rnn layers

* finish 5 changes * move metric.py to gluon, replace mx.metric with mx.gluon.metric in python/mxnet/ * fix importError * replace mx.metric with mx.gluon.metric in tests/python * remove global support * remove macro support * rewrite BinaryAccuracy * extend F1 to multiclass/multilabel * add tests for new F1, remove global tests * use mxnet.numpy instead of numpy * fix sanity * rewrite ce and ppl, improve some details * use mxnet.numpy.float64 * remove sklearn * remove reset_local() and get_global in other files * fix test_mlp * replace mx.metric with mx.gluon.metric in example * fix context difference * Disable -DUSE_TVM_OP on GPU builds * Fix disable tvm op for gpu runs * use label.ctx in metric.py; remove gluoncv dependency in test_cvnets * fix sanity * fix importError * remove nose Co-authored-by: Ubuntu <ubuntu@ip-172-31-12-243.us-east-2.compute.internal> Co-authored-by: Leonard Lausen <lausen@amazon.com>

…18307)

TVMOP feature is now disabled on GPU builds, which caused this test to fail previously

Fix leak of ndarray objects in the frontend due to reference cycle.

…t initialized (#18306) * avoid race condition in profiler init * Update storage_profiler.h Co-authored-by: Ubuntu <ubuntu@ip-172-31-61-76.ec2.internal>

* run operator tests with naive engine * fix take tests * update skip mark * fix cuda error reset * adjust tests * disable parallel testing and naive engine for mkl/mkldnn #18244

* add dlpack functions to npx * improve tests * further improve test * fix comment

…ays built (#18308) * remove Profiler from the runtime feature list, since its always built * Update libinfo.cc * Update RunTime.pm Co-authored-by: Lin <haibilin@a483e7be4c92.ant.amazon.com>

…18588) Use CMAKE_SYSTEM_PROCESSOR to detect target architecture and make x86 related options available only when compiling for x86. Remove the code turning these options manually off on CI. Remove ANDROID cmake option which was used to decide if -lpthread needs to be specified explicitly (on most Linux systems) or not (on Android). Instead auto-detect the behavior.

* add default ctx to cachedop fwd * add test * perl fix * initial commit * update sparse tests * add aux_states * fix aux-state type * fix some tests * fix check symbolic forwrad/backward * fix symbolic grad check * arg_dict fixes * support init ops * support forward only graph * fix check symbolic backward stype * add missing file * replace extension test bind * replace bind with _bind * simplify backward_mul implementation * small fix * drop contrib.sparseembedding * remove simple_bind in test sparse ops * use simple_bind * replave simple bind in quantization * fix aux index * update amp simple_bind calls * drop ifft * fix a bug found in subgraph op * add aux_array method * replace symbols * minor fix * fix executor default context * fix import * bug fix for nd.where * add subgraph test * fix forward grad req * fix batch dot dtype * remove unused code * fix slice dtype * fix attach grad * remove tests for non-existing sparse ops * MXCachedOpGetOptimizedSymbol * fix foreach test * enhance err msg * skip failed test * add docs * add docs * fix lint * fix lint, remove quantization * fix lint * fix lint * fix lint * fix build and import * fix import * fix perl call * fix test * remove perl binding * remove reshape test * fix profiler, trt * remove tensorrt test * remove quantization tests * fix import * fix conflcit * fix lint * skip buggy test Co-authored-by: EC2 Default User <ec2-user@ip-172-31-81-80.ec2.internal> Co-authored-by: Lin <haibilin@a483e7be4c92.ant.amazon.com>

* add default ctx to cachedop fwd * add test * perl fix * initial commit * update sparse tests * add aux_states * fix aux-state type * fix some tests * fix check symbolic forwrad/backward * fix symbolic grad check * arg_dict fixes * support init ops * support forward only graph * fix check symbolic backward stype * add missing file * replace extension test bind * replace bind with _bind * simplify backward_mul implementation * small fix * drop contrib.sparseembedding * remove simple_bind in test sparse ops * use simple_bind * replave simple bind in quantization * fix aux index * update amp simple_bind calls * drop ifft * fix a bug found in subgraph op * add aux_array method * replace symbols * minor fix * fix executor default context * fix import * bug fix for nd.where * add subgraph test * fix forward grad req * fix batch dot dtype * remove unused code * fix slice dtype * fix attach grad * remove tests for non-existing sparse ops * MXCachedOpGetOptimizedSymbol * fix foreach test * enhance err msg * skip failed test * add docs * add docs * fix lint * fix lint, remove quantization * fix lint * fix lint * fix lint * fix build and import * fix import * remove scala, R, julia, perl bindings * remove cpp, matlab bindings * fix perl call * fix test * remove perl binding * remove reshape test * fix profiler, trt * remove tensorrt test * remove quantization tests * fix import * fix conflcit * fix lint * skip buggy test * remove clojure * remove executor c api * remove amalgamation * fix build * move executor folder * fix import * fix lint * fix cpp pcakge * fix predict cpp * fix cpp make * remove jnilint * remove cpp package tset * remove julia test pipeline * disable numpy tests * disable compat test for delete Co-authored-by: EC2 Default User <ec2-user@ip-172-31-81-80.ec2.internal> Co-authored-by: Lin <haibilin@a483e7be4c92.ant.amazon.com>

* redirect api reference on v-master to v1.6 * update R docs

…17949) * Initial commit of input reordering in Gluon * Add test for Gluon input reorder * Fix backward in CachedOp for input reordering * Fix test_input_reorder for backward pass * Fix merge error in NaiveCachedOp * Include correct header for std::iota Co-authored-by: Vladimir Cherepanov <vcherepanov@nvidia.com>

Update docs according to new Block APIs (#18413)

Co-authored-by: Ubuntu <ubuntu@ip-172-31-92-136.ec2.internal>

…18621)

* add lans optimizer * fix * fix Co-authored-by: Zheng <shzheng@a483e789dd93.ant.amazon.com>

…#18633)

* adding comments explaining code optimizations * fixing broadcast_axis kernel to int32 * fixing slice_axis kernel to int32 * combining CPU and GPU implementation method signatures and cleaned up code * adding new broadcast_axis to np_matmul Co-authored-by: Rohit Kumar Srivastava <srivastava.141@buckeyemail.osu.edu>

* add numpy tril_indices ffi * Update src/api/operator/numpy/np_matrix_op.cc Co-authored-by: Haozheng Fan <hzfan9@outlook.com> Co-authored-by: Haozheng Fan <hzfan9@outlook.com>

* Add test for BatchNorm running variables synchronization * Fix BatchNorm backward synchronization It fixes issue #18610

* Fix failing empty array (log_)softmax * Modify test for npx (log_)softmax

* refactor clipboard * make lang getter more extensible * trigger ci

Add --no-pull option which disables overwriting the local docker cache based on CI docker cache. It is useful when locally changing Dockerfiles.

* flip * rollaxis * stack * fixed * retrigger ci Co-authored-by: Ubuntu <ubuntu@ip-172-31-18-97.us-east-2.compute.internal>

* fix ffi * fix less/greater error * back * submodule * fixed Co-authored-by: Ubuntu <ubuntu@ip-172-31-8-94.us-east-2.compute.internal>

* user feedback widget implementation * add user feedback widget to python docs site * update margin * add apache license * one more license * turn off feedback widget on python site * update copy * format * add event value field * turn on widget on Python site

* package created * mvn WIP * normal wip, to be tested * update * docstring added, normal mostly done * add test file * Bernoulli WIP * bernoulli wip * bernoulli doc done * dense variational WIP * add kl infra * implement normal kl method * refactor kl * add not implemented handling, rename kl_storage * add abstract method and Categorical class * rewrite logit2prob prob2logit for multiclass support * normal broadcast_to implemented * categorical mostly done * update distributions/utils.py * add dot ahead of import * fix normal F * bernoulli, normal brief tests implemented * add hybridize tests * transformation infras done * affine transformation, implemented tested * add tests cases * add sum_right_most * fix get F bug * compose transform implemented, tested * fix * add event_dim * fetch mvn from upstremm * clean code, implement normal cdf and tests * constraint in bernoulli done * fix constraint * finish half normal * add cached_property * add test on cached_property * add more features to distribution and constratins * change constraint * fix bernoulli * add independent * add independent tests * update naming of cached_property * revert * add constraints * add Cat * add Stack for imperative mode * add Stack for imperative mode * add bernoulli entropy * categorical WIP * categorical sampling implemented * finish categorical log_prob, sampling * enumerate_support finished * polish StochasticBlock, add test * add test for stochastic sequential * clean loss list in __call__ * fix affine, implement sigmoid, softmax * add gumbel, relaxed bernoulli * relaxed one-hot sampling implemented * gamma done * gamma, dirichlet implemented * beta done * gumbel softmax log-likelihood implemented * refactor tests, implement exponential, fix compose transform * weibull implemented, transformed distribution cdf icdf added * pareto implemented * uniform wip * uniform done * rewrite lgamma, implement chi2 * fix chi2 scale * F distributiion done * t implemented * fix tiny problem * cauchy done * add half cauchy * multinomial done, tests to be added * add multinomial test * MVN done, tests todo * mvn polished * fix a few precison issues * add erf, erfinv unified api and learnable transform * fix mvn attribute check * MVN done * poisson done * hack poisson for size support * geometric finished * negative binomial done * binomial done * implement some kl * add more kl * refactor kl test * add more kl * binomial kl todo * change constraint logical op implement * implement gamma entropy * finish beta dirchlet entropy * finishi all entropy * kl finished * add constraint test * domain map done * remove bayesian dense * fix tiny problems * add kl uniform normal * add kl tests * acquire patch from upstream * add some doc * finish doc * refactor kl test(WIP) * add more kl, fix float32 underflow issue * make sampling more stable * handle inconsistent mode * replace boolean idx with np.where * fix file name * add more doc * add constraint check * add half_normal/cauchy pdf cdf support check * fix import problem * change nosetest to pytest * remove buggy lines * change alias register path * attempt to fix ci * fix lint, change a few tests * fix lint * modify hybrid sequential * fix lint * change import order * add test gluon probability v2 * fix hybridize flag * change implementation of stochastic block * fix lint * fix comments * fix block * modify domain map * add raises for improper add_loss * add raises for improper add_loss * add extra cases * change collectLoss decorator to mandatory * skip stochastic block tests * remove test cases * put gpu tests back * add test_gluon_stochastic_block back * remove export test * put a test back * tiny refactor * add memory leak flag * small changes Co-authored-by: Zheng <shzheng@a483e789dd93.ant.amazon.com>

zhreshold and others added 30 commits May 7, 2020 10:34

Remove duplicate large_vector_test (#18259)

353c243

Fixing #16655 (#18257)

7f24823

* fix the error message of reshape() * Fixing issue #16655 reshape() error message Co-authored-by: Ubuntu <ubuntu@ip-172-31-34-249.ap-northeast-1.compute.internal>

Change include to relative in nvvm_to_onnx.cc (#18249)

21b187b

Signed-off-by: Serge Panev <spanev@nvidia.com>

fixed overwrite of args/aux variables (#18232)

68cb955

* fixed overwrite of args/aux variables * fixed spacing

fix when clicking version dropdown it jumps to top of the page (#18238)

33dfbf7

fix mixed type backward (#18250)

f00b9ab

Fix interleave matmul doc (#18260)

de51058

* fix doc * fix doc * fix axis Co-authored-by: Lin <haibilin@a483e7be4c92.ant.amazon.com>

CI: Update Android, ARM build and ARM test containers (#18264)

1d14bf3

* Android build containers * ARM build containers * ARM test containers * Fix naming scheme * Set WORKDIR at correct location

[Numpy] Port nd.random.multinomial to npx.sample_categorical (#18272)

9d44086

* port nd.multinomial to npx.sample_categorical * move to npx.random

Deprecate dataset transform= argument in gluon data API (#17852)

18b6e05

[Numpy] New FFIs for Operator: squeeze, repeat, around, round, diagfl…

0523f09

…at (#18263) * FFI new feature * Feature ffi x 5 * Fix pylint error * Fix pylint error * Fix around error * repeat modified

add gelu doc (#18274)

8a5886a

Co-authored-by: Lin <haibilin@a483e7be4c92.ant.amazon.com>

Mark test_np_mixed_precision_binary_funcs flaky (#18290)

51844b2

Revert PR 17767 for fixing GPU memory usage regression (#18283)

47a38d1

* Revert "Fix and optimize handling of vectorized memory accesses (#17767)" This reverts commit 5542d03. * add license to reverted file

Mark CD quantization tests as xfail on , decimal separator locales (#…

6d5e471

…18307)

Reenable test_amp_conversion (#18292)

446ce14

TVMOP feature is now disabled on GPU builds, which caused this test to fail previously

xfail_when_nonstandard_decimal_separator for test_metric.py (#18312)

4d4cbd5

Fix missing MKLDNN headers (#18310)

fec534a

Fix memory leaks in Gluon (#18328)

3e676fc

Fix leak of ndarray objects in the frontend due to reference cycle.

Fix deferred compute mode for operators using new FFI (#18284)

37280e4

Add a timeout to the storage profiler in case mem_counters_ is not ye…

09224c4

…t initialized (#18306) * avoid race condition in profiler init * Update storage_profiler.h Co-authored-by: Ubuntu <ubuntu@ip-172-31-61-76.ec2.internal>

fix native cd builds (#18337)

9482728

[CI] run operator tests with naive engine (#18252)

10b6b48

* run operator tests with naive engine * fix take tests * update skip mark * fix cuda error reset * adjust tests * disable parallel testing and naive engine for mkl/mkldnn #18244

[numpy] add dlpack functions to npx (#18342)

7ab326c

* add dlpack functions to npx * improve tests * further improve test * fix comment

[BUGFIX] Remove Profiler from the runtime feature list, since its alw…

7f5df07

…ays built (#18308) * remove Profiler from the runtime feature list, since its always built * Update libinfo.cc * Update RunTime.pm Co-authored-by: Lin <haibilin@a483e7be4c92.ant.amazon.com>

leezu and others added 29 commits June 19, 2020 14:46

Use chain.from_iterable in artifact_repository.py (#18578)

56cfd9c

redirect api reference on v-master to v1.6 (#18607)

74fcb99

* redirect api reference on v-master to v1.6 * update R docs

Update tutorials (#18609)

acf2d27

Update docs according to new Block APIs (#18413)

use new mxnet.gluon.block APIs (#18601)

1fcc7ea

Update disclaimer wording (#18616)

3f555f8

add epsilon to adamax (#18532)

e4c93e3

Co-authored-by: Ubuntu <ubuntu@ip-172-31-92-136.ec2.internal>

add version check on installation guide (#18587)

c9dcdd1

fix julia api redirect (#18613)

ecbda07

fix contrib interleaved_matmul_selfatt_valatt not render correctly (#…

8ee4600

…18621)

Add LANS optimizer (#18620)

d6c3578

* add lans optimizer * fix * fix Co-authored-by: Zheng <shzheng@a483e789dd93.ant.amazon.com>

Enhance license checker to cover multiple license header and md files (…

b12abbf

…#18633)

Remove mention of nightly in pypi (#18635)

becb9ca

[Numpy] FFI: tril_indices (#18546)

2158106

* add numpy tril_indices ffi * Update src/api/operator/numpy/np_matrix_op.cc Co-authored-by: Haozheng Fan <hzfan9@outlook.com> Co-authored-by: Haozheng Fan <hzfan9@outlook.com>

Fix BatchNorm backward synchronization (#18644)

37bed6e

* Add test for BatchNorm running variables synchronization * Fix BatchNorm backward synchronization It fixes issue #18610

Fix softmax, logsoftmax failed on empty ndarray (#18602)

9a122ca

* Fix failing empty array (log_)softmax * Modify test for npx (log_)softmax

update to onednn v1.4 (#18273)

a8c8dea

Clipboard refactor (#18605)

0c8b6b2

* refactor clipboard * make lang getter more extensible * trigger ci

build.py --no-pull (#18589)

d1b2cd9

Add --no-pull option which disables overwriting the local docker cache based on CI docker cache. It is useful when locally changing Dockerfiles.

Mark test_get_symbol as garbage_expected (#18595)

c519e0e

[numpy] FFI flip, rollaxis, stack (#18614)

d1b0a09

* flip * rollaxis * stack * fixed * retrigger ci Co-authored-by: Ubuntu <ubuntu@ip-172-31-18-97.us-east-2.compute.internal>

[numpy] Fix less/greater bug with scalar input (#18642)

6462887

* fix ffi * fix less/greater error * back * submodule * fixed Co-authored-by: Ubuntu <ubuntu@ip-172-31-8-94.us-east-2.compute.internal>

fix broken installation widget - remove empty entries (#18661)

348ab4d

jinboci merged commit cf87b37 into jinboci:master Jul 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Syncing forks #2

Syncing forks #2

jinboci commented Jul 8, 2020

Syncing forks #2

Syncing forks #2

Conversation

jinboci commented Jul 8, 2020

Description

Checklist

Essentials

Changes

Comments