Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Source op per critical section (#6472) * backup code * EventRecord * auto format by CI * backup code * remove deprecated binary test cases * refactor valatile to atomic * add StreamType::InitInstructionStatusIf/StreamType::DeleteInstructionStatusIf * merge from branch profiling_nn_graph * address comments * EventRecordProvider * more comments for XXXStatusQuerier::SetLaunched * more comments for SharedEventRecord::Init * wait source op per critical section * rename a task_node.cpp * minor fix * backup code * fix compiler complaints * 1) remove AddCtrlEdgeBetweenSrcDstTickAndInputOutputInSameRank; 2) create CriticalSectionInstance buffers * fix compiler complaints * more profiler code * refactor vm preschedule * TryMoveFromWaitingToReady * revert flying_instruction_cnt * revert to single position to call DispatchInstruction * revert several code * reset instruction watermark * remove is_xxx_hook_empty * build with profiler * merge master * insert device ticks before and after critical sections * refactor register_num of cs_wait/cs_callback from 2 to 128 * fix static analysis complaints * fix complier complaints about JobBuilder::ParallelConf4OpName * Update oneflow/core/operator/critical_section_wait_tick_op.cpp Co-authored-by: daquexian <daquexian566@gmail.com> * address pr comments * add job example for InstructionsBuilder::LaunchLazyJob * address pr comments Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: ouyangyu <xuanjiuye@gmail.com> Co-authored-by: daquexian <daquexian566@gmail.com> * More details of error of getting op matched sbp signature (#7077) * more details of error msg * minor change * address review comment * avoid namesake iterator * Module apply only once (#7055) * add once apply of param * apply once on buffer * test reuse var on module to * test resue var * rm useless test * finish test * refine test Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * distributed test bugfix (#7057) * change spawn_shell to spawn_shell_and_check, sleep in script Signed-off-by: daquexian <daquexian566@gmail.com> * fix distributed test master addr Signed-off-by: daquexian <daquexian566@gmail.com> * remove sleep Signed-off-by: daquexian <daquexian566@gmail.com> * spawn_shell -> spawn_shell_ignoring_failure Signed-off-by: daquexian <daquexian566@gmail.com> * auto format by CI * fix bug Signed-off-by: daquexian <daquexian566@gmail.com> * auto format by CI * fix the reversed logic Signed-off-by: daquexian <daquexian566@gmail.com> * improve error msg Signed-off-by: daquexian <daquexian566@gmail.com> * resolve name conflict of MASTER_ADDR Signed-off-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix promote_type matrix (#7066) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix chunk op dim=-1 bug (#7073) * fix chunk op dim=-1 bug * Update oneflow/core/functional/impl/array_functor.cpp Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> * Update oneflow/core/functional/impl/array_functor.cpp Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix resource desc dump cudnn conf bug (#7038) * fix Resource::DumpCudnnConf * fix typo and error msg Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix concat bug (#7075) * fix * support concat single input * Clean TensorNameScope after graph build (#7076) * Clear tensor name scope after graph build * Add test case of 2 graph caught same free eager tensor * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix_abnormal_printing (#7099) * Fix bias add dropout fuse (#7081) * fix bias_add dropout fuse when p=0.0 * remove redundant op Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Support 1d to 2d eager boxing (#7083) * fix Resource::DumpCudnnConf * support_1d_to_2d_eager_boxing * rename stack to unflatten * add test case * of format * refine test case * Revert "fix Resource::DumpCudnnConf" This reverts commit f07278d. * support nd to 1d * add 2d to 1d test case Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Implement all User Ops with Op Schema (#7032) * add oneflow-tblgen: generate op schema (OpInterpCtx) from ods * cmake: add inja * tblgen: add oneflow_datatype * tblgen: use option cat * tblgen: fix error * tblgen: put impl in .cpp * tblgen: fix null attrs * tblgen: fix null ops * refine * refine * reifne * Refine op schema template and compilation * add base OpInterpCtx to finish compilation * fix * refine * fix * add custom infer code * generate op registrants automatically * refine * fix * update user op ods and fix shape attr * refine * refine * add custom code in op base * refine comments * add same_output_regst_num and infer * support declare hasxx * update op schema emitter * refine * emit output regist num * refine * refine * migrate acc op * migrate onerec_reader, ones_like, send, pack and padding ops * add has_sbp_signature_infer_fn * refine * migrate pad, parallel_cast, partial_fc and pooling ops * rm redundant has_device_infer_fn * migrate prelu, quantization, randperm, reduce and repeat ops * migrate reshape, reshape_like, roi_align, same_pad, selu and scalar related ops * back port * backport * migrate ops * refine * refine * refine * refine * add new op * fix llvm not found * fix mlir headers * fix mlir headers * fix llvm not found * irefine * mark override * fix merge * fix * fix * set op schema as obj lib to speed up * rewrite ops * add addn * add grdi * refien * add more def (#7051) * affine grid * refien * refine * refine * refine * fix * refien * refine * refine * refine * refine * refine * refien * refine * refine * refein * refine * refine * refine * refine * refien * refine * refine * refine * refien * refien * refien * refine * refine * refien * refine * refine * refine * refein * refine * refine * refine * refine * refine * refien * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refein * refine * refine * refine * move more ops * fix math_binary_broadcast/elementwise_ops * fix hardtanh * add norm * rename file and add CpuOnly no_grad * fix ir & fix norm op * fix oneflow-tblgen * fix math_unary_elementwise_op * fix norm * fix bn * fix op schema * refine * fix * refine physical_tensor_desc_infer_fn * refine * add ScalarLogicalNotEqualOp & RecvOp * refine * auto format by CI * fix fmt * add cuda only trait * delete unused inja * del inja_copy_headers_to_destination * delete unused inja * del inja_copy_headers_to_destination * add cuda only to tblgen * fix json inja url and md5 not used * fix json inja url and md5 not used * refine * revert * add with cuda * refine * delete GenUserOpODS * remove cuda only * revert cuda only after meeting * fix Co-authored-by: PragmaTwice <i@twice.moe> Co-authored-by: hjchen2 <chenhoujiangcug@gmail.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * Feat/debug pass (#7054) * add pass debug * debug pass * refine comment of fuse add pass * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix error message (#6930) * fix error message * fix dot doc * fix dot elem cnt * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix simple ci: add of_op_schema target to tidy check (#7105) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Rename AnyType in .td (#7109) * AnyType => Tensor * refine * refine Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Feat graph reuse var (#7080) * add once apply of param * apply once on buffer * test reuse var on module to * test resue var * rm useless test * finish test * refine test * Clear tensor name scope after graph build * Add test case of 2 graph caught same free eager tensor * auto format by CI * refactor var build draft * add full func; add check * done * add test of call parameter ousite its moudule * fix break test Co-authored-by: chengtbf <472491134@qq.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix l2_normalize & add nn.functional.normalize (#6940) * fix l2_normalize * add normalize * add test for normalize * refine * clean l2_normalize and refine normalize * simplify normalize test * Fix l2norm block_size * refine Co-authored-by: Juncheng <liujuncheng1022@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Align api in swin transformer (#7058) * add linspace op * fix align error in swintransformer * add @ magic method * fix conflict * support tensor list * fix meshgrid bug * revert Co-authored-by: hjchen2 <chenhoujiangcug@gmail.com> * set CMAKE_LINK_DEPENDS_NO_SHARED to ON (#7063) Signed-off-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add other api graph autotest (#7091) * Clear tensor name scope after graph build * Add test case of 2 graph caught same free eager tensor * auto format by CI * add other api graph autotest * add more samples * fix comments * refine * refine * refine * refine * refine * fix error * fix test error * fix bug * fix flip bug * fix bug * fix bug * fix ci bug * fix ci error * fix bug * fix ci error Co-authored-by: chengtbf <472491134@qq.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: Li Xiang <54010254+lixiang007666@users.noreply.github.com> * [serving] dev graph run (#7008) * add cmake changes for liboneflow_cpp.so Signed-off-by: daquexian <daquexian566@gmail.com> * add separate target for cpp api test Signed-off-by: daquexian <daquexian566@gmail.com> * add cpp api test in ci Signed-off-by: daquexian <daquexian566@gmail.com> * graph run * reverse the order of cudnn and cuda library Signed-off-by: daquexian <daquexian566@gmail.com> * update logic of BUILD_MONOLITHIC_LIBONEFLOW Signed-off-by: daquexian <daquexian566@gmail.com> * rename BUILD_MONOLITHIC_LIBONEFLOW to BUILD_MONOLITHIC_LIBONEFLOW_CPP_SO Signed-off-by: daquexian <daquexian566@gmail.com> * refine * [draft] implement graph parameter load and save (#7010) * implement parameter save (python) and load (c++) Signed-off-by: daquexian <daquexian566@gmail.com> * revert accident changes Signed-off-by: daquexian <daquexian566@gmail.com> * fix circular reference Signed-off-by: daquexian <daquexian566@gmail.com> * pimpl * batching * share lib directory in test container Signed-off-by: daquexian <daquexian566@gmail.com> * fix typo; * add github actions debug Signed-off-by: daquexian <daquexian566@gmail.com> * Revert "add github actions debug" This reverts commit 7d9aef6. * add upterm debug after exe test Signed-off-by: daquexian <daquexian566@gmail.com> * sleep after fail Signed-off-by: daquexian <daquexian566@gmail.com> * set LD_LIBRARY_PATH in yml for cpp api test exe Signed-off-by: daquexian <daquexian566@gmail.com> * refine * add test file && input order * sleep Signed-off-by: daquexian <daquexian566@gmail.com> * upload liboneflow_cpp.so Signed-off-by: daquexian <daquexian566@gmail.com> * modify cmake to trigger compilation Signed-off-by: daquexian <daquexian566@gmail.com> * load job from ir && clean && add mlir model * [remove useless python code]save to .pb * add target of_common_obj to remove duplicate REGISTER_PASS && run of_format * remove openvino * remove openvino test * refine * IValue * Update oneflow/api/cpp/framework/graph.h Co-authored-by: daquexian <daquexian566@gmail.com> * refine * refine * refine * refine * refine * refine * rename in oneflow.cmake * refine oneflow.cmake * make of_api_common object library * move device util function in api to core * remove device check in New and ThreadLocalGetOrNew * refine * fix device test * refine graph test * refine GetExeDir() * refine GetExeDir() again * fix * refine * fix Co-authored-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: mosout <mosout@qq.com> * disable autograd in lazy mode (#7070) * disable autograd in lazy mode * refine * Fix/rand source op in graph (#7092) * add test * fix rand consistent * add test * Fix powf (#7106) * quick fix power * add int scalar test case Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Dispatch stateful ops using functional api (#7046) * Dispatch functional stateful ops * fix * fix cmake * fix * disable attr check since it may not given when creating op expr. * fix * fix * fix * fix * fix * fix * fix * fix * refine Co-authored-by: VertexC <bob2420083992@gmail.com> * Fix HWLoc memory affinity (#7115) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add_env_api_docs (#7100) * add_env_api_docs * minor fix * fix grammatical errors Co-authored-by: Yao Chi <later@usopp.net> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * tmp skip s0 print because of slice (#7065) * tmp skip s0 print because of slice * tmp skip s0 print in test case * fix Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * indexing first version (#7012) * indexing first version * complete * test * out loop * test skip * revise * revise * shape * docs * formatted * confict1 * confict2 * confict2 * confict * revise * auto format by CI Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * fix maybe: add Maybe(T&&) to allow constructing from rvalue T (#7125) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * autotest_add_graph_log (#7126) * Meta info consistency check (#7085) * meta_info_consistency_check * refine check function * Update consistent_cast.cpp * move check to opinterpreter * refine * add note * refactor MetaInfoConsistencyCheck * of_format * refine * NonRecursiveMetaInfoConsistencyCheck * fix func name * add IsMetaInfoConsistencyCheckDisable() * mino fix * refine * minor fix * format * minor fix Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * cmake: use interface target instead of include_directories in pybind11 (#7128) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Import cmake dependence json and inja using FetchContent (#7124) * import cmake dependence json and inja using FetchContent * install-llvm: fix url hash * fix inja config * add cache var * fix ninja build * fix ninja build Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add environment variable to set GRPC_ARG_MAX_MESSAGE_LENGTH (#7130) * env ONEFLOW_GRPC_MAX_MESSAGE_BYTE_SIZE * set default to -1 * Fea/nhwc (#6811) * legacy maxpool2d module * add legacy avgpool2d * add graph cudnn conv alg config * add conv2d nhwc * lazy create cuda_stream in CudaCopyD2HDeviceCtx CudaStreamHandleDeviceCtx * refine * conv bn pool nhwc for resnet perf * one hot with float * use BiasAddRowGpu * rm l2 with 0 * reformat * add nhwc env var * legacy pool merged into new * refine * fix style * fix and refine * address review * fix and refine * fix doc test Co-authored-by: luyang <flowingsun007@163.com> Co-authored-by: guo-ran <360112263@qq.com> Co-authored-by: lixinqi <lixinqi0703106@163.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * reduce memory usage caused by slice grad (#7144) * cmake: fix THIRD_PARTY build (#7146) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix fold op (#7156) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Support inplace for lazy consistent (#7112) * Support inplace for lazy consistent * fix single client sbp hint * refine Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix prelu bug (#7118) * support dtype and device in prelu * optimize PreluFunctor * fix prelu 1-dim error * update * update * auto format by CI Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * use ibn2nd_sbp to get nd_sbp (#7155) Co-authored-by: Houjiang Chen <chenhoujiangcug@gmail.com> * fix copy bug (#7159) * fix copy bug * add to test case * refine * fix test case Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix laynorm backward bug (#7164) * fix layernorm backward index bug * add layernorm test case * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * [Fix] graph support 0-Size tensor (#6957) * Add nn.functional.glu graph test * add filter to motify functional autotest * motify code * add test example * add test else * add test judging condition for test_masked_fill.py,test_constant.py,test_tile.py、test_repeat.py,test_expand.py * add test ok example * Clear tensor name scope after graph build * Add test case of 2 graph caught same free eager tensor * auto format by CI * Dev cc clean tensor name scope (#7082) * Clear tensor name scope after graph build * Add test case of 2 graph caught same free eager tensor * auto format by CI Co-authored-by: chengtbf <472491134@qq.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * submit test success example * test success example * submit test code * fix a bug about relu module with 0 shape data * fixed a bug about relu module with 0 shape data * fix a bug about relu module with 0 shape data * fix a bug about relu module with 0 shape data * 0shape and 0d autotest * fix a bug about relu module with 0 shape data * 0shape changed to 0_size * modify test_var.py * modify test_eye.py * modify test_reshape.py * modify test_.py * modify ReshapeFunctor * modify some file * Fixed graph autotest bug with reshape op test * Fixed graph autotest bug with reshape op test * fixed test_sub.py * modify test_sub.py * modify tensor_methods.cpp * modify array_functor.cpp * graph support 0-Size tensor * rename 0shape to 0 size * modified check_graph=True * fix and refine Co-authored-by: Zhenhua <huangzhenhua@zhejianglab.com> Co-authored-by: tangnana925 <85614052+tangnana925@users.noreply.github.com> Co-authored-by: tangnana <tnn_personal@163.com> Co-authored-by: Zhenhua <1209435+hengzi@users.noreply.github.com> Co-authored-by: chengtbf <472491134@qq.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: Xiaoyu Xu <xiaoyulink@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Cumsum op implementation (#7050) * add cumsum op's forward definition * add cumsum forward test case * cumsum ver3 * remove calculating time * add cumsum forward gpu implementation * fix gpu forward error * change var name * remove annotation * add cumsum cpu forward multi-thread support * add multi-thread annotation * add cumsum grad definition * update * add cumsum cpu backward * add cumsum cpu backward functor * add cumsum autograd * update * remove user interface * use random method to test cumsum forward * add cumsum gpu backward * add cumsum gpu test * fix gpu backward bug * add a 3d cuda kernel try * Revert "add cumsum gpu test" This reverts commit 05c31556ba28ecb827b25e54c2f5fa38984e8096. * Revert "Revert "add cumsum gpu test"" This reverts commit 918ee1569863b008c1d419c3528257416cffd840. * change nele to ele_cnt * add test_cumsum.py in oneflow/test/modules * change original test_cumsum to autotest version * optimize cumsum for special up_space and down_space * add two special cu func * add cumsum doc * update doc * update doc * update code according to bbuf's review * ditto * change pin/pout to in_ptr/out_ptr * remove multi-thread func * update doc * use tensor processor * update by review * update by review * update * update * auto format by CI * auto format by CI * update doc * update Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * Logical slice in tenosr str (#7116) * using logical slice in tensor str * add tensor str util file * refine * refine * refine * refine * add logical slice docs * fix bug * fix comment * auto format by CI * fix doc test bug * delete TODO Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add install for oneflow py (#7107) * Add install for oneflow py * refine * refine * refine * refine * refine * refine * refine * refine * refien * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix bug: output key not exists when SavaJobToIR (#7139) * fix bug: output key not exists when SavaJobToIR * [test] makedirs when path not exists * remove useless comment Co-authored-by: Peihong Liu <mosout@qq.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add linalg 2d norm op for clip_grad (#7160) * add linalg_2d_norm op for clip_grad * code format * revert sqrt * fix comment * refine * fix comment * fix ci error * fix ci error * fix docs bug * fix ci error * fix ci error Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * refine nn.graph autotest (#7111) * add linspace op * refine graph autotest * revert * add graph error trace * fix bug * fix autotest bug * auto format by CI * fix set_printoptions error * auto format by CI * CI test bug * auto format by CI * For CI * auto format by CI * For CI test * fix ci error * revert for ci * fix bug * fix ci error * fix bug * fix bug Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: Li Xiang <54010254+lixiang007666@users.noreply.github.com> Co-authored-by: lixiang <88304454@qq.com> * add oneflow/pytorch cudnn.deterministic (#7172) * add cudnn.deterministic * fix bug * auto format by CI * fix bug * fix generate fake program input bug * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * fix linalg vector norm scalar tensor print bug (#7178) * fix linalg vector norm scalar tensor print bug * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * use op schema for cumsum (#7175) * add op schema for cumsum * change cumsum's td definition to math group * update * fix get_sbp for scalar math ops (#7184) * add inplace mul for clip_grad (#7180) * add inplace mul for clip_grad * auto format by CI * fix format error Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * Add swapaxes op (#7179) * Add swapaxes op * Modify runtime * fix docstr * Modify functor Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix install cuda include (#7191) * Support uneven split in eager slice boxing (#7123) * fix Resource::DumpCudnnConf * add shape para in boxing check function * fix GetBoxingFunction para * asymmetric_x_to_b support cpu * forbid uneven split in cellective boxing * refine slice boxing kernel to support uneven split * add test case and fix balanced_splitter error * fix test case * fix op/kernel bug * fix bug in symmetric_s_to_p * revert boxing_dividor_util.cpp * use const Shape& Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add stack kernel (#7152) * fix arange bug * build init kernel * add stack backward * remove annotation * reformat and fix sbp * fix ops td format * fix format * fix comment * add more test case in dim * fiux user ops td * fix to use size_t * fix annotation * fix less than * fix userop tabelgen * fix bug when num of inputs greater than 128 Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add erfinv op (#7163) * Add erfinv op pre * fix * add erfinv op * Add test * fix comment * add inplace version of erfinv * add inplace version docs * fix inplace cpu version kernel and ops td * add test and docs * fix back * fix unittest * fix const & Co-authored-by: MARD1NO <359521840@qq.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add doc for pybind type (#7193) * fix linspace bug (#7185) * fix linspace bug * auto format by CI * fix comment * annotation adaptive_avgpool3d Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add floor inplace version (#7187) * add floor inplace version * add docs * fix comment * fix comment * fix comment * auto format by CI * fix comment Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: ZZK <42901638+MARD1NO@users.noreply.github.com> * remove is_lazy check in nn.Graph inplace output (#7190) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix test case about eye (#7194) * fix eye test case * add test case Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix(narrow): fix consistent narrow gradient bug (#7195) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fused kernel with broadcast (#6977) * add broadcast for fused kernel * fix cuda memcpy ilegal access error * add broadcast for fused_softmax kernel * fix errors * add more test sample * reformat * add one_elif * reformat * use different dispatch logic * Use simplified dims * add simplified dims for fused_scale_mask_softmax_dropout * add simplified broadcast for fused_scale_mask_softmax_dropout * add simplified dims for fused_scale_mask_softmax * try to merge duplicate code * simpified kernel code * fix test case * fix check * remove annotation * add new line Co-authored-by: MARD1NO <359521840@qq.com> Co-authored-by: ZZK <42901638+MARD1NO@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * skip drop if drop rate is zero (#7186) * Dev inplace clamp (#7182) * add inplace for clamp * first commit * fix conflict * add clip alias and docs * fix bug and add test * add more test case * skip functional adaptive pool3d test Co-authored-by: Zhanghuihong <garfield.gzhh@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> Co-authored-by: Li Xiang <54010254+lixiang007666@users.noreply.github.com> * Revert "Fused kernel with broadcast (#6977)" (#7207) This reverts commit 80099aa. * [BUG] Fixed graph autotest bug with sub op (#7142) * fixed Fixed graph autotest bug with sub op test * fixed 0size data graph autotest bug with randperm op Co-authored-by: Xiaoyu Xu <xiaoyulink@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * var kernel (#7024) * var forward kernel * variance backward * add var backward * refine * refine * refine * refine * add GetSbpFn * refine * refine * refine * refine * refine * add TODO * replace 'axis' str using 'dim' str * change the way of getting cuda stream * add comment * auto format by CI * fix ref bug * fix static check error * auto format by CI * fix build many linux error * format * fix static check error * fix mut dptr error when size is 0 * refine * support 0 shape and nan * auto format by CI * refine * fix doctest because of accuracy error * fix backward unsqueeze dim bug * fix bug backward * refine * fix out of order bug Co-authored-by: Yao Chi <later@usopp.net> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add Python code frame, debug(2) and debug(3) in nn.Graph (#7110) * add frame * test pass * refine loc str * refine code * refine code * refine debug * add debug * block forward with glog scope * refine debug * glog to stderr when v 2 * refine py str api * refine and fix py obj repr * refine pystr; use GetOrThrow at pyfunc; use alsologtostderr * refine pystr * move str * fix test * log 2 alsolog Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * update readme 0.6.0 (#7202) * update readme * add Publication section * reorder * update default version * Fix check graph bug part1 (#7197) * support randperm graph test * add diagonal graph test * fix eye op check graph bug * refine * fix to bug * refine * fix * format * restruct nn.graph autotest * format * fix bug * auto format by CI * fix where test bug * comment diagonal op * fix comment * fix ci error Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: Li Xiang <54010254+lixiang007666@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * Add documentation file for nn.init.xxxx (#7181) * Add documentation file for nn.init.xxxx (#7168) * Modify document index order (#7168) Co-authored-by: Yao Chi <later@usopp.net> * Refactor to numpy (#7097) * tensor numpy method * to numpy * delete useless file * replace CHECK_JUST with JUST * tensor cpu method return self if it is in cpu * delete tensor buffer * delete useless code * refine * Update python/oneflow/nn/modules/tensor_ops.py Co-authored-by: Yinggang Wang <wyg19970408@gmail.com> * refine * add docstr of cpu method * delete useless code * refine * add comment * refine * add 'assert' info * refine * do .cpu if tensor is not in cpu memory * revert format change * fix tensor buffer numpy * support tensor buffer to invoke numpy * fix bug * fix nd sbp numpy bug * fix bug about test case because of numpy sharing memory with tensor * auto format by CI Co-authored-by: Yinggang Wang <wyg19970408@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * Fix eager boxing bug (#7196) * fix_eager_boxing_bug * remove EagerBoxingCall * minor fix * fix error * fix error * rename d to dim Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Feat eager consistent 2d sbp infer (#7143) * feat(EagerConsistent): support 2d sbp infer * feat(EagerConsistent): support compute copy cost * refine 2d sbp cannot find error message * refactor(EagerConsistent): move functions to sbp_infer_util * feat(EagerConsistent): add same sbp judgement * refine code * feat(EagerConsistent): update 1d to 1d copy cost * feat(EagerConsistent): try to get boxing from eager_consistent_boxing_mgr * feat(EagerConsistent): update copy cost function * remove useless code * refine code * fix merge bug * refine code and fix copy cost function * Revert "Fused kernel with broadcast (#6977)" This reverts commit 80099aa. * Add comment * refine code * fix JUST * Revert "Revert "Fused kernel with broadcast (#6977)"" This reverts commit e7e2990. * fix P->B copy cost * fix error message error Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: Houjiang Chen <chenhoujiangcug@gmail.com> * fix split default arg (#7222) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix no grad inplace clamp (#7220) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix readthedocs auto update (#7223) * fix docs (#7227) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * allow_file_schema_in_mirror_third_party (#7231) * support_symmetric_cyclic_nd_sbp_boxing (#7210) * support_symmetric_cyclic_nd_sbp_boxing * rename func * minor fix * solve comment * minor fix * fix typo Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix erfinv and swapaxes (#7217) * Fix erfinv and swapaxes * Fix * Fix bug and add test * Modify name * Fix arg * Modify pi * Fix Co-authored-by: ZZK <42901638+MARD1NO@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Support nd sbp dim reduce (#7230) * support_symmetric_cyclic_nd_sbp_boxing * rename func * minor fix * solve comment * minor fix * support_nd_sbp_dim_reduce * fix_typo * add test case * fix bug * fix bug * refine * fix dead loop error Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix comm test cases (#7021) * fix comm test cases * auto format by CI * refine * refine * refine Co-authored-by: ZZK <42901638+MARD1NO@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * fix_backward_bug_in_1d_to_2d_boxing (#7224) * fix_backward_bug_in_1d_to_2d_boxing * refine * of_format Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Skip layernorm warp test (#7243) * fix arange bug * skip * fix comment Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Print warning for non localhost proxy (#7228) * print warning for non localhost proxy Signed-off-by: daquexian <daquexian566@gmail.com> * reformat Signed-off-by: daquexian <daquexian566@gmail.com> * add more check Signed-off-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add ddp return type (#7232) * add dpp return type * add comment * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Parameter support both inplace op and setter (#7249) * feat(Parameter): Parameter support both inplace op and setter * feat(Tensor): tensor support data's getter interface * test(Parameter): add getter test Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix(*): fix sbp filter function bug (#7229) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * refine (#7240) * Eager boxing status (#7150) * add eager boxing status * refine MakeBoxingInterpreterStatus * add blank line * del EagerBoxingCall * refine BoxingInterpreterStatus * refine BoxingInterpreterStatus * add eager boxing log * minor fix * minor fix * revert removed file * add indent arg * rename indent to prefix * solve comment * refine eager_boxing_logger * use Global<const EagerBoxingLogger> * minor fix Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix empty bug (#7239) * fix empty bug * simplify empty Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix empty debug str of hob primitive (#7245) * fix empty debug str of hob primitive Signed-off-by: daquexian <daquexian566@gmail.com> * fix 'OF_PP_STRINGIZE(op)' Signed-off-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: ZZK <42901638+MARD1NO@users.noreply.github.com> * Add VSCode dev container (#7233) * add dev container * use oneflow/devcontainer * add settings for new lines and trailing ws * refine docs * add eol setting to config * Add '"--gpus", "all"' if running a CUDA image * set BUILD_HWLOC off in fast cmake init cache * Skip send and recv if dst and src are same. (#7255) * Maxpool op nhwc (#7214) * maxpool2d_support_nhwc * refine * add test case * format * refine * refine * fix comments * Implement consistent tensor detach (#7265) * Feat/zero optimization in nn.Graph (#7165) * debug * modify graph.py * fix bug about graph debug interface * Fix nn graph variable bind (#6895) * fix(AutoParallel): nn.Graph support auto_parallel change sbp * fix(AutoParallel): use tensor.set_data interface and add print sbp info * add comment * hack check * add test * refine test * refine test * refine code * add and refine zero * fix test * refine code * rm debug log * refine min size set * add note * debug zero * fix cudnn config * refine test doc * add comment of check * eager mode in graph pass * format * rebuid parameter according to sbp in synced plan * auto format by CI * fix code check * fix test * try init session at graph init * refine and revert session init * rm useless code * add back print of sys conf Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: grybd <52237830+grybd@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: wyg1997 <wyg19970408@gmail.com> * fix linspace limit bug (#7236) * fix linspace limit bug * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: Liang Depeng <liangdepeng@gmail.com> * fix merge bugs * fix(NNGraph): create tensor in jobpass after pulling plan * fix code bug Co-authored-by: Li Xinqi <lixinqi2010@gmail.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: ouyangyu <xuanjiuye@gmail.com> Co-authored-by: daquexian <daquexian566@gmail.com> Co-authored-by: leaves-zwx <kunta0932@gmail.com> Co-authored-by: Xiaoyu Xu <xiaoyulink@gmail.com> Co-authored-by: Shijie <821898965@qq.com> Co-authored-by: Li Xiang <54010254+lixiang007666@users.noreply.github.com> Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> Co-authored-by: binbinHan <han_binbin@163.com> Co-authored-by: Luyang <flowingsun007@163.com> Co-authored-by: cheng cheng <472491134@qq.com> Co-authored-by: ZZK <42901638+MARD1NO@users.noreply.github.com> Co-authored-by: Shenghang Tsai <jackalcooper@gmail.com> Co-authored-by: PragmaTwice <i@twice.moe> Co-authored-by: hjchen2 <chenhoujiangcug@gmail.com> Co-authored-by: luqiang guo <702572275@qq.com> Co-authored-by: Peihong Liu <mosout@qq.com> Co-authored-by: Juncheng <liujuncheng1022@gmail.com> Co-authored-by: ZeKai Zhou <30856589+zzk0@users.noreply.github.com> Co-authored-by: VertexC <bob2420083992@gmail.com> Co-authored-by: Yao Chi <later@usopp.net> Co-authored-by: liufengwei0103 <2472937968@qq.com> Co-authored-by: lichunyou <33850693+lcylcy@users.noreply.github.com> Co-authored-by: guo-ran <360112263@qq.com> Co-authored-by: lixinqi <lixinqi0703106@163.com> Co-authored-by: wyushun <wyushun@foxmail.com> Co-authored-by: fengdaozhuo <52237830+grybd@users.noreply.github.com> Co-authored-by: Zhenhua <huangzhenhua@zhejianglab.com> Co-authored-by: tangnana925 <85614052+tangnana925@users.noreply.github.com> Co-authored-by: tangnana <tnn_personal@163.com> Co-authored-by: Zhenhua <1209435+hengzi@users.noreply.github.com> Co-authored-by: lixiang <88304454@qq.com> Co-authored-by: MARD1NO <359521840@qq.com> Co-authored-by: DangKai <dangkai4u@outlook.com> Co-authored-by: Zhanghuihong <garfield.gzhh@gmail.com> Co-authored-by: Tao Lei <96455870+taoteo@users.noreply.github.com> Co-authored-by: Liang Depeng <liangdepeng@gmail.com>
- Loading branch information