Feat/eager tensor to graph out and inplace #7254

strint · 2022-01-13T14:59:53Z

Support eager tensor in nn.Graph's

output
inplace out to output

Refine auto test of nn.Graph

…github.com/Oneflow-Inc/oneflow into feat/eager_tensor_to_graph_out_and_inplace

oneflow/core/framework/op_interpreter/lazy_op_interpreter.cpp

…github.com/Oneflow-Inc/oneflow into feat/eager_tensor_to_graph_out_and_inplace

oneflow/core/framework/op_interpreter/lazy_op_interpreter.cpp

python/oneflow/test_utils/automated_test_util/torch_flow_dual_object.py

github-actions · 2022-01-16T10:50:38Z

CI failed when running job: cuda-legacy-benchmark-experimental. PR label automerge has been removed

github-actions · 2022-01-17T12:28:53Z

Speed stats:

GPU Name: GeForce GTX 1080 

OneFlow resnet50 time: 136.9ms (= 13687.0ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 137.7ms (= 13771.2ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.01 (= 137.7ms / 136.9ms)

OneFlow resnet50 time: 78.7ms (= 7867.0ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 82.5ms (= 8252.4ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.05 (= 82.5ms / 78.7ms)

OneFlow resnet50 time: 52.9ms (= 10588.2ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 56.9ms (= 11388.4ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.08 (= 56.9ms / 52.9ms)

OneFlow resnet50 time: 43.3ms (= 8660.3ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 46.8ms (= 9369.5ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.08 (= 46.8ms / 43.3ms)

OneFlow resnet50 time: 40.4ms (= 8081.3ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 38.1ms (= 7628.0ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 0.94 (= 38.1ms / 40.4ms)

OneFlow resnet50 time: 150.6ms (= 15064.6ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 157.2ms (= 15724.0ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.04 (= 157.2ms / 150.6ms)

OneFlow resnet50 time: 96.0ms (= 9603.3ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 99.7ms (= 9973.2ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.04 (= 99.7ms / 96.0ms)

OneFlow resnet50 time: 71.4ms (= 14288.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 73.9ms (= 14777.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.03 (= 73.9ms / 71.4ms)

OneFlow resnet50 time: 64.5ms (= 12895.1ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 62.3ms (= 12452.5ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 0.97 (= 62.3ms / 64.5ms)

OneFlow resnet50 time: 57.6ms (= 11526.4ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 60.6ms (= 12120.7ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.05 (= 60.6ms / 57.6ms)

* add ddp return type (#7232) * add dpp return type * add comment * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Parameter support both inplace op and setter (#7249) * feat(Parameter): Parameter support both inplace op and setter * feat(Tensor): tensor support data's getter interface * test(Parameter): add getter test Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix(*): fix sbp filter function bug (#7229) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * refine (#7240) * Eager boxing status (#7150) * add eager boxing status * refine MakeBoxingInterpreterStatus * add blank line * del EagerBoxingCall * refine BoxingInterpreterStatus * refine BoxingInterpreterStatus * add eager boxing log * minor fix * minor fix * revert removed file * add indent arg * rename indent to prefix * solve comment * refine eager_boxing_logger * use Global<const EagerBoxingLogger> * minor fix Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix empty bug (#7239) * fix empty bug * simplify empty Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix empty debug str of hob primitive (#7245) * fix empty debug str of hob primitive Signed-off-by: daquexian <daquexian566@gmail.com> * fix 'OF_PP_STRINGIZE(op)' Signed-off-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: ZZK <42901638+MARD1NO@users.noreply.github.com> * Add VSCode dev container (#7233) * add dev container * use oneflow/devcontainer * add settings for new lines and trailing ws * refine docs * add eol setting to config * Add '"--gpus", "all"' if running a CUDA image * set BUILD_HWLOC off in fast cmake init cache * Skip send and recv if dst and src are same. (#7255) * Maxpool op nhwc (#7214) * maxpool2d_support_nhwc * refine * add test case * format * refine * refine * fix comments * Implement consistent tensor detach (#7265) * Feat/zero optimization in nn.Graph (#7165) * debug * modify graph.py * fix bug about graph debug interface * Fix nn graph variable bind (#6895) * fix(AutoParallel): nn.Graph support auto_parallel change sbp * fix(AutoParallel): use tensor.set_data interface and add print sbp info * add comment * hack check * add test * refine test * refine test * refine code * add and refine zero * fix test * refine code * rm debug log * refine min size set * add note * debug zero * fix cudnn config * refine test doc * add comment of check * eager mode in graph pass * format * rebuid parameter according to sbp in synced plan * auto format by CI * fix code check * fix test * try init session at graph init * refine and revert session init * rm useless code * add back print of sys conf Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: grybd <52237830+grybd@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: wyg1997 <wyg19970408@gmail.com> * fix linspace limit bug (#7236) * fix linspace limit bug * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: Liang Depeng <liangdepeng@gmail.com> * replace throw by OF_UNIMPLEMENTED or UNIMPLEMENTED, refine error message, replace CHECK by CHECK_OR_RETURN (#7121) * replace throw by OF_UNIMPLEMENTED in dim_scatter_ops.cpp * replace throw by OF_UNIMPLEMENTED in scatter ralated kernels * replace throw by OF_UNIMPLEMENTED in scatter ralated kernel * replace glog CHECK by oneflow CHECK_OR_RETURN * refine error message on modified UNIMPLEMENTED * replace CHECK by CHECK_OR_RETURN in dim_scatter_ops.cpp * refine error message on modified UNIMPLEMENTED * refine error message on modified UNIMPLEMENTED * refine error message on modified UNIMPLEMENTED * remove std::endl, add period, remove redundant maybe.h including * remove std::endl, add period Co-authored-by: Yao Chi <later@usopp.net> * Remove single client from CI (#7274) * remove single client ci * update get-oneflow * rm changed_files * refine workflow * Revert "refine workflow" This reverts commit f9cdcadf63f4634177471a06be5a2aa49e87df68. * Update test.yml * refine * refine * refine * reorder * rm changed_files * refine * add CHANGELOG.md * refine * Feat/eager tensor to graph out and inplace (#7254) * feat(Parameter): Parameter support both inplace op and setter * feat(Tensor): tensor support data's getter interface * test(Parameter): add getter test * debug * add test * open flatten graph test * add validated flase type * refine * foramt Co-authored-by: wyg1997 <wyg19970408@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Optimize LayerNorm backward param grad (#6996) * layer_norm forward * test case * rm useless * layer_norm backward dx * layer norm param grad * int count to T count * fix * fix T mask to int mask, refine code * refine * refine * test case * refine * format * fix * add dtype bfloat16 * refine * refine * refine * refine * sum_loss to sum_stats * x_buf to normalized_buf * refine * refine * address review * refine * add testcase * double use uncached impl to reduce compile time * Fix python apis and xla implementation (#7183) * Support save/load for lr_scheduler (#6948) * feat(LrScheduler): support save/load for lr_scheduler * refine document * auto format by CI * Refine test * auto format by CI Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * Fix eye_op attr (#6973) * fix * add graph test * Update python/oneflow/test/graph/test_graph_eye.py Co-authored-by: daquexian <daquexian566@gmail.com> * refine * Update python/oneflow/test/graph/test_graph_eye.py Co-authored-by: daquexian <daquexian566@gmail.com> * auto format by CI Co-authored-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * softmax double use uncached impl to accelerate compile (#6992) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add [[nodiscard]] for cpp api (#6997) * add [[nodiscard]] * refine * reformat Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Support Arange delta to decide dtype (#6998) * support delta dtype to decide output dtype * add more unittest Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add clang as CUDA FE compiler in CI (#6954) * update action use * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * fix * add 80 and 86 * refine * refine * add CUDA_NVCC_THREADS_NUMBER * refine * address review * set CUDA_NVCC_THREADS_NUMBER 8 * fix * fix clang in init cmake * add script * refine * refine * refine * refine * refine * refien * refine * add flags to skip zlib * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * Migrate chunk python layer to functor (#6983) * Migrate chunk Python layer logic to functor * fix runtime * Fix splits bug and CI * Modify push to emplace Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Reduce memory usage when compiling oneflow dialect ops (#7000) * CudaAllocator device reset before OOM (#6976) * CudaAllocator device reset before OOM * Add NOTE Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Refactor vm stream desc (#6989) * remove StreamDesc::num_machines * Prepare one thread for one stream_type Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add Diagonal Op (#6016) * format complete * python to cpp * py2cpp error * rm * auto format by CI * revise * auto format by CI * license * docstring * docstring * tensor * tensor attribute * auto format by CI * docstring * revise * test * revise * revise * rename * half * docs * doc,test * test times * revise * format Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add all to all op (#6283) * add all to all op * add barrier * format * add import * fix test * delete barrier * delete barrier * Revert "delete barrier" This reverts commit aa397ea5ba815fe6df883b263b82735f126345c8. * Revert "delete barrier" This reverts commit 7ddf79afaa7ac072813e84ce9224440939a3f95c. * check tensor meta between ranks * add more assert * all_reduce operate in place * all_reduce operate in place * fix bug * assert tensor.is_local * fix bug in scatter * add more assert * delete meta check * add pytorch comparison test * add pytorch comparison test * refine * add ONEFLOW_TEST_CPU_ONLY * fix bug from torch gloo Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Dev ivalue for cpp api (#6890) * add api tensor * refine * add nn.relu * refine * clean shape & refine relu test * support void* for from_blob * add multithreading relu test * refine test * refine * refine * add comment for __internal_tensor() * convert to copy_util * reformat * refine * add ivalue * refine directory structure * refine cpp api test * refine test * add ivalue * refine ivalue * refine ivalue * refine * refine * refine * refine Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * default use cpu generator (#7001) * optimize reshape/slice/transpose functor (#6956) * optimize reshape/slice/transpose functor * update code according to reviewer's suggestion * judge negative dimension number besides -1 * judge negative shape value in view::Reshape * remove is_full_slice logic in SliceFunctor * update code according to yinggang's advice * move ordered permute judge to TransposeKernel * remove print sentence * abstract IsOrderedPermute func * support negative permute value in TransposeFunctor * delete tranpose_kernel optimization * Revert "delete tranpose_kernel optimization" This reverts commit e026434dc7c1ebad948c76bde475540e3bf4477a. * not return original tensor when reshape do nothing * simplify code * correct spell error Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix IsContinuosSubspace error (#6968) * fix IsContinuosSubspace error * recover original IsContinuosSubspace code * add test case * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add cpu group deconve impl (#6980) * add cpu group deconv impl * remove useless lines * remove useless lines * add deconv2d import * add groups test * remove check_allclose=False * add tf_prelu * add cpu group deconv impl * remove useless lines * remove useless lines * add deconv2d * add groups test * remove check_allclose=False * add tf_prelu * auto format by CI * add deconv2d impl * add deconv2d impl * remove useless lines * add deconv2d in functional api * auto format by CI * auto format by CI * Add variable initial * Add variable initial * auto format by CI * add conv2d impl * add conv2d impl * auto format by CI * remove useless lines Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * Migrate the python layer logic of broadcastlike to functor (#7007) * Migrate the python layer logic of broadcastlike to functor * add var name Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Temporarily skip comm test cases (#7015) * Temporarily skip comm test cases * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix nd_sbp attribute type and set nd_sbp in random functors (#7017) * fix * fix compile Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Save Job to IR and load Job from IR (#6885) * save to ir * test * fix bugs * impl load and test * rm useless code * fix conflict * fix issues * JobOp * fix issues * fix test_fuse_tril_scale * fix test jit-outline-func * fix test_mlir_opt.py * save * fix ods gen for max and avg pool * rename oneflow to oneflow_foundation * fix files checks * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * auto format by CI * check in changes * refine * Update oneflow/ir/test/OneFlow/test_mlir_opt.py * Update oneflow/ir/include/OneFlow/OneFlowOps.td * refine includes * printer & parser & verifier * code tidy * tidy include * address review * rm duplicated GetDataTypeType * TensorSource trait Co-authored-by: jackalcooper <jackalcooper@gmail.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * Fix Simple CI linkage (#6986) * fix-simple-ci-linkage * refine * refine * fix * refine * refine * refine * refine * refien * refine * revert * refine * auto format by CI * refine * revert * refine Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * fix sbp when weight is optional (#6984) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Feat from numpy (#7013) * feat(Tensor): support share memory with ndarray * test(FromNumpy): add test * enhancement test and add document * Fix merge error * fix bug in numpy c api * Fix(doctest): fix doctest error Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add custom ShapeAttr in ODS (#7023) * add ShapeAttr * refine * fix doc * refine * fix (#7028) * Add linspace op (#7006) * add linspace op * refine doc * refine * fix comments * fix comment * auto format by CI * fix ci doc error Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix fasterrcnn infer (#7014) * fix fasterrcnn infer * roi_align 0shape * refine * refine Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * separate kernel state and cache (#6655) * support eager state except lazy dynamic Signed-off-by: daquexian <daquexian566@gmail.com> * modularize kernel contexts Signed-off-by: daquexian <daquexian566@gmail.com> * fix warning Signed-off-by: daquexian <daquexian566@gmail.com> * reformat Signed-off-by: daquexian <daquexian566@gmail.com> * remove duplicated license Signed-off-by: daquexian <daquexian566@gmail.com> * fix static check error Signed-off-by: daquexian <daquexian566@gmail.com> * make test gpu only Signed-off-by: daquexian <daquexian566@gmail.com> * temp Signed-off-by: daquexian <daquexian566@gmail.com> * revert opkernel context changes, align with master Signed-off-by: daquexian <daquexian566@gmail.com> * reformat Signed-off-by: daquexian <daquexian566@gmail.com> * refine cachecontext Signed-off-by: daquexian <daquexian566@gmail.com> * add separate cache context inferface, remove out-dated files Signed-off-by: daquexian <daquexian566@gmail.com> * add init and cache context aliases Signed-off-by: daquexian <daquexian566@gmail.com> * update eager kernel Signed-off-by: daquexian <daquexian566@gmail.com> * fix wrong AttrMayChanged value Signed-off-by: daquexian <daquexian566@gmail.com> * rename and add comment Signed-off-by: daquexian <daquexian566@gmail.com> * auto format by CI * fix combined_margin_loss_kernel.cpp Signed-off-by: daquexian <daquexian566@gmail.com> * rename op_kernel_state_wrapper.h to op_kernel_wrapper.h Signed-off-by: daquexian <daquexian566@gmail.com> * rename more classes, fix old cache in stateful op kernel Signed-off-by: daquexian <daquexian566@gmail.com> * rename more classes Signed-off-by: daquexian <daquexian566@gmail.com> * may changed -> not changed Signed-off-by: daquexian <daquexian566@gmail.com> * optimize away genrepeatedbn Signed-off-by: daquexian <daquexian566@gmail.com> * reformat Signed-off-by: daquexian <daquexian566@gmail.com> * refine Signed-off-by: daquexian <daquexian566@gmail.com> * update stateful local opkernel, use Cache** if possible Signed-off-by: daquexian <daquexian566@gmail.com> * remove TensorDesc4ArgNameAndIndex base method Signed-off-by: daquexian <daquexian566@gmail.com> * auto format by CI * fix clang-tidy error Signed-off-by: daquexian <daquexian566@gmail.com> * auto format by CI * fix conv kernel bug Signed-off-by: daquexian <daquexian566@gmail.com> * auto format by CI * fix group conv bug and fix warning Signed-off-by: daquexian <daquexian566@gmail.com> * fix avgpool error Signed-off-by: daquexian <daquexian566@gmail.com> * fix maxpool error Signed-off-by: daquexian <daquexian566@gmail.com> * auto format by CI * respect flag in deconv cpu kernel, rename cache to cache_ptr Signed-off-by: daquexian <daquexian566@gmail.com> * fix compile error Signed-off-by: daquexian <daquexian566@gmail.com> * auto format by CI * fix deconv cache bug Signed-off-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: Li Xinqi <lixinqi2010@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add fully support for all datatype (#7025) * add fully support for all datatype * Use max array size * add clang-format off to maintain the matrix * fix format * remove redundant numpy dtype Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Migrate split python layer to functor (#7030) * Migrate split python layer to functor * modify dim Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add add_sparse_optimizer for Graph (#6988) * add_sparse_optimizer * format * fix bug * refine new interface by discuss * auto format by CI * address review * correct syntax * correct error message * rm debug print * auto format by CI * fix cpu-only test Co-authored-by: XIE Xuan <xiexuanx2@gmail.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Refine RUN_CUDA_KERNEL (#7003) * Refine RUN_CUDA_KERNEL * Added LaunchConfig Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Support llvm in tree build (#6995) * refine * refine * refine * refine * add61 * refien * refine * refine * refine * refine * refien * refine * refine * refine * refine * refine * refine * refine * rm * revert * refine * refine * refine * refine * return_self_in_to_consistent_if_necessary (#7004) * return_self_in_to_consistent_if_necessary * fix error and add test case * skip cpu test * fix error Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Decouple ep and global (#7027) * Decouple ep and global * NOLINT * fix * fix import Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * arange doc fix (#7035) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add_consistency_check_in_consistent_tensor_set_data (#7002) * add_consistency_check_in_consistent_tensor_set_data * auto format by CI * minor fix * add just wrap Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * [cmake] add liboneflow_cpp target (#7005) * add cmake changes for liboneflow_cpp.so Signed-off-by: daquexian <daquexian566@gmail.com> * add separate target for cpp api test Signed-off-by: daquexian <daquexian566@gmail.com> * add cpp api test in ci Signed-off-by: daquexian <daquexian566@gmail.com> * reverse the order of cudnn and cuda library Signed-off-by: daquexian <daquexian566@gmail.com> * update logic of BUILD_MONOLITHIC_LIBONEFLOW Signed-off-by: daquexian <daquexian566@gmail.com> * rename BUILD_MONOLITHIC_LIBONEFLOW to BUILD_MONOLITHIC_LIBONEFLOW_CPP_SO Signed-off-by: daquexian <daquexian566@gmail.com> * share lib directory in test container Signed-off-by: daquexian <daquexian566@gmail.com> * add github actions debug Signed-off-by: daquexian <daquexian566@gmail.com> * Revert "add github actions debug" This reverts commit 7d9aef684a479285c690f38d25525c9b97865e45. * add upterm debug after exe test Signed-off-by: daquexian <daquexian566@gmail.com> * sleep after fail Signed-off-by: daquexian <daquexian566@gmail.com> * set LD_LIBRARY_PATH in yml for cpp api test exe Signed-off-by: daquexian <daquexian566@gmail.com> * sleep Signed-off-by: daquexian <daquexian566@gmail.com> * upload liboneflow_cpp.so Signed-off-by: daquexian <daquexian566@gmail.com> * modify cmake to trigger compilation Signed-off-by: daquexian <daquexian566@gmail.com> * remove sleep Signed-off-by: daquexian <daquexian566@gmail.com> * build cpp api in cpu mode Signed-off-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix CUDA 52 and add it to CI (#7031) * refine * refine * refine * refine * revert * fix * refine * refine * refine Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add check of placement constructor (#6991) * add_check_of_placement_constructor * move CheckDeviceIdsIsValid to runtime * handle comment * fix error * fix error Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix(FromNumpy): fix bug in stride (#7042) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add non virtual destructor back (#6999) Signed-off-by: daquexian <daquexian566@gmail.com> Co-authored-by: Houjiang Chen <chenhoujiangcug@gmail.com> * move python code to cpp: eye (#7036) * 80% Sbp signature left to finish * refine functional_api.yaml * 90% docstr left to update * refine * add sbp check * refine docs * auto format by CI * refine * refine docstr * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix l2norm block_size (#7044) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix undefined symbol: cudaGetDeviceCount (#7052) * fix_worker_orphan_process (#7048) * fix_worker_orphan_process * use SIGTERM instead * broadcast elemwise binary (#6871) * add * broadcast elementwise binary * fix * refine * fix * refine * refine * for compile * refine * refine * refine * refine * refine * revert kernels * revert kernel * refine * refine * refine * refine * nvcc thread to 4 Co-authored-by: ZZK <42901638+MARD1NO@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Source op per critical section (#6472) * backup code * EventRecord * auto format by CI * backup code * remove deprecated binary test cases * refactor valatile to atomic * add StreamType::InitInstructionStatusIf/StreamType::DeleteInstructionStatusIf * merge from branch profiling_nn_graph * address comments * EventRecordProvider * more comments for XXXStatusQuerier::SetLaunched * more comments for SharedEventRecord::Init * wait source op per critical section * rename a task_node.cpp * minor fix * backup code * fix compiler complaints * 1) remove AddCtrlEdgeBetweenSrcDstTickAndInputOutputInSameRank; 2) create CriticalSectionInstance buffers * fix compiler complaints * more profiler code * refactor vm preschedule * TryMoveFromWaitingToReady * revert flying_instruction_cnt * revert to single position to call DispatchInstruction * revert several code * reset instruction watermark * remove is_xxx_hook_empty * build with profiler * merge master * insert device ticks before and after critical sections * refactor register_num of cs_wait/cs_callback from 2 to 128 * fix static analysis complaints * fix complier complaints about JobBuilder::ParallelConf4OpName * Update oneflow/core/operator/critical_section_wait_tick_op.cpp Co-authored-by: daquexian <daquexian566@gmail.com> * address pr comments * add job example for InstructionsBuilder::LaunchLazyJob * address pr comments Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: ouyangyu <xuanjiuye@gmail.com> Co-authored-by: daquexian <daquexian566@gmail.com> * More details of error of getting op matched sbp signature (#7077) * more details of error msg * minor change * address review comment * avoid namesake iterator * Module apply only once (#7055) * add once apply of param * apply once on buffer * test reuse var on module to * test resue var * rm useless test * finish test * refine test Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * distributed test bugfix (#7057) * change spawn_shell to spawn_shell_and_check, sleep in script Signed-off-by: daquexian <daquexian566@gmail.com> * fix distributed test master addr Signed-off-by: daquexian <daquexian566@gmail.com> * remove sleep Signed-off-by: daquexian <daquexian566@gmail.com> * spawn_shell -> spawn_shell_ignoring_failure Signed-off-by: daquexian <daquexian566@gmail.com> * auto format by CI * fix bug Signed-off-by: daquexian <daquexian566@gmail.com> * auto format by CI * fix the reversed logic Signed-off-by: daquexian <daquexian566@gmail.com> * improve error msg Signed-off-by: daquexian <daquexian566@gmail.com> * resolve name conflict of MASTER_ADDR Signed-off-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix promote_type matrix (#7066) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix chunk op dim=-1 bug (#7073) * fix chunk op dim=-1 bug * Update oneflow/core/functional/impl/array_functor.cpp Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> * Update oneflow/core/functional/impl/array_functor.cpp Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix resource desc dump cudnn conf bug (#7038) * fix Resource::DumpCudnnConf * fix typo and error msg Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix concat bug (#7075) * fix * support concat single input * Clean TensorNameScope after graph build (#7076) * Clear tensor name scope after graph build * Add test case of 2 graph caught same free eager tensor * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix_abnormal_printing (#7099) * Fix bias add dropout fuse (#7081) * fix bias_add dropout fuse when p=0.0 * remove redundant op Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Support 1d to 2d eager boxing (#7083) * fix Resource::DumpCudnnConf * support_1d_to_2d_eager_boxing * rename stack to unflatten * add test case * of format * refine test case * Revert "fix Resource::DumpCudnnConf" This reverts commit f07278d71e3f344f435fc8f116a12cbd1c099b54. * support nd to 1d * add 2d to 1d test case Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Implement all User Ops with Op Schema (#7032) * add oneflow-tblgen: generate op schema (OpInterpCtx) from ods * cmake: add inja * tblgen: add oneflow_datatype * tblgen: use option cat * tblgen: fix error * tblgen: put impl in .cpp * tblgen: fix null attrs * tblgen: fix null ops * refine * refine * reifne * Refine op schema template and compilation * add base OpInterpCtx to finish compilation * fix * refine * fix * add custom infer code * generate op registrants automatically * refine * fix * update user op ods and fix shape attr * refine * refine * add custom code in op base * refine comments * add same_output_regst_num and infer * support declare hasxx * update op schema emitter * refine * emit output regist num * refine * refine * migrate acc op * migrate onerec_reader, ones_like, send, pack and padding ops * add has_sbp_signature_infer_fn * refine * migrate pad, parallel_cast, partial_fc and pooling ops * rm redundant has_device_infer_fn * migrate prelu, quantization, randperm, reduce and repeat ops * migrate reshape, reshape_like, roi_align, same_pad, selu and scalar related ops * back port * backport * migrate ops * refine * refine * refine * refine * add new op * fix llvm not found * fix mlir headers * fix mlir headers * fix llvm not found * irefine * mark override * fix merge * fix * fix * set op schema as obj lib to speed up * rewrite ops * add addn * add grdi * refien * add more def (#7051) * affine grid * refien * refine * refine * refine * fix * refien * refine * refine * refine * refine * refine * refien * refine * refine * refein * refine * refine * refine * refine * refien * refine * refine * refine * refien * refien * refien * refine * refine * refien * refine * refine * refine * refein * refine * refine * refine * refine * refine * refien * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refein * refine * refine * refine * move more ops * fix math_binary_broadcast/elementwise_ops * fix hardtanh * add norm * rename file and add CpuOnly no_grad * fix ir & fix norm op * fix oneflow-tblgen * fix math_unary_elementwise_op * fix norm * fix bn * fix op schema * refine * fix * refine physical_tensor_desc_infer_fn * refine * add ScalarLogicalNotEqualOp & RecvOp * refine * auto format by CI * fix fmt * add cuda only trait * delete unused inja * del inja_copy_headers_to_destination * delete unused inja * del inja_copy_headers_to_destination * add cuda only to tblgen * fix json inja url and md5 not used * fix json inja url and md5 not used * refine * revert * add with cuda * refine * delete GenUserOpODS * remove cuda only * revert cuda only after meeting * fix Co-authored-by: PragmaTwice <i@twice.moe> Co-authored-by: hjchen2 <chenhoujiangcug@gmail.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * Feat/debug pass (#7054) * add pass debug * debug pass * refine comment of fuse add pass * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix error message (#6930) * fix error message * fix dot doc * fix dot elem cnt * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix simple ci: add of_op_schema target to tidy check (#7105) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Rename AnyType in .td (#7109) * AnyType => Tensor * refine * refine Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Feat graph reuse var (#7080) * add once apply of param * apply once on buffer * test reuse var on module to * test resue var * rm useless test * finish test * refine test * Clear tensor name scope after graph build * Add test case of 2 graph caught same free eager tensor * auto format by CI * refactor var build draft * add full func; add check * done * add test of call parameter ousite its moudule * fix break test Co-authored-by: chengtbf <472491134@qq.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix l2_normalize & add nn.functional.normalize (#6940) * fix l2_normalize * add normalize * add test for normalize * refine * clean l2_normalize and refine normalize * simplify normalize test * Fix l2norm block_size * refine Co-authored-by: Juncheng <liujuncheng1022@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Align api in swin transformer (#7058) * add linspace op * fix align error in swintransformer * add @ magic method * fix conflict * support tensor list * fix meshgrid bug * revert Co-authored-by: hjchen2 <chenhoujiangcug@gmail.com> * set CMAKE_LINK_DEPENDS_NO_SHARED to ON (#7063) Signed-off-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add other api graph autotest (#7091) * Clear tensor name scope after graph build * Add test case of 2 graph caught same free eager tensor * auto format by CI * add other api graph autotest * add more samples * fix comments * refine * refine * refine * refine * refine * fix error * fix test error * fix bug * fix flip bug * fix bug * fix bug * fix ci bug * fix ci error * fix bug * fix ci error Co-authored-by: chengtbf <472491134@qq.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: Li Xiang <54010254+lixiang007666@users.noreply.github.com> * [serving] dev graph run (#7008) * add cmake changes for liboneflow_cpp.so Signed-off-by: daquexian <daquexian566@gmail.com> * add separate target for cpp api test Signed-off-by: daquexian <daquexian566@gmail.com> * add cpp api test in ci Signed-off-by: daquexian <daquexian566@gmail.com> * graph run * reverse the order of cudnn and cuda library Signed-off-by: daquexian <daquexian566@gmail.com> * update logic of BUILD_MONOLITHIC_LIBONEFLOW Signed-off-by: daquexian <daquexian566@gmail.com> * rename BUILD_MONOLITHIC_LIBONEFLOW to BUILD_MONOLITHIC_LIBONEFLOW_CPP_SO Signed-off-by: daquexian <daquexian566@gmail.com> * refine * [draft] implement graph parameter load and save (#7010) * implement parameter save (python) and load (c++) Signed-off-by: daquexian <daquexian566@gmail.com> * revert accident changes Signed-off-by: daquexian <daquexian566@gmail.com> * fix circular reference Signed-off-by: daquexian <daquexian566@gmail.com> * pimpl * batching * share lib directory in test container Signed-off-by: daquexian <daquexian566@gmail.com> * fix typo; * add github actions debug Signed-off-by: daquexian <daquexian566@gmail.com> * Revert "add github actions debug" This reverts commit 7d9aef684a479285c690f38d25525c9b97865e45. * add upterm debug after exe test Signed-off-by: daquexian <daquexian566@gmail.com> * sleep after fail Signed-off-by: daquexian <daquexian566@gmail.com> * set LD_LIBRARY_PATH in yml for cpp api test exe Signed-off-by: daquexian <daquexian566@gmail.com> * refine * add test file && input order * sleep Signed-off-by: daquexian <daquexian566@gmail.com> * upload liboneflow_cpp.so Signed-off-by: daquexian <daquexian566@gmail.com> * modify cmake to trigger compilation Signed-off-by: daquexian <daquexian566@gmail.com> * load job from ir && clean && add mlir model * [remove useless python code]save to .pb * add target of_common_obj to remove duplicate REGISTER_PASS && run of_format * remove openvino * remove openvino test * refine * IValue * Update oneflow/api/cpp/framework/graph.h Co-authored-by: daquexian <daquexian566@gmail.com> * refine * refine * refine * refine * refine * refine * rename in oneflow.cmake * refine oneflow.cmake * make of_api_common object library * move device util function in api to core * remove device check in New and ThreadLocalGetOrNew * refine * fix device test * refine graph test * refine GetExeDir() * refine GetExeDir() again * fix * refine * fix Co-authored-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: mosout <mosout@qq.com> * disable autograd in lazy mode (#7070) * disable autograd in lazy mode * refine * Fix/rand source op in graph (#7092) * add test * fix rand consistent * add test * Fix powf (#7106) * quick fix power * add int scalar test case Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Dispatch stateful ops using functional api (#7046) * Dispatch functional stateful ops * fix * fix cmake * fix * disable attr check since it may not given when creating op expr. * fix * fix * fix * fix * fix * fix * fix * fix * refine Co-authored-by: VertexC <bob2420083992@gmail.com> * Fix HWLoc memory affinity (#7115) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add_env_api_docs (#7100) * add_env_api_docs * minor fix * fix grammatical errors Co-authored-by: Yao Chi <later@usopp.net> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * tmp skip s0 print because of slice (#7065) * tmp skip s0 print because of slice * tmp skip s0 print in test case * fix Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * indexing first version (#7012) * indexing first version * complete * test * out loop * test skip * revise * revise * shape * docs * formatted * confict1 * confict2 * confict2 * confict * revise * auto format by CI Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * fix maybe: add Maybe(T&&) to allow constructing from rvalue T (#7125) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * autotest_add_graph_log (#7126) * Meta info consistency check (#7085) * meta_info_consistency_check * refine check function * Update consistent_cast.cpp * move check to opinterpreter * refine * add note * refactor MetaInfoConsistencyCheck * of_format * refine * NonRecursiveMetaInfoConsistencyCheck * fix func name * add IsMetaInfoConsistencyCheckDisable() * mino fix * refine * minor fix * format * minor fix Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * cmake: use interface target instead of include_directories in pybind11 (#7128) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Import cmake dependence json and inja using FetchContent (#7124) * import cmake dependence json and inja using FetchContent * install-llvm: fix url hash * fix inja config * add cache var * fix ninja build * fix ninja build Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add environment variable to set GRPC_ARG_MAX_MESSAGE_LENGTH (#7130) * env ONEFLOW_GRPC_MAX_MESSAGE_BYTE_SIZE * set default to -1 * Fea/nhwc (#6811) * legacy maxpool2d module * add legacy avgpool2d * add graph cudnn conv alg config * add conv2d nhwc * lazy create cuda_stream in CudaCopyD2HDeviceCtx CudaStreamHandleDeviceCtx * refine * conv bn pool nhwc for resnet perf * one hot with float * use BiasAddRowGpu * rm l2 with 0 * reformat * add nhwc env var * legacy pool merged into new * refine * fix style * fix and refine * address review * fix and refine * fix doc test Co-authored-by: luyang <flowingsun007@163.com> Co-authored-by: guo-ran <360112263@qq.com> Co-authored-by: lixinqi <lixinqi0703106@163.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * reduce memory usage caused by slice grad (#7144) * cmake: fix THIRD_PARTY build (#7146) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix fold op (#7156) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Support inplace for lazy consistent (#7112) * Support inplace for lazy consistent * fix single client sbp hint * refine Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix prelu bug (#7118) * support dtype and device in prelu * optimize PreluFunctor * fix prelu 1-dim error * update * update * auto format by CI Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * use ibn2nd_sbp to get nd_sbp (#7155) Co-authored-by: Houjiang Chen <chenhoujiangcug@gmail.com> * fix copy bug (#7159) * fix copy bug * add to test case * refine * fix test case Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix laynorm backward bug (#7164) * fix layernorm backward index bug * add layernorm test case * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * [Fix] graph support 0-Size tensor (#6957) * Add nn.functional.glu graph test * add filter to motify functional autotest * motify code * add test example * add test else * add test judging condition for test_masked_fill.py,test_constant.py,test_tile.py、test_repeat.py,test_expand.py * add test ok example * Clear tensor name scope after graph build * Add test case of 2 graph caught same free eager tensor * auto format by CI * Dev cc clean tensor name scope (#7082) * Clear tensor name scope after graph build * Add test case of 2 graph caught same free eager tensor * auto format by CI Co-authored-by: chengtbf <472491134@qq.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * submit test success example * test success example * submit test code * fix a bug about relu module with 0 shape data * fixed a bug about relu module with 0 shape data * fix a bug about relu module with 0 shape data * fix a bug about relu module with 0 shape data * 0shape and 0d autotest * fix a bug about relu module with 0 shape data * 0shape changed to 0_size * modify test_var.py * modify test_eye.py * modify test_reshape.py * modify test_.py * modify ReshapeFunctor * modify some file * Fixed graph autotest bug with reshape op test * Fixed graph autotest bug with reshape op test * fixed test_sub.py * modify test_sub.py * modify tensor_methods.cpp * modify array_functor.cpp * graph support 0-Size tensor * rename 0shape to 0 size * modified check_graph=True * fix and refine Co-authored-by: Zhenhua <huangzhenhua@zhejianglab.com> Co-authored-by: tangnana925 <85614052+tangnana925@users.noreply.github.com> Co-authored-by: tangnana <tnn_personal@163.com> Co-authored-by: Zhenhua <1209435+hengzi@users.noreply.github.com> Co-authored-by: chengtbf <472491134@qq.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: Xiaoyu Xu <xiaoyulink@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Cumsum op implementation (#7050) * add cumsum op's forward definition * add cumsum forward test case * cumsum ver3 * remove calculating time * add cumsum forward gpu implementation * fix gpu forward error * change var name * remove annotation * add cumsum cpu forward multi-thread support * add multi-thread annotation * add cumsum grad definition * update * add cumsum cpu backward * add cumsum cpu backward functor * add cumsum autograd * update * remove user interface * use random method to test cumsum forward * add cumsum gpu backward * add cumsum gpu test * fix gpu backward bug * add a 3d cuda kernel try * Revert "add cumsum gpu test" This reverts commit 05c31556ba28ecb827b25e54c2f5fa38984e8096. * Revert "Revert "add cumsum gpu test"" This reverts commit 918ee1569863b008c1d419c3528257416cffd840. * change nele to ele_cnt * add test_cumsum.py in oneflow/test/modules * change original test_cumsum to autotest version * optimize cumsum for special up_space and down_space * add two special cu func * add cumsum doc * update doc * update doc * update code according to bbuf's review * ditto * change pin/pout to in_ptr/out_ptr * remove multi-thread func * update doc * use tensor processor * update by review * update by review * update * update * auto format by CI * auto format by CI * update doc * update Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * Logical slice in tenosr str (#7116) * using logical slice in tensor str * add tensor str util file * refine * refine * refine * refine * add logical slice docs * fix bug * fix comment * auto format by CI * fix doc test bug * delete TODO Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add install for oneflow py (#7107) * Add install for oneflow py * refine * refine * refine * refine * refine * refine * refine * refine * refien * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix bug: output key not exists when SavaJobToIR (#7139) * fix bug: output key not exists when SavaJobToIR * [test] makedirs when path not exists * remove useless comment Co-authored-by: Peihong Liu <mosout@qq.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add linalg 2d norm op for clip_grad (#7160) * add linalg_2d_norm op for clip_grad * code format * revert sqrt * fix comment * refine * fix comment * fix ci error * fix ci error * fix docs bug * fix ci error * fix ci error Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * refine nn.graph autotest (#7111) * add linspace op * refine graph autotest * revert * add graph error trace * fix bug * fix autotest bug * auto format by CI * fix set_printoptions error * auto format by CI * CI test bug * auto format by CI * For CI * auto format by CI * For CI test * fix ci error * revert for ci * fix bug * fix ci error * fix bug * fix bug Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: Li Xiang <54010254+lixiang007666@users.noreply.github.com> Co-authored-by: lixiang <88304454@qq.com> * add oneflow/pytorch cudnn.deterministic (#7172) * add cudnn.deterministic * fix bug * auto format by CI * fix bug * fix generate fake program input bug * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * fix linalg vector norm scalar tensor print bug (#7178) * fix linalg vector norm scalar tensor print bug * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * format * refine * format Co-authored-by: Yinggang Wang <wyg19970408@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: liufengwei0103 <2472937968@qq.com> Co-authored-by: daquexian <daquexian566@gmail.com> Co-authored-by: guo ran <360112263@qq.com> Co-authored-by: Peihong Liu <mosout@qq.com> Co-authored-by: ZZK <42901638+MARD1NO@users.noreply.github.com> Co-authored-by: Shenghang Tsai <jackalcooper@gmail.com> Co-authored-by: Li Xiang <54010254+lixiang007666@users.noreply.github.com> Co-authored-by: cheng cheng <472491134@qq.com> Co-authored-by: Li Xinqi <lixinqi2010@gmail.com> Co-authored-by: lichunyou <33850693+lcylcy@users.noreply.github.com> Co-authored-by: Luyang <flowingsun007@163.com> Co-authored-by: wyushun <wyushun@foxmail.com> Co-authored-by: zhu wang <33675639+olojuwin@users.noreply.github.com> Co-authored-by: leaves-zwx <kunta0932@gmail.com> Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> Co-authored-by: Shijie <821898965@qq.com> Co-authored-by: XIE Xuan <xiexuanx2@gmail.com> Co-authored-by: Juncheng <liujuncheng1022@gmail.com> Co-authored-by: binbinHan <han_binbin@163.com> Co-authored-by: CHI LIU <42956025+thinksoso@users.noreply.github.com> Co-authored-by: Yao Chi <later@usopp.net> Co-authored-by: ouyangyu <xuanjiuye@gmail.com> Co-authored-by: Xiaoyu Xu <xiaoyulink@gmail.com> Co-authored-by: PragmaTwice <i@twice.moe> Co-authored-by: luqiang guo <702572275@qq.com> Co-authored-by: ZeKai Zhou <30856589+zzk0@users.noreply.github.com> Co-authored-by: VertexC <bob2420083992@gmail.com> Co-authored-by: lixinqi <lixinqi0703106@163.com> Co-authored-by: fengdaozhuo <52237830+grybd@users.noreply.github.com> Co-authored-by: Zhenhua <huangzhenhua@zhejianglab.com> Co-authored-by: tangnana925 <85614052+tangnana925@users.noreply.github.com> Co-authored-by: tangnana <tnn_personal@163.com> Co-authored-by: Zhenhua <1209435+hengzi@users.noreply.github.com> Co-authored-by: lixiang <88304454@qq.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: Houjiang Chen <chenhoujiangcug@gmail.com> Co-authored-by: Yinggang Wang <wyg19970408@gmail.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: liufengwei0103 <2472937968@qq.com> Co-authored-by: daquexian <daquexian566@gmail.com> Co-authored-by: Peihong Liu <mosout@qq.com> Co-authored-by: ZZK <42901638+MARD1NO@users.noreply.github.com> Co-authored-by: Shenghang Tsai <jackalcooper@gmail.com> Co-authored-by: Li Xiang <54010254+lixiang007666@users.noreply.github.com> Co-authored-by: cheng cheng <472491134@qq.com> Co-authored-by: Li Xinqi <lixinqi2010@gmail.com> Co-authored-by: lichunyou <33850693+lcylcy@users.noreply.github.com> Co-authored-by: Luyang <flowingsun007@163.com> Co-authored-by: wyushun <wyushun@foxmail.com> Co-authored-by: zhu wang <33675639+olojuwin@users.noreply.github.com> Co-authored-by: leaves-zwx <kunta0932@gmail.com> Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> Co-authored-by: Shijie <821898965@qq.com> Co-authored-by: XIE Xuan <xiexuanx2@gmail.com> Co-authored-by: Juncheng <liujuncheng1022@gmail.com> Co-authored-by: binbinHan <han_binbin@163.com> Co-authored-by: CHI LIU <42956025+thinksoso@users.noreply.github.com> Co-authored-by: Yao Chi <later@usopp.net> Co-authored-by: ouyangyu <xuanjiuye@gmail.com> Co-authored-by: Xiaoyu Xu <xiaoyulink@gmail.com> Co-authored-by: PragmaTwice <i@twice.moe> Co-authored-by: luqiang guo <702572275@qq.com> Co-authored-by: ZeKai Zhou <30856589+zzk0@users.noreply.github.com> Co-authored-by: VertexC <bob2420083992@gmail.com> Co-authored-by: lixinqi <lixinqi0703106@163.com> Co-authored-by: fengdaozhuo <52237830+grybd@users.noreply.github.com> Co-authored-by: Zhenhua <huangzhenhua@zhejianglab.com> Co-authored-by: tangnana925 <85614052+tangnana925@users.noreply.github.com> Co-authored-by: tangnana <tnn_personal@163.com> Co-authored-by: Zhenhua <1209435+hengzi@users.noreply.github.com> Co-authored-by: lixiang <88304454@qq.com> * Use normalize instead of l2_normalize (#7113) * use normalize instead of l2_normalize * refine * fix l2_norm * reformat Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add_eager_naive_s_to_p_boxing (#7203) * add_eager_naive_s_to_p_boxing * fix typo * minor fix * fix test case Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add timeout for distributed run (#7286) * add timeout * strict timeout * quick fix * cmake: import gflags and glog using FetchContent (#7176) * cmake: import gflags and glog using FetchContent * cmake: use set_mirror_url_with_hash * fix THIRD_PARTY build * fix lib path * fix gflags * remove gflags * format * auto format by CI * fix xrt gflags * fix name * remove oneflow_exe_third_party_libs * remove PUBLIC * revert some changes * Update oneflow.cmake * fix so * fix so * remove Custom op test and Single client dry run test Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Prune parallel dim with val eq one in parallel dim reduce (#7257) * fix Resource::DumpCudnnConf * prune_parallel_dim_with_val_eq_one_in_parallel_dim_reduce * minor fix * refine Prune * refine * refine * minor fix * fix bug Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add sync (#7294) * add polynomial scheduler (#7260) Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix autotest inplace bug, hardsigmod (#7276) * Fix autotest inplace bug, hardsigmod * Fix * Format * Fix * Fix kwargs * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Use flowvision replace flow.utils.vision (#6612) * use flowvision * del flow.utils.vision * add flow.utils.data * refine * update version * refine * align clip grad with torch in error_if_nonfinite (#7304) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Use cmake install to copy cpp api related files (#7200) * install cpp api * install mlir related files * clean * handle third party dependences * support cpack * fix * Update oneflow.cmake * fix * fix compiling error * refine * add exe test as deps * install third party * refine * refine * revert install dir * install third party * refine Co-authored-by: Shenghang Tsai <jackalcooper@gmail.com> * Add_Tensor.T_and_Tensor.t()_ops (#7269) * Add_Tensor.T_and_Tensor.t()_ops * Update single test and docs * Update single test * auto format by CI * Update tensor.T * recover requirements.txt * auto format by CI * Update check_graph=False Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Refactor release tensor (#7071) * refactor ReleaseTensor instruction * support CurrentDevVmDepObjectConsumeMode for ReleaseTensor * rm useless Touch instruction * reset speed test threshold Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: Yu OuYang <xuanjiuye@gmail.com> * Fix functional dropout and Docs (#7237) * fix addend to kwargs * fix to an extra * fix test * fix to use key word arguments Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Modify graph and 0-D Tensor (#7208) * Fix 0 dim bug * part1 * Fix * Fix * Add 0dim to 1dim function * Fix * Fix * Fix * Fix * Fix * Fix * Delete test_logical_not_with_0dim_data * Fix * Format * FIx * Fix * Fix * Update test_movedim.py * Update test_narrow.py * Test bug * Test bug * Fix graph bug Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Refactor OneRecReader to stateful op and provide Module api (#7271) * refactor read_onerec to nn.OneRecReader * fix * refine doc * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: Yao Chi <later@usopp.net> Co-authored-by: Houjiang Chen <chenhoujiangcug@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * [bug] Adam align torch params (#7318) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add channel-last to resnet50 graph ci test (#7253) * add channel-last to resnet50 graph ci test * refine Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add oneTBB (#7213) * add tbb * refien * refine * refine * refine * revert * add tbb * success add tbb * tbb onednn ok * fix ninja onednn * component * install tbb include file * updata tbb master zip * fix md5 * refine * refjine * fix * cmake option * modified clang 10 OMP * add line * fix add OMP flags * fix * fix * fix OF_RUNTIME_TBB * refine * clean * fix Co-authored-by: jackalcooper <jackalcooper@gmail.com> Co-authored-by: mosout <mosout@qq.com> * Dev all op bool (#6962) * fix typo * dev all op bool type * add bool testcase * support bool for ops and kernels * bool api * functional bool * ndarray bool * fix conflict * fix tabel gen * fix * refine * fix * fix * fix * fix * fix * fix * fix * add broadcast_to_compatible bool kernel * fix * fix * fix split api * fix docstr * fix setitem bool * fix char * fix * fix * fix Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix leaf tensor backward error (#7331) * fix(Autograd): fix leaf tensor backward error * test(Autograd): add scalar leaf tensor backward test * Create tensor in jobpass after pulling plan (#7315) * fix(NNGraph): create tensor in jobpass after pulling plan * fix(NNGraph): remove useless sync and fix typo * refine error message Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * [BUG] suport 0D tensor in eager consistent (#7242) * suport 0D tensor in eager consistent * fixed_0Size_0D_bug_with_eager_consistent * fixed_0Size_0D_bug_with_eager_consistent * fixed a bug with flatten op in graph autotest * fixed a bug with flattenfunctor in graph autotest * fixed a bug with flatten op in graph autotest * add Notes * modifyed some check_graph=False * auto format by CI * modifyed check_graph=True * modifyed test_add.py * modifyed test_index_select.py * modifyed test_mul.py Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: Xiaoyu Xu <xiaoyulink@gmail.com> * Use cudnn maxpool when possible (#7333) * use_cudnn_maxpool_when_possible * restruct * refine * Remove `register_tensor_op` decorator and use `add_docstr`[part] (#7306) * Remove register_tensor_op decorator and use add_docstr * Fix eq * Fix ne * Fix lt * Fix le * Fix to_local * Fix * test * Resolve conflict * disable nccl release_tensor sequential (#7341) * Rm local dep object pool (#7131) * refactor ReleaseTensor instruction * remove LocalDepObject::logical_object * remove LocalDepObjectPool * refine code by profiling * support CurrentDevVmDepObjectConsumeMode for ReleaseTensor * rm useless Touch instruction * set default value of EagerNcclBroadcastOp::async_launch to true * flow._C.stream_touch (#7209) * flow._C.stream_touch * fix compiler complaints * reset speed test threshold * reserve more size for vector dep_objects * fix static checker complaint * stream_touch does nothing if inputs empty * do not run stream_touch if inputs empty Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: Yu OuYang <xuanjiuye@gmail.com> * Autotest add graph backward (#7270) * add graph backward run in autotest * format * revert * fix ci * auto format by CI * fix bug * fix bug * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Cumprod (#7278) * add comprod * fix name * rename * fix when specified dim is 1 * add docstr * add docstr * refine * add WITH_CUDA * refine * refine * Update python/oneflow/framework/docstr/math_ops.py fix docstr Co-authored-by: Yao Chi <later@usopp.net> * Update python/oneflow/framework/docstr/math_ops.py fix docstr Co-authored-by: Yao Chi <later@usopp.net> * fix docstr * refine * refine * fix include * refine * refine * refine Co-authored-by: Yao Chi <later@usopp.net> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Migrate tile python layer to functor (#7305) * tile implement * migrate repeat * of format * align document with pytorch * change input args to *size * change with review comments * migrate tile python layer to functor * fix tile document * fix tile document * add document's function signature * auto format by CI * add repeat document signature * fix document * fix document Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * fix (#7338) * update_version_of_flowvision (#7346) * Release tensor storage as soon as possible (#7287) * Release tensor storage as soon as possible * refine Co-authored-by: Luyang <flowingsun007@163.com> * Dev eager consistent autotest (#7204) * add doc for pybind type * eager consistent autotest * align placement repr and api * export necessary apis to pickle object * broadcast rank 0 to other rank if consistent test * update consistent add unittest * refine * cmake: import gtest using FetchContent (#7292) * cmake: import gflags and glog using FetchContent * cmake: use set_mirror_url_with_hash * fix THIRD_PARTY build * fix lib path * fix gflags * remove gflags * format * auto format by CI * fix xrt gflags * fix name * remove oneflow_exe_third_party…

wyg1997 and others added 5 commits January 13, 2022 12:25

feat(Parameter): Parameter support both inplace op and setter

c47fda2

feat(Tensor): tensor support data's getter interface

4c47583

test(Parameter): add getter test

94595a2

debug

49e3193

add test

abc42e1

strint requested review from chengtbf, daquexian and jackalcooper as code owners January 13, 2022 14:59

strint added 4 commits January 13, 2022 23:10

open flatten graph test

bef91d5

Merge branch 'master' into feat/eager_tensor_to_graph_out_and_inplace

b2f4c33

add validated flase type

6149ce6

Merge branch 'feat/eager_tensor_to_graph_out_and_inplace' of https://…

fe6e92e

…github.com/Oneflow-Inc/oneflow into feat/eager_tensor_to_graph_out_and_inplace

strint requested a review from BBuf January 13, 2022 15:27

strint added graph graph mode system test feature labels Jan 13, 2022

Merge branch 'master' into feat/eager_tensor_to_graph_out_and_inplace

e111000

chengtbf reviewed Jan 14, 2022

View reviewed changes

oneflow/core/framework/op_interpreter/lazy_op_interpreter.cpp Outdated Show resolved Hide resolved

chengtbf approved these changes Jan 14, 2022

View reviewed changes

strint added 3 commits January 14, 2022 11:07

refine

6121b3c

Merge branch 'feat/eager_tensor_to_graph_out_and_inplace' of https://…

bb428d3

…github.com/Oneflow-Inc/oneflow into feat/eager_tensor_to_graph_out_and_inplace

foramt

713400d

strint commented Jan 14, 2022

View reviewed changes

oneflow/core/framework/op_interpreter/lazy_op_interpreter.cpp Show resolved Hide resolved

Merge branch 'master' into feat/eager_tensor_to_graph_out_and_inplace

403a77b

strint requested a review from oneflow-ci-bot January 14, 2022 07:21

strint mentioned this pull request Jan 14, 2022

[BUG] 打开hardsigmoid(nn.functional下)的graph报错（test_functional_hardsigmoid_with_random_data） #7262

Closed

BBuf reviewed Jan 14, 2022

View reviewed changes

python/oneflow/test_utils/automated_test_util/torch_flow_dual_object.py Show resolved Hide resolved

BBuf approved these changes Jan 14, 2022

View reviewed changes

strint added the automerge label Jan 14, 2022

Merge branch 'master' into feat/eager_tensor_to_graph_out_and_inplace

a18830a

oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot January 15, 2022 14:37

strint requested a review from oneflow-ci-bot January 15, 2022 16:39

oneflow-ci-bot removed their request for review January 15, 2022 17:45

strint requested a review from oneflow-ci-bot January 16, 2022 02:03

oneflow-ci-bot removed their request for review January 16, 2022 02:26

strint requested a review from oneflow-ci-bot January 16, 2022 05:28

oneflow-ci-bot removed their request for review January 16, 2022 05:30

strint requested a review from oneflow-ci-bot January 16, 2022 07:06

oneflow-ci-bot removed their request for review January 16, 2022 07:12

strint requested a review from oneflow-ci-bot January 16, 2022 08:53

github-actions bot removed the automerge label Jan 16, 2022

oneflow-ci-bot removed their request for review January 16, 2022 10:52

Merge branch 'master' into feat/eager_tensor_to_graph_out_and_inplace

bbcab0a

strint requested a review from oneflow-ci-bot January 17, 2022 01:48

strint added the automerge label Jan 17, 2022

oneflow-ci-bot removed their request for review January 17, 2022 05:10

Merge branch 'master' into feat/eager_tensor_to_graph_out_and_inplace

997c85f

oneflow-ci-bot self-requested a review January 17, 2022 06:06

Merge branch 'master' into feat/eager_tensor_to_graph_out_and_inplace

d616df4

oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot January 17, 2022 12:55

oneflow-ci-bot merged commit 4ed6784 into master Jan 17, 2022

oneflow-ci-bot deleted the feat/eager_tensor_to_graph_out_and_inplace branch January 17, 2022 15:43

strint mentioned this pull request Jan 19, 2022

Inplace mul check_graph=True unittest failed #7299

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/eager tensor to graph out and inplace #7254

Feat/eager tensor to graph out and inplace #7254

strint commented Jan 13, 2022 •

edited

Loading

github-actions bot commented Jan 16, 2022

github-actions bot commented Jan 17, 2022

Feat/eager tensor to graph out and inplace #7254

Feat/eager tensor to graph out and inplace #7254

Conversation

strint commented Jan 13, 2022 • edited Loading

Support eager tensor in nn.Graph's

Refine auto test of nn.Graph

github-actions bot commented Jan 16, 2022

github-actions bot commented Jan 17, 2022

strint commented Jan 13, 2022 •

edited

Loading