Use normalize instead of l2_normalize #7113

mosout · 2021-12-27T07:50:28Z

No description provided.

olojuwin · 2022-01-14T08:48:40Z

现在使用与之前速度基本持平

* add ddp return type (#7232) * add dpp return type * add comment * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Parameter support both inplace op and setter (#7249) * feat(Parameter): Parameter support both inplace op and setter * feat(Tensor): tensor support data's getter interface * test(Parameter): add getter test Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix(*): fix sbp filter function bug (#7229) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * refine (#7240) * Eager boxing status (#7150) * add eager boxing status * refine MakeBoxingInterpreterStatus * add blank line * del EagerBoxingCall * refine BoxingInterpreterStatus * refine BoxingInterpreterStatus * add eager boxing log * minor fix * minor fix * revert removed file * add indent arg * rename indent to prefix * solve comment * refine eager_boxing_logger * use Global<const EagerBoxingLogger> * minor fix Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix empty bug (#7239) * fix empty bug * simplify empty Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix empty debug str of hob primitive (#7245) * fix empty debug str of hob primitive Signed-off-by: daquexian <daquexian566@gmail.com> * fix 'OF_PP_STRINGIZE(op)' Signed-off-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: ZZK <42901638+MARD1NO@users.noreply.github.com> * Add VSCode dev container (#7233) * add dev container * use oneflow/devcontainer * add settings for new lines and trailing ws * refine docs * add eol setting to config * Add '"--gpus", "all"' if running a CUDA image * set BUILD_HWLOC off in fast cmake init cache * Skip send and recv if dst and src are same. (#7255) * Maxpool op nhwc (#7214) * maxpool2d_support_nhwc * refine * add test case * format * refine * refine * fix comments * Implement consistent tensor detach (#7265) * Feat/zero optimization in nn.Graph (#7165) * debug * modify graph.py * fix bug about graph debug interface * Fix nn graph variable bind (#6895) * fix(AutoParallel): nn.Graph support auto_parallel change sbp * fix(AutoParallel): use tensor.set_data interface and add print sbp info * add comment * hack check * add test * refine test * refine test * refine code * add and refine zero * fix test * refine code * rm debug log * refine min size set * add note * debug zero * fix cudnn config * refine test doc * add comment of check * eager mode in graph pass * format * rebuid parameter according to sbp in synced plan * auto format by CI * fix code check * fix test * try init session at graph init * refine and revert session init * rm useless code * add back print of sys conf Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: grybd <52237830+grybd@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: wyg1997 <wyg19970408@gmail.com> * fix linspace limit bug (#7236) * fix linspace limit bug * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: Liang Depeng <liangdepeng@gmail.com> * replace throw by OF_UNIMPLEMENTED or UNIMPLEMENTED, refine error message, replace CHECK by CHECK_OR_RETURN (#7121) * replace throw by OF_UNIMPLEMENTED in dim_scatter_ops.cpp * replace throw by OF_UNIMPLEMENTED in scatter ralated kernels * replace throw by OF_UNIMPLEMENTED in scatter ralated kernel * replace glog CHECK by oneflow CHECK_OR_RETURN * refine error message on modified UNIMPLEMENTED * replace CHECK by CHECK_OR_RETURN in dim_scatter_ops.cpp * refine error message on modified UNIMPLEMENTED * refine error message on modified UNIMPLEMENTED * refine error message on modified UNIMPLEMENTED * remove std::endl, add period, remove redundant maybe.h including * remove std::endl, add period Co-authored-by: Yao Chi <later@usopp.net> * Remove single client from CI (#7274) * remove single client ci * update get-oneflow * rm changed_files * refine workflow * Revert "refine workflow" This reverts commit f9cdcadf63f4634177471a06be5a2aa49e87df68. * Update test.yml * refine * refine * refine * reorder * rm changed_files * refine * add CHANGELOG.md * refine * Feat/eager tensor to graph out and inplace (#7254) * feat(Parameter): Parameter support both inplace op and setter * feat(Tensor): tensor support data's getter interface * test(Parameter): add getter test * debug * add test * open flatten graph test * add validated flase type * refine * foramt Co-authored-by: wyg1997 <wyg19970408@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Optimize LayerNorm backward param grad (#6996) * layer_norm forward * test case * rm useless * layer_norm backward dx * layer norm param grad * int count to T count * fix * fix T mask to int mask, refine code * refine * refine * test case * refine * format * fix * add dtype bfloat16 * refine * refine * refine * refine * sum_loss to sum_stats * x_buf to normalized_buf * refine * refine * address review * refine * add testcase * double use uncached impl to reduce compile time * Fix python apis and xla implementation (#7183) * Support save/load for lr_scheduler (#6948) * feat(LrScheduler): support save/load for lr_scheduler * refine document * auto format by CI * Refine test * auto format by CI Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * Fix eye_op attr (#6973) * fix * add graph test * Update python/oneflow/test/graph/test_graph_eye.py Co-authored-by: daquexian <daquexian566@gmail.com> * refine * Update python/oneflow/test/graph/test_graph_eye.py Co-authored-by: daquexian <daquexian566@gmail.com> * auto format by CI Co-authored-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * softmax double use uncached impl to accelerate compile (#6992) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add [[nodiscard]] for cpp api (#6997) * add [[nodiscard]] * refine * reformat Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Support Arange delta to decide dtype (#6998) * support delta dtype to decide output dtype * add more unittest Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add clang as CUDA FE compiler in CI (#6954) * update action use * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * fix * add 80 and 86 * refine * refine * add CUDA_NVCC_THREADS_NUMBER * refine * address review * set CUDA_NVCC_THREADS_NUMBER 8 * fix * fix clang in init cmake * add script * refine * refine * refine * refine * refine * refien * refine * add flags to skip zlib * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * Migrate chunk python layer to functor (#6983) * Migrate chunk Python layer logic to functor * fix runtime * Fix splits bug and CI * Modify push to emplace Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Reduce memory usage when compiling oneflow dialect ops (#7000) * CudaAllocator device reset before OOM (#6976) * CudaAllocator device reset before OOM * Add NOTE Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Refactor vm stream desc (#6989) * remove StreamDesc::num_machines * Prepare one thread for one stream_type Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add Diagonal Op (#6016) * format complete * python to cpp * py2cpp error * rm * auto format by CI * revise * auto format by CI * license * docstring * docstring * tensor * tensor attribute * auto format by CI * docstring * revise * test * revise * revise * rename * half * docs * doc,test * test times * revise * format Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add all to all op (#6283) * add all to all op * add barrier * format * add import * fix test * delete barrier * delete barrier * Revert "delete barrier" This reverts commit aa397ea5ba815fe6df883b263b82735f126345c8. * Revert "delete barrier" This reverts commit 7ddf79afaa7ac072813e84ce9224440939a3f95c. * check tensor meta between ranks * add more assert * all_reduce operate in place * all_reduce operate in place * fix bug * assert tensor.is_local * fix bug in scatter * add more assert * delete meta check * add pytorch comparison test * add pytorch comparison test * refine * add ONEFLOW_TEST_CPU_ONLY * fix bug from torch gloo Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Dev ivalue for cpp api (#6890) * add api tensor * refine * add nn.relu * refine * clean shape & refine relu test * support void* for from_blob * add multithreading relu test * refine test * refine * refine * add comment for __internal_tensor() * convert to copy_util * reformat * refine * add ivalue * refine directory structure * refine cpp api test * refine test * add ivalue * refine ivalue * refine ivalue * refine * refine * refine * refine Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * default use cpu generator (#7001) * optimize reshape/slice/transpose functor (#6956) * optimize reshape/slice/transpose functor * update code according to reviewer's suggestion * judge negative dimension number besides -1 * judge negative shape value in view::Reshape * remove is_full_slice logic in SliceFunctor * update code according to yinggang's advice * move ordered permute judge to TransposeKernel * remove print sentence * abstract IsOrderedPermute func * support negative permute value in TransposeFunctor * delete tranpose_kernel optimization * Revert "delete tranpose_kernel optimization" This reverts commit e026434dc7c1ebad948c76bde475540e3bf4477a. * not return original tensor when reshape do nothing * simplify code * correct spell error Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix IsContinuosSubspace error (#6968) * fix IsContinuosSubspace error * recover original IsContinuosSubspace code * add test case * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add cpu group deconve impl (#6980) * add cpu group deconv impl * remove useless lines * remove useless lines * add deconv2d import * add groups test * remove check_allclose=False * add tf_prelu * add cpu group deconv impl * remove useless lines * remove useless lines * add deconv2d * add groups test * remove check_allclose=False * add tf_prelu * auto format by CI * add deconv2d impl * add deconv2d impl * remove useless lines * add deconv2d in functional api * auto format by CI * auto format by CI * Add variable initial * Add variable initial * auto format by CI * add conv2d impl * add conv2d impl * auto format by CI * remove useless lines Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * Migrate the python layer logic of broadcastlike to functor (#7007) * Migrate the python layer logic of broadcastlike to functor * add var name Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Temporarily skip comm test cases (#7015) * Temporarily skip comm test cases * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix nd_sbp attribute type and set nd_sbp in random functors (#7017) * fix * fix compile Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Save Job to IR and load Job from IR (#6885) * save to ir * test * fix bugs * impl load and test * rm useless code * fix conflict * fix issues * JobOp * fix issues * fix test_fuse_tril_scale * fix test jit-outline-func * fix test_mlir_opt.py * save * fix ods gen for max and avg pool * rename oneflow to oneflow_foundation * fix files checks * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * auto format by CI * check in changes * refine * Update oneflow/ir/test/OneFlow/test_mlir_opt.py * Update oneflow/ir/include/OneFlow/OneFlowOps.td * refine includes * printer & parser & verifier * code tidy * tidy include * address review * rm duplicated GetDataTypeType * TensorSource trait Co-authored-by: jackalcooper <jackalcooper@gmail.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * Fix Simple CI linkage (#6986) * fix-simple-ci-linkage * refine * refine * fix * refine * refine * refine * refine * refien * refine * revert * refine * auto format by CI * refine * revert * refine Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * fix sbp when weight is optional (#6984) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Feat from numpy (#7013) * feat(Tensor): support share memory with ndarray * test(FromNumpy): add test * enhancement test and add document * Fix merge error * fix bug in numpy c api * Fix(doctest): fix doctest error Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add custom ShapeAttr in ODS (#7023) * add ShapeAttr * refine * fix doc * refine * fix (#7028) * Add linspace op (#7006) * add linspace op * refine doc * refine * fix comments * fix comment * auto format by CI * fix ci doc error Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix fasterrcnn infer (#7014) * fix fasterrcnn infer * roi_align 0shape * refine * refine Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * separate kernel state and cache (#6655) * support eager state except lazy dynamic Signed-off-by: daquexian <daquexian566@gmail.com> * modularize kernel contexts Signed-off-by: daquexian <daquexian566@gmail.com> * fix warning Signed-off-by: daquexian <daquexian566@gmail.com> * reformat Signed-off-by: daquexian <daquexian566@gmail.com> * remove duplicated license Signed-off-by: daquexian <daquexian566@gmail.com> * fix static check error Signed-off-by: daquexian <daquexian566@gmail.com> * make test gpu only Signed-off-by: daquexian <daquexian566@gmail.com> * temp Signed-off-by: daquexian <daquexian566@gmail.com> * revert opkernel context changes, align with master Signed-off-by: daquexian <daquexian566@gmail.com> * reformat Signed-off-by: daquexian <daquexian566@gmail.com> * refine cachecontext Signed-off-by: daquexian <daquexian566@gmail.com> * add separate cache context inferface, remove out-dated files Signed-off-by: daquexian <daquexian566@gmail.com> * add init and cache context aliases Signed-off-by: daquexian <daquexian566@gmail.com> * update eager kernel Signed-off-by: daquexian <daquexian566@gmail.com> * fix wrong AttrMayChanged value Signed-off-by: daquexian <daquexian566@gmail.com> * rename and add comment Signed-off-by: daquexian <daquexian566@gmail.com> * auto format by CI * fix combined_margin_loss_kernel.cpp Signed-off-by: daquexian <daquexian566@gmail.com> * rename op_kernel_state_wrapper.h to op_kernel_wrapper.h Signed-off-by: daquexian <daquexian566@gmail.com> * rename more classes, fix old cache in stateful op kernel Signed-off-by: daquexian <daquexian566@gmail.com> * rename more classes Signed-off-by: daquexian <daquexian566@gmail.com> * may changed -> not changed Signed-off-by: daquexian <daquexian566@gmail.com> * optimize away genrepeatedbn Signed-off-by: daquexian <daquexian566@gmail.com> * reformat Signed-off-by: daquexian <daquexian566@gmail.com> * refine Signed-off-by: daquexian <daquexian566@gmail.com> * update stateful local opkernel, use Cache** if possible Signed-off-by: daquexian <daquexian566@gmail.com> * remove TensorDesc4ArgNameAndIndex base method Signed-off-by: daquexian <daquexian566@gmail.com> * auto format by CI * fix clang-tidy error Signed-off-by: daquexian <daquexian566@gmail.com> * auto format by CI * fix conv kernel bug Signed-off-by: daquexian <daquexian566@gmail.com> * auto format by CI * fix group conv bug and fix warning Signed-off-by: daquexian <daquexian566@gmail.com> * fix avgpool error Signed-off-by: daquexian <daquexian566@gmail.com> * fix maxpool error Signed-off-by: daquexian <daquexian566@gmail.com> * auto format by CI * respect flag in deconv cpu kernel, rename cache to cache_ptr Signed-off-by: daquexian <daquexian566@gmail.com> * fix compile error Signed-off-by: daquexian <daquexian566@gmail.com> * auto format by CI * fix deconv cache bug Signed-off-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: Li Xinqi <lixinqi2010@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add fully support for all datatype (#7025) * add fully support for all datatype * Use max array size * add clang-format off to maintain the matrix * fix format * remove redundant numpy dtype Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Migrate split python layer to functor (#7030) * Migrate split python layer to functor * modify dim Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add add_sparse_optimizer for Graph (#6988) * add_sparse_optimizer * format * fix bug * refine new interface by discuss * auto format by CI * address review * correct syntax * correct error message * rm debug print * auto format by CI * fix cpu-only test Co-authored-by: XIE Xuan <xiexuanx2@gmail.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Refine RUN_CUDA_KERNEL (#7003) * Refine RUN_CUDA_KERNEL * Added LaunchConfig Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Support llvm in tree build (#6995) * refine * refine * refine * refine * add61 * refien * refine * refine * refine * refine * refien * refine * refine * refine * refine * refine * refine * refine * rm * revert * refine * refine * refine * refine * return_self_in_to_consistent_if_necessary (#7004) * return_self_in_to_consistent_if_necessary * fix error and add test case * skip cpu test * fix error Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Decouple ep and global (#7027) * Decouple ep and global * NOLINT * fix * fix import Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * arange doc fix (#7035) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add_consistency_check_in_consistent_tensor_set_data (#7002) * add_consistency_check_in_consistent_tensor_set_data * auto format by CI * minor fix * add just wrap Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * [cmake] add liboneflow_cpp target (#7005) * add cmake changes for liboneflow_cpp.so Signed-off-by: daquexian <daquexian566@gmail.com> * add separate target for cpp api test Signed-off-by: daquexian <daquexian566@gmail.com> * add cpp api test in ci Signed-off-by: daquexian <daquexian566@gmail.com> * reverse the order of cudnn and cuda library Signed-off-by: daquexian <daquexian566@gmail.com> * update logic of BUILD_MONOLITHIC_LIBONEFLOW Signed-off-by: daquexian <daquexian566@gmail.com> * rename BUILD_MONOLITHIC_LIBONEFLOW to BUILD_MONOLITHIC_LIBONEFLOW_CPP_SO Signed-off-by: daquexian <daquexian566@gmail.com> * share lib directory in test container Signed-off-by: daquexian <daquexian566@gmail.com> * add github actions debug Signed-off-by: daquexian <daquexian566@gmail.com> * Revert "add github actions debug" This reverts commit 7d9aef684a479285c690f38d25525c9b97865e45. * add upterm debug after exe test Signed-off-by: daquexian <daquexian566@gmail.com> * sleep after fail Signed-off-by: daquexian <daquexian566@gmail.com> * set LD_LIBRARY_PATH in yml for cpp api test exe Signed-off-by: daquexian <daquexian566@gmail.com> * sleep Signed-off-by: daquexian <daquexian566@gmail.com> * upload liboneflow_cpp.so Signed-off-by: daquexian <daquexian566@gmail.com> * modify cmake to trigger compilation Signed-off-by: daquexian <daquexian566@gmail.com> * remove sleep Signed-off-by: daquexian <daquexian566@gmail.com> * build cpp api in cpu mode Signed-off-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix CUDA 52 and add it to CI (#7031) * refine * refine * refine * refine * revert * fix * refine * refine * refine Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add check of placement constructor (#6991) * add_check_of_placement_constructor * move CheckDeviceIdsIsValid to runtime * handle comment * fix error * fix error Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix(FromNumpy): fix bug in stride (#7042) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add non virtual destructor back (#6999) Signed-off-by: daquexian <daquexian566@gmail.com> Co-authored-by: Houjiang Chen <chenhoujiangcug@gmail.com> * move python code to cpp: eye (#7036) * 80% Sbp signature left to finish * refine functional_api.yaml * 90% docstr left to update * refine * add sbp check * refine docs * auto format by CI * refine * refine docstr * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix l2norm block_size (#7044) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix undefined symbol: cudaGetDeviceCount (#7052) * fix_worker_orphan_process (#7048) * fix_worker_orphan_process * use SIGTERM instead * broadcast elemwise binary (#6871) * add * broadcast elementwise binary * fix * refine * fix * refine * refine * for compile * refine * refine * refine * refine * refine * revert kernels * revert kernel * refine * refine * refine * refine * nvcc thread to 4 Co-authored-by: ZZK <42901638+MARD1NO@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Source op per critical section (#6472) * backup code * EventRecord * auto format by CI * backup code * remove deprecated binary test cases * refactor valatile to atomic * add StreamType::InitInstructionStatusIf/StreamType::DeleteInstructionStatusIf * merge from branch profiling_nn_graph * address comments * EventRecordProvider * more comments for XXXStatusQuerier::SetLaunched * more comments for SharedEventRecord::Init * wait source op per critical section * rename a task_node.cpp * minor fix * backup code * fix compiler complaints * 1) remove AddCtrlEdgeBetweenSrcDstTickAndInputOutputInSameRank; 2) create CriticalSectionInstance buffers * fix compiler complaints * more profiler code * refactor vm preschedule * TryMoveFromWaitingToReady * revert flying_instruction_cnt * revert to single position to call DispatchInstruction * revert several code * reset instruction watermark * remove is_xxx_hook_empty * build with profiler * merge master * insert device ticks before and after critical sections * refactor register_num of cs_wait/cs_callback from 2 to 128 * fix static analysis complaints * fix complier complaints about JobBuilder::ParallelConf4OpName * Update oneflow/core/operator/critical_section_wait_tick_op.cpp Co-authored-by: daquexian <daquexian566@gmail.com> * address pr comments * add job example for InstructionsBuilder::LaunchLazyJob * address pr comments Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: ouyangyu <xuanjiuye@gmail.com> Co-authored-by: daquexian <daquexian566@gmail.com> * More details of error of getting op matched sbp signature (#7077) * more details of error msg * minor change * address review comment * avoid namesake iterator * Module apply only once (#7055) * add once apply of param * apply once on buffer * test reuse var on module to * test resue var * rm useless test * finish test * refine test Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * distributed test bugfix (#7057) * change spawn_shell to spawn_shell_and_check, sleep in script Signed-off-by: daquexian <daquexian566@gmail.com> * fix distributed test master addr Signed-off-by: daquexian <daquexian566@gmail.com> * remove sleep Signed-off-by: daquexian <daquexian566@gmail.com> * spawn_shell -> spawn_shell_ignoring_failure Signed-off-by: daquexian <daquexian566@gmail.com> * auto format by CI * fix bug Signed-off-by: daquexian <daquexian566@gmail.com> * auto format by CI * fix the reversed logic Signed-off-by: daquexian <daquexian566@gmail.com> * improve error msg Signed-off-by: daquexian <daquexian566@gmail.com> * resolve name conflict of MASTER_ADDR Signed-off-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix promote_type matrix (#7066) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix chunk op dim=-1 bug (#7073) * fix chunk op dim=-1 bug * Update oneflow/core/functional/impl/array_functor.cpp Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> * Update oneflow/core/functional/impl/array_functor.cpp Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix resource desc dump cudnn conf bug (#7038) * fix Resource::DumpCudnnConf * fix typo and error msg Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix concat bug (#7075) * fix * support concat single input * Clean TensorNameScope after graph build (#7076) * Clear tensor name scope after graph build * Add test case of 2 graph caught same free eager tensor * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix_abnormal_printing (#7099) * Fix bias add dropout fuse (#7081) * fix bias_add dropout fuse when p=0.0 * remove redundant op Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Support 1d to 2d eager boxing (#7083) * fix Resource::DumpCudnnConf * support_1d_to_2d_eager_boxing * rename stack to unflatten * add test case * of format * refine test case * Revert "fix Resource::DumpCudnnConf" This reverts commit f07278d71e3f344f435fc8f116a12cbd1c099b54. * support nd to 1d * add 2d to 1d test case Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Implement all User Ops with Op Schema (#7032) * add oneflow-tblgen: generate op schema (OpInterpCtx) from ods * cmake: add inja * tblgen: add oneflow_datatype * tblgen: use option cat * tblgen: fix error * tblgen: put impl in .cpp * tblgen: fix null attrs * tblgen: fix null ops * refine * refine * reifne * Refine op schema template and compilation * add base OpInterpCtx to finish compilation * fix * refine * fix * add custom infer code * generate op registrants automatically * refine * fix * update user op ods and fix shape attr * refine * refine * add custom code in op base * refine comments * add same_output_regst_num and infer * support declare hasxx * update op schema emitter * refine * emit output regist num * refine * refine * migrate acc op * migrate onerec_reader, ones_like, send, pack and padding ops * add has_sbp_signature_infer_fn * refine * migrate pad, parallel_cast, partial_fc and pooling ops * rm redundant has_device_infer_fn * migrate prelu, quantization, randperm, reduce and repeat ops * migrate reshape, reshape_like, roi_align, same_pad, selu and scalar related ops * back port * backport * migrate ops * refine * refine * refine * refine * add new op * fix llvm not found * fix mlir headers * fix mlir headers * fix llvm not found * irefine * mark override * fix merge * fix * fix * set op schema as obj lib to speed up * rewrite ops * add addn * add grdi * refien * add more def (#7051) * affine grid * refien * refine * refine * refine * fix * refien * refine * refine * refine * refine * refine * refien * refine * refine * refein * refine * refine * refine * refine * refien * refine * refine * refine * refien * refien * refien * refine * refine * refien * refine * refine * refine * refein * refine * refine * refine * refine * refine * refien * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refein * refine * refine * refine * move more ops * fix math_binary_broadcast/elementwise_ops * fix hardtanh * add norm * rename file and add CpuOnly no_grad * fix ir & fix norm op * fix oneflow-tblgen * fix math_unary_elementwise_op * fix norm * fix bn * fix op schema * refine * fix * refine physical_tensor_desc_infer_fn * refine * add ScalarLogicalNotEqualOp & RecvOp * refine * auto format by CI * fix fmt * add cuda only trait * delete unused inja * del inja_copy_headers_to_destination * delete unused inja * del inja_copy_headers_to_destination * add cuda only to tblgen * fix json inja url and md5 not used * fix json inja url and md5 not used * refine * revert * add with cuda * refine * delete GenUserOpODS * remove cuda only * revert cuda only after meeting * fix Co-authored-by: PragmaTwice <i@twice.moe> Co-authored-by: hjchen2 <chenhoujiangcug@gmail.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * Feat/debug pass (#7054) * add pass debug * debug pass * refine comment of fuse add pass * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix error message (#6930) * fix error message * fix dot doc * fix dot elem cnt * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix simple ci: add of_op_schema target to tidy check (#7105) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Rename AnyType in .td (#7109) * AnyType => Tensor * refine * refine Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Feat graph reuse var (#7080) * add once apply of param * apply once on buffer * test reuse var on module to * test resue var * rm useless test * finish test * refine test * Clear tensor name scope after graph build * Add test case of 2 graph caught same free eager tensor * auto format by CI * refactor var build draft * add full func; add check * done * add test of call parameter ousite its moudule * fix break test Co-authored-by: chengtbf <472491134@qq.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix l2_normalize & add nn.functional.normalize (#6940) * fix l2_normalize * add normalize * add test for normalize * refine * clean l2_normalize and refine normalize * simplify normalize test * Fix l2norm block_size * refine Co-authored-by: Juncheng <liujuncheng1022@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Align api in swin transformer (#7058) * add linspace op * fix align error in swintransformer * add @ magic method * fix conflict * support tensor list * fix meshgrid bug * revert Co-authored-by: hjchen2 <chenhoujiangcug@gmail.com> * set CMAKE_LINK_DEPENDS_NO_SHARED to ON (#7063) Signed-off-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add other api graph autotest (#7091) * Clear tensor name scope after graph build * Add test case of 2 graph caught same free eager tensor * auto format by CI * add other api graph autotest * add more samples * fix comments * refine * refine * refine * refine * refine * fix error * fix test error * fix bug * fix flip bug * fix bug * fix bug * fix ci bug * fix ci error * fix bug * fix ci error Co-authored-by: chengtbf <472491134@qq.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: Li Xiang <54010254+lixiang007666@users.noreply.github.com> * [serving] dev graph run (#7008) * add cmake changes for liboneflow_cpp.so Signed-off-by: daquexian <daquexian566@gmail.com> * add separate target for cpp api test Signed-off-by: daquexian <daquexian566@gmail.com> * add cpp api test in ci Signed-off-by: daquexian <daquexian566@gmail.com> * graph run * reverse the order of cudnn and cuda library Signed-off-by: daquexian <daquexian566@gmail.com> * update logic of BUILD_MONOLITHIC_LIBONEFLOW Signed-off-by: daquexian <daquexian566@gmail.com> * rename BUILD_MONOLITHIC_LIBONEFLOW to BUILD_MONOLITHIC_LIBONEFLOW_CPP_SO Signed-off-by: daquexian <daquexian566@gmail.com> * refine * [draft] implement graph parameter load and save (#7010) * implement parameter save (python) and load (c++) Signed-off-by: daquexian <daquexian566@gmail.com> * revert accident changes Signed-off-by: daquexian <daquexian566@gmail.com> * fix circular reference Signed-off-by: daquexian <daquexian566@gmail.com> * pimpl * batching * share lib directory in test container Signed-off-by: daquexian <daquexian566@gmail.com> * fix typo; * add github actions debug Signed-off-by: daquexian <daquexian566@gmail.com> * Revert "add github actions debug" This reverts commit 7d9aef684a479285c690f38d25525c9b97865e45. * add upterm debug after exe test Signed-off-by: daquexian <daquexian566@gmail.com> * sleep after fail Signed-off-by: daquexian <daquexian566@gmail.com> * set LD_LIBRARY_PATH in yml for cpp api test exe Signed-off-by: daquexian <daquexian566@gmail.com> * refine * add test file && input order * sleep Signed-off-by: daquexian <daquexian566@gmail.com> * upload liboneflow_cpp.so Signed-off-by: daquexian <daquexian566@gmail.com> * modify cmake to trigger compilation Signed-off-by: daquexian <daquexian566@gmail.com> * load job from ir && clean && add mlir model * [remove useless python code]save to .pb * add target of_common_obj to remove duplicate REGISTER_PASS && run of_format * remove openvino * remove openvino test * refine * IValue * Update oneflow/api/cpp/framework/graph.h Co-authored-by: daquexian <daquexian566@gmail.com> * refine * refine * refine * refine * refine * refine * rename in oneflow.cmake * refine oneflow.cmake * make of_api_common object library * move device util function in api to core * remove device check in New and ThreadLocalGetOrNew * refine * fix device test * refine graph test * refine GetExeDir() * refine GetExeDir() again * fix * refine * fix Co-authored-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: mosout <mosout@qq.com> * disable autograd in lazy mode (#7070) * disable autograd in lazy mode * refine * Fix/rand source op in graph (#7092) * add test * fix rand consistent * add test * Fix powf (#7106) * quick fix power * add int scalar test case Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Dispatch stateful ops using functional api (#7046) * Dispatch functional stateful ops * fix * fix cmake * fix * disable attr check since it may not given when creating op expr. * fix * fix * fix * fix * fix * fix * fix * fix * refine Co-authored-by: VertexC <bob2420083992@gmail.com> * Fix HWLoc memory affinity (#7115) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add_env_api_docs (#7100) * add_env_api_docs * minor fix * fix grammatical errors Co-authored-by: Yao Chi <later@usopp.net> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * tmp skip s0 print because of slice (#7065) * tmp skip s0 print because of slice * tmp skip s0 print in test case * fix Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * indexing first version (#7012) * indexing first version * complete * test * out loop * test skip * revise * revise * shape * docs * formatted * confict1 * confict2 * confict2 * confict * revise * auto format by CI Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * fix maybe: add Maybe(T&&) to allow constructing from rvalue T (#7125) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * autotest_add_graph_log (#7126) * Meta info consistency check (#7085) * meta_info_consistency_check * refine check function * Update consistent_cast.cpp * move check to opinterpreter * refine * add note * refactor MetaInfoConsistencyCheck * of_format * refine * NonRecursiveMetaInfoConsistencyCheck * fix func name * add IsMetaInfoConsistencyCheckDisable() * mino fix * refine * minor fix * format * minor fix Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * cmake: use interface target instead of include_directories in pybind11 (#7128) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Import cmake dependence json and inja using FetchContent (#7124) * import cmake dependence json and inja using FetchContent * install-llvm: fix url hash * fix inja config * add cache var * fix ninja build * fix ninja build Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add environment variable to set GRPC_ARG_MAX_MESSAGE_LENGTH (#7130) * env ONEFLOW_GRPC_MAX_MESSAGE_BYTE_SIZE * set default to -1 * Fea/nhwc (#6811) * legacy maxpool2d module * add legacy avgpool2d * add graph cudnn conv alg config * add conv2d nhwc * lazy create cuda_stream in CudaCopyD2HDeviceCtx CudaStreamHandleDeviceCtx * refine * conv bn pool nhwc for resnet perf * one hot with float * use BiasAddRowGpu * rm l2 with 0 * reformat * add nhwc env var * legacy pool merged into new * refine * fix style * fix and refine * address review * fix and refine * fix doc test Co-authored-by: luyang <flowingsun007@163.com> Co-authored-by: guo-ran <360112263@qq.com> Co-authored-by: lixinqi <lixinqi0703106@163.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * reduce memory usage caused by slice grad (#7144) * cmake: fix THIRD_PARTY build (#7146) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix fold op (#7156) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Support inplace for lazy consistent (#7112) * Support inplace for lazy consistent * fix single client sbp hint * refine Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix prelu bug (#7118) * support dtype and device in prelu * optimize PreluFunctor * fix prelu 1-dim error * update * update * auto format by CI Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * use ibn2nd_sbp to get nd_sbp (#7155) Co-authored-by: Houjiang Chen <chenhoujiangcug@gmail.com> * fix copy bug (#7159) * fix copy bug * add to test case * refine * fix test case Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix laynorm backward bug (#7164) * fix layernorm backward index bug * add layernorm test case * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * [Fix] graph support 0-Size tensor (#6957) * Add nn.functional.glu graph test * add filter to motify functional autotest * motify code * add test example * add test else * add test judging condition for test_masked_fill.py,test_constant.py,test_tile.py、test_repeat.py,test_expand.py * add test ok example * Clear tensor name scope after graph build * Add test case of 2 graph caught same free eager tensor * auto format by CI * Dev cc clean tensor name scope (#7082) * Clear tensor name scope after graph build * Add test case of 2 graph caught same free eager tensor * auto format by CI Co-authored-by: chengtbf <472491134@qq.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * submit test success example * test success example * submit test code * fix a bug about relu module with 0 shape data * fixed a bug about relu module with 0 shape data * fix a bug about relu module with 0 shape data * fix a bug about relu module with 0 shape data * 0shape and 0d autotest * fix a bug about relu module with 0 shape data * 0shape changed to 0_size * modify test_var.py * modify test_eye.py * modify test_reshape.py * modify test_.py * modify ReshapeFunctor * modify some file * Fixed graph autotest bug with reshape op test * Fixed graph autotest bug with reshape op test * fixed test_sub.py * modify test_sub.py * modify tensor_methods.cpp * modify array_functor.cpp * graph support 0-Size tensor * rename 0shape to 0 size * modified check_graph=True * fix and refine Co-authored-by: Zhenhua <huangzhenhua@zhejianglab.com> Co-authored-by: tangnana925 <85614052+tangnana925@users.noreply.github.com> Co-authored-by: tangnana <tnn_personal@163.com> Co-authored-by: Zhenhua <1209435+hengzi@users.noreply.github.com> Co-authored-by: chengtbf <472491134@qq.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: Xiaoyu Xu <xiaoyulink@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Cumsum op implementation (#7050) * add cumsum op's forward definition * add cumsum forward test case * cumsum ver3 * remove calculating time * add cumsum forward gpu implementation * fix gpu forward error * change var name * remove annotation * add cumsum cpu forward multi-thread support * add multi-thread annotation * add cumsum grad definition * update * add cumsum cpu backward * add cumsum cpu backward functor * add cumsum autograd * update * remove user interface * use random method to test cumsum forward * add cumsum gpu backward * add cumsum gpu test * fix gpu backward bug * add a 3d cuda kernel try * Revert "add cumsum gpu test" This reverts commit 05c31556ba28ecb827b25e54c2f5fa38984e8096. * Revert "Revert "add cumsum gpu test"" This reverts commit 918ee1569863b008c1d419c3528257416cffd840. * change nele to ele_cnt * add test_cumsum.py in oneflow/test/modules * change original test_cumsum to autotest version * optimize cumsum for special up_space and down_space * add two special cu func * add cumsum doc * update doc * update doc * update code according to bbuf's review * ditto * change pin/pout to in_ptr/out_ptr * remove multi-thread func * update doc * use tensor processor * update by review * update by review * update * update * auto format by CI * auto format by CI * update doc * update Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * Logical slice in tenosr str (#7116) * using logical slice in tensor str * add tensor str util file * refine * refine * refine * refine * add logical slice docs * fix bug * fix comment * auto format by CI * fix doc test bug * delete TODO Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add install for oneflow py (#7107) * Add install for oneflow py * refine * refine * refine * refine * refine * refine * refine * refine * refien * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix bug: output key not exists when SavaJobToIR (#7139) * fix bug: output key not exists when SavaJobToIR * [test] makedirs when path not exists * remove useless comment Co-authored-by: Peihong Liu <mosout@qq.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add linalg 2d norm op for clip_grad (#7160) * add linalg_2d_norm op for clip_grad * code format * revert sqrt * fix comment * refine * fix comment * fix ci error * fix ci error * fix docs bug * fix ci error * fix ci error Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * refine nn.graph autotest (#7111) * add linspace op * refine graph autotest * revert * add graph error trace * fix bug * fix autotest bug * auto format by CI * fix set_printoptions error * auto format by CI * CI test bug * auto format by CI * For CI * auto format by CI * For CI test * fix ci error * revert for ci * fix bug * fix ci error * fix bug * fix bug Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: Li Xiang <54010254+lixiang007666@users.noreply.github.com> Co-authored-by: lixiang <88304454@qq.com> * add oneflow/pytorch cudnn.deterministic (#7172) * add cudnn.deterministic * fix bug * auto format by CI * fix bug * fix generate fake program input bug * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * fix linalg vector norm scalar tensor print bug (#7178) * fix linalg vector norm scalar tensor print bug * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * format * refine * format Co-authored-by: Yinggang Wang <wyg19970408@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: liufengwei0103 <2472937968@qq.com> Co-authored-by: daquexian <daquexian566@gmail.com> Co-authored-by: guo ran <360112263@qq.com> Co-authored-by: Peihong Liu <mosout@qq.com> Co-authored-by: ZZK <42901638+MARD1NO@users.noreply.github.com> Co-authored-by: Shenghang Tsai <jackalcooper@gmail.com> Co-authored-by: Li Xiang <54010254+lixiang007666@users.noreply.github.com> Co-authored-by: cheng cheng <472491134@qq.com> Co-authored-by: Li Xinqi <lixinqi2010@gmail.com> Co-authored-by: lichunyou <33850693+lcylcy@users.noreply.github.com> Co-authored-by: Luyang <flowingsun007@163.com> Co-authored-by: wyushun <wyushun@foxmail.com> Co-authored-by: zhu wang <33675639+olojuwin@users.noreply.github.com> Co-authored-by: leaves-zwx <kunta0932@gmail.com> Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> Co-authored-by: Shijie <821898965@qq.com> Co-authored-by: XIE Xuan <xiexuanx2@gmail.com> Co-authored-by: Juncheng <liujuncheng1022@gmail.com> Co-authored-by: binbinHan <han_binbin@163.com> Co-authored-by: CHI LIU <42956025+thinksoso@users.noreply.github.com> Co-authored-by: Yao Chi <later@usopp.net> Co-authored-by: ouyangyu <xuanjiuye@gmail.com> Co-authored-by: Xiaoyu Xu <xiaoyulink@gmail.com> Co-authored-by: PragmaTwice <i@twice.moe> Co-authored-by: luqiang guo <702572275@qq.com> Co-authored-by: ZeKai Zhou <30856589+zzk0@users.noreply.github.com> Co-authored-by: VertexC <bob2420083992@gmail.com> Co-authored-by: lixinqi <lixinqi0703106@163.com> Co-authored-by: fengdaozhuo <52237830+grybd@users.noreply.github.com> Co-authored-by: Zhenhua <huangzhenhua@zhejianglab.com> Co-authored-by: tangnana925 <85614052+tangnana925@users.noreply.github.com> Co-authored-by: tangnana <tnn_personal@163.com> Co-authored-by: Zhenhua <1209435+hengzi@users.noreply.github.com> Co-authored-by: lixiang <88304454@qq.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: Houjiang Chen <chenhoujiangcug@gmail.com> Co-authored-by: Yinggang Wang <wyg19970408@gmail.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: liufengwei0103 <2472937968@qq.com> Co-authored-by: daquexian <daquexian566@gmail.com> Co-authored-by: Peihong Liu <mosout@qq.com> Co-authored-by: ZZK <42901638+MARD1NO@users.noreply.github.com> Co-authored-by: Shenghang Tsai <jackalcooper@gmail.com> Co-authored-by: Li Xiang <54010254+lixiang007666@users.noreply.github.com> Co-authored-by: cheng cheng <472491134@qq.com> Co-authored-by: Li Xinqi <lixinqi2010@gmail.com> Co-authored-by: lichunyou <33850693+lcylcy@users.noreply.github.com> Co-authored-by: Luyang <flowingsun007@163.com> Co-authored-by: wyushun <wyushun@foxmail.com> Co-authored-by: zhu wang <33675639+olojuwin@users.noreply.github.com> Co-authored-by: leaves-zwx <kunta0932@gmail.com> Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> Co-authored-by: Shijie <821898965@qq.com> Co-authored-by: XIE Xuan <xiexuanx2@gmail.com> Co-authored-by: Juncheng <liujuncheng1022@gmail.com> Co-authored-by: binbinHan <han_binbin@163.com> Co-authored-by: CHI LIU <42956025+thinksoso@users.noreply.github.com> Co-authored-by: Yao Chi <later@usopp.net> Co-authored-by: ouyangyu <xuanjiuye@gmail.com> Co-authored-by: Xiaoyu Xu <xiaoyulink@gmail.com> Co-authored-by: PragmaTwice <i@twice.moe> Co-authored-by: luqiang guo <702572275@qq.com> Co-authored-by: ZeKai Zhou <30856589+zzk0@users.noreply.github.com> Co-authored-by: VertexC <bob2420083992@gmail.com> Co-authored-by: lixinqi <lixinqi0703106@163.com> Co-authored-by: fengdaozhuo <52237830+grybd@users.noreply.github.com> Co-authored-by: Zhenhua <huangzhenhua@zhejianglab.com> Co-authored-by: tangnana925 <85614052+tangnana925@users.noreply.github.com> Co-authored-by: tangnana <tnn_personal@163.com> Co-authored-by: Zhenhua <1209435+hengzi@users.noreply.github.com> Co-authored-by: lixiang <88304454@qq.com> * Use normalize instead of l2_normalize (#7113) * use normalize instead of l2_normalize * refine * fix l2_norm * reformat Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add_eager_naive_s_to_p_boxing (#7203) * add_eager_naive_s_to_p_boxing * fix typo * minor fix * fix test case Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add timeout for distributed run (#7286) * add timeout * strict timeout * quick fix * cmake: import gflags and glog using FetchContent (#7176) * cmake: import gflags and glog using FetchContent * cmake: use set_mirror_url_with_hash * fix THIRD_PARTY build * fix lib path * fix gflags * remove gflags * format * auto format by CI * fix xrt gflags * fix name * remove oneflow_exe_third_party_libs * remove PUBLIC * revert some changes * Update oneflow.cmake * fix so * fix so * remove Custom op test and Single client dry run test Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Prune parallel dim with val eq one in parallel dim reduce (#7257) * fix Resource::DumpCudnnConf * prune_parallel_dim_with_val_eq_one_in_parallel_dim_reduce * minor fix * refine Prune * refine * refine * minor fix * fix bug Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add sync (#7294) * add polynomial scheduler (#7260) Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix autotest inplace bug, hardsigmod (#7276) * Fix autotest inplace bug, hardsigmod * Fix * Format * Fix * Fix kwargs * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Use flowvision replace flow.utils.vision (#6612) * use flowvision * del flow.utils.vision * add flow.utils.data * refine * update version * refine * align clip grad with torch in error_if_nonfinite (#7304) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Use cmake install to copy cpp api related files (#7200) * install cpp api * install mlir related files * clean * handle third party dependences * support cpack * fix * Update oneflow.cmake * fix * fix compiling error * refine * add exe test as deps * install third party * refine * refine * revert install dir * install third party * refine Co-authored-by: Shenghang Tsai <jackalcooper@gmail.com> * Add_Tensor.T_and_Tensor.t()_ops (#7269) * Add_Tensor.T_and_Tensor.t()_ops * Update single test and docs * Update single test * auto format by CI * Update tensor.T * recover requirements.txt * auto format by CI * Update check_graph=False Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Refactor release tensor (#7071) * refactor ReleaseTensor instruction * support CurrentDevVmDepObjectConsumeMode for ReleaseTensor * rm useless Touch instruction * reset speed test threshold Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: Yu OuYang <xuanjiuye@gmail.com> * Fix functional dropout and Docs (#7237) * fix addend to kwargs * fix to an extra * fix test * fix to use key word arguments Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Modify graph and 0-D Tensor (#7208) * Fix 0 dim bug * part1 * Fix * Fix * Add 0dim to 1dim function * Fix * Fix * Fix * Fix * Fix * Fix * Delete test_logical_not_with_0dim_data * Fix * Format * FIx * Fix * Fix * Update test_movedim.py * Update test_narrow.py * Test bug * Test bug * Fix graph bug Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Refactor OneRecReader to stateful op and provide Module api (#7271) * refactor read_onerec to nn.OneRecReader * fix * refine doc * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: Yao Chi <later@usopp.net> Co-authored-by: Houjiang Chen <chenhoujiangcug@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * [bug] Adam align torch params (#7318) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add channel-last to resnet50 graph ci test (#7253) * add channel-last to resnet50 graph ci test * refine Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add oneTBB (#7213) * add tbb * refien * refine * refine * refine * revert * add tbb * success add tbb * tbb onednn ok * fix ninja onednn * component * install tbb include file * updata tbb master zip * fix md5 * refine * refjine * fix * cmake option * modified clang 10 OMP * add line * fix add OMP flags * fix * fix * fix OF_RUNTIME_TBB * refine * clean * fix Co-authored-by: jackalcooper <jackalcooper@gmail.com> Co-authored-by: mosout <mosout@qq.com> * Dev all op bool (#6962) * fix typo * dev all op bool type * add bool testcase * support bool for ops and kernels * bool api * functional bool * ndarray bool * fix conflict * fix tabel gen * fix * refine * fix * fix * fix * fix * fix * fix * fix * add broadcast_to_compatible bool kernel * fix * fix * fix split api * fix docstr * fix setitem bool * fix char * fix * fix * fix Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix leaf tensor backward error (#7331) * fix(Autograd): fix leaf tensor backward error * test(Autograd): add scalar leaf tensor backward test * Create tensor in jobpass after pulling plan (#7315) * fix(NNGraph): create tensor in jobpass after pulling plan * fix(NNGraph): remove useless sync and fix typo * refine error message Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * [BUG] suport 0D tensor in eager consistent (#7242) * suport 0D tensor in eager consistent * fixed_0Size_0D_bug_with_eager_consistent * fixed_0Size_0D_bug_with_eager_consistent * fixed a bug with flatten op in graph autotest * fixed a bug with flattenfunctor in graph autotest * fixed a bug with flatten op in graph autotest * add Notes * modifyed some check_graph=False * auto format by CI * modifyed check_graph=True * modifyed test_add.py * modifyed test_index_select.py * modifyed test_mul.py Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: Xiaoyu Xu <xiaoyulink@gmail.com> * Use cudnn maxpool when possible (#7333) * use_cudnn_maxpool_when_possible * restruct * refine * Remove `register_tensor_op` decorator and use `add_docstr`[part] (#7306) * Remove register_tensor_op decorator and use add_docstr * Fix eq * Fix ne * Fix lt * Fix le * Fix to_local * Fix * test * Resolve conflict * disable nccl release_tensor sequential (#7341) * Rm local dep object pool (#7131) * refactor ReleaseTensor instruction * remove LocalDepObject::logical_object * remove LocalDepObjectPool * refine code by profiling * support CurrentDevVmDepObjectConsumeMode for ReleaseTensor * rm useless Touch instruction * set default value of EagerNcclBroadcastOp::async_launch to true * flow._C.stream_touch (#7209) * flow._C.stream_touch * fix compiler complaints * reset speed test threshold * reserve more size for vector dep_objects * fix static checker complaint * stream_touch does nothing if inputs empty * do not run stream_touch if inputs empty Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: Yu OuYang <xuanjiuye@gmail.com> * Autotest add graph backward (#7270) * add graph backward run in autotest * format * revert * fix ci * auto format by CI * fix bug * fix bug * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Cumprod (#7278) * add comprod * fix name * rename * fix when specified dim is 1 * add docstr * add docstr * refine * add WITH_CUDA * refine * refine * Update python/oneflow/framework/docstr/math_ops.py fix docstr Co-authored-by: Yao Chi <later@usopp.net> * Update python/oneflow/framework/docstr/math_ops.py fix docstr Co-authored-by: Yao Chi <later@usopp.net> * fix docstr * refine * refine * fix include * refine * refine * refine Co-authored-by: Yao Chi <later@usopp.net> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Migrate tile python layer to functor (#7305) * tile implement * migrate repeat * of format * align document with pytorch * change input args to *size * change with review comments * migrate tile python layer to functor * fix tile document * fix tile document * add document's function signature * auto format by CI * add repeat document signature * fix document * fix document Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * fix (#7338) * update_version_of_flowvision (#7346) * Release tensor storage as soon as possible (#7287) * Release tensor storage as soon as possible * refine Co-authored-by: Luyang <flowingsun007@163.com> * Dev eager consistent autotest (#7204) * add doc for pybind type * eager consistent autotest * align placement repr and api * export necessary apis to pickle object * broadcast rank 0 to other rank if consistent test * update consistent add unittest * refine * cmake: import gtest using FetchContent (#7292) * cmake: import gflags and glog using FetchContent * cmake: use set_mirror_url_with_hash * fix THIRD_PARTY build * fix lib path * fix gflags * remove gflags * format * auto format by CI * fix xrt gflags * fix name * remove oneflow_exe_third_party…

use normalize instead of l2_normalize

c5a7a08

mosout requested review from daquexian, doombeaker, hjchen2 and jackalcooper as code owners December 27, 2021 07:50

mosout requested review from BBuf and olojuwin December 27, 2021 07:50

mosout added automerge enhancement op labels Dec 27, 2021

Merge remote-tracking branch 'upstream/master' into remove_l2_norm

9341551

mosout changed the title ~~use normalize instead of l2_normalize~~ Use normalize instead of l2_normalize Dec 27, 2021

mosout added 3 commits January 13, 2022 16:18

Merge remote-tracking branch 'upstream/master' into remove_l2_norm

3e9c4da

refine

a0b0cd7

Merge remote-tracking branch 'upstream/master' into remove_l2_norm

37354ba

olojuwin approved these changes Jan 14, 2022

View reviewed changes

Merge branch 'master' into remove_l2_norm

2f9f528

oneflow-ci-bot self-requested a review January 14, 2022 08:56

olojuwin approved these changes Jan 14, 2022

View reviewed changes

mosout and others added 4 commits January 14, 2022 22:00

fix l2_norm

f5f20ca

Merge remote-tracking branch 'origin/remove_l2_norm' into remove_l2_norm

4c89852

reformat

a015fad

Merge branch 'master' into remove_l2_norm

1a65818

oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot January 14, 2022 15:59

Merge branch 'master' into remove_l2_norm

a60eaca

oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot January 14, 2022 22:22

Merge branch 'master' into remove_l2_norm

479d70f

oneflow-ci-bot self-requested a review January 15, 2022 00:53

Merge branch 'master' into remove_l2_norm

f07885e

oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot January 15, 2022 08:01

Merge branch 'master' into remove_l2_norm

87c0eec

oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot January 15, 2022 14:41

Merge branch 'master' into remove_l2_norm

7dfee8f

oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot January 16, 2022 12:57

MARD1NO approved these changes Jan 17, 2022

View reviewed changes

Merge branch 'master' into remove_l2_norm

45c7cf5

oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot January 17, 2022 06:14

Merge branch 'master' into remove_l2_norm

ed094ac

oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot January 17, 2022 12:54

Merge branch 'master' into remove_l2_norm

3614ae5

oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot January 17, 2022 15:52

Merge branch 'master' into remove_l2_norm

5e20b20

oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot January 17, 2022 17:37

oneflow-ci-bot merged commit 3e8be2e into Oneflow-Inc:master Jan 17, 2022

mosout deleted the remove_l2_norm branch January 21, 2022 02:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use normalize instead of l2_normalize #7113

Use normalize instead of l2_normalize #7113

mosout commented Dec 27, 2021

olojuwin commented Jan 14, 2022

Use normalize instead of l2_normalize #7113

Use normalize instead of l2_normalize #7113

Conversation

mosout commented Dec 27, 2021

olojuwin commented Jan 14, 2022