Add Python code frame, debug(2) and debug(3) in nn.Graph #7110

strint · 2021-12-25T09:49:10Z

Summary

本pr使得Graph debug可以从Module级别进一步精确到Op和代码位置级别

Op带上Python Frame信息，提高op级别的错误定位的便利性

字段名为loc，参考mlir中命名方式
OperatorConf loc，给op图层面做debug
user_op::InferContext loc，可以给kernel做debug

一个matmul op的loc信息，包括2个stack frame 和一个c api：

Python stack[-2]: <frame at 0x7fd6f6c78430, file \'/home/xuxiaoyu/oneflow/python/oneflow/nn/graph/block.py\', line 238, code __block_forward>; 
Python stack[-1]: <frame at 0x7fd6f6c6ad70, file \'/home/xuxiaoyu/oneflow/python/oneflow/nn/modules/linear.py\', line 110, code forward>; 
C API: <func matmul>

通过op name可以定位到op在nn.Module位置，通过op loc可以定位到op被调用的代码位置。

使用两个stack frame的原因是有些 c api 和用户python代码（nn.Module）之间有两层python函数调用。

新增nn.Graph.debug(2) & debug(3)

debug(2)，会新增打印graph build阶段的VLOG(2)的信息，当前是op创建信息；
debug(3)，会新增打印graph build阶段的VLOG(3)的信息，当前是op attr的详细信息；

当前主要控制lazy interpreter下面debug信息的打印

  VLOG(2) << "Lazy nn.Graph name " << infer_ctx->job().job_conf().job_name() << " try to add op: \n"
          << op_conf.DebugString() << std::endl;
  OpAttribute op_attr = *JUST(infer_ctx->AddAndInferConsistentOp(op_conf));
  VLOG(2) << "Lazy nn.Graph name " << infer_ctx->job().job_conf().job_name() << " add op : \n"
          << op_conf.DebugString() << std::endl;
  VLOG(3) << "Lazy nn.Graph name " << infer_ctx->job().job_conf().job_name()
          << " infer and and op attr : \n"
          << op_attr.DebugString() << std::endl;

区分详细程度的做debug

可以给整个graph配置debug()，
可以给部分module配置debug()，用于只对特定module做详细日志定位

strint · 2021-12-25T09:51:15Z

oneflow/user/ops/reshape_op.cpp

@@ -110,7 +110,8 @@ namespace oneflow {
      << " input shape is : " << in_shape.ToString()
      << " , output shape is : " << out_shape->ToString() << " , output logical shape is "
      << logical_shape.ToString()
-      << " , And reshape shape conf is : " << ctx->Attr<Shape>("shape").ToString();
+      << " , And reshape shape conf is : " << ctx->Attr<Shape>("shape").ToString()
+      << " op_loc: " << ctx->op_loc();


可以试下，看是否对 Switch-trans 的reshpe问题定位有帮助

如果这个 reshape 是 JobPass 期间插入的 reshape op，那么这里的 op_loc 其实就是空？还是说是 graph build 的栈

这时就没有栈信息了。因为这里的栈信息是Python调用Functional C API时生成的。直接调用C API时，没有对应的Python栈信息。

如果有必要，我们也可以尝试在Job Pass中构造伪的栈信息，以定位Job Pass中创建的Op

jackalcooper · 2021-12-25T11:13:23Z

这个 loc 是不是得是多态的比较好，有时候不仅是个字符串，有可能是两个或多个loc组合起来（fuse，boxing上下游），或者从onnx之类的其他模型定义导入

strint · 2021-12-27T03:40:43Z

这个 loc 是不是得是多态的比较好，有时候不仅是个字符串，有可能是两个或多个loc组合起来（fuse，boxing上下游），或者从onnx之类的其他模型定义导入

对于多个op，这里可以暂时用str append的做法，string本身还是有比较好的扩展性的。

你有什么建议不，支持多态这块。

这个pr打算先做个最小功能，下个pr可以参考你的建议再做更能支持。

jackalcooper · 2021-12-27T03:58:07Z

这个 loc 是不是得是多态的比较好，有时候不仅是个字符串，有可能是两个或多个loc组合起来（fuse，boxing上下游），或者从onnx之类的其他模型定义导入

对于多个op，这里可以暂时用str append的做法，string本身还是有比较好的扩展性的。

你有什么建议不，支持多态这块。

这个pr打算先做个最小功能，下个pr可以参考你的建议再做更能支持。

可以用奇异递归模版把 loc 信息的存储和展示分离了，展示的接口放在基类，真正存储loc信息放在子类

…feat/op_loc

…into feat/op_loc

strint · 2021-12-29T02:42:26Z

oneflow/api/python/env/env.cpp

@@ -37,4 +37,8 @@ ONEFLOW_API_PYBIND11_MODULE("", m) {
  m.def("IsMultiClient", &IsMultiClient);
  m.def("SetIsMultiClient", &SetIsMultiClient);
  m.def("CudaGetDeviceCount", &CudaGetDeviceCount);
+  m.def("SetFLAGS_logtostderr", &SetFLAGS_logtostderr);


在Python层，用GFlags即时控制GLOG

strint · 2021-12-29T02:43:23Z

oneflow/api/python/functional/py_function.h

+  return back;
+}
+
+std::string get_cur_frame_stack_str() {


获取Python Interpreter的StackFrame并序列化。

strint · 2021-12-29T02:45:02Z

oneflow/api/python/functional/py_function.h

@@ -83,10 +90,52 @@ class PyFunctionDispatcher {
  std::vector<const char*> signatures_;
 };

+namespace {
+static std::string get_obj_str(PyObject* obj) {
+  PyObject* repr = PyObject_Repr(obj);


当前PyFrameObject的数据结构还不稳定，所以暂时不自定义frame格式，而使用Python内置的frame格式。

可以直接用oneflow/api/python/functional/common.h里的PyStringAsString，

Maybe<const char*> PyStringAsString(PyObject* obj);

strint · 2021-12-29T02:46:30Z

oneflow/api/python/functional/py_function.h

+        get_cur_frame_stack_str() + "; C API: <func " + dispatcher.get_func_name() + ">";
+    // User DispathFram to pass frame info to OpExpr or Interpreter.
+    DispatchFrame::Guard f_guard(cur_f_str);
+    return dispatcher.call(args, kwargs, std::make_index_sequence<sizeof...(SchemaT)>{});


lazy mode下，每个functional c api的调用，就捕获其stack frame，记录到DispatchFrame中。

也可用于eager，不过暂时不加上，避免额外的性能开销。

这时是生成python 栈信息的地方

strint · 2021-12-29T02:47:23Z

oneflow/core/framework/op_expr.cpp

@@ -100,6 +101,7 @@ Maybe<void> BuiltinOpExprImpl<UserOpConf>::BuildOpConf(OperatorConf* op_conf,
                                                       const AttrMap& attrs) const {
  *(op_conf->mutable_name()) = op_name_;
  *(op_conf->mutable_user_conf()) = op_proto_;
+  *(op_conf->mutable_loc()) = DispatchFrame::get_str();


build op conf时，获取stack frame str，记录到loc

strint · 2021-12-29T02:48:32Z

python/oneflow/framework/graph_build_util.py

+    def __exit__(self, exc_type, exc_val, exc_tb):
+        if self._s == 0:
+            oneflow._oneflow_internal.SetFLAGS_logtostderr(self._prev_logtostderr)
+        oneflow._oneflow_internal.SetFLAGS_v(self._prev_v)


glog的scope可控制，支持分段打开和关闭glog

控制输出info等级

控制是否输出到terminal

strint · 2021-12-29T02:49:13Z

python/oneflow/nn/graph/graph.py

+            with graph_build_util.GLogScopeContext(
+                self._debug_min_s_level, self._debug_max_v_level - 1
+            ):
+                eager_outputs = self._build_graph(*args)


graph级别的glog控制

strint · 2021-12-29T02:49:29Z

python/oneflow/nn/graph/block.py

+        with graph_build_util.GLogScopeContext(
+            self._debug_min_s_level, self._debug_max_v_level - 1
+        ):
+            result = self.__block_forward(*args)


block级别的glog控制

strint · 2021-12-29T02:55:05Z

oneflow/core/framework/infer_util.h

@@ -59,6 +59,7 @@ class InferContext {
  virtual const std::string& op_name() const = 0;
  virtual const std::string& op_type_name() const = 0;
  virtual const std::string& device_tag() const = 0;
+  virtual const std::string& op_loc() const = 0;


InferContext都带上了 op_loc的实现，以支持op infer推理时的报错可以报到代码位置。

这里是容易出错的地方之一，后续可以再加到其它报错需要定位到代码位置的地方。

github-actions · 2022-01-07T23:52:44Z

Speed stats:

GPU Name: GeForce GTX 1080 

OneFlow resnet50 time: 135.9ms (= 13588.2ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 137.7ms (= 13765.4ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.01 (= 137.7ms / 135.9ms)

OneFlow resnet50 time: 77.9ms (= 7791.0ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 82.2ms (= 8221.9ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.06 (= 82.2ms / 77.9ms)

OneFlow resnet50 time: 50.2ms (= 10031.1ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 56.7ms (= 11349.9ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.13 (= 56.7ms / 50.2ms)

OneFlow resnet50 time: 42.2ms (= 8434.1ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 50.3ms (= 10067.6ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.19 (= 50.3ms / 42.2ms)

OneFlow resnet50 time: 34.4ms (= 6870.6ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 38.8ms (= 7759.4ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.13 (= 38.8ms / 34.4ms)

OneFlow resnet50 time: 150.7ms (= 15068.0ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 158.2ms (= 15816.9ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.05 (= 158.2ms / 150.7ms)

OneFlow resnet50 time: 94.5ms (= 9450.3ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 100.9ms (= 10089.4ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.07 (= 100.9ms / 94.5ms)

OneFlow resnet50 time: 69.5ms (= 13898.6ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 71.4ms (= 14288.8ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.03 (= 71.4ms / 69.5ms)

OneFlow resnet50 time: 62.6ms (= 12529.8ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 61.6ms (= 12318.2ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 0.98 (= 61.6ms / 62.6ms)

OneFlow resnet50 time: 62.9ms (= 12586.3ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 63.5ms (= 12703.4ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.01 (= 63.5ms / 62.9ms)

github-actions · 2022-01-08T01:14:48Z

CI failed when running job: cuda-speed-test. PR label automerge has been removed

github-actions · 2022-01-08T03:38:23Z

CI failed when running job: cuda-legacy-benchmark-experimental. PR label automerge has been removed

github-actions · 2022-01-08T04:50:08Z

Speed stats:

GPU Name: GeForce GTX 1080 

OneFlow resnet50 time: 136.9ms (= 13686.7ms / 100, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 137.6ms (= 13757.7ms / 100, input_shape=[16, 3, 224, 224])
✔️ Relative speed: 1.01 (= 137.6ms / 136.9ms)

OneFlow resnet50 time: 78.7ms (= 7868.0ms / 100, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 84.1ms (= 8405.6ms / 100, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.07 (= 84.1ms / 78.7ms)

OneFlow resnet50 time: 49.4ms (= 9873.7ms / 200, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 57.6ms (= 11526.6ms / 200, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.17 (= 57.6ms / 49.4ms)

OneFlow resnet50 time: 38.8ms (= 7759.5ms / 200, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 45.7ms (= 9139.7ms / 200, input_shape=[2, 3, 224, 224])
✔️ Relative speed: 1.18 (= 45.7ms / 38.8ms)

OneFlow resnet50 time: 35.3ms (= 7070.0ms / 200, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 37.5ms (= 7493.4ms / 200, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 1.06 (= 37.5ms / 35.3ms)

OneFlow resnet50 time: 150.3ms (= 15033.6ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 157.6ms (= 15760.6ms / 100, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.05 (= 157.6ms / 150.3ms)

OneFlow resnet50 time: 95.1ms (= 9507.2ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 99.4ms (= 9944.4ms / 100, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.05 (= 99.4ms / 95.1ms)

OneFlow resnet50 time: 67.5ms (= 13497.3ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 73.9ms (= 14774.9ms / 200, input_shape=[4, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.09 (= 73.9ms / 67.5ms)

OneFlow resnet50 time: 68.3ms (= 13651.6ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 61.7ms (= 12349.6ms / 200, input_shape=[2, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 0.90 (= 61.7ms / 68.3ms)

OneFlow resnet50 time: 65.6ms (= 13123.2ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 57.0ms (= 11398.1ms / 200, input_shape=[1, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 0.87 (= 57.0ms / 65.6ms)

* Source op per critical section (#6472) * backup code * EventRecord * auto format by CI * backup code * remove deprecated binary test cases * refactor valatile to atomic * add StreamType::InitInstructionStatusIf/StreamType::DeleteInstructionStatusIf * merge from branch profiling_nn_graph * address comments * EventRecordProvider * more comments for XXXStatusQuerier::SetLaunched * more comments for SharedEventRecord::Init * wait source op per critical section * rename a task_node.cpp * minor fix * backup code * fix compiler complaints * 1) remove AddCtrlEdgeBetweenSrcDstTickAndInputOutputInSameRank; 2) create CriticalSectionInstance buffers * fix compiler complaints * more profiler code * refactor vm preschedule * TryMoveFromWaitingToReady * revert flying_instruction_cnt * revert to single position to call DispatchInstruction * revert several code * reset instruction watermark * remove is_xxx_hook_empty * build with profiler * merge master * insert device ticks before and after critical sections * refactor register_num of cs_wait/cs_callback from 2 to 128 * fix static analysis complaints * fix complier complaints about JobBuilder::ParallelConf4OpName * Update oneflow/core/operator/critical_section_wait_tick_op.cpp Co-authored-by: daquexian <daquexian566@gmail.com> * address pr comments * add job example for InstructionsBuilder::LaunchLazyJob * address pr comments Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: ouyangyu <xuanjiuye@gmail.com> Co-authored-by: daquexian <daquexian566@gmail.com> * More details of error of getting op matched sbp signature (#7077) * more details of error msg * minor change * address review comment * avoid namesake iterator * Module apply only once (#7055) * add once apply of param * apply once on buffer * test reuse var on module to * test resue var * rm useless test * finish test * refine test Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * distributed test bugfix (#7057) * change spawn_shell to spawn_shell_and_check, sleep in script Signed-off-by: daquexian <daquexian566@gmail.com> * fix distributed test master addr Signed-off-by: daquexian <daquexian566@gmail.com> * remove sleep Signed-off-by: daquexian <daquexian566@gmail.com> * spawn_shell -> spawn_shell_ignoring_failure Signed-off-by: daquexian <daquexian566@gmail.com> * auto format by CI * fix bug Signed-off-by: daquexian <daquexian566@gmail.com> * auto format by CI * fix the reversed logic Signed-off-by: daquexian <daquexian566@gmail.com> * improve error msg Signed-off-by: daquexian <daquexian566@gmail.com> * resolve name conflict of MASTER_ADDR Signed-off-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix promote_type matrix (#7066) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix chunk op dim=-1 bug (#7073) * fix chunk op dim=-1 bug * Update oneflow/core/functional/impl/array_functor.cpp Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> * Update oneflow/core/functional/impl/array_functor.cpp Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix resource desc dump cudnn conf bug (#7038) * fix Resource::DumpCudnnConf * fix typo and error msg Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix concat bug (#7075) * fix * support concat single input * Clean TensorNameScope after graph build (#7076) * Clear tensor name scope after graph build * Add test case of 2 graph caught same free eager tensor * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix_abnormal_printing (#7099) * Fix bias add dropout fuse (#7081) * fix bias_add dropout fuse when p=0.0 * remove redundant op Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Support 1d to 2d eager boxing (#7083) * fix Resource::DumpCudnnConf * support_1d_to_2d_eager_boxing * rename stack to unflatten * add test case * of format * refine test case * Revert "fix Resource::DumpCudnnConf" This reverts commit f07278d. * support nd to 1d * add 2d to 1d test case Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Implement all User Ops with Op Schema (#7032) * add oneflow-tblgen: generate op schema (OpInterpCtx) from ods * cmake: add inja * tblgen: add oneflow_datatype * tblgen: use option cat * tblgen: fix error * tblgen: put impl in .cpp * tblgen: fix null attrs * tblgen: fix null ops * refine * refine * reifne * Refine op schema template and compilation * add base OpInterpCtx to finish compilation * fix * refine * fix * add custom infer code * generate op registrants automatically * refine * fix * update user op ods and fix shape attr * refine * refine * add custom code in op base * refine comments * add same_output_regst_num and infer * support declare hasxx * update op schema emitter * refine * emit output regist num * refine * refine * migrate acc op * migrate onerec_reader, ones_like, send, pack and padding ops * add has_sbp_signature_infer_fn * refine * migrate pad, parallel_cast, partial_fc and pooling ops * rm redundant has_device_infer_fn * migrate prelu, quantization, randperm, reduce and repeat ops * migrate reshape, reshape_like, roi_align, same_pad, selu and scalar related ops * back port * backport * migrate ops * refine * refine * refine * refine * add new op * fix llvm not found * fix mlir headers * fix mlir headers * fix llvm not found * irefine * mark override * fix merge * fix * fix * set op schema as obj lib to speed up * rewrite ops * add addn * add grdi * refien * add more def (#7051) * affine grid * refien * refine * refine * refine * fix * refien * refine * refine * refine * refine * refine * refien * refine * refine * refein * refine * refine * refine * refine * refien * refine * refine * refine * refien * refien * refien * refine * refine * refien * refine * refine * refine * refein * refine * refine * refine * refine * refine * refien * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine * refein * refine * refine * refine * move more ops * fix math_binary_broadcast/elementwise_ops * fix hardtanh * add norm * rename file and add CpuOnly no_grad * fix ir & fix norm op * fix oneflow-tblgen * fix math_unary_elementwise_op * fix norm * fix bn * fix op schema * refine * fix * refine physical_tensor_desc_infer_fn * refine * add ScalarLogicalNotEqualOp & RecvOp * refine * auto format by CI * fix fmt * add cuda only trait * delete unused inja * del inja_copy_headers_to_destination * delete unused inja * del inja_copy_headers_to_destination * add cuda only to tblgen * fix json inja url and md5 not used * fix json inja url and md5 not used * refine * revert * add with cuda * refine * delete GenUserOpODS * remove cuda only * revert cuda only after meeting * fix Co-authored-by: PragmaTwice <i@twice.moe> Co-authored-by: hjchen2 <chenhoujiangcug@gmail.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * Feat/debug pass (#7054) * add pass debug * debug pass * refine comment of fuse add pass * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix error message (#6930) * fix error message * fix dot doc * fix dot elem cnt * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix simple ci: add of_op_schema target to tidy check (#7105) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Rename AnyType in .td (#7109) * AnyType => Tensor * refine * refine Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Feat graph reuse var (#7080) * add once apply of param * apply once on buffer * test reuse var on module to * test resue var * rm useless test * finish test * refine test * Clear tensor name scope after graph build * Add test case of 2 graph caught same free eager tensor * auto format by CI * refactor var build draft * add full func; add check * done * add test of call parameter ousite its moudule * fix break test Co-authored-by: chengtbf <472491134@qq.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix l2_normalize & add nn.functional.normalize (#6940) * fix l2_normalize * add normalize * add test for normalize * refine * clean l2_normalize and refine normalize * simplify normalize test * Fix l2norm block_size * refine Co-authored-by: Juncheng <liujuncheng1022@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Align api in swin transformer (#7058) * add linspace op * fix align error in swintransformer * add @ magic method * fix conflict * support tensor list * fix meshgrid bug * revert Co-authored-by: hjchen2 <chenhoujiangcug@gmail.com> * set CMAKE_LINK_DEPENDS_NO_SHARED to ON (#7063) Signed-off-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add other api graph autotest (#7091) * Clear tensor name scope after graph build * Add test case of 2 graph caught same free eager tensor * auto format by CI * add other api graph autotest * add more samples * fix comments * refine * refine * refine * refine * refine * fix error * fix test error * fix bug * fix flip bug * fix bug * fix bug * fix ci bug * fix ci error * fix bug * fix ci error Co-authored-by: chengtbf <472491134@qq.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: Li Xiang <54010254+lixiang007666@users.noreply.github.com> * [serving] dev graph run (#7008) * add cmake changes for liboneflow_cpp.so Signed-off-by: daquexian <daquexian566@gmail.com> * add separate target for cpp api test Signed-off-by: daquexian <daquexian566@gmail.com> * add cpp api test in ci Signed-off-by: daquexian <daquexian566@gmail.com> * graph run * reverse the order of cudnn and cuda library Signed-off-by: daquexian <daquexian566@gmail.com> * update logic of BUILD_MONOLITHIC_LIBONEFLOW Signed-off-by: daquexian <daquexian566@gmail.com> * rename BUILD_MONOLITHIC_LIBONEFLOW to BUILD_MONOLITHIC_LIBONEFLOW_CPP_SO Signed-off-by: daquexian <daquexian566@gmail.com> * refine * [draft] implement graph parameter load and save (#7010) * implement parameter save (python) and load (c++) Signed-off-by: daquexian <daquexian566@gmail.com> * revert accident changes Signed-off-by: daquexian <daquexian566@gmail.com> * fix circular reference Signed-off-by: daquexian <daquexian566@gmail.com> * pimpl * batching * share lib directory in test container Signed-off-by: daquexian <daquexian566@gmail.com> * fix typo; * add github actions debug Signed-off-by: daquexian <daquexian566@gmail.com> * Revert "add github actions debug" This reverts commit 7d9aef6. * add upterm debug after exe test Signed-off-by: daquexian <daquexian566@gmail.com> * sleep after fail Signed-off-by: daquexian <daquexian566@gmail.com> * set LD_LIBRARY_PATH in yml for cpp api test exe Signed-off-by: daquexian <daquexian566@gmail.com> * refine * add test file && input order * sleep Signed-off-by: daquexian <daquexian566@gmail.com> * upload liboneflow_cpp.so Signed-off-by: daquexian <daquexian566@gmail.com> * modify cmake to trigger compilation Signed-off-by: daquexian <daquexian566@gmail.com> * load job from ir && clean && add mlir model * [remove useless python code]save to .pb * add target of_common_obj to remove duplicate REGISTER_PASS && run of_format * remove openvino * remove openvino test * refine * IValue * Update oneflow/api/cpp/framework/graph.h Co-authored-by: daquexian <daquexian566@gmail.com> * refine * refine * refine * refine * refine * refine * rename in oneflow.cmake * refine oneflow.cmake * make of_api_common object library * move device util function in api to core * remove device check in New and ThreadLocalGetOrNew * refine * fix device test * refine graph test * refine GetExeDir() * refine GetExeDir() again * fix * refine * fix Co-authored-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: mosout <mosout@qq.com> * disable autograd in lazy mode (#7070) * disable autograd in lazy mode * refine * Fix/rand source op in graph (#7092) * add test * fix rand consistent * add test * Fix powf (#7106) * quick fix power * add int scalar test case Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Dispatch stateful ops using functional api (#7046) * Dispatch functional stateful ops * fix * fix cmake * fix * disable attr check since it may not given when creating op expr. * fix * fix * fix * fix * fix * fix * fix * fix * refine Co-authored-by: VertexC <bob2420083992@gmail.com> * Fix HWLoc memory affinity (#7115) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add_env_api_docs (#7100) * add_env_api_docs * minor fix * fix grammatical errors Co-authored-by: Yao Chi <later@usopp.net> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * tmp skip s0 print because of slice (#7065) * tmp skip s0 print because of slice * tmp skip s0 print in test case * fix Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * indexing first version (#7012) * indexing first version * complete * test * out loop * test skip * revise * revise * shape * docs * formatted * confict1 * confict2 * confict2 * confict * revise * auto format by CI Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * fix maybe: add Maybe(T&&) to allow constructing from rvalue T (#7125) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * autotest_add_graph_log (#7126) * Meta info consistency check (#7085) * meta_info_consistency_check * refine check function * Update consistent_cast.cpp * move check to opinterpreter * refine * add note * refactor MetaInfoConsistencyCheck * of_format * refine * NonRecursiveMetaInfoConsistencyCheck * fix func name * add IsMetaInfoConsistencyCheckDisable() * mino fix * refine * minor fix * format * minor fix Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * cmake: use interface target instead of include_directories in pybind11 (#7128) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Import cmake dependence json and inja using FetchContent (#7124) * import cmake dependence json and inja using FetchContent * install-llvm: fix url hash * fix inja config * add cache var * fix ninja build * fix ninja build Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add environment variable to set GRPC_ARG_MAX_MESSAGE_LENGTH (#7130) * env ONEFLOW_GRPC_MAX_MESSAGE_BYTE_SIZE * set default to -1 * Fea/nhwc (#6811) * legacy maxpool2d module * add legacy avgpool2d * add graph cudnn conv alg config * add conv2d nhwc * lazy create cuda_stream in CudaCopyD2HDeviceCtx CudaStreamHandleDeviceCtx * refine * conv bn pool nhwc for resnet perf * one hot with float * use BiasAddRowGpu * rm l2 with 0 * reformat * add nhwc env var * legacy pool merged into new * refine * fix style * fix and refine * address review * fix and refine * fix doc test Co-authored-by: luyang <flowingsun007@163.com> Co-authored-by: guo-ran <360112263@qq.com> Co-authored-by: lixinqi <lixinqi0703106@163.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * reduce memory usage caused by slice grad (#7144) * cmake: fix THIRD_PARTY build (#7146) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix fold op (#7156) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Support inplace for lazy consistent (#7112) * Support inplace for lazy consistent * fix single client sbp hint * refine Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix prelu bug (#7118) * support dtype and device in prelu * optimize PreluFunctor * fix prelu 1-dim error * update * update * auto format by CI Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * use ibn2nd_sbp to get nd_sbp (#7155) Co-authored-by: Houjiang Chen <chenhoujiangcug@gmail.com> * fix copy bug (#7159) * fix copy bug * add to test case * refine * fix test case Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix laynorm backward bug (#7164) * fix layernorm backward index bug * add layernorm test case * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * [Fix] graph support 0-Size tensor (#6957) * Add nn.functional.glu graph test * add filter to motify functional autotest * motify code * add test example * add test else * add test judging condition for test_masked_fill.py,test_constant.py,test_tile.py、test_repeat.py,test_expand.py * add test ok example * Clear tensor name scope after graph build * Add test case of 2 graph caught same free eager tensor * auto format by CI * Dev cc clean tensor name scope (#7082) * Clear tensor name scope after graph build * Add test case of 2 graph caught same free eager tensor * auto format by CI Co-authored-by: chengtbf <472491134@qq.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * submit test success example * test success example * submit test code * fix a bug about relu module with 0 shape data * fixed a bug about relu module with 0 shape data * fix a bug about relu module with 0 shape data * fix a bug about relu module with 0 shape data * 0shape and 0d autotest * fix a bug about relu module with 0 shape data * 0shape changed to 0_size * modify test_var.py * modify test_eye.py * modify test_reshape.py * modify test_.py * modify ReshapeFunctor * modify some file * Fixed graph autotest bug with reshape op test * Fixed graph autotest bug with reshape op test * fixed test_sub.py * modify test_sub.py * modify tensor_methods.cpp * modify array_functor.cpp * graph support 0-Size tensor * rename 0shape to 0 size * modified check_graph=True * fix and refine Co-authored-by: Zhenhua <huangzhenhua@zhejianglab.com> Co-authored-by: tangnana925 <85614052+tangnana925@users.noreply.github.com> Co-authored-by: tangnana <tnn_personal@163.com> Co-authored-by: Zhenhua <1209435+hengzi@users.noreply.github.com> Co-authored-by: chengtbf <472491134@qq.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: Xiaoyu Xu <xiaoyulink@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Cumsum op implementation (#7050) * add cumsum op's forward definition * add cumsum forward test case * cumsum ver3 * remove calculating time * add cumsum forward gpu implementation * fix gpu forward error * change var name * remove annotation * add cumsum cpu forward multi-thread support * add multi-thread annotation * add cumsum grad definition * update * add cumsum cpu backward * add cumsum cpu backward functor * add cumsum autograd * update * remove user interface * use random method to test cumsum forward * add cumsum gpu backward * add cumsum gpu test * fix gpu backward bug * add a 3d cuda kernel try * Revert "add cumsum gpu test" This reverts commit 05c3155. * Revert "Revert "add cumsum gpu test"" This reverts commit 918ee15. * change nele to ele_cnt * add test_cumsum.py in oneflow/test/modules * change original test_cumsum to autotest version * optimize cumsum for special up_space and down_space * add two special cu func * add cumsum doc * update doc * update doc * update code according to bbuf's review * ditto * change pin/pout to in_ptr/out_ptr * remove multi-thread func * update doc * use tensor processor * update by review * update by review * update * update * auto format by CI * auto format by CI * update doc * update Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * Logical slice in tenosr str (#7116) * using logical slice in tensor str * add tensor str util file * refine * refine * refine * refine * add logical slice docs * fix bug * fix comment * auto format by CI * fix doc test bug * delete TODO Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add install for oneflow py (#7107) * Add install for oneflow py * refine * refine * refine * refine * refine * refine * refine * refine * refien * refine * refine * refine * refine * refine * refine * refine * refine * refine * refine Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix bug: output key not exists when SavaJobToIR (#7139) * fix bug: output key not exists when SavaJobToIR * [test] makedirs when path not exists * remove useless comment Co-authored-by: Peihong Liu <mosout@qq.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add linalg 2d norm op for clip_grad (#7160) * add linalg_2d_norm op for clip_grad * code format * revert sqrt * fix comment * refine * fix comment * fix ci error * fix ci error * fix docs bug * fix ci error * fix ci error Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * refine nn.graph autotest (#7111) * add linspace op * refine graph autotest * revert * add graph error trace * fix bug * fix autotest bug * auto format by CI * fix set_printoptions error * auto format by CI * CI test bug * auto format by CI * For CI * auto format by CI * For CI test * fix ci error * revert for ci * fix bug * fix ci error * fix bug * fix bug Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: Li Xiang <54010254+lixiang007666@users.noreply.github.com> Co-authored-by: lixiang <88304454@qq.com> * add oneflow/pytorch cudnn.deterministic (#7172) * add cudnn.deterministic * fix bug * auto format by CI * fix bug * fix generate fake program input bug * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * fix linalg vector norm scalar tensor print bug (#7178) * fix linalg vector norm scalar tensor print bug * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * use op schema for cumsum (#7175) * add op schema for cumsum * change cumsum's td definition to math group * update * fix get_sbp for scalar math ops (#7184) * add inplace mul for clip_grad (#7180) * add inplace mul for clip_grad * auto format by CI * fix format error Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * Add swapaxes op (#7179) * Add swapaxes op * Modify runtime * fix docstr * Modify functor Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix install cuda include (#7191) * Support uneven split in eager slice boxing (#7123) * fix Resource::DumpCudnnConf * add shape para in boxing check function * fix GetBoxingFunction para * asymmetric_x_to_b support cpu * forbid uneven split in cellective boxing * refine slice boxing kernel to support uneven split * add test case and fix balanced_splitter error * fix test case * fix op/kernel bug * fix bug in symmetric_s_to_p * revert boxing_dividor_util.cpp * use const Shape& Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add stack kernel (#7152) * fix arange bug * build init kernel * add stack backward * remove annotation * reformat and fix sbp * fix ops td format * fix format * fix comment * add more test case in dim * fiux user ops td * fix to use size_t * fix annotation * fix less than * fix userop tabelgen * fix bug when num of inputs greater than 128 Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add erfinv op (#7163) * Add erfinv op pre * fix * add erfinv op * Add test * fix comment * add inplace version of erfinv * add inplace version docs * fix inplace cpu version kernel and ops td * add test and docs * fix back * fix unittest * fix const & Co-authored-by: MARD1NO <359521840@qq.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add doc for pybind type (#7193) * fix linspace bug (#7185) * fix linspace bug * auto format by CI * fix comment * annotation adaptive_avgpool3d Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add floor inplace version (#7187) * add floor inplace version * add docs * fix comment * fix comment * fix comment * auto format by CI * fix comment Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: ZZK <42901638+MARD1NO@users.noreply.github.com> * remove is_lazy check in nn.Graph inplace output (#7190) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix test case about eye (#7194) * fix eye test case * add test case Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix(narrow): fix consistent narrow gradient bug (#7195) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fused kernel with broadcast (#6977) * add broadcast for fused kernel * fix cuda memcpy ilegal access error * add broadcast for fused_softmax kernel * fix errors * add more test sample * reformat * add one_elif * reformat * use different dispatch logic * Use simplified dims * add simplified dims for fused_scale_mask_softmax_dropout * add simplified broadcast for fused_scale_mask_softmax_dropout * add simplified dims for fused_scale_mask_softmax * try to merge duplicate code * simpified kernel code * fix test case * fix check * remove annotation * add new line Co-authored-by: MARD1NO <359521840@qq.com> Co-authored-by: ZZK <42901638+MARD1NO@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * skip drop if drop rate is zero (#7186) * Dev inplace clamp (#7182) * add inplace for clamp * first commit * fix conflict * add clip alias and docs * fix bug and add test * add more test case * skip functional adaptive pool3d test Co-authored-by: Zhanghuihong <garfield.gzhh@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> Co-authored-by: Li Xiang <54010254+lixiang007666@users.noreply.github.com> * Revert "Fused kernel with broadcast (#6977)" (#7207) This reverts commit 80099aa. * [BUG] Fixed graph autotest bug with sub op (#7142) * fixed Fixed graph autotest bug with sub op test * fixed 0size data graph autotest bug with randperm op Co-authored-by: Xiaoyu Xu <xiaoyulink@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * var kernel (#7024) * var forward kernel * variance backward * add var backward * refine * refine * refine * refine * add GetSbpFn * refine * refine * refine * refine * refine * add TODO * replace 'axis' str using 'dim' str * change the way of getting cuda stream * add comment * auto format by CI * fix ref bug * fix static check error * auto format by CI * fix build many linux error * format * fix static check error * fix mut dptr error when size is 0 * refine * support 0 shape and nan * auto format by CI * refine * fix doctest because of accuracy error * fix backward unsqueeze dim bug * fix bug backward * refine * fix out of order bug Co-authored-by: Yao Chi <later@usopp.net> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Add Python code frame, debug(2) and debug(3) in nn.Graph (#7110) * add frame * test pass * refine loc str * refine code * refine code * refine debug * add debug * block forward with glog scope * refine debug * glog to stderr when v 2 * refine py str api * refine and fix py obj repr * refine pystr; use GetOrThrow at pyfunc; use alsologtostderr * refine pystr * move str * fix test * log 2 alsolog Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * update readme 0.6.0 (#7202) * update readme * add Publication section * reorder * update default version * Fix check graph bug part1 (#7197) * support randperm graph test * add diagonal graph test * fix eye op check graph bug * refine * fix to bug * refine * fix * format * restruct nn.graph autotest * format * fix bug * auto format by CI * fix where test bug * comment diagonal op * fix comment * fix ci error Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: Li Xiang <54010254+lixiang007666@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * Add documentation file for nn.init.xxxx (#7181) * Add documentation file for nn.init.xxxx (#7168) * Modify document index order (#7168) Co-authored-by: Yao Chi <later@usopp.net> * Refactor to numpy (#7097) * tensor numpy method * to numpy * delete useless file * replace CHECK_JUST with JUST * tensor cpu method return self if it is in cpu * delete tensor buffer * delete useless code * refine * Update python/oneflow/nn/modules/tensor_ops.py Co-authored-by: Yinggang Wang <wyg19970408@gmail.com> * refine * add docstr of cpu method * delete useless code * refine * add comment * refine * add 'assert' info * refine * do .cpu if tensor is not in cpu memory * revert format change * fix tensor buffer numpy * support tensor buffer to invoke numpy * fix bug * fix nd sbp numpy bug * fix bug about test case because of numpy sharing memory with tensor * auto format by CI Co-authored-by: Yinggang Wang <wyg19970408@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * Fix eager boxing bug (#7196) * fix_eager_boxing_bug * remove EagerBoxingCall * minor fix * fix error * fix error * rename d to dim Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Feat eager consistent 2d sbp infer (#7143) * feat(EagerConsistent): support 2d sbp infer * feat(EagerConsistent): support compute copy cost * refine 2d sbp cannot find error message * refactor(EagerConsistent): move functions to sbp_infer_util * feat(EagerConsistent): add same sbp judgement * refine code * feat(EagerConsistent): update 1d to 1d copy cost * feat(EagerConsistent): try to get boxing from eager_consistent_boxing_mgr * feat(EagerConsistent): update copy cost function * remove useless code * refine code * fix merge bug * refine code and fix copy cost function * Revert "Fused kernel with broadcast (#6977)" This reverts commit 80099aa. * Add comment * refine code * fix JUST * Revert "Revert "Fused kernel with broadcast (#6977)"" This reverts commit e7e2990. * fix P->B copy cost * fix error message error Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: Houjiang Chen <chenhoujiangcug@gmail.com> * fix split default arg (#7222) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix no grad inplace clamp (#7220) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix readthedocs auto update (#7223) * fix docs (#7227) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * allow_file_schema_in_mirror_third_party (#7231) * support_symmetric_cyclic_nd_sbp_boxing (#7210) * support_symmetric_cyclic_nd_sbp_boxing * rename func * minor fix * solve comment * minor fix * fix typo Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Fix erfinv and swapaxes (#7217) * Fix erfinv and swapaxes * Fix * Fix bug and add test * Modify name * Fix arg * Modify pi * Fix Co-authored-by: ZZK <42901638+MARD1NO@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Support nd sbp dim reduce (#7230) * support_symmetric_cyclic_nd_sbp_boxing * rename func * minor fix * solve comment * minor fix * support_nd_sbp_dim_reduce * fix_typo * add test case * fix bug * fix bug * refine * fix dead loop error Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix comm test cases (#7021) * fix comm test cases * auto format by CI * refine * refine * refine Co-authored-by: ZZK <42901638+MARD1NO@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> * fix_backward_bug_in_1d_to_2d_boxing (#7224) * fix_backward_bug_in_1d_to_2d_boxing * refine * of_format Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Skip layernorm warp test (#7243) * fix arange bug * skip * fix comment Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Print warning for non localhost proxy (#7228) * print warning for non localhost proxy Signed-off-by: daquexian <daquexian566@gmail.com> * reformat Signed-off-by: daquexian <daquexian566@gmail.com> * add more check Signed-off-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * add ddp return type (#7232) * add dpp return type * add comment * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * Parameter support both inplace op and setter (#7249) * feat(Parameter): Parameter support both inplace op and setter * feat(Tensor): tensor support data's getter interface * test(Parameter): add getter test Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix(*): fix sbp filter function bug (#7229) Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * refine (#7240) * Eager boxing status (#7150) * add eager boxing status * refine MakeBoxingInterpreterStatus * add blank line * del EagerBoxingCall * refine BoxingInterpreterStatus * refine BoxingInterpreterStatus * add eager boxing log * minor fix * minor fix * revert removed file * add indent arg * rename indent to prefix * solve comment * refine eager_boxing_logger * use Global<const EagerBoxingLogger> * minor fix Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix empty bug (#7239) * fix empty bug * simplify empty Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> * fix empty debug str of hob primitive (#7245) * fix empty debug str of hob primitive Signed-off-by: daquexian <daquexian566@gmail.com> * fix 'OF_PP_STRINGIZE(op)' Signed-off-by: daquexian <daquexian566@gmail.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: ZZK <42901638+MARD1NO@users.noreply.github.com> * Add VSCode dev container (#7233) * add dev container * use oneflow/devcontainer * add settings for new lines and trailing ws * refine docs * add eol setting to config * Add '"--gpus", "all"' if running a CUDA image * set BUILD_HWLOC off in fast cmake init cache * Skip send and recv if dst and src are same. (#7255) * Maxpool op nhwc (#7214) * maxpool2d_support_nhwc * refine * add test case * format * refine * refine * fix comments * Implement consistent tensor detach (#7265) * Feat/zero optimization in nn.Graph (#7165) * debug * modify graph.py * fix bug about graph debug interface * Fix nn graph variable bind (#6895) * fix(AutoParallel): nn.Graph support auto_parallel change sbp * fix(AutoParallel): use tensor.set_data interface and add print sbp info * add comment * hack check * add test * refine test * refine test * refine code * add and refine zero * fix test * refine code * rm debug log * refine min size set * add note * debug zero * fix cudnn config * refine test doc * add comment of check * eager mode in graph pass * format * rebuid parameter according to sbp in synced plan * auto format by CI * fix code check * fix test * try init session at graph init * refine and revert session init * rm useless code * add back print of sys conf Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: grybd <52237830+grybd@users.noreply.github.com> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: wyg1997 <wyg19970408@gmail.com> * fix linspace limit bug (#7236) * fix linspace limit bug * auto format by CI Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: Liang Depeng <liangdepeng@gmail.com> * fix merge bugs * fix(NNGraph): create tensor in jobpass after pulling plan * fix code bug Co-authored-by: Li Xinqi <lixinqi2010@gmail.com> Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org> Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com> Co-authored-by: ouyangyu <xuanjiuye@gmail.com> Co-authored-by: daquexian <daquexian566@gmail.com> Co-authored-by: leaves-zwx <kunta0932@gmail.com> Co-authored-by: Xiaoyu Xu <xiaoyulink@gmail.com> Co-authored-by: Shijie <821898965@qq.com> Co-authored-by: Li Xiang <54010254+lixiang007666@users.noreply.github.com> Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com> Co-authored-by: binbinHan <han_binbin@163.com> Co-authored-by: Luyang <flowingsun007@163.com> Co-authored-by: cheng cheng <472491134@qq.com> Co-authored-by: ZZK <42901638+MARD1NO@users.noreply.github.com> Co-authored-by: Shenghang Tsai <jackalcooper@gmail.com> Co-authored-by: PragmaTwice <i@twice.moe> Co-authored-by: hjchen2 <chenhoujiangcug@gmail.com> Co-authored-by: luqiang guo <702572275@qq.com> Co-authored-by: Peihong Liu <mosout@qq.com> Co-authored-by: Juncheng <liujuncheng1022@gmail.com> Co-authored-by: ZeKai Zhou <30856589+zzk0@users.noreply.github.com> Co-authored-by: VertexC <bob2420083992@gmail.com> Co-authored-by: Yao Chi <later@usopp.net> Co-authored-by: liufengwei0103 <2472937968@qq.com> Co-authored-by: lichunyou <33850693+lcylcy@users.noreply.github.com> Co-authored-by: guo-ran <360112263@qq.com> Co-authored-by: lixinqi <lixinqi0703106@163.com> Co-authored-by: wyushun <wyushun@foxmail.com> Co-authored-by: fengdaozhuo <52237830+grybd@users.noreply.github.com> Co-authored-by: Zhenhua <huangzhenhua@zhejianglab.com> Co-authored-by: tangnana925 <85614052+tangnana925@users.noreply.github.com> Co-authored-by: tangnana <tnn_personal@163.com> Co-authored-by: Zhenhua <1209435+hengzi@users.noreply.github.com> Co-authored-by: lixiang <88304454@qq.com> Co-authored-by: MARD1NO <359521840@qq.com> Co-authored-by: DangKai <dangkai4u@outlook.com> Co-authored-by: Zhanghuihong <garfield.gzhh@gmail.com> Co-authored-by: Tao Lei <96455870+taoteo@users.noreply.github.com> Co-authored-by: Liang Depeng <liangdepeng@gmail.com>

strint · 2022-04-18T07:14:14Z

oneflow/core/job/job_build_and_infer_ctx.cpp

@@ -1375,7 +1375,7 @@ Maybe<std::string> JobBuildAndInferCtx::NewUniqueOpNameByFunctionalOpConf(
  } else {
    op_type_name = "SystemOp";
  }
-  std::string op_name = op_name_prefix + op_type_name + "_" + std::to_string(unique_op_name_index_);
+  std::string op_name = op_name_prefix + op_type_name + "-" + std::to_string(unique_op_name_index_);


这里也有一个更改

strint added 2 commits December 25, 2021 04:00

add frame

4361fcc

test pass

dba919f

strint requested review from chengtbf, daquexian and jackalcooper as code owners December 25, 2021 09:49

refine loc str

b29813d

strint commented Dec 25, 2021

View reviewed changes

refine code

bfd7df0

strint requested a review from hjchen2 December 27, 2021 04:17

strint added enhancement feature graph graph mode labels Dec 27, 2021

strint mentioned this pull request Dec 27, 2021

Support inplace for lazy consistent #7112

Merged

refine code

82771e8

strint changed the title ~~Add python code locatioin to op in graph~~ Add python code frame to op in graph Dec 27, 2021

strint requested a review from oneflow-ci-bot December 27, 2021 14:49

strint added 4 commits December 27, 2021 22:50

Merge branch 'master' into feat/op_loc

6d4c2f5

merge master

26c22df

refine debug

6f028f5

add debug

fd315b9

strint changed the title ~~Add python code frame to op in graph~~ Add Python code frame, debug(3) and debug(4) in nn.Graph Dec 28, 2021

strint added 4 commits December 29, 2021 10:33

block forward with glog scope

68d6412

Merge branch 'master' into feat/op_loc

bd128e2

Merge branch 'master' of https://github.com/Oneflow-Inc/oneflow into …

b7de190

…feat/op_loc

Merge branch 'feat/op_loc' of https://github.com/Oneflow-Inc/oneflow …

529f781

…into feat/op_loc

strint commented Dec 29, 2021

View reviewed changes

Merge branch 'master' into feat/op_loc

f3cf05a

strint requested a review from oneflow-ci-bot January 7, 2022 23:42

strint added the automerge label Jan 7, 2022

log 2 alsolog

9512a36

strint requested review from oneflow-ci-bot and removed request for oneflow-ci-bot January 7, 2022 23:55

Merge branch 'master' into feat/op_loc

6cb6bf9

oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot January 8, 2022 00:20

github-actions bot removed the automerge label Jan 8, 2022

oneflow-ci-bot removed their request for review January 8, 2022 01:16

strint requested a review from oneflow-ci-bot January 8, 2022 01:34

strint added the automerge label Jan 8, 2022

Merge branch 'master' into feat/op_loc

cb2020b

oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot January 8, 2022 02:18

github-actions bot removed the automerge label Jan 8, 2022

oneflow-ci-bot removed their request for review January 8, 2022 03:40

strint requested a review from oneflow-ci-bot January 8, 2022 04:36

strint added the automerge label Jan 8, 2022

oneflow-ci-bot removed their request for review January 8, 2022 05:05

strint requested a review from oneflow-ci-bot January 8, 2022 05:22

oneflow-ci-bot removed their request for review January 8, 2022 05:27

jackalcooper merged commit 5b09790 into master Jan 8, 2022

jackalcooper deleted the feat/op_loc branch January 8, 2022 05:29

strint commented Apr 18, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Python code frame, debug(2) and debug(3) in nn.Graph #7110

Add Python code frame, debug(2) and debug(3) in nn.Graph #7110

strint commented Dec 25, 2021 •

edited

Loading

strint Dec 25, 2021

chengtbf Jan 5, 2022

strint Jan 5, 2022

strint Jan 5, 2022

jackalcooper commented Dec 25, 2021

strint commented Dec 27, 2021

jackalcooper commented Dec 27, 2021

strint Dec 29, 2021

strint Dec 29, 2021 •

edited

Loading

strint Dec 29, 2021

hjchen2 Jan 5, 2022

strint Dec 29, 2021

strint Jan 5, 2022

strint Dec 29, 2021

strint Dec 29, 2021

strint Dec 29, 2021

strint Dec 29, 2021

strint Dec 29, 2021

github-actions bot commented Jan 7, 2022

github-actions bot commented Jan 8, 2022

github-actions bot commented Jan 8, 2022

github-actions bot commented Jan 8, 2022

strint Apr 18, 2022

Add Python code frame, debug(2) and debug(3) in nn.Graph #7110

Add Python code frame, debug(2) and debug(3) in nn.Graph #7110

Conversation

strint commented Dec 25, 2021 • edited Loading

Summary

Op带上Python Frame信息，提高op级别的错误定位的便利性

新增nn.Graph.debug(2) & debug(3)

区分详细程度的做debug

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jackalcooper commented Dec 25, 2021

strint commented Dec 27, 2021

jackalcooper commented Dec 27, 2021

Choose a reason for hiding this comment

strint Dec 29, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Jan 7, 2022

github-actions bot commented Jan 8, 2022

github-actions bot commented Jan 8, 2022

github-actions bot commented Jan 8, 2022

Choose a reason for hiding this comment

strint commented Dec 25, 2021 •

edited

Loading

strint Dec 29, 2021 •

edited

Loading