Skip to content

Commit

Permalink
Fix python apis and xla implementation (#7183)
Browse files Browse the repository at this point in the history
* Support save/load for lr_scheduler (#6948)

* feat(LrScheduler): support save/load for lr_scheduler

* refine document

* auto format by CI

* Refine test

* auto format by CI

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>

* Fix eye_op attr (#6973)

* fix

* add graph test

* Update python/oneflow/test/graph/test_graph_eye.py

Co-authored-by: daquexian <daquexian566@gmail.com>

* refine

* Update python/oneflow/test/graph/test_graph_eye.py

Co-authored-by: daquexian <daquexian566@gmail.com>

* auto format by CI

Co-authored-by: daquexian <daquexian566@gmail.com>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>

* softmax double use uncached impl to accelerate compile (#6992)

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Add [[nodiscard]] for cpp api (#6997)

* add [[nodiscard]]

* refine

* reformat

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Support Arange delta to decide dtype (#6998)

* support delta dtype to decide output dtype

* add more unittest

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Add clang as CUDA FE compiler in CI (#6954)

* update action use

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* fix

* add 80 and 86

* refine

* refine

* add CUDA_NVCC_THREADS_NUMBER

* refine

* address review

* set CUDA_NVCC_THREADS_NUMBER 8

* fix

* fix clang in init cmake

* add script

* refine

* refine

* refine

* refine

* refine

* refien

* refine

* add flags to skip zlib

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* Migrate chunk python layer to functor (#6983)

* Migrate chunk Python layer logic to functor

* fix runtime

* Fix splits bug and CI

* Modify push to emplace

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Reduce memory usage when compiling oneflow dialect ops (#7000)

* CudaAllocator device reset before OOM (#6976)

* CudaAllocator device reset before OOM

* Add NOTE

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Refactor vm stream desc (#6989)

* remove StreamDesc::num_machines

* Prepare one thread for one stream_type

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Add Diagonal Op (#6016)

* format complete

* python to cpp

* py2cpp error

* rm

* auto format by CI

* revise

* auto format by CI

* license

* docstring

* docstring

* tensor

* tensor attribute

* auto format by CI

* docstring

* revise

* test

* revise

* revise

* rename

* half

* docs

* doc,test

* test times

* revise

* format

Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* add all to all op (#6283)

* add all to all op

* add barrier

* format

* add import

* fix test

* delete barrier

* delete barrier

* Revert "delete barrier"

This reverts commit aa397ea.

* Revert "delete barrier"

This reverts commit 7ddf79a.

* check tensor meta between ranks

* add more assert

* all_reduce operate in place

* all_reduce operate in place

* fix bug

* assert tensor.is_local

* fix bug in scatter

* add more assert

* delete meta check

* add pytorch comparison test

* add pytorch comparison test

* refine

* add ONEFLOW_TEST_CPU_ONLY

* fix bug from torch gloo

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Dev ivalue for cpp api (#6890)

* add api tensor

* refine

* add nn.relu

* refine

* clean shape & refine relu test

* support void* for from_blob

* add multithreading relu test

* refine test

* refine

* refine

* add comment for __internal_tensor()

* convert to copy_util

* reformat

* refine

* add ivalue

* refine directory structure

* refine cpp api test

* refine test

* add ivalue

* refine ivalue

* refine ivalue

* refine

* refine

* refine

* refine

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* default use cpu generator (#7001)

* optimize reshape/slice/transpose functor (#6956)

* optimize reshape/slice/transpose functor

* update code according to reviewer's suggestion

* judge negative dimension number besides -1

* judge negative shape value in view::Reshape

* remove is_full_slice logic in SliceFunctor

* update code according to yinggang's advice

* move ordered permute judge to TransposeKernel

* remove print sentence

* abstract IsOrderedPermute func

* support negative permute value in TransposeFunctor

* delete tranpose_kernel optimization

* Revert "delete tranpose_kernel optimization"

This reverts commit e026434.

* not return original tensor when reshape do nothing

* simplify code

* correct spell error

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* fix IsContinuosSubspace error (#6968)

* fix IsContinuosSubspace error

* recover original IsContinuosSubspace code

* add test case

* auto format by CI

Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* add cpu group deconve impl (#6980)

* add cpu group deconv impl

* remove useless lines

* remove useless lines

* add deconv2d import

* add groups test

* remove check_allclose=False

* add tf_prelu

* add cpu group deconv impl

* remove useless lines

* remove useless lines

* add deconv2d

* add groups test

* remove check_allclose=False

* add tf_prelu

* auto format by CI

* add deconv2d impl

* add deconv2d impl

* remove useless lines

* add deconv2d in functional api

* auto format by CI

* auto format by CI

* Add variable initial

* Add variable initial

* auto format by CI

* add conv2d impl

* add conv2d impl

* auto format by CI

* remove useless lines

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>

* Migrate the python layer logic of broadcastlike to functor (#7007)

* Migrate the python layer logic of broadcastlike to functor

* add var name

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Temporarily skip comm test cases (#7015)

* Temporarily skip comm test cases

* auto format by CI

Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Fix nd_sbp attribute type and set nd_sbp in random functors (#7017)

* fix

* fix compile

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Save Job to IR and load Job from IR (#6885)

* save to ir

* test

* fix bugs

* impl load and test

* rm useless code

* fix conflict

* fix issues

* JobOp

* fix issues

* fix test_fuse_tril_scale

* fix test jit-outline-func

* fix test_mlir_opt.py

* save

* fix ods gen for max and avg pool

* rename oneflow to oneflow_foundation

* fix files checks

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* auto format by CI

* check in changes

* refine

* Update oneflow/ir/test/OneFlow/test_mlir_opt.py

* Update oneflow/ir/include/OneFlow/OneFlowOps.td

* refine includes

* printer & parser & verifier

* code tidy

* tidy include

* address review

* rm duplicated GetDataTypeType

* TensorSource trait

Co-authored-by: jackalcooper <jackalcooper@gmail.com>
Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>

* Fix Simple CI linkage (#6986)

* fix-simple-ci-linkage

* refine

* refine

* fix

* refine

* refine

* refine

* refine

* refien

* refine

* revert

* refine

* auto format by CI

* refine

* revert

* refine

Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>

* fix sbp when weight is optional (#6984)

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Feat from numpy (#7013)

* feat(Tensor): support share memory with ndarray

* test(FromNumpy): add test

* enhancement test and add document

* Fix merge error

* fix bug in numpy c api

* Fix(doctest): fix doctest error

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Add custom ShapeAttr in ODS (#7023)

* add ShapeAttr

* refine

* fix doc

* refine

* fix (#7028)

* Add linspace op (#7006)

* add linspace op

* refine doc

* refine

* fix comments

* fix comment

* auto format by CI

* fix ci doc error

Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* fix fasterrcnn infer (#7014)

* fix fasterrcnn infer

* roi_align 0shape

* refine

* refine

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* separate kernel state and cache (#6655)

* support eager state except lazy dynamic

Signed-off-by: daquexian <daquexian566@gmail.com>

* modularize kernel contexts

Signed-off-by: daquexian <daquexian566@gmail.com>

* fix warning

Signed-off-by: daquexian <daquexian566@gmail.com>

* reformat

Signed-off-by: daquexian <daquexian566@gmail.com>

* remove duplicated license

Signed-off-by: daquexian <daquexian566@gmail.com>

* fix static check error

Signed-off-by: daquexian <daquexian566@gmail.com>

* make test gpu only

Signed-off-by: daquexian <daquexian566@gmail.com>

* temp

Signed-off-by: daquexian <daquexian566@gmail.com>

* revert opkernel context changes, align with master

Signed-off-by: daquexian <daquexian566@gmail.com>

* reformat

Signed-off-by: daquexian <daquexian566@gmail.com>

* refine cachecontext

Signed-off-by: daquexian <daquexian566@gmail.com>

* add separate cache context inferface, remove out-dated files

Signed-off-by: daquexian <daquexian566@gmail.com>

* add init and cache context aliases

Signed-off-by: daquexian <daquexian566@gmail.com>

* update eager kernel

Signed-off-by: daquexian <daquexian566@gmail.com>

* fix wrong AttrMayChanged value

Signed-off-by: daquexian <daquexian566@gmail.com>

* rename and add comment

Signed-off-by: daquexian <daquexian566@gmail.com>

* auto format by CI

* fix combined_margin_loss_kernel.cpp

Signed-off-by: daquexian <daquexian566@gmail.com>

* rename op_kernel_state_wrapper.h to op_kernel_wrapper.h

Signed-off-by: daquexian <daquexian566@gmail.com>

* rename more classes, fix old cache in stateful op kernel

Signed-off-by: daquexian <daquexian566@gmail.com>

* rename more classes

Signed-off-by: daquexian <daquexian566@gmail.com>

* may changed -> not changed

Signed-off-by: daquexian <daquexian566@gmail.com>

* optimize away genrepeatedbn

Signed-off-by: daquexian <daquexian566@gmail.com>

* reformat

Signed-off-by: daquexian <daquexian566@gmail.com>

* refine

Signed-off-by: daquexian <daquexian566@gmail.com>

* update stateful local opkernel, use Cache** if possible

Signed-off-by: daquexian <daquexian566@gmail.com>

* remove TensorDesc4ArgNameAndIndex base method

Signed-off-by: daquexian <daquexian566@gmail.com>

* auto format by CI

* fix clang-tidy error

Signed-off-by: daquexian <daquexian566@gmail.com>

* auto format by CI

* fix conv kernel bug

Signed-off-by: daquexian <daquexian566@gmail.com>

* auto format by CI

* fix group conv bug and fix warning

Signed-off-by: daquexian <daquexian566@gmail.com>

* fix avgpool error

Signed-off-by: daquexian <daquexian566@gmail.com>

* fix maxpool error

Signed-off-by: daquexian <daquexian566@gmail.com>

* auto format by CI

* respect flag in deconv cpu kernel, rename cache to cache_ptr

Signed-off-by: daquexian <daquexian566@gmail.com>

* fix compile error

Signed-off-by: daquexian <daquexian566@gmail.com>

* auto format by CI

* fix deconv cache bug

Signed-off-by: daquexian <daquexian566@gmail.com>

Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: Li Xinqi <lixinqi2010@gmail.com>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Add fully support for all datatype (#7025)

* add fully support for all datatype

* Use max array size

* add clang-format off to maintain the matrix

* fix format

* remove redundant numpy dtype

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Migrate split python layer to functor (#7030)

* Migrate split python layer to functor

* modify dim

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Add add_sparse_optimizer  for Graph (#6988)

* add_sparse_optimizer

* format

* fix bug

* refine new interface by discuss

* auto format by CI

* address review

* correct syntax

* correct error message

* rm debug print

* auto format by CI

* fix cpu-only test

Co-authored-by: XIE Xuan <xiexuanx2@gmail.com>
Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Refine RUN_CUDA_KERNEL (#7003)

* Refine RUN_CUDA_KERNEL

* Added LaunchConfig

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Support llvm in tree build (#6995)

* refine

* refine

* refine

* refine

* add61

* refien

* refine

* refine

* refine

* refine

* refien

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* rm

* revert

* refine

* refine

* refine

* refine

* return_self_in_to_consistent_if_necessary (#7004)

* return_self_in_to_consistent_if_necessary

* fix error and add test case

* skip cpu test

* fix error

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Decouple ep and global (#7027)

* Decouple ep and global

* NOLINT

* fix

* fix import

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* arange doc fix (#7035)

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* add_consistency_check_in_consistent_tensor_set_data (#7002)

* add_consistency_check_in_consistent_tensor_set_data

* auto format by CI

* minor fix

* add just wrap

Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* [cmake] add liboneflow_cpp target (#7005)

* add cmake changes for liboneflow_cpp.so

Signed-off-by: daquexian <daquexian566@gmail.com>

* add separate target for cpp api test

Signed-off-by: daquexian <daquexian566@gmail.com>

* add cpp api test in ci

Signed-off-by: daquexian <daquexian566@gmail.com>

* reverse the order of cudnn and cuda library

Signed-off-by: daquexian <daquexian566@gmail.com>

* update logic of BUILD_MONOLITHIC_LIBONEFLOW

Signed-off-by: daquexian <daquexian566@gmail.com>

* rename BUILD_MONOLITHIC_LIBONEFLOW to BUILD_MONOLITHIC_LIBONEFLOW_CPP_SO

Signed-off-by: daquexian <daquexian566@gmail.com>

* share lib directory in test container

Signed-off-by: daquexian <daquexian566@gmail.com>

* add github actions debug

Signed-off-by: daquexian <daquexian566@gmail.com>

* Revert "add github actions debug"

This reverts commit 7d9aef6.

* add upterm debug after exe test

Signed-off-by: daquexian <daquexian566@gmail.com>

* sleep after fail

Signed-off-by: daquexian <daquexian566@gmail.com>

* set LD_LIBRARY_PATH in yml for cpp api test exe

Signed-off-by: daquexian <daquexian566@gmail.com>

* sleep

Signed-off-by: daquexian <daquexian566@gmail.com>

* upload liboneflow_cpp.so

Signed-off-by: daquexian <daquexian566@gmail.com>

* modify cmake to trigger compilation

Signed-off-by: daquexian <daquexian566@gmail.com>

* remove sleep

Signed-off-by: daquexian <daquexian566@gmail.com>

* build cpp api in cpu mode

Signed-off-by: daquexian <daquexian566@gmail.com>

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Fix CUDA 52 and add it to CI (#7031)

* refine

* refine

* refine

* refine

* revert

* fix

* refine

* refine

* refine

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Add check of placement constructor (#6991)

* add_check_of_placement_constructor

* move CheckDeviceIdsIsValid to runtime

* handle comment

* fix error

* fix error

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Fix(FromNumpy): fix bug in stride (#7042)

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* add non virtual destructor back (#6999)

Signed-off-by: daquexian <daquexian566@gmail.com>

Co-authored-by: Houjiang Chen <chenhoujiangcug@gmail.com>

* move python code to cpp: eye (#7036)

* 80% Sbp signature left to finish

* refine functional_api.yaml

* 90% docstr left to update

* refine

* add sbp check

* refine docs

* auto format by CI

* refine

* refine docstr

* auto format by CI

Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Fix l2norm block_size (#7044)

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* fix undefined symbol: cudaGetDeviceCount (#7052)

* fix_worker_orphan_process (#7048)

* fix_worker_orphan_process

* use SIGTERM instead

* broadcast elemwise binary (#6871)

* add

* broadcast elementwise binary

* fix

* refine

* fix

* refine

* refine

* for compile

* refine

* refine

* refine

* refine

* refine

* revert kernels

* revert kernel

* refine

* refine

* refine

* refine

* nvcc thread to 4

Co-authored-by: ZZK <42901638+MARD1NO@users.noreply.github.com>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Source op per critical section (#6472)

* backup code

* EventRecord

* auto format by CI

* backup code

* remove deprecated binary test cases

* refactor valatile to atomic

* add StreamType::InitInstructionStatusIf/StreamType::DeleteInstructionStatusIf

* merge from branch profiling_nn_graph

* address comments

* EventRecordProvider

* more comments for XXXStatusQuerier::SetLaunched

* more comments for SharedEventRecord::Init

* wait source op per critical section

* rename a task_node.cpp

* minor fix

* backup code

* fix compiler complaints

* 1) remove AddCtrlEdgeBetweenSrcDstTickAndInputOutputInSameRank; 2) create CriticalSectionInstance buffers

* fix compiler complaints

* more profiler code

* refactor vm preschedule

* TryMoveFromWaitingToReady

* revert flying_instruction_cnt

* revert to single position to call DispatchInstruction

* revert several code

* reset instruction watermark

* remove is_xxx_hook_empty

* build with profiler

* merge master

* insert device ticks before and after critical sections

* refactor register_num of cs_wait/cs_callback from 2 to 128

* fix static analysis complaints

* fix complier complaints about JobBuilder::ParallelConf4OpName

* Update oneflow/core/operator/critical_section_wait_tick_op.cpp

Co-authored-by: daquexian <daquexian566@gmail.com>

* address pr comments

* add job example for InstructionsBuilder::LaunchLazyJob

* address pr comments

Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: ouyangyu <xuanjiuye@gmail.com>
Co-authored-by: daquexian <daquexian566@gmail.com>

* More details of error of getting op matched sbp signature (#7077)

* more details of error msg

* minor change

* address review comment

* avoid namesake iterator

* Module apply only once (#7055)

* add once apply of param

* apply once on buffer

* test reuse var on module to

* test resue var

* rm useless test

* finish test

* refine test

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* distributed test bugfix (#7057)

* change spawn_shell to spawn_shell_and_check, sleep in script

Signed-off-by: daquexian <daquexian566@gmail.com>

* fix distributed test master addr

Signed-off-by: daquexian <daquexian566@gmail.com>

* remove sleep

Signed-off-by: daquexian <daquexian566@gmail.com>

* spawn_shell -> spawn_shell_ignoring_failure

Signed-off-by: daquexian <daquexian566@gmail.com>

* auto format by CI

* fix bug

Signed-off-by: daquexian <daquexian566@gmail.com>

* auto format by CI

* fix the reversed logic

Signed-off-by: daquexian <daquexian566@gmail.com>

* improve error msg

Signed-off-by: daquexian <daquexian566@gmail.com>

* resolve name conflict of MASTER_ADDR

Signed-off-by: daquexian <daquexian566@gmail.com>

Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* fix promote_type matrix (#7066)

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* fix chunk op dim=-1 bug (#7073)

* fix chunk op dim=-1 bug

* Update oneflow/core/functional/impl/array_functor.cpp

Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com>

* Update oneflow/core/functional/impl/array_functor.cpp

Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com>

Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Fix resource desc dump cudnn conf bug (#7038)

* fix Resource::DumpCudnnConf

* fix typo and error msg

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* fix concat bug (#7075)

* fix

* support concat single input

* Clean TensorNameScope after graph build (#7076)

* Clear tensor name scope after graph build

* Add test case of 2 graph caught same free eager tensor

* auto format by CI

Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* fix_abnormal_printing (#7099)

* Fix bias add dropout fuse (#7081)

* fix bias_add dropout fuse when p=0.0

* remove redundant op

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Support 1d to 2d eager boxing (#7083)

* fix Resource::DumpCudnnConf

* support_1d_to_2d_eager_boxing

* rename stack to unflatten

* add test case

* of format

* refine test case

* Revert "fix Resource::DumpCudnnConf"

This reverts commit f07278d.

* support nd to 1d

* add 2d to 1d test case

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Implement all User Ops with Op Schema (#7032)

* add oneflow-tblgen: generate op schema (OpInterpCtx) from ods

* cmake: add inja

* tblgen: add oneflow_datatype

* tblgen: use option cat

* tblgen: fix error

* tblgen: put impl in .cpp

* tblgen: fix null attrs

* tblgen: fix null ops

* refine

* refine

* reifne

* Refine op schema template and compilation

* add base OpInterpCtx to finish compilation

* fix

* refine

* fix

* add custom infer code

* generate op registrants automatically

* refine

* fix

* update user op ods and fix shape attr

* refine

* refine

* add custom code in op base

* refine comments

* add same_output_regst_num and infer

* support declare hasxx

* update op schema emitter

* refine

* emit output regist num

* refine

* refine

* migrate acc op

* migrate onerec_reader, ones_like, send, pack and padding ops

* add has_sbp_signature_infer_fn

* refine

* migrate pad, parallel_cast, partial_fc and pooling ops

* rm redundant has_device_infer_fn

* migrate prelu, quantization, randperm, reduce and repeat ops

* migrate reshape, reshape_like, roi_align, same_pad, selu and scalar related ops

* back port

* backport

* migrate ops

* refine

* refine

* refine

* refine

* add new op

* fix llvm not found

* fix mlir headers

* fix mlir headers

* fix llvm not found

* irefine

* mark override

* fix merge

* fix

* fix

* set op schema as obj lib to speed up

* rewrite ops

* add addn

* add grdi

* refien

* add more def (#7051)

* affine grid

* refien

* refine

* refine

* refine

* fix

* refien

* refine

* refine

* refine

* refine

* refine

* refien

* refine

* refine

* refein

* refine

* refine

* refine

* refine

* refien

* refine

* refine

* refine

* refien

* refien

* refien

* refine

* refine

* refien

* refine

* refine

* refine

* refein

* refine

* refine

* refine

* refine

* refine

* refien

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refein

* refine

* refine

* refine

* move more ops

* fix math_binary_broadcast/elementwise_ops

* fix hardtanh

* add norm

* rename file and add CpuOnly no_grad

* fix ir & fix norm op

* fix oneflow-tblgen

* fix math_unary_elementwise_op

* fix norm

* fix bn

* fix op schema

* refine

* fix

* refine physical_tensor_desc_infer_fn

* refine

* add ScalarLogicalNotEqualOp & RecvOp

* refine

* auto format by CI

* fix fmt

* add cuda only trait

* delete unused inja

* del inja_copy_headers_to_destination

* delete unused inja

* del inja_copy_headers_to_destination

* add cuda only to tblgen

* fix json inja url and md5 not used

* fix json inja url and md5 not used

* refine

* revert

* add with cuda

* refine

* delete GenUserOpODS

* remove cuda only

* revert cuda only after meeting

* fix

Co-authored-by: PragmaTwice <i@twice.moe>
Co-authored-by: hjchen2 <chenhoujiangcug@gmail.com>
Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>

* Feat/debug pass (#7054)

* add pass debug

* debug pass

* refine comment of fuse add pass

* auto format by CI

Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Fix error message (#6930)

* fix error message

* fix dot doc

* fix dot elem cnt

* auto format by CI

Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* fix simple ci: add of_op_schema target to tidy check (#7105)

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Rename AnyType in .td (#7109)

* AnyType => Tensor

* refine

* refine

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Feat graph reuse var (#7080)

* add once apply of param

* apply once on buffer

* test reuse var on module to

* test resue var

* rm useless test

* finish test

* refine test

* Clear tensor name scope after graph build

* Add test case of 2 graph caught same free eager tensor

* auto format by CI

* refactor var build draft

* add full func; add check

* done

* add test of call parameter ousite its moudule

* fix break test

Co-authored-by: chengtbf <472491134@qq.com>
Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Fix l2_normalize & add nn.functional.normalize (#6940)

* fix l2_normalize

* add normalize

* add test for normalize

* refine

* clean l2_normalize and refine normalize

* simplify normalize test

* Fix l2norm block_size

* refine

Co-authored-by: Juncheng <liujuncheng1022@gmail.com>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Align api in swin transformer (#7058)

* add linspace op

* fix align error in swintransformer

* add @ magic method

* fix conflict

* support tensor list

* fix meshgrid bug

* revert

Co-authored-by: hjchen2 <chenhoujiangcug@gmail.com>

* set CMAKE_LINK_DEPENDS_NO_SHARED to ON (#7063)

Signed-off-by: daquexian <daquexian566@gmail.com>

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Add other api graph autotest (#7091)

* Clear tensor name scope after graph build

* Add test case of 2 graph caught same free eager tensor

* auto format by CI

* add other api graph autotest

* add more samples

* fix comments

* refine

* refine

* refine

* refine

* refine

* fix error

* fix test error

* fix bug

* fix flip bug

* fix bug

* fix bug

* fix ci bug

* fix ci error

* fix bug

* fix ci error

Co-authored-by: chengtbf <472491134@qq.com>
Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: Li Xiang <54010254+lixiang007666@users.noreply.github.com>

* [serving] dev graph run (#7008)

* add cmake changes for liboneflow_cpp.so

Signed-off-by: daquexian <daquexian566@gmail.com>

* add separate target for cpp api test

Signed-off-by: daquexian <daquexian566@gmail.com>

* add cpp api test in ci

Signed-off-by: daquexian <daquexian566@gmail.com>

* graph run

* reverse the order of cudnn and cuda library

Signed-off-by: daquexian <daquexian566@gmail.com>

* update logic of BUILD_MONOLITHIC_LIBONEFLOW

Signed-off-by: daquexian <daquexian566@gmail.com>

* rename BUILD_MONOLITHIC_LIBONEFLOW to BUILD_MONOLITHIC_LIBONEFLOW_CPP_SO

Signed-off-by: daquexian <daquexian566@gmail.com>

* refine

* [draft] implement graph parameter load and save (#7010)

* implement parameter save (python) and load (c++)

Signed-off-by: daquexian <daquexian566@gmail.com>

* revert accident changes

Signed-off-by: daquexian <daquexian566@gmail.com>

* fix circular reference

Signed-off-by: daquexian <daquexian566@gmail.com>

* pimpl

* batching

* share lib directory in test container

Signed-off-by: daquexian <daquexian566@gmail.com>

* fix typo;

* add github actions debug

Signed-off-by: daquexian <daquexian566@gmail.com>

* Revert "add github actions debug"

This reverts commit 7d9aef6.

* add upterm debug after exe test

Signed-off-by: daquexian <daquexian566@gmail.com>

* sleep after fail

Signed-off-by: daquexian <daquexian566@gmail.com>

* set LD_LIBRARY_PATH in yml for cpp api test exe

Signed-off-by: daquexian <daquexian566@gmail.com>

* refine

* add test file && input order

* sleep

Signed-off-by: daquexian <daquexian566@gmail.com>

* upload liboneflow_cpp.so

Signed-off-by: daquexian <daquexian566@gmail.com>

* modify cmake to trigger compilation

Signed-off-by: daquexian <daquexian566@gmail.com>

* load job from ir && clean && add mlir model

* [remove useless python code]save to .pb

* add target of_common_obj to remove duplicate REGISTER_PASS  && run of_format

* remove openvino

* remove openvino test

* refine

* IValue

* Update oneflow/api/cpp/framework/graph.h

Co-authored-by: daquexian <daquexian566@gmail.com>

* refine

* refine

* refine

* refine

* refine

* refine

* rename in oneflow.cmake

* refine oneflow.cmake

* make of_api_common object library

* move device util function in api to core

* remove device check in New and ThreadLocalGetOrNew

* refine

* fix device test

* refine graph test

* refine GetExeDir()

* refine GetExeDir() again

* fix

* refine

* fix

Co-authored-by: daquexian <daquexian566@gmail.com>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: mosout <mosout@qq.com>

* disable autograd in lazy mode (#7070)

* disable autograd in lazy mode

* refine

* Fix/rand source op in graph (#7092)

* add test

* fix rand consistent

* add test

* Fix powf (#7106)

* quick fix power

* add int scalar test case

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Dispatch stateful ops using functional api (#7046)

* Dispatch functional stateful ops

* fix

* fix cmake

* fix

* disable attr check since it may not given when creating op expr.

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* refine

Co-authored-by: VertexC <bob2420083992@gmail.com>

* Fix HWLoc memory affinity (#7115)

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* add_env_api_docs (#7100)

* add_env_api_docs

* minor fix

* fix grammatical errors

Co-authored-by: Yao Chi <later@usopp.net>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* tmp skip s0 print because of slice (#7065)

* tmp skip s0 print because of slice

* tmp skip s0 print in test case

* fix

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* indexing first version (#7012)

* indexing first version

* complete

* test

* out loop

* test skip

* revise

* revise

* shape

* docs

* formatted

* confict1

* confict2

* confict2

* confict

* revise

* auto format by CI

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>

* fix maybe: add Maybe(T&&) to allow constructing from rvalue T (#7125)

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* autotest_add_graph_log (#7126)

* Meta info consistency check (#7085)

* meta_info_consistency_check

* refine check function

* Update consistent_cast.cpp

* move check to opinterpreter

* refine

* add note

* refactor MetaInfoConsistencyCheck

* of_format

* refine

* NonRecursiveMetaInfoConsistencyCheck

* fix func name

* add IsMetaInfoConsistencyCheckDisable()

* mino fix

* refine

* minor fix

* format

* minor fix

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* cmake: use interface target instead of include_directories in pybind11 (#7128)

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Import cmake dependence json and inja using FetchContent (#7124)

* import cmake dependence json and inja using FetchContent

* install-llvm: fix url hash

* fix inja config

* add cache var

* fix ninja build

* fix ninja build

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Add environment variable to set GRPC_ARG_MAX_MESSAGE_LENGTH (#7130)

* env ONEFLOW_GRPC_MAX_MESSAGE_BYTE_SIZE

* set default to -1

* Fea/nhwc (#6811)

* legacy maxpool2d module

* add legacy avgpool2d

* add graph cudnn conv alg config

* add conv2d nhwc

* lazy create cuda_stream in CudaCopyD2HDeviceCtx CudaStreamHandleDeviceCtx

* refine

* conv bn pool nhwc for resnet perf

* one hot with float

* use BiasAddRowGpu

* rm l2 with 0

* reformat

* add nhwc env var

* legacy pool merged into new

* refine

* fix style

* fix and refine

* address review

* fix and refine

* fix doc test

Co-authored-by: luyang <flowingsun007@163.com>
Co-authored-by: guo-ran <360112263@qq.com>
Co-authored-by: lixinqi <lixinqi0703106@163.com>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* reduce memory usage caused by slice grad (#7144)

* cmake: fix THIRD_PARTY build (#7146)

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* fix fold op (#7156)

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Support inplace for lazy consistent (#7112)

* Support inplace for lazy consistent

* fix single client sbp hint

* refine

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* fix prelu bug (#7118)

* support dtype and device in prelu

* optimize PreluFunctor

* fix prelu 1-dim error

* update

* update

* auto format by CI

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>

* use ibn2nd_sbp to get nd_sbp (#7155)

Co-authored-by: Houjiang Chen <chenhoujiangcug@gmail.com>

* fix copy bug (#7159)

* fix copy bug

* add to test case

* refine

* fix test case

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Fix laynorm backward bug (#7164)

* fix layernorm backward index bug

* add layernorm test case

* auto format by CI

Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* [Fix] graph support 0-Size tensor (#6957)

* Add nn.functional.glu graph test

* add filter to motify functional autotest

* motify code

* add test example

* add test else

* add test judging condition for test_masked_fill.py,test_constant.py,test_tile.py、test_repeat.py,test_expand.py

* add test ok example

* Clear tensor name scope after graph build

* Add test case of 2 graph caught same free eager tensor

* auto format by CI

* Dev cc clean tensor name scope (#7082)

* Clear tensor name scope after graph build

* Add test case of 2 graph caught same free eager tensor

* auto format by CI

Co-authored-by: chengtbf <472491134@qq.com>
Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>

* submit test success example

* test success example

* submit test code

* fix a bug about relu module with 0 shape data

* fixed a bug about relu module with 0 shape data

* fix a bug about relu module with 0 shape data

* fix a bug about relu module with 0 shape data

* 0shape and 0d autotest

* fix a bug about relu module with 0 shape data

* 0shape changed to 0_size

* modify test_var.py

* modify test_eye.py

* modify test_reshape.py

* modify test_.py

* modify ReshapeFunctor

* modify some file

* Fixed graph autotest bug with reshape op test

* Fixed graph autotest bug with reshape op test

* fixed test_sub.py

* modify test_sub.py

* modify tensor_methods.cpp

* modify array_functor.cpp

* graph support 0-Size tensor

* rename 0shape to 0 size

* modified check_graph=True

* fix and refine

Co-authored-by: Zhenhua <huangzhenhua@zhejianglab.com>
Co-authored-by: tangnana925 <85614052+tangnana925@users.noreply.github.com>
Co-authored-by: tangnana <tnn_personal@163.com>
Co-authored-by: Zhenhua <1209435+hengzi@users.noreply.github.com>
Co-authored-by: chengtbf <472491134@qq.com>
Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: Xiaoyu Xu <xiaoyulink@gmail.com>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Cumsum op implementation (#7050)

* add cumsum op's forward definition

* add cumsum forward test case

* cumsum ver3

* remove calculating time

* add cumsum forward gpu implementation

* fix gpu forward error

* change var name

* remove annotation

* add cumsum cpu forward multi-thread support

* add multi-thread annotation

* add cumsum grad definition

* update

* add cumsum cpu backward

* add cumsum cpu backward functor

* add cumsum autograd

* update

* remove user interface

* use random method to test cumsum forward

* add cumsum gpu backward

* add cumsum gpu test

* fix gpu backward bug

* add a 3d cuda kernel try

* Revert "add cumsum gpu test"

This reverts commit 05c31556ba28ecb827b25e54c2f5fa38984e8096.

* Revert "Revert "add cumsum gpu test""

This reverts commit 918ee1569863b008c1d419c3528257416cffd840.

* change nele to ele_cnt

* add test_cumsum.py in oneflow/test/modules

* change original test_cumsum to autotest version

* optimize cumsum for special up_space and down_space

* add two special cu func

* add cumsum doc

* update doc

* update doc

* update code according to bbuf's review

* ditto

* change pin/pout to in_ptr/out_ptr

* remove multi-thread func

* update doc

* use tensor processor

* update by review

* update by review

* update

* update

* auto format by CI

* auto format by CI

* update doc

* update

Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>

* Logical slice in tenosr str (#7116)

* using logical slice in tensor str

* add tensor str util file

* refine

* refine

* refine

* refine

* add logical slice docs

* fix bug

* fix comment

* auto format by CI

* fix doc test bug

* delete TODO

Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Add install for oneflow py (#7107)

* Add install for oneflow py

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refien

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

* refine

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* fix bug: output key not exists when SavaJobToIR (#7139)

* fix bug: output key not exists when SavaJobToIR

* [test] makedirs when path not exists

* remove useless comment

Co-authored-by: Peihong Liu <mosout@qq.com>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* Add linalg 2d norm op for clip_grad (#7160)

* add linalg_2d_norm op for clip_grad

* code format

* revert sqrt

* fix comment

* refine

* fix comment

* fix ci error

* fix ci error

* fix docs bug

* fix ci error

* fix ci error

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>

* refine nn.graph autotest (#7111)

* add linspace op

* refine graph autotest

* revert

* add graph error trace

* fix bug

* fix autotest bug

* auto format by CI

* fix set_printoptions error

* auto format by CI

* CI test bug

* auto format by CI

* For CI

* auto format by CI

* For CI test

* fix ci error

* revert for ci

* fix bug

* fix ci error

* fix bug

* fix bug

Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: Li Xiang <54010254+lixiang007666@users.noreply.github.com>
Co-authored-by: lixiang <88304454@qq.com>

* add oneflow/pytorch cudnn.deterministic (#7172)

* add cudnn.deterministic

* fix bug

* auto format by CI

* fix bug

* fix generate fake program input bug

* auto format by CI

Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>

* fix linalg vector norm scalar tensor print bug (#7178)

* fix linalg vector norm scalar tensor print bug

* auto format by CI

Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>

* format

* refine

* format

Co-authored-by: Yinggang Wang <wyg19970408@gmail.com>
Co-authored-by: oneflow-ci-bot <69100618+oneflow-ci-bot@users.noreply.github.com>
Co-authored-by: oneflow-ci-bot <ci-bot@oneflow.org>
Co-authored-by: liufengwei0103 <2472937968@qq.com>
Co-authored-by: daquexian <daquexian566@gmail.com>
Co-authored-by: guo ran <360112263@qq.com>
Co-authored-by: Peihong Liu <mosout@qq.com>
Co-authored-by: ZZK <42901638+MARD1NO@users.noreply.github.com>
Co-authored-by: Shenghang Tsai <jackalcooper@gmail.com>
Co-authored-by: Li Xiang <54010254+lixiang007666@users.noreply.github.com>
Co-authored-by: cheng cheng <472491134@qq.com>
Co-authored-by: Li Xinqi <lixinqi2010@gmail.com>
Co-authored-by: lichunyou <33850693+lcylcy@users.noreply.github.com>
Co-authored-by: Luyang <flowingsun007@163.com>
Co-authored-by: wyushun <wyushun@foxmail.com>
Co-authored-by: zhu wang <33675639+olojuwin@users.noreply.github.com>
Co-authored-by: leaves-zwx <kunta0932@gmail.com>
Co-authored-by: Xiaoyu Zhang <35585791+BBuf@users.noreply.github.com>
Co-authored-by: Shijie <821898965@qq.com>
Co-authored-by: XIE Xuan <xiexuanx2@gmail.com>
Co-authored-by: Juncheng <liujuncheng1022@gmail.com>
Co-authored-by: binbinHan <han_binbin@163.com>
Co-authored-by: CHI LIU <42956025+thinksoso@users.noreply.github.com>
Co-authored-by: Yao Chi <later@usopp.net>
Co-authored-by: ouyangyu <xuanjiuye@gmail.com>
Co-authored-by: Xiaoyu Xu <xiaoyulink@gmail.com>
Co-authored-by: PragmaTwice <i@twice.moe>
Co-authored-by: luqiang guo <702572275@qq.com>
Co-authored-by: ZeKai Zhou <30856589+zzk0@users.noreply.github.com>
Co-authored-by: VertexC <bob2420083992@gmail.com>
Co-authored-by: lixinqi <lixinqi0703106@163.com>
Co-authored-by: fengdaozhuo <52237830+grybd@users.noreply.github.com>
Co-authored-by: Zhenhua <huangzhenhua@zhejianglab.com>
Co-authored-by: tangnana925 <85614052+tangnana925@users.noreply.github.com>
Co-authored-by: tangnana <tnn_personal@163.com>
Co-authored-by: Zhenhua <1209435+hengzi@users.noreply.github.com>
Co-authored-by: lixiang <88304454@qq.com>
  • Loading branch information
Show file tree
Hide file tree
Showing 815 changed files with 38,707 additions and 18,043 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/canary.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ on:
push:
branches:
- master
- add-canary-release
- add-support-clang-12
workflow_dispatch:
inputs:
oneflow-ref:
Expand Down Expand Up @@ -43,7 +43,7 @@ jobs:
- name: Checkout Oneflow-Inc/oneflow
if: ${{ github.event.inputs.oneflow-ref == '' }}
uses: actions/checkout@v2
- uses: Oneflow-Inc/get-oneflow@canary-release
- uses: Oneflow-Inc/get-oneflow@support-clang-12
name: Build manylinux
id: build-cuda
with:
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/simple.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ jobs:
cmake .. -C ../cmake/caches/international/cpu.cmake \
-DCMAKE_BUILD_TYPE=Release \
-DBUILD_TESTING=ON
cmake --build . -j$(nproc) --target oneflow_deps of_cfgobj of_protoobj of_functional_obj of_functional_tensor_obj
cmake --build . -j$(nproc) --target oneflow_deps of_cfgobj of_protoobj of_functional_obj of_functional_tensor_obj of_op_schema
- name: Run clang-tidy for all translation units
# use clang as compiler for correct compiler flags
run: |
Expand Down Expand Up @@ -247,7 +247,7 @@ jobs:
repository: Oneflow-Inc/conda-env
ref: 30a7f00eb48ee9009d85a848e720823e5054c66b
path: conda-env
- uses: Oneflow-Inc/get-oneflow@canary-release
- uses: Oneflow-Inc/get-oneflow@support-clang-12
name: Build with gcc7
if: ${{ matrix.build-type == 'gcc7'}}
with:
Expand All @@ -256,7 +256,7 @@ jobs:
oneflow-build-env: conda
conda-env-file: conda-env/dev/gcc7/environment-v2.yml
conda-env-name: oneflow-dev-gcc7-v2
- uses: Oneflow-Inc/get-oneflow@canary-release
- uses: Oneflow-Inc/get-oneflow@support-clang-12
name: Build with clang10
if: ${{ matrix.build-type == 'clang10'}}
with:
Expand Down
83 changes: 52 additions & 31 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -182,7 +182,7 @@ jobs:
with:
ref: ${{ github.event.pull_request.head.sha }}
repository: ${{github.event.pull_request.head.repo.full_name}}
- uses: Oneflow-Inc/get-oneflow/cache-complete/matrix/build@canary-release
- uses: Oneflow-Inc/get-oneflow/cache-complete/matrix/build@support-clang-12
name: find cache
id: find-cache
timeout-minutes: 5
Expand Down Expand Up @@ -228,7 +228,7 @@ jobs:
with:
ref: ${{ github.event.pull_request.head.sha }}
repository: ${{github.event.pull_request.head.repo.full_name}}
- uses: Oneflow-Inc/get-oneflow/cache-complete@canary-release
- uses: Oneflow-Inc/get-oneflow/cache-complete@support-clang-12
name: Save cache if successful
id: save-cache
timeout-minutes: 5
Expand All @@ -242,7 +242,7 @@ jobs:
run: |
echo "::error file=test.yml,line=204,col=10::steps.save-cache.outputs.cache-hit != matrix.cache-hit"
exit 1
- uses: Oneflow-Inc/get-oneflow@canary-release
- uses: Oneflow-Inc/get-oneflow@support-clang-12
name: Build manylinux cpu only
id: build-cpu
if: ${{ matrix.entry =='cpu' && !matrix.cache-hit }}
Expand All @@ -263,7 +263,7 @@ jobs:
python-versions: |
3.6
3.7
- uses: Oneflow-Inc/get-oneflow@canary-release
- uses: Oneflow-Inc/get-oneflow@support-clang-12
name: Build manylinux cu102
id: build-cuda
if: ${{ matrix.entry =='cu102' && !matrix.cache-hit }}
Expand All @@ -284,7 +284,7 @@ jobs:
python-versions: |
3.6
3.7
- uses: Oneflow-Inc/get-oneflow@canary-release
- uses: Oneflow-Inc/get-oneflow@support-clang-12
name: Build manylinux cu101_xla
id: build-xla
if: ${{ matrix.entry =='cu101_xla' && !matrix.cache-hit && needs.changed_files.outputs.should_run_single_client_tests == '1' }}
Expand All @@ -306,7 +306,7 @@ jobs:
3.6
- name: Upload bin
if: ${{ !fromJson(matrix.cache-hit) && contains(matrix.runs-on, 'self-hosted') && (steps.build-cpu.outcome == 'success' || steps.build-cuda.outcome == 'success' || steps.build-xla.outcome == 'success') }}
uses: Oneflow-Inc/get-oneflow/digest/upload@canary-release
uses: Oneflow-Inc/get-oneflow/digest/upload@support-clang-12
timeout-minutes: 10
with:
digest: ${{ steps.save-cache.outputs.build-digest }}
Expand All @@ -315,9 +315,20 @@ jobs:
ssh-tank-path: ${{ env.SSH_TANK_PATH }}
src-dir: ${{ env.MANYLINUX_CACHE_DIR }}/build/bin
dst-dir: bin
- name: Upload liboneflow_cpp library
if: ${{ !fromJson(matrix.cache-hit) && contains(matrix.runs-on, 'self-hosted') && (steps.build-cpu.outcome == 'success' || steps.build-cuda.outcome == 'success') }}
uses: Oneflow-Inc/get-oneflow/digest/upload@support-clang-12
timeout-minutes: 10
with:
digest: ${{ steps.save-cache.outputs.build-digest }}
entry: ${{ matrix.entry }}
ssh-tank-host: ${{ env.SSH_TANK_HOST }}
ssh-tank-path: ${{ env.SSH_TANK_PATH }}
src-dir: ${{ env.MANYLINUX_CACHE_DIR }}/build/liboneflow_cpp/lib
dst-dir: liboneflow_cpp/lib
- name: Upload whl
if: ${{ !fromJson(matrix.cache-hit) && contains(matrix.runs-on, 'self-hosted') && (steps.build-cpu.outcome == 'success' || steps.build-cuda.outcome == 'success' || steps.build-xla.outcome == 'success') }}
uses: Oneflow-Inc/get-oneflow/digest/upload@canary-release
uses: Oneflow-Inc/get-oneflow/digest/upload@support-clang-12
timeout-minutes: 10
with:
digest: ${{ steps.save-cache.outputs.build-digest }}
Expand All @@ -331,14 +342,19 @@ jobs:
name: Build with clang
if: github.event.pull_request.draft == false && github.base_ref == 'master' && contains(github.event.pull_request.requested_reviewers.*.login, 'oneflow-ci-bot')
runs-on: [self-hosted, linux, build]
env:
ONEFLOW_SRC: .
MANYLINUX_CACHE_DIR: ~/manylinux-cache-dir/clang13
CUDA_VERSION: "10.1"
WHEELHOUSE_DIR: ./wheelhouse
steps:
- name: Fix permissions
run: |
set -x
docker run --rm -v $PWD:$PWD -w $PWD busybox rm -rf *
- name: Checkout Oneflow-Inc/oneflow
uses: actions/checkout@v2
- uses: Oneflow-Inc/get-oneflow/cache-complete@canary-release
- uses: Oneflow-Inc/get-oneflow/cache-complete@support-clang-12
name: Save cache if successful
id: save-cache
timeout-minutes: 5
Expand All @@ -347,25 +363,26 @@ jobs:
entry: build-with-clang
digest-type: build
mark-as-completed: ${{ github.event.pull_request.head.repo.full_name == github.repository }}
- name: Checkout Oneflow-Inc/conda-env
if: ${{ !fromJSON(steps.save-cache.outputs.cache-hit) }}
uses: actions/checkout@v2
with:
repository: Oneflow-Inc/conda-env
ref: 30a7f00eb48ee9009d85a848e720823e5054c66b
path: conda-env
- uses: Oneflow-Inc/get-oneflow@canary-release
- name: Build with Clang
uses: Oneflow-Inc/get-oneflow@support-clang-12
if: ${{ !fromJSON(steps.save-cache.outputs.cache-hit) }}
name: Build with clang10
with:
cmake-init-cache: cmake/caches/ci/gh-hosted/cpu-clang.cmake
oneflow-src: .
oneflow-build-env: conda
conda-env-file: conda-env/dev/clang10/environment-v2.yml
conda-env-name: oneflow-dev-clang10-v2
conda-installer-url: https://oneflow-static.oss-cn-beijing.aliyuncs.com/downloads/conda-installers/Miniconda3-py39_4.10.3-Linux-x86_64.sh
conda-prefix: ~/miniconda3-prefixes/py39_4.10.3
cmake-init-cache: ${{ env.ONEFLOW_SRC }}/cmake/caches/ci/llvm/cuda-75-clang.cmake
build-script: ${{ env.ONEFLOW_SRC }}/ci/clang/build-llvm.sh
oneflow-src: ${{ env.ONEFLOW_SRC }}
oneflow-build-env: llvm
wheelhouse-dir: ${{ env.WHEELHOUSE_DIR }}
clear-wheelhouse-dir: true
self-hosted: true
cuda-version: ${{ env.CUDA_VERSION }}
manylinux-cache-dir: ${{ env.MANYLINUX_CACHE_DIR }}
docker-run-use-system-http-proxy: false
docker-run-use-lld: false
retry-failed-build: true
clean-ccache: ${{ contains(github.event.pull_request.labels.*.name, 'need-clean-ccache') }}
wheel-audit: false
python-versions: |
3.8
find-test-cache:
name: "Find test cache"
Expand All @@ -382,7 +399,7 @@ jobs:
with:
ref: ${{ github.event.pull_request.head.sha }}
repository: ${{github.event.pull_request.head.repo.full_name}}
- uses: Oneflow-Inc/get-oneflow/cache-complete/matrix/test@canary-release
- uses: Oneflow-Inc/get-oneflow/cache-complete/matrix/test@support-clang-12
name: find cache
id: find-cache
timeout-minutes: 5
Expand Down Expand Up @@ -424,7 +441,7 @@ jobs:
if: ${{ contains(matrix.runs-on, 'self-hosted') }}
run: |
docker rm -f ${{ env.TEST_CONTAINER_NAME }} || true
- uses: Oneflow-Inc/get-oneflow/cache-complete@canary-release
- uses: Oneflow-Inc/get-oneflow/cache-complete@support-clang-12
name: Save cache if successful
id: save-cache
timeout-minutes: 5
Expand All @@ -438,9 +455,9 @@ jobs:
run: |
echo "::error file=test.yml,line=204,col=10::steps.save-cache.outputs.cache-hit != matrix.cache-hit"
exit 1
- name: Download wheel and binary
- name: Download wheel, binary and liboneflow_cpp lib
if: ${{ !fromJson(matrix.cache-hit) && contains(matrix.runs-on, 'self-hosted') && (!fromJson(matrix.is-xla) || (fromJson(matrix.is-xla) && needs.changed_files.outputs.should_run_single_client_tests == '1')) }}
uses: Oneflow-Inc/get-oneflow/digest/download@canary-release
uses: Oneflow-Inc/get-oneflow/digest/download@support-clang-12
id: download-digest
timeout-minutes: 10
with:
Expand Down Expand Up @@ -492,13 +509,15 @@ jobs:
working-directory: ${{ env.ONEFLOW_SRC }}
env:
ONEFLOW_BIN_PATH: ${{ steps.download-digest.outputs.entry-dir }}/bin
ONEFLOW_CPP_API_LIB_PATH: ${{ steps.download-digest.outputs.entry-dir }}/liboneflow_cpp/lib
run: |
docker run -d --rm --privileged --shm-size=8g \
--cap-add=SYS_PTRACE --security-opt seccomp=unconfined \
--runtime=nvidia \
-v /dataset:/dataset:ro -v /model_zoo:/model_zoo:ro \
-v ${ONEFLOW_WHEEL_PATH}:${ONEFLOW_WHEEL_PATH}:ro \
-v ${ONEFLOW_BIN_PATH}:${ONEFLOW_BIN_PATH}:ro \
-v ${ONEFLOW_CPP_API_LIB_PATH}:${ONEFLOW_CPP_API_LIB_PATH}:ro \
-v $HOME/test-container-cache/dot-local:/root/.local \
-v $HOME/test-container-cache/dot-cache:/root/.cache \
-e ONEFLOW_WHEEL_PATH=${ONEFLOW_WHEEL_PATH} \
Expand Down Expand Up @@ -527,11 +546,13 @@ jobs:
run: |
docker exec ${{ env.TEST_CONTAINER_NAME }} python3 -m oneflow --doctor
- name: Exe test
if: ${{ !fromJson(matrix.cache-hit) && matrix.test-type == 'misc' }}
if: ${{ !fromJson(matrix.cache-hit) && matrix.test-type == 'misc' && matrix.device == 'cpu' }}
timeout-minutes: 10
run: |
chmod +x ${{ steps.download-digest.outputs.entry-dir }}/bin/oneflow_testexe
docker exec ${{ env.TEST_CONTAINER_NAME }} ${{ steps.download-digest.outputs.entry-dir }}/bin/oneflow_testexe
chmod +x ${{ steps.download-digest.outputs.entry-dir }}/bin/oneflow_cpp_api_testexe
docker exec -e LD_LIBRARY_PATH=${{ steps.download-digest.outputs.entry-dir }}/liboneflow_cpp/lib ${{ env.TEST_CONTAINER_NAME }} ${{ steps.download-digest.outputs.entry-dir }}/bin/oneflow_cpp_api_testexe
- name: Build documentation
timeout-minutes: 10
if: ${{ !fromJson(matrix.cache-hit) && matrix.test-type == 'misc' && matrix.device == 'cpu' }}
Expand Down Expand Up @@ -744,7 +765,7 @@ jobs:
ref: ${{ github.event.pull_request.head.sha }}
repository: ${{github.event.pull_request.head.repo.full_name}}
fetch-depth: 0
- uses: Oneflow-Inc/get-oneflow/cache-complete@canary-release
- uses: Oneflow-Inc/get-oneflow/cache-complete@support-clang-12
name: Save cache if successful
id: save-cache
timeout-minutes: 5
Expand Down Expand Up @@ -785,7 +806,7 @@ jobs:
-DBUILD_TESTING=ON \
-DCMAKE_C_COMPILER_LAUNCHER=ccache \
-DCMAKE_CXX_COMPILER_LAUNCHER=ccache
cmake --build . -j$(nproc) --target oneflow_deps of_cfgobj of_protoobj of_functional_obj of_functional_tensor_obj
cmake --build . -j$(nproc) --target oneflow_deps of_cfgobj of_protoobj of_functional_obj of_functional_tensor_obj of_op_schema
- name: Fetch upstream
if: ${{ !fromJSON(steps.save-cache.outputs.cache-hit) && github.event.pull_request.head.repo.full_name != github.event.pull_request.base.repo.full_name }}
run: |
Expand Down
42 changes: 22 additions & 20 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Minimum CMake required
cmake_minimum_required(VERSION 3.18.0)

set(CMAKE_INSTALL_MESSAGE LAZY CACHE STRING "")
if (NOT CMAKE_BUILD_TYPE)
message(STATUS "No build type selected, default to Release")
set(CMAKE_BUILD_TYPE "Release" CACHE STRING "Build type (default Release)" FORCE)
Expand All @@ -23,7 +24,8 @@ endif()
option(USE_CLANG_FORMAT "" OFF)
option(USE_CLANG_TIDY "" OFF)
option(BUILD_PYTHON "" ON)
option(BUILD_MONOLITHIC_LIBONEFLOW "" ON)
option(BUILD_CPP_API "Option to build OneFlow C++ API (beta)" OFF)
option(BUILD_MONOLITHIC_LIBONEFLOW_CPP_SO "Option to build a monolithic liboneflow_cpp.so (only meaningful when BUILD_CPP_API is ON)" ON)
option(BUILD_RDMA "" OFF)
option(BUILD_CUDA "" ON)
option(WITH_ONEDNN "" OFF)
Expand All @@ -33,7 +35,12 @@ option(WITH_TENSORRT "Option to build with TensorRT" OFF)
option(WITH_OPENVINO "Option to build with OpenVINO" OFF)
option(WITH_MLIR "" OFF)
option(WITH_MLIR_CUDA_CODEGEN "" OFF)
set(LLVM_PROVIDER "in-tree" CACHE STRING "in-tree, install")
if (NOT WITH_MLIR)
set(LLVM_PROVIDER "install" CACHE STRING "in-tree will build LLVM's ALL, not what we want when not building MLIR" FORCE)
endif(NOT WITH_MLIR)
option(WITH_COCOAPI "Option to build with COCO API" ON)
option(WITH_ZLIB "" ON)
option(BUILD_GIT_VERSION "" ON)
option(BUILD_PROFILER "" OFF)
option(OF_SOFTMAX_USE_FAST_MATH "" ON)
Expand Down Expand Up @@ -201,28 +208,22 @@ endif()

if(BUILD_PYTHON)
set(ONEFLOW_INCLUDE_DIR "${ONEFLOW_PYTHON_DIR}/oneflow/include")
else() # build_python
set(ONEFLOW_INCLUDE_DIR "${PROJECT_BINARY_DIR}/liboneflow/include/oneflow")
set(ONEFLOW_LIBRARY_DIR "${PROJECT_BINARY_DIR}/liboneflow/lib")
set(ONEFLOW_SHARE_DIR "${PROJECT_BINARY_DIR}/liboneflow/share")
make_directory(${ONEFLOW_INCLUDE_DIR})
make_directory(${ONEFLOW_LIBRARY_DIR})
make_directory(${ONEFLOW_SHARE_DIR})
endif(BUILD_PYTHON)

if(BUILD_CPP_API)
set(LIBONEFLOW_LIBRARY_DIR "${PROJECT_BINARY_DIR}/liboneflow_cpp/lib")
set(LIBONEFLOW_SHARE_DIR "${PROJECT_BINARY_DIR}/liboneflow_cpp/share")
make_directory(${LIBONEFLOW_LIBRARY_DIR})
make_directory(${LIBONEFLOW_SHARE_DIR})

if(BUILD_SHARED_LIBS)
if(BUILD_MONOLITHIC_LIBONEFLOW)
set(BUILD_SHARED_LIBS OFF)
if(BUILD_MONOLITHIC_LIBONEFLOW_CPP_SO)
message(FATAL_ERROR "BUILD_MONOLITHIC_LIBONEFLOW_CPP_SO is incompatible with BUILD_SHARED_LIBS. Please set either of them to OFF.")
else()
set(LIBRARY_OUTPUT_PATH ${ONEFLOW_LIBRARY_DIR})
endif(BUILD_MONOLITHIC_LIBONEFLOW)
set(BUILD_SHARED_LIBONEFLOW ON)
else()
if(BUILD_MONOLITHIC_LIBONEFLOW)
message(WARNING "BUILD_MONOLITHIC_LIBONEFLOW=ON is meaningless when BUILD_SHARED_LIBS=OFF")
endif()
set(BUILD_SHARED_LIBONEFLOW OFF)
set(LIBRARY_OUTPUT_PATH ${LIBONEFLOW_LIBRARY_DIR})
endif(BUILD_MONOLITHIC_LIBONEFLOW_CPP_SO)
endif(BUILD_SHARED_LIBS)
endif(BUILD_PYTHON)
endif(BUILD_CPP_API)

include(third_party)

Expand Down Expand Up @@ -261,7 +262,7 @@ if (BUILD_CUDA)

if ("${CMAKE_CUDA_COMPILER_ID}" STREQUAL "NVIDIA")
if(CMAKE_CUDA_COMPILER_VERSION VERSION_GREATER_EQUAL "11.2")
set(CUDA_NVCC_THREADS_NUMBER "1" CACHE STRING "")
set(CUDA_NVCC_THREADS_NUMBER "4" CACHE STRING "")
list(APPEND CUDA_NVCC_FLAGS -t ${CUDA_NVCC_THREADS_NUMBER})
endif()
message(STATUS "CUDA_NVCC_FLAGS: " ${CUDA_NVCC_FLAGS})
Expand All @@ -276,3 +277,4 @@ add_custom_target(oneflow_deps ALL DEPENDS prepare_oneflow_third_party)
if (ONEFLOW)
include(oneflow)
endif()
add_subdirectory(ci)
1 change: 1 addition & 0 deletions ci/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
add_subdirectory(test)
28 changes: 28 additions & 0 deletions ci/clang/build-llvm.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
set -ex
export PATH=/usr/lib/llvm-12/bin:/usr/lib/llvm-13/bin:/usr/lib64/ccache:/root/.local/bin:$PATH

# clean python dir
cd ${ONEFLOW_CI_SRC_DIR}
${ONEFLOW_CI_PYTHON_EXE} -m pip install -i https://mirrors.aliyun.com/pypi/simple --user -r ci/fixed-dev-requirements.txt
cd python
git clean -nXd -e \!dist -e \!dist/**
git clean -fXd -e \!dist -e \!dist/**

# cmake config
mkdir -p ${ONEFLOW_CI_BUILD_DIR}
cd ${ONEFLOW_CI_BUILD_DIR}
find ${ONEFLOW_CI_BUILD_DIR} -name CMakeCache.txt
find ${ONEFLOW_CI_BUILD_DIR} -name CMakeCache.txt -delete
if [ ! -f "$ONEFLOW_CI_CMAKE_INIT_CACHE" ]; then
echo "$ONEFLOW_CI_CMAKE_INIT_CACHE does not exist."
exit 1
fi
cmake -S ${ONEFLOW_CI_SRC_DIR} -C ${ONEFLOW_CI_CMAKE_INIT_CACHE} -DPython3_EXECUTABLE=${ONEFLOW_CI_PYTHON_EXE}
# cmake build
cd ${ONEFLOW_CI_BUILD_DIR}
cmake --build . -j $(nproc)

# build pip
cd ${ONEFLOW_CI_SRC_DIR}
cd python
${ONEFLOW_CI_PYTHON_EXE} setup.py bdist_wheel
2 changes: 0 additions & 2 deletions ci/test/1node_op_test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -37,5 +37,3 @@ then
else
echo "deadlock unsolved, skipping multi-card eager"
fi

ONEFLOW_TEST_MULTI_PROCESS=1 python3 test/ops/test_multi_process.py --failfast --verbose
4 changes: 2 additions & 2 deletions ci/test/2node_op_test_multi_client.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ cd ${test_tmp_dir}/$(basename $test_dir)

for device_num in 1 2 4
do
ONEFLOW_TEST_NODE_NUM=2 ONEFLOW_TEST_DEVICE_NUM=$device_num python3 -m oneflow.distributed.launch --nproc_per_node $device_num --nnodes=2 --node_rank=$NODE_RANK --master_addr 192.168.1.12 -m unittest discover ${PWD} --failfast --verbose
ONEFLOW_TEST_NODE_NUM=2 ONEFLOW_TEST_DEVICE_NUM=$device_num python3 -m oneflow.distributed.launch --nproc_per_node $device_num --nnodes=2 --node_rank=$NODE_RANK --master_addr $_MASTER_ADDR -m unittest discover ${PWD} --failfast --verbose
# use a invalid ibverbs lib to test if falling back to epoll works
ONEFLOW_TEST_NODE_NUM=2 ONEFLOW_TEST_DEVICE_NUM=$device_num ONEFLOW_LIBIBVERBS_PATH=invalid_lib python3 -m oneflow.distributed.launch --nproc_per_node $device_num --nnodes=2 --node_rank=$NODE_RANK --master_addr 192.168.1.12 -m unittest discover ${PWD} --failfast --verbose
ONEFLOW_TEST_NODE_NUM=2 ONEFLOW_TEST_DEVICE_NUM=$device_num ONEFLOW_LIBIBVERBS_PATH=invalid_lib python3 -m oneflow.distributed.launch --nproc_per_node $device_num --nnodes=2 --node_rank=$NODE_RANK --master_addr $_MASTER_ADDR -m unittest discover ${PWD} --failfast --verbose
done
Loading

0 comments on commit 322a36b

Please sign in to comment.