Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] [CodeGen] Fix fp16 -> fp32 codegen on X86-64 #2925

Merged
merged 1 commit into from
Mar 31, 2019

Conversation

ajtulloch
Copy link
Contributor

@ajtulloch ajtulloch commented Mar 30, 2019

LLVM fails to generate calls to AVX/AVX512 variants of vcvtph2ps by default, which is inconvenient.

This adds a pattern matches (I'll generalize to arbitrary vector length) which dispatches to the AVX/AVX512 intrinsic functions when available.

This speeds this simple script up by approximately ~5x on my Haswell Core i7.

import tvm
import numpy as np
# Build a simple op that does 8-wide vectorized Float16 to Float32 conversion.

N = tvm.var("N")
V = 8
X = tvm.placeholder((N, V), name="X", dtype="float16")

def tvm_from_fp16(x):
    return x.astype("float32")

Y = tvm.compute(X.shape, lambda *i: tvm_from_fp16(X(*i)))

s = tvm.create_schedule(Y.op)
s[Y].vectorize(s[Y].op.axis[1])

target = tvm.target.create("llvm -mcpu=core-avx2")

with target:
    func = tvm.build(s, [X, Y])
    # print(func.get_source("asm"))
    # print(func.get_source("ll"))
    te = func.time_evaluator(func.entry_name, ctx=tvm.cpu(), min_repeat_ms=1000)
    x_np = tvm.nd.array(np.random.randn(10000, 8).astype(np.float16))
    y_np = tvm.nd.array(np.random.randn(10000, 8).astype(np.float32))
    print("Before: {:.2f}us".format(te(x_np, y_np).mean * 10 ** 6))
    np.testing.assert_equal(x_np.asnumpy().astype(np.float32), y_np.asnumpy())
    
target = tvm.target.create("llvm -mcpu=core-avx2 -mattr=+avx2,+f16c")

with target:
    func = tvm.build(s, [X, Y])
    # print(func.get_source("asm"))
    # print(func.get_source("ll"))
    te = func.time_evaluator(func.entry_name, ctx=tvm.cpu(), min_repeat_ms=1000)
    x_np = tvm.nd.array(np.random.randn(10000, 8).astype(np.float16))
    y_np = tvm.nd.array(np.random.randn(10000, 8).astype(np.float32))
    print("After: {:.2f}us".format(te(x_np, y_np).mean * 10 ** 6))
    np.testing.assert_equal(x_np.asnumpy().astype(np.float32), y_np.asnumpy())


# Before: 74.09us
# After: 14.90us

@ajtulloch
Copy link
Contributor Author

@tqchen, @yidawang, @yzhliu, this may be of interest to you folks.

One question - is it possible to add unit tests that only execute when the underlying machine supports architectural features? It would be convenient to add a unit test for this that runs on machines with AVX or AVX-512, but I don't know how to express that in the current testing infrastructure.

@hlu1
Copy link
Contributor

hlu1 commented Mar 30, 2019

Copy link
Contributor

@yidawang yidawang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general. Just have some questions, mostly for my own educational purpose :)

@@ -0,0 +1,68 @@
/*!
* Copyright (c) 2017 by Contributors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how time flies


llvm::Value* CodeGenX86_64::VisitExpr_(const Cast* op) {
// LLVM does not automatically generate the correct instruction sequences for
// half -> float conversion (using AVX2/AVX512 variants of vcvtph2ps).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AVX-512

target_machine_->getTargetFeatureString().find("avx512f") != llvm::StringRef::npos;

// TODO(tulloch): implement version generic over lanes.
if (from.lanes() == 8 && (has_f16c || has_avx512f)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get the logic here. If lanes==8, why could has_avx512f be true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lanes == 8 is a property of the Expr we are compiling, has_avx512f is a property of the TargetMachine we are generating code for.

}

// TODO(tulloch): implement version generic over lanes.
if (from.lanes() == 16 && has_avx512f) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does has_avx512f==true imply has_f16c==true? And, is there a case that lanes==16 but has_avx512f==false? I am not familiar with the F16C instruction set.

import ctypes

def test_fp16_to_fp32_with_f16c():
target = 'llvm -mcpu=core-avx2 -mattr=+f16c'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the mattr flag needs to be set by the users?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed this to query the TargetMachine directly, so no changes are needed (assuming users are using the existing -mcpu=core-avx2, -mcpu=skylake-avx512, etc)

target = 'llvm'
elements = 64
n = tvm.convert(elements)
A = tvm.placeholder((n, 8), dtype="float16", name='A')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is fp16 handled without F16C?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajtulloch ajtulloch force-pushed the vcvtph2ps branch 8 times, most recently from 50798e4 to c5f5719 Compare March 31, 2019 01:37
@ajtulloch
Copy link
Contributor Author

@hlu1

@ajtulloch, cpu-info can detect AVX2 or AVX512: https://github.com/pytorch/cpuinfo/blob/40c5f3695b053e5c3d642d9bc34113f3baa71ef2/include/cpuinfo.h#L1009

The problem with that is that this only works if we are generating code for the current host architecture, which isn't what we want. I think the right way here is to continue to pass -mcpu (or -mattr) to LLVM when we want to generate architecture-specific code, and then we can query the constructed llvm::TargetMachine instance to see if it has the supplied feature (which is how LLVM chooses how to select instructions already, so we're guaranteed to be consistent). This is how Rust, etc implements this kind of thing (rust-lang/rust#31709). We can pass -mattr=-f16c,-avx512f, etc if we want to disable this optimization.

@ajtulloch ajtulloch force-pushed the vcvtph2ps branch 5 times, most recently from ce5bd60 to e88c1c8 Compare March 31, 2019 02:52
@ajtulloch ajtulloch force-pushed the vcvtph2ps branch 3 times, most recently from 215ae63 to a842f1b Compare March 31, 2019 14:08
@tqchen tqchen merged commit eb1ed11 into apache:master Mar 31, 2019
@tqchen
Copy link
Member

tqchen commented Mar 31, 2019

Thanks @ajtulloch , @yidawang @hlu1 this is now merged

wweic pushed a commit to wweic/tvm that referenced this pull request Apr 7, 2019
wweic pushed a commit to wweic/tvm that referenced this pull request Apr 7, 2019
wweic pushed a commit to wweic/tvm that referenced this pull request Apr 8, 2019
MarisaKirisame pushed a commit to MarisaKirisame/tvm that referenced this pull request Apr 9, 2019
MarisaKirisame added a commit to MarisaKirisame/tvm that referenced this pull request Apr 9, 2019
lint

lint

save

save

add more case

save

error

lint

lint

commit

do

lint

save

fix lint

wrap it back as func

lint

save

remove dead comment

fix style

fix lint

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

address review feedback

pe now handle freevar. as a result preserving function is now trivial.

test

add basic test, implement pretty printing for generic function

test

lint

fix segfault

save

save

do

test

fix another error

address comment

commit

save

address review feedback

add test for invalidate, fix error in lookup

rename cont to boduy

fix error and add regression test

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

fix error, add test case

fix lint

remove extra line

fix some error

pe

commit

save

save

save

save

save (pe/dce broken)

[DOCKER] Pin flatbuffers checkout to the last release tag (apache#2823). (apache#2879)

[Relay][Text Format] Reverse CallNode Print Order (apache#2882)

[NNPACK] Modernize test (apache#2868)

[Relay] Add list update to prelude (apache#2866)

Add missing sgx includes (apache#2878)

Fix setting up hints for getaddrinfo (apache#2872)

[ARITH] RewriteSimplifier: improved cmp simplification (apache#2851)

do (apache#2883)

[RELAY][Frontend][TF] decompile tf control flow (apache#2830)

* decompile tf control flow

* Add docs

* remove import relay

* move tests under tensorflow frontend

* minor fix

Enhance upsample operator to adapt onnx opset version 9 (apache#2840)

Use version invariant rustfmt (apache#2886)

[Relay][Op] Add group conv2d dispatch to topi function (apache#2870)

* [Relay][Op] Add group conv2d dispatch to topi function

* Rerun tests

[Apps] [howto_deploy] fix cxx-flags order and build directory (apache#2888)

fix prelu, now can use on 2d input and add one test (apache#2875)

Add dense schedules to __init__ for cpu (apache#2855)

* Add dense schedules to __init__ for cpu

* Add documentation for topi::shape

* Add additional imports to topi CPU __init__.

[TESTS] Improve script robustness (apache#2893)

A number of test scripts use the '|| exit 1' idiom.  This has two
issues, first process exit codes are defined to be in the range 0-255.
Second, more importantly, the idiom is fragile because it requires
that every possible failure point be explicitly coded.  This patch
removes the idiom in favour of "set -e" as used in the docker scripts
as a more robust mechanism to ensure that script failures are always
caught and propagated by default.

[Relay] Fix name of bias in testing.mlp (apache#2892)

winograd_nnpack (apache#2721)

[Relay] Fix Relay ARM CPU depthwise spatial pack schedule alter op layout issue. (apache#2861)

* Fix Relay ARM CPU spatial pack depthwise alter op layout issue.

* Update tune_relay_arm.py

[TESTS] Import script robustness (set -u) (apache#2896)

Adopt the "set -u" idiom from the docker scripts as a mechanism to
improve future robustness.

[DOCKER] Upgrade ci-cpu to latest v0.50 (apache#2901)

Allow linking against MKLML (apache#2902)

[COMMUNITY] ASF mentors (apache#2906)

[Relay] Allow converting keras.layers.Sequential (apache#2842)

* Allow converting keras.layers.Sequential

* Use existing new_var function

* Only update expr when missing

* Add test

[Relay] clean up hd, change tl (apache#2917)

Turn on USE_SORT by default (apache#2916)

[TEST] Cache test data (apache#2921)

Unified error handling in NNVM and Relay frontends (apache#2828)

add support for mxnet smooth_l1 (apache#2905)

[Relay] Add support for TupleGetItem in op fusion (apache#2914)

[Relay, TOPI]  Deformable conv2d (apache#2908)

* [Relay, TOPI] Add deformable conv2d

* Moved to op level2

* Fix lint

* Moved to level2 & bug fix

* Update comments

* Disabled flaky test of conv2d

TVM debugresult dump to Chrome Tracing (apache#2922)

[Relay] add test for second order ad (apache#2754)

* do second order

* add comment

* better name

* use tvm assert all close

* refire ci

Revert "[Relay] add test for second order ad (apache#2754)" (apache#2926)

This reverts commit f5ca991.

[Tutorial] Cache the test data in tutorial (apache#2923)

[AUTOTVM] Refactor measure build func (apache#2927)

Fix intersect of modular set (apache#2904)

Fix comment bugs and code style

[Relay, OpFusion] Fix handling TupleGetItem for nested tuples (apache#2929)

Consistent result of DetectLinearEquation() when an empy vars is passed (apache#2860)

[FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay. (apache#2850)

* [FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay.

* 	* test cases

* 	* ci error

Outdated renaming for flatten in ONNX converter (apache#2843)

[FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models. (apache#2864)

* [FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models.

* 	* review comments

Fix vcvtph2ps codegen (apache#2925)

Port changes

More fixes

save

save

Changes to schedules and mxnet importer
MarisaKirisame added a commit to MarisaKirisame/tvm that referenced this pull request Apr 9, 2019
lint

lint

save

save

add more case

save

error

lint

lint

commit

do

lint

save

fix lint

wrap it back as func

lint

save

remove dead comment

fix style

fix lint

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

address review feedback

pe now handle freevar. as a result preserving function is now trivial.

test

add basic test, implement pretty printing for generic function

test

lint

fix segfault

save

save

do

test

fix another error

address comment

commit

save

address review feedback

add test for invalidate, fix error in lookup

rename cont to boduy

fix error and add regression test

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

fix error, add test case

fix lint

remove extra line

fix some error

pe

commit

save

save

save

save

save (pe/dce broken)

[DOCKER] Pin flatbuffers checkout to the last release tag (apache#2823). (apache#2879)

[Relay][Text Format] Reverse CallNode Print Order (apache#2882)

[NNPACK] Modernize test (apache#2868)

[Relay] Add list update to prelude (apache#2866)

Add missing sgx includes (apache#2878)

Fix setting up hints for getaddrinfo (apache#2872)

[ARITH] RewriteSimplifier: improved cmp simplification (apache#2851)

do (apache#2883)

[RELAY][Frontend][TF] decompile tf control flow (apache#2830)

* decompile tf control flow

* Add docs

* remove import relay

* move tests under tensorflow frontend

* minor fix

Enhance upsample operator to adapt onnx opset version 9 (apache#2840)

Use version invariant rustfmt (apache#2886)

[Relay][Op] Add group conv2d dispatch to topi function (apache#2870)

* [Relay][Op] Add group conv2d dispatch to topi function

* Rerun tests

[Apps] [howto_deploy] fix cxx-flags order and build directory (apache#2888)

fix prelu, now can use on 2d input and add one test (apache#2875)

Add dense schedules to __init__ for cpu (apache#2855)

* Add dense schedules to __init__ for cpu

* Add documentation for topi::shape

* Add additional imports to topi CPU __init__.

[TESTS] Improve script robustness (apache#2893)

A number of test scripts use the '|| exit 1' idiom.  This has two
issues, first process exit codes are defined to be in the range 0-255.
Second, more importantly, the idiom is fragile because it requires
that every possible failure point be explicitly coded.  This patch
removes the idiom in favour of "set -e" as used in the docker scripts
as a more robust mechanism to ensure that script failures are always
caught and propagated by default.

[Relay] Fix name of bias in testing.mlp (apache#2892)

winograd_nnpack (apache#2721)

[Relay] Fix Relay ARM CPU depthwise spatial pack schedule alter op layout issue. (apache#2861)

* Fix Relay ARM CPU spatial pack depthwise alter op layout issue.

* Update tune_relay_arm.py

[TESTS] Import script robustness (set -u) (apache#2896)

Adopt the "set -u" idiom from the docker scripts as a mechanism to
improve future robustness.

[DOCKER] Upgrade ci-cpu to latest v0.50 (apache#2901)

Allow linking against MKLML (apache#2902)

[COMMUNITY] ASF mentors (apache#2906)

[Relay] Allow converting keras.layers.Sequential (apache#2842)

* Allow converting keras.layers.Sequential

* Use existing new_var function

* Only update expr when missing

* Add test

[Relay] clean up hd, change tl (apache#2917)

Turn on USE_SORT by default (apache#2916)

[TEST] Cache test data (apache#2921)

Unified error handling in NNVM and Relay frontends (apache#2828)

add support for mxnet smooth_l1 (apache#2905)

[Relay] Add support for TupleGetItem in op fusion (apache#2914)

[Relay, TOPI]  Deformable conv2d (apache#2908)

* [Relay, TOPI] Add deformable conv2d

* Moved to op level2

* Fix lint

* Moved to level2 & bug fix

* Update comments

* Disabled flaky test of conv2d

TVM debugresult dump to Chrome Tracing (apache#2922)

[Relay] add test for second order ad (apache#2754)

* do second order

* add comment

* better name

* use tvm assert all close

* refire ci

Revert "[Relay] add test for second order ad (apache#2754)" (apache#2926)

This reverts commit f5ca991.

[Tutorial] Cache the test data in tutorial (apache#2923)

[AUTOTVM] Refactor measure build func (apache#2927)

Fix intersect of modular set (apache#2904)

Fix comment bugs and code style

[Relay, OpFusion] Fix handling TupleGetItem for nested tuples (apache#2929)

Consistent result of DetectLinearEquation() when an empy vars is passed (apache#2860)

[FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay. (apache#2850)

* [FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay.

* 	* test cases

* 	* ci error

Outdated renaming for flatten in ONNX converter (apache#2843)

[FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models. (apache#2864)

* [FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models.

* 	* review comments

Fix vcvtph2ps codegen (apache#2925)

Port changes

More fixes

save

save

Changes to schedules and mxnet importer
MarisaKirisame added a commit to MarisaKirisame/tvm that referenced this pull request Apr 9, 2019
lint

lint

save

save

add more case

save

error

lint

lint

commit

do

lint

save

fix lint

wrap it back as func

lint

save

remove dead comment

fix style

fix lint

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

address review feedback

pe now handle freevar. as a result preserving function is now trivial.

test

add basic test, implement pretty printing for generic function

test

lint

fix segfault

save

save

do

test

fix another error

address comment

commit

save

address review feedback

add test for invalidate, fix error in lookup

rename cont to boduy

fix error and add regression test

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

fix error, add test case

fix lint

remove extra line

fix some error

pe

commit

save

save

save

save

save (pe/dce broken)

[DOCKER] Pin flatbuffers checkout to the last release tag (apache#2823). (apache#2879)

[Relay][Text Format] Reverse CallNode Print Order (apache#2882)

[NNPACK] Modernize test (apache#2868)

[Relay] Add list update to prelude (apache#2866)

Add missing sgx includes (apache#2878)

Fix setting up hints for getaddrinfo (apache#2872)

[ARITH] RewriteSimplifier: improved cmp simplification (apache#2851)

do (apache#2883)

[RELAY][Frontend][TF] decompile tf control flow (apache#2830)

* decompile tf control flow

* Add docs

* remove import relay

* move tests under tensorflow frontend

* minor fix

Enhance upsample operator to adapt onnx opset version 9 (apache#2840)

Use version invariant rustfmt (apache#2886)

[Relay][Op] Add group conv2d dispatch to topi function (apache#2870)

* [Relay][Op] Add group conv2d dispatch to topi function

* Rerun tests

[Apps] [howto_deploy] fix cxx-flags order and build directory (apache#2888)

fix prelu, now can use on 2d input and add one test (apache#2875)

Add dense schedules to __init__ for cpu (apache#2855)

* Add dense schedules to __init__ for cpu

* Add documentation for topi::shape

* Add additional imports to topi CPU __init__.

[TESTS] Improve script robustness (apache#2893)

A number of test scripts use the '|| exit 1' idiom.  This has two
issues, first process exit codes are defined to be in the range 0-255.
Second, more importantly, the idiom is fragile because it requires
that every possible failure point be explicitly coded.  This patch
removes the idiom in favour of "set -e" as used in the docker scripts
as a more robust mechanism to ensure that script failures are always
caught and propagated by default.

[Relay] Fix name of bias in testing.mlp (apache#2892)

winograd_nnpack (apache#2721)

[Relay] Fix Relay ARM CPU depthwise spatial pack schedule alter op layout issue. (apache#2861)

* Fix Relay ARM CPU spatial pack depthwise alter op layout issue.

* Update tune_relay_arm.py

[TESTS] Import script robustness (set -u) (apache#2896)

Adopt the "set -u" idiom from the docker scripts as a mechanism to
improve future robustness.

[DOCKER] Upgrade ci-cpu to latest v0.50 (apache#2901)

Allow linking against MKLML (apache#2902)

[COMMUNITY] ASF mentors (apache#2906)

[Relay] Allow converting keras.layers.Sequential (apache#2842)

* Allow converting keras.layers.Sequential

* Use existing new_var function

* Only update expr when missing

* Add test

[Relay] clean up hd, change tl (apache#2917)

Turn on USE_SORT by default (apache#2916)

[TEST] Cache test data (apache#2921)

Unified error handling in NNVM and Relay frontends (apache#2828)

add support for mxnet smooth_l1 (apache#2905)

[Relay] Add support for TupleGetItem in op fusion (apache#2914)

[Relay, TOPI]  Deformable conv2d (apache#2908)

* [Relay, TOPI] Add deformable conv2d

* Moved to op level2

* Fix lint

* Moved to level2 & bug fix

* Update comments

* Disabled flaky test of conv2d

TVM debugresult dump to Chrome Tracing (apache#2922)

[Relay] add test for second order ad (apache#2754)

* do second order

* add comment

* better name

* use tvm assert all close

* refire ci

Revert "[Relay] add test for second order ad (apache#2754)" (apache#2926)

This reverts commit f5ca991.

[Tutorial] Cache the test data in tutorial (apache#2923)

[AUTOTVM] Refactor measure build func (apache#2927)

Fix intersect of modular set (apache#2904)

Fix comment bugs and code style

[Relay, OpFusion] Fix handling TupleGetItem for nested tuples (apache#2929)

Consistent result of DetectLinearEquation() when an empy vars is passed (apache#2860)

[FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay. (apache#2850)

* [FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay.

* 	* test cases

* 	* ci error

Outdated renaming for flatten in ONNX converter (apache#2843)

[FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models. (apache#2864)

* [FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models.

* 	* review comments

Fix vcvtph2ps codegen (apache#2925)

Port changes

More fixes

save

save

Changes to schedules and mxnet importer

save

save

save

save

save
wweic pushed a commit to wweic/tvm that referenced this pull request Apr 10, 2019
wweic pushed a commit to neo-ai/tvm that referenced this pull request Apr 11, 2019
MarisaKirisame added a commit to MarisaKirisame/tvm that referenced this pull request Apr 15, 2019
lint

lint

save

save

add more case

save

error

lint

lint

commit

do

lint

save

fix lint

wrap it back as func

lint

save

remove dead comment

fix style

fix lint

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

address review feedback

pe now handle freevar. as a result preserving function is now trivial.

test

add basic test, implement pretty printing for generic function

test

lint

fix segfault

save

save

do

test

fix another error

address comment

commit

save

address review feedback

add test for invalidate, fix error in lookup

rename cont to boduy

fix error and add regression test

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

fix error, add test case

fix lint

remove extra line

fix some error

pe

commit

save

save

save

save

save (pe/dce broken)

[DOCKER] Pin flatbuffers checkout to the last release tag (apache#2823). (apache#2879)

[Relay][Text Format] Reverse CallNode Print Order (apache#2882)

[NNPACK] Modernize test (apache#2868)

[Relay] Add list update to prelude (apache#2866)

Add missing sgx includes (apache#2878)

Fix setting up hints for getaddrinfo (apache#2872)

[ARITH] RewriteSimplifier: improved cmp simplification (apache#2851)

do (apache#2883)

[RELAY][Frontend][TF] decompile tf control flow (apache#2830)

* decompile tf control flow

* Add docs

* remove import relay

* move tests under tensorflow frontend

* minor fix

Enhance upsample operator to adapt onnx opset version 9 (apache#2840)

Use version invariant rustfmt (apache#2886)

[Relay][Op] Add group conv2d dispatch to topi function (apache#2870)

* [Relay][Op] Add group conv2d dispatch to topi function

* Rerun tests

[Apps] [howto_deploy] fix cxx-flags order and build directory (apache#2888)

fix prelu, now can use on 2d input and add one test (apache#2875)

Add dense schedules to __init__ for cpu (apache#2855)

* Add dense schedules to __init__ for cpu

* Add documentation for topi::shape

* Add additional imports to topi CPU __init__.

[TESTS] Improve script robustness (apache#2893)

A number of test scripts use the '|| exit 1' idiom.  This has two
issues, first process exit codes are defined to be in the range 0-255.
Second, more importantly, the idiom is fragile because it requires
that every possible failure point be explicitly coded.  This patch
removes the idiom in favour of "set -e" as used in the docker scripts
as a more robust mechanism to ensure that script failures are always
caught and propagated by default.

[Relay] Fix name of bias in testing.mlp (apache#2892)

winograd_nnpack (apache#2721)

[Relay] Fix Relay ARM CPU depthwise spatial pack schedule alter op layout issue. (apache#2861)

* Fix Relay ARM CPU spatial pack depthwise alter op layout issue.

* Update tune_relay_arm.py

[TESTS] Import script robustness (set -u) (apache#2896)

Adopt the "set -u" idiom from the docker scripts as a mechanism to
improve future robustness.

[DOCKER] Upgrade ci-cpu to latest v0.50 (apache#2901)

Allow linking against MKLML (apache#2902)

[COMMUNITY] ASF mentors (apache#2906)

[Relay] Allow converting keras.layers.Sequential (apache#2842)

* Allow converting keras.layers.Sequential

* Use existing new_var function

* Only update expr when missing

* Add test

[Relay] clean up hd, change tl (apache#2917)

Turn on USE_SORT by default (apache#2916)

[TEST] Cache test data (apache#2921)

Unified error handling in NNVM and Relay frontends (apache#2828)

add support for mxnet smooth_l1 (apache#2905)

[Relay] Add support for TupleGetItem in op fusion (apache#2914)

[Relay, TOPI]  Deformable conv2d (apache#2908)

* [Relay, TOPI] Add deformable conv2d

* Moved to op level2

* Fix lint

* Moved to level2 & bug fix

* Update comments

* Disabled flaky test of conv2d

TVM debugresult dump to Chrome Tracing (apache#2922)

[Relay] add test for second order ad (apache#2754)

* do second order

* add comment

* better name

* use tvm assert all close

* refire ci

Revert "[Relay] add test for second order ad (apache#2754)" (apache#2926)

This reverts commit f5ca991.

[Tutorial] Cache the test data in tutorial (apache#2923)

[AUTOTVM] Refactor measure build func (apache#2927)

Fix intersect of modular set (apache#2904)

Fix comment bugs and code style

[Relay, OpFusion] Fix handling TupleGetItem for nested tuples (apache#2929)

Consistent result of DetectLinearEquation() when an empy vars is passed (apache#2860)

[FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay. (apache#2850)

* [FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay.

* 	* test cases

* 	* ci error

Outdated renaming for flatten in ONNX converter (apache#2843)

[FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models. (apache#2864)

* [FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models.

* 	* review comments

Fix vcvtph2ps codegen (apache#2925)

Port changes

More fixes

save

save

Changes to schedules and mxnet importer

save

save

save

save

save

remove

remove
MarisaKirisame added a commit to MarisaKirisame/tvm that referenced this pull request Apr 16, 2019
lint

lint

save

save

add more case

save

error

lint

lint

commit

do

lint

save

fix lint

wrap it back as func

lint

save

remove dead comment

fix style

fix lint

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

address review feedback

pe now handle freevar. as a result preserving function is now trivial.

test

add basic test, implement pretty printing for generic function

test

lint

fix segfault

save

save

do

test

fix another error

address comment

commit

save

address review feedback

add test for invalidate, fix error in lookup

rename cont to boduy

fix error and add regression test

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

fix error, add test case

fix lint

remove extra line

fix some error

pe

commit

save

save

save

save

save (pe/dce broken)

[DOCKER] Pin flatbuffers checkout to the last release tag (apache#2823). (apache#2879)

[Relay][Text Format] Reverse CallNode Print Order (apache#2882)

[NNPACK] Modernize test (apache#2868)

[Relay] Add list update to prelude (apache#2866)

Add missing sgx includes (apache#2878)

Fix setting up hints for getaddrinfo (apache#2872)

[ARITH] RewriteSimplifier: improved cmp simplification (apache#2851)

do (apache#2883)

[RELAY][Frontend][TF] decompile tf control flow (apache#2830)

* decompile tf control flow

* Add docs

* remove import relay

* move tests under tensorflow frontend

* minor fix

Enhance upsample operator to adapt onnx opset version 9 (apache#2840)

Use version invariant rustfmt (apache#2886)

[Relay][Op] Add group conv2d dispatch to topi function (apache#2870)

* [Relay][Op] Add group conv2d dispatch to topi function

* Rerun tests

[Apps] [howto_deploy] fix cxx-flags order and build directory (apache#2888)

fix prelu, now can use on 2d input and add one test (apache#2875)

Add dense schedules to __init__ for cpu (apache#2855)

* Add dense schedules to __init__ for cpu

* Add documentation for topi::shape

* Add additional imports to topi CPU __init__.

[TESTS] Improve script robustness (apache#2893)

A number of test scripts use the '|| exit 1' idiom.  This has two
issues, first process exit codes are defined to be in the range 0-255.
Second, more importantly, the idiom is fragile because it requires
that every possible failure point be explicitly coded.  This patch
removes the idiom in favour of "set -e" as used in the docker scripts
as a more robust mechanism to ensure that script failures are always
caught and propagated by default.

[Relay] Fix name of bias in testing.mlp (apache#2892)

winograd_nnpack (apache#2721)

[Relay] Fix Relay ARM CPU depthwise spatial pack schedule alter op layout issue. (apache#2861)

* Fix Relay ARM CPU spatial pack depthwise alter op layout issue.

* Update tune_relay_arm.py

[TESTS] Import script robustness (set -u) (apache#2896)

Adopt the "set -u" idiom from the docker scripts as a mechanism to
improve future robustness.

[DOCKER] Upgrade ci-cpu to latest v0.50 (apache#2901)

Allow linking against MKLML (apache#2902)

[COMMUNITY] ASF mentors (apache#2906)

[Relay] Allow converting keras.layers.Sequential (apache#2842)

* Allow converting keras.layers.Sequential

* Use existing new_var function

* Only update expr when missing

* Add test

[Relay] clean up hd, change tl (apache#2917)

Turn on USE_SORT by default (apache#2916)

[TEST] Cache test data (apache#2921)

Unified error handling in NNVM and Relay frontends (apache#2828)

add support for mxnet smooth_l1 (apache#2905)

[Relay] Add support for TupleGetItem in op fusion (apache#2914)

[Relay, TOPI]  Deformable conv2d (apache#2908)

* [Relay, TOPI] Add deformable conv2d

* Moved to op level2

* Fix lint

* Moved to level2 & bug fix

* Update comments

* Disabled flaky test of conv2d

TVM debugresult dump to Chrome Tracing (apache#2922)

[Relay] add test for second order ad (apache#2754)

* do second order

* add comment

* better name

* use tvm assert all close

* refire ci

Revert "[Relay] add test for second order ad (apache#2754)" (apache#2926)

This reverts commit f5ca991.

[Tutorial] Cache the test data in tutorial (apache#2923)

[AUTOTVM] Refactor measure build func (apache#2927)

Fix intersect of modular set (apache#2904)

Fix comment bugs and code style

[Relay, OpFusion] Fix handling TupleGetItem for nested tuples (apache#2929)

Consistent result of DetectLinearEquation() when an empy vars is passed (apache#2860)

[FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay. (apache#2850)

* [FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay.

* 	* test cases

* 	* ci error

Outdated renaming for flatten in ONNX converter (apache#2843)

[FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models. (apache#2864)

* [FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models.

* 	* review comments

Fix vcvtph2ps codegen (apache#2925)

Port changes

More fixes

save

save

Changes to schedules and mxnet importer

save

save

save

save

save

remove

remove

save
MarisaKirisame added a commit to MarisaKirisame/tvm that referenced this pull request Apr 16, 2019
lint

lint

save

save

add more case

save

error

lint

lint

commit

do

lint

save

fix lint

wrap it back as func

lint

save

remove dead comment

fix style

fix lint

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

address review feedback

pe now handle freevar. as a result preserving function is now trivial.

test

add basic test, implement pretty printing for generic function

test

lint

fix segfault

save

save

do

test

fix another error

address comment

commit

save

address review feedback

add test for invalidate, fix error in lookup

rename cont to boduy

fix error and add regression test

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

fix error, add test case

fix lint

remove extra line

fix some error

pe

commit

save

save

save

save

save (pe/dce broken)

[DOCKER] Pin flatbuffers checkout to the last release tag (apache#2823). (apache#2879)

[Relay][Text Format] Reverse CallNode Print Order (apache#2882)

[NNPACK] Modernize test (apache#2868)

[Relay] Add list update to prelude (apache#2866)

Add missing sgx includes (apache#2878)

Fix setting up hints for getaddrinfo (apache#2872)

[ARITH] RewriteSimplifier: improved cmp simplification (apache#2851)

do (apache#2883)

[RELAY][Frontend][TF] decompile tf control flow (apache#2830)

* decompile tf control flow

* Add docs

* remove import relay

* move tests under tensorflow frontend

* minor fix

Enhance upsample operator to adapt onnx opset version 9 (apache#2840)

Use version invariant rustfmt (apache#2886)

[Relay][Op] Add group conv2d dispatch to topi function (apache#2870)

* [Relay][Op] Add group conv2d dispatch to topi function

* Rerun tests

[Apps] [howto_deploy] fix cxx-flags order and build directory (apache#2888)

fix prelu, now can use on 2d input and add one test (apache#2875)

Add dense schedules to __init__ for cpu (apache#2855)

* Add dense schedules to __init__ for cpu

* Add documentation for topi::shape

* Add additional imports to topi CPU __init__.

[TESTS] Improve script robustness (apache#2893)

A number of test scripts use the '|| exit 1' idiom.  This has two
issues, first process exit codes are defined to be in the range 0-255.
Second, more importantly, the idiom is fragile because it requires
that every possible failure point be explicitly coded.  This patch
removes the idiom in favour of "set -e" as used in the docker scripts
as a more robust mechanism to ensure that script failures are always
caught and propagated by default.

[Relay] Fix name of bias in testing.mlp (apache#2892)

winograd_nnpack (apache#2721)

[Relay] Fix Relay ARM CPU depthwise spatial pack schedule alter op layout issue. (apache#2861)

* Fix Relay ARM CPU spatial pack depthwise alter op layout issue.

* Update tune_relay_arm.py

[TESTS] Import script robustness (set -u) (apache#2896)

Adopt the "set -u" idiom from the docker scripts as a mechanism to
improve future robustness.

[DOCKER] Upgrade ci-cpu to latest v0.50 (apache#2901)

Allow linking against MKLML (apache#2902)

[COMMUNITY] ASF mentors (apache#2906)

[Relay] Allow converting keras.layers.Sequential (apache#2842)

* Allow converting keras.layers.Sequential

* Use existing new_var function

* Only update expr when missing

* Add test

[Relay] clean up hd, change tl (apache#2917)

Turn on USE_SORT by default (apache#2916)

[TEST] Cache test data (apache#2921)

Unified error handling in NNVM and Relay frontends (apache#2828)

add support for mxnet smooth_l1 (apache#2905)

[Relay] Add support for TupleGetItem in op fusion (apache#2914)

[Relay, TOPI]  Deformable conv2d (apache#2908)

* [Relay, TOPI] Add deformable conv2d

* Moved to op level2

* Fix lint

* Moved to level2 & bug fix

* Update comments

* Disabled flaky test of conv2d

TVM debugresult dump to Chrome Tracing (apache#2922)

[Relay] add test for second order ad (apache#2754)

* do second order

* add comment

* better name

* use tvm assert all close

* refire ci

Revert "[Relay] add test for second order ad (apache#2754)" (apache#2926)

This reverts commit f5ca991.

[Tutorial] Cache the test data in tutorial (apache#2923)

[AUTOTVM] Refactor measure build func (apache#2927)

Fix intersect of modular set (apache#2904)

Fix comment bugs and code style

[Relay, OpFusion] Fix handling TupleGetItem for nested tuples (apache#2929)

Consistent result of DetectLinearEquation() when an empy vars is passed (apache#2860)

[FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay. (apache#2850)

* [FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay.

* 	* test cases

* 	* ci error

Outdated renaming for flatten in ONNX converter (apache#2843)

[FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models. (apache#2864)

* [FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models.

* 	* review comments

Fix vcvtph2ps codegen (apache#2925)

Port changes

More fixes

save

save

Changes to schedules and mxnet importer

save

save

save

save

save

remove

remove

save

save
MarisaKirisame added a commit to MarisaKirisame/tvm that referenced this pull request Apr 16, 2019
lint

lint

save

save

add more case

save

error

lint

lint

commit

do

lint

save

fix lint

wrap it back as func

lint

save

remove dead comment

fix style

fix lint

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

address review feedback

pe now handle freevar. as a result preserving function is now trivial.

test

add basic test, implement pretty printing for generic function

test

lint

fix segfault

save

save

do

test

fix another error

address comment

commit

save

address review feedback

add test for invalidate, fix error in lookup

rename cont to boduy

fix error and add regression test

Update src/relay/pass/partial_eval.cc

Co-Authored-By: MarisaKirisame <lolisa@marisa.moe>

fix error, add test case

fix lint

remove extra line

fix some error

pe

commit

save

save

save

save

save (pe/dce broken)

[DOCKER] Pin flatbuffers checkout to the last release tag (apache#2823). (apache#2879)

[Relay][Text Format] Reverse CallNode Print Order (apache#2882)

[NNPACK] Modernize test (apache#2868)

[Relay] Add list update to prelude (apache#2866)

Add missing sgx includes (apache#2878)

Fix setting up hints for getaddrinfo (apache#2872)

[ARITH] RewriteSimplifier: improved cmp simplification (apache#2851)

do (apache#2883)

[RELAY][Frontend][TF] decompile tf control flow (apache#2830)

* decompile tf control flow

* Add docs

* remove import relay

* move tests under tensorflow frontend

* minor fix

Enhance upsample operator to adapt onnx opset version 9 (apache#2840)

Use version invariant rustfmt (apache#2886)

[Relay][Op] Add group conv2d dispatch to topi function (apache#2870)

* [Relay][Op] Add group conv2d dispatch to topi function

* Rerun tests

[Apps] [howto_deploy] fix cxx-flags order and build directory (apache#2888)

fix prelu, now can use on 2d input and add one test (apache#2875)

Add dense schedules to __init__ for cpu (apache#2855)

* Add dense schedules to __init__ for cpu

* Add documentation for topi::shape

* Add additional imports to topi CPU __init__.

[TESTS] Improve script robustness (apache#2893)

A number of test scripts use the '|| exit 1' idiom.  This has two
issues, first process exit codes are defined to be in the range 0-255.
Second, more importantly, the idiom is fragile because it requires
that every possible failure point be explicitly coded.  This patch
removes the idiom in favour of "set -e" as used in the docker scripts
as a more robust mechanism to ensure that script failures are always
caught and propagated by default.

[Relay] Fix name of bias in testing.mlp (apache#2892)

winograd_nnpack (apache#2721)

[Relay] Fix Relay ARM CPU depthwise spatial pack schedule alter op layout issue. (apache#2861)

* Fix Relay ARM CPU spatial pack depthwise alter op layout issue.

* Update tune_relay_arm.py

[TESTS] Import script robustness (set -u) (apache#2896)

Adopt the "set -u" idiom from the docker scripts as a mechanism to
improve future robustness.

[DOCKER] Upgrade ci-cpu to latest v0.50 (apache#2901)

Allow linking against MKLML (apache#2902)

[COMMUNITY] ASF mentors (apache#2906)

[Relay] Allow converting keras.layers.Sequential (apache#2842)

* Allow converting keras.layers.Sequential

* Use existing new_var function

* Only update expr when missing

* Add test

[Relay] clean up hd, change tl (apache#2917)

Turn on USE_SORT by default (apache#2916)

[TEST] Cache test data (apache#2921)

Unified error handling in NNVM and Relay frontends (apache#2828)

add support for mxnet smooth_l1 (apache#2905)

[Relay] Add support for TupleGetItem in op fusion (apache#2914)

[Relay, TOPI]  Deformable conv2d (apache#2908)

* [Relay, TOPI] Add deformable conv2d

* Moved to op level2

* Fix lint

* Moved to level2 & bug fix

* Update comments

* Disabled flaky test of conv2d

TVM debugresult dump to Chrome Tracing (apache#2922)

[Relay] add test for second order ad (apache#2754)

* do second order

* add comment

* better name

* use tvm assert all close

* refire ci

Revert "[Relay] add test for second order ad (apache#2754)" (apache#2926)

This reverts commit f5ca991.

[Tutorial] Cache the test data in tutorial (apache#2923)

[AUTOTVM] Refactor measure build func (apache#2927)

Fix intersect of modular set (apache#2904)

Fix comment bugs and code style

[Relay, OpFusion] Fix handling TupleGetItem for nested tuples (apache#2929)

Consistent result of DetectLinearEquation() when an empy vars is passed (apache#2860)

[FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay. (apache#2850)

* [FRONTEND][ONNX] Some bug fixes and Shape operator fixed for relay.

* 	* test cases

* 	* ci error

Outdated renaming for flatten in ONNX converter (apache#2843)

[FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models. (apache#2864)

* [FRONTEND][TENSORFLOW] bug fix for tensorflow official slim models.

* 	* review comments

Fix vcvtph2ps codegen (apache#2925)

Port changes

More fixes

save

save

Changes to schedules and mxnet importer

save

save

save

save

save

remove

remove

save

save

revert
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants