[TOPI][x86] Injective schedule improvement #4786

anijain2305 · 2020-01-29T02:06:43Z

While working on quantized mobilenet V2, I saw that pad operator was taking around 25% of total time on cascade lake machine. This PR optimizes the injective schedule by performing vectorization

For following test

Before PR - 80 us
After PR - 5 us

import numpy as np
import tvm
from tvm import relay
from tvm.relay.op import register_pattern, OpPattern
from tvm.contrib import graph_runtime
from tvm.contrib.debugger import debug_runtime

dtype='uint8'
dshape=(1, 6, 114, 114, 16)

x1 = relay.var("x1", shape=dshape, dtype=dtype)
x2 = relay.nn.pad(x1, pad_width=((0, 0), (0, 0), (1, 1), (1, 1), (0, 0)))

func = relay.Function([x1], x2)
mod = relay.Module.from_expr(func)

with relay.build_config(opt_level=3):
    graph, lib, params = relay.build(mod, target="llvm -mcpu=cascadelake")

ctx = tvm.cpu()
# module = graph_runtime.create(graph, lib, ctx)
module = debug_runtime.create(graph, lib, ctx)
module.run()

@yzhliu @vinx13 @shoubhik @yidawang please review

anijain2305 · 2020-01-29T04:49:13Z

Please do not merge, running few more performance tests

anijain2305 · 2020-01-29T07:00:50Z

Update - Interesting observation. Even though the single pad operator sees a large speedup with this PR, the operators that follow pad sees a consistent slowdown in the original graph. I think the reason is that h and w are spread across cores, causing data transfer issues for the second operator.

Will try a few more options. If nothing works, I will close the PR

anijain2305 · 2020-02-04T20:18:19Z

@yzhliu @tqchen

One test fails with this

E           tvm._ffi.base.TVMError: Traceback (most recent call last):
E             [bt] (8) /home/ubuntu/workplace/tvm/t1/tvm/build/libtvm.so(tvm::NodeFunctor<tvm::tir::Stmt (tvm::runtime::ObjectRef const&, tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>*)>::operator()(tvm::runtime::ObjectRef const&, tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>*) const+0xf3) [0x7f9849db3153]
E             [bt] (7) /home/ubuntu/workplace/tvm/t1/tvm/build/libtvm.so(tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>::InitVTable()::{lambda(tvm::runtime::ObjectRef const&, tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>*)#9}::__invoke(tvm::runtime::ObjectRef const&, tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>*)+0x13) [0x7f9849db5283]
E             [bt] (6) /home/ubuntu/workplace/tvm/t1/tvm/build/libtvm.so(tvm::tir::StmtMutator::VisitStmt_(tvm::tir::ProducerConsumerNode const*)+0x25) [0x7f984a041eb5]
E             [bt] (5) /home/ubuntu/workplace/tvm/t1/tvm/build/libtvm.so(tvm::tir::StmtMutator::VisitStmt(tvm::tir::Stmt const&)+0x2b) [0x7f9849db2c6b]
E             [bt] (4) /home/ubuntu/workplace/tvm/t1/tvm/build/libtvm.so(tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>::VisitStmt(tvm::tir::Stmt const&)+0x30) [0x7f9849db2eb0]
E             [bt] (3) /home/ubuntu/workplace/tvm/t1/tvm/build/libtvm.so(tvm::NodeFunctor<tvm::tir::Stmt (tvm::runtime::ObjectRef const&, tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>*)>::operator()(tvm::runtime::ObjectRef const&, tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>*) const+0xf3) [0x7f9849db3153]
E             [bt] (2) /home/ubuntu/workplace/tvm/t1/tvm/build/libtvm.so(tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>::InitVTable()::{lambda(tvm::runtime::ObjectRef const&, tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>*)#4}::__invoke(tvm::runtime::ObjectRef const&, tvm::tir::StmtFunctor<tvm::tir::Stmt (tvm::tir::Stmt const&)>*)+0x13) [0x7f9849db4db3]
E             [bt] (1) /home/ubuntu/workplace/tvm/t1/tvm/build/libtvm.so(tvm::tir::LoopVectorizer::VisitStmt_(tvm::tir::ForNode const*)+0x1c3) [0x7f9849fd1553]
E             [bt] (0) /home/ubuntu/workplace/tvm/t1/tvm/build/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x32) [0x7f9849d4dd52]
E             File "/home/ubuntu/workplace/tvm/t1/tvm/src/tir/pass/vectorize_loop.cc", line 528
E           TVMError: Failed to vectorize loop with extent {n0|n0>=0}

Is this expected?

anijain2305 · 2020-02-04T20:56:35Z

Yizhi helped. Vectorize works only with const extents. Added a split to make it work.

yzhliu · 2020-02-04T23:25:41Z

looks good to me. Thanks @anijain2305

* [TOPI][x86] Injective Schedule Improvement. * Add tiling. * Vectorize when there is an axis.

yzhliu approved these changes Jan 29, 2020

View reviewed changes

anijain2305 changed the title ~~[TOPI][x86] Pad schedule improvment.~~ [WIP] [TOPI][x86] Pad schedule improvement Jan 29, 2020

[TOPI][x86] Injective Schedule Improvement.

ab347eb

anijain2305 force-pushed the p3.6 branch from 5dd2f54 to ab347eb Compare February 4, 2020 18:48

anijain2305 changed the title ~~[WIP] [TOPI][x86] Pad schedule improvement~~ [WIP] [TOPI][x86] Injective schedule improvement Feb 4, 2020

Add tiling.

bc74ceb

Vectorize when there is an axis.

861ac14

anijain2305 changed the title ~~[WIP] [TOPI][x86] Injective schedule improvement~~ [TOPI][x86] Injective schedule improvement Feb 4, 2020

yzhliu merged commit 4a39e52 into apache:master Feb 4, 2020

yzhliu added the status: accepted label Feb 4, 2020

alexwong pushed a commit to alexwong/tvm that referenced this pull request Feb 26, 2020

[TOPI][x86] Injective schedule improvement (apache#4786)

289aa72

* [TOPI][x86] Injective Schedule Improvement. * Add tiling. * Vectorize when there is an axis.

alexwong pushed a commit to alexwong/tvm that referenced this pull request Feb 28, 2020

[TOPI][x86] Injective schedule improvement (apache#4786)

67ba9c7

* [TOPI][x86] Injective Schedule Improvement. * Add tiling. * Vectorize when there is an axis.

zhiics pushed a commit to neo-ai/tvm that referenced this pull request Mar 2, 2020

[TOPI][x86] Injective schedule improvement (apache#4786)

b47821a

* [TOPI][x86] Injective Schedule Improvement. * Add tiling. * Vectorize when there is an axis.

ZihengJiang mentioned this pull request Sep 17, 2020

TVM v0.7 Release Note Candidate #6486

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TOPI][x86] Injective schedule improvement #4786

[TOPI][x86] Injective schedule improvement #4786

anijain2305 commented Jan 29, 2020 •

edited

Loading

anijain2305 commented Jan 29, 2020

anijain2305 commented Jan 29, 2020

anijain2305 commented Feb 4, 2020

anijain2305 commented Feb 4, 2020

yzhliu commented Feb 4, 2020

[TOPI][x86] Injective schedule improvement #4786

[TOPI][x86] Injective schedule improvement #4786

Conversation

anijain2305 commented Jan 29, 2020 • edited Loading

anijain2305 commented Jan 29, 2020

anijain2305 commented Jan 29, 2020

anijain2305 commented Feb 4, 2020

anijain2305 commented Feb 4, 2020

yzhliu commented Feb 4, 2020

anijain2305 commented Jan 29, 2020 •

edited

Loading