[Frontend][Tensorflow] Sparse_Dense Op CSR scheduling issue resolved for Cuda & X86 #7148

ANSHUMAN87 · 2020-12-22T06:16:54Z

This is a follow up PR.

It has resolved the issue in CSR scheduling for both Cuda & X86.
Also the test cases in Tensorflow frontends are enabled for the same.

cc @tkonolige , @FrozenGene !

…for both cuda & x86

tkonolige

What was the issue holding up CSR scheduling?

We already talked about performance for this right?

python/tvm/topi/cuda/sparse.py

python/tvm/topi/x86/sparse.py

ANSHUMAN87 · 2020-12-23T14:31:20Z

What was the issue holding up CSR scheduling?

We already talked about performance for this right?

Cuda scheduling for Sparse_dense Op is internally changed to Sparse_dense_padded. But it works only when multiple of warp_size, but if it is lower than that, there is no fallback scheduling for CSR, so i have resolved that part here. Please let me know in case i am not clear. Thanks!

ANSHUMAN87 · 2021-01-13T01:08:07Z

Gentle ping @tkonolige !
cc @junrushao1994 , @comaniac too :)

tkonolige

Looks good!

tkonolige

So I just hit the bug that this fixes. Can we add a test to make sure we don't hit it again in the future. Here is the test I wrote:

@tvm.testing.requires_cuda
def test_sparse_dense_padded_alter_op():
    with tvm.target.Target("cuda"):
        M = 128
        N = 16
        K = 128
        X_np = np.random.randn(M, K).astype("float32")
        W_sp_np = random_bsr_matrix(N, K, 2, 2, density=0.01, dtype="float32")
        x = relay.var("x", relay.TensorType(X_np.shape,"float32"))
        mult = relay.op.nn.sparse_dense(
            x,
            (
                relay.Constant(tvm.nd.array(W_sp_np.data)),
                relay.Constant(tvm.nd.array(W_sp_np.indices)),
                relay.Constant(tvm.nd.array(W_sp_np.indptr)),
            ),
        )
        f = relay.Function([x], mult)
        f_ = relay.transform.InferType()(tvm.IRModule.from_expr(f))
        f_ = relay.transform.AlterOpLayout()(f_)
        assert f_["main"].body.op.name == "nn.internal.sparse_dense_padded"

        # build with cuda and AlterOpLayout to ensure that sparse_dense_padded is has an implementation
        with tvm.transform.PassContext(opt_level=3, required_pass="AlterOpLayout"):
            x = relay.build(tvm.IRModule.from_expr(f), target=tvm.target.Target("cuda"))

in tests/python/topi/python/test_topi_sparse.py

ANSHUMAN87 · 2021-01-14T16:11:06Z

So I just hit the bug that this fixes. Can we add a test to make sure we don't hit it again in the future. Here is the test I wrote:

@tvm.testing.requires_cuda
def test_sparse_dense_padded_alter_op():
    with tvm.target.Target("cuda"):
        M = 128
        N = 16
        K = 128
        X_np = np.random.randn(M, K).astype("float32")
        W_sp_np = random_bsr_matrix(N, K, 2, 2, density=0.01, dtype="float32")
        x = relay.var("x", relay.TensorType(X_np.shape,"float32"))
        mult = relay.op.nn.sparse_dense(
            x,
            (
                relay.Constant(tvm.nd.array(W_sp_np.data)),
                relay.Constant(tvm.nd.array(W_sp_np.indices)),
                relay.Constant(tvm.nd.array(W_sp_np.indptr)),
            ),
        )
        f = relay.Function([x], mult)
        f_ = relay.transform.InferType()(tvm.IRModule.from_expr(f))
        f_ = relay.transform.AlterOpLayout()(f_)
        assert f_["main"].body.op.name == "nn.internal.sparse_dense_padded"

        # build with cuda and AlterOpLayout to ensure that sparse_dense_padded is has an implementation
        with tvm.transform.PassContext(opt_level=3, required_pass="AlterOpLayout"):
            x = relay.build(tvm.IRModule.from_expr(f), target=tvm.target.Target("cuda"))

in tests/python/topi/python/test_topi_sparse.py

Thanks @tkonolige ! The test case is added now.

tests/python/topi/python/test_topi_sparse.py

tkonolige

Looks good! @comaniac @junrushao1994 I think this is ready to merge. (Assuming it passes CI).

comaniac · 2021-01-15T02:29:45Z

Thanks @ANSHUMAN87 @tkonolige

…for Cuda & X86 (apache#7148) * [Frontend][Tensorflow] Sparse_Dense Op CSR scheduling issue resolved for both cuda & x86 * [1] Review comments handled * [2] Review comments handled * [3] Review comments handled

[Frontend][Tensorflow] Sparse_Dense Op CSR scheduling issue resolved …

76f8cc8

…for both cuda & x86

tkonolige requested changes Dec 22, 2020

View reviewed changes

python/tvm/topi/cuda/sparse.py Outdated Show resolved Hide resolved

python/tvm/topi/x86/sparse.py Outdated Show resolved Hide resolved

[1] Review comments handled

af0a549

tkonolige approved these changes Jan 13, 2021

View reviewed changes

tkonolige requested changes Jan 13, 2021

View reviewed changes

comaniac added the status: need test case need test cases to cover the change label Jan 13, 2021

[2] Review comments handled

c078d0b

tkonolige reviewed Jan 14, 2021

View reviewed changes

tests/python/topi/python/test_topi_sparse.py Outdated Show resolved Hide resolved

[3] Review comments handled

3829c5b

tkonolige approved these changes Jan 14, 2021

View reviewed changes

comaniac approved these changes Jan 15, 2021

View reviewed changes

comaniac merged commit c947463 into apache:main Jan 15, 2021

comaniac added status: accepted and removed status: need test case need test cases to cover the change labels Jan 15, 2021

junrushao mentioned this pull request Nov 1, 2021

Apache TVM v0.8 Release Note Candidate #9416

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Frontend][Tensorflow] Sparse_Dense Op CSR scheduling issue resolved for Cuda & X86 #7148

[Frontend][Tensorflow] Sparse_Dense Op CSR scheduling issue resolved for Cuda & X86 #7148

ANSHUMAN87 commented Dec 22, 2020

tkonolige left a comment

ANSHUMAN87 commented Dec 23, 2020

ANSHUMAN87 commented Jan 13, 2021

tkonolige left a comment

tkonolige left a comment •

edited

Loading

ANSHUMAN87 commented Jan 14, 2021

tkonolige left a comment •

edited

Loading

comaniac commented Jan 15, 2021

[Frontend][Tensorflow] Sparse_Dense Op CSR scheduling issue resolved for Cuda & X86 #7148

[Frontend][Tensorflow] Sparse_Dense Op CSR scheduling issue resolved for Cuda & X86 #7148

Conversation

ANSHUMAN87 commented Dec 22, 2020

tkonolige left a comment

Choose a reason for hiding this comment

ANSHUMAN87 commented Dec 23, 2020

ANSHUMAN87 commented Jan 13, 2021

tkonolige left a comment

Choose a reason for hiding this comment

tkonolige left a comment • edited Loading

Choose a reason for hiding this comment

ANSHUMAN87 commented Jan 14, 2021

tkonolige left a comment • edited Loading

Choose a reason for hiding this comment

comaniac commented Jan 15, 2021

tkonolige left a comment •

edited

Loading

tkonolige left a comment •

edited

Loading