Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Frontend][Tensorflow] Sparse_Dense Op CSR scheduling issue resolved for Cuda & X86 #7148

Merged
merged 4 commits into from
Jan 15, 2021

Conversation

ANSHUMAN87
Copy link
Contributor

This is a follow up PR.

  1. It has resolved the issue in CSR scheduling for both Cuda & X86.
  2. Also the test cases in Tensorflow frontends are enabled for the same.

cc @tkonolige , @FrozenGene !

Copy link
Contributor

@tkonolige tkonolige left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the issue holding up CSR scheduling?

We already talked about performance for this right?

python/tvm/topi/cuda/sparse.py Outdated Show resolved Hide resolved
python/tvm/topi/x86/sparse.py Outdated Show resolved Hide resolved
@ANSHUMAN87
Copy link
Contributor Author

What was the issue holding up CSR scheduling?

We already talked about performance for this right?

Cuda scheduling for Sparse_dense Op is internally changed to Sparse_dense_padded. But it works only when multiple of warp_size, but if it is lower than that, there is no fallback scheduling for CSR, so i have resolved that part here. Please let me know in case i am not clear. Thanks!

@ANSHUMAN87
Copy link
Contributor Author

Gentle ping @tkonolige !
cc @junrushao1994 , @comaniac too :)

Copy link
Contributor

@tkonolige tkonolige left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Copy link
Contributor

@tkonolige tkonolige left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I just hit the bug that this fixes. Can we add a test to make sure we don't hit it again in the future. Here is the test I wrote:

@tvm.testing.requires_cuda
def test_sparse_dense_padded_alter_op():
    with tvm.target.Target("cuda"):
        M = 128
        N = 16
        K = 128
        X_np = np.random.randn(M, K).astype("float32")
        W_sp_np = random_bsr_matrix(N, K, 2, 2, density=0.01, dtype="float32")
        x = relay.var("x", relay.TensorType(X_np.shape,"float32"))
        mult = relay.op.nn.sparse_dense(
            x,
            (
                relay.Constant(tvm.nd.array(W_sp_np.data)),
                relay.Constant(tvm.nd.array(W_sp_np.indices)),
                relay.Constant(tvm.nd.array(W_sp_np.indptr)),
            ),
        )
        f = relay.Function([x], mult)
        f_ = relay.transform.InferType()(tvm.IRModule.from_expr(f))
        f_ = relay.transform.AlterOpLayout()(f_)
        assert f_["main"].body.op.name == "nn.internal.sparse_dense_padded"

        # build with cuda and AlterOpLayout to ensure that sparse_dense_padded is has an implementation
        with tvm.transform.PassContext(opt_level=3, required_pass="AlterOpLayout"):
            x = relay.build(tvm.IRModule.from_expr(f), target=tvm.target.Target("cuda"))

in tests/python/topi/python/test_topi_sparse.py

@comaniac comaniac added the status: need test case need test cases to cover the change label Jan 13, 2021
@ANSHUMAN87
Copy link
Contributor Author

So I just hit the bug that this fixes. Can we add a test to make sure we don't hit it again in the future. Here is the test I wrote:

@tvm.testing.requires_cuda
def test_sparse_dense_padded_alter_op():
    with tvm.target.Target("cuda"):
        M = 128
        N = 16
        K = 128
        X_np = np.random.randn(M, K).astype("float32")
        W_sp_np = random_bsr_matrix(N, K, 2, 2, density=0.01, dtype="float32")
        x = relay.var("x", relay.TensorType(X_np.shape,"float32"))
        mult = relay.op.nn.sparse_dense(
            x,
            (
                relay.Constant(tvm.nd.array(W_sp_np.data)),
                relay.Constant(tvm.nd.array(W_sp_np.indices)),
                relay.Constant(tvm.nd.array(W_sp_np.indptr)),
            ),
        )
        f = relay.Function([x], mult)
        f_ = relay.transform.InferType()(tvm.IRModule.from_expr(f))
        f_ = relay.transform.AlterOpLayout()(f_)
        assert f_["main"].body.op.name == "nn.internal.sparse_dense_padded"

        # build with cuda and AlterOpLayout to ensure that sparse_dense_padded is has an implementation
        with tvm.transform.PassContext(opt_level=3, required_pass="AlterOpLayout"):
            x = relay.build(tvm.IRModule.from_expr(f), target=tvm.target.Target("cuda"))

in tests/python/topi/python/test_topi_sparse.py

Thanks @tkonolige ! The test case is added now.

Copy link
Contributor

@tkonolige tkonolige left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! @comaniac @junrushao1994 I think this is ready to merge. (Assuming it passes CI).

@comaniac comaniac merged commit c947463 into apache:main Jan 15, 2021
@comaniac
Copy link
Contributor

Thanks @ANSHUMAN87 @tkonolige

@comaniac comaniac added status: accepted and removed status: need test case need test cases to cover the change labels Jan 15, 2021
masahi pushed a commit to masahi/tvm that referenced this pull request Jan 18, 2021
…for Cuda & X86 (apache#7148)

* [Frontend][Tensorflow] Sparse_Dense Op CSR scheduling issue resolved for both cuda & x86

* [1] Review comments handled

* [2] Review comments handled

* [3] Review comments handled
TusharKanekiDey pushed a commit to TusharKanekiDey/tvm that referenced this pull request Jan 20, 2021
…for Cuda & X86 (apache#7148)

* [Frontend][Tensorflow] Sparse_Dense Op CSR scheduling issue resolved for both cuda & x86

* [1] Review comments handled

* [2] Review comments handled

* [3] Review comments handled
trevor-m pushed a commit to neo-ai/tvm that referenced this pull request Jan 21, 2021
…for Cuda & X86 (apache#7148)

* [Frontend][Tensorflow] Sparse_Dense Op CSR scheduling issue resolved for both cuda & x86

* [1] Review comments handled

* [2] Review comments handled

* [3] Review comments handled
electriclilies pushed a commit to electriclilies/tvm that referenced this pull request Feb 18, 2021
…for Cuda & X86 (apache#7148)

* [Frontend][Tensorflow] Sparse_Dense Op CSR scheduling issue resolved for both cuda & x86

* [1] Review comments handled

* [2] Review comments handled

* [3] Review comments handled
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants