Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Frontend, pytorch] Vc/pytorch lstm #8447

Merged
merged 11 commits into from
Jul 20, 2021
Merged

Conversation

vvchernov
Copy link
Contributor

@vvchernov vvchernov commented Jul 12, 2021

  1. LSTM layer was supported for pytorch frontend (see [Torch] Support aten::lstm #6474)
  2. Tests for different types of LSTM (unidirectional, bidirectional, stacked, with projection and their combinations) were implemented. Tests are performed for pytorch model and onnx model converted from pytorch

@jwfromm jwfromm requested a review from masahi July 12, 2021 16:10
@jwfromm
Copy link
Contributor

jwfromm commented Jul 12, 2021

@masahi can you help review and shepherd this PR?

@vvchernov vvchernov changed the title WIP: Vc/pytorch lstm Vc/pytorch lstm Jul 12, 2021
@vvchernov
Copy link
Contributor Author

Hello @masahi, I have finished my work and waiting for your review and response

self.device = device
self.batch_first = batch_first

# Network defition

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Network definition

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah let's fix this @vvchernov

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@masahi masahi self-assigned this Jul 12, 2021
@masahi
Copy link
Member

masahi commented Jul 13, 2021

thanks @vvchernov this is great. Please fix the lint issue.

@vvchernov vvchernov force-pushed the vc/pytorch_lstm branch 3 times, most recently from e7c5ea2 to 883c261 Compare July 13, 2021 07:47
@vvchernov
Copy link
Contributor Author

@masahi @jwfromm Hello guys, I have fixed all issues related to lint and resolved merge conflicts, but tvm-ci failed. After that I chose main with success status in jenkins (this one https://ci.tlcpack.ai/blue/organizations/jenkins/tvm/detail/main/1292/pipeline) and merged it to my branch. but tvm-ci failed again. In both cases it seems that it is not related to my changings in the code. First Build:GPU failed, currently python3: i386 failed. What do you think about it? May be you can give me a hash with stable commit for main branch

@masahi
Copy link
Member

masahi commented Jul 13, 2021

@vvchernov Both failures look new to me, they can be flaky tests. Can you try running CI again (rebase and push)?

…orch frontend for case without biases. exception in the test for conversion LSTM with projection from pytorch to ONNX
@vvchernov
Copy link
Contributor Author

@masahi I followed your recommendation and tvm-ci passed successfully. But I observed that my test did not check case where LSTM layer without bias weights. I updated test and checked these cases with combination of other LSTM modifications. Observed bugs were fixed (please review last commit). It seems that all modification of LSTM supported by pytorch are supported on tvm side.

@vvchernov vvchernov force-pushed the vc/pytorch_lstm branch 2 times, most recently from 65a8607 to fc4ac84 Compare July 16, 2021 11:22
# if err < 1e-6:
# print("SUCCESS: RESULTS ARE THE SAME WITH MAX ERROR {} AND EPSILON {}".format(err, 1e-6))
# else:
# print("WARNING: RESULTS ARE NOT THE SAME WITH ERROR {}".format(err))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove these comments

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

dev = tvm.cpu(0)
with tvm.transform.PassContext(opt_level=3):
lib = relay.build(mod, target=target, params=params)
elif format == "onnx":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove onnx test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea to check modifications of LSTM layer converted from pytorch to onnx is related to that pytorch frontend supports less ML layers than onnx one. It means that complicated neural network contained LSTM can be not supported from pytorch model on tvm side, but onnx model converted from pytorch can be. Moreover there is specific of stacked LSTM support, I do not think that it is tested on onnx frontend side

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Then please address my other comments. Also you can avoid creating a temp directory by writing onnx file to an in-memory buffer, see https://github.com/facebookresearch/detr/blob/master/test_all.py#L136-L140

Copy link
Contributor Author

@vvchernov vvchernov Jul 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not work for onnx model loading, but I found correct way with io.BytesIO using. done

# Create outdir directory to keep temporal files
out_dir = Path.cwd().joinpath("output")
out_dir.mkdir(exist_ok=True, parents=True)
has_proj = "p" in lstm_type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to create temp directory if you remove onnx test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

compare(tvm_output, golden_output_batch)

# Remove output directory with tmp files
shutil.rmtree(out_dir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove it with onnx test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

# Model compilation by tvm
target = tvm.target.Target("llvm", host="llvm")
dev = tvm.cpu(0)
with tvm.transform.PassContext(opt_level=3):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please test on cuda too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@masahi
Copy link
Member

masahi commented Jul 17, 2021

@vvchernov Don't forget #8447 (comment)

@vvchernov vvchernov changed the title Vc/pytorch lstm [Frontend, pytorch] Vc/pytorch lstm Jul 18, 2021
…ng tmp dir was removed. remove unneccessary comments
@masahi masahi merged commit d5818b4 into apache:main Jul 20, 2021
@masahi
Copy link
Member

masahi commented Jul 20, 2021

Thanks @vvchernov !

@masahi
Copy link
Member

masahi commented Jul 29, 2021

@vvchernov I was informed that the test added in this PR is taking more than 20 min of CI time. Can you reduce the number of test cases?

See https://ci.tlcpack.ai/job/tvm/job/main/1356/testReport/cython.tests.python.frontend.pytorch/

@cheimu
Copy link

cheimu commented Sep 16, 2021

Hi there. I found this error still exist. I just clone this repo yesterday
image

@vvchernov
Copy link
Contributor Author

Hello @cheimu! Dropout is not supported on TVM side. Therefore LSTM with dropout does not work. In general it is so because some people think that dropout is used for training only, but I know that for some TTS model it can be used for inference. Unfortunately the current state is that dropout op is dummy, it has attributes but does not have compute implementation. I do not know when dropout will be supported on TVM side.

ylc pushed a commit to ylc/tvm that referenced this pull request Sep 29, 2021
* lstm layer conversion to relay from pytorch model (TorchScript) was supported

* bidirectional LSTM layer was supported for pytorch API

* lstm tests were implemented. fixes in pytorch lstm

* fix pytorch bidirectional lstm. update test comment

* black format and some small fixes

* LSTM with projection was supported for pytorch frontend. test was updated by new combination of LSTM types

* lint fixes

* add bias switcher for LSTM types test. fix LSTM implementation in pytorch frontend for case without biases. exception in the test for conversion LSTM with projection from pytorch to ONNX

* transfer test_lstms to pytest format

* onnx model saving was implemented through io.BytesIO. creating/removing tmp dir was removed. remove unneccessary comments

* gpu target was added to the test

Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 13, 2022
* lstm layer conversion to relay from pytorch model (TorchScript) was supported

* bidirectional LSTM layer was supported for pytorch API

* lstm tests were implemented. fixes in pytorch lstm

* fix pytorch bidirectional lstm. update test comment

* black format and some small fixes

* LSTM with projection was supported for pytorch frontend. test was updated by new combination of LSTM types

* lint fixes

* add bias switcher for LSTM types test. fix LSTM implementation in pytorch frontend for case without biases. exception in the test for conversion LSTM with projection from pytorch to ONNX

* transfer test_lstms to pytest format

* onnx model saving was implemented through io.BytesIO. creating/removing tmp dir was removed. remove unneccessary comments

* gpu target was added to the test

Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants