[Feature Request] Add cuDNN-accelerated LSTM and GRU to PyTorch #19177

foxik · 2024-02-14T14:55:18Z

Hi,

are there any plans to add cuDNN-accelerated versions of LSTM and GRU to the PyTorch backend? Without cuDNN acceleration, the LSTM and GRU are considerably (several times) slower, even when running on GPU; however, we still use RNNs heavily (for example, adding them after Transformer encoder still helps in some cases).

The torch.nn.LSTM/torch.nn.GRU offer cuDNN acceleration, and wrapping them to a keras.layers.Layer works, but the resulting model is not backend-agnostic (so the resulting model cannot be used cross-frameworks).

Thanks for consideration 🙏 and cheers!

PS: Relatedly, torch.nn.LSTM/GRU offers bidirectional computation by a single call (by passing bidirectional=True) -- I am not sure how much faster it is compared to two asynchronous unidirectional computations, but if it is faster, keras.layers.Bidirectional would probably have to be updated to handle keras.layers.LSTM and keras.layers.GRU specifically to support it.

The text was updated successfully, but these errors were encountered:

haifeng-jin · 2024-02-15T19:40:55Z

@foxik Thanks for the issue!
Would you like to contribute this by modifying the following file?
https://github.com/keras-team/keras/blob/master/keras/backend/torch/rnn.py#L377C1-L382C30

foxik · 2024-02-16T08:24:45Z

@haifeng-jin I am not sure I can do it correctly. I assume that

The cudnn_ok will probably need to consider also the current device (whether it is cuda or not) [in addition to verifying that the arguments are supported by CuDNN implementation)
- that may require changes in the code, because cudnn_ok is currently being called only for tensorflow backend; on the other hand, it is used only to set supports_jit to False, which is probably not needed for PyTorch, because the sources indicate that TorchScript can compile torch.nn.LSTM/GRU.
the torch.nn.LSTM/GRU is a whole layer including parameters, but we need to use the given parameters. Therefore, we should probably call torch._VF.lstm/gru, but I am not sure whether that would be considered OK
nontrivial care must be taken to assure the results of CuDNN branch are the same as the usual branch
the go_backwards has no direct analogue in Torch API, so some manual reversing will be needed
on the other hand, bidirectional run is supported by Torch API, so
- similarly to backend.lstm/gru, backend.lstm/gru_bidirectional should be introduced
- Bidirectional wrapper should try calling this lstm/gpu_bidirectional to use CuDNN-accelerated bidirectional call (only PyTorch would implement this method)
- With this support in place, the go_backwards would not be used in PyTorch for most usages, so it would not matter its implementation would not be great

In any case, for the time being I unfortunately do not have time to work on this.

park · 2024-07-19T12:45:56Z

This feature would be great indeed. Hopefully someone high capable will attend to this sometime soon.

github-actions · 2025-01-16T01:59:39Z

This issue is stale because it has been open for 180 days with no activity. It will be closed if no further activity occurs. Thank you.

github-actions bot assigned sachinprasadhs Feb 14, 2024

sachinprasadhs added type:feature The user is asking for a new feature. keras-team-review-pending Pending review by a Keras team member. labels Feb 14, 2024

haifeng-jin assigned haifeng-jin and unassigned sachinprasadhs Feb 15, 2024

sachinprasadhs added stat:awaiting response from contributor and removed keras-team-review-pending Pending review by a Keras team member. labels Feb 15, 2024

google-ml-butler bot removed the stat:awaiting response from contributor label Feb 16, 2024

haifeng-jin added the stat:contributions welcome A pull request to fix this issue would be welcome. label Apr 16, 2024

github-actions bot added the stale label Jan 16, 2025

mw66 mentioned this issue Feb 14, 2025

pytorch backend lstm very +10x slow (maybe batch size with pytorch backend has different semantics than the traditional Keras semantics?) #20902

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Add cuDNN-accelerated LSTM and GRU to PyTorch #19177

[Feature Request] Add cuDNN-accelerated LSTM and GRU to PyTorch #19177

foxik commented Feb 14, 2024

haifeng-jin commented Feb 15, 2024

foxik commented Feb 16, 2024

park commented Jul 19, 2024

github-actions bot commented Jan 16, 2025

[Feature Request] Add cuDNN-accelerated LSTM and GRU to PyTorch #19177

[Feature Request] Add cuDNN-accelerated LSTM and GRU to PyTorch #19177

Comments

foxik commented Feb 14, 2024

haifeng-jin commented Feb 15, 2024

foxik commented Feb 16, 2024

park commented Jul 19, 2024

github-actions bot commented Jan 16, 2025