Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RNN-style train and eval for S4/S4D #49

Closed
mingweima opened this issue Jun 24, 2022 · 3 comments
Closed

RNN-style train and eval for S4/S4D #49

mingweima opened this issue Jun 24, 2022 · 3 comments

Comments

@mingweima
Copy link

Excellent idea and great paper!
Could you please provide a concrete example on how to both train and eval using the stateful RNN version of S4/S4D? I only find an evaluation example in the SaShiMi code but I have not found an example for training.
Thank you!

@mingweima mingweima changed the title RNN-style train and evalusing S4/S4D RNN-style train and eval for S4/S4D Jun 24, 2022
@albertfgu
Copy link
Contributor

You can find functionality in v1 of this codebase, by passing an initial state into the forward pass of the S4 module. Unfortunately this functionality has been discontinued because it is a non-trivial technical addition that is difficult to maintain and has not been published yet. We are thinking of putting up a short technical report with the details and adding official support for this.

If you really need this functionality, you can modify it from v1 of this codebase. Alternatively you can modify the RNN mode (with the step function), but this could be very slow.

@mingweima
Copy link
Author

Thanks!

@albertfgu
Copy link
Contributor

If you haven't seen it yet, this functionality has been re-introduced and improved: see the README.

As I mentioned previously, this functionality is non-trivial and unpublished, and we would appreciate you sending a private correspondence if it ends up being important for a project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants