Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hyperparameters #2

Open
JiaqiLi404 opened this issue Nov 6, 2024 · 1 comment
Open

Hyperparameters #2

JiaqiLi404 opened this issue Nov 6, 2024 · 1 comment

Comments

@JiaqiLi404
Copy link

Hi

Thanks for your efforts! May I ask the eventual hyperparameters of the model and for training? Since I ran on the default seeting, the performance is less exciting. And can I know when the pretrained weights could be released?

Thanks!

@lsy0882
Copy link
Owner

lsy0882 commented Nov 7, 2024

We are currently re-running our research using architectural insights gained from the Mamba-v2 network and RDFA-S4, so the pretrained weights file has been temporarily removed.

Until the results of this new research are released, it might be beneficial to apply the recurrent operations method described in the paper to the Mamba-base network for the TAL task.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants