You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add support for converting Marian Transformer RNN models.
Motivation
Firefox is doing great work at training new models. Their teacher models are able to be converted to PyTorch models using existing conversion tooling.
However their student models which are way smaller, and more efficient, are structured as Transformer-RNN. This is currently not supported by the conversion tools. Would it be even possible to add support for this?
Your contribution
At this point I'm just a parrot trying to seek help and information on this topic. It is far outside my current knowledge on what this exactly means. Perhaps if someone could point me in the right direction I could figure this out.
The text was updated successfully, but these errors were encountered:
Hi @FricoRico, I don't know if the Marian modeling code in Transformers supports Transformer-RNN architectures either! This means that you'd need to either:
@Rocketknight1 Yeah you are right, ideally the Marian tools would also need to be expanded to support Transformer-RNN for inference. But I guess step one would be to even allow to export the models in the first place. But perhaps I'm over simplifying things.
In general, we need modeling code first in order to support conversion of model checkpoints, rather than the other way around! The model code provides the "architecture" that runs a particular set of weights.
Feature request
Add support for converting Marian Transformer RNN models.
Motivation
Firefox is doing great work at training new models. Their teacher models are able to be converted to PyTorch models using existing conversion tooling.
However their student models which are way smaller, and more efficient, are structured as Transformer-RNN. This is currently not supported by the conversion tools. Would it be even possible to add support for this?
Your contribution
At this point I'm just a parrot trying to seek help and information on this topic. It is far outside my current knowledge on what this exactly means. Perhaps if someone could point me in the right direction I could figure this out.
The text was updated successfully, but these errors were encountered: