Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is this codes really based on what the paper said? #36

Open
basbaba opened this issue Jul 3, 2023 · 0 comments
Open

Is this codes really based on what the paper said? #36

basbaba opened this issue Jul 3, 2023 · 0 comments

Comments

@basbaba
Copy link

basbaba commented Jul 3, 2023

We are studying TIIM and found many problems, the most critical confusions are:

  1. This codes employs model/transformer/Transformer but not the model/transformer/TransformerMonotonic, which should be the main ideal of using MoCha mentioned in the paper.
  2. In TransformerMonotonic, the image features are organized in HxNWxC, that means features are scanned in the order of row by row, not column by column, but the paper puts emphasis on column and explained why vertical features do better to the translating.

Did I misunderstand the whole thing in paper and codes? Please correct me if I'm wrong.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant