Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sentence 4556 has more than 256 words. Can not handle such long sentence. Please cut it short first! #4

Open
ajesujoba opened this issue Jan 11, 2021 · 1 comment

Comments

@ajesujoba
Copy link

I want to create a suffix array index of the source and target sides of my training bitext. But it appears I cannot process sentences with more than 256 words. Is there a way I can increase the maximum number of words per sentence to 512 or 1024?

@hieuhoang
Copy link
Contributor

no idea I'm afraid. No one has worked on the code for years. If you fix it, please create a pull request

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants