-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redundant sourceWordsBeingConsidered #2
Comments
Thanks for catching this; what you have done is what was originally intended. The alignments should still be the same, because of the two continues on lines 1293 and 1297. I will update the source soon. |
Great! Thank you. |
Hi, I'm also running in performance issues. Could you please provide your adjusted code? Many thanks. |
@eoehri |
In aligner.py lines 1267 and 1268, each source/target word may be appended many times to the sourceWordsBeingConsidered/targetWordsBeingConsidered lists, which make these lists too big due to redundant elements. I do not see the point of including words indices many times as this makes the next loop (line 1285) very time consuming.
To accelerate the execution, I converted sourceWordsBeingConsidered and targetWordsBeingConsidered lists to sets to remove duplicates. It is far faster now and I get the same alignment in testalign.py, however, I want to be sure that this does not deteriorate the alignment quality in other cases. Can you please confirm that removing redudancy is safe?
The text was updated successfully, but these errors were encountered: