-
-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Word segmentation of LatexEquation123 #39
Comments
This is expected behavior, since the input contains capital letters, it prevents If you pass in |
Thanks for the clarification. |
Currently there is no such option, the original author has suggested a possible solution but has not implemented it in the original code. I am not sure how to implement this in the current code. |
Alright, it seems like lowering the case beforehand and capitalizing later could be the workaround for now. |
I recently found some examples are not segmented as properly as expected. For instance, the segmentation of
LatexEquation123
isLa tex Equ at ion 123
but the expected output should beLatex Equation 123
. I checked the frequency entries in frequency_dictionary_en_82_765.txt and foundlatex
andequation
.Is this expected in terms of the algorithm?
The text was updated successfully, but these errors were encountered: