You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The section "Choosing stemming table" does not clarify how exactly evaluation was done.
I should differentiate between data used for training and data used for validation. Where same data used for validation for both tables?
I would like also to repeat evaluation using different training data than validation data just like Andrzej Białecki did in his oryginal implementation to account for the fact that bigger training table make be to overfitted and does not handle new words (see my discussion in https://datascience.stackexchange.com/questions/84652/stemmer-or-dictionary)
In particular:
split data into training and test data
validation both test data and on training+test data
The text was updated successfully, but these errors were encountered:
How well it handles words unknown during training? This is easy for stemmer trained original dictionary -- just use words for polimorf that are not in original dictionary. But how to test it on stemmer trained on polimorf dictionary if that is the most complete dictionary I know?
The section "Choosing stemming table" does not clarify how exactly evaluation was done.
I should differentiate between data used for training and data used for validation. Where same data used for validation for both tables?
I would like also to repeat evaluation using different training data than validation data just like Andrzej Białecki did in his oryginal implementation to account for the fact that bigger training table make be to overfitted and does not handle new words (see my discussion in https://datascience.stackexchange.com/questions/84652/stemmer-or-dictionary)
In particular:
The text was updated successfully, but these errors were encountered: