Questions when constructing lexicon #18

Alan5279 · 2023-02-24T05:37:06Z

When constructing the n-gram lexicon for memory, did you use test dataset? I notice in run.sh, in --eval_data_path the value is the test dataset. But from the code, I assume constructing the lexicon uses only train and eval data, and test set is used for --do_test option.
Moreover, if the test set is used to construct lexicon, some of the words and their statistical features are known to the model.
It doesn't feel right for experiments.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions when constructing lexicon #18

Questions when constructing lexicon #18

Alan5279 commented Feb 24, 2023

Questions when constructing lexicon #18

Questions when constructing lexicon #18

Comments

Alan5279 commented Feb 24, 2023