-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Put wer tag entity type in SBS output #32
Conversation
src/wer.cpp
Outdated
string tk_classLabel = tk_pair->classLabel; | ||
string ref_tk = tk_pair->ref; | ||
string hyp_tk = tk_pair->hyp; | ||
/* while (visitor.NextTriple(tk_pair)) { */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: delete unused code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea doesn't this mean no more topAlignment
needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still gets used below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the ngrams? goodness that's yucky. I guess at least get rid of the tk_pair
and visitor
initialization then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just looks like that one function needs a refactor
|
||
NlpFstLoader::NlpFstLoader(std::vector<RawNlpRecord> &records, Json::Value normalization, bool processLabels) | ||
NlpFstLoader::NlpFstLoader(std::vector<RawNlpRecord> &records, Json::Value normalization, | ||
Json::Value wer_sidecar, bool processLabels) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: put nWerSidecar(wer_sidecar) in initializer list instead of line 27
The WER tag sidecar file carries accompanying info for tokens in a NLP file. One such type of info is an "entity_type" like NER tags. This type of info is useful to see in the output SBS file.