wikiner datasets / NER training and add to model #13381
Unanswered
CarloPederiva
asked this question in
Help: Coding & Implementations
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Dear all
I do have a question in regard to training an NER model, better said, add NER to de_dep_news_trf. Having a wikiner dataset which is pre-annotated, I want to use it for training. Hence I created the base_config file and ran the command to create the new cfg file. My problem is that the wikiner data is zipped in a .bz2 folder and when unzipped, it is a .txt file. Using Spacy 3.0, do I have to convert this .txt file first to json (or jsonl) as there is no option to use the spacy convert command on txt file as it is not supported. In case of having to use a json file, I figured that the data must be in a "special" order not just a normal .json file (which I have by now) as this does not let me convert it. Do you have any ideas on how to do that? Is my approach correct to first train it on the wikiner, add it to the existing de_dep_news_trf pipeline (the NER) and then save it to disk as a new model and let it run/train/correct on other datasets? Thanks heeps!
Beta Was this translation helpful? Give feedback.
All reactions