-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rules Improvement for French #38
Rules Improvement for French #38
Conversation
Hi @richardpaulhudson |
Hi @Pantalaymon, thank you very much for this and please accept my sincere apologies for taking so unacceptably long to get back to you. Coreferee is still being maintained and will still be maintained in the future; with me having changed employers I seem to have missed the original PR notification in December. I am currently doing experiments into ways of improving the accuracy specifically for English. The most likely outcome — although this is by no means set in stone — is that we will end up implementing a new library for English coreference. Coreferee will definitely still be supported for the other languages and it may well be that the results of the experiments point to some cross-language improvements that can be made to Coreferee as well. Your suggestion to implement rules to filter noun-noun coreference sounds like a very good idea and I shall definitely look into this further. Two questions about this PR:
|
Hi @richardpaulhudson , Very interesting. So it would be a new library independent from base spacy? Regarding my suggestion, I think that partly exceeds the original focus of coreferee which was anaphora resolution, Since the noun-noun pairing operates mostly on a cross-language level and a rule-based system . However if you really plan to start from this project as to implement a larger, multi-language coreference resolution solution for spacy, I am 100% convinced that specific language rules for noun-noun coreference would be worth designing. Regarding your questions :
By the way, regarding the evaluation of the whole coreference chains, I have been able to evaluate the tool for french with more usual metrics here by using the CONLL format. The results are not so good for the reasons exposed below but still ok. |
Hello ,
As I will be using coreferee in a new project I am still working on improving the rules.
I added a few more rules in
lang/fr/language_rules.py
as well as a few tests intests/fr
to make sure they work as expected.There is also some edits in l
ang/fr/data files
which are used by the rulesRegarding the new rules, I don't know if you plan to use the same rules for the spacy native solution that you are developing but I just wanted to share that on top of the language specific rules for noun/anaphora - anaphora pairs, the system would greatly benefit from language specific rules for noun - noun coreferring pairs. For instance to prevent singular named entities (say John Doe) from coreferring with plural nouns (say the people) or gender-incompatible nouns.