Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spanish flair NER model for judical resolutions #3068

Closed
jedzill4 opened this issue Jan 25, 2023 · 4 comments
Closed

Spanish flair NER model for judical resolutions #3068

jedzill4 opened this issue Jan 25, 2023 · 4 comments
Labels
question Further information is requested wontfix This will not be worked on

Comments

@jedzill4
Copy link

jedzill4 commented Jan 25, 2023

Hi! We are Collective AI team, a group of machine learning engineers that works on several AI projects. In collaboration with DataGénero, we have developed a NER model in Spanish for processing judicial resolutions. This model serves as the backbone of AymurAI (the website is just in spanish for now), a software that will collaborate with criminal court officials in Argentina and Mexico in the tasks of generating and maintaining anonymized datasets for understanding gender-based violence against women and LGBTIQ+ people in Latin America.

We are opening the code, and we would like to contribute to Flair by integrating our model to your public hub. We believe this would be helpful for many other criminal court officials in Latin America and NLP researchers that work with Spanish corpus.

What are the steps to add the model to the library?

If you want to know more about the project, you can read the paper published last year here (english and spanish versions).

@jedzill4 jedzill4 added the question Further information is requested label Jan 25, 2023
@alanakbik
Copy link
Collaborator

Hello @jedzill4 thanks for letting us know about the project and great that you found Flair useful here! I could not access the paper - what does your model predict?

Contributing a model is quite easy and you have several options how to do it:

  • if your model is a SequenceTagger, you can push it to the Huggingface Hub. For instance like this model. Advantages are that you don't need to host yourself and there is an automatic online demo. However, I think (not sure) by using their hub you give some rights over your model to Huggingface so if that is problematic best clarify before.
  • alternatively, you can host the model on your own server and do a PR like this one. Essentially extend the fetch_model method of the model class with the URL where you host the model.
  • alternatively, you can host the model on your server (or we host it on our university server) and someone in our team will do the PR for you

@jedzill4
Copy link
Author

Hi @alanakbik! thanks for the fast answer!

I could not access the paper - what does your model predict?

Sorry, I updated the links (here is the English version of the paper). The model is on top of BETO embeddings and finetuned on Argentinian court documents/resolutions to predict some common entities like names, locations, and important dates; also other important entities for our domain like broken laws, types of violence involved in the case, violent quotes and decisions or rulings made by the judge.

Contributing a model is quite easy and you have several options how to do it:

  • if your model is a SequenceTagger, you can push it to the Huggingface Hub. For instance like this model. Advantages are that you don't need to host yourself and there is an automatic online demo. However, I think (not sure) by using their hub you give some rights over your model to Huggingface so if that is problematic best clarify before.

  • alternatively, you can host the model on your own server and do a PR like this one. Essentially extend the fetch_model method of the model class with the URL where you host the model.

  • alternatively, you can host the model on your server (or we host it on our university server) and someone in our team will do the PR for you

Thanks. I think we are going to go with the huggingface alternative.

@alanakbik
Copy link
Collaborator

Thanks for the info! Let me know when the model is posted on HF - I'd be interested to try it out.

Is the training data you used for the model public as well?

@stale
Copy link

stale bot commented Jun 11, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Jun 11, 2023
@stale stale bot closed this as completed Aug 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants