Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Add support for MobIE NER Dataset #3348

Closed
stefan-it opened this issue Oct 23, 2023 · 1 comment
Closed

[Feature]: Add support for MobIE NER Dataset #3348

stefan-it opened this issue Oct 23, 2023 · 1 comment
Assignees
Labels
feature A new feature

Comments

@stefan-it
Copy link
Member

stefan-it commented Oct 23, 2023

Problem statement

Hey,

in my latest blog post I used the MobIE NER Dataset to show how to fine-tune models with Flair.

I wrote a custom dataset loader for the MobIE NER Dataset:

The German MobIE Dataset was introduced in the MobIE paper by Hennig, Truong and Gabryszak (2021).

It's a German-language dataset that has been human-annotated with 20 coarse- and fine-grained entity types, and it includes entity linking information for geographically linkable entities. The dataset comprises 3,232 social media texts and traffic reports, totaling 91K tokens, with 20.5K annotated entities, of which 13.1K are linked to a knowledge base. In total, 20 different named entities are annotated.

Solution

Add MobIE support into Flair directly - example class:

https://github.com/stefan-it/autotrain-flair-mobie/blob/main/mobie_dataset.py

It also has some unit tests:

https://github.com/stefan-it/autotrain-flair-mobie/blob/main/script.py#L11-L19

Additional Context

No response

@stefan-it stefan-it added the feature A new feature label Oct 23, 2023
@stefan-it stefan-it self-assigned this Oct 23, 2023
@alanakbik
Copy link
Collaborator

Closed in #3351 - thanks @stefan-it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new feature
Projects
None yet
Development

No branches or pull requests

2 participants