Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inefficient Process for Adding New Entities in ReFinED #28

Open
Shoumik-Gandre opened this issue May 18, 2024 · 0 comments
Open

Inefficient Process for Adding New Entities in ReFinED #28

Shoumik-Gandre opened this issue May 18, 2024 · 0 comments

Comments

@Shoumik-Gandre
Copy link

Shoumik-Gandre commented May 18, 2024

When trying to add a dozen more entities by running preprocess_all.py, the process requires downloading over 100GB of data, which is highly inefficient for such a small addition.

This model cannot be considered to have zero-shot capabilities until there is a streamlined, bloat-free script for adding new entities into the system.

Steps to Reproduce:

  1. Clone the repository and set up the environment as per the documentation.
  2. Attempt to add a dozen new entities by running preprocess_all.py.
  3. Observe the data download requirements and inefficiency.

Expected Behavior:

There should be a lightweight and efficient process for adding new entities without requiring extensive data downloads.

Actual Behavior:

Adding new entities requires downloading over 100GB of data, making the process highly inefficient and cumbersome.

Environment:

Google Colab
Operating System: Linux
Python Version: 3.10

Severity:

High - This issue severely impacts the usability and efficiency of adding new entities to the system and needs immediate attention.

@Shoumik-Gandre Shoumik-Gandre changed the title Horrendous Zero Shot Entity Linking (Zeshel) Capabilities Inefficient Process for Adding New Entities in ReFinED May 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant