Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved ingest flow #34

Closed
enjalot opened this issue Mar 8, 2024 · 1 comment
Closed

Improved ingest flow #34

enjalot opened this issue Mar 8, 2024 · 1 comment
Labels
enhancement New feature or request python web

Comments

@enjalot
Copy link
Owner

enjalot commented Mar 8, 2024

Right now when you upload a file to start a new dataset we just blindly ingest it. There are a number of things we could do better:

  • Check if a dataset with the same name exists
    • if it does, append a number like we do for other data files {dataset_name}_001
  • Check columns of the dataset to suggest options:
    • choose a text column
    • import embeddings (if column is all arrays of the same length)
      • give the option to choose the model embedding was generated with
      • this could be suggested in the embedding step based on columns identified as arrays of numbers
    • determine categorical and numeric columns
      • this would support filtering
      • and allowing to color by field
@enjalot enjalot added enhancement New feature or request python web labels Mar 8, 2024
@enjalot
Copy link
Owner Author

enjalot commented Mar 21, 2024

Implemented in 0.1.8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request python web
Projects
None yet
Development

No branches or pull requests

1 participant