Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🗺 Embedding Insights #160

Closed
40 tasks done
mikeldking opened this issue Jan 9, 2023 · 0 comments
Closed
40 tasks done

🗺 Embedding Insights #160

mikeldking opened this issue Jan 9, 2023 · 0 comments
Assignees
Labels

Comments

@mikeldking
Copy link
Contributor

mikeldking commented Jan 9, 2023

Goals

  • Provide a way for users to visualize embeddings of a dataset in a UMAP point cloud. The point cloud should provide a visual representation of the embeddings' vector space. In addition, the point cloud should be colored by various heuristics such that the user can visually identify areas that are performing badly or that contain a high amount of anomalies.

  • Provide Embedding drift analysis using UMAP projection of one or two datasets. These two datasets are crafted by a sample of the primary dataset compared against the full (down-sampled) baseline dataset.

Euclidean Distance

Sub-goal: display temporal drift over time so that we can perform timeseries based analysis when there are two datasets.

Backend

Metrics

Datasets

Components

UMAP and Clustering

Backend

App

@mikeldking mikeldking changed the title Embedding Insights 🗺 Embedding Insights Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Archived in project
Development

No branches or pull requests

5 participants