Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CODE IMPROVEMENT] Add notice when the number of samples is cut in the Validation Insights tab #849

Open
pascal-pfeiffer opened this issue Sep 11, 2024 · 1 comment
Assignees
Labels
area/core Core code related issue type/good first issue Good for newcomers

Comments

@pascal-pfeiffer
Copy link
Collaborator

🔧 Proposed code refactoring

We limit the samples to 900 in the Validation Insights tab to prevent lagging issues for very large datasets.

    elif len(df) > 900:
        df = df.sample(n=900, random_state=42).reset_index(drop=True)

We should show a notice when the subsampling on the Validation Insights tab is happening and point towards the "Download predictions" option.

Motivation

While users may still download all predictions it may be puzzling to see a different number here than in the training logs.

@pascal-pfeiffer pascal-pfeiffer added area/core Core code related issue type/good first issue Good for newcomers labels Sep 11, 2024
@us8945
Copy link
Collaborator

us8945 commented Oct 16, 2024

@pascal-pfeiffer , when we display Validation insights, the data is already sampled (if the number of validation records above 900).
With the change, when we display the message to the user, should we indicate exact number of the validation records available, or the message can just state that the number of validation records is above 900?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/core Core code related issue type/good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants