You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
According to https://docs.evidentlyai.com/support/f.a.q. its recommended to use sampling for large datasets.
Can you please help me understand what is a "large" dataset that would require sampling?
Like how many (rows * columns) would cause issues?
The text was updated successfully, but these errors were encountered:
Evidently can evaluate hundreds of different metrics where each has its computational footprint (e.g., there are metrics like "text content drift" that train a whole machine learning model on your data vs. more straightforward metrics that compute the mean value in the column). You can also combine multiple metrics in the same report.
The computation happens in memory, so the limitation will depend on your infrastructure.
So the simple answer is: if your computation takes longer than you want or fails to compute otherwise, you may consider sampling. Also, sampling often makes sense for metrics like data distribution drift.
According to https://docs.evidentlyai.com/support/f.a.q. its recommended to use sampling for large datasets.
Can you please help me understand what is a "large" dataset that would require sampling?
Like how many (rows * columns) would cause issues?
The text was updated successfully, but these errors were encountered: