Skip to content
This repository was archived by the owner on Mar 24, 2023. It is now read-only.

Sampling size #27

Closed
dzieciou opened this issue Jan 15, 2023 · 2 comments · Fixed by #41 · May be fixed by #40
Closed

Sampling size #27

dzieciou opened this issue Jan 15, 2023 · 2 comments · Fixed by #41 · May be fixed by #40

Comments

@dzieciou
Copy link
Owner

Currently, the sample size is always 10 but in the case of available labels of more than 10, this makes no sense.

This is more about algorithm and less about UI.

Perhaps sample size should be calculated automatically depending on the labels count, etc.

@pkubiak
Copy link
Collaborator

pkubiak commented Jan 15, 2023

Also due to population selection method, on small datasets, package generate single-item iterations

@dzieciou
Copy link
Owner Author

dzieciou commented Jan 18, 2023

Right, single-item iteration will happen in two situations:

  • for very small taxonomies (when the number of leaf categories is smaller than the size of the requested sample)
  • in the further iteration, when the number of categories left to disambiguate is relatively small.

Ideally, the number of returned items in interaction should converge to 0. If it does not (i.e. stays above 0 for many iterations), then there is no way to disambiguate specific categories and create a complete mapping. The ultimate mapping and annotation will be suboptimal.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
2 participants