Sampling size #27

dzieciou · 2023-01-15T11:27:50Z

Currently, the sample size is always 10 but in the case of available labels of more than 10, this makes no sense.

This is more about algorithm and less about UI.

Perhaps sample size should be calculated automatically depending on the labels count, etc.

pkubiak · 2023-01-15T11:29:48Z

Also due to population selection method, on small datasets, package generate single-item iterations

dzieciou · 2023-01-18T16:01:13Z

Right, single-item iteration will happen in two situations:

for very small taxonomies (when the number of leaf categories is smaller than the size of the requested sample)
in the further iteration, when the number of categories left to disambiguate is relatively small.

Ideally, the number of returned items in interaction should converge to 0. If it does not (i.e. stays above 0 for many iterations), then there is no way to disambiguate specific categories and create a complete mapping. The ultimate mapping and annotation will be suboptimal.

dzieciou mentioned this issue Jan 28, 2023

[WIP] Calculate sample size automatically #40

Draft

dzieciou mentioned this issue Feb 22, 2023

Adapt sample size to the maximum number of categories #41

Merged

dzieciou closed this as completed in #41 Mar 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sampling size #27

Sampling size #27

dzieciou commented Jan 15, 2023

pkubiak commented Jan 15, 2023

dzieciou commented Jan 18, 2023 •

edited

Loading

Sampling size #27

Sampling size #27

Comments

dzieciou commented Jan 15, 2023

pkubiak commented Jan 15, 2023

dzieciou commented Jan 18, 2023 • edited Loading

dzieciou commented Jan 18, 2023 •

edited

Loading