Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Method to find entity combinations "missing" from dataset #248

Open
pvandyken opened this issue Feb 16, 2023 · 1 comment
Open

Method to find entity combinations "missing" from dataset #248

pvandyken opened this issue Feb 16, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@pvandyken
Copy link
Contributor

Most of the new filtering apis (#209) focus on removing any entity combinations that are missing in one or more components. For example, component B is missing subject 2, so subject 2 is completely excluded from .expand().

It would be helpful for QC to have a method to print all of these missing entities. This would let developers and users quickly query missing parts of their datasets. Technically, this is equal to:

# pseudocode
product(*dataset.entities.values()) - dataset.zip_lists

In other words, the maximal zip list subtracted by the actual zip list.

My idea is to have a base method that returns a zip_list like representation of all missing groupings, and possibly another convenience method to print the list in a nice table. I need to think about the exact API yet, but if anyone has ideas please share!

@pvandyken pvandyken added the enhancement New feature or request label Feb 16, 2023
@tkkuehn
Copy link
Contributor

tkkuehn commented Feb 17, 2023

Signature proposal: Dataset.missing_entities() -> dict[str, dict[str, list[str]]

  • Components with no missing entities should be present in the dictionary, with every entity in its dictionary having the value of any empty list.
    • e.g. {'component_a': {'sub': [], 'ses': []}}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants