-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support multiple DynamicTableRegion when converting to a hierarchical dataframe #649
Comments
I added the "good first issue" label mainly because its an issue that is focused on a particular area (i.e., DynamicTable) and remains mostly on the surface (i.e., requires using the API and building new features on top). However, this is not necessarily a trivial issue as it requires some tricky data wrangling with lots of edge cases. That being said, it's a good issue for someone who wants to dive deeper into DynamicTable logic. |
Smplify interacting with DynamicTables that reference other tables via DynamicTableRegion, creating a collection of linked tables. In the ICEphys case this is a "simple" linear hierarchy of tables, but in principle a table may contain any number of DynamicTableRegion columns. This PR adds several functions to simplify introspection of linked DynamicTables and conversion to pandas DataFrames. - [X] Fix #646 by adding ``AlignedDynamicTable.get`` - [X] Fix #651 by updating ``AlignedDynamicTable.get`` to support slicing with ``[int, (str, str)]``, ``[int, str, str]``, and ``[int, str]`` to select a single cell or row of a category table, repectively - [X] Add ``AlignedDynamicTable.get_colnames(...)`` functions to allow us to keep compliance of the ``colnames`` property with ``DynamicTable`` while providing an easy way to get the full list of column names. - [X] Set name of DataFrame in ``DynamicTable.to_dataframe()`` and ``DynamicTable.get`` - [X] Add helper functions to ``DynamicTable`` to deal with foreign columns: - [X] ``DynamicTable.get_foreign_columns`` to identify if the table contains ``DynamicTableRegion`` columns - [X] ``DynamicTable.has_foreign_columns`` to identify which columns are``DynamicTableRegion`` columns - [X] ``DynamicTable.get_linked_tables`` to retrieve all tables linked to either directly or indirectly from the current table via ``DynamicTableRegion`` - [x] Implement the same helper functions also for ``AlignedDynamicTable`` - [x] ``DynamicTable.get_foreign_columns`` to identify if the table contains ``DynamicTableRegion`` columns - [X] ``DynamicTable.has_foreign_columns`` to identify which columns are``DynamicTableRegion`` columns - [x] ``DynamicTable.get_linked_tables`` to retrieve all tables linked to either directly or indirectly from the current table via ``DynamicTableRegion`` - [X] Add new module ``hdmf.common.hierarchicaltable`` with helper functions to facilitate conversion of linked tables to a single Pandas dataframe. - [X] ``to_hierarchical_dataframe`` to merge linked tables into a single consolidated pandas DataFrame. - [X] ``drop_id_columns`` to remove "id" columns from a DataFrame. - [X] ``flatten_column_index`` to replace a ``pandas.MultiIndex`` with a regular ``pandas.Index`` - [x] Add test for DyanmicTableRegion pointing to AlignedDynamicTable to check that the all columns are used - [x] Add tests for hierarchicaltable.py for - [X] to_hierarchical_dataframe - [x] drop_id_columns - [x] flatten_column_index functions - [X] File issue tickets for open TODO items for future PRs - [X] ``to_hierarchical_dataframe`` should be updated to support resolution of more than one DynamicTableRegion column. See #649 - [x] Add tutorial for DynamicTableRegion and how to use for linking to tables and for creating linked tables. See #648
@oruebel Could you take this? If not, could you make me an example to run and I will take it. |
Let's see if I find time to work on this during the Dev Days. Otherwise, I'd leave it for |
Problem: The function
hdmf.common.hierarchicaltable.to_hierarchical_dataframe
currently only supports resolution of oneDynamicTableRegion
column perDynamicTable
that is linked to in the table hierarchy. I.e., the function follows and resolves the firstDynamicTableRegion
found in a table but if any given table contains additionalDynamicTableRegion
columns, then those will be converted as nestedpandas.DataFrame
objects.Possible Solutions:
hdmf.common.hierarchicaltable.to_hierarchical_dataframe
should support resolution of multipleDynamicTableRegion
columns for each given table.DynamicTable
would work onpandas.DataFrame
objects and allow resolution of an arbitrary number of user-defined columns. There should then also be an option to automatically find columns that need resolution to allow resolution of all columns at once.Possible Challenges
Checklist
The text was updated successfully, but these errors were encountered: