Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functionality for linked DynamicTables #645

Merged
merged 63 commits into from
Jul 21, 2021
Merged

Conversation

oruebel
Copy link
Contributor

@oruebel oruebel commented Jul 16, 2021

Motivation

For the icephys work with NWB we would like to simplify interacting with DynamicTables that reference other tables via DynamicTableRegion, creating a collection of linked tables. In the ICEphys case this is a "simple" linear hierarchy of tables, but in principle a table may contain any number of DynamicTableRegion columns. This PR adds several functions to simplify introspection of linked DynamicTables and conversion to pandas DataFrames.

TODO

  • Fix DyanmicTableRegion not working correctly with AlignedDynamicTable #646 by adding AlignedDynamicTable.get
  • Fix AlignedDyanmicTable does not support [int, str] type slicing #651 by updating AlignedDynamicTable.get to support slicing with [int, (str, str)], [int, str, str], and [int, str] to select a single cell or row of a category table, repectively
  • Add AlignedDynamicTable.get_colnames(...) functions to allow us to keep compliance of the colnames property with DynamicTable while providing an easy way to get the full list of column names.
  • Set name of DataFrame in DynamicTable.to_dataframe() and DynamicTable.get
  • Add helper functions to DynamicTable to deal with foreign columns:
    • DynamicTable.get_foreign_columns to identify if the table contains DynamicTableRegion columns
    • DynamicTable.has_foreign_columns to identify which columns areDynamicTableRegion columns
    • DynamicTable.get_linked_tables to retrieve all tables linked to either directly or indirectly from
      the current table via DynamicTableRegion
  • Implement the same helper functions also for AlignedDynamicTable
    • DynamicTable.get_foreign_columns to identify if the table contains DynamicTableRegion columns
    • DynamicTable.has_foreign_columns to identify which columns areDynamicTableRegion columns
    • DynamicTable.get_linked_tables to retrieve all tables linked to either directly or indirectly from
      the current table via DynamicTableRegion
  • Add new module hdmf.common.hierarchicaltable with helper functions to facilitate conversion of linked tables to a single Pandas dataframe.
    • to_hierarchical_dataframe to merge linked tables into a single consolidated pandas DataFrame.
    • drop_id_columns to remove "id" columns from a DataFrame.
    • flatten_column_index to replace a pandas.MultiIndex with a regular pandas.Index
  • Add test for DyanmicTableRegion pointing to AlignedDynamicTable to check that the all columns are used
  • Add tests for hierarchicaltable.py for
    • to_hierarchical_dataframe
    • drop_id_columns
    • flatten_column_index functions
  • File issue tickets for open TODO items for future PRs

Checklist

  • Did you update CHANGELOG.md with your changes?
  • Have you checked our Contributing document?
  • Have you ensured the PR clearly describes the problem and the solution?
  • Is your contribution compliant with our coding style? This can be checked running flake8 from the source directory.
  • Have you checked to ensure that there aren't other open Pull Requests for the same change?
  • Have you included the relevant issue number using "Fix #XXX" notation where XXX is the issue number? By including "Fix #XXX" you allow GitHub to close issue #XXX when the PR is merged.

@rly
Copy link
Contributor

rly commented Jul 20, 2021

I made a few more comments. Otherwise, looks good to me

@oruebel
Copy link
Contributor Author

oruebel commented Jul 20, 2021

@rly thanks for the helpful code review! Your comments have helped clean up to_hierarchical_dataframe and revealed some additional issues with the handling of AlignedDynamicTable. I appreciate the feedback.

@oruebel
Copy link
Contributor Author

oruebel commented Jul 20, 2021

@rly I have now resolved all comments. Can you please have another look and approve the PR if it looks good.
Should this be merged with dev or another release branch?

@oruebel
Copy link
Contributor Author

oruebel commented Jul 20, 2021

@rly while adding another test to ensure to_hierarchical_dataframe is working correctly with AlignedDynamicTable higher up in the table hierarchy (not just at the bottom) I discovered two issues:

  1. issue AlignedDyanmicTable does not support [int, str] type slicing #651 and
  2. there were also a few more places in to_hierarchical_dataframe where dataframe.colnames was used instead of dataframe.get_colnames.

Those issues are now fixed as well. I think these should be last changes on this PR 🤞

@oruebel
Copy link
Contributor Author

oruebel commented Jul 21, 2021

@rly thanks for the additional comments. I have updated the source accordingly.

rly
rly previously approved these changes Jul 21, 2021
@oruebel oruebel merged commit c5f1bf7 into dev Jul 21, 2021
@oruebel oruebel deleted the add/hierarchical_table_funcs branch July 21, 2021 18:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants