Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add data exploration GUI #200

Merged
merged 14 commits into from
Aug 10, 2020
Merged

Add data exploration GUI #200

merged 14 commits into from
Aug 10, 2020

Conversation

aidanheerdegen
Copy link
Collaborator

ipywidgets for data exploration

Closes #199

@aidanheerdegen aidanheerdegen marked this pull request as draft August 4, 2020 07:10
@aidanheerdegen
Copy link
Collaborator Author

To have a play with the default DB:

from cosima_cookbook import explore
dbx = explore.DatabaseExplorer()
dbx

You can pass a session to DatabaseExplorer if you want to test other databases.

@navidcy
Copy link
Collaborator

navidcy commented Aug 4, 2020

This is great!!

OK, I loaded pbot_t for example but then how can I plot something? Or use this loaded data array?

It says The loaded DataArray is accessible as the data attribute of the ExperimentExplorer object but when I call

dbx.data

I get

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-13-e641cb6988d1> in <module>
----> 1 dbx.data

AttributeError: 'DatabaseExplorer' object has no attribute 'data'

@navidcy
Copy link
Collaborator

navidcy commented Aug 4, 2020

Btw, this PR definitely it should be definitely accompanied by a Data Exploration Tutorial in cosima-recipes

@aidanheerdegen
Copy link
Collaborator Author

The child ExperimentExplorer object of the DatabaseExplorer is accessible as the .ee attribute. So in this case the data is accessible as dbx.ee.data.

It needs documentation for sure, and yes a tutorial would be on the To-Do list.

There is a plan to have a VariableExplorer which wraps hvplot, but that might have to wait.

Copy link
Collaborator

@angus-g angus-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few thoughts as I read through. Looks great though, and definitely needed!

cosima_cookbook/explore.py Outdated Show resolved Hide resolved
cosima_cookbook/explore.py Show resolved Hide resolved
cosima_cookbook/explore.py Outdated Show resolved Hide resolved
cosima_cookbook/explore.py Outdated Show resolved Hide resolved
cosima_cookbook/explore.py Show resolved Hide resolved
cosima_cookbook/explore.py Outdated Show resolved Hide resolved
cosima_cookbook/explore.py Show resolved Hide resolved
cosima_cookbook/explore.py Show resolved Hide resolved
When first instantiated, or experiment changed, the variable
selector widget needs to be refreshed
"""
self.de = DatabaseExtension(self.session, experiments=experiment_name)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what you meant by having to re-cache the database contents when experiments are changed? Maybe DatabaseExtension could cache the full thing once, and provide an interface to narrow its view to certain experiments (instead of recreation?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole extension thing is a horrible hack. I was hoping to re-implement most of the guts of it in the ORM layer if I had time, which I evidently do not, as this has taken WAY too long.

When I'm testing I make a single de object and pass it into the widgets so never have to recalculate. I wasn't sure about always generating the whole thing every time, as it is possible to pass a list of experiments to it as well, and this is what happens when you just make an ExperimentExplorer and pass it an experiment name. So that it is quick, as it just pulls in the information for the one experiment. It is important that it only has the info for a single experiment, as we don't want it picking up variables from other experiments.

The DatabaseExtension thing is doing too much work. Like I said, hacky. For DatabaseExplorer it makes the mapping between (name, long_name) and experiment for filtering. But we don't need that for the ExperimentExplorer.

I should rethink that, but am happy for any input, as I've been staring at this stuff for too long and am a bit over it TBH.

I'm trying to write tests ATM, which should be useful if there is some refactoring done.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the feedback too. Very helpful.

setup.py Show resolved Hide resolved
Removed redundant return_value_or_empty functions.

Fixed bug with ExperimentExplorer __init__ when session not passed.
Changed to using a dict comprehension for formatting info output.
Modified test files for explore to ensure differences betweem two
experiments to make tests meaninful.
@aidanheerdegen
Copy link
Collaborator Author

Turns out the issue with widgets popping up where they shouldn't is because I am storing them in a dict.

jupyter-widgets/ipywidgets#2944

Python is weird sometimes.

…ubwidgets

were pointed to the same memory location in different instants. For more
information see:

jupyter-widgets/ipywidgets#2944

Removed unused imports.
@aidanheerdegen
Copy link
Collaborator Author

Ok, random widgets turning up in weird places is now fixed.

Enforced consistent style across classes.
@aidanheerdegen aidanheerdegen marked this pull request as ready for review August 7, 2020 07:21
@aidanheerdegen
Copy link
Collaborator Author

I would like to get this out so people can use it. It would be nice to have more elegant SQL queries to replace the DatabaseExplorer functionality but I think it would take me too long to figure it out. If you're interested I'm happy to take suggestions @angus-g, or we can go with this and improve it later when more experiments are added the DB and it the length of time to loop over all the experiments becomes too large.

@angus-g
Copy link
Collaborator

angus-g commented Aug 7, 2020

I'll have a quick play with getting the kind of data we want accessible at the ORM level, but I agree that it's better to have something usable rather than fuss over perfection!

@angus-g
Copy link
Collaborator

angus-g commented Aug 10, 2020

I have some ideas of how to change things, but I'm going to merge this first. If they come to fruition, they can come in through a new PR.

@angus-g angus-g merged commit 4d404e4 into master Aug 10, 2020
@angus-g angus-g deleted the issue-199 branch August 10, 2020 03:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GUI for data exploration
3 participants