Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding periodic boundary condition support to tobac #127

Merged
merged 94 commits into from
Feb 8, 2023

Conversation

galexsky
Copy link
Contributor

@galexsky galexsky commented May 3, 2022

Extensive update of tobac capabilities and procedures by myself and @freemansw1 . Many substantial changes to code, such as:

  • Inclusion of the vertical dimension for 3D feature detection and tracking, now accounting for a potentially non-uniform dz
  • Modification to 3D segmentation allowing "box seeding" of user-set size in addition to default "column seeding" method
  • Feature detection, segmentation, and tracking now able to address datasets with periodic boundary conditions (PBCs)
  • Addition of several new utility functions and movement of some such functions from other modules to utils
  • Optimization, simplification, bug fixes, and other streamlining of overall code base
  • Improved code documentation

galexsky and others added 30 commits September 14, 2021 11:54
Added 3D feature detect functions to module
I went in and updated most of the documentation, although not all of it. It now compiles locally on my machine and should be compatible with readthedocs.
Made a few tweaks for brevity/typos
Cleaned up a sentence and typo or two
* Updated documentation

I went in and updated most of the documentation, although not all of it. It now compiles locally on my machine and should be compatible with readthedocs.

* Update analysis.rst

Made a few tweaks for brevity/typos

* Update installation.rst

Cleaned up a sentence and typo or two

Co-authored-by: galexsky <90701223+galexsky@users.noreply.github.com>
)

* Updated the tracking function to the new, faster option

In addition to updating the tracking, I've added a Dockerfile in here for easier testing using pytest.

* Added pycache to the gitignore

* Update tracking.py

Added the 3D version of the updated tracking function. Looks and works essentially the same as the original, but the function name has a '_3D' appended to it and the function's guts now include the vertical dimension 'vdim' in the arguments for the trackpy link/link_df functions.

* Update tracking.py

Tweaked "my_linking_trackpy_3D" to change it to "linking_trackpy_3D" for consistency with the non-3D function name

* Update tracking.py

Removed 3D tracking function so we can merge with consistency.

Co-authored-by: galexsky <90701223+galexsky@users.noreply.github.com>
Updated the docs, including moving docstrings to numpy format
…-project#10)

* Improved speed from implementation of new indices get functions

* Updated feature detection to be faster

* Updated version

I updated the version number in the setup.py script so that users can update with pip.

* Added workflow to automatically run tests via github actions

* Switched around order of dockerfile to improve performance.

* Switched label around to use binary arrays, as that seems faster.

* Speeding up binary_erosion
* Update feature_detection.py

Added new optimized functions for 3D and periodic boundary condition (PBC) feature detection. Includes treatments for no PBCs (original tobac boundary approach), single-boundary PBC for either of hdim_1 or hdim_2, and doubly periodic boundaries

* Updated docstring on some of the feature detection functions

* Updated docstrings

Continuing to update docstrings to be more clear and to be in numpy format where needed.

* Added some new tests and updated dockerfile for testing

I added a test for get_label_props_in_dict as part of adding new tests for tobac. I also updated the dockerfile to improve execution order and build/test times.

* Updated test assertions for number of labels

* cleaned up new code and removed redundant functions

* Added 2D and 3D compatibility with get_indices_of_labels_from_dict.

I also added the associated tests to test_feature_detection.py.

* Renamed feature_position_3D for cleanup.

* Addition of 3D/PBC segmentation functions

Added 2 new versions of the segmentation and segmentation_timestep functions that improve the 3D segmentation and also include a first-pass doubly periodic boundary condition (which will need improvement before being pushed out to the world at large). Whereas the previous 3D segmentation approach simply seeded the entire vertical column at the integer x,y position of a feature, we now seed a 5x5x5 box around the position of the feature, which ought to reduce the chance of artificially segmenting unconnected fields (e.g. cirrus overlying a discrete convective cloud).

* Update segmentation.py

Fixed a call to an old test function "watershed_PBC" to just call the regular skimage.segmentation watershed function. Also updated the original segmentation function to use skimage.segmentation watershed rather than the deprecated skimage.morphology watershed

* Added linking_trackpy_3D function

Added linking_trackpy_3D to module, which is an updated, 3D version of the linking_trackpy function. This function is basically identical to the non-3D version, but includes vdim as an additional function argument and positional argument for the tp.link call that is really the guts of tobac's tracking code. Longer term, probably a good idea to integrate this with the 2D tracking function and use a user-set flag or something to determine whether to do 2D or 3D tracking - all this really comes down to is whether or not to include vdim in pos_columns.

* Added new functions, corrected error in old one

Corrected issue in 'feature_position_3D' where the coordinate re-transform at the end of the function still used the old integer PBC_flag values instead of the new string values. Also added 3D versions of feature_detection_multithreshold(_3D) and feature_detection_multithreshold_timestep(_3D) just for temporary continuity with feature_detection_threshold_3D so we can smoothly integrate 2D and 3D feature detection functions.

* Fixing a docker bug and continuing to combine feature detection

* Updated feature detection for 3D to call the new overarching feature_position function

* Updated 3D feature detection stuff

* Fixed some bugs with 3D feature detection.

* At a working state, but the new combined feature_detection_threshold does not exactly match what we had earlier, so fixing that is the next step.

* Fixed feature_detection_threshold to be in line with what Alex had (and ~2x faster)

* Removing a 3D function to clean things up

* Added 3D capability to feature_detection_multithreshold_timestep and it matches.

* Removed extraneous 3D code

* Continued consolidation of 2D and 3D functions.

* Combined the last of the 2D/3D feature detection functions.

* Fixed a bug with the 2D case label_bbox.

* Ignoring the .vscode files, whoops

* Deleted my personal vscode settings.

* first go at improving remove_parents, switching to numpy comparisons.

* trying out the v2 trunk from the tobac github

* Attempt at our own remove_parents function

* improving speeds

* Attempt at speeding up further

* fixing a bug

* fixing some bugs here?

* Fixing the edge case with no initial features.

* Combined 2D and 3D tracking

* Starting to add in more testing in anticipation of PBC tracking

* Starting to add into the PBC coordinates function

* Added in hdim_2 to PBC_flag both

* get_pbc_coordinates now working throughout

* Cleaned up make_feature_blob and added the docstring for a new generate_single_feature function

* New testing utilities, building to PBC tracking

* Added more tests for our new generate_single_feature function.

* Updated testing code to include times and starting updates to PBC code

* Tracking testing working well now.

* Now have a PBC test that correctly fails.

* Starting to clean up documentation and prep for 3D PBC tracking

* Added in new distance calculation functions for trackpy

* Added in PBC tracking and added scikit-learn as a requirement

To add in PBC tracking, we need to use the BTree neighbor_strategy, which requires scikit-learn.

* Fixed tracking to produce correct results, huzzah!

* Added new tests for PBC tracking with hdim_1 boundaries and hdim_2 boundaries

* Starting prep work to efficiency

* Switched euclidian distance tests to use numpy arrays to match what trackpy gives it

* Revision to calculate_distance_coords_pbc to be faster

* Trying benchmarks with numba

* Making numba optional

* Cleaning up unused functions

* Updated comments

* Added fix for small proto-features case

Previously, code would crash if there are 1 or more proto-features at a particular threshold where none exceed the minimum point threshold. Variable 'column_names' was left undefined, causing an error when attempting to create a dataframe from 'list_features_threshold' with columns 'column_names'. Added a new if-else statement to fix this.

* Added some more testing utilities and more tests for the tests

* Fixed PBC labeling

Added 3 new elifs in feature_detection_threshold that corrected an issue where contiguous indices which crossed boundaries would not be relabeled properly if the cross-boundary label had already been overwritten.

* Removed field_in argument for linking_trackpy

Removed field_in as an argument in linking_trackpy. This argument did not serve any meaningful purpose in the original tracking code, if I remember correctly, and is completely unused in the present implementation.

* Revert "Removed field_in argument for linking_trackpy"

This reverts commit 893d09eaafb86aa3632e5dedc3dbe888ad2797e9.

Co-authored-by: galexsky <90701223+galexsky@users.noreply.github.com>
Co-authored-by: Sean Freeman <sfreeman@parhelic.atmos.colostate.edu>
Co-authored-by: Sean Freeman <sfreeman@solvarg.atmos.colostate.edu>
Co-authored-by: galexsky <galexsky94@gmail.com>
Added a few new functions and many lines of code to enable a new 'box' 3D seeding approach and proper watershedding/adjustment of segmentation masks across periodic boundaries. The if-else split for the older 'column' seeding approach and new 'box' seeding approach should work (though I have not tested it), but right now the PBC treatment assumes we are using the 'box' approach and will likely have to be adapted to work with the 'column' approach too. There is also a bug in the buddy box seeding currently, as we assume a 5x5x5 seeding box the code will break if the size of the buddy box is less than 5 in any of the dimensions. Generalizing the box approach to be a user-set size and also adding a check for this in the Buddy Box routine should help to fix this, but I wanted to port it over without breaking any of the current functionality before I started exploring that.
Removed older versions of segmentation functions which were nominally PBC/3D but had been superseded by new versions of the original functions including these capabilities and more. Also added 'seed_3D_flag' argument to all relevant functions for user choice of 3D feature seeding method
Added missing seed_3D_flag in segmentation function call for segmentation_3D function
@freemansw1
Copy link
Member

I'm back to working on this now that we're very close to v1.4.0. I've got the latest changes from RC_v1.5.0 merged in, although there will be some more work to do once #190 is merged in.

I'm not ready for a re-review as there's a lot more to do, but I will ping when I am.

@JuliaKukulies
Copy link
Member

Great @freemansw1 ! Thanks for the notification and for all your hard work on this one. I will have a look at #190 later this week.

@freemansw1
Copy link
Member

Okay, I've updated this code for RC_V1.5.0. One thing that I'm struggling with: I cannot get our workaround for predictive tracking working with 3D data. For now, I've switched tests to use random, but that is clearly not ideal. @snilsn any ideas? I know that this will be fixed once soft-matter/trackpy#710 is merged, but without an ETA there, it would be much better to have a workaround.

@snilsn if you're able/willing/interested in helping, I've put the "base case" in a gist here: https://gist.github.com/freemansw1/a917e0d823524c6bb44eee458e218a02 .

@snilsn
Copy link
Collaborator

snilsn commented Nov 10, 2022

I could get df_link_iter to produce an output by setting pos_columns to ['z', 'y', 'x'] manually, but then the prediction doesn't work. This could mean that guessing the position columns isn't working correctly for 3d inputs within df_link_iter, which would make our workaround useless. trackpy.utils.guess_pos_columns() is used for this, but seems to work fine

One idea I had was specifyng an 3d initial_guess_vels in NearestVelocityPredict, but that isn't helping.

Did you find any examples for trackpy that include tracking 3d features using predictors @freemansw1?

@snilsn
Copy link
Collaborator

snilsn commented Nov 11, 2022

I think there is a solution (at least theoretically). It's possible to set the values for the interpolator manually:

from scipy.interpolate import NearestNDInterpolator
import numpy as np
import pandas as pd
import trackpy

test_df = pd.DataFrame({'z': [0,1,2,3], 'y':[0,1,2,3], 
'x': [0,1,2,3], 'frame': [0,1,2,3]})

features_list = [frame for i, frame in test_df.groupby("frame", sort=True)]

initial_guess_pos = np.asarray([[0, 0, 0]])
initial_guess_vel = np.asarray([[1, 1, 1]])

pred = trackpy.predict.NearestVelocityPredict(span=1)
pred.interpolator = NearestNDInterpolator(initial_guess_pos, initial_guess_vel)

traj_iter = trackpy.link_df_iter(features_list, 
                                 search_range=0.1, 
                                 t_column='frame', 
                                 predictor=pred.predict)
track = pd.concat(traj_iter)

This code links all features into one particle despite the small search range, i. e. the prediction is happening.

The problem is, that the value for the velocity isn't adjusted automatically. This would mean that we would have to loop through all the frames in pairs, interpolate the velocity from the last frame and set a new pred.interpolator at each step, do the linking for the two frames and combine everything into one dataframe at the end.

It's probably possible, but a little ridiculous. Maybe we can iterate to something feasible from here.

@freemansw1 freemansw1 changed the title Optimization, documentation, 3D, PBC improvements Adding 3D and periodic boundary condition support to tobac Nov 17, 2022
Made several changes to the utils functions due to the merging in of the
utils restructuring.
…into dev-3D-PBC-support

Resolved a merge conflict of my own making.
@freemansw1
Copy link
Member

I've now merged in the latest v1.5.0 changes, most importantly the utils breakout (#191). I'm in the process of thinking how best to do this. I'm becoming inclined to break out the 3D from the PBC changes, despite the pain that that will likely cause for me. As of now, this PR is extremely difficult to review due to its size and scope, with around 5K LOC changed.

I'm going to hold off asking for re-reviews until I resolve this.

@freemansw1 freemansw1 marked this pull request as draft December 2, 2022 16:14
@freemansw1 freemansw1 changed the title Adding 3D and periodic boundary condition support to tobac Adding periodic boundary condition support to tobac Dec 20, 2022
@freemansw1 freemansw1 merged commit 4080fed into tobac-project:RC_v1.5.0 Feb 8, 2023
@freemansw1
Copy link
Member

It appears that, because #209 was merged and contains this commit history, GitHub has tagged this as merged as well. I'll work on a new PR for the PBC changes.

@w-k-jones
Copy link
Member

That's a slightly annoying feature on Github! Let us know when the new PR is ready and I'll review it asap

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Addition of new features, or improved functionality of existing features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants