Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Top-down centered-instance pipeline #3

Closed
talmo opened this issue Jul 5, 2023 · 2 comments · Fixed by #16
Closed

Top-down centered-instance pipeline #3

talmo opened this issue Jul 5, 2023 · 2 comments · Fixed by #16

Comments

@talmo
Copy link
Contributor

talmo commented Jul 5, 2023

Add the core data pipeline for top-down centered instance models.

Use TopdownConfmapsPipeline as reference.

Roughly:

  1. Read labels (Core data loader #1)
  2. Augmentation (Augmentation pipeline block #2)
  3. Find centroids/anchors (Add centroid finder block #7)
  4. Crop instances (Instance Cropping #13)
  5. Apply image normalization (Refactor datapipes #9)
  6. Generate confidence maps (Confidence Map Generation #11)
  7. Shuffling/Batching/prefetching

Some of these may need to occur in a different order than above or in the reference pipeline to work better with the augmentation backends and/or with the PyTorch Lightning data model.

Notes:

  • Be careful when handling the number of instances. 1 instance = 1 example, but we may have a variable number of instances per frame. In the reference implementation, we use tf.data.Dataset.unbatch() to go from frame-level examples to instance-level examples.
  • Shuffling should be done at frame-level if possible
  • Do image normalization as late as possible since copying uint8 arrays to GPU will be 4x faster than float32
  • Make sure to implement the same behavior of centroid/anchor detection as in the reference pipeline. Basically: prefer to use the specified anchor node, but if not visible use the midpoint of the bounding box.
  • Potential performance optimization to explore (specific to this pipeline): applying augmentation to larger crops rather than full image. If we apply augmentation to the final crops, we'll get black patches at the corners if we rotate. If we apply augmentation to the full image, we don't get black edges, but it's much less efficient. The best solution would be to crop at sqrt(2) * box_size (or a bit more for rounding error) so that we never have black edges regardless of the rotation angle.
@talmo talmo changed the title Top-down multi-instance pipeline Top-down centered-instance pipeline Jul 5, 2023
@talmo talmo mentioned this issue Jul 5, 2023
11 tasks
@talmo
Copy link
Contributor Author

talmo commented Jul 27, 2023

In light of pytorch/data#1196, we should be mindful of using the existing patterns and functionality in core PyTorch's implementation of datapipes: https://github.com/pytorch/pytorch/tree/main/torch/utils/data/datapipes

Namely, we should consider adding decorators to register our DataPipes with the functional API (@functional_datapipe("method_name")), as well as making sure to reuse existing DataPipe blocks like shufflers, etc.

@talmo
Copy link
Contributor Author

talmo commented Jul 27, 2023

For high level pipeline builder, we would like an API that's really convenient and readable like:

import sleap_io as sio
import sleap_nn as snn

labels = sio.load_slp("train.pkg.slp")
pipeline = snn.data.TopDownPipeline(labels=labels, anchor_node="thorax", crop_size=160, batch_size=4, rotation=180)

Maybe we'll want to use a @classmethod to construct it instead? Not clear if there is a usecase for it necessarily or if the __init__ constructor will suffice.

@alckasoc alckasoc linked a pull request Aug 30, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant