Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aadi/track local queues #20

Merged
merged 41 commits into from
Apr 22, 2024
Merged

Conversation

aaprasad
Copy link
Contributor

@aaprasad aaprasad commented Nov 10, 2023

  • add instance + frame data structures for easier data manipulation.
  • implement track local queues for tracking so that we track across individual instance gaps instead of ending trajectory every time it leaves the queue.
  • Add Miscellaneous formatting/doc string fixes

Summary by CodeRabbit

  • New Features

    • Enhanced instance and frame tracking with updated data structures.
    • Improved dataset handling for more efficient data processing.
  • Documentation

    • Detailed method docstrings for better code understanding.
    • Expanded class descriptions for improved documentation coverage.
  • Refactor

    • Introduced Frame and Instance classes for enhanced data management.
    • Implemented TrackQueue class for improved tracking functionality.
  • Style

    • Aligned codebase with consistent styles for better readability.
    • Enhanced visualization of training batches for improved monitoring.
  • Tests

    • Increased test coverage for new data structures and tracking features.
    • Updated unit tests for enhanced code reliability.
  • Bug Fixes

    • Resolved issues with bounding box calculations for accurate data representation.
    • Fixed data processing logic to ensure correct information handling.
  • Chores

    • Conducted codebase cleanup by removing unused imports for optimized performance.
    • Corrected minor typos in docstrings for improved code clarity.

@aaprasad aaprasad changed the base branch from main to aadi-sleap-anchors November 10, 2023 03:46
Copy link

codecov bot commented Nov 10, 2023

Codecov Report

Attention: Patch coverage is 76.45212% with 150 lines in your changes are missing coverage. Please review.

❗ No coverage uploaded for pull request base (aadi-sleap-anchors@cd6d61a). Click here to learn what that means.

❗ Current head 7c4c645 differs from pull request most recent head 33b97ad. Consider uploading reports for the commit 33b97ad to get more accurate results

Files Patch % Lines
biogtr/data_structures.py 82.39% 47 Missing ⚠️
biogtr/inference/tracker.py 44.44% 45 Missing ⚠️
biogtr/inference/track_queue.py 86.36% 15 Missing ⚠️
biogtr/datasets/eval_dataset.py 0.00% 14 Missing ⚠️
biogtr/inference/track.py 0.00% 13 Missing ⚠️
biogtr/inference/metrics.py 91.30% 4 Missing ⚠️
biogtr/config.py 0.00% 3 Missing ⚠️
biogtr/datasets/data_utils.py 77.77% 2 Missing ⚠️
biogtr/datasets/sleap_dataset.py 90.90% 2 Missing ⚠️
biogtr/models/gtr_runner.py 77.77% 2 Missing ⚠️
... and 3 more
Additional details and impacted files
@@                  Coverage Diff                  @@
##             aadi-sleap-anchors      #20   +/-   ##
=====================================================
  Coverage                      ?   68.24%           
=====================================================
  Files                         ?       28           
  Lines                         ?     2148           
  Branches                      ?        0           
=====================================================
  Hits                          ?     1466           
  Misses                        ?      682           
  Partials                      ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

coderabbitai bot commented Dec 4, 2023

Important

Auto Review Skipped

Auto reviews are disabled on base/target branches other than the default branch. Please add the base/target branch pattern to the list of additional branches to be reviewed in the settings.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

The changes across the codebase primarily involve the introduction and integration of Frame and Instance data structures, replacing previous uses of dictionaries for handling data. This shift towards object-oriented data management is evident in various modules, from datasets and models to inference and testing. The updates also include docstring improvements, type annotations, and minor logic adjustments to accommodate the new structures.

Changes

File Path Summary
.gitignore Added notebooks/ directory to target/.
biogtr/config.py, biogtr/datasets/..., biogtr/inference/..., biogtr/models/..., biogtr/training/..., tests/... Integrated Frame and Instance data structures; updated method signatures, imports, and logic; added type annotations; improved docstrings.
biogtr/visualize.py, tests/fixtures/configs.py Updated docstrings; modified function parameters and import statements.
biogtr/training/train.py Minor refactoring and documentation changes.
tests/fixtures/torch.py Expanded comment block and fixed typo.

"In the warren of code, where the data hops around,
🐇 A rabbit's delight, as new structures are found.
With Frames and Instances, now neatly in line,
The burrow's more orderly, the code more divine."


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@aaprasad
Copy link
Contributor Author

aaprasad commented Dec 6, 2023

@coderabbitai review

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 27

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between cd6d61a and 7c4c645.
Files ignored due to filter (2)
  • biogtr/training/configs/base.yaml
  • tests/configs/base.yaml
Files selected for processing (33)
  • .gitignore (1 hunks)
  • biogtr/config.py (5 hunks)
  • biogtr/data_structures.py (1 hunks)
  • biogtr/datasets/base_dataset.py (6 hunks)
  • biogtr/datasets/cell_tracking_dataset.py (5 hunks)
  • biogtr/datasets/data_utils.py (6 hunks)
  • biogtr/datasets/eval_dataset.py (1 hunks)
  • biogtr/datasets/microscopy_dataset.py (5 hunks)
  • biogtr/datasets/sleap_dataset.py (9 hunks)
  • biogtr/datasets/tracking_dataset.py (2 hunks)
  • biogtr/inference/init.py (1 hunks)
  • biogtr/inference/boxes.py (2 hunks)
  • biogtr/inference/metrics.py (9 hunks)
  • biogtr/inference/post_processing.py (1 hunks)
  • biogtr/inference/track.py (6 hunks)
  • biogtr/inference/track_queue.py (1 hunks)
  • biogtr/inference/tracker.py (6 hunks)
  • biogtr/models/attention_head.py (2 hunks)
  • biogtr/models/embedding.py (6 hunks)
  • biogtr/models/global_tracking_transformer.py (2 hunks)
  • biogtr/models/gtr_runner.py (11 hunks)
  • biogtr/models/model_utils.py (3 hunks)
  • biogtr/models/transformer.py (9 hunks)
  • biogtr/training/losses.py (2 hunks)
  • biogtr/training/train.py (3 hunks)
  • biogtr/visualize.py (5 hunks)
  • tests/fixtures/configs.py (1 hunks)
  • tests/fixtures/torch.py (1 hunks)
  • tests/test_data_structures.py (1 hunks)
  • tests/test_datasets.py (12 hunks)
  • tests/test_inference.py (4 hunks)
  • tests/test_models.py (6 hunks)
  • tests/test_training.py (4 hunks)
Files skipped from review due to trivial changes (8)
  • .gitignore
  • biogtr/datasets/tracking_dataset.py
  • biogtr/inference/init.py
  • biogtr/inference/track_queue.py
  • biogtr/models/attention_head.py
  • biogtr/models/embedding.py
  • tests/fixtures/configs.py
  • tests/fixtures/torch.py
Additional comments: 81
biogtr/config.py (5)
  • 45-47: The update to the __str__ method's docstring is a minor change that improves clarity.

  • 93-99: The variable assignments within the get_gtr_runner method appear unchanged, contrary to the summary which mentions reformatting.

  • 191-197: The addition of a trailing comma after **dataloader_params in the get_dataloader method aligns with PEP 8's recommendation for multi-line collections.

  • 283-287: The addition of a trailing comma after the accelerator parameter in the get_trainer method aligns with PEP 8's recommendation for function signatures.

  • 298-304: The get_trainer method has been updated with logic to set accelerator and devices in the trainer configuration if they are not present, which is not mentioned in the summary.

biogtr/datasets/cell_tracking_dataset.py (4)
  • 2-12: The import statements have been correctly updated to include Instance and Frame from biogtr.data_structures, which is in line with the PR objectives and summary.

  • 27-31: The __init__ method of CellTrackingDataset has been updated with new parameters and the super call has been updated accordingly, which is consistent with the PR objectives and summary.

  • 180-204: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [111-204]

The get_instances method has been updated to return a list of Frame objects, and the logic within the method has been updated to create and return Frame objects. This change is consistent with the PR objectives and summary.

  • 111-111: The summary states that the input type of the get_instances method has been changed from a list of integers to a list of Frame objects. However, the code indicates that the input type remains a list of integers (label_idx: List[int], frame_idx: List[int]). This discrepancy should be clarified.
biogtr/datasets/data_utils.py (4)
  • 67-73: The summary mentions a modification in the calculation of bounding box coordinates, but the provided hunk shows no changes in the logic of the get_bbox function. The code within the function is identical to the previous version, and the function signature remains unchanged.

  • 119-140: The summary indicates a revision in the calculation of the bounding box around an instance pose in the pose_bbox function, but the provided hunk shows no changes in the logic. The code within the function is identical to the previous version, and the function signature remains unchanged.

  • 213-218: The summary mentions a correction of a variable name from xml_path to data_path in the parse_trackmate function, but the provided hunk shows that the variable data_path is already being used consistently. There is no indication of a previous variable named xml_path.

  • 447-453: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [447-481]

The summary mentions a typo fix in the documentation string of the view_training_batch function, but the provided hunk does not show any changes in the documentation string. However, the if condition within the function has been corrected to properly check for the type of cmap.

- ax.imshow(data.T) if isinstance(cmap, None) else ax.imshow(data.T, cmap=cmap)
+ ax.imshow(data.T, cmap=cmap) if cmap is not None else ax.imshow(data.T)
biogtr/datasets/microscopy_dataset.py (6)
  • 2-8: The import changes are consistent with the PR objectives and the AI-generated summaries, which aim to improve tracking capabilities and data handling through the introduction of new data structures.

  • 25-28: The method signature changes in the __init__ method are consistent with the PR objectives and the AI-generated summaries.

  • 94-100: The logic to set self.frame_idx using torch.arange and Image.open(video).n_frames or len(video) is correct and supports different types of video input.

  • 104-112: The get_indices method is correctly implemented and does not present any issues.

  • 170-199: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [114-199]

The get_instances method has been refactored to use the new Frame and Instance data structures, which aligns with the PR objectives and the AI-generated summaries.

  • 25-28: Please confirm if the commented-out line for setting the random seed (np.random.seed(self.seed)) is intentionally left in the code. If it's not needed, it should be removed to avoid confusion.
biogtr/datasets/sleap_dataset.py (6)
  • 5-12: The addition of imports for warnings and the Frame and Instance classes from biogtr.data_structures aligns with the PR's objective to enhance data structure usage for better tracking and data manipulation.

  • 29-33: The addition of the verbose argument to the __init__ method is consistent with the PR's objective to enhance tracking capabilities and provide more control over logging and output.

  • 205-270: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [140-270]

The modifications to the data processing logic, including the handling of frames and instances using the new Frame and Instance data structures, align with the PR's objectives to improve object tracking across gaps in individual instances.

  • 102-106: The get_indices method remains unchanged, which is consistent with the summary and indicates that the PR's changes did not affect this part of the code.

  • 205-270: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [140-270]

The changes to the get_instances method, including the use of the new Frame and Instance data structures, are significant and align with the PR's objectives to improve the tracking system's data handling capabilities.

  • 205-270: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [167-270]

The changes to the augmentation logic, including the handling of keypoints and the application of augmentations to the images and poses, are consistent with the PR's objectives to improve data handling and tracking capabilities.

biogtr/inference/boxes.py (3)
  • 1-3: The update to the import statement is correct as Union is not used in the file.

  • 58-59: The docstring correction in the area method is a minor grammatical improvement.

  • 56-62: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [5-62]

No further issues or changes are observed in the provided hunks.

biogtr/inference/track.py (3)
  • 21-47: The changes to the export_trajectories function reflect the updated usage of Frame objects instead of dictionaries, aligning with the PR objectives and the summary provided. The logic within the function has been correctly adjusted to handle the new data structures.

  • 84-98: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [55-95]

The inference function has been updated to handle Frame objects, which is consistent with the PR objectives and the summary. The logic within the function has been adapted to work with the new data structures, and the changes are correctly implemented.

  • 107-112: The main function's logic has been updated to work with the new Frame objects and the updated inference function. The changes are consistent with the PR objectives and the summary, and the function appears to handle the new data structures correctly.
biogtr/inference/tracker.py (6)
  • 24-105: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [15-56]

The addition of max_gap and verbose parameters to the Tracker class's __init__ method is correctly implemented and aligns with the PR objectives to enhance tracking capabilities.

  • 58-58: The __call__ method has been correctly updated to accept a list of Frame objects.

  • 84-102: The logic within the track method has been updated to handle instances of the Frame class and their associated attributes.

  • 121-187: The sliding_inference method has been updated to handle instances of the Frame class and their associated attributes.

  • 215-328: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [189-392]

The _run_global_tracker method has been updated to handle instances of the Frame class and their associated attributes.

  • 1-12: The additional imports for warnings, Frame, and TrackQueue are correctly implemented.
biogtr/models/global_tracking_transformer.py (4)
  • 1-7: The import of Frame from biogtr.data_structures is correctly added to support the new data structure usage in the forward method.

  • 101-123: The forward method has been correctly updated to accept a list of Frame objects, and the logic within the method has been adapted to work with these new data structures.

  • 111-123: The logic to extract features from frames only if they do not already have them is a good use of the Frame object's methods and ensures that features are not recomputed unnecessarily.

  • 121-123: The return statement correctly provides the association predictions and embeddings, which is consistent with the method's documentation and the expected output of the forward method.

biogtr/models/gtr_runner.py (8)
  • 21-33: The default values for metrics and persistent_tracking have been updated to improve clarity and functionality. The documentation has been adjusted to reflect these changes.

  • 60-72: The forward method now checks for detected instances using the num_detected attribute before proceeding with the model prediction, which is a logical improvement.

  • 56-80: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [74-90]

The training_step method has been updated with improved documentation and a more structured approach to logging metrics.

  • 86-98: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [92-108]

The validation_step method has been updated with improved documentation and a more structured approach to logging metrics.

  • 102-114: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [110-124]

The test_step method has been updated with improved documentation and a more structured approach to logging metrics.

  • 120-130: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [126-141]

The predict_step method has been updated to always use persistent tracking during inference, which aligns with the objective of improving tracking across individual instance gaps.

  • 205-218: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [178-207]

The configure_optimizers method has been structured to be overridden by config, with a default setup for the optimizer and scheduler provided. This allows for flexibility and customization.

  • 209-218: The log_metrics method has been enhanced with type annotations and improved documentation, which aids in code readability and maintainability.
biogtr/models/transformer.py (7)
  • 11-21: The documentation at the top of the file has been updated to reflect the addition of fixed embeddings over boxes, aligning with the PR's objective to enhance tracking capabilities.

  • 162-163: The _reset_parameters method has been updated to initialize model weights using the Xavier uniform distribution for tensors with more than one dimension. This is a standard practice for initializing neural network weights and should help with convergence during training.

  • 214-245: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [165-272]

The forward method of the Transformer class has been significantly refactored to accommodate the new Frame objects and the query_frame parameter. The method now concatenates features from the Frame objects and handles positional embeddings based on the embedding_meta configuration. The logic for handling query indices and the attention mechanism has been updated accordingly. This aligns with the PR's objective to improve object tracking across individual instance gaps.

  • 294-298: The forward method of the TransformerEncoder class has been updated to optionally accept a positional embedding tensor. This change is consistent with the overall PR's objective of enhancing the tracking system's capabilities.

  • 339-345: The forward method of the TransformerDecoder class has been updated to accept positional embeddings for both the target sequence and the memory sequence. This is in line with the changes made to the Transformer class and supports the PR's goal of improving tracking.

  • 416-420: The forward method of the TransformerEncoderLayer class has been updated to include positional embeddings in the input sequence if provided. This change is consistent with the updates to the TransformerEncoder and TransformerDecoder classes and supports the PR's enhancements.

  • 490-494: The forward method of the TransformerDecoderLayer class has been updated to handle positional embeddings for both the target and memory sequences. This change is consistent with the updates to the TransformerDecoder class and supports the PR's enhancements.

biogtr/training/losses.py (4)
  • 2-2: The import change from dictionaries to Frame objects is consistent with the PR objectives and the summary provided.

  • 36-38: The update to the forward method signature to accept Frame objects aligns with the PR objectives and the summary.

  • 49-50: The attribute access changes to utilize Frame object methods are consistent with the PR objectives and the summary.

  • 53-53: The change in the function call to get_boxes_times to pass Frame objects is consistent with the PR objectives and the summary.

biogtr/training/train.py (1)
  • 41-52: The refactoring of the input prompt logic for obtaining the task index is a good fallback mechanism for when the POD_INDEX environment variable is not set. This ensures that the batch job can still proceed with manual input.
biogtr/visualize.py (4)
  • 2-13: The summary indicates that the import statement for matplotlib.pyplot was removed, but this change is not visible in the provided hunks. Please verify if this change occurred in a part of the file not shown here.

  • 18-19: The change in the docstring for fill_missing from "Fills" to "Fill" is a minor grammatical correction and is acceptable.

  • 67-71: The addition of the poses parameter to the annotate_video function may require updates to all calls to this function to accommodate the new parameter, especially if they rely on positional arguments.

  • 280-282: The refactoring of the main function's docstring to a one-liner is a stylistic change and is acceptable.

tests/test_datasets.py (5)
  • 54-58: The update from direct dictionary access to using the get_gt_track_ids() method aligns with the PR's objective to encapsulate attributes and provide methods for easier manipulation and access.

  • 65-71: The changes to the n_chunks parameter in the SleapDataset instantiation are not mentioned in the summary. This could be an oversight, as altering the chunking behavior can have significant effects on dataset partitioning and should be documented.

  • 54-58: The changes to the assertions in the test functions are not mentioned in the summary. These changes are significant as they validate the expected behavior of the dataset with the new data structures.

  • 229-229: The changes to the assertions comparing gt_track_ids_1 and gt_track_ids_2 are not mentioned in the summary. These changes are significant as they ensure the consistency of the dataset's behavior with and without the gt_list provided.

  • 384-390: The changes to the augmentation tests in the test_augmentations function are not mentioned in the summary. These changes are significant as they test the robustness of the dataset under transformations and should be documented.

tests/test_inference.py (4)
  • 5-7: The import of Frame and Instance from biogtr.data_structures is consistent with the PR objectives and the changes described in the summary.

  • 19-37: The refactoring to use Frame and Instance objects for tracking logic is consistent with the PR objectives and the summary.

  • 62-81: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [78-152]

The post-processing tests appear to be unchanged and not directly related to the new data structures. Verify if the commented-out pytest.mark.parametrize is intended to be removed or if it's part of a refactor that should be completed.

  • 155-188: The creation of Frame and Instance objects for metrics testing is consistent with the PR objectives and the summary.
tests/test_models.py (3)
  • 208-225: The test test_transformer_basic has been updated to use the new Frame and Instance data structures. This change is consistent with the PR objectives and the summary provided.

  • 270-286: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [270-307]

The test test_transformer_embedding has been updated to use the new Frame and Instance data structures and includes additional assertions to check the size of the output embeddings. This change is consistent with the PR objectives and the summary provided.

  • 314-331: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [314-353]

The test test_tracking_transformer has been updated to use the new Frame and Instance data structures and includes additional assertions to check the size of the output embeddings. This change is consistent with the PR objectives and the summary provided.

tests/test_training.py (4)
  • 2-8: The imports and setup in the beginning of the file are consistent with the PR objectives and the summary provided.

  • 19-39: The changes in the test_asso_loss function correctly reflect the use of the new Frame and Instance data structures, aligning with the PR objectives and the summary.

  • 46-77: The setup of the test_basic_gtr_runner function with the new Frame and Instance data structures is consistent with the PR objectives and the summary.

  • 98-120: The calls to training_step, validation_step, and test_step within the test_basic_gtr_runner function are updated to pass a list containing batch instead of batch directly, which is in line with the PR objectives and the summary.


# for now set equal since detections are fixed
pred_box, pred_time = get_boxes_times(instances)
pred_box, pred_time = get_boxes_times(frames)
target_box, target_time = pred_box, pred_time

# todo: we should maybe reconsider how we label gt instances. The second
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider removing the commented-out code that references the old implementation using dictionaries to clean up the file and avoid confusion.

Comment on lines 1 to 13
"""Module containing model helper functions."""
from copy import deepcopy
from typing import Dict, List, Tuple, Iterable
from typing import List, Tuple, Iterable
from pytorch_lightning import loggers
from biogtr.data_structures import Frame
import torch


def get_boxes_times(instances: List[Dict]) -> Tuple[torch.Tensor, torch.Tensor]:
"""Extracts the bounding boxes and frame indices from the input list of instances.
def get_boxes_times(frames: List[Frame]) -> Tuple[torch.Tensor, torch.Tensor]:
"""Extract the bounding boxes and frame indices from the input list of instances.

Args:
instances (List[Dict]): List of instance dictionaries
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This review was outside the patches, and no patch overlapping with it was found. Original lines [63-64]

Potential bug in error handling.

-        if optimizer_class is None:
+        optimizer_class = None

Note: This review was outside the patches, and no patch overlapping with it was found. Original lines [83-84]

Potential bug in error handling.

-        if scheduler_class is None:
+        scheduler_class = None

Comment on lines +7 to +61
def test_instance():
"""Test Instance object logic."""

gt_track_id = 0
pred_track_id = 0
bbox = torch.randn((1, 4))
crop = torch.randn((1, 3, 128, 128))
features = torch.randn((1, 64))

instance = Instance(
gt_track_id=gt_track_id,
pred_track_id=pred_track_id,
bbox=bbox,
crop=crop,
features=features,
)

assert instance.has_gt_track_id()
assert instance.gt_track_id.item() == gt_track_id
assert instance.has_pred_track_id()
assert instance.pred_track_id.item() == pred_track_id
assert instance.has_bbox()
assert torch.equal(instance.bbox, bbox)
assert instance.has_features()
assert torch.equal(instance.features, features)

instance.gt_track_id = 1
instance.pred_track_id = 1
instance.bbox = torch.randn((1, 4))
instance.crop = torch.randn((1, 3, 128, 128))
instance.features = torch.randn((1, 64))

assert instance.has_gt_track_id()
assert instance.gt_track_id.item() != gt_track_id
assert instance.has_pred_track_id()
assert instance.pred_track_id.item() != pred_track_id
assert instance.has_bbox()
assert not torch.equal(instance.bbox, bbox)
assert instance.has_features()
assert not torch.equal(instance.features, features)

instance.gt_track_id = None
instance.pred_track_id = -1
instance.bbox = None
instance.crop = None
instance.features = None

assert not instance.has_gt_track_id()
assert instance.gt_track_id.shape[0] == 0
assert not instance.has_pred_track_id()
assert instance.pred_track_id.item() != pred_track_id
assert not instance.has_bbox()
assert not torch.equal(instance.bbox, bbox)
assert not instance.has_features()
assert not torch.equal(instance.features, features)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests for the Instance class seem comprehensive, covering the creation of an instance, attribute checks, and the behavior when setting attributes to None. However, it would be beneficial to add comments explaining the purpose of each assertion block for better readability and maintainability.

Comment on lines +64 to +126
def test_frame():
n_detected = 2
n_traj = 3
video_id = 0
frame_id = 0
img_shape = torch.tensor([3, 1024, 1024])
asso_output = torch.randn(n_detected, 16)
traj_score = torch.randn(n_detected, n_traj)
matches = ([0, 1], [0, 1])

instances = []
for i in range(n_detected):
instances.append(
Instance(
gt_track_id=i,
pred_track_id=i,
bbox=torch.randn(1, 4),
crop=torch.randn(1, 3, 64, 64),
features=torch.randn(1, 64),
)
)
frame = Frame(
video_id=video_id, frame_id=frame_id, img_shape=img_shape, instances=instances
)

assert frame.video_id.item() == video_id
assert frame.frame_id.item() == frame_id
assert torch.equal(frame.img_shape, img_shape)
assert frame.num_detected == n_detected
assert frame.has_instances()
assert len(frame.instances) == n_detected
assert frame.has_gt_track_ids()
assert len(frame.get_gt_track_ids()) == n_detected
assert frame.has_pred_track_ids()
assert len(frame.get_pred_track_ids()) == n_detected
assert not frame.has_matches()
assert not frame.has_asso_output()
assert not frame.has_traj_score()

frame.asso_output = asso_output
frame.add_traj_score("initial", traj_score)
frame.matches = matches

assert frame.has_matches()
assert frame.matches == matches
assert frame.has_asso_output()
assert torch.equal(frame.asso_output, asso_output)
assert frame.has_traj_score()
assert torch.equal(frame.get_traj_score("initial"), traj_score)

frame.instances = []

assert frame.video_id.item() == video_id
assert frame.num_detected == 0
assert not frame.has_instances()
assert len(frame.instances) == 0
assert not frame.has_gt_track_ids()
assert not len(frame.get_gt_track_ids())
assert not frame.has_pred_track_ids()
assert len(frame.get_pred_track_ids()) == 0
assert frame.has_matches()
assert frame.has_asso_output()
assert frame.has_traj_score()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests for the Frame class cover a variety of scenarios, including the creation of a frame with instances, attribute checks, and the behavior when modifying the instances attribute. Similar to the Instance tests, adding comments to explain the purpose of each assertion block would improve readability.

Comment on lines +129 to +204
def test_track_queue():
window_size = 8
max_gap = 10
img_shape = (3, 1024, 1024)
n_instances_per_frame = [2] * window_size

frames = []
instances_per_frame = []

tq = TrackQueue(window_size, max_gap)
for i in range(window_size):
instances = []
for j in range(n_instances_per_frame[i]):
instances.append(Instance(gt_track_id=j, pred_track_id=j))
instances_per_frame.append(instances)
frame = Frame(video_id=0, frame_id=i, img_shape=img_shape, instances=instances)
frames.append(frame)

tq.add_frame(frame)

assert len(tq) == sum(n_instances_per_frame[1:])
assert tq.n_tracks == max(n_instances_per_frame)
assert tq.tracks == [i for i in range(max(n_instances_per_frame))]
assert len(tq.collate_tracks()) == window_size - 1
assert all([gap == 0 for gap in tq._curr_gap.values()])
assert tq.curr_track == max(n_instances_per_frame) - 1

tq.add_frame(
Frame(
video_id=0,
frame_id=window_size + 1,
img_shape=img_shape,
instances=[Instance(gt_track_id=0, pred_track_id=0)],
)
)

assert len(tq._queues[0]) == window_size - 1
assert tq._curr_gap[0] == 0
assert tq._curr_gap[max(n_instances_per_frame) - 1] == 1

tq.add_frame(
Frame(
video_id=0,
frame_id=window_size + 1,
img_shape=img_shape,
instances=[
Instance(gt_track_id=1, pred_track_id=1),
Instance(
gt_track_id=max(n_instances_per_frame),
pred_track_id=max(n_instances_per_frame),
),
],
)
)

assert len(tq._queues[max(n_instances_per_frame)]) == 1
assert tq._curr_gap[1] == 0
assert tq._curr_gap[0] == 1

for i in range(max_gap):
tq.add_frame(
Frame(
video_id=0,
frame_id=window_size + i + 1,
img_shape=img_shape,
instances=[Instance(gt_track_id=0, pred_track_id=0)],
)
)

assert tq.n_tracks == 1
assert tq.curr_track == max(n_instances_per_frame)
assert 0 in tq._queues.keys()

tq.end_tracks()

assert len(tq) == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests for the TrackQueue class check the queue's behavior with different numbers of frames and instances, as well as the handling of gaps. It would be good to add assertions to check the correctness of the collate_tracks method's output and to ensure that the end_tracks method correctly finalizes the tracking. Additionally, comments explaining the test scenarios would enhance understanding.

Comment on lines 16 to 18
Args:
instances: a list of dicts where each dict corresponds to a frame and
contains the video_id, frame_id, gt labels and predicted labels
instances: a list of Frames containing the video_id, frame_id,
gt labels and predicted labels
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring argument name instances should be updated to frames to match the function signature.

-        instances: a list of Frames containing the video_id, frame_id,
+        frames: a list of Frames containing the video_id, frame_id,

Committable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
Args:
instances: a list of dicts where each dict corresponds to a frame and
contains the video_id, frame_id, gt labels and predicted labels
instances: a list of Frames containing the video_id, frame_id,
gt labels and predicted labels
Args:
frames: a list of Frames containing the video_id, frame_id,
gt labels and predicted labels

Comment on lines 42 to 43
else:
warnings.warn("No instances detected!")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider providing more context in the warning message, such as the video_id or frame_id, to help with debugging.

Comment on lines 27 to 63
def __getitem__(self, idx: int) -> List[Frame]:
"""Get an element of the dataset.

Args:
idx: the index of the batch. Note this is not the index of the video
or the frame.

Returns:
A list of dicts where each dict corresponds a frame in the chunk and
each value is a `torch.Tensor`. Dict elements are the video id, frame id, and gt/pred track ids

A list of Frames where frames contain instances w gt and pred track ids + bboxes.
"""
labels = [{"video_id": gt_frame['video_id'],
"frame_id": gt_frame['video_id'],
"gt_track_ids": gt_frame['gt_track_ids'],
"pred_track_ids": pred_frame['gt_track_ids'],
"bboxes": pred_frame["bboxes"]
} for gt_frame, pred_frame in zip(self.gt_dataset[idx], self.pred_dataset[idx])]

return labels No newline at end of file
gt_batch = self.gt_dataset[idx]
pred_batch = self.pred_dataset[idx]

eval_frames = []
for gt_frame, pred_frame in zip(gt_batch, pred_batch):
eval_instances = []
for gt_instance, pred_instance in zip(
gt_frame.instances, pred_frame.instances
):
eval_instances.append(
Instance(
gt_track_id=gt_instance.gt_track_id,
pred_track_id=pred_instance.gt_track_id,
bbox=pred_instance.bbox,
)
)
eval_frames.append(
Frame(
video_id=gt_frame.video_id,
frame_id=gt_frame.frame_id,
img_shape=gt_frame.img_shape,
instances=eval_instances,
)
)

return eval_frames
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The __getitem__ method has been refactored to return a list of Frame objects. Ensure that the logic correctly creates Frame and Instance objects and that the ground truth and predicted track IDs are assigned to the correct attributes. Specifically, line 50 seems to incorrectly assign pred_instance.gt_track_id to pred_track_id instead of pred_instance.pred_track_id.

    eval_instances.append(
    Instance(
        gt_track_id=gt_instance.gt_track_id,
-       pred_track_id=pred_instance.gt_track_id,
+       pred_track_id=pred_instance.pred_track_id,
        bbox=pred_instance.bbox,
    )
)

Committable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
def __getitem__(self, idx: int) -> List[Frame]:
"""Get an element of the dataset.
Args:
idx: the index of the batch. Note this is not the index of the video
or the frame.
Returns:
A list of dicts where each dict corresponds a frame in the chunk and
each value is a `torch.Tensor`. Dict elements are the video id, frame id, and gt/pred track ids
A list of Frames where frames contain instances w gt and pred track ids + bboxes.
"""
labels = [{"video_id": gt_frame['video_id'],
"frame_id": gt_frame['video_id'],
"gt_track_ids": gt_frame['gt_track_ids'],
"pred_track_ids": pred_frame['gt_track_ids'],
"bboxes": pred_frame["bboxes"]
} for gt_frame, pred_frame in zip(self.gt_dataset[idx], self.pred_dataset[idx])]
return labels
\ No newline at end of file
gt_batch = self.gt_dataset[idx]
pred_batch = self.pred_dataset[idx]
eval_frames = []
for gt_frame, pred_frame in zip(gt_batch, pred_batch):
eval_instances = []
for gt_instance, pred_instance in zip(
gt_frame.instances, pred_frame.instances
):
eval_instances.append(
Instance(
gt_track_id=gt_instance.gt_track_id,
pred_track_id=pred_instance.gt_track_id,
bbox=pred_instance.bbox,
)
)
eval_frames.append(
Frame(
video_id=gt_frame.video_id,
frame_id=gt_frame.frame_id,
img_shape=gt_frame.img_shape,
instances=eval_instances,
)
)
return eval_frames
def __getitem__(self, idx: int) -> List[Frame]:
"""Get an element of the dataset.
Args:
idx: the index of the batch. Note this is not the index of the video
or the frame.
Returns:
A list of Frames where frames contain instances w gt and pred track ids + bboxes.
"""
gt_batch = self.gt_dataset[idx]
pred_batch = self.pred_dataset[idx]
eval_frames = []
for gt_frame, pred_frame in zip(gt_batch, pred_batch):
eval_instances = []
for gt_instance, pred_instance in zip(
gt_frame.instances, pred_frame.instances
):
eval_instances.append(
Instance(
gt_track_id=gt_instance.gt_track_id,
pred_track_id=pred_instance.pred_track_id,
bbox=pred_instance.bbox,
)
)
eval_frames.append(
Frame(
video_id=gt_frame.video_id,
frame_id=gt_frame.frame_id,
img_shape=gt_frame.img_shape,
instances=eval_instances,
)
)
return eval_frames

Comment on lines 12 to 13
gt_track_id: int = None,
pred_track_id: int = -1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default values for gt_track_id and pred_track_id are inconsistent with their expected types. Consider using 0 or another placeholder that aligns with the expected integer type.

-        gt_track_id: int = None,
+        gt_track_id: int = 0,
-        pred_track_id: int = -1,
+        pred_track_id: int = 0,

Committable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
gt_track_id: int = None,
pred_track_id: int = -1,
gt_track_id: int = 0,
pred_track_id: int = 0,

Comment on lines 30 to 37
self._gt_track_id = torch.tensor([gt_track_id])
else:
self._gt_track_id = torch.tensor([])

if pred_track_id is not None:
self._pred_track_id = torch.tensor([pred_track_id])
else:
self._pred_track_id = torch.tensor([])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider whether it is necessary to store gt_track_id and pred_track_id as tensors since they represent single integer values.

save config to wandblogger
enable use of profiler
use mixed precision in training
use simple tracker ids to initialize tracks instead of random
use clone instead of deepcopy
save out all asso_matrices
don't track during training by default
save pose as dictionary
fix tests
check box coords to make sure x1x2 or y1y2 are not negative
save embeddings to model for gradient update
@aaprasad
Copy link
Contributor Author

@coderabbitai review

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 27

Actionable comments outside the diff hunks (8)
tests/test_training.py (1)

Line range hint 45-45: Address the unused local variable to clean up the code.

- feats = 128

The variable feats is assigned but never used. Consider removing it if it's not needed.

biogtr/inference/track.py (1)

Line range hint 137-137: Correct the f-string to include placeholders where necessary.

- print(f"Pod Index: {index}")
+ print(f"Pod Index: {index}")  # Ensure that `index` is defined and used correctly in the f-string.

This change ensures that the f-string is used correctly, displaying the value of index.

tests/test_inference.py (1)

Line range hint 91-91: Address the unused local variable to clean up the code.

- N = N_t * T

The variable N is assigned but never used. Consider removing it if it's not needed.

biogtr/models/model_utils.py (2)

Line range hint 63-64: Address potential bug in error handling.

-        if optimizer_class is None:
+        optimizer_class = None

This change corrects a potential bug where the optimizer_class variable might not be set correctly, leading to runtime errors.


Line range hint 83-84: Address potential bug in error handling.

-        if scheduler_class is None:
+        scheduler_class = None

This change corrects a potential bug where the scheduler_class variable might not be set correctly, leading to runtime errors.

biogtr/datasets/microscopy_dataset.py (1)

Line range hint 86-86: Avoid using lambda expressions for defining functions due to potential issues with serialization and readability. Consider defining a proper function instead.

- parser = lambda x: data_utils.parse_synthetic(x, source=source)
+ def parser(x):
+     return data_utils.parse_synthetic(x, source=source)
biogtr/datasets/data_utils.py (2)

Line range hint 66-66: Use isinstance() for type checking.

- if type(size) == int:
+ if isinstance(size, int):

Replace type comparison with isinstance() for more Pythonic and reliable type checking.


Line range hint 342-342: Avoid using bare except.

- except:
+ except Exception as e:

Specify the exception type to avoid catching unexpected exceptions, which can make debugging more difficult.

tests/test_models.py Show resolved Hide resolved

def track(self, model: GlobalTrackingTransformer, instances: list[dict], all_instances: list = None):
def track(self, model: GlobalTrackingTransformer, frames: list[dict]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method signature for track should accept a list of Frame objects instead of a list of dictionaries. This change aligns with the PR objectives and ensures type consistency.

- def track(self, model: GlobalTrackingTransformer, frames: list[dict]):
+ def track(self, model: GlobalTrackingTransformer, frames: list[Frame]):

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
def track(self, model: GlobalTrackingTransformer, frames: list[dict]):
def track(self, model: GlobalTrackingTransformer, frames: list[Frame]):

self.track_queue.clear()

if self.verbose:
warnings.warn(f"Clearing Queue after tracking")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The warning message in the verbose logging uses an f-string without placeholders. Consider revising the message or removing the f-string if not needed.

- warnings.warn(f"Clearing Queue after tracking")
+ warnings.warn("Clearing Queue after tracking")

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
warnings.warn(f"Clearing Queue after tracking")
warnings.warn("Clearing Queue after tracking")

from biogtr.inference.boxes import Boxes
import torch
from biogtr.data_structures import Frame
import warnings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove unused import warnings.

- import warnings

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
import warnings

biogtr/inference/metrics.py Show resolved Hide resolved
biogtr/visualize.py Show resolved Hide resolved
biogtr/visualize.py Show resolved Hide resolved
biogtr/visualize.py Show resolved Hide resolved
biogtr/datasets/sleap_dataset.py Show resolved Hide resolved
biogtr/datasets/microscopy_dataset.py Show resolved Hide resolved
@aaprasad aaprasad merged commit cfc5500 into aadi-sleap-anchors Apr 22, 2024
6 checks passed
@talmo talmo deleted the aadi/track-local-queues branch May 14, 2024 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant