-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add trainer #29
Add trainer #29
Conversation
WalkthroughThe changes involve enhancing the peak-finding algorithm in Changes
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on X ? TipsChat with CodeRabbit Bot (
|
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #29 +/- ##
==========================================
- Coverage 99.78% 99.35% -0.44%
==========================================
Files 19 20 +1
Lines 949 1087 +138
==========================================
+ Hits 947 1080 +133
- Misses 2 7 +5 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 18
Configuration used: CodeRabbit UI
Files ignored due to filter (3)
- environment.yml
- environment_cpu.yml
- environment_osx-arm64.yml
Files selected for processing (5)
- docs/config.md (1 hunks)
- sleap_nn/data/general.py (1 hunks)
- sleap_nn/model_trainer.py (1 hunks)
- tests/fixtures/datasets.py (2 hunks)
- tests/test_model_trainer.py (1 hunks)
Files skipped from review due to trivial changes (1)
- docs/config.md
Additional comments: 9
sleap_nn/data/general.py (1)
- 21-22: The documentation update in the
__iter__
method correctly reflects the changes in howframe_idx
andvideo_idx
are handled. This aligns with the PR's objective to modify the logic for accessing source video information.sleap_nn/model_trainer.py (2)
3-19:
The imports and theModelTrainer
class setup indicate the introduction of a new training system using PyTorch Lightning. The class is well-documented and includes a constructor that sets up the device and seed for training reproducibility.160-272:
TheTopDownCenteredInstanceModel
class is a subclass ofLightningModule
and is responsible for defining the training and validation steps, as well as configuring optimizers and learning rate schedulers. The class is well-documented and includes methods for the forward pass, training and validation steps, and optimizer configuration.tests/fixtures/datasets.py (3)
3-3: The import of
OmegaConf
is correctly added to support the newconfig
fixture function.20-209: The
config
fixture function is well-defined and correctly utilizes thesleap_data_dir
fixture to set up paths in the configuration. It's important to ensure that the keys and values within the configuration dictionary match the expected schema used elsewhere in the codebase.22-206: <!--Monologue:
Thought 1: Therg
command has now provided context lines around the matches, which shows how the configuration keys are used in the codebase. The keys 'data_config', 'model_config', and 'trainer_config' are indeed used in various parts of the codebase, and their structure seems to be consistent with the configuration defined in theconfig
fixture intests/fixtures/datasets.py
.
Reflect 1: This suggests that the structure and keys of the configuration dictionary in theconfig
fixture are likely to match the expected schema throughout the codebase.Thought 2: The
find
command has successfully located the file 'minimal_instance.pkg.slp' in the 'tests/assets' directory, confirming that the file path used in the configuration is valid and the file exists in the repository.
Reflect 2: This confirms that the file paths constructed within the configuration are valid and correspond to actual files in the repository.Resolution: With the information gathered, we can conclude that the structure and keys of the configuration dictionary match the expected schema throughout the codebase, and the file paths used in the configuration are valid.
-->The verification process has confirmed that the structure and keys of the configuration dictionary in the
config
fixture are consistent with the expected schema used throughout the codebase. Additionally, the file 'minimal_instance.pkg.slp' referenced in the configuration exists in the repository. No issues have been found.Analysis chain
Verify that the structure and keys of the configuration dictionary match the expected schema throughout the codebase. This includes ensuring that the default values and types are consistent with how the configuration is consumed in other parts of the application.
Given the output, we need to adjust our approach to verify the configuration structure and also to check the existence of the file 'minimal_instance.pkg.slp'. Let's create a new script to address these points.
The output from the
rg
command did not provide the necessary context to verify the usage of the configuration keys, and thefind
command failed to check for the existence of the file 'minimal_instance.pkg.slp'. Let's correct these scripts to gather the required information.Scripts Executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for the usage of the keys within the configuration to ensure they match the expected schema. rg 'data_config|model_config|trainer_config' --files-with-matches # Verify that the paths constructed within the configuration are valid and correspond to actual files. fd 'minimal_instance.pkg.slp' --exec test -fLength of output: 334
Script:
#!/bin/bash # Deep dive into the usage of the configuration keys to ensure they match the expected schema. # We will look for the definitions and usages of these keys in the identified files. rg 'data_config|model_config|trainer_config' -C 5 --files-with-matches # Correctly check if the file 'minimal_instance.pkg.slp' exists in the repository. # Since the previous fd command was incorrect, we will use find instead. find . -name "minimal_instance.pkg.slp" -exec test -f {} \; -printLength of output: 368
Script:
#!/bin/bash # Adjust the ripgrep command to include context lines around the matches for detailed analysis. rg 'data_config|model_config|trainer_config' -C 5 # Correct the find command to simply print the file if it exists, without using the 'test' command. find . -name "minimal_instance.pkg.slp" -printLength of output: 40106
tests/test_model_trainer.py (3)
23-34: The assertions in the test
test_create_data_loader
seem to be correct, but it's important to ensure that theconfig
fixture andsleap_data_dir
are properly set up to provide meaningful test data.95-153: Ensure that the paths used in
test_trainer
are valid and that the test data is set up correctly to provide meaningful results.156-173: Ensure that the input data and model configuration in
test_topdown_centered_instance_model
are set up correctly to provide meaningful results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 2
Configuration used: CodeRabbit UI
Files selected for processing (5)
- docs/config.md (1 hunks)
- sleap_nn/inference/peak_finding.py (1 hunks)
- sleap_nn/model_trainer.py (1 hunks)
- tests/fixtures/datasets.py (2 hunks)
- tests/test_model_trainer.py (1 hunks)
Files skipped from review as they are similar to previous changes (2)
- docs/config.md
- sleap_nn/model_trainer.py
Additional comments: 6
sleap_nn/inference/peak_finding.py (1)
- 96-122: Given the changes in the peak finding logic, it is crucial to verify that the unit tests for the
find_global_peaks_rough
function have been updated to reflect these changes and that they cover the new behavior, especially the handling of values below the threshold.tests/fixtures/datasets.py (5)
20-208: The new
config
fixture function is well-structured and provides a comprehensive configuration for testing purposes. It's important to ensure that the values used, especially paths and hyperparameters, align with the expected test scenarios and that they are documented if necessary for clarity.26-27: Verify that the files referenced in the
labels_path
configuration exist and are in the correct format to avoid test failures due to missing or malformed data.Also applies to: 76-76
Verification successful
The file
minimal_instance.pkg.slp
does exist within thetests/assets
directory of the repository. This confirms that the file referenced in thelabels_path
configuration of theconfig
fixture is present and should be accessible during testing.* 48-48: Ensure that if data augmentation features are to be tested, there are test cases that specifically enable and test these configurations, as they are disabled by default in the fixture.Scripts Executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Verify the existence of the minimal_instance.pkg.slp file in the sleap_data_dir for both train and val configurations. fd "minimal_instance.pkg.slp" "$(pytestconfig.rootdir)/tests/assets" --exec test -fLength of output: 271
Script:
#!/bin/bash # Search for the file 'minimal_instance.pkg.slp' within the entire repository. fd "minimal_instance.pkg.slp"Length of output: 67
Also applies to: 97-97
189-191: Confirm whether the Weights & Biases (
wandb
) configuration is intended for use in the test environment. Typically, external logging services are not used in unit tests to avoid polluting experiment tracking with test runs.188-188: The
save_ckpt_path
is set tof"{tmp_path}"
, which seems to be a placeholder. Ensure that this is replaced with an actual temporary directory path during test setup, or iftmp_path
is a fixture, ensure that it is defined and available in the test environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 0
Configuration used: CodeRabbit UI
Files selected for processing (2)
- sleap_nn/inference/peak_finding.py (1 hunks)
- tests/test_model_trainer.py (1 hunks)
Files skipped from review due to trivial changes (1)
- tests/test_model_trainer.py
Files skipped from review as they are similar to previous changes (1)
- sleap_nn/inference/peak_finding.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 0
Configuration used: CodeRabbit UI
Files selected for processing (2)
- sleap_nn/model_trainer.py (1 hunks)
- tests/test_model_trainer.py (1 hunks)
Files skipped from review as they are similar to previous changes (2)
- sleap_nn/model_trainer.py
- tests/test_model_trainer.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 9
Configuration used: CodeRabbit UI
Files selected for processing (2)
- sleap_nn/model_trainer.py (1 hunks)
- tests/test_model_trainer.py (1 hunks)
Files skipped from review as they are similar to previous changes (1)
- tests/test_model_trainer.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 8
Configuration used: CodeRabbit UI
Files selected for processing (2)
- sleap_nn/model_trainer.py (1 hunks)
- tests/test_model_trainer.py (1 hunks)
Files skipped from review as they are similar to previous changes (1)
- tests/test_model_trainer.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 4
Configuration used: CodeRabbit UI
Files selected for processing (4)
- docs/config.md (1 hunks)
- sleap_nn/model_trainer.py (1 hunks)
- tests/fixtures/datasets.py (2 hunks)
- tests/test_model_trainer.py (1 hunks)
Additional comments: 11
sleap_nn/model_trainer.py (6)
- 33-33: The typo in the argument description
trainer_cong
should be corrected totrainer_config
.- (iii) trainer_cong: trainer configs like accelerator, optimiser params. + (iii) trainer_config: trainer configs like accelerator, optimiser params.
48-50: The assignment of
self.provider
toLabelsReader
is still not dynamic. Consider implementing a factory pattern or a mapping to support extensibility.115-116: Consider adding error handling around the directory creation to handle potential exceptions, such as permission issues.
155-159: The
xavier_init_weights
function should be extended to handle other types of layers and include checks to ensure that the layer hasweight
andbias
attributes before attempting to initialize them.189-190: The weight initialization is still hardcoded to use the Xavier method. Consider making this configurable to allow for different initialization methods based on the model or training requirements.
254-269: The optimizer and scheduler configuration should be made more flexible by allowing different types to be specified via the configuration file.
tests/fixtures/datasets.py (1)
- 21-207: The
config
fixture is well-structured and provides a comprehensive configuration for testing. Ensure that the paths and values used here align with the actual test environment and intended test scenarios.tests/test_model_trainer.py (4)
- 4-4: Duplicate import of
pytest
on line 6 and 15.- import pytest
13-26: The test
test_create_data_loader
appears to be well-structured and checks the creation of data loaders. Ensure that the assertions are valid and that the number of items in the data loaders matches the expected dataset size.29-91: The test
test_trainer
seems to cover the training process, including the creation of checkpoint directories and files. Verify that the paths, configurations, and assertions are correct and that the test environment is properly isolated.94-111: The test
test_topdown_centered_instance_model
checks the model's forward pass and training step. Confirm that the input shapes, loss calculation, and assertions are accurate and that the test covers the expected behavior of the model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 5
Configuration used: CodeRabbit UI
Files selected for processing (3)
- docs/config.md (1 hunks)
- sleap_nn/model_trainer.py (1 hunks)
- tests/test_model_trainer.py (1 hunks)
Files skipped from review as they are similar to previous changes (1)
- tests/test_model_trainer.py
Additional comments: 1
docs/config.md (1)
- 1-102: The documentation for the configuration file is clear and well-structured, providing detailed explanations for each configuration option. Ensure that the provided details match the actual implementation and usage within the codebase.
Verification successful
The initial search was too narrow, focusing only on Markdown files. To verify that the documented config keys match the keys used in the codebase, the search should include all file types, particularly Python files where these keys are likely to be used. Let's perform a broader search.
The search results confirm that the keys
data_config
,model_config
, andtrainer_config
are indeed used within the codebase, specifically in themodel_trainer.py
andpipelines.py
files. The references to these keys in the code align with the documentation provided inconfig.md
, indicating that the documentation is consistent with the actual implementation.Final response:
The documentation in
config.md
is consistent with the usage of configuration keys within the codebase, as verified by the search results frommodel_trainer.py
andpipelines.py
.Scripts Executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Verify that the documented config keys match the keys used in the codebase rg --type md "data_config|model_config|trainer_config" sleap_nn/Length of output: 66
Script:
#!/bin/bash # Broaden the search to include all file types for the documented config keys rg "data_config|model_config|trainer_config" sleap_nn/Length of output: 6765
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 0
Configuration used: CodeRabbit UI
Files selected for processing (4)
- docs/config.md (1 hunks)
- sleap_nn/model_trainer.py (1 hunks)
- tests/fixtures/datasets.py (2 hunks)
- tests/test_model_trainer.py (1 hunks)
Files skipped from review as they are similar to previous changes (2)
- sleap_nn/model_trainer.py
- tests/test_model_trainer.py
Additional comments: 3
tests/fixtures/datasets.py (2)
3-3: The import of
OmegaConf
is correctly added to support the newconfig
fixture.21-210: The
config
fixture is well-constructed with a comprehensive set of parameters for testing. The use of f-strings for dynamic path construction inlabels_path
is noted and appreciated for its flexibility.docs/config.md (1)
- 1-107: The documentation for the configuration file is clear and comprehensive, detailing the parameters for
data_config
,model_config
, andtrainer_config
. It provides users with the necessary information to effectively configure model training.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 0
Configuration used: CodeRabbit UI
Files selected for processing (2)
- sleap_nn/model_trainer.py (1 hunks)
- tests/fixtures/datasets.py (2 hunks)
Files skipped from review as they are similar to previous changes (2)
- sleap_nn/model_trainer.py
- tests/fixtures/datasets.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 0
Configuration used: CodeRabbit UI
Files selected for processing (1)
- sleap_nn/model_trainer.py (1 hunks)
Files skipped from review as they are similar to previous changes (1)
- sleap_nn/model_trainer.py
Issue #27
Summary by CodeRabbit
New Features
ModelTrainer
class for streamlined model training using PyTorch Lightning.Bug Fixes
Documentation
config.md
file detailing the setup for model training configurations.Tests
ModelTrainer
andTopDownCenteredInstanceModel
functionalities.