-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SDK/CLI: Add ability to call auto-annotation functions with a custom threshold #8688
Conversation
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughA new Changes
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
2085f22
to
f4e131e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (16)
changelog.d/20241112_201034_roman_aa_threshold.md (1)
3-9
: Consider mentioning backward compatibilityThe changelog entry is well-written and comprehensive. However, it would be helpful to explicitly mention that this is a backward-compatible change, as this is important information for users considering the upgrade.
Consider adding a note about backward compatibility:
and use AA functions that support object filtering based on confidence levels. Updated the builtin functions in `cvat_sdk.auto_annotation.functions` - to support filtering via this parameter + to support filtering via this parameter. This change is backward-compatibletests/python/cli/threshold_function.py (2)
9-13
: Add docstring to explain the test specification.While the specification is valid for testing, adding a docstring would help explain:
- The purpose of this test specification
- Why the label ID is set to 0
- How this specification relates to threshold testing
spec = cvataa.DetectionFunctionSpec( + """Test specification for threshold validation. + Uses a single 'car' label with ID 0 to verify threshold-based detection behavior.""" labels=[ cvataa.label_spec("car", 0), ], )
16-21
: Document the test function and explain magic numbers.The detection function needs documentation to explain:
- Its purpose in testing threshold functionality
- The meaning of the hardcoded coordinates [1,1,1]
- How it uses the threshold value for testing
def detect( context: cvataa.DetectionFunctionContext, image: PIL.Image.Image ) -> list[models.LabeledShapeRequest]: + """Test detection function that validates threshold behavior. + + Returns a single rectangle with confidence score equal to the threshold value. + The rectangle coordinates [1,1,1] are fixed values used only for testing purposes. + + Args: + context: Detection context containing the threshold value + image: Input image (unused in this test) + + Returns: + List containing a single rectangle with the threshold as its confidence score + """ return [ cvataa.rectangle(0, [context.threshold, 1, 1, 1]), ]Also, consider using named constants for the hardcoded coordinates to improve readability:
+TEST_WIDTH = 1 +TEST_HEIGHT = 1 +TEST_POSITION = 1 def detect( context: cvataa.DetectionFunctionContext, image: PIL.Image.Image ) -> list[models.LabeledShapeRequest]: return [ - cvataa.rectangle(0, [context.threshold, 1, 1, 1]), + cvataa.rectangle(0, [context.threshold, TEST_POSITION, TEST_WIDTH, TEST_HEIGHT]), ]cvat-sdk/cvat_sdk/auto_annotation/functions/torchvision_detection.py (1)
33-33
: Consider adding threshold validation and documentationThe threshold implementation looks good and efficiently filters detection results. However, consider these improvements:
- Add validation for negative threshold values
- Add docstring explaining the threshold parameter and its valid range
Consider applying this enhancement:
def detect( self, context: cvataa.DetectionFunctionContext, image: PIL.Image.Image ) -> list[models.LabeledShapeRequest]: + """ + Detect objects in the image using the model. + + Args: + context: Detection context containing parameters like threshold + image: Input image to process + + The threshold parameter (context.threshold) filters detection results + by confidence score. Valid range is [0, 1]. Default is 0 (no filtering). + + Returns: + List of labeled shape requests for detected objects + """ threshold = context.threshold or 0 + if threshold < 0 or threshold > 1: + raise ValueError(f"Threshold must be between 0 and 1, got {threshold}")Also applies to: 39-40
cvat-cli/src/cvat_cli/_internal/parsers.py (1)
56-65
: Add docstring to document the function's purpose and behavior.The implementation looks correct, but adding a docstring would improve maintainability and help users understand the function's purpose, parameters, return value, and possible exceptions.
Consider adding this documentation:
def parse_threshold(s: str) -> float: + """Parse and validate a confidence threshold value. + + Args: + s: String representation of the threshold value + + Returns: + float: The parsed threshold value between 0 and 1 + + Raises: + ArgumentTypeError: If the input is not a valid number or outside the range [0,1] + """ try: value = float(s) except ValueError as e:cvat-sdk/cvat_sdk/auto_annotation/functions/torchvision_keypoint_detection.py (1)
41-41
: Consider adding parameter validation and documentationThe threshold implementation looks good, but could benefit from:
- Adding a docstring explaining the threshold parameter and its valid range
- Adding validation to ensure the threshold is within a valid range (typically 0.0 to 1.0 for confidence scores)
Consider adding validation and documentation like this:
def detect( self, context: cvataa.DetectionFunctionContext, image: PIL.Image.Image ) -> list[models.LabeledShapeRequest]: + """Detect keypoints in the image. + + Args: + context: Detection context containing parameters like threshold + image: Input image to process + + Returns: + List of labeled shapes with detected keypoints + + The threshold parameter (context.threshold) filters detections by confidence score. + Only detections with scores >= threshold are included. Defaults to 0 (no filtering). + Expected range is 0.0 to 1.0. + """ threshold = context.threshold or 0 + if threshold < 0 or threshold > 1: + raise ValueError(f"Threshold must be between 0 and 1, got {threshold}")Also applies to: 57-60
cvat-sdk/cvat_sdk/auto_annotation/interface.py (1)
54-69
: Consider adding runtime validation for threshold valuesThe threshold property is well-defined and documented. However, since the docstring specifies that non-None values must be between 0 and 1, consider adding runtime validation in implementing classes to ensure this constraint is met.
Example validation in implementing classes:
@property def threshold(self) -> Optional[float]: value = self._threshold # or however you store it if value is not None and not (0 <= value <= 1): raise ValueError("Threshold must be between 0 and 1") return valuesite/content/en/docs/api_sdk/sdk/auto-annotation.md (2)
Line range hint
71-88
: Enhance the example to demonstrate threshold usage.The example should be updated to show how to use the new threshold parameter when calling
annotate_task
. This would help users understand how to leverage this feature effectively.Add the threshold parameter to the example at the bottom of the file:
cvataa.annotate_task(client, 41617, TorchvisionDetectionFunction("fasterrcnn_resnet50_fpn_v2", "DEFAULT", box_score_thresh=0.5), + threshold=0.7, # Filter out detections with confidence below 0.7 )
208-210
: Enhance the threshold parameter documentation.While the documentation introduces the threshold parameter, it would be more helpful with additional details about:
- Valid threshold values (e.g., range between 0 and 1)
- A code example showing its usage
Consider expanding the documentation:
It's possible to pass a custom confidence threshold to the function via the -`threshold` parameter. +`threshold` parameter. The threshold should be a float between 0 and 1, where 1 +represents maximum confidence. For example: + +```python +# Only keep detections with confidence >= 0.7 +annotate_task(client, task_id, detection_function, threshold=0.7) +```cvat-sdk/cvat_sdk/auto_annotation/driver.py (2)
269-271
: Improve docstring formatting for better readability.Consider reformatting the threshold parameter documentation to match the style of other parameters:
- The threshold parameter must be None or a number between 0 and 1. It will be passed - to the function as the threshold attribute of the context object. + threshold: Optional number between 0 and 1. When provided, it will be passed to the + function as the threshold attribute of the context object.
277-278
: Add type hint for the threshold parameter.While the implementation is correct, consider adding a type hint to the function signature for better code maintainability:
def annotate_task( client: Client, task_id: int, function: DetectionFunction, *, pbar: Optional[ProgressReporter] = None, clear_existing: bool = False, allow_unmatched_labels: bool = False, - threshold: Optional[float] = None, + threshold: Optional[float] = None, # type: Optional[float] ) -> None:Also applies to: 296-296
tests/python/cli/test_cli.py (1)
351-363
: Test looks good, but could be more comprehensive.The test effectively verifies the basic threshold functionality, but consider these improvements:
- Add docstring explaining the test's purpose and expected behavior
- Add test cases for edge cases (e.g., invalid thresholds)
- Consider parameterizing the test to verify different threshold values
- Make the verification more implementation-agnostic instead of relying on specific point values
Example implementation with suggested improvements:
+ @pytest.mark.parametrize("threshold", [ + 0.0, # minimum + 0.5, # middle + 1.0, # maximum + ]) def test_auto_annotate_with_threshold(self, fxt_new_task: Task): + """ + Test auto-annotation with custom threshold parameter. + Verifies that: + 1. CLI accepts the threshold parameter + 2. Annotations are created with the specified threshold + """ annotations = fxt_new_task.get_annotations() assert not annotations.shapes self.run_cli( "auto-annotate", str(fxt_new_task.id), f"--function-module={__package__}.threshold_function", - "--threshold=0.75", + f"--threshold={threshold}", ) annotations = fxt_new_task.get_annotations() - assert annotations.shapes[0].points[0] == 0.75 + assert annotations.shapes # Verify annotations were created + # Verify threshold was applied (implementation-specific assertions) + assert all(shape.confidence >= threshold for shape in annotations.shapes) + def test_auto_annotate_with_invalid_threshold(self, fxt_new_task: Task): + """Test that invalid threshold values are properly handled.""" + with pytest.raises(Exception): # Replace with specific exception + self.run_cli( + "auto-annotate", + str(fxt_new_task.id), + f"--function-module={__package__}.threshold_function", + "--threshold=-0.5", # Invalid threshold + )tests/python/sdk/test_auto_annotation.py (4)
272-309
: Consider adding edge cases and improving assertion messages.The test coverage for the threshold parameter is good, but could be enhanced by:
- Testing edge cases (0.0 and 1.0)
- Adding descriptive messages to assertions for better test failure diagnostics
def test_threshold(self): spec = cvataa.DetectionFunctionSpec(labels=[]) received_threshold = None def detect( context: cvataa.DetectionFunctionContext, image: PIL.Image.Image ) -> list[models.LabeledShapeRequest]: nonlocal received_threshold received_threshold = context.threshold return [] cvataa.annotate_task( self.client, self.task.id, namespace(spec=spec, detect=detect), threshold=0.75, ) - assert received_threshold == 0.75 + assert received_threshold == 0.75, "Expected threshold value of 0.75 was not received" cvataa.annotate_task( self.client, self.task.id, namespace(spec=spec, detect=detect), ) - assert received_threshold is None + assert received_threshold is None, "Expected None threshold when not provided" - for bad_threshold in [-0.1, 1.1]: + for bad_threshold in [-0.1, 1.1, 0.0, 1.0]: with pytest.raises(ValueError): cvataa.annotate_task( self.client, self.task.id, namespace(spec=spec, detect=detect), threshold=bad_threshold, )
616-618
: Consider using more distinct confidence scores for better test coverage.The current scores (0.75 and 0.74) are very close to each other and to the test threshold (0.75). Using more distinct values would make the threshold filtering tests more robust.
- "scores": torch.tensor([0.75, 0.74]), + "scores": torch.tensor([0.9, 0.6]),
641-651
: Use more distinct confidence scores for keypoint detection tests.Similar to the object detection case, using more distinct confidence scores would make the threshold filtering tests more robust.
- "scores": torch.tensor([0.75, 0.74]), + "scores": torch.tensor([0.9, 0.6]),
716-716
: Consider testing different threshold values across test methods.Both test methods use the same threshold value (0.75). Testing different valid threshold values across different test methods would increase test coverage.
- threshold=0.75, + threshold=0.8, # Use a different threshold valueAlso applies to: 736-736
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (11)
changelog.d/20241112_201034_roman_aa_threshold.md
(1 hunks)cvat-cli/src/cvat_cli/_internal/commands.py
(4 hunks)cvat-cli/src/cvat_cli/_internal/parsers.py
(1 hunks)cvat-sdk/cvat_sdk/auto_annotation/driver.py
(4 hunks)cvat-sdk/cvat_sdk/auto_annotation/functions/torchvision_detection.py
(1 hunks)cvat-sdk/cvat_sdk/auto_annotation/functions/torchvision_keypoint_detection.py
(2 hunks)cvat-sdk/cvat_sdk/auto_annotation/interface.py
(2 hunks)site/content/en/docs/api_sdk/sdk/auto-annotation.md
(4 hunks)tests/python/cli/test_cli.py
(1 hunks)tests/python/cli/threshold_function.py
(1 hunks)tests/python/sdk/test_auto_annotation.py
(5 hunks)
🔇 Additional comments (14)
tests/python/cli/threshold_function.py (2)
1-8
: LGTM: License and imports are properly structured.
The copyright notice, license declaration, and imports are well-organized and contain only the necessary dependencies.
16-21
: Verify edge cases in threshold testing.
While the current implementation tests basic threshold functionality, consider adding test cases for:
- Edge case thresholds (0.0 and 1.0)
- Invalid threshold values (negative or >1.0)
- Multiple detections with varying confidence scores
✅ Verification successful
Edge Cases in Threshold Testing Verified
Edge cases for threshold values are comprehensively covered in existing test files.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for other test files that might need similar threshold testing
rg -l "DetectionFunctionContext" "tests/"
# Look for existing threshold-related test cases
rg "threshold" "tests/" -A 5
Length of output: 201698
cvat-sdk/cvat_sdk/auto_annotation/functions/torchvision_detection.py (1)
30-32
: LGTM! Type annotation improvement
The addition of type annotation for the context
parameter enhances code clarity and IDE support.
cvat-cli/src/cvat_cli/_internal/parsers.py (2)
57-64
: LGTM! Robust error handling implementation.
The error handling is well-implemented with:
- Clear error messages for both invalid number format and range violations
- Proper exception chaining
- Appropriate use of inclusive bounds for the [0,1] range check
56-56
: Verify integration with CLI commands and auto-annotation.
Let's confirm that this parser is properly integrated with the CLI commands and auto-annotation functionality.
cvat-sdk/cvat_sdk/auto_annotation/functions/torchvision_keypoint_detection.py (1)
38-40
: LGTM! Improved type safety with explicit context typing
The explicit typing of the context parameter improves code clarity and helps catch potential type-related issues at development time.
cvat-sdk/cvat_sdk/auto_annotation/interface.py (1)
7-7
: LGTM: Import statement is correctly placed
The addition of Optional
from typing is appropriate for the new threshold property.
site/content/en/docs/api_sdk/sdk/auto-annotation.md (1)
121-127
: LGTM! Documentation is clear and comprehensive.
The documentation for the new threshold
parameter is well-written, accurately describing both its purpose and optional nature.
cvat-sdk/cvat_sdk/auto_annotation/driver.py (1)
223-226
: LGTM! Well-structured class modification.
The changes to _DetectionFunctionContextImpl
are well-implemented:
- Appropriate use of
@attrs.frozen(kw_only=True)
for immutability and keyword-only arguments - Correct type annotation for the threshold parameter
tests/python/cli/test_cli.py (1)
358-359
: Verify the threshold_function module exists and handles the threshold parameter correctly.
The test references a threshold_function
module that should exist in the same package. Let's verify its implementation and integration.
cvat-cli/src/cvat_cli/_internal/commands.py (4)
23-29
: LGTM!
The import of parse_threshold
is correctly placed with other parser imports and follows the alphabetical order.
472-478
: LGTM!
The --threshold
argument is well-configured with:
- Clear help text explaining its purpose
- Appropriate default value of None
- Input validation through parse_threshold
489-489
: LGTM!
The threshold
parameter is correctly added to the method signature with appropriate type annotation.
514-514
: Verify the threshold parameter handling in upstream function
The threshold parameter is correctly forwarded to cvataa.annotate_task
.
Let's verify the threshold parameter handling in the upstream function:
✅ Verification successful
Threshold parameter is correctly handled in annotate_task
The threshold
parameter is properly validated and utilized within annotate_task
.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check how the threshold parameter is handled in cvataa.annotate_task
# Search for annotate_task function definition and its usage
rg -A 10 "def annotate_task" --glob "cvat_sdk/**/*.py"
Length of output: 205
Script:
#!/bin/bash
# Description: Re-check how the threshold parameter is handled in cvataa.annotate_task without restrictive glob
# Search for annotate_task function definition and its usage across the entire codebase
rg "def annotate_task" -A 10
Length of output: 790
Script:
#!/bin/bash
# Description: Extract the full implementation of annotate_task to verify threshold handling
# Retrieve the annotate_task function definition and its body
rg -A 30 "def annotate_task" cvat-sdk/cvat_sdk/auto_annotation/driver.py
Length of output: 1256
Script:
#!/bin/bash
# Description: Extract the complete implementation of annotate_task to verify threshold handling
# Retrieve the annotate_task function definition and its entire body by increasing the number of lines fetched
rg -A 100 "def annotate_task" cvat-sdk/cvat_sdk/auto_annotation/driver.py
Length of output: 3400
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #8688 +/- ##
========================================
Coverage 74.19% 74.20%
========================================
Files 401 401
Lines 43502 43510 +8
Branches 3950 3950
========================================
+ Hits 32278 32288 +10
+ Misses 11224 11222 -2
|
…threshold My main motivation is to support a future feature, but I think this is a good thing in its own right. While it's already possible to create an AA function that lets you customize the threshold (by adding a creation parameter), confidence scoring is very common in detection models, so it makes sense to make this easier to support, both for the implementer of the function, and for its user.
06be1ca
to
f92da49
Compare
f92da49
to
13f4131
Compare
Quality Gate failedFailed conditions See analysis details on SonarQube Cloud Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE |
Motivation and context
My main motivation is to support a future feature, but I think this is a good thing in its own right.
While it's already possible to create an AA function that lets you customize the threshold (by adding a creation parameter), confidence scoring is very common in detection models, so it makes sense to make this easier to support, both for the implementer of the function, and for its user.
How has this been tested?
Unit tests.
Checklist
develop
branch[ ] I have linked related issues (see GitHub docs)[ ] I have increased versions of npm packages if it is necessary(cvat-canvas,
cvat-core,
cvat-data and
cvat-ui)
License
Feel free to contact the maintainers if that's a concern.
Summary by CodeRabbit
Release Notes
New Features
threshold
parameter for automatic annotation, allowing users to filter object detections based on confidence levels.--threshold
option in the auto-annotation command.Bug Fixes
Tests
Documentation