Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RetinaNet] Image Converter and ObjectDetector #1906

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

sineeli
Copy link
Collaborator

@sineeli sineeli commented Oct 3, 2024

This PR covers preprocessor for RetinaNet object detector and RetinaNet model itself. #1756

  1. ImageObjectDetector
  2. ImageObjectDetectorPreprocessor
  3. RetinaNetObjectDetector
  4. RetinaNetObjectDetectorPreprocessor
  5. RetinaNetImageConverter

Copy link
Collaborator

@divyashreepathihalli divyashreepathihalli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @sineeli!! Looks generally good!! I have left a few comments.
Also, I want to make sure the -The code usage is updated to reflect the correct implementation - https://docs.google.com/document/d/15FUEP_vNehwLWJLragXhPFkYcbmmpbo0NYh5vL6q1xA/edit?tab=t.0

keras_hub/src/models/image_object_detector.py Outdated Show resolved Hide resolved
metrics=None,
**kwargs,
):
"""Configures the `ImageSegmenter` task for training.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update docstring for object detection

keras_hub/src/models/image_object_detector_preprocessor.py Outdated Show resolved Hide resolved
object detection tasks. It is intended to be paired with a
`keras_hub.models.ImageObjectDetector` task.

All `ImageObjectDetectorPreprocessor` take inputs three inputs, `x`, `y`, and
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All ImageObjectDetectorPreprocessor take inputs three inputs -> ImageObjectDetectorPreprocessor class acceptes three inputs: x, y, and...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

keras_hub/src/models/image_object_detector_preprocessor.py Outdated Show resolved Hide resolved
data_format = standardize_data_format(data_format)
input_levels = [int(level[1]) for level in backbone.pyramid_outputs]
backbone_max_level = min(max(input_levels), max_level)
image_encoder = keras.Model(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooo, we have an image_encoder here. haha, any suggestions for name here other than image_encoder

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for input args: image_encoder (Backbone)
for pyramid outputs extraction: feature_extractor (Resnet with feature pyramid outputs)

keras_hub/src/models/retinanet/retinanet_label_encoder.py Outdated Show resolved Hide resolved
keras_hub/src/tests/test_case.py Outdated Show resolved Hide resolved
@sineeli
Copy link
Collaborator Author

sineeli commented Oct 4, 2024

@divyashreepathihalli

Kept some layers FeaturePyramid, RetinaNetLabelEncoder , BoxMatcher and NonMaxSupressionnot exposed as layer API's we can expose once all the models are ported in and move a centralized layers to layers/modeling/ folder.

@sineeli sineeli requested a review from fchollet October 4, 2024 18:54

The `ImageObjectDetector` tasks wrap a `keras_hub.models.Backbone` and
a `keras_hub.models.Preprocessor` to create a model that can be used for
image classification. `ImageObjectDetector` tasks take an additional
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Object detection

anchor_size: float. Scale of size of the base anchor relative to the
feature stride 2^level.
anchor_generator: A `keras_hub.layers.AnchorGenerator`.
bounding_box_format: str. TODO:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can still add a line of description here for the arg.

`RetinaNetObjectDetectorPreprocessor` class or a custom preprocessor.
activation: Optional. The activation function to be used in the
classification head.
head_dtype: Optional. The data type for the prediction heads.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dtype will be the same for the model and the head right? can we just pass dtype here?

)

# === Functional Model ===
image_input = keras.layers.Input(backbone.image_shape, name="images")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move layers up under === Layers ===

@divyashreepathihalli divyashreepathihalli added the kokoro:force-run Runs Tests on GPU label Oct 5, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Oct 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants