Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/sg 747 support predict video full pipeline master #829

Merged
Merged
Changes from 1 commit
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
43176f2
wip
Louis-Dupont Mar 26, 2023
5a0023b
move to imageprocessors
Louis-Dupont Mar 26, 2023
1aacdfa
Merge branch 'master' into feature/SG-747-add_image_processor
Louis-Dupont Mar 26, 2023
89c48a5
wip
Louis-Dupont Mar 27, 2023
6958813
add back changes
Louis-Dupont Mar 27, 2023
4ae57b1
making it work fully for yolox and almost for ppyoloe
Louis-Dupont Mar 27, 2023
2700b80
minor change
Louis-Dupont Mar 27, 2023
b48c596
working for det
Louis-Dupont Mar 28, 2023
e5366c5
Merge branch 'master' into feature/SG-747-add_preprocessing
Louis-Dupont Mar 28, 2023
0ac4fe8
cleaning
Louis-Dupont Mar 28, 2023
24c16c8
clean
Louis-Dupont Mar 28, 2023
2735cf8
undo
Louis-Dupont Mar 28, 2023
3587cee
replace empty with none
Louis-Dupont Mar 28, 2023
4a50611
Merge branch 'master' into feature/SG-747-add_preprocessing
Louis-Dupont Mar 28, 2023
6a4250e
add _get_shift_params
Louis-Dupont Mar 28, 2023
061aa5d
minor doc change
Louis-Dupont Mar 28, 2023
976d292
cleaning wip
Louis-Dupont Mar 28, 2023
23be205
working for multiple images
Louis-Dupont Mar 28, 2023
c30d95e
add ppyoloe
Louis-Dupont Mar 29, 2023
0031494
Merge branch 'master' into feature/SG-747-add_preprocessing
Louis-Dupont Mar 29, 2023
2464398
replace pydantic with dataclasses and fix typing
Louis-Dupont Mar 29, 2023
d4c0774
add docstrings
Louis-Dupont Mar 29, 2023
cf19765
doc improvment and use get_shift_params in transforms
Louis-Dupont Mar 29, 2023
7e8ad22
add tests
Louis-Dupont Mar 29, 2023
90f708e
improve comment
Louis-Dupont Mar 29, 2023
8830ba9
rename
Louis-Dupont Mar 29, 2023
efd58d4
Merge branch 'master' into feature/SG-747-add_preprocessing
Louis-Dupont Mar 29, 2023
4990938
Merge branch 'feature/SG-747-add_preprocessing' into feature/SG-747-a…
Louis-Dupont Mar 29, 2023
7638fbb
wip
Louis-Dupont Mar 29, 2023
74379c6
add option to keep ratio in rescale
Louis-Dupont Mar 29, 2023
efbde36
Merge branch 'master' into feature/SG-747-add_preprocessing
Louis-Dupont Mar 29, 2023
efd023e
make functions private
Louis-Dupont Mar 29, 2023
008b77b
remove DetectionPaddedRescale
Louis-Dupont Mar 29, 2023
77addfa
fix doc
Louis-Dupont Mar 29, 2023
13e686d
Merge branch 'feature/SG-747-add_preprocessing' into feature/SG-747-a…
Louis-Dupont Mar 29, 2023
239a0af
big commit with wrong things
Louis-Dupont Mar 29, 2023
90d1bf2
try undo bad change
Louis-Dupont Mar 29, 2023
e0fdae4
doc
Louis-Dupont Mar 29, 2023
0d9a101
minor doc
Louis-Dupont Mar 29, 2023
385ea57
add a lot of doc
Louis-Dupont Mar 29, 2023
eb5bd55
Merge branch 'master' into feature/SG-747-add_full_pipeline_with_prep…
Louis-Dupont Mar 29, 2023
d295f47
fix comment
Louis-Dupont Mar 29, 2023
c56b3c3
minor change
Louis-Dupont Mar 29, 2023
c2abaf8
first draft of load_video
Louis-Dupont Mar 30, 2023
a7837d5
adding save_video, some parts are still to be checked
Louis-Dupont Mar 30, 2023
a6e4c7a
wip
Louis-Dupont Mar 30, 2023
5628fe9
Merge branch 'master' into feature/SG-747-add_full_pipeline_with_prep…
Louis-Dupont Mar 30, 2023
8edc916
add __init__.py to pipelines
Louis-Dupont Mar 30, 2023
12eb78e
replace size with shape
Louis-Dupont Mar 30, 2023
875eaed
wip
Louis-Dupont Mar 30, 2023
974d7ca
Merge branch 'master' into feature/SG-747-support_predict_video
Louis-Dupont Apr 2, 2023
265a828
cleaning
Louis-Dupont Apr 2, 2023
d1188ef
Merge branch 'feature/SG-747-add_full_pipeline_with_preprocessing' in…
Louis-Dupont Apr 2, 2023
1dacc78
wip
Louis-Dupont Apr 2, 2023
d2be717
fix rgb to bgr and remove check
Louis-Dupont Apr 2, 2023
3ddf80b
Merge branch 'feature/SG-747-support_predict_video' into feature/SG-7…
Louis-Dupont Apr 2, 2023
e533022
almost working, missing batch
Louis-Dupont Apr 2, 2023
12c09de
Merge branch 'master' into feature/SG-747-support_predict_video_full_…
Louis-Dupont Apr 8, 2023
970526b
proposal of predict_video
Louis-Dupont Apr 9, 2023
d0f9daf
wip working on dete
Louis-Dupont Apr 9, 2023
06378dd
add yolox
Louis-Dupont Apr 9, 2023
1188997
add flag to visualize
Louis-Dupont Apr 9, 2023
c30ef16
update
Louis-Dupont Apr 10, 2023
edc8d78
add streaming
Louis-Dupont Apr 10, 2023
dca4a1d
improve streaming code
Louis-Dupont Apr 10, 2023
bee719b
docstring update
Louis-Dupont Apr 10, 2023
72c8d05
fix stream example
Louis-Dupont Apr 10, 2023
250cdc9
rename Results
Louis-Dupont Apr 10, 2023
00421ff
cleaning
Louis-Dupont Apr 10, 2023
71b45cf
rename stream to predict_webcam
Louis-Dupont Apr 11, 2023
6070c21
doc fixes
Louis-Dupont Apr 11, 2023
bcdeb0d
improve docstring and homogenize some names
Louis-Dupont Apr 11, 2023
d3ccfe8
rename _images_prediction_lst
Louis-Dupont Apr 11, 2023
ce6c089
improve doc
Louis-Dupont Apr 11, 2023
5d8e013
add doc
Louis-Dupont Apr 11, 2023
37e3e2b
minore change
Louis-Dupont Apr 11, 2023
9638ca0
Merge branch 'master' into feature/SG-747-support_predict_video_full_…
Louis-Dupont Apr 11, 2023
12d6a5d
fix image
Louis-Dupont Apr 11, 2023
1fc2974
Merge branch 'master' into feature/SG-747-support_predict_video_full_…
Louis-Dupont Apr 16, 2023
aa888f5
fix ci
Louis-Dupont Apr 16, 2023
86ed370
Merge branch 'master' into feature/SG-747-support_predict_video_full_…
Louis-Dupont Apr 16, 2023
c2e648c
fix merge
Louis-Dupont Apr 16, 2023
21b5534
reverse channel properly
Louis-Dupont Apr 16, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
add docstrings
Louis-Dupont committed Mar 29, 2023

Verified

This commit was signed with the committer’s verified signature.
Louis-Dupont Louis Dupont
commit d4c0774cb9f26fe265933693da2854242e7deb44
30 changes: 15 additions & 15 deletions src/super_gradients/training/transforms/processing.py
Original file line number Diff line number Diff line change
@@ -5,13 +5,13 @@
import numpy as np

from super_gradients.training.transforms.utils import (
_rescale_image,
_rescale_bboxes,
_shift_image,
_shift_bboxes,
_rescale_and_pad_to_size,
_rescale_xyxy_bboxes,
_get_shift_params,
rescale_image,
rescale_bboxes,
shift_image,
shift_bboxes,
rescale_and_pad_to_size,
rescale_xyxy_bboxes,
get_shift_params,
)


@@ -126,11 +126,11 @@ def __init__(self, output_size: Tuple[int, int], swap: Tuple[int, ...] = (2, 0,
self.pad_value = pad_value

def preprocess_image(self, image: np.ndarray) -> Tuple[np.ndarray, DetectionPaddedRescaleMetadata]:
rescaled_image, r = _rescale_and_pad_to_size(image=image, output_size=self.output_size, swap=self.swap, pad_val=self.pad_value)
rescaled_image, r = rescale_and_pad_to_size(image=image, output_size=self.output_size, swap=self.swap, pad_val=self.pad_value)
return rescaled_image, DetectionPaddedRescaleMetadata(r=r)

def postprocess_predictions(self, predictions: np.array, metadata=DetectionPaddedRescaleMetadata) -> np.array:
return _rescale_xyxy_bboxes(targets=predictions, r=1 / metadata.r)
return rescale_xyxy_bboxes(targets=predictions, r=1 / metadata.r)


class DetectionPadToSize(Processing):
@@ -148,13 +148,13 @@ def __init__(self, output_size: Tuple[int, int], pad_value: int):
self.pad_value = pad_value

def preprocess_image(self, image: np.ndarray) -> Tuple[np.ndarray, DetectionPadToSizeMetadata]:
shift_h, shift_w, pad_h, pad_w = _get_shift_params(original_size=image.shape, output_size=self.output_size)
processed_image = _shift_image(image, pad_h, pad_w, self.pad_value)
shift_h, shift_w, pad_h, pad_w = get_shift_params(input_size=image.shape, output_size=self.output_size)
processed_image = shift_image(image, pad_h, pad_w, self.pad_value)

return processed_image, DetectionPadToSizeMetadata(shift_h=shift_h, shift_w=shift_w)

def postprocess_predictions(self, predictions: np.ndarray, metadata: DetectionPadToSizeMetadata) -> np.ndarray:
return _shift_bboxes(targets=predictions, shift_w=-metadata.shift_w, shift_h=-metadata.shift_h)
return shift_bboxes(targets=predictions, shift_w=-metadata.shift_w, shift_h=-metadata.shift_h)


class _Rescale(Processing, ABC):
@@ -168,16 +168,16 @@ def __init__(self, output_shape: Tuple[int, int]):

def preprocess_image(self, image: np.ndarray) -> Tuple[np.ndarray, RescaleMetadata]:
sy, sx = self.output_shape[0] / image.shape[0], self.output_shape[1] / image.shape[1]
rescaled_image = _rescale_image(image, target_shape=self.output_shape)
rescaled_image = rescale_image(image, target_shape=self.output_shape)

return rescaled_image, RescaleMetadata(original_size=image.shape[:2], sy=sy, sx=sx)


class DetectionRescale(_Rescale):
def postprocess_predictions(self, predictions: np.ndarray, metadata: RescaleMetadata) -> np.ndarray:
return _rescale_bboxes(targets=predictions, scale_factors=(1 / metadata.sy, 1 / metadata.sx))
return rescale_bboxes(targets=predictions, scale_factors=(1 / metadata.sy, 1 / metadata.sx))


class SegmentationRescale(_Rescale):
def postprocess_predictions(self, predictions: np.ndarray, metadata: RescaleMetadata) -> np.ndarray:
return _rescale_image(predictions, target_shape=metadata.original_size)
return rescale_image(predictions, target_shape=metadata.original_size)
30 changes: 15 additions & 15 deletions src/super_gradients/training/transforms/transforms.py
Original file line number Diff line number Diff line change
@@ -20,12 +20,12 @@
from super_gradients.training.datasets.data_formats.formats import filter_on_bboxes, ConcatenatedTensorFormat
from super_gradients.training.datasets.data_formats.default_formats import XYXY_LABEL, LABEL_CXCYWH
from super_gradients.training.transforms.utils import (
_rescale_and_pad_to_size,
_rescale_image,
_rescale_bboxes,
_shift_image,
_shift_bboxes,
_rescale_xyxy_bboxes,
rescale_and_pad_to_size,
rescale_image,
rescale_bboxes,
shift_image,
shift_bboxes,
rescale_xyxy_bboxes,
)

IMAGE_RESAMPLE_MODE = Image.BILINEAR
@@ -741,9 +741,9 @@ def __call__(self, sample: dict) -> dict:
img, targets, crowd_targets = sample["image"], sample["target"], sample.get("crowd_target")
img, shift_w, shift_h = self._apply_to_image(img, final_shape=self.output_size, pad_value=self.pad_value)
sample["image"] = img
sample["target"] = _shift_bboxes(targets=targets, shift_w=shift_w, shift_h=shift_h)
sample["target"] = shift_bboxes(targets=targets, shift_w=shift_w, shift_h=shift_h)
if crowd_targets is not None:
sample["crowd_target"] = _shift_bboxes(targets=crowd_targets, shift_w=shift_w, shift_h=shift_h)
sample["crowd_target"] = shift_bboxes(targets=crowd_targets, shift_w=shift_w, shift_h=shift_h)
return sample

def _apply_to_image(self, image, final_shape: Tuple[int, int], pad_value: int):
@@ -759,7 +759,7 @@ def _apply_to_image(self, image, final_shape: Tuple[int, int], pad_value: int):
pad_h = (shift_h, pad_h - shift_h)
pad_w = (shift_w, pad_w - shift_w)

_shift_image(image, pad_h, pad_w, pad_value)
shift_image(image, pad_h, pad_w, pad_value)
return image, shift_w, shift_h


@@ -785,12 +785,12 @@ def __init__(self, input_dim: Tuple, swap: Tuple[int, ...] = (2, 0, 1), max_targ

def __call__(self, sample: dict) -> dict:
img, targets, crowd_targets = sample["image"], sample["target"], sample.get("crowd_target")
img, r = _rescale_and_pad_to_size(img, self.input_dim, self.swap, self.pad_value)
img, r = rescale_and_pad_to_size(img, self.input_dim, self.swap, self.pad_value)

sample["image"] = img
sample["target"] = _rescale_xyxy_bboxes(targets, r)
sample["target"] = rescale_xyxy_bboxes(targets, r)
if crowd_targets is not None:
sample["crowd_target"] = _rescale_xyxy_bboxes(crowd_targets, r)
sample["crowd_target"] = rescale_xyxy_bboxes(crowd_targets, r)
return sample


@@ -838,10 +838,10 @@ def __call__(self, sample: dict) -> dict:

sy, sx = (self.output_shape[0] / image.shape[0], self.output_shape[1] / image.shape[1])

sample["image"] = _rescale_image(image=image, target_shape=self.output_shape)
sample["target"] = _rescale_bboxes(targets, scale_factors=(sy, sx))
sample["image"] = rescale_image(image=image, target_shape=self.output_shape)
sample["target"] = rescale_bboxes(targets, scale_factors=(sy, sx))
if crowd_targets is not None:
sample["crowd_target"] = _rescale_bboxes(crowd_targets, scale_factors=(sy, sx))
sample["crowd_target"] = rescale_bboxes(crowd_targets, scale_factors=(sy, sx))
return sample


58 changes: 43 additions & 15 deletions src/super_gradients/training/transforms/utils.py
Original file line number Diff line number Diff line change
@@ -6,34 +6,62 @@
from super_gradients.training.utils.detection_utils import xyxy2cxcywh, cxcywh2xyxy


def _rescale_bboxes(targets: np.array, scale_factors: Tuple[float, float]) -> np.array:
"""DetectionRescale targets to given scale factors."""
def rescale_image(image: np.ndarray, target_shape: Tuple[float, float]) -> np.ndarray:
"""Rescale image to target_shape, without preserving aspect ratio.

targets = targets.astype(np.float32, copy=True) if len(targets) > 0 else np.zeros((0, 5), dtype=np.float32)
:param image: Image to rescale.
:param target_shape: Target shape to rescale to.
:return: Rescaled image.
"""
return cv2.resize(image, dsize=(int(target_shape[1]), int(target_shape[0])), interpolation=cv2.INTER_LINEAR).astype(np.uint8)


def rescale_bboxes(targets: np.array, scale_factors: Tuple[float, float]) -> np.array:
"""Rescale bboxes to given scale factors, without preserving aspect ratio.

:param targets: Targets to rescale (N, 4+), where target[:, :4] is the bounding box coordinates.
:param scale_factors: Tuple of (sy, sx) scale factors to rescale to.
:return: Rescaled targets.
"""

targets = targets.astype(np.float32, copy=True)

sy, sx = scale_factors
targets[:, 0:4] *= np.array([[sx, sy, sx, sy]], dtype=targets.dtype)
targets[:, :4] *= np.array([[sx, sy, sx, sy]], dtype=targets.dtype)
return targets


def _rescale_image(image: np.ndarray, target_shape: Tuple[float, float]) -> np.ndarray:
"""DetectionRescale image to target_shape, without preserving aspect ratio."""
return cv2.resize(image, dsize=(int(target_shape[1]), int(target_shape[0])), interpolation=cv2.INTER_LINEAR).astype(np.uint8)

def get_shift_params(input_size: Tuple[int, int], output_size: Tuple[int, int]) -> Tuple[int, int, Tuple[int, int], Tuple[int, int]]:
"""Get shift parameters for resizing an image to given output size, while preserving aspect ratio using padding.

def _get_shift_params(original_size: Tuple[int, int], output_size: Tuple[int, int]) -> Tuple[int, int, Tuple[int, int], Tuple[int, int]]:
pad_h, pad_w = output_size[0] - original_size[0], output_size[1] - original_size[1]
:param input_size: Size of the input image.
:param output_size: Size to resize to.
:return:
- shift_h: Horizontal shift.
- shift_w: Vertical shift.
- pad_h: Horizontal padding.
- pad_w: Vertical padding.
"""
pad_h, pad_w = output_size[0] - input_size[0], output_size[1] - input_size[1]
shift_h, shift_w = pad_h // 2, pad_w // 2
pad_h = (shift_h, pad_h - shift_h)
pad_w = (shift_w, pad_w - shift_w)
return shift_h, shift_w, pad_h, pad_w


def _shift_image(image: np.ndarray, pad_h: Tuple[int, int], pad_w: Tuple[int, int], pad_value: int) -> np.ndarray:
def shift_image(image: np.ndarray, pad_h: Tuple[int, int], pad_w: Tuple[int, int], pad_value: int) -> np.ndarray:
"""Shift bboxes with respect to padding coordinates.

:param image: Image to shift
:param pad_h: Padding to add to height
:param pad_w: Padding to add to width
:param pad_value: Padding value
:return: Image shifted according to padding coordinates.
"""
return np.pad(image, (pad_h, pad_w, (0, 0)), "constant", constant_values=pad_value)


def _shift_bboxes(targets: np.array, shift_w: float, shift_h: float) -> np.array:
def shift_bboxes(targets: np.array, shift_w: float, shift_h: float) -> np.array:
"""Shift bboxes with respect to padding values.

:param targets: Bboxes to transform of shape (N, 5+), in format [x1, y1, x2, y2, class_id, ...]
@@ -48,7 +76,7 @@ def _shift_bboxes(targets: np.array, shift_w: float, shift_h: float) -> np.array
return np.concatenate((boxes, labels), 1)


def _rescale_xyxy_bboxes(targets: np.array, r: float) -> np.array:
def rescale_xyxy_bboxes(targets: np.array, r: float) -> np.array:
"""Scale targets to given scale factors.

:param targets: Bboxes to transform of shape (N, 5+), in format [x1, y1, x2, y2, class_id, ...]
@@ -63,7 +91,7 @@ def _rescale_xyxy_bboxes(targets: np.array, r: float) -> np.array:
return np.concatenate((boxes, targets), 1)


def _rescale_and_pad_to_size(image: np.ndarray, output_size: Tuple[int, int], swap: Tuple[int] = (2, 0, 1), pad_val: int = 114) -> Tuple[np.ndarray, float]:
def rescale_and_pad_to_size(image: np.ndarray, output_size: Tuple[int, int], swap: Tuple[int] = (2, 0, 1), pad_val: int = 114) -> Tuple[np.ndarray, float]:
"""
Rescales image according to minimum ratio input height/width and output height/width.
and pads the image to the target size.
@@ -84,7 +112,7 @@ def _rescale_and_pad_to_size(image: np.ndarray, output_size: Tuple[int, int], sw
r = min(output_size[0] / image.shape[0], output_size[1] / image.shape[1])

target_shape = (int(image.shape[0] * r), int(image.shape[1] * r))
resized_image = _rescale_image(image=image, target_shape=target_shape)
resized_image = rescale_image(image=image, target_shape=target_shape)
padded_image[: target_shape[0], : target_shape[1]] = resized_image

padded_image = padded_image.transpose(swap)