-
Notifications
You must be signed in to change notification settings - Fork 607
Add yolov5-youtube example #1201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 5 commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
98c8868
Implement first prototype for yolov5 example
dsuess 313d408
Update predictor with flexible input size & better overlay
dsuess 29977d7
Update readme
dsuess da5c4a2
Merge branch 'master' into example/yolo
RobertLucian b67e4eb
Make lint
RobertLucian d0caa6c
Rename to yolov5-youtube
dsuess 9de3281
Update Readme
dsuess d81b0ec
Fix ffmpeg bug with gmp dependency
RobertLucian 4621c29
Use context manager & remove all videos from disk
RobertLucian 8d65d4b
Reorganize the example a bit
RobertLucian 436bcd3
Polishing the docs & tuning line thickness
RobertLucian 001f40f
Use video/mp4 mime-type
RobertLucian df254bc
Merge branch 'master' into example/yolo
RobertLucian File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
# YoloV5 Detection model | ||
|
||
This example deploys a detection model trained using [ultralytics' yolo repo](https://github.com/ultralytics/yolov5) using ONNX. | ||
We'll use the `yolov5s` model as an example here. | ||
In can be used to run inference on youtube videos and returns the annotated video with bounding boxes. | ||
|
||
The example can be run on both CPU and on GPU hardware. | ||
|
||
## Exporting ONNX | ||
|
||
To export a custom model from the repo, use the [`model/export.py`](https://github.com/ultralytics/yolov5/blob/master/models/export.py) script. | ||
The only change we need to make is to change the line | ||
|
||
``` | ||
model.model[-1].export = True # set Detect() layer export=True | ||
``` | ||
|
||
to | ||
|
||
``` | ||
model.model[-1].export = False | ||
``` | ||
|
||
Originally, the ultralytics repo does not export postprocessing steps of the model, e.g. the conversion from the raw CNN outputs to bounding boxes. | ||
With newer ONNX versions, these can be exported as part of the model making the deployment much easier. | ||
|
||
With this modified script, the ONNX graph used for this example has been exported using | ||
``` | ||
python models/export.py --weights weights/yolov5s.pt --img 416 --batch 1 | ||
``` | ||
|
||
|
||
## Sample Prediction | ||
deliahu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Deploy the model by running: | ||
|
||
```bash | ||
cortex deploy | ||
``` | ||
|
||
And wait for it to become live by tracking its status with `cortex get --watch`. | ||
|
||
Once the API has been successfully deployed, export the API's endpoint for convenience. You can get the API's endpoint by running `cortex get youtube-yolov5`. | ||
deliahu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
```bash | ||
export ENDPOINT=your-api-endpoint | ||
``` | ||
|
||
When making a prediction with [sample.json](sample.json), [this](https://www.youtube.com/watch?v=aUdKzb4LGJ) youtube video will be used. | ||
|
||
To make a request to the model: | ||
|
||
```bash | ||
curl "${ENDPOINT}" -X POST -H "Content-Type: application/json" -d @sample.json --output video.mp4 | ||
``` | ||
|
||
After a few seconds, `curl` will save the resulting video `video.mp4` in the current working directory. | ||
deliahu marked this conversation as resolved.
Show resolved
Hide resolved
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
conda-forge::ffmpeg | ||
conda-forge::youtube-dl | ||
conda-forge::matplotlib |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
# WARNING: you are on the master branch, please refer to the examples on the branch that matches your `cortex version` | ||
|
||
- name: youtube-yolov5 | ||
deliahu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
kind: SyncAPI | ||
predictor: | ||
type: onnx | ||
path: predictor.py | ||
model_path: yolov5s.onnx | ||
deliahu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
config: | ||
iou_threshold: 0.5 | ||
confidence_threshold: 0.3 | ||
compute: | ||
gpu: 1 | ||
deliahu marked this conversation as resolved.
Show resolved
Hide resolved
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
[ | ||
"person", | ||
"bicycle", | ||
"car", | ||
"motorcycle", | ||
"airplane", | ||
"bus", | ||
"train", | ||
"truck", | ||
"boat", | ||
"traffic light", | ||
"fire hydrant", | ||
"stop sign", | ||
"parking meter", | ||
"bench", | ||
"bird", | ||
"cat", | ||
"dog", | ||
"horse", | ||
"sheep", | ||
"cow", | ||
"elephant", | ||
"bear", | ||
"zebra", | ||
"giraffe", | ||
"backpack", | ||
"umbrella", | ||
"handbag", | ||
"tie", | ||
"suitcase", | ||
"frisbee", | ||
"skis", | ||
"snowboard", | ||
"sports ball", | ||
"kite", | ||
"baseball bat", | ||
"baseball glove", | ||
"skateboard", | ||
"surfboard", | ||
"tennis racket", | ||
"bottle", | ||
"wine glass", | ||
"cup", | ||
"fork", | ||
"knife", | ||
"spoon", | ||
"bowl", | ||
"banana", | ||
"apple", | ||
"sandwich", | ||
"orange", | ||
"broccoli", | ||
"carrot", | ||
"hot dog", | ||
"pizza", | ||
"donut", | ||
"cake", | ||
"chair", | ||
"couch", | ||
"potted plant", | ||
"bed", | ||
"dining table", | ||
"toilet", | ||
"tv", | ||
"laptop", | ||
"mouse", | ||
"remote", | ||
"keyboard", | ||
"cell phone", | ||
"microwave", | ||
"oven", | ||
"toaster", | ||
"sink", | ||
"refrigerator", | ||
"book", | ||
"clock", | ||
"vase", | ||
"scissors", | ||
"teddy bear", | ||
"hair drier", | ||
"toothbrush" | ||
] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,178 @@ | ||
# WARNING: you are on the master branch, please refer to the examples on the branch that matches your `cortex version` | ||
|
||
import json | ||
import os | ||
import uuid | ||
from pathlib import Path | ||
from typing import Iterable, Tuple | ||
|
||
import cv2 | ||
import ffmpeg | ||
import numpy as np | ||
import youtube_dl | ||
from matplotlib import pyplot as plt | ||
|
||
from starlette.responses import FileResponse | ||
|
||
|
||
def download_from_youtube(url: str, min_height: int) -> Path: | ||
target = f"{uuid.uuid1()}.mp4" | ||
ydl_opts = { | ||
"outtmpl": target, | ||
"format": f"worstvideo[vcodec=vp9][height>={min_height}]", | ||
} | ||
with youtube_dl.YoutubeDL(ydl_opts) as ydl: | ||
ydl.download([url]) | ||
# we need to glob in case youtube-dl adds suffix | ||
(path,) = Path().absolute().glob(f"{target}*") | ||
return path | ||
|
||
|
||
def frame_reader(path: Path, size: Tuple[int, int]) -> Iterable[np.ndarray]: | ||
width, height = size | ||
# letterbox frames to fixed size | ||
process = ( | ||
ffmpeg.input(path) | ||
.filter("scale", size=f"{width}:{height}", force_original_aspect_ratio="decrease") | ||
# Negative values for x and y center the padded video | ||
.filter("pad", height=height, width=width, x=-1, y=-1) | ||
.output("pipe:", format="rawvideo", pix_fmt="rgb24") | ||
.run_async(pipe_stdout=True) | ||
) | ||
|
||
while True: | ||
in_bytes = process.stdout.read(height * width * 3) | ||
if not in_bytes: | ||
process.wait() | ||
break | ||
frame = np.frombuffer(in_bytes, np.uint8).reshape([height, width, 3]) | ||
yield frame | ||
|
||
|
||
class FrameWriter: | ||
def __init__(self, path: Path, size: Tuple[int, int]): | ||
width, height = size | ||
self.process = ( | ||
ffmpeg.input("pipe:", format="rawvideo", pix_fmt="rgb24", s=f"{width}x{height}") | ||
.output(path, pix_fmt="yuv420p") | ||
.overwrite_output() | ||
.run_async(pipe_stdin=True) | ||
) | ||
|
||
def write(self, frame: np.ndarray): | ||
self.process.stdin.write(frame.astype(np.uint8).tobytes()) | ||
|
||
def __del__(self): | ||
self.process.stdin.close() | ||
self.process.wait() | ||
|
||
|
||
def nms(dets: np.ndarray, scores: np.ndarray, thresh: float) -> np.ndarray: | ||
x1 = dets[:, 0] | ||
y1 = dets[:, 1] | ||
x2 = dets[:, 2] | ||
y2 = dets[:, 3] | ||
|
||
areas = (x2 - x1 + 1) * (y2 - y1 + 1) | ||
order = scores.argsort()[::-1] # get boxes with more ious first | ||
|
||
keep = [] | ||
while order.size > 0: | ||
i = order[0] # pick maxmum iou box | ||
keep.append(i) | ||
xx1 = np.maximum(x1[i], x1[order[1:]]) | ||
yy1 = np.maximum(y1[i], y1[order[1:]]) | ||
xx2 = np.minimum(x2[i], x2[order[1:]]) | ||
yy2 = np.minimum(y2[i], y2[order[1:]]) | ||
|
||
w = np.maximum(0.0, xx2 - xx1 + 1) # maximum width | ||
h = np.maximum(0.0, yy2 - yy1 + 1) # maxiumum height | ||
inter = w * h | ||
ovr = inter / (areas[i] + areas[order[1:]] - inter) | ||
|
||
inds = np.where(ovr <= thresh)[0] | ||
order = order[inds + 1] | ||
|
||
return np.array(keep).astype(np.int) | ||
|
||
|
||
def boxes_yolo_to_xyxy(boxes: np.ndarray): | ||
boxes[:, 0] -= boxes[:, 2] / 2 | ||
boxes[:, 1] -= boxes[:, 3] / 2 | ||
boxes[:, 2] = boxes[:, 2] + boxes[:, 0] | ||
boxes[:, 3] = boxes[:, 3] + boxes[:, 1] | ||
return boxes | ||
|
||
|
||
def overlay_boxes(frame, boxes, class_ids, label_map, color_map, line_thickness=None): | ||
tl = ( | ||
line_thickness or round(0.002 * (frame.shape[0] + frame.shape[1]) / 2) + 1 | ||
) # line/font thickness | ||
|
||
for class_id, (x1, y1, x2, y2) in zip(class_ids, boxes.astype(np.int)): | ||
color = color_map[class_id] | ||
label = label_map[class_id] | ||
cv2.rectangle(frame, (x1, y1), (x2, y2), color, tl, cv2.LINE_AA) | ||
tf = max(tl - 1, 1) # font thickness | ||
t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0] | ||
x3, y3 = x1 + t_size[0], y1 - t_size[1] - 3 | ||
cv2.rectangle(frame, (x1, y1), (x3, y3), color, -1, cv2.LINE_AA) # filled | ||
cv2.putText( | ||
frame, | ||
label, | ||
(x1, y1 - 2), | ||
0, | ||
tl / 3, | ||
[225, 255, 255], | ||
thickness=tf, | ||
lineType=cv2.LINE_AA, | ||
) | ||
|
||
|
||
class ONNXPredictor: | ||
def __init__(self, onnx_client, config): | ||
self.client = onnx_client | ||
# Get the input shape from the ONNX runtime | ||
(signature,) = onnx_client.input_signatures.values() | ||
_, _, height, width = signature["images"]["shape"] | ||
self.input_size = (width, height) | ||
self.config = config | ||
with open("labels.json") as buf: | ||
self.labels = json.load(buf) | ||
color_map = plt.cm.tab20(np.linspace(0, 20, len(self.labels))) | ||
self.color_map = [tuple(map(int, colors)) for colors in 255 * color_map] | ||
|
||
def postprocess(self, output): | ||
boxes, obj_score, class_scores = np.split(output[0], [4, 5], axis=1) | ||
boxes = boxes_yolo_to_xyxy(boxes) | ||
|
||
# get the class-prediction & class confidences | ||
class_id = class_scores.argmax(axis=1) | ||
cls_score = class_scores[np.arange(len(class_scores)), class_id] | ||
|
||
confidence = obj_score.squeeze(axis=1) * cls_score | ||
sel = confidence > self.config["confidence_threshold"] | ||
boxes, class_id, confidence = boxes[sel], class_id[sel], confidence[sel] | ||
sel = nms(boxes, confidence, self.config["iou_threshold"]) | ||
boxes, class_id, confidence = boxes[sel], class_id[sel], confidence[sel] | ||
return boxes, class_id, confidence | ||
|
||
def predict(self, payload): | ||
in_path = download_from_youtube(payload["url"], self.input_size[1]) | ||
out_path = f"{uuid.uuid1()}.mp4" | ||
writer = FrameWriter(out_path, size=self.input_size) | ||
|
||
for frame in frame_reader(in_path, size=self.input_size): | ||
x = (frame.astype(np.float32) / 255).transpose(2, 0, 1) | ||
# 4 output tensors, the last three are intermediate values and | ||
# not necessary for detection | ||
output, *_ = self.client.predict(x[None]) | ||
boxes, class_ids, confidence = self.postprocess(output) | ||
overlay_boxes(frame, boxes, class_ids, self.labels, self.color_map) | ||
writer.write(frame) | ||
|
||
del writer | ||
deliahu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
os.remove(in_path) | ||
# We cant remove out_path, so the deployment will run out of memory | ||
deliahu marked this conversation as resolved.
Show resolved
Hide resolved
deliahu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# sooner or later | ||
return FileResponse(out_path) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
ffmpeg-python | ||
aiofiles | ||
opencv-python-headless |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
{ | ||
"url": "https://www.youtube.com/watch?v=aUdKzb4LGJI" | ||
} |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.