implement kornia-dnn with RTDETR detector #129

edgarriba · 2024-09-08T12:01:07Z

creates kornia-dnn based on the ort crate
define RTDETRDetectorBuilder and RTDETRDetector create the ort session
create a basic Detection struct message
add example rtdetr to use rtdetr with the webcam
TODO: explore CUDA inference

crates/kornia-dnn/src/rtdetr.rs

crates/kornia-dnn/Cargo.toml

decahedron1 · 2024-09-08T14:35:49Z

crates/kornia-dnn/src/rtdetr.rs

+    /// Path to the ONNX Runtime dynamic library.
+    pub ort_dylib_path: PathBuf,
+    /// Number of threads to use for inference.
+    pub num_threads: usize,


Execution providers can be registered through the environment so this is super minor but WDYT about an execution_providers: Vec<ExecutionProviderDispatch> field & with_execution_providers to configure EPs specifically for the RTDETR session?

i wanted to somehow make ort transparent to the user and avoid them to explicitly pass the ort::ExecutionProvider and have something custom

enum ExecutionProvided { Cpu, Cuda, TensorRT, } let execution_providers = match execution_provider { ExecutionProvider::Cpu => vec![CpuExecutionProvider::default()], ExecutionProvider::Cuda => vec![CUDAExecutionProvider::default()], ExecutionProvider::TensorRT => vec![TensorRTExecutionProvider::default()], }; let session = Session::builder()? .with_optimization_level(GraphOptimizationLevel::Level3)? .with_intra_threads(num_threads)? .with_execution_providers([execution_providers])? .commit_from_file(model_path)?;

I'm still experimenting with the execution providers. What is the point of defining multiple as a Vec, just because of a fallback provider ?

I've played a bit with it and i noticed that cuda/tensorrt takes few seconds to run the first frames.

A couple of questions:

Is there any way to leave this to the constructor of the session before i fetch the session ? (might suffer some queuying issues with cameras)

For tensorrt, means that's is compiling the model at runtime ? could we somehow passe a compiled model ?

As for the commit_from_file -- the idea is that we will have a bunch of operators/models in our Kornia HF hub which can use the commit_from_url but somehow also let the user also to give a local onnx file. Any tips here ?

What about re-exporting EPs like pub use ort::{CPUExecutionProvider, CUDAExecutionProvider, TensorRTExecutionProvider} so users can still configure each EP's options instead of being limited to defaults? That should still keep things neat.

What is the point of defining multiple as a Vec, just because of a fallback provider ?

Yes.

Is there any way to leave this to the constructor of the session before i fetch the session ? (might suffer some queuying issues with cameras)

You could run it on 1 dummy frame inside the constructor, that should get the graph warmed up.

For tensorrt, means that's is compiling the model at runtime ? could we somehow passe a compiled model ?

Yes, and for CUDA it is determining the most optimal cuDNN convolution kernels. By default it performs an exhaustive search which gets the best performance at the cost of significant warmup time - this can be configured with CUDAExecutionProvider::with_conv_algorithm_search.

TensorRT graphs can theoretically be cached with TensorRTExecutionProvider::with_engine_cache but some users in the pyke Discord have reported that ONNX Runtime sometimes doesn't respect this option, and session creation can still take a few seconds despite using a cached engine. Your mileage may vary, though; I personally haven't been able to reproduce the issue, but it's just something to keep in mind.

As for the commit_from_file -- the idea is that we will have a bunch of operators/models in our Kornia HF hub which can use the commit_from_url but somehow also let the user also to give a local onnx file. Any tips here ?

How about something like this? (very roughly)

pub trait ModelSource { fn commit_session(&self, builder: SessionBuilder) -> ort::Result<Session>; } pub trait SessionBuilderExt { fn commit_from_source<S: ModelSource>(self, source: S) -> ort::Result<Session>; } impl SessionBuilderExt for SessionBuilder { fn commit_from_source<S: ModelSource>(self, source: S) -> ort::Result<Session> { source.commit_session(self) } } impl<P: AsRef<Path>> ModelSource for P { fn commit_session(&self, builder: SessionBuilder) -> ort::Result<Session> { builder.commit_from_file(self.as_ref()) } } pub mod models { pub struct ExampleRTDETR; impl super::ModelSource for ExampleRTDETR { fn commit_session(&self, builder: SessionBuilder) -> ort::Result<Session> { builder.commit_from_url("https://kornia.rs/model/rtdetr.onnx") } } } // rtdetr.rs pub struct RTDETRDetectorBuilder { pub source: Box<dyn ModelSource> } impl RTDETRDetectorBuilder { pub fn with_source<S: ModelSource>(mut self, source: S) -> Self { self.source = Box::new(source); self } } impl RTDETRDetector { pub fn new(...) -> Result<Self> { let session = Session::builder()? ... .commit_from_source(source)?; Ok(Self { session }) } } // user code let rtdetr = RTDETRDetectorBuilder::with_source(kornia_dnn::models::ExampleRTDETR).build()?; // or from file let rtdetr = RTDETRDetectorBuilder::with_source("./my-local-rtdetr.onnx").build()?;

i wanted to somehow make ort transparent to the user and avoid them to explicitly pass the ort::ExecutionProvider and have something custom

actually one more idea would be to kinda automatically set based on feature flags?

crates/kornia-dnn/src/rtdetr.rs

decahedron1

Outside of the load-dynamic issue, everything looks great! 🔥

edgarriba · 2024-09-08T17:54:48Z

crates/kornia-dnn/src/rtdetr.rs

+        let ort_tensor = ort::Tensor::from_array((image_nchw.shape, image_nchw.into_vec()))?;
+
+        // run the model
+        let outputs = self.session.run(ort::inputs!["input" => ort_tensor]?)?;


@decahedron1 how could we pre-allocate the output tensor here ?

This seems like the perfect use case for OutputSelector!

edgarriba added 2 commits September 8, 2024 14:00

implement kornia-dnn with RTDETR detector

0095739

improve docs

fc3d72c

edgarriba commented Sep 8, 2024

View reviewed changes

crates/kornia-dnn/src/rtdetr.rs Show resolved Hide resolved

decahedron1 reviewed Sep 8, 2024

View reviewed changes

crates/kornia-dnn/Cargo.toml Outdated Show resolved Hide resolved

decahedron1 reviewed Sep 8, 2024

View reviewed changes

crates/kornia-dnn/src/rtdetr.rs Outdated Show resolved Hide resolved

decahedron1 suggested changes Sep 8, 2024

View reviewed changes

edgarriba commented Sep 8, 2024

View reviewed changes

edgarriba added 4 commits September 8, 2024 22:03

code review

1344de3

update readme

8193603

re-export providers

3cda7e1

fix python

8cd3394

edgarriba mentioned this pull request Sep 9, 2024

Change Project structure to be a little more scalable dora-rs/fastformat#6

Merged

edgarriba mentioned this pull request Sep 26, 2024

Fix TensorStorage memory deallocation #145

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement kornia-dnn with RTDETR detector #129

implement kornia-dnn with RTDETR detector #129

edgarriba commented Sep 8, 2024 •

edited

Loading

decahedron1 Sep 8, 2024 •

edited

Loading

edgarriba Sep 8, 2024

decahedron1 Sep 8, 2024 •

edited

Loading

edgarriba Sep 8, 2024

decahedron1 left a comment

edgarriba Sep 8, 2024

decahedron1 Sep 8, 2024

implement kornia-dnn with RTDETR detector #129

Are you sure you want to change the base?

implement kornia-dnn with RTDETR detector #129

Conversation

edgarriba commented Sep 8, 2024 • edited Loading

decahedron1 Sep 8, 2024 • edited Loading

Choose a reason for hiding this comment

edgarriba Sep 8, 2024

Choose a reason for hiding this comment

decahedron1 Sep 8, 2024 • edited Loading

Choose a reason for hiding this comment

edgarriba Sep 8, 2024

Choose a reason for hiding this comment

decahedron1 left a comment

Choose a reason for hiding this comment

edgarriba Sep 8, 2024

Choose a reason for hiding this comment

decahedron1 Sep 8, 2024

Choose a reason for hiding this comment

edgarriba commented Sep 8, 2024 •

edited

Loading

decahedron1 Sep 8, 2024 •

edited

Loading

decahedron1 Sep 8, 2024 •

edited

Loading