-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C# inference is slower than Python #892
Comments
I cannot index the Span in VS2015, so I cannot use it instead of
I tried Even when I ignore the actual outputs in C# (fill in with null instead) there is a 40 to 50ms difference between C# and Python (even though in Python I fetch the outputs). |
@dashesy Is this still a persistent issue with the latest runtime version? If not, please close the issue. If so, let's see how we can help to troubleshoot this. |
This is same issue with 0.4.0 is there a newer version I can try? I shared a small project with you that shows the issue |
The 40-50ms performance gap between Python and C# requires investigation. It looks like there's a delta even without retrieving/iterating over the output results. This should be addressed as part of the quality milestones. |
One thing I noticed is that this gap is almost constant. For an inference that is in the order of 150ms, a 40-60ms gap is more noticeable, but it is always there. |
@dashesy, reviving this old issue, as part of quality cleanup. The C# running versus Python time is close, at least in current master branch.. On the celeb model, for 100 iterations, C#=42.48 milliseconds on average, and Python=45.13 milliseconds. If anything, C# is a bit faster for the same model. It's possible that recent changes to C-API (which C# uses directly) may have closed the gap. I'm not aware of any C# API changes that would have had any impact. Please ensure you are running in release mode. Debug builds are going to be slower. Closing this issue for now. Python Resultsimport time
import numpy as np
import onnxruntime as rt
im = np.random.rand(1, 3, 257, 257).astype('uint8')
sess = rt.InferenceSession("D:\models\ehsan\celeb\celeb.onnx")
totaltime=0
numiters = 100
t0 = time.time()
for i in range(0, numiters):
output = sess.run(['prob'], {'data': im})[0]
t1 = time.time()
totaltime += t1 - t0
t0 = t1
print ("Average time to score 1 tensor in milliseconds = " , totaltime/numiters * 1000)
C#Results
|
Describe the bug
A clear and concise description of what the bug is.
System information
To Reproduce
Load an ONNX file in
onnxruntime
in Python and in C#, the inference in C# is twice slowerI get around 0.09s (90ms) but when I load the model in ORT nuget I get about 200ms
TensorArray
is a wrapper class that acceptsfloat
array and its dimension.Is this behavior expected? is it
ToArray
that has overhead in C# but not in Python?Expected behavior
Minimal overhead
Additional context
Sent an email with a model file
The text was updated successfully, but these errors were encountered: