You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
But the inference time I calculated is quite different from it mentioned in the paper. It takes more than 100ms for a 4K image. The code I use is as follows.
'''
model = B_transformer().cuda()
a = torch.randn(1, 3, 1024, 1024).cuda()
starter, ender = torch.cuda.Event(enable_timing=True), torch.cuda.Event(enable_timing=True)
repetitions = 100
timings = np.zeros((repetitions, 1))
# GPU-WARM-UP
for _ in range(50):
enhanced_image = model(a)
# MEASURE PERFORMANCE
with torch.no_grad():
for rep in range(repetitions):
torch.cuda.synchronize()
starter.record()
enhanced_image = model(a)
ender.record()
# WAIT FOR GPU SYNC
torch.cuda.synchronize()
curr_time = starter.elapsed_time(ender)
timings[rep] = curr_time
mean_syn = np.sum(timings) / repetitions
std_syn = np.std(timings)
print(mean_syn)
'''
Is this right?
The text was updated successfully, but these errors were encountered:
Thank you for your reply. But my code contains the GPU warm-up step.
If you don't mind, could you provide the correct code for measuring the inference time?
Our model is quick for inference and may have something to do with your cold start.
I have the same question. The running time test on the Telsa V100 GPU for a 1024X1024 image comes to 100ms, which is quiet different from the data in your paper (9ms for a 4k image). Can you share the code for measuring the inference time?
Thanks for your contribution.
But the inference time I calculated is quite different from it mentioned in the paper. It takes more than 100ms for a 4K image. The code I use is as follows.
'''
model = B_transformer().cuda()
a = torch.randn(1, 3, 1024, 1024).cuda()
starter, ender = torch.cuda.Event(enable_timing=True), torch.cuda.Event(enable_timing=True)
repetitions = 100
timings = np.zeros((repetitions, 1))
# GPU-WARM-UP
for _ in range(50):
enhanced_image = model(a)
# MEASURE PERFORMANCE
with torch.no_grad():
for rep in range(repetitions):
torch.cuda.synchronize()
starter.record()
enhanced_image = model(a)
ender.record()
# WAIT FOR GPU SYNC
torch.cuda.synchronize()
curr_time = starter.elapsed_time(ender)
timings[rep] = curr_time
mean_syn = np.sum(timings) / repetitions
std_syn = np.std(timings)
print(mean_syn)
'''
Is this right?
The text was updated successfully, but these errors were encountered: