Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High inference time using r1.0 and master #28

Open
harsh-agar opened this issue Jun 29, 2018 · 3 comments
Open

High inference time using r1.0 and master #28

harsh-agar opened this issue Jun 29, 2018 · 3 comments

Comments

@harsh-agar
Copy link

Hi @gustavz
The model ran successfully on Jetson TX2 but the inference time was quite slow. I tried both r1.0 branch and the master branch, the inference time were-
For master:
18.15, 2.39, 2.62, 2.53 seconds
While for r1.0:
22.34, 0.27, 0.17, 0.13 seconds
for 4 images respectively.
Visualization was switched off.
Is there anything I'm missing that makes it this slow?

Thanks

@gustavz
Copy link
Owner

gustavz commented Jul 1, 2018

@harsh-agar

  1. are you using the current master?
  2. how does your config look like?
  3. Did you change code?
  4. which python / openCV /JetPack version are you using

@harsh-agar
Copy link
Author

@gustavz

  1. I tried both master as well as r1.0 and results obtained are shown above

2.This is my config.yml for master


Inference Config

VIDEO_INPUT: 0 # Input Must be OpenCV readable
VISUALIZE: True # Disable for performance increase
VIS_FPS: True # Draw current FPS in the top left Image corner
CPU_ONLY: False # CPU Placement for speed test
USE_OPTIMIZED: False # whether to use the optimized model (only possible if transform with script)
DISCO_MODE: False # Secret Disco Visualization Mode

Testing

IMAGE_PATH: 'test_images' # path for test_*.py test_images
LIMIT_IMAGES: None # if set to None, all images are used
WRITE_TIMELINE: True # write json timeline file (slows infrence)
SAVE_RESULT: False # save detection results to disk
RESULT_PATH: 'test_results' # path to save detection results
SEQ_MODELS: [] # List of Models to sequentially test (Default all Models)

Object_Detection

WIDTH: 600 # OpenCV only supports 4:3 formats others will be converted
HEIGHT: 600 # 600x600 leads to 640x480
MAX_FRAMES: 5000 # only used if visualize==False
FPS_INTERVAL: 5 # Interval [s] to print fps of the last interval in console
PRINT_INTERVAL: 500 # intervall [frames] to print detections to console
PRINT_TH: 0.5 # detection threshold for det_intervall

speed hack

SPLIT_MODEL: True # Splits Model into a GPU and CPU session (currently only works for ssd_mobilenets)
SSD_SHAPE: 300 # used for the split model algorithm (currently only supports ssd networks trained on 300x300 and 600x600 input)

Tracking

USE_TRACKER: False # Use a Tracker (currently only works properly WITHOUT split_model)
TRACKER_FRAMES: 20 # Number of tracked frames between detections
NUM_TRACKERS: 5 # Max number of objects to track

Model

OD_MODEL_NAME: 'ssd_mobilenet_v11_coco'
OD_MODEL_PATH: 'models/ssd_mobilenet_v11_coco/{}'
LABEL_PATH: 'rod/data/tf_coco_label_map.pbtxt'
NUM_CLASSES: 90

DeepLab

ALPHA: 0.3 # mask overlay factor (also for mask_rcnn)
BBOX: True # compute boundingbox in postprocessing
MINAREA: 500 # min Pixel Area to apply bounding boxes (avoid noise)

Model

DL_MODEL_NAME: 'deeplabv3_mnv2_pascal_train_aug_2018_01_29'
DL_MODEL_PATH: 'models/deeplabv3_mnv2_pascal_train_aug/{}'

  1. I did not change the code

  2. Python 2.7 | OpenCV 3.3.1 | Jetpack 3.1

Thanks again

@harsh-agar
Copy link
Author

I ran the script test_objectdetection.py and, What I observed is when loading the model it is using GPU, but during detection GPU usage is 0%.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants