Low latency than expected #27

paduck86 · 2018-10-30T01:03:10Z

Hello Mr.Jung,
Tanks to you, I was able to test faster rcnn with tensorrt.

But, the latency is lower than I expected on my machine.
The response time is as follows,

architecture : faster rcnn resnet101
machine : V100
version : tensorflow 1.10, tensorrt4-ga, cuda9.0, cudnn7.0
tensorflow native : 0.17s per image
tensrrt fp 32 : 0.52s per image
tensrrt fp 16 : 0.51s per image
tensrrt int 8 : 0.51s per image

In addition, the memory usage is not much different.

My code is as follows,

trt_graph = trt.create_inference_graph(
input_graph_def=frozen_graph,
outputs=output_names,
max_batch_size=1,
max_workspace_size_bytes=1 << 25,
precision_mode='INT8' #'FP32' / 'FP16',
minimum_segment_size=50
)

Did I do something wrong?
I would really appreciate it if you answer for me.

…about camera_tf_trt.py

Finished the first working implementation of the real-time object detection demo script: 'camera_tf_trt.py'

…t directory

… drawing function

…onvActLayer.cpp (61) - Cuda Error in createFilterTextur'

…xes; fix duplicated logging for tensorflow

Refactor visualization code to the utils/ directory; fix duplicated logging for tensorflow

… case

…th/height setting in command line arguments)

…rflow-1.8 (as specified in the original NVIDIA tf_trt_models/README.md)

…l ones from tensorflow model repository), in which the 'score_threshold' has been modified from 1e-8 to 0.3

Update a working version, tested with JetPack-3.2 and tensorflow 1.8.0

Add download link to tensorflow 1.8.0 wheel for JetPack-3.3

…are 0 based)

Add support for ssd_mobilenet_v1_egohands

Add data/egohands_label_map.pbtxt

…on_v2_egohands'

…ohands', 'ssd_inception_v2_egohands', 'faster_rcnn_resnet50_egohands', 'faster_rcnn_resnet101_egohands' and 'faster_rcnn_inception_v2_egohands' models. However, the faster rcnn models are hacky and do not perform well on TX2 yet.

Add support for 'ssd_mobilenet_v2_egohands', 'ssdlite_mobilenet_v2_egohands', 'ssd_inception_v2_egohands', 'faster_rcnn_resnet50_egohands', 'faster_rcnn_resnet101_egohands' and 'faster_rcnn_inception_v2_egohands' models

…TRT) onto GPU, so well as revert number of RPN proposals back to 300; add code to measure tf_sess.run() time

…ection API library: 'detection_boxes', 'detection_scores', 'detection_classes', and 'num_detections'

…map directly

Add support for 'rfcn_resnet101_egohands', plus some other re-factoring

jkjung-avt · 2018-10-30T03:07:26Z

I put my latest code in my own GitHub repository: https://github.com/jkjung-avt/tf_trt_models. Feel free to check it out.

Meanwhile, I'm not completely sure what your question is. Are you trying to say that TF-TRT fails to optimize 'faster_rcnn_resnet101' at all?

…Flow Detection Model Zoo instead)

… snapshot tensorflow 'models'

…models work

…rators.py

Update my fork to match the latest code in NVIDIA's original repository

jkjung-avt added 30 commits September 12, 2018 18:22

Fix installation scripts (using python3)

2a538f1

Add more stuffs into .gitignore

00cb45f

Add logs/ into .gitignore

68244a0

Add camera_tf_trt.py script and the corresponding utils code

8d5b726

Replace NVIDIA's README.md wiht my own stuffs, including description …

4517702

…about camera_tf_trt.py

Minor updates to README.md

ac699d0

Merge pull request #1 from jkjung-avt/dev

21eee23

Finished the first working implementation of the real-time object detection demo script: 'camera_tf_trt.py'

Fix typos

f1cf752

Update some comments in camera_tf_trt.py

1ec645e

Add utils/__init__.py so as to fix problems importing stuffs from tha…

7db4004

…t directory

Create the BBoxVisualization class and implement a nicer bounding box…

1070d9a

… drawing function

Fix bugs in the installation scripts

c7dace3

Make sure BBoxVisualization is working, and create test code for it

b56ecbf

Add reference link about setting max_batch_size to avoid 'cudnnFusedC…

bff3967

…onvActLayer.cpp (61) - Cuda Error in createFilterTextur'

Use the new BBoxVisualization class to draw nicer-looking bounding bo…

cff7f4f

…xes; fix duplicated logging for tensorflow

Merge pull request #2 from jkjung-avt/dev

9813e81

Refactor visualization code to the utils/ directory; fix duplicated logging for tensorflow

Fix the issue of CPU getting occupied by grab_img thread in use_image…

d9d9cc5

… case

Set display window size based on actual input image size (not the wid…

c585a44

…th/height setting in command line arguments)

Update the screenshot based on test result with JetPack-3.2 and tenso…

15f0858

…rflow-1.8 (as specified in the original NVIDIA tf_trt_models/README.md)

Add the tensorflow SSD model config files (don't download the origina…

ff30619

…l ones from tensorflow model repository), in which the 'score_threshold' has been modified from 1e-8 to 0.3

Merge pull request #3 from jkjung-avt/dev

a119d7b

Update a working version, tested with JetPack-3.2 and tensorflow 1.8.0

Add link to tensorflow 1.8.0 wheel for JetPack-3.3 (built by myself)

e9281ab

Add highlights on the pip wheel download links

263ea8a

Minor udpates on text formatting/alignments

8526e04

Merge pull request #4 from jkjung-avt/dev

8fd6429

Add download link to tensorflow 1.8.0 wheel for JetPack-3.3

Fix the bug of class 0 (output of TensorFlow Object Detection models …

fbb14a3

…are 0 based)

Add support for ssd_mobilenet_v1_egohands

44f3f47

Merge pull request #5 from jkjung-avt/dev

e99a9ac

Add support for ssd_mobilenet_v1_egohands

Add data/egohands_label_map.pbtxt

e265da7

Merge pull request #6 from jkjung-avt/dev

c6e8e66

Add data/egohands_label_map.pbtxt

jkjung-avt added 14 commits September 28, 2018 16:37

Add support for 'faster_rcnn_resnet50_egohands' model

383c09a

Have a hacky working version of TF-TRT optimized 'faster_rcnn_incepti…

efee38e

…on_v2_egohands'

Merge pull request #8 from jkjung-avt/dev

683c803

Add support for 'ssd_mobilenet_v2_egohands', 'ssdlite_mobilenet_v2_egohands', 'ssd_inception_v2_egohands', 'faster_rcnn_resnet50_egohands', 'faster_rcnn_resnet101_egohands' and 'faster_rcnn_inception_v2_egohands' models

Add description about applying TF-TRT on the hand detector models

27e25d7

Put faster_rcnn SecondStage computations (though not optimized by TF-…

846c19f

…TRT) onto GPU, so well as revert number of RPN proposals back to 300; add code to measure tf_sess.run() time

Use tensor names that are coded in the original TensorFlow Object Det…

22ede28

…ection API library: 'detection_boxes', 'detection_scores', 'detection_classes', and 'num_detections'

Remove '--num-classes' option, which could be derived from the label …

aa5079b

…map directly

Add code to deal with missing classes in the label map

5956941

Reduce number of region proposals to 32 for all faster_rcnn models

a9ab4b9

Add support for 'rfcn_resnet101_egohands' model

26561f0

Add code to handle rfcn models

8076a7a

Add code to handle rfcn models, as well as some minor optimizations

300437a

Merge pull request #9 from jkjung-avt/dev

af1fbf0

Add support for 'rfcn_resnet101_egohands', plus some other re-factoring

jkjung-avt added 15 commits December 13, 2018 14:32

Attempt to merge NVIDIA's latest changes into my own repository

868cdf5

Update tensorflow 'models' to a newer snapshot (hash: 6518c1c)

513f4f0

Remove coco model configs (will download the latest files from Tensor…

a92bffa

…Flow Detection Model Zoo instead)

Fix errors in the installation scripts

ea70984

Fix 'config_path' related code

9feea6b

Fix class dictionary (cls_dict) indices: it is 1-based with the newer…

1820836

… snapshot tensorflow 'models'

Add 'force_2ndstage_cpu()' jack to make faster_rcnn_xxx and rfcn_xxx …

b0d088c

…models work

Add a python3 related fix in object_detection/models/feature_map_gene…

ff6cbeb

…rators.py

Merge pull request #10 from jkjung-avt/nvidia

030f655

Update my fork to match the latest code in NVIDIA's original repository

Add description about tensorflow version and the 'TF-TRT Revisted' post

f7f2fa5

Do single-threading when reading from image or video files

6d2294e

Remove unused code

48a111d

Fix a minor bug in visualization.py

321b349

Add support for 'nvarguscamerasrc'

0c811f4

Highlight the TF-TRT Revisited blog post

2f55c69

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Low latency than expected #27

Low latency than expected #27

paduck86 commented Oct 30, 2018 •

edited

jkjung-avt commented Oct 30, 2018

Low latency than expected #27

Are you sure you want to change the base?

Low latency than expected #27

Conversation

paduck86 commented Oct 30, 2018 • edited

jkjung-avt commented Oct 30, 2018

paduck86 commented Oct 30, 2018 •

edited