Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low latency than expected #27

Open
wants to merge 61 commits into
base: master
Choose a base branch
from
Open

Conversation

paduck86
Copy link

@paduck86 paduck86 commented Oct 30, 2018

Hello Mr.Jung,
Tanks to you, I was able to test faster rcnn with tensorrt.

But, the latency is lower than I expected on my machine.
The response time is as follows,

  • architecture : faster rcnn resnet101
  • machine : V100
  • version : tensorflow 1.10, tensorrt4-ga, cuda9.0, cudnn7.0
  • tensorflow native : 0.17s per image
  • tensrrt fp 32 : 0.52s per image
  • tensrrt fp 16 : 0.51s per image
  • tensrrt int 8 : 0.51s per image

In addition, the memory usage is not much different.

My code is as follows,

trt_graph = trt.create_inference_graph(
input_graph_def=frozen_graph,
outputs=output_names,
max_batch_size=1,
max_workspace_size_bytes=1 << 25,
precision_mode='INT8' #'FP32' / 'FP16',
minimum_segment_size=50
)

Did I do something wrong?
I would really appreciate it if you answer for me.

Finished the first working implementation of the real-time object detection demo script: 'camera_tf_trt.py'
…onvActLayer.cpp (61) - Cuda Error in createFilterTextur'
Refactor visualization code to the utils/ directory; fix duplicated logging for tensorflow
…th/height setting in command line arguments)
…rflow-1.8 (as specified in the original NVIDIA tf_trt_models/README.md)
…l ones from tensorflow model repository), in which the 'score_threshold' has been modified from 1e-8 to 0.3
Update a working version, tested with JetPack-3.2 and tensorflow 1.8.0
 Add download link to tensorflow 1.8.0 wheel for JetPack-3.3
 Add support for ssd_mobilenet_v1_egohands
Add data/egohands_label_map.pbtxt
…ohands', 'ssd_inception_v2_egohands', 'faster_rcnn_resnet50_egohands', 'faster_rcnn_resnet101_egohands' and 'faster_rcnn_inception_v2_egohands' models. However, the faster rcnn models are hacky and do not perform well on TX2 yet.
Add support for 'ssd_mobilenet_v2_egohands', 'ssdlite_mobilenet_v2_egohands', 'ssd_inception_v2_egohands', 'faster_rcnn_resnet50_egohands', 'faster_rcnn_resnet101_egohands' and 'faster_rcnn_inception_v2_egohands' models
…TRT) onto GPU, so well as revert number of RPN proposals back to 300; add code to measure tf_sess.run() time
…ection API library: 'detection_boxes', 'detection_scores', 'detection_classes', and 'num_detections'
Add support for 'rfcn_resnet101_egohands', plus some other re-factoring
@jkjung-avt
Copy link

I put my latest code in my own GitHub repository: https://github.com/jkjung-avt/tf_trt_models. Feel free to check it out.

Meanwhile, I'm not completely sure what your question is. Are you trying to say that TF-TRT fails to optimize 'faster_rcnn_resnet101' at all?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants