-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perf_analyzer reported metrics for decoupled model #7203
Comments
Hi @ZhanqiuHu,
Please see here for the details of the metrics being calculated.
I believe if you run perf analyzer with
|
Thanks a lot for providing the details! I was more interested in what "Compute Input", "Compute Output", and "Network+Server Send/Recv" specifically are. When I use Thank you very much! |
For
Yes you are correct. The gRPC time reports are not supported in decoupled model. |
Thanks for the answer! However, it seems like the description on the doc is a little bit vague. What specific steps are involved in preprocessing of inputs and outputs? For example, for inputs, copying/moving the data to the device is probably part of it? And I guess for decoupled python model, (de)serailization will be part of Thanks! |
I am trying to profile our decoupled models (python backend) with perf_analyzer, and I'm curious how the following latency metrics are calculated?
Client Send, Network+Server Send/Recv,Server Queue,Server Compute Input,Server Compute Infer,Server Compute Output, and Client Recv
Also, when using grpc or http endpoints, is it possible to measure the latencies spend on network overhead and (un)marshalling protobuf?
Thanks!
The text was updated successfully, but these errors were encountered: