Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allows for custom providers to be passed #369

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Qfc9
Copy link

@Qfc9 Qfc9 commented Apr 27, 2024

Description

There is an existing issue within the current codebase, highlighted by warnings generated for numerous users. The default inference providers derived from ONNX are inaccurately configured. This issue predominantly affects the OnnxRoboflowInferenceModel, leading to a scenario where only the CPU is utilized for computations even when GPU or TensorRT packages are available.

Type of Change

Please remove options that are not applicable.

  • Bug fix (a non-breaking change that resolves an issue)
  • New feature (a non-breaking change that introduces new functionality)

Testing and Validation

This modification has been tested via local installation and execution of the package. Post-implementation, the previous warnings have been eliminated, and the package performs efficiently, provided that custom providers are specified. Apart from this adjustment, the functionality of the code remains consistent with prior outputs.

@CLAassistant
Copy link

CLAassistant commented Apr 27, 2024

CLA assistant check
All committers have signed the CLA.

@yeldarby
Copy link
Contributor

Hi @Qfc9, the default is to use the CUDA execution provider, then fallback to OpenVINO, then fallback to CPU if those are not available. This can be overridden by setting the ONNXRUNTIME_EXECUTION_PROVIDERS environment variable.

highlighted by warnings generated for numerous users

What warning are you seeing?

leading to a scenario where only the CPU is utilized for computations even when GPU or TensorRT packages are available

We purposely do not default to the TensorRT EP because it has an extremely slow compilation step (upwards of 20 minutes on some machines) before you can get your first inference results back. This tradeoff of perverse cold-start time with a bit faster inference is a non-starter for many real-world use-cases. But the default can be overridden using the environment variable.

@Qfc9
Copy link
Author

Qfc9 commented Apr 30, 2024

TLDR; The Docs are bad. Not obvious on how to make a providers change. You have a method (ENV Vars), but this PR can also help and doesn't stop the old method. We can update the docs if yall need help with that to make ENV vars more obvious.

The following warning is usually given to users because OpenVINO is apart of the execution list even though they may not have it.

C:\Users\elija\AppData\Local\Programs\Python\Python311\Lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py:65: UserWarning: Specified provider 'OpenVINOExecutionProvider' is not in available provider names.Available providers: 'TensorrtExecutionProvider, CUDAExecutionProvider, CPUExecutionProvider' warnings.warn(

Not an issue, but it confusing to new users. Additionally editing the ONNXRUNTIME_EXECUTION_PROVIDERS is not really a feasible or understandable concept to most people because there is little to no documentation on the matter. There is just the one reference in the docker side of the docs (https://inference.roboflow.com/quickstart/docker/#advanced-build-a-docker-container-from-scratch). But i mean that looks like the issue with all the ENV Vars, none of them look to be documented on the docs.

I didn't even realize that was a thing until now, tbh. This potentially negates the change of the PR that I submitted. The only argument I would still make for the change is that using the old ENV Var method still works. This just also makes it more accessible and understandable to other devs who are more used to using Onnx and passing in providers via named args.

At a minimum, we could look at instead updating the docs if that is a feasible thing from the repo and then sending a PR.

We are not in favor of having tensorrt as a default item. You are 100% correct in the current set up. Sure tensorrt takes 20 minutes for exporting on the first run, but alot of our customers are fine with that since there is a significate boost to performance from it. This can turn a model from near real time, to truly real time.

Additionally, this allows customers to be able to run potentially more models at once on the same hardware since of the optimizations.

@Qfc9
Copy link
Author

Qfc9 commented Apr 30, 2024

Below is supporting images to the current doc issue. Most individuals will not see an option, may see the one function but there isn't any immediately useful information on the page. I think it's unreasonable to expect most devs to do any more research than that. We are an example of that.

Again, we would be more than happy to do some updates to the docs if yall would prefer that PR compared to the one we just submitted.

image
image
image

@yeldarby
Copy link
Contributor

yeldarby commented May 1, 2024

Definitely agree our docs are lacking here. Would definitely accept a PR to make this more clear.

Having two competing ways to do set these values seems not ideal. Via environment variables is easier when trying to configure via Docker w/o having to have different code for different environments but can definitely see how if one is using the package directly in Python it'd be useful to be able to do it in the code.

@PawelPeczek-Roboflow @grzegorz-roboflow what do you think?

@grzegorz-roboflow
Copy link
Contributor

grzegorz-roboflow commented May 1, 2024

I'm proponent of verbose way of passing all required args so all are available in the top method signature, so reading signature should provide all information for user to be able to reason about the method. I would parse envs in the top layer and pass them as arg rather than read envs somewhere deep below in the n-th layer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants