You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched the Inference issues and found no similar bug report.
Bug
If you install inference-gpu on a machine with CUDA 12 it'll complain and fall back to CPU execution mode.
2024-03-10 21:00:56.403012156 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:640 CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Please reference https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements to ensure all dependencies are met.
Ideally we'd detect this automatically & make it "just work" with CUDA 12. But alternatively we could let the user know why they're not getting GPU acceleration.
To resolve the issue where inference-gpu defaults to CPU execution on systems with CUDA 12, due to compatibility with onnxruntime, consider extending the platform.py script to detect the CUDA version. This can be achieved by invoking system commands to extract the CUDA version information and including this data within the retrieve_platform_specifics function's return dictionary. Once CUDA version detection is implemented, use this information in benchmark_adapter.py before running benchmarks or initializing models that require GPU support. Check the detected CUDA version and, if CUDA 12 is identified, ensure the appropriate version of onnxruntime-gpu that supports CUDA 12 is used.
Search before asking
Bug
If you install
inference-gpu
on a machine with CUDA 12 it'll complain and fall back to CPU execution mode.There's a special version of onnxruntime needed for CUDA 12: https://onnxruntime.ai/docs/install/
Ideally we'd detect this automatically & make it "just work" with CUDA 12. But alternatively we could let the user know why they're not getting GPU acceleration.
Environment
pytorch/pytorch:2.2.0-cuda12.1-cudnn8-devel
docker with after runningpip install inference-gpu
Minimal Reproducible Example
Additional
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: