Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensorrt model failure of TensorRT 8.6.1 when running model on GPU RTX4000 #3829

Open
sashank-tirumala opened this issue Apr 26, 2024 · 6 comments
Assignees
Labels
triaged Issue has been triaged by maintainers

Comments

@sashank-tirumala
Copy link

Description

I compiled a model on a device with a NVIDIA RTX4000GPU with driver version: 535.171.04
I tried to run the model on a device with a NVIDIA RTX4000GPU with driver version: 535.161.04
It gives me a warning:
[TRT] [W] Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.
and then does not run.

Environment

Device on which my tensorrt model is compiled:
TensorRT Version: 8.6.1
GPU Type: NVIDIA RTX 4000
Nvidia Driver Version: 535.171.04
CUDA Version: 12.2
CUDNN Version: 8.5
Operating System + Version: Ubuntu Jammy 22.04.4
Python Version (if applicable): 3.9.18
TensorFlow Version (if applicable): NA
PyTorch Version (if applicable): NA
Baremetal or Container (if container which image + tag): NVIDIA Release 23.07 (build 63868013)
NVIDIA TensorRT Version 8.6.1 (This container was used to run trtexec)

Device on which I attempted to run the Tensorrt Model:

TensorRT Version: 8.6.1
GPU Type: NVIDIA RTX 4000
Nvidia Driver Version: 535.161.07
CUDA Version: 12.2
CUDNN Version: 8.5
Operating System + Version: Ubuntu Jammy 22.04.4
Python Version (if applicable): 3.9.19
TensorFlow Version (if applicable): NA
PyTorch Version (if applicable): NA
Baremetal or Container (if container which image + tag): NA

Relevant Files

Model link:https://drive.google.com/file/d/102dE8e_fLO2rtnalnkbdhosWWytxINDM/view?usp=drive_link
Script link:https://drive.google.com/file/d/10So5Gf9F-RXTI10A69eCBKUY5znnB5zI/view?usp=drive_link

Steps To Reproduce

Download the data from the google drive link. The data has a simple test model and a test script test.py
Go to a python environment with tensorrt 8.6.1 installed. Run the python test.py script from inside the environment in the GPU and CUDA system provided above, you should see the same error.
Commands or scripts:
python test.py
Have you tried the latest release?:
No
Can this model run on other frameworks? Tensorrt model on the same system itself works, it's only when I move it to a different system that I face this error.

@lix19937
Copy link

lix19937 commented Apr 27, 2024

Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors

Usually it just a waring, you can upload full log.

On the other way, you had better to build engine plan and run plan at the same machine/device(with the same gpu arch and the nv-trt-cuda env).

@lix19937
Copy link

Similar problem #2900

@sashank-tirumala
Copy link
Author

It's the same GPU in both systems RTX4000, only the driver version that slightly changed!

@zerollzeng
Copy link
Collaborator

@zerollzeng
Copy link
Collaborator

Also would be good if you can try TRT 10, cause we don't fix bug on the older release now.

@zerollzeng
Copy link
Collaborator

BTW what does nvidia-smi report in the 2 machines?

@zerollzeng zerollzeng self-assigned this Apr 28, 2024
@zerollzeng zerollzeng added the triaged Issue has been triaged by maintainers label Apr 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

3 participants