Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TensorRT 8.6.3.1 package in Python PyPy for Triton Nvidia Inference Server version > 24.01 #3862

Open
aptmess opened this issue May 13, 2024 · 1 comment
Assignees
Labels
triaged Issue has been triaged by maintainers

Comments

@aptmess
Copy link

aptmess commented May 13, 2024

Description

For Triton Nvidia Inference Server version bigger than 24.01 (started with 24.02) the supported version of tensorrt is 8.6.3.1. I am using tensorrt python package and script to convert onnx weights to trt engine, but the last available version in pypy is 8.6.1.6 and because of this I can't use tensorrt_backend in triton and got this error:

The engine plan file is not compatible with this version of TensorRT, 
       expecting library version 8.6.3.1 got 8.6.1.6, please rebuild.

Is it possible to upload this package version (8.6.3.1) to pypy? Or how can I rewrite this script using other tools?

import tensorrt as trt

explicit_batch_flag = 1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
max_batch_size: int = 32
max_workspace_size: int = 1 << 30

logger = trt.Logger(trt.Logger.INFO)

with (
    trt.Builder(logger) as builder, 
    builder.create_network(explicit_batch_flag) as network,
    trt.OnnxParser(network, logger) as parser
):
    config = builder.create_builder_config()
    config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, max_workspace_size)
    ...

Steps To Reproduce

  1. pip install tensorrt==8.6.1.6
  2. run script with compiling onnx to tensorrt
  3. run triton with version > 24.01 (works on 24.01, but not 24.02)

Linked issue

Tasks

No tasks being tracked yet.
@aptmess aptmess changed the title TensorRT 8.6.3.1 package in Python for Triton Nvidia Inference Server version > 24.01 TensorRT 8.6.3.1 package in Python PyPy for Triton Nvidia Inference Server version > 24.01 May 13, 2024
@zerollzeng
Copy link
Collaborator

Looks like we didn't release this version in pypi, to WAR this, you can:

  1. download and install from the deb package.
  2. build the python building on your own.

@zerollzeng zerollzeng self-assigned this May 17, 2024
@zerollzeng zerollzeng added the triaged Issue has been triaged by maintainers label May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

2 participants