The accuracy of the .pt model will decrease after being converted to .engine model. #12996

arkerman · 2024-05-10T06:09:07Z

Search before asking

I have searched the YOLOv5 issues and found no similar bug report.

YOLOv5 Component

Detection

Bug

The results obtained by my inference using the .pt model and the .engine model are different.

Environment

No response

Minimal Reproducible Example

No response

Additional

No response

Are you willing to submit a PR?

Yes I'd like to help by submitting a PR!

glenn-jocher · 2024-05-10T10:46:18Z

@arkerman hello!

It's quite common to observe slight discrepancies in model performance when converting from a .pt file to a .engine file due to difference in optimization and precision handling between the two formats. To minimize such discrepancies, ensure that the precision during conversion matches (e.g., FP32 in both cases) and that all optimization settings are similar.

If the performance difference is significant and these adjustments don't help, consider reviewing the conversion logs for any warnings or error messages that could indicate what might be going wrong during the process.

Happy coding! 😊

arkerman · 2024-05-10T11:32:29Z

A warning will be reported during the conversion process:
"Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32."
So how should I convert the model to ensure accurate alignment?

glenn-jocher · 2024-05-10T15:15:43Z

Hi @arkerman!

The warning you're seeing indicates a mismatch in data types during the conversion process from ONNX to TensorRT, where TensorRT does not support INT64 weights. To help ensure better precision and compatibility, you could manually cast the weights from INT64 to FP32 before converting to TensorRT. This is generally more aligned with TensorRT's capabilities than INT32 and helps minimize potential loss of information. Here's a quick code snippet to adjust the data types in the ONNX model:

import onnx
from onnx import numpy_helper

# Load your ONNX model
model = onnx.load('model.onnx')

# Iterate through each initializer (weight)
for initializer in model.graph.initializer:
    data = numpy_helper.to_array(initializer)
    if data.dtype == np.int64:
        # Cast INT64 to FP32
        data = data.astype(np.float32)
        # Replace the initializer with the new data
        new_initializer = numpy_helper.from_array(data, initializer.name)
        model.graph.initializer.remove(initializer)
        model.graph.initializer.append(new_initializer)

# Save the modified model
onnx.save(model, 'modified_model.onnx')

This snippet converts INT64 weights to FP32, which might help with your conversion process! 😊 Happy coding!

arkerman · 2024-05-11T00:39:40Z

@glenn-jocher Thanks for your help!
But it seems that the code snipped is not work.
It raised an error : "onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Load model from modified_yolov5s.onnx failed:This is an invalid model. Type Error: Type 'tensor(float)' of input parameter (onnx::Reshape_468) of operator (Reshape) in node (Reshape_237) is invalid."

glenn-jocher · 2024-05-21T03:57:36Z

Hi @arkerman!

It looks like the model is expecting a different data type for certain operations. Instead of converting all weights to FP32 indiscriminately, you might need to specifically target only those tensors that are compatible with such a conversion. Here's a refined approach to ensure you only adjust the necessary tensors:

import onnx
from onnx import numpy_helper

# Load your ONNX model
model = onnx.load('model.onnx')

# Iterate through each initializer (weight)
for initializer in model.graph.initializer:
    data = numpy_helper.to_array(initializer)
    if data.dtype == np.int64 and data.ndim == 0:  # Target scalar int64 weights
        data = data.astype(np.float32)
        new_initializer = numpy_helper.from_array(data, initializer.name)
        model.graph.initializer.remove(initializer)
        model.graph.initializer.append(new_initializer)

# Save the modified model
onnx.save(model, 'modified_model.onnx')

This code now checks if the tensor is a scalar (0-dimensional) before converting, which might help avoid the type error you encountered. Give it a try and let us know how it goes! 😊

arkerman added the bug Something isn't working label May 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The accuracy of the .pt model will decrease after being converted to .engine model. #12996

The accuracy of the .pt model will decrease after being converted to .engine model. #12996

arkerman commented May 10, 2024

glenn-jocher commented May 10, 2024

arkerman commented May 10, 2024

glenn-jocher commented May 10, 2024

arkerman commented May 11, 2024

glenn-jocher commented May 21, 2024

The accuracy of the .pt model will decrease after being converted to .engine model. #12996

The accuracy of the .pt model will decrease after being converted to .engine model. #12996

Comments

arkerman commented May 10, 2024

Search before asking

YOLOv5 Component

Bug

Environment

Minimal Reproducible Example

Additional

Are you willing to submit a PR?

glenn-jocher commented May 10, 2024

arkerman commented May 10, 2024

glenn-jocher commented May 10, 2024

arkerman commented May 11, 2024

glenn-jocher commented May 21, 2024