Different predict result from preview and my app for same image #683

ilanb · 2024-05-14T08:26:16Z

Search before asking

I have searched the HUB issues and discussions and found no similar questions.

Question

Hi, I use PyTorch model in custom python app and I'v for same images no detection with model and good detection (0;95%) in preview (ultralytics)

I use :
results = yolo_model.predict(Image.open(buf), iou=0.45, imgsz=640, max_det=1, device="cpu", augment=True, agnostic_nms=True)

I tried too :
results = yolo_model.predict(Image.open(buf))

This is not happen for every class (I have 50 class) detection work fine in my app

Thanks for help

Additional

No response

The text was updated successfully, but these errors were encountered:

pderrenger · 2024-05-14T13:18:58Z

@ilanb hi there! 🙌

It sounds like you're experiencing inconsistencies in detection results between the Ultralytics preview and your custom Python application. A possible reason for this could be differences in preprocessing or configuration settings applied to the model before running predictions.

Since you’ve experimented with some parameters already, you might want to verify the following:

Image preprocessing: Ensure that the images are preprocessed in the same way in both environments (your app and the Ultralytics preview) before they are fed into the model.
Model configuration: Double-check that the model configuration (e.g., iou, imgsz, max_det, device, augment, agnostic_nms) mirrors the setup used in the Ultralytics preview as closely as possible.

Additionally, please ensure that you are using the same model version and weights in both your custom app and the preview for a fair comparison.

For specific instructions on setting and matching configuration, please refer to the Ultralytics HUB Docs at https://docs.ultralytics.com/hub.

Hope this helps! If the issue persists, please provide more details about your image preprocessing steps and model setup in both environments for further diagnosis. 😊

ilanb · 2024-05-14T13:26:43Z

Thanks for reply :-)

What do you mean by " this could be differences in preprocessing " I only upload image from postman or html like your preview in utltralytics hub.

what "Image preprocessing" do you apply in preview ?

For the parameters I already tested both with and without.

Image used is the same image to serve as training, valid, test

here is my simple app:

`from ultralytics import YOLO
from flask import request, Flask, jsonify
from flask_cors import CORS
from PIL import Image
import json

app = Flask(name)
CORS(app)

YOLO model

yolo_model = None

@app.route("/")
def root():
"""
Site main page handler function.
:return: Content of index.html file
"""
with open("index.html") as file:
return file.read()

@app.route("/detect", methods=["POST"])
def detect():
buf = request.files["image_file"]
try:
if yolo_model is None:
raise Exception("Model has not been loaded.")
boxes = detect_objects_on_image(buf.stream)
return jsonify(boxes)
except Exception as e:
error_message = str(e)
return jsonify({"error": error_message}), 500

@app.route("/detecthtml", methods=["POST"])
def detecthtml():
confidence_threshold = float(request.args.get("confidence_threshold", 0.2))
buf = request.files["image_file"]
try:
if yolo_model is None:
raise Exception("Model has not been loaded.")
boxes = detect_objects_on_image_html(buf.stream, confidence_threshold)
return jsonify(boxes)
except Exception as e:
error_message = str(e)
return jsonify({"error": error_message}), 500

def detect_objects_on_image(buf):
try:
confidence_threshold = float(request.form.get("confidence_threshold", "0.2"))
results = yolo_model.predict(Image.open(buf), iou=0.45, imgsz=640, max_det=1, device="cpu", augment=True, agnostic_nms=True)
#results = yolo_model.predict(Image.open(buf))
result = results[0]
output = []
for box in result.boxes:
class_id = box.cls[0].item()
prob = round(box.conf[0].item(), 2)
if prob >= confidence_threshold:
output.append([result.names[class_id], prob])

    return output
except Exception as e:
    raise Exception("Error during object detection: " + str(e))

def detect_objects_on_image_html(buf, confidence_threshold=0.2):
try:
results = yolo_model.predict(Image.open(buf), iou=0.45, imgsz=640, max_det=1, device="cpu", augment=True, agnostic_nms=True)
result = results[0]
output = []
for box in result.boxes:
x1, y1, x2, y2 = [round(x) for x in box.xyxy[0].tolist()]
class_id = box.cls[0].item()
prob = round(box.conf[0].item(), 2)
if prob >= confidence_threshold:
output.append([x1, y1, x2, y2, result.names[class_id], prob])

    return output
except Exception as e:
    raise Exception("Error during object detection: " + str(e))

if name == 'main':
yolo_model = YOLO("ponantyolo8.pt")
app.run(debug=True, host='0.0.0.0', port=5080)`

Thanks

pderrenger · 2024-05-19T18:54:46Z

Hi @ilanb, thanks for providing more details! 😊

When I mention "image preprocessing," I'm referring to how the image is prepared before it's input into the model. This includes resizing, normalization, and possibly other transformations to ensure the image is in the correct format for the model to process.

In the Ultralytics preview, images are typically resized and normalized to match the input expectations of the model. It's crucial to ensure that the same preprocessing steps are applied in your app as well.

From your code, it looks like you're directly using Image.open(buf) without explicitly resizing or normalizing the image. Here's a quick suggestion to ensure the image is resized correctly:

from PIL import Image

def prepare_image(image_path):
    img = Image.open(image_path)
    img = img.resize((640, 640))  # Resize the image to the expected input size
    return img

# Then use this function to prepare your image before prediction
img = prepare_image(buf)
results = yolo_model.predict(img, iou=0.45, imgsz=640, max_det=1, device="cpu", augment=True, agnostic_nms=True)

Make sure that the image size (imgsz) and other parameters match those used during the model's training and in the Ultralytics preview. This consistency is key to achieving similar detection results.

Let me know if aligning these preprocessing steps helps or if there's anything else you'd like to explore! 🚀

ilanb · 2024-05-20T08:37:04Z

Thank you,
I tryed that too, but same problem, "no detection" .

I tried to simplify to results = yolo_model.predict(img) but same.

with same image, detection working on your preview but failed with my code...

`from ultralytics import YOLO
from flask import request, Flask, jsonify
from flask_cors import CORS
from PIL import Image
import json

app = Flask(name)
CORS(app)

@app.route("/")
def root():
with open("index.html") as file:
return file.read()

@app.route("/detect", methods=["POST"])
def detect():
buf = request.files["image_file"]
try:
if yolo_model is None:
raise Exception("Model has not been loaded.")
boxes = detect_objects_on_image(buf.stream)
return jsonify(boxes)
except Exception as e:
error_message = str(e)
return jsonify({"error": error_message}), 500

@app.route("/detecthtml", methods=["POST"])
def detecthtml():
confidence_threshold = float(request.args.get("confidence_threshold", 0.2))
buf = request.files["image_file"]
try:
if yolo_model is None:
raise Exception("Model has not been loaded.")
boxes = detect_objects_on_image_html(buf.stream, confidence_threshold)
return jsonify(boxes)
except Exception as e:
error_message = str(e)
return jsonify({"error": error_message}), 500

def prepare_image(image_path):
img = Image.open(image_path)
img = img.resize((640, 640)) # Resize the image to the expected input size
return img

def detect_objects_on_image(buf):
try:
confidence_threshold = float(request.form.get("confidence_threshold", "0.2"))
img = prepare_image(buf)
results = yolo_model.predict(img)
result = results[0]
output = []
for box in result.boxes:
class_id = box.cls[0].item()
prob = round(box.conf[0].item(), 2)
output.append([result.names[class_id], prob])

    return output
except Exception as e:
    raise Exception("Error during object detection: " + str(e))

def detect_objects_on_image_html(buf, confidence_threshold=0.2):
try:
img = prepare_image(buf)
results = yolo_model.predict(img)
result = results[0]
output = []
for box in result.boxes:
x1, y1, x2, y2 = [round(x) for x in box.xyxy[0].tolist()]
class_id = box.cls[0].item()
prob = round(box.conf[0].item(), 2)
output.append([x1, y1, x2, y2, result.names[class_id], prob])

    return output
except Exception as e:
    raise Exception("Error during object detection: " + str(e))

if name == 'main':
yolo_model = YOLO("ponantyolo8.pt")
app.run(debug=True, host='0.0.0.0', port=5080)`

ilanb · 2024-05-20T09:45:42Z

The strange behaviour that is when I try .onnx exported model in unity sentis c# all images are correctly detected too..

Didn't understant what cause the problem with python and .pt model

`using System.Collections.Generic;
using System.Collections;
using Unity.Sentis;
using UnityEngine;
using UnityEngine.UI;
using UnityEngine.Video;
using Lays = Unity.Sentis.Layers;
using MugHeadStudios;

public class RunYOLO8n : MonoBehaviour
{
const string modelName = "lastponant.sentis";
// Link the classes.txt here:
public TextAsset labelsAsset;
// Create a Raw Image in the scene and link it here:
public RawImage displayImage;
// Link to a bounding box texture here:
public Sprite boxTexture;
// Link to the font for the labels:
public Font font;

const BackendType backend = BackendType.CPU;

private Transform displayLocation;
private Model model;
private IWorker engine;
private string[] labels;
private RenderTexture targetRT;


//Image size for the model
private const int imageWidth = 640;
private const int imageHeight = 640;

//The number of classes in the model
private const int numClasses = 50;

private VideoPlayer video;

List<GameObject> boxPool = new List<GameObject>();

[SerializeField, Range(0, 1)] float iouThreshold = 0.5f;
[SerializeField, Range(0, 1)] float scoreThreshold = 0.5f;
int maxOutputBoxes = 64;

//For using tensor operators:
Ops ops;

//bounding box data
public struct BoundingBox
{
    public float centerX;
    public float centerY;
    public float width;
    public float height;
    public string label;
}

public Texture2D[] textures; // Assign this array in the Unity Editor.
public float delayInSeconds = 5f; // Time delay between textures.

void Start()
{
    Application.targetFrameRate = 60;
    Screen.orientation = ScreenOrientation.LandscapeLeft;

    ops = WorkerFactory.CreateOps(backend, null);

    //Parse neural net labels
    labels = labelsAsset.text.Split('\n');

    LoadModel();

    targetRT = new RenderTexture(imageWidth, imageHeight, 0);

    //Create image to display video
    displayLocation = displayImage.transform;

    //Create engine to run model
    engine = WorkerFactory.CreateWorker(backend, model);
    
    if (textures.Length > 0)
    {
    	SubRoutines.Repeat(10, textures.Length*delayInSeconds, r => { StartCoroutine(LoadTexturesOneByOne(delayInSeconds)); }, () => { Debug.Log("Restarted"); });
    }
}

void LoadModel()
{
    //Load model
    model = ModelLoader.Load(Application.streamingAssetsPath + "/" + modelName);

    //The classes are also stored here in JSON format:
    Debug.Log($"Class names: \n{model.Metadata["names"]}");

    //We need to add some layers to choose the best boxes with the NMSLayer
    
    //Set constants
    model.AddConstant(new Lays.Constant("0", new int[] { 0 }));
    model.AddConstant(new Lays.Constant("1", new int[] { 1 }));
    model.AddConstant(new Lays.Constant("4", new int[] { 4 }));


    model.AddConstant(new Lays.Constant("classes_plus_4", new int[] { numClasses + 4 }));
    model.AddConstant(new Lays.Constant("maxOutputBoxes", new int[] { maxOutputBoxes }));
    model.AddConstant(new Lays.Constant("iouThreshold", new float[] { iouThreshold }));
    model.AddConstant(new Lays.Constant("scoreThreshold", new float[] { scoreThreshold }));
   
    //Add layers
    model.AddLayer(new Lays.Slice("boxCoords0", "output0", "0", "4", "1")); 
    model.AddLayer(new Lays.Transpose("boxCoords", "boxCoords0", new int[] { 0, 2, 1 }));
    model.AddLayer(new Lays.Slice("scores0", "output0", "4", "classes_plus_4", "1")); 
    model.AddLayer(new Lays.ReduceMax("scores", new[] { "scores0", "1" }));
    model.AddLayer(new Lays.ArgMax("classIDs", "scores0", 1));

    model.AddLayer(new Lays.NonMaxSuppression("NMS", "boxCoords", "scores",
        "maxOutputBoxes", "iouThreshold", "scoreThreshold",
        centerPointBox: Lays.CenterPointBox.Center
    ));

    model.outputs.Clear();
    model.AddOutput("boxCoords");
    model.AddOutput("classIDs");
    model.AddOutput("NMS");
}

IEnumerator LoadTexturesOneByOne(float wait)
{
	foreach (var texture in textures)
	{
		displayImage.texture = texture;

		if (displayImage == null || texture == null)
		{
			Debug.LogError("Please assign the RawImage and testImage in the Inspector.");
		}

		// Set the test image as the texture of the Raw Image
		displayImage.texture = texture;

		// Calculate the aspect ratio of the test image
		float aspectRatio = (float)texture.width / texture.height;

		// Get the screen dimensions
		float screenWidth = Screen.width;
		float screenHeight = Screen.height;

		// Calculate the size of the RawImage to fit the screen while maintaining aspect ratio
		float imageWidth = screenWidth;
		float imageHeight = screenWidth / aspectRatio;

		if (imageHeight > screenHeight)
		{
			imageHeight = screenHeight;
			imageWidth = screenHeight * aspectRatio;
		}

		// Set the size of the RawImage
		RectTransform rawImageRect = displayImage.GetComponent<RectTransform>();
		rawImageRect.sizeDelta = new Vector2(imageWidth, imageHeight);
		// Perform ML inference on the test image
		ExecuteML(texture);
		
		yield return new WaitForSeconds(wait);		
	}
}

private void Update()
{
    if (Input.GetKeyDown(KeyCode.Escape))
    {
        Application.Quit();
    }
}

public void ExecuteML(Texture2D inputTexture)
{
    ClearAnnotations();

    // Process the input texture
    using var input = TextureConverter.ToTensor(inputTexture, imageWidth, imageHeight, 3);
    engine.Execute(input);

    var boxCoords = engine.PeekOutput("boxCoords") as TensorFloat;
    var NMS = engine.PeekOutput("NMS") as TensorInt;
    var classIDs = engine.PeekOutput("classIDs") as TensorInt;

    using var boxIDs = ops.Slice(NMS, new int[] { 2 }, new int[] { 3 }, new int[] { 1 }, new int[] { 1 });
    using var boxIDsFlat = boxIDs.ShallowReshape(new TensorShape(boxIDs.shape.length)) as TensorInt;
    using var output = ops.Gather(boxCoords, boxIDsFlat, 1);
    using var labelIDs = ops.Gather(classIDs, boxIDsFlat, 2);
    
    output.MakeReadable();
    labelIDs.MakeReadable();

    float displayWidth = displayImage.rectTransform.rect.width;
    float displayHeight = displayImage.rectTransform.rect.height;

    float scaleX = displayWidth / imageWidth;
    float scaleY = displayHeight / imageHeight;

    //Draw the bounding boxes
    for (int n = 0; n < output.shape[1]; n++)
    {
        var box = new BoundingBox
        {
            centerX = output[0, n, 0] * scaleX - displayWidth / 2,
            centerY = output[0, n, 1] * scaleY - displayHeight / 2,
            width = output[0, n, 2] * scaleX,
            height = output[0, n, 3] * scaleY,
            label = labels[labelIDs[0, 0,n]],
        };
        DrawBox(box, n);
    }
}

public void DrawBox(BoundingBox box , int id)
{
    //Create the bounding box graphic or get from pool
    GameObject panel;
    if (id < boxPool.Count)
    {
        panel = boxPool[id];
        panel.SetActive(true);
    }
    else
    {
        panel = CreateNewBox(Color.yellow);
    }
    //Set box position
    panel.transform.localPosition = new Vector3(box.centerX, -box.centerY);

    //Set box size
    RectTransform rt = panel.GetComponent<RectTransform>();
    rt.sizeDelta = new Vector2(box.width, box.height);
    
    //Set label text
    var label = panel.GetComponentInChildren<Text>();
    label.text = box.label;
}

public GameObject CreateNewBox(Color color)
{
    //Create the box and set image

    var panel = new GameObject("ObjectBox");
    panel.AddComponent<CanvasRenderer>();
    Image img = panel.AddComponent<Image>();
    img.color = color;
    img.sprite = boxTexture;
    img.type = Image.Type.Sliced;
    panel.transform.SetParent(displayLocation, false);

    //Create the label

    var text = new GameObject("ObjectLabel");
    text.AddComponent<CanvasRenderer>();
    text.transform.SetParent(panel.transform, false);
    Text txt = text.AddComponent<Text>();
    txt.font = font;
    txt.color = color;
    txt.fontSize = 40;
    txt.horizontalOverflow = HorizontalWrapMode.Overflow;

    RectTransform rt2 = text.GetComponent<RectTransform>();
    rt2.offsetMin = new Vector2(20, rt2.offsetMin.y);
    rt2.offsetMax = new Vector2(0, rt2.offsetMax.y);
    rt2.offsetMin = new Vector2(rt2.offsetMin.x, 0);
    rt2.offsetMax = new Vector2(rt2.offsetMax.x, 30);
    rt2.anchorMin = new Vector2(0, 0);
    rt2.anchorMax = new Vector2(1, 1);

    boxPool.Add(panel);
    return panel;
}

public void ClearAnnotations()
{
    foreach(var box in boxPool)
    {
        box.SetActive(false);
    }
}

private void OnDestroy()
{
    engine?.Dispose();
    ops?.Dispose();
}

}
`

pderrenger · 2024-05-20T17:22:54Z

Hi @ilanb,

It's intriguing that the .onnx model works well in Unity with Sentis but not the .pt model in Python. This could suggest a difference in how the models handle the input data or in the post-processing steps.

Here are a couple of things to consider:

Model Version Compatibility: Ensure that the .pt model and the .onnx model are from the same training and have the same architecture.
Input Normalization: Check if there's a difference in how images are preprocessed and normalized in Unity vs. your Python setup. Sometimes, models expect inputs normalized in a specific way (e.g., scaled to [0,1] or mean-subtracted).

If you haven't already, you might also want to try running inference with a very simple setup in Python to rule out any issues with Flask or image handling:

from ultralytics import YOLO
from PIL import Image

# Load model
model = YOLO("path_to_your_model.pt")

# Load image
img = Image.open("path_to_your_image.jpg")
img = img.resize((640, 640))

# Predict
results = model.predict(img)
print(results)

This minimal example can help isolate the problem by removing potential complications from web server code or image streaming.

Let me know how it goes! 🚀

ilanb added the question A HUB question that does not involve a bug label May 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different predict result from preview and my app for same image #683

Different predict result from preview and my app for same image #683

ilanb commented May 14, 2024

pderrenger commented May 14, 2024

ilanb commented May 14, 2024

pderrenger commented May 19, 2024

ilanb commented May 20, 2024

ilanb commented May 20, 2024

pderrenger commented May 20, 2024

Different predict result from preview and my app for same image #683

Different predict result from preview and my app for same image #683

Comments

ilanb commented May 14, 2024

Search before asking

Question

Additional

pderrenger commented May 14, 2024

ilanb commented May 14, 2024

YOLO model

pderrenger commented May 19, 2024

ilanb commented May 20, 2024

ilanb commented May 20, 2024

pderrenger commented May 20, 2024