Loosing the pretrained model weights when using a new data to retrain the already trained model. #12666

SIDD-1991 · 2024-05-13T13:42:01Z

Search before asking

I have searched the YOLOv8 issues and discussions and found no similar questions.

Question

I have a model that I already trained to distinguish between two classes and it work's fine however when I try to train that model using new data just for one class it seems to loose all the metrics of the other class and is unable to recognize the former class.

model = YOLO("best.pt")
results = model.train(pretrained = True, data='train.yaml', epochs=10)

This new data only contains one class for example the pretrained model had data for cats and dogs and now I try to retrain it with more images of dogs.
When I do that it is unable to recognize the cats.

Additional

No response

github-actions · 2024-05-13T13:42:27Z

👋 Hello @SIDD-1991, thank you for your interest in Ultralytics YOLOv8 🚀! We recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Join the vibrant Ultralytics Discord 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.

Install

Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.

pip install ultralytics

Environments

YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Notebooks with free GPU:
Google Cloud Deep Learning VM. See GCP Quickstart Guide
Amazon Deep Learning AMI. See AWS Quickstart Guide
Docker Image. See Docker Quickstart Guide

Status

If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

glenn-jocher · 2024-05-13T19:29:16Z

It sounds like you're dealing with a mix-up in training tasks using the YOLOv8 models. Here’s how you can address this:

When you initially train a detection model with yolov8-seg.yaml but specify train='detect', the model is fine-tuned for detection tasks only, even if the YAML file is configured for segmentation.

To train for segmentation after initially training for detection, you need to ensure that the model configuration file (i.e., the YAML file) and your specified training task are aligned with segmentation. You should check that your YAML configuration aligns with segmentation requirements, especially the last layers and their configurations.

If your YAML configuration ends with a detect layer when it should be set up for segmentation, you'll need to adjust the architecture in the YAML to have segmentation-specific layers. Make sure these configurations reflect the need to output masks and not just bounding boxes.

Here's a brief example how you might adjust your training call to switch tasks correctly:

model = YOLO('stf-yolov8.yaml')  # Ensure this YAML is set up for segmentation
results = model.train(data='your_segmentation_dataset.yaml', task='segment')

Make sure that 'your_segmentation_dataset.yaml' is set up correctly for segmentation tasks, including paths to images and their corresponding masks.

Changing the task dynamically without ensuring the underlying model architecture supports it will typically not provide the desired outcome.

SIDD-1991 · 2024-05-14T05:22:19Z

I have a model that I already trained to distinguish between two classes and it work's fine however when I try to train that model using new data just for one class it seems to loose all the metrics of the other class and is unable to recognize the former class.

model = YOLO("best.pt")
results = model.train(pretrained = True, data='train.yaml', epochs=10)

This new data only contains one class for example the pretrained model had data for cats and dogs and now I try to retrain it with more images of dogs.
When I do that it is unable to recognize the cats.

Note the images size of the original training set was 128x128 and the new images of dogs are 640x640 is that makes any difference

SIDD-1991 · 2024-05-14T05:24:39Z

let say in model.train(model='yolov8-seg.yaml',train='detect') now which would be implemented , detection or segementation

we planned to train a backbone model, since we didn't have a mask segment annotation for large dataset, we did a detection training. now using best model from that training, we are training it with another small dataset with mask segment annotation , but when i train i get detect layer in last line of model.yaml , so how can i use it as segment model

i tried to change the task='segment', but i don't get instance segmentation but box detection

the backbone model is stf-yolov8, now i want that trained model to train with mask sement annotation

To be honest i don't think you should be posting this as a comment this is a question it's better you raise a new thread so that if other's have the same issue they can easily identify in the search results

SIDD-1991 · 2024-05-14T08:55:30Z

i got from some forum that if i retrain the model with new data it will forget the old one so the correct way is to retrain with the new data added to previous data.
Now my question is if i had best.pt from the previous training then if i use

model = YOLO("best.pt")

or mode = YOLO("yolov8m.pt")

would there be any difference in training accuracy like the already trained model when retrained with same data and some additional data would it be better in any way than just using the new fresh model?

glenn-jocher · 2024-05-14T09:27:00Z

You're right in thinking that retraining a model on new data without the old data will cause it to forget the previously learned classes—a phenomenon known as "catastrophic forgetting." To preserve previous learning while incorporating new data, you should combine both old and new datasets in the retraining process.

Regarding your question on using best.pt vs. a fresh yolov8m.pt:

model = YOLO("best.pt"): This loads your model previously trained on your specific dataset. If you retrain this model with combined old and new data, it should help improve its accuracy for both old and new classes as it continues to learn from both.
model = YOLO("yolov8m.pt"): This would be starting from a pre-trained model but not one customized to your specific previous data. This could be helpful if your previous model was significantly overfitting or if you want to approach retraining with a less biased starting point.

In summary, using best.pt allows you to benefit from previous training, potentially leading to improved performance when retrained with an expanded dataset.

SIDD-1991 · 2024-05-14T10:33:32Z

will using best.pt be any faster I mean could I work with less epoch's if I use best.pt rather than starting from scratch?

glenn-jocher · 2024-05-20T03:09:35Z

Absolutely! Using best.pt for further training can indeed save you time. Since best.pt is already trained on your data, it has learned some of the necessary features, allowing you to potentially achieve similar or better performance with fewer epochs compared to starting from scratch. Here’s a quick example:

model = YOLO("best.pt")
results = model.train(data='new_train.yaml', epochs=5)  # Fewer epochs might be sufficient

Just make sure to monitor the performance to ensure the model is still improving and not overfitting. Happy training! 😊

SIDD-1991 added the question Further information is requested label May 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loosing the pretrained model weights when using a new data to retrain the already trained model. #12666

Loosing the pretrained model weights when using a new data to retrain the already trained model. #12666

SIDD-1991 commented May 13, 2024 •

edited

github-actions bot commented May 13, 2024

glenn-jocher commented May 13, 2024

SIDD-1991 commented May 14, 2024

SIDD-1991 commented May 14, 2024

SIDD-1991 commented May 14, 2024

glenn-jocher commented May 14, 2024

SIDD-1991 commented May 14, 2024

glenn-jocher commented May 20, 2024

Loosing the pretrained model weights when using a new data to retrain the already trained model. #12666

Loosing the pretrained model weights when using a new data to retrain the already trained model. #12666

Comments

SIDD-1991 commented May 13, 2024 • edited

Search before asking

Question

Additional

github-actions bot commented May 13, 2024

Install

Environments

Status

glenn-jocher commented May 13, 2024

SIDD-1991 commented May 14, 2024

SIDD-1991 commented May 14, 2024

SIDD-1991 commented May 14, 2024

glenn-jocher commented May 14, 2024

SIDD-1991 commented May 14, 2024

glenn-jocher commented May 20, 2024

SIDD-1991 commented May 13, 2024 •

edited