Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Significant Drop in Performance when Switching between YOLOv8n-seg Models #12656

Open
2 tasks done
AirdEliya opened this issue May 13, 2024 · 4 comments
Open
2 tasks done
Labels
bug Something isn't working non-reproducible Bug is not reproducible

Comments

@AirdEliya
Copy link

Search before asking

  • I have searched the YOLOv8 issues and found no similar bug report.

YOLOv8 Component

Augmentation, Hyperparameter Tuning

Bug

Hello,

Today I encountered an issue while using the yolov8n-seg model to train on my custom dataset. After training for approximately 100 epochs, the performance in terms of average precision (AP) reached around 0.75 for both mask and bounding box prediction tasks. However, when I attempted to try different conditions using the yolov8n-seg-p6 model, I noticed a significant drop in performance. I decided to revert back to using the yolov8n-seg model and wanted to test if the copy_paste function had any effect. Unfortunately, in that particular experiment, the AP dropped drastically.

Subsequently, I adjusted the input parameters and reverted to the conditions where I could achieve an AP of 0.75 before using the p6 model. However, when training again with the yolov8n-seg model under the previously successful conditions, the AP dropped significantly to only 0.469. I did not make any changes to the data itself, so I am unsure what caused such a substantial decrease in performance after switching to the p6 model and subsequently using the copy_paste function.

Thank you.

Environment

No response

Minimal Reproducible Example

from ultralytics.models.yolo.segment import SegmentationTrainer

from models.yolo.segment import SegmentationTrainer

from ultralytics.models.yolo.detect import DetectionTrainer

args = dict(model='yolov8n-seg.yaml', data='2dmaterial-0304-seg.yaml', epochs=100,batch=8,device=0,task='segment',name='tiny_e100_b8',project="tiny_yolov8_model",close_mosaic=10)
trainer = SegmentationTrainer(overrides=args)
trainer.train()

Additional

No response

Are you willing to submit a PR?

  • Yes I'd like to help by submitting a PR!
@AirdEliya AirdEliya added the bug Something isn't working label May 13, 2024
Copy link

👋 Hello @AirdEliya, thank you for your interest in Ultralytics YOLOv8 🚀! We recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Join the vibrant Ultralytics Discord 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.

Install

Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.

pip install ultralytics

Environments

YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

Ultralytics CI

If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

@glenn-jocher
Copy link
Member

@AirdEliya hello,

Thanks for bringing this issue to our attention and for the detailed description of your experience with the YOLOv8n-seg models. It sounds like you've been very thorough in your approach. 🕵️‍♂️

Switching models can sometimes lead to performance variations due to differences in model architectures and how they handle feature learning, particularly when moving to a variant like yolov8n-seg-p6. Adjustments in hyperparameters, as you've attempted, are a good response, but the issue might also relate to other factors such as initialization states or minor changes in training conditions that might not be immediately apparent.

A potential first step to diagnose the issue could be to ensure consistency in the model's state before and after switching models or performing significant training operations like using the copy_paste function. You could also try monitoring the intermediate outputs and training dynamics to spot any abrupt changes or anomalies.

Here’s a simple code snippet to check for model parameter consistency:

def check_model_consistency(model1, model2):
    for p1, p2 in zip(model1.parameters(), model2.parameters()):
        if not torch.allclose(p1, p2):
            return False
    return True

# Usage:
is_consistent = check_model_consistency(trainer.model, previous_model_state)
print('Model consistency:', is_consistent)

This is simplistic and assumes you have access to both model states but could be a starting point to ensure that the models’ parameters aren’t altered unintentionally.

We're here to help and would appreciate your continued feedback or contributions, especially if you discover a solution that might benefit others encountering similar challenges!

@glenn-jocher glenn-jocher added the non-reproducible Bug is not reproducible label May 14, 2024
@AirdEliya
Copy link
Author

AirdEliya commented May 15, 2024

Hello author @glenn-jocher ,

I wanted to inform you that I have resolved the issue I previously inquired about. It turns out the solution was quite simple. When using the yolov8n-seg model, I mistakenly called the command as instructed in the Segmentation task using yolov8n-seg.pt. Consequently, the .pt file was downloaded to the current path. I haven't had the chance to delve into this file to understand its origin as a pre-trained model.

Additionally, I initially intended to call yolov8n-seg-p6.yaml to utilize the initial architecture. However, when reverting back, I accidentally used yolov8n-seg.yaml for training, resulting in differences between the two runs. I apologize for any inconvenience caused by these inquiries.

Moving forward, I have a few more questions. Once a good epoch is found using the pre-trained models with the initial conditions and achieving a good mAP with the validation data, how should other parameters be adjusted? Moreover, what strategy should be employed for data augmentation to gradually adjust the parameters to fit my dataset?

I would like to know how the pretrained yolov8n-seg be trained.

Thank you.

@glenn-jocher
Copy link
Member

Hello @AirdEliya,

Great to hear that you've resolved the issue! It's easy to mix up file names, especially when switching between similar model configurations. 😅

For your questions on fine-tuning the model:

  1. Adjusting Parameters: After finding a good epoch, consider adjusting the learning rate and batch size to see if you can squeeze out more performance without overfitting. Also, experiment with different optimizers if you haven't settled on one yet.
  2. Data Augmentation: Start with basic augmentations like rotation, flipping, and scaling. Gradually introduce more complex augmentations like color adjustments and noise. Monitor the model's performance on validation data with each change to ensure improvements.

To train the pretrained yolov8n-seg model, you can use the following example command:

yolo detect train model=yolov8n-seg.pt data=your_dataset.yaml

Make sure your dataset is properly configured in the .yaml file.

Keep experimenting, and don't hesitate to reach out if you have more questions! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-reproducible Bug is not reproducible
Projects
None yet
Development

No branches or pull requests

2 participants