Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

training does not start #12644

Open
1 task done
faridamousa opened this issue May 13, 2024 · 1 comment
Open
1 task done

training does not start #12644

faridamousa opened this issue May 13, 2024 · 1 comment
Labels
question Further information is requested

Comments

@faridamousa
Copy link

Search before asking

Question

when i want to train the model.
this is my code:
model = YOLO("yolov8n.yaml") # load the trained model

# Train the model
model.train(data="config.yaml", epochs=30, batch = 4, imgsz=256)

metrics = model.val(data="config.yaml")
print(metrics.box)

and i run the file. epoch 1 reaches 4% and then stops and the training stops. why is this happening? happened with me with yolov8 and yolov9

Additional

No response

@faridamousa faridamousa added the question Further information is requested label May 13, 2024
@glenn-jocher
Copy link
Member

@faridamousa hello! It appears that the training process halts unexpectedly at 4% during the first epoch. Here are a few suggestions that might help to troubleshoot and resolve the issue:

  1. Check the Dataset: Verify that config.yaml is correctly set up with valid paths to your dataset folders and that the images and labels are accessible and properly formatted.

  2. Hardware Resources: Ensure that your hardware resources are not being maxed out. Monitor the CPU and memory usage, and if you are training on GPU, check for any potential issues with the CUDA environment or out-of-memory errors.

  3. Terminal Output/Logs: Look closely at any error messages or warnings in the terminal output or logs generated during the training process. These might give more context on why the training is stopping.

  4. Version Compatibility: Confirm that your YOLOv8 or YOLOv9 environment is set up with compatible versions of dependencies like PyTorch, CUDA, etc.

  5. Simplify Your Configuration: Try reducing batch size or imgsz to see if it has an impact on progressing past the 4% mark.

If none of these suggestions resolve the issue, it would be helpful to have more details such as terminal output/errors, hardware specifications, and the exact content of config.yaml. This information will help in diagnosing the problem more effectively.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants