Wildfire Smoke Detection With YOLOv5, Roboflow And Weights & Biases (1)

Train and Debug YOLOv5 Models with Weights & Biases
In this colab, we'll demonstrate how to use the W&B integration with version 5 of the "You Only Look Once" (aka YOLOv5) real-time object detection framework to track model metrics, inspect model outputs, and restart interrupted runs. We'll also make use of Roboflow's functionality for preprocessing and annotating our computer vision datasets.
Setup
Detect
YOLOv5 provides highly-accurate, fast models that are pretrained on the Common Objects in COntext (COCO) dataset.
If your object detection application involves only classes from the COCO dataset, like "Stop Sign" and "Pizza", then these pretrained models may be all you need!
The cell below runs a pretrained model on an example image
using detect.py from the YOLOv5 toolkit.
Train
YOLOv5 comes with wandb already integrated,
so all you need to do is configure the logging
with command line arguments.
--projectsets the W&B project to which we're logging (akin to a GitHub repo).--upload_datasettellswandbto upload the dataset as a dataset-visualization Table. At regular intervals set by--bbox_interval, the model's outputs on the validation set will also be logged to W&B.--save-periodsets the number of epochs to wait in between logging the model checkpoints. If not set, only the final trained model is logged.
Even without these arguments, basic model metrics and some model outputs will still be saved to W&B.
Note: to use this same training and logging setup on a different dataset, just create a
data.yamlfor that dataset and provide it to the--dataargument.
Here's where you can find the uploaded evaluation results in the W&B UI:
Resume Crashed Runs
In addition to making it easier to debug our models, the W&B integration can help rescue crash or interrupted runs.
Two steps above helped set us up for this:
- By setting a
--save-period, we regularly logged the model to W&B, which means we can recreate our model and then resume the run on any device with the dataset available. - By using
--upload_dataset, we logged the data to W&B, which means we can recreate the data as well and so resume runs on any device, whether the dataset is present on disk or not
To resume a crashed or interrupted run:
- Go to that run's overview section on W&B dashboard
- Copy the run path
- Pass the run path as the
--resumeargument, plus the prefixwandb-artifact://. This prefix tells YOLO that the files are located on wandb, rather than locally.
crashed_run_path = "entity/project/run-id" # your path here
!python train.py --resume wandb-artifact://{crashed_run_path}
End Notes
Distributed Data-Parallel Training
All YOLO+W&B features are DDP-aware and compatible. Train on as many GPUs as you can muster, and we'll keep logging!
Logging Large Datasets
For very large datasets,
the initial dataset upload triggered by --log_dataset
might be prohibitively expensive.
In that case,
check out the
log_dataset.py script
included in YOLOv5.
stripped Models
At the end of training, a "stripped" version of the model is saved to W&B. This version of the model file is much smaller, but is missing accumulated data required for resuming training. It's intended for use in downstream inference.