YOLO

YOLO is one of the popular methods to detect objects in an image. Object detection means that the detector provides the coordinate of each object detected in a photo in addition to the label of each object.

Below photos show how YOLO identified my dogs’ positions in each photo indicated by bounding boxes:

Maddie and Olivia

Aimée (the big puppy ;-)

I created a a short video clip to demo how YOLO processes traffic on a city street. I took a video of Taylor St in downtown San Jose and ran it through YOLO to mark objects’ location.

video

Here are the steps that you want to follow if you want to do this yourself:

  1. Prepare a video
    1. Capture a video or find a video you want to use for object detection.
    2. Extract frames from the video (e.g. using ffmpeg)
  2. Go to the YOLO developer’s main website. Follow instructions on the website:
    1. git clone the source code
    2. Download pre-trained weights file
    3. Build using make
    4. Run an example program
  3. Go to pyimagesearch.com and download the example Python code to use YOLO with OpenCV’s neural network module. The site has detailed instructions for the example.
    1. Copy three files from your local YOLO installation to a directory under the example code
      1. coco.names
      2. yolov3.weights
      3. yolov3.cfg
    2. Make sure that the example code works.
    3. Tweak the code to read video frame files and output frame files.
    4. Run the script
  4. Combine frames to a video (e.g. using ffmpeg again)