You only look once (YOLO) is a state-of-the-art, real-time object detection system. It is a fully convolutional network. On a Pascal Titan X, it processes images at 30 FPS and has a mAP of 57.9% on COCO. It has 75 convolutional layers with skip connections and upsampling layers and no pooling.