Abstract:Although the detector based on deep learning can achieve high detection accuracy, most of their speed cannot meet the real-time requirements. For the moment, the accuracy of the popular real-time detectors, such as single shot multibox detector(SSD) and you only look once(YOLO), is not high when detecting small objects. Therefore, a detector based on visual features region proposal is proposed, which can balance the detection accuracy and speed. This detector is divided into two parts: Region proposal and network classification. In the region proposal stage, the region of interest(ROI) is exated according to the feature information of the objects, which is also called candidate region; in the network classification stage, we use convolutional neural network(CNN) to process the ROI, then calculate class confidence of each ROI, and get the final candidates whose confidence is greater than the threshold value. Experimental results show that the detection accuracy of the proposed detector is significantly higher than that of the Faster R-CNN, SSD and YOLO, and its speed is close to the speed of the SSD and YOLO.