☆ 4.6 Article

Object detection using YOLO: challenges, architectural successors, datasets and applications

MULTIMEDIA TOOLS AND APPLICATIONS (2023)

期刊

MULTIMEDIA TOOLS AND APPLICATIONS

卷 82, 期 6, 页码 9243-9275

出版社

SPRINGER

DOI: 10.1007/s11042-022-13644-y

关键词

Object detection; Convolutional neural networks; YOLO; Deep learning; Computer vision

类别

Computer Science, Information Systems Computer Science, Software Engineering Computer Science, Theory & Methods Engineering, Electrical & Electronic

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Object detection is a significant problem in computer vision, and deep learning has greatly improved its performance. Object detectors can be categorized into two stage and single stage detectors, with two stage detectors typically achieving higher accuracy and single stage detectors having faster inference time. YOLO, a widely adopted single stage object detection algorithm, has the advantage of faster inference speed. This paper provides a comprehensive review of single stage object detectors, particularly YOLO, and compares them with two stage detectors. It also summarizes different versions of YOLO and their applications, as well as future research directions.

Object detection is one of the predominant and challenging problems in computer vision. Over the decade, with the expeditious evolution of deep learning, researchers have extensively experimented and contributed in the performance enhancement of object detection and related tasks such as object classification, localization, and segmentation using underlying deep models. Broadly, object detectors are classified into two categories viz. two stage and single stage object detectors. Two stage detectors mainly focus on selective region proposals strategy via complex architecture; however, single stage detectors focus on all the spatial region proposals for the possible detection of objects via relatively simpler architecture in one shot. Performance of any object detector is evaluated through detection accuracy and inference time. Generally, the detection accuracy of two stage detectors outperforms single stage object detectors. However, the inference time of single stage detectors is better compared to its counterparts. Moreover, with the advent of YOLO (You Only Look Once) and its architectural successors, the detection accuracy is improving significantly and sometime it is better than two stage detectors. YOLOs are adopted in various applications majorly due to their faster inferences rather than considering detection accuracy. As an example, detection accuracies are 63.4 and 70 for YOLO and Fast-RCNN respectively, however, inference time is around 300 times faster in case of YOLO. In this paper, we present a comprehensive review of single stage object detectors specially YOLOs, regression formulation, their architecture advancements, and performance statistics. Moreover, we summarize the comparative illustration between two stage and single stage object detectors, among different versions of YOLOs, applications based on two stage detectors, and different versions of YOLOs along with the future research directions.

Object detection using YOLO: challenges, architectural successors, datasets and applications

期刊

MULTIMEDIA TOOLS AND APPLICATIONS

出版社

SPRINGER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Object detection using YOLO: challenges, architectural successors, datasets and applications

期刊

MULTIMEDIA TOOLS AND APPLICATIONS

出版社

SPRINGER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文