FAS-YOLO: An Improved Algorithm for Small Target Detection of Unmanned Aerial Vehicles

Xinlong Shui; Wendi Chu

doi:10.6911/WSRJ.202510_11(10).0001

Authors

Xinlong Shui
Wendi Chu

DOI:

https://doi.org/10.6911/WSRJ.202510_11(10).0001

Keywords:

Small target detection, YOLOv8, Feature Pyramid, FPSharedConv, Attention mechanism, AFGCAttention, WIoUv3.

Abstract

Small target detection from the perspective of unmanned aerial vehicles (UAVs) faces the challenges of low accuracy, high missed detection rate, and high false detection rate. To address these issues, this paper proposes an improved small target detection model based on YOLOv8, named FAS-YOLO (Feature Attention Small Object Detection-YOLO). Firstly, the model replaces the traditional pooling operation by introducing the FPSharedConv module, which can effectively extract fine-grained features and retain the detailed information of small targets. Secondly, based on the improvement of PAFPN, the smallObjectEnhancePyramid feature pyramid structure is proposed: without adding the P2 detection layer, through the fusion of the SPDConv convolution of the P2 feature layer and CSP-OmniKernel, the feature representation ability of small targets is effectively enhanced. In addition, the AFGCAttention mechanism is introduced after the FPSharedConv to further improve the model's attention to key small targets. Finally, the loss function is improved based on WIoUv3, and the detection and positioning accuracy is improved by using a more reasonable aspect ratio measurement. The experimental results show that the precision, recall, mAP50, and mAP50-95 of the improved model on the VisDrone2019 dataset are increased by 9.6%, 8.6%, 10.9%, and 7%, respectively. FAS-YOLO significantly improves the performance of small target detection and provides a new solution for efficient target detection in UAV scenarios.

Downloads

Download data is not yet available.

References

[1] SUN S , MO B , XU J ,et al.Multi-YOLOv8: An infrared moving small object detection model based on YOLOv8 for air vehicle[J]. Neurocomputing, 2024, 588. DOI:10. 1016/j.neucom.2024.127685.

[2] Suthaharan, S. & Suthaharan, S. Support vector machine. Machine Learning Models and Algorithms for Big Data Classification: Thinking with Examples for Effective Learning 207–235 (2016).

[3] LIN T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection [C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. Shenzhen, China, Piscataway: IEEE, 2017: 29802988.

[4] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector [C]// Proceedings of the 2016 European Conference on Computer Vision. Amsterdam, Netherlands, Springer, 2016: 21-37.

[5] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]// Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. LasVegas, NV, USA, Piscataway: IEEE, 2016: 779-788.

[6] REDMON J, FARHADI A. YOLO9000: better, faster, stronger [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Hawaii Convention Center, USA, Piscataway: IEEE, 2017: 7263- 7271.

[7] REDMON J, FARHADI A. Yolov3: An incremental improvement [EB/OL]. [2024-07-01]. https://arxiv.org/abs/1804.02767.

[8] SHEN X L, WANG L C. UAV Aerial Photography Target Detection Based on YOLOv8n [J]. Computer Systems Applications, 2024, 33(7): 139-148.

[9] PAN W, WEI C, QIAN C Y, et al. Improved YOLOv8s Model for Small Object Detection from Perspective of Drones [J]. Computer Engineering and Applications, 2024, 60(9): 142-150.

[10] Xiuman Liang, Zihan Jia, Zhendong Liu, et al. A UAV Aerial Image Detection Algorithm Based on Improved YOLOv8n. Electronics Optics & Control. [2024-09-25]. http://kns.cnki.net/kcms/detail/41.1227.TN.20240914.1601.002.html.

[11] LEI B J,YU A,YU K. Small Object Detection Algorithm based on Improved YOLOv8s[J]. Radio Engineering, 2024, 54(04):857-870.s

[12] WANG G, CHEN Y, AN P, et al. UAV-YOLOv8: A smallobject-detection model based on improved YOLOv8 for UAV aerial photography scenarios[J]. Sensors, 2023, 23(16): 7190.

[13] ZHU L, WANG X, KE Z, et al. Biformer: Vision transformer with bi-level routing attention[C]//Pro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2023: 10323-10333.

[14] LI Y, FAN Q, HUANG H, et al. A Modified YOLOv8 Detection Network for UAV Aerial Image Recognition[J]. Drones, 2023, 7(5): 304.

[15] Bochkovskiy A, WANG C Y, LIAO H Y M. Yolov4: Optimal speed and accuracy of object detection [EB/OL]. [2024-07-01]. . https://arxiv.org/pdf/2004.10934.

[16] GE Z, LIU S, WANG F, et al. Yolox: Exceeding yolo series in 2021 [EB/OL]. [2024-07-01]. https://arxiv.org/abs/2107.08430.

[17] JIANG B, LOU R, MAO J, et al. Acquisition of localization confidence for accurate object detection [C]// Proceedings of the 2018 European conference on computer vision (ECCV). Munich, Germany, Springer, 2018: 784-799.

[18] H. Sun, Y. Wen, H. Feng, Y. Zheng, Q. Mei, D. Ren, et al., "Unsupervised bidirectional contrastive reconstruction and adaptive fine-grained channel attention networks for image dehazing", Neural Netw., vol. 176, Aug. 2024.

[19] Cui, Y., Ren, W., & Knoll, A. (2024). Omni-Kernel Network for Image Restoration. Proceedings of the AAAI Conference on Artificial Intelligence, 38(2), 1426-1434. https://doi.org/10.1609/aaai.v38i2.27907

[20] LIN T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection [C]// Proceedings of the 2017 IEEE conference on computer vision and pattern recognition. Hawaii Convention Center, USA, Piscataway: IEEE, 2017: 2117-2125.

[21] Tong, Z., Chen, Y., Xu, Z., Yu, R.: Wise-IoU: bounding box regression loss with dynamic focusing mechanism. arXiv:230110051 (2023).

[22] Wenhan Huang, Guodong Yin, Keke Geng, te al. Object detection in complex driving scene based on dilated convolution feature adaptive fusion. Journal of Southeast University (Natural Science Edition), 2021,51(6):1076-1083. DOI:10.3969/j.issn.1001-0505.2021.06.021.

[23] V. Chalavadi, et al.mSODANet: a network for multi-scale object detection in aerial images using hierarchical dilated convolutions[J].Pattern Recognit, 126 (2022), Article 108548.

[24] Yang, Z., Zheng, Y., Shao, J. et al. Improved YOLOv4 based on dilated coordinate attention for object detection. Multimed Tools Appl 83, 56261–56273 (2024). https://doi.org/10.1007/s11042-023-17817-1.

[25] Guodong Zhang, Zhihua Chen, Bin Sheng Infrared Small Target Detection Based on Dilated Convolutional Conditional Generative Adversarial Networks. Computer Science, 2024, 51(2):151-160.DOI:10.11896/jsjkx.221200045.

[26] Guo, MH., Xu, TX., Liu, JJ. et al. Attention mechanisms in computer vision: A survey. Comp. Visual Media 8, 331–368 (2022). https://doi.org/10.1007/s41095-022-0271-y.

[27] J. Hu, L. Shen and G. Sun, "Squeeze-and-Excitation Networks," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 7132-7141, doi: 10.1109/CVPR.2018.00745.

[28] Ji, CL., Yu, T., Gao, P. et al. Yolo-tla: An Efficient and Lightweight Small Object Detection Model based on YOLOv5. J Real-Time Image Proc 21, 141 (2024). https://doi.org/10.1007/s11554-024-01519-4.

[29] Zheng, Z., Wang, P., Ren, D., Liu, W., Ye, R., Hu, Q., et al.: Enhancing geometric factors in model learning and inference for object detection and instance segmentation. IEEE Trans. Cybern. 52(8), 8574–8586 (2021).

[30] Sun, H., Luo, Z., Ren, D., Hu, W., Du, B., Yang, W., et al.: Partial siamese with multiscale bi-codec networks for remote sensing image haze removal. IEEE Trans. Geosci. Remote Sens. (2023).

[31] Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., Ren, D.: editors. Distance-IoU loss: faster and better learning for bounding box regression. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 12993–13000. AAAI Press, Washington, D.C. (2020).

[32] KKANG M, TING C M, TING F F, et al. ASF-YOLO: A novel YOLO model with attentional scale sequence fusion for cell instance segmentation [EB/OL]. [2024-07-01]. https://arxiv.org/pdf/2312.06458.