欢迎访问林草资源研究

林草资源研究 ›› 2024›› Issue (1): 34-40.doi: 10.13466/j.cnki.lczyyj.2024.01.005

• 科学研究 • 上一篇    下一篇

基于深度学习的野生动物视频目标检测

汪帅1(), 卢楠1, 郑红1(), 李晖2, 彭检贵1, 张同1, 魏彦华1   

  1. 1.国家林业和草原局中南调查规划院,长沙 410014
    2.国家林业和草原局林草调查规划院,北京 100714
  • 收稿日期:2023-11-17 修回日期:2024-01-30 出版日期:2024-02-28 发布日期:2024-03-22
  • 通讯作者: 郑红,教授级高级工程师,主要从事森林资源管理、林草信息化建设等工作。Email:756874452@qq.com
  • 作者简介:汪帅,工程师,主要从事林草信息化建设等工作。Email:928344085@qq.com

Wildlife Video Object Detection Based on Deep Learning

WANG Shuai1(), LU Nan1, ZHENG Hong1(), LI Hui2, PENG Jiangui1, ZHANG Tong1, WEI Yanhua1   

  1. 1. Central South Inventory and Planning Institute of National Forestry and Grassland Administration,Changsha 410014,China
    2. Academy of Forestry Inventory and Planning,National Forestry and Grassland Administration,Beijing 100714,China
  • Received:2023-11-17 Revised:2024-01-30 Online:2024-02-28 Published:2024-03-22

摘要:

以红外相机为代表的生态感知终端为野生动物监测研究提供海量的图像和视频数据。为改善人工识别海量数据时效性低、处理能力有限等问题,解决目标检测模型在受到复杂背景、多目标、昼夜明暗等多重因素影响的实际场景中应用的不确定性,以金钱豹、成年雄性岩羊、非成年雄性岩羊为例,建立野生动物目标检测数据集,对比分析Faster R-CNN、SSD、YOLOv5和YOLOv8等4种经典目标检测模型在实际场景中检测精度、检测速度和检测效果。结果表明:1)YOLOv5与YOLOv8的检测效果和检测速度整体优于Faster R-CNN与SSD,YOLOv8在多重因素干扰下检测精度更高、鲁棒性更强,更适合追求检测效果的场景;2)4种模型均能满足生态感知终端实时视频检测需要,但YOLOv5模型更轻量、检测速度更快,更适合算力有限时追求检测速度的场景。YOLOv5和YOLOv8性能优越,适合在实际场景中开展野生动物视频目标检测。

关键词: 深度学习, 目标检测, YOLOv8, 野生动物视频

Abstract:

Ecological sensing terminals represented by infrared cameras provide massive amounts of image and video data for wildlife monitoring research.To improve the problems of low timeliness and limited processing ability in manual recognition of massive data,and to solve the uncertainty of object detection models in practical scenarios affected by multiple factors such as complex backgrounds,multiple targets,light and dark,a wildlife object detection dataset was established using leopard,adult male bharal,and non-adult male bharal as examples.Four classic object detection models,Faster R-CNN,SSD,YOLOv5,and YOLOv8,were compared and analyzed in terms of detection accuracy,detection speed,and detection effectiveness in actual scenarios.The results show that the detection effect and speed of YOLOv5 and YOLOv8 are overall better than Faster R-CNN and SSD.1)YOLOv8 has higher detection accuracy and stronger robustness under multiple interference factors,making it more suitable for scenarios that pursue detection results;2)All four models can meet the real-time video detection needs of ecological perception terminals,but the YOLOv5 model is the lightest and has the fastest detection speed,making it more suitable for scenarios with limited computability that pursue detection speed.YOLOv5 and YOLOv8 have superior performance and are suitable for detecting wildlife video targets in practical scenarios.

Key words: deep learning, object detection, YOLOv8, wildlife video

中图分类号: