吉林大学学报(信息科学版) ›› 2025, Vol. 43 ›› Issue (6): 1430-1440.

• • 上一篇    

融合深度学习技术的多模态安全管理应用

陈 冲a , 朱晓旭b , 万林葳c , 付凯宇d , 黄自彬, 王文雅d , 车浩源b   

  1. 吉林大学 a. 人工智能学院; b. 公共计算机教学与研究中心; c. 软件学院; d. 电子科学与工程学院, 长春 130012
  • 收稿日期:2025-03-02 出版日期:2025-12-08 发布日期:2025-12-08
  • 通讯作者: 朱晓旭(1984— ), 女, 吉林省吉林市人, 吉林大学工程师, 主要从事大学生创新创业实践示范基地管理研究, (Tel)86-15948734898(E-mail)xiaoxuz@ jlu. edu. cn E-mail:xiaoxuz@ jlu. edu. cn
  • 作者简介:陈冲(2003— ), 男, 河南商丘人, 吉林大学本科生, 主要从事人工智能研究, (Tel)86-13323601678(E-mail)2622167196@ qq. com
  • 基金资助:
    吉林大学大学生创新创业训练计划基金资助项目(S202410183390)

Application of Multimodal Security Management Integrating Deep Learning Technology

CHEN Chonga, ZHU Xiaoxub, WAN Linweic, FU Kaiyud, HUANG Zibind, WANG Wenyad, CHE Haoyuanb   

  1. a. School of Artificial Intelligence; b. Public Computer Education and Research Center;c. College of Software; d. College of Electronic Science & Engineering, Jilin University, Changchun 130012, China
  • Received:2025-03-02 Online:2025-12-08 Published:2025-12-08

摘要:

针对传统安全管理主要依赖人工监控与事后处理、效率低下且难以及时发现异常行为的问题, 设计了一套多模态智能安全管理系统。该系统主体由基于华为 Atlas 200I DK A2 开发套件运行的视觉识别算法、基于单片机的语音报警装置及配套软件构成。通过视觉处理算法与音频关键词检测实现行为智能识别, 当发生危险情况时, 可经软件自动将信息及时反馈给管理人员, 有效保障现场人身安全。针对视觉算法部分, 通过优化 YOLOv5(You Only Look Once version 5)网络结构, 引入 CA(Coordinate Attention)注意力机制以增强对小目标与复杂场景的检测能力, 并修改损失函数, 添加对 EIoU(Efficient IoU)损失函数的支持, 使模型更好地适应场景变化, 从而实现对打架与摔倒行为的高效识别。实验结果表明, 本方法在多种场景下的平均精度均值(mAP@ 0. 5)均有明显提升, 检测速度满足实时性要求, 可为公共场所的安全管理提供智能化解决方案。

关键词:

Abstract:

Aiming at the inefficiency and delayed response of traditional security management that relies mainly on manual monitoring and post-processing, a multimodal intelligent security management system is designed. The main components of the system include a visual recognition algorithm running on the Huawei Atlas 200I DK A2 development kit, a voice alarm device based on a single-chip microcomputer, and supporting software.Intelligent behavior recognition is achieved through visual processing algorithms and audio keyword detection.When dangerous situations occur, information can be automatically fed back to managers in time via the backend software, effectively ensuring on-site personal safety. For the visual algorithm, the YOLOv5 (You Only Look Once version 5) network structure is optimized by incorporating a CA( Coordinate Attention) mechanism to enhance detection capability for small targets and complex scenes, modify the loss function, and add support for the EIoU( Efficient IoU) loss function, enabling the model to adapt to scene changes and thereby achieve efficient recognition of fights and falls. Experimental results show that the mean average precision (mAP@ 0. 5)of the proposed method is improved significantly under various scenarios, and the detection speed meets real-time requirements, providing an intelligent solution for safety management in public places.

Key words:

中图分类号: 

  • TP391. 41