吉林大学学报(信息科学版) ›› 2022, Vol. 40 ›› Issue (6): 1009-1016.

• • 上一篇    下一篇

多层次融合与注意力机制的人群计数算法

李 萌, 孙艳歌, 郭华平, 吴 飞   

  1. 信阳师范学院 计算机与信息技术学院, 河南 信阳 464000
  • 收稿日期:2022-04-12 出版日期:2022-12-09 发布日期:2022-12-10
  • 作者简介:李萌(1997— ), 男, 河南信阳人, 信阳师范学院硕士研究生, 主要从事深度学习研究, ( Tel)86-17337658255(E-mail)limeng19971016@ 163. com; 孙艳歌(1982— ), 女, 河南平顶山人, 信阳师范学院副教授, 硕士生导师, 主要从事深度学习、 机器学习与数据挖掘研究, (Tel)86-18837691652(E-mail)ygsun1982@ 126. com。
  • 基金资助:
    国家自然科学基金资助项目(62062004; 61702550; 31900710); 河南省自然科学基金资助项目(222300420274)

Multi-Level Fusion and Attention Mechanism Based Crowd Counting Algorithm

LI Meng, SUN Yange, GUO Huaping, WU Fei   

  1. Computer & Information Technology, Xinyang Normal University, Xinyang 464000, China
  • Received:2022-04-12 Online:2022-12-09 Published:2022-12-10

摘要: 针对人群图像背景的差异性, 以及透视效应引起的人群尺度变化等对人群计数精度产生严重影响的 问题, 提出一种多层次融合与注意力机制的人群计数算法。 该算法包含尺度注意力提取与多层次融合两个子 网络, 尺度注意力提取网络采用编解码结构, 负责尺度注意力的提取, 以对抗复杂人群场景中的人群尺度 变化、人群遮挡等问题; 多层次融合网络在每个卷积块之前增加一个特征融合操作, 将不同尺度的注意力图与 相对应卷积层次的输入融合, 以去除冗杂的图像信息, 进而生成高质量的人群密度图。 相较于其他优秀的人群 计数算法, 该算法在ShangHaitech数据集Part_B 上的MAE(Mean Absolute Error)和MSE(Mean Squared Error) 分别提高了17 %和25%, 在Part_A上的MAE提高了1. 7%, 在UCF_CC_50数据集上的MAE提高了7%。 实验结果表明, 该算法在应对复杂人群场景时具有较高的准确性与鲁棒性。

关键词: 人群计数,  , 编解码,  , 尺度注意力,  , 特征融合,  , 人群密度图

Abstract: To solve the problems that the difference in crowd image background and the change in crowd scale caused by perspective effect have a serious impact on the accuracy of crowd counting, a multi-level fusion and attention mechanism based crowd counting algorithm is proposed, which includes two sub networks: scale attention extraction and multi-level fusion. The scale attention extraction network adopts coder-decoder structure, which is responsible for scale extraction to combat the problems of crowd scale change and crowd occlusion in complex crowd scenes; the multi-level fusion network adds a feature fusion operation before each convolution block to fuse the attention map with the input of each convolution block to remove the redundant image information, and then generate a high-quality crowd density map. Compared to other excellent crowd counting algorithms, the MAE(Mean Absolute Error) and MSE(Mean Squared Error) of the proposed algorithm on the ShangHaitech dataset Part _ B are increased by 17% and 25% , respectively, and the MAE on Part _ A is increased by 1. 7% . The MAE is increased by 7% on the UCF_CC_50 dataset. The experimental results show that the proposed algorithm has high accuracy and robustness in dealing with complex crowd scenes.

Key words: crowd counting,  , coder-decoder,  , scale attention,  , feature fusion,  , crowd density map

中图分类号: 

  • TP391