吉林大学学报(工学版) ›› 2024, Vol. 54 ›› Issue (11): 3296-3301.doi: 10.13229/j.cnki.jdxbgxb.20230934

• 计算机科学与技术 • 上一篇    

基于卷积神经网络的视频编码优化算法

陆宇(),陈谦,殷海兵   

  1. 杭州电子科技大学 通信工程学院,杭州 310018
  • 收稿日期:2023-09-03 出版日期:2024-11-01 发布日期:2025-04-24
  • 作者简介:陆宇(1977-),男,副教授,博士.研究方向:智能视频信息处理方法.E-mail:luyu20230@163.com
  • 基金资助:
    浙江省教育厅科研项目(Y202249588);国家自然科学基金项目(61972123)

Optimization algorithm for video coding based on convolutional neural networks

Yu LU(),Qian CHEN,Hai-bing YIN   

  1. Hangzhou Dianzi University,School of Communication Engineering,Hangzhou 310018,China
  • Received:2023-09-03 Online:2024-11-01 Published:2025-04-24

摘要:

为进一步提高高效视频编码(HEVC)的压缩效率,使其更好地适用于高清视频的压缩,利用深度学习对视频特征强大的挖掘能力,提出了一种多输入的多尺度残差卷积神经网络和网络迭代训练方法,显著提高了HEVC环路滤波的性能;提出了一种新颖的分像素插值滤波方法,进一步提高编码的压缩效率。实验结果表明,本文算法在RA编码模式下平均可以减少7.47%的BD-rate。与现有的两种编码优化算法相比,本文提出的优化算法有效地提升了压缩效率,同时提高了视频质量。

关键词: 卷积神经网络, 环路滤波, 高效视频编码, 分像素插值滤波

Abstract:

In order to further improve the compression efficiency of Efficient Video Coding (HEVC) and make it more suitable for high-definition video compression. By utilizing the powerful mining ability of deep learning for video features, this paper proposes a multi input multi-scale residual convolutional neural network and network iterative training method, which significantly improves the performance of HEVC loop filtering. And a novel pixel based interpolation filtering method was proposed to further improve the compression efficiency of the encoding. The experimental results show that the algorithm proposed in this paper can reduce BD rate by an average of 7.47% in RA encoding mode. Compared with the two existing encoding optimization algorithms, the optimization algorithm proposed in this paper effectively improves compression efficiency while enhancing video quality.

Key words: convolutional neural network, loop filtering, efficient video encoding, pixel based interpolation filtering

中图分类号: 

  • TN919.81

图1

本文卷积神经网络结构"

图2

多尺度残差块结构"

图3

AI和RA模式下的编码结构"

图5

本文采用的HM编码器"

表1

编码性能比较 (%)"

序列文献[4]方法文献[5]方法本文方法
平均-6.62-6.09-7.47
Traffic-5.97-6.24-8.21
PeopleOnStreet-9.17-6.23-7.84
Kimono1-3.68-5.73-7.19
Cactus-5.29-7.92-8.92
BQTerrace-5.08-13.80-11.43
ParkScene-3.85-1.26-4.36
BasketballDrive-5.99-7.34-6.91
RaceHorses--4.53-7.89
BQMall-7.37-4.81-8.76
PartyScene-9.28-1.68-3.23
BasketballDrill-10.65-3.55-4.99
RaceHorses-3.52-3.85-6.27
BQSquare-10.99-1.68-3.61
BlowingBubbles-7.54-0.89-3.09
BasketballPass-6.34-3.10-6.15
FourPeople-10.53-10.69-10.43
Jonny-11.22-14.55-14.52
KristenAndSara-9.00-11.91-10.73

图6

视频主观质量比较本文方法与其他方法的效果对比"

1 韩丽, 王华东. 动态视频多帧连续图像形变特征重构方法研究[J]. 计算机仿真, 2022, 39(12): 245-248.
Han Li, Wang Hua-dong. Research on deformation feature reconstruction of dynamic video multi-frame continuous image[J]. Computer Simulation, 2022, 39(12): 245-248
2 Sullivan G J, Ohm J R, Han W J, et al. Overview of the high efficiency video coding standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012, 22(12): 1649-1668.
3 惠超, 蒋林, 朱筠, 等. HEVC中分像素插值算法的动态可重构实现[J]. 计算机工程与设计, 2022, 43(3): 764-770.
Hui Chao, Jiang Lin, Zhu Jun, et al. Dynamic reconfigurable implementation of pixel interpolation algorithm in HEVC[J]. Computer Engineering and Design, 2022, 43(3): 764-770.
4 Pan Z, Yi X, Zhang Y, et al. Efficient in-loop filtering based on enhanced deep convolutional neural networks for HEVC[J]. IEEE Transactions on Image Processing, 2020, 29: 5352-5366.
5 Sun W, He X, Chen H, et al. A nonlocal HEVC in-loop filter using CNN-based compression noise estimation[J]. Applied Intelligence, 2022, 52(15): 17810-17828.
6 王刚, 陈贺新, 陈绵书. 基于HEVC的自适应插值滤波算法[J].吉林大学学报: 理学版, 2018, 56(2): 320-328.
Wang Gang, Chen He-xin, Chen Mian-shu. Adaptive interpolation filtering algorithm based on HEVC[J]. Journal of Jilin University (Science Edition), 2018, 56(2): 320-328.
7 Lu Y, Huang X, Liu H, et al. Fast SHVC inter-coding based on bayesian decision with coding depth estimation[J]. Journal of Real-Time Image Processing, 2021, 18(6): 2269-2285.
8 He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770-778.
9 Li J C, Fang F M, Mei K F, et al. Multi-scale residual network for image super-resolution[C]∥Proceedings of the European Conference on Computer Vision, Munichi, Germany, 2018: 527-542.
10 Agustsson E, Timofte R. Ntire 2017 challenge on single image super-resolution: dataset and studyp[C]∥IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, USA, 2017: 1122-1131.
11 Ma D, Zhang F, Bull D. BVI-DVC: a training database for deep video compression[J]. IEEE Transactions on Multimedia, 2021, 24: 3847-3858.
12 Kingma D P, Ba J. Adam: a method for stochastic optimization[J]. Arxiv Preprint, 2014, 9: 14126980.
13 Bjøntegaard G. Calculation of average PSNR differences between RD-curves[J]. ITU-T VCEG-M33, 2001, 4: 2-4.
[1] 李志军,杨楚皙,刘丹,孙大洋. 基于深度卷积神经网络的信息流增强图像压缩方法[J]. 吉林大学学报(工学版), 2020, 50(5): 1788-1795.
[2] 桑爱军, 穆森, 王墨林, 崔海廷, 陈贺新. 基于多维矢量矩阵的多视角视频编码[J]. 吉林大学学报(工学版), 2013, 43(04): 1110-1115.
[3] 李晓妮, 陈贺新, 陈绵书, 蒙塞夫·嘎博基. 基于H.264运动估计的音视频同步编码技术[J]. , 2012, 42(05): 1321-1326.
[4] 赵岩, 刘静, 陈贺新, 刘伯轩. 用于单视加深度视频编码的快速立体匹配算法[J]. , 2012, 42(04): 1032-1036.
[5] 刘丽丽,陈贺新,桑爱军,胡铁根. 基于多维矢量矩阵正交变换理论的彩色图像压缩编码[J]. 吉林大学学报(工学版), 2011, 41(6): 1754-1759.
[6] 李晓妮,陈贺新,孙元,陈绵书,刘添. 基于H.264的嵌入式音视频同步编码技术[J]. 吉林大学学报(工学版), 2011, 41(05): 1475-1479.
[7] 李宏伟,张捷,吴成柯,宋锐. 多对称性快速自适应插值滤波算法[J]. 吉林大学学报(工学版), 2011, 41(05): 1485-1490.
[8] 杨红生, 杨光, 毛宏宇, 田地. 针对样品视频的H.264/SVC的快速运动估计算法[J]. 吉林大学学报(工学版), 2010, 40(06): 1710-1714.
[9] 黄新林, 王钢, 刘春刚, 于迎新. 基于JPEG2000的感兴趣区域编码新算法[J]. 吉林大学学报(工学版), 2010, 40(06): 1715-1718.
[10] 杨大伟,赵旦峰,战滨. 基于H.264空时SVC编码的快速模式决策算法[J]. 吉林大学学报(工学版), 2009, 39(增刊2): 367-0370.
[11] 刘韶,桑爱军,陈贺新,陈强. 基于YC子阵的彩色图像三维矩阵变换压缩编码[J]. 吉林大学学报(工学版), 2006, 36(04): 569-573.
[12] 王好谦, 杜成立, 惠征. 一种新的多视点分布式视频编码算法[J]. 吉林大学学报(工学版), 2013, 43(增刊1): 225-229.
[13] 桑爱军, 崔海廷, 王墨林, 陈贺新. 高维变换域中的系数集中分析及研究[J]. 吉林大学学报(工学版), 2013, 43(增刊1): 96-100.
[14] 王刚, 陈贺新, 陈绵书, 刘媛媛, 桑爱军. 一种改进的H.264分像素插值滤波算法[J]. 吉林大学学报(工学版), 2013, 43(增刊1): 173-176.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!