基于卷积神经网络的视频编码优化算法

doi:10.13229/j.cnki.jdxbgxb.20230934

Abstract

Abstract:

In order to further improve the compression efficiency of Efficient Video Coding （HEVC） and make it more suitable for high-definition video compression. By utilizing the powerful mining ability of deep learning for video features， this paper proposes a multi input multi-scale residual convolutional neural network and network iterative training method， which significantly improves the performance of HEVC loop filtering. And a novel pixel based interpolation filtering method was proposed to further improve the compression efficiency of the encoding. The experimental results show that the algorithm proposed in this paper can reduce BD rate by an average of 7.47% in RA encoding mode. Compared with the two existing encoding optimization algorithms， the optimization algorithm proposed in this paper effectively improves compression efficiency while enhancing video quality.

Key words: convolutional neural network, loop filtering, efficient video encoding, pixel based interpolation filtering

CLC Number:

TN919.81

Yu LU,Qian CHEN,Hai-bing YIN. Optimization algorithm for video coding based on convolutional neural networks[J].Journal of Jilin University(Engineering and Technology Edition), 2024, 54(11): 3296-3301.

Figures/Tables 6

Fig.1

Fig.2

Fig.3

Fig.5

Table 1

Fig.6

References 13

1	韩丽, 王华东. 动态视频多帧连续图像形变特征重构方法研究[J]. 计算机仿真, 2022, 39(12): 245-248.
	Han Li, Wang Hua-dong. Research on deformation feature reconstruction of dynamic video multi-frame continuous image[J]. Computer Simulation, 2022, 39(12): 245-248
2	Sullivan G J, Ohm J R, Han W J, et al. Overview of the high efficiency video coding standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012, 22(12): 1649-1668.
3	惠超, 蒋林, 朱筠, 等. HEVC中分像素插值算法的动态可重构实现[J]. 计算机工程与设计, 2022, 43(3): 764-770.
	Hui Chao, Jiang Lin, Zhu Jun, et al. Dynamic reconfigurable implementation of pixel interpolation algorithm in HEVC[J]. Computer Engineering and Design, 2022, 43(3): 764-770.
4	Pan Z, Yi X, Zhang Y, et al. Efficient in-loop filtering based on enhanced deep convolutional neural networks for HEVC[J]. IEEE Transactions on Image Processing, 2020, 29: 5352-5366.
5	Sun W, He X, Chen H, et al. A nonlocal HEVC in-loop filter using CNN-based compression noise estimation[J]. Applied Intelligence, 2022, 52(15): 17810-17828.
6	王刚, 陈贺新, 陈绵书. 基于HEVC的自适应插值滤波算法[J].吉林大学学报: 理学版, 2018, 56(2): 320-328.
	Wang Gang, Chen He-xin, Chen Mian-shu. Adaptive interpolation filtering algorithm based on HEVC[J]. Journal of Jilin University (Science Edition), 2018, 56(2): 320-328.
7	Lu Y, Huang X, Liu H, et al. Fast SHVC inter-coding based on bayesian decision with coding depth estimation[J]. Journal of Real-Time Image Processing, 2021, 18(6): 2269-2285.
8	He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770-778.
9	Li J C, Fang F M, Mei K F, et al. Multi-scale residual network for image super-resolution[C]∥Proceedings of the European Conference on Computer Vision, Munichi, Germany, 2018: 527-542.
10	Agustsson E, Timofte R. Ntire 2017 challenge on single image super-resolution: dataset and studyp[C]∥IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, USA, 2017: 1122-1131.
11	Ma D, Zhang F, Bull D. BVI-DVC: a training database for deep video compression[J]. IEEE Transactions on Multimedia, 2021, 24: 3847-3858.
12	Kingma D P, Ba J. Adam: a method for stochastic optimization[J]. Arxiv Preprint, 2014, 9: 14126980.
13	Bjøntegaard G. Calculation of average PSNR differences between RD-curves[J]. ITU-T VCEG-M33, 2001, 4: 2-4.

Related Articles 14

[1]	Zhi-jun LI,Chu-xi YANG,Dan LIU,Da-yang SUN. Deep convolutional networks based image compression with enhancement of information flow [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1788-1795.
[2]	SANG Ai-jun, MU Sen, WANG Mo-lin, CUI Hai-ting, CHEN He-xin. Multi-view video coding based on multi-dimensional vector matrix [J]. 吉林大学学报(工学版), 2013, 43(04): 1110-1115.
[3]	LI Xiao-ni, CHEN He-xin, CHEN Mian-shu, GABBOUJ Moncef. Audio-video synchronous coding based on motion estimation in H.264 [J]. , 2012, 42(05): 1321-1326.
[4]	ZHAO Yan, LIU Jing, CHEN He-xin, LIU Bo-xuan. Fast stereo matching algorithm used in stereo video coding based on video plus depth [J]. , 2012, 42(04): 1032-1036.
[5]	LIU Li-li,CHEN He-xin,SANG Ai-jun,HU Tie-gen. Color image coding based on multidimensional vector matrix discrete cosine transform theory [J]. 吉林大学学报(工学版), 2011, 41(6): 1754-1759.
[6]	LI Xiao-ni, CHEN He-xin, SUN Yuan, CHEN Mian-shu, LIU Tian. Embedded audio-video synchronization coding based on H.264 [J]. 吉林大学学报(工学版), 2011, 41(05): 1475-1479.
[7]	LI Hong-wei, ZHANG Jie, WU Cheng-ke, SONG Rui. Fast multi-symmetry adaptive interpolation filter algorithm [J]. 吉林大学学报(工学版), 2011, 41(05): 1485-1490.
[8]	YANG Hong-Sheng, YANG Guang, MAO Hong-Yu, TIAN Di. H.264/SVC fast motion estimation algorithm for sample video [J]. 吉林大学学报(工学版), 2010, 40(06): 1710-1714.
[9]	HUANG Xin-Lin, WANG Gang, LIU Chun-Gang, YU Ying-Xin. New coding method for ROI based on JPEG2000 [J]. 吉林大学学报(工学版), 2010, 40(06): 1715-1718.
[10]	YANG Da-wei, ZHAO Dan-feng, ZHAN Bin. Fast mode decision algorithm within H.264 spatial and temporal scalable video coding [J]. 吉林大学学报(工学版), 2009, 39(增刊2): 367-0370.
[11]	Liu Shao，Sang Ai-jun，Chen He-xin，Chen Qiang. 3D matrix transform compression coding of color image based on YC submatrix [J]. 吉林大学学报(工学版), 2006, 36(04): 569-573.
[12]	WANG Hao-qian, DU Cheng-li, HUI Zheng. New multi-view distributed video coding algorithm [J]. 吉林大学学报(工学版), 2013, 43(增刊1): 225-229.
[13]	SANG Ai-jun, CUI Hai-ting, WANG Mo-lin, CHEN He-xin. Coefficients concentration analysis and research in high-dimensional transformation domain [J]. 吉林大学学报(工学版), 2013, 43(增刊1): 96-100.
[14]	WANG Gang, CHEN He-xin, CHEN Mian-shu, LIU Yuan-yuan, SANG Ai-jun. An improved filtering algorithm for H.264 fractional-pel interpolation [J]. 吉林大学学报(工学版), 2013, 43(增刊1): 173-176.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

序列	文献［4］方法	文献［5］方法	本文方法
平均	-6.62	-6.09	-7.47
Traffic	-5.97	-6.24	-8.21
PeopleOnStreet	-9.17	-6.23	-7.84
Kimono1	-3.68	-5.73	-7.19
Cactus	-5.29	-7.92	-8.92
BQTerrace	-5.08	-13.80	-11.43
ParkScene	-3.85	-1.26	-4.36
BasketballDrive	-5.99	-7.34	-6.91
RaceHorses	-	-4.53	-7.89
BQMall	-7.37	-4.81	-8.76
PartyScene	-9.28	-1.68	-3.23
BasketballDrill	-10.65	-3.55	-4.99
RaceHorses	-3.52	-3.85	-6.27
BQSquare	-10.99	-1.68	-3.61
BlowingBubbles	-7.54	-0.89	-3.09
BasketballPass	-6.34	-3.10	-6.15
FourPeople	-10.53	-10.69	-10.43
Jonny	-11.22	-14.55	-14.52
KristenAndSara	-9.00	-11.91	-10.73

Optimization algorithm for video coding based on convolutional neural networks

RICH HTML

PDF (PC)