Journal of Jilin University (Information Science Edition) ›› 2025, Vol. 43 ›› Issue (5): 1128-1137.

Previous Articles     Next Articles

Video Anomaly Detection Framework Based on Bidirectional Spatio-Temporal Feature Fusion GAN 

ZHAO Yugang1a, YANG Yujia1b, XIANG Ting2, JIN Honglin1a   

  1. 1a. School of Computer Science and Engineering; 1b. School of Intelligent Manufacturing and Electrical Engineering, Guangzhou Institute of Science and Technology, Guangzhou 510540, China; 2. Chinese Medicine Department, First Affiliated Hospital of Sun Yat-Sen University Traditional, Guangzhou 510080, China
  • Received:2023-10-07 Online:2025-09-28 Published:2025-11-20

Abstract: In order to improve the accuracy of video anomaly detection in complex scenes, a video anomaly detection framework based on improved GAN(Generative Adversarial Network) is proposed. Two discriminators are used for the adversarial training of the generator, and the bidirectional prediction consistency is enhanced through a regression loss function. FusionNet and LSTM(Long Short Term Memory) are combined to form a generator structure based on spatio-temporal feature fusion. Forward and backward video sequences are taken as the inputs of the generator, and predicted video frames and predicted video sequences are output respectively. Patch GAN architecture is adopted for both of the discriminators, the frame discriminator is used to distinguish synthetic frames and the sequence discriminator is used to determine whether the frame sequence contains at least one synthetic frame to maintain temporal consistency of the predicted frames, to improve the robustness and accuracy of the predicted network. Finally, the anomaly score is calculated based on the normalized mean PNSR (Peak Signal to Noise Ratio). Experimental results show that the proposed framework can effectively capture the bidirectional spatio-temporal features in video sequences and outperforms other state-of-the-art methods on thechallenging public video anomaly detection datasets UCF-Crime ( University of Central Florida Crime) and ShanghaiTech.

Key words: video anomaly detection, generative adversarial network(GAN), FusionNet model, long short-term memory(LSTM), spatio-temporal feature fusion

CLC Number: 

  • TP391