Journal of Jilin University (Information Science Edition) ›› 2025, Vol. 43 ›› Issue (4): 724-735.

Previous Articles     Next Articles

ADCFA-MVSNet: Multi-View Stereo with Adaptive Depth Consistency and Cross-Frequency Attention 

ING Hang, WANG Gang, WANG Yan, HOU Minghui   

  1. College of Computer Science and Technology, Jilin University, Changchun 130012, China
  • Received:2024-08-15 Online:2025-08-15 Published:2025-08-14

Abstract: The current challenges in deep learning for 3D reconstruction are difficulty in extracting comprehensive scene information from images and insufficient consideration of depth consistency between views. A multi-view stereo network with adaptive depth consistency and cross-frequency attention (ADCFA-MVSNet: Multi-View Stereo with Adaptive Depth Consistency and Cross-Frequency Attention) is proposed. The CFA (Cross-Frequency Attention) module integrates high-frequency, low-frequency information within images and global scene information across views, enabling more comprehensive feature extraction. The AD(Adaptive Depth) consistency module precisely captures the geometric structure of the scene and dynamically considers the contribution of different views to depth consistency, enhancing it across various scales. The innovation of this method lies in utilizing comprehensive image information to ensure geometric consistency, achieving excellent performance in 3D reconstruction tasks. On the DTU(Technical University of Denmark) dataset, it achievs an accuracy of 0. 319, completeness of 0.285, and an overall score of 0.302, surpassing other methods. On the BlendedMVS dataset, the EPE(End-Point-Error) score is 0.27, e1 score is 5.28, and e3 score is 1.84, outperforming other methods. These results demonstrate the effectiveness of ADCFA-MVSNet in improving the completeness and accuracy of multi-view 3D reconstruction. Experimental results show that this method enhances the quality of multi-view reconstruction and achieves good reconstruction effects.

Key words: computer vision, multi-view stereo, deep learning, cross-frequency attention, adaptive depth consistency

CLC Number: 

  • TP391.4