多尺度感知与边界引导的图像篡改检测方法

doi:10.13229/j.cnki.jdxbgxb.20231027

摘要/Abstract

摘要：

针对传统图像篡改检测方法存在边界模糊、提取特征尺度单一、忽略背景信息等问题，本文提出多尺度感知与边界引导的图像篡改检测方法。首先，使用改进的金字塔视觉变压器提取篡改图像的空间细节和基础特征。其次，通过边缘感知模块探索与伪造区域边缘相关的信息，生成边缘预测图。再次，利用边缘引导模块突出所提取特征中的关键通道，减少冗余通道的干扰。然后，通过多尺度上下文感知模块，从多个感受野学习伪造区域丰富的上下文信息。最后，利用特征融合模块交替关注篡改图像的前景和背景，精确分割伪造区域。将本文方法在5个常用的公开图像篡改检测数据集上进行定量和定性对比，实验结果表明，本文方法可以有效检测伪造区域，并且优于其他方法。

关键词: 计算机应用, 图像篡改检测, 多尺度上下文感知, 边界引导

Abstract:

Aiming at the problems of traditional image manipulation detection methods， such as fuzzy boundaries， single scale of extracted features， and ignoring background information， this paper proposes an image manipulation detection method based on multi-scale context-aware and boundary-guided. First， spatial details and base features of manipulated images are extracted using an improved pyramid vision transformer. Second， information related to the edge of the falsified region is explored by an edge context-aware module to generate an edge prediction map. Again， the edge guidance module is utilized to highlight the key channels in the extracted features and reduce the interference of redundant channels. Then， the rich contextual information of the manipulated region is learned from multiple sensory fields through the multi-scale context-aware module. Finally， the feature fusion module is utilized to accurately segment the manipulated region by focusing alternately on the foreground and background of the manipulated images. Comparing this paper's method quantitatively and qualitatively on five commonly used public image manipulation detection datasets， the experimental results show that this paper's method can effectively detect manipulated regions and outperforms other methods.

Key words: computer application, image manipulation detection, multi-scale context-aware, boundary guidance

中图分类号:

TP391

陈海鹏,张世博,吕颖达. 多尺度感知与边界引导的图像篡改检测方法[J]. 吉林大学学报(工学版), 2025, 55(6): 2114-2121.

Hai-peng CHEN,Shi-bo ZHANG,Ying-da LYU. Multi⁃scale context⁃aware and boundary⁃guided image manipulation detection method[J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(6): 2114-2121.

图/表 10

图1

图2

图3

图4

图5

表1

表2

图6

表3

表4

参考文献 22

[1]	Shi Z, Chen H, Zhang D. Transformer-auxiliary neural networks for image manipulation localization by operator inductions[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(9): 4907-4920.
[2]	钟辉, 康恒, 吕颖达, 等. 基于注意力卷积神经网络的图像篡改定位算法[J]. 吉林大学学报: 工学版, 2021, 51(5): 1838-1844.
	Zhong Hui, Kang Heng, Ying-da Lyu, et al. Image manipulation localization algorithm based on channel attention convolutional neural networks[J]. Journal of Jilin University (Engineering and Technology Edition), 2021, 51(5): 1838-1844.
[3]	石泽男, 陈海鹏, 张冬, 等. 预训练驱动的多模态边界感知视觉Transformer[J]. 软件学报, 2023, 34(5): 2051-2067.
	Shi Ze-nan, Chen Hai-peng, Zhang Dong, et al. Pretraining-driven multimodal boundary-aware vision transformer[J]. Journal of Software, 2023, 34(5): 2051-2067.
[4]	Xu D, Shen X, Lyu Y, et al. MC-Net: Learning mutually complementary features for image manipulation localization[J]. International Journal of Intelligent Systems, 2022, 37(5): 3072-3089.
[5]	Mahdian B, Saic S. Using noise inconsistencies for blind image forensics[J]. Image and Vision Computing, 2009, 27(10): 1497-1503.
[6]	Lin Z, He J, Tang X, et al. Fast, automatic and fine-grained tampered JPEG image detection via DCT coefficient analysis[J]. Pattern Recognition, 2009, 42(11): 2492-2501.
[7]	Popescu A C, Farid H. Exposing digital forgeries in color filter array interpolated images[J]. IEEE Transactions on Signal Processing, 2005, 53(10): 3948-3959.
[8]	Zhou P, Chen B C, Han X, et al. Generate, segment, and refine: towards generic manipulation segmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence, New York, USA, 2020: 13058-13065.
[9]	Wang J, Wu Z, Chen J, et al. Objectformer for image manipulation detection and localization[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, USA, 2022: 2364-2373.
[10]	Lin X, Wang S, Deng J, et al. Image manipulationdetection by multiple tampering traces and edge artifact enhancement[J]. Pattern Recognition, 2023, 133:109026-109036.
[11]	Wang W, Xie E, Li X, et al. PVT v2: improved baselines with pyramid vision transformer[J]. Computational Visual Media, 2022, 8(3): 415-424.
[12]	胡林辉, 陈保营, 谭舜泉, 等. 基于Convnext-Upernet的图像篡改检测定位模型[J/OL]. [2023-09-10].
	Hu Lin-hui, Chen Bao-ying, Tan Shun-quan, et al. Convnext-Upernet based deep-learning model for image forgery detection and localization[J/OL]. [2023-09-10].
[13]	Dong J, Wang W, Tan T. Casia image tampering detection evaluation database[C]∥2013 IEEE China Summit and International Conference on Signal and Information Processing, Beijing, China, 2013: 422-426.
[14]	Guan H, Kozak M, Robertson E, et al. MFC datasets: Large-scale benchmark datasets for media for ensic challenge evaluation[C]∥2019 IEEE Winter Applications of Computer Vision Workshops (WACVW), Waikoloa, USA, 2019: 63-72.
[15]	Hsu Y F, Chang S F. Detecting image splicing usinggeometry invariants and camera characteristics consistency[C]∥2006 IEEE International Conference on Multimedia and Expo, Toronto, Canada, 2006: 549-552.
[16]	Wen B, Zhu Y, Subramanian R, et al. COVERAGE: a novel database for copy-move forgery detection[C]∥2016 IEEE International Conference on Image Processing (ICIP), Phoenix, USA, 2016: 161-165.
[17]	Novozamsky A, Mahdian B, Saic S. IMD2020: a large-scale annotated dataset tailored for detecting manipulated images[C]∥Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, Snowmass Village, USA, 2020: 71-80.
[18]	Wu Y, AbdAlmageed W, Natarajan P. Mantra-net: Manipulation tracing network for detection and localization of image forgeries with anomalous features[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 9543-9552.
[19]	Hu X, Zhang Z, Jiang Z, et al. SPAN: spatial pyramid attention network for image manipulation localization[C]∥Proceedings of the European Conference on Computer Vision (ECCV), Glasgow, UK, 2020: 312-328.
[20]	Chen X, Dong C, Ji J, et al. Image manipulation detection by multi-view multi-scale supervision[C]∥Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, Canada, 2021: 14165-14173.
[21]	Zhuang P, Li H, Tan S, et al. Image tampering localization using a dense fully convolutional network[J].IEEE Transactions on Information Forensics and Security, 2021, 16: 2986-2999.
[22]	Zhuo L, Tan S, Li B, et al. Self-adversarial training incorporating forgery attention for image forgery localization[J]. IEEE Transactions on Information Forensics and Security, 2022, 17: 819-834.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

数据集	规模		类别			后处理操作
数据集	训练集	测试集	拼接	复制-粘贴	移除	后处理操作
CASIA^［13］	5 123	921	√		√	√
NIST^［14］	404	160	√	√	√	√
Columbia^［15］	—	180	√
COVER^［16］	75	25		√		√
IMD2020^［17］	1 610	400				√

方法	CASIA		NIST		Columbia		COVER		IMD2020		Mean
方法	AUC	F₁	AUC	F₁	AUC	F₁	AUC	F₁	AUC	F₁	AUC	F₁
ManTra^［18］	0.796	0.267	0.959	0.638	0.736	0.243	0.777	0.283	0.773	0.249	0.808	0.336
SPAN^［19］	0.709	0.213	0.779	0.252	0.741	0.463	0.791	0.325	—	—	0.755	0.313
MVSS^［20］	0.847	0.318	0.981	0.768	0.808	0.417	0.808	0.284	0.802	0.396	0.849	0.437
GSRNet^［8］	0.836	0.340	0.967	0.640	0.900	0.433	0.788	0.218	—	—	0.873	0.408
DenseFCN^［21］	0.809	0.203	0.979	0.812	0.761	0.257	0.754	0.185	0.715	0.272	0.804	0.346
LocateNet^［22］	0.754	0.273	0.986	0.738	0.718	0.411	0.813	0.282	—	—	0.818	0.426
EMTNet^［10］	0.856	0.459	0.987	0.825	0.832	0.561	0.812	0.353	—	—	0.872	0.550
Ours	0.848	0.647	0.965	0.898	0.770	0.609	0.776	0.372	0.801	0.507	0.832	0.607

方法	组件				AUC	F₁
方法	EAM	EFM	MCAM	FFM	AUC	F₁
a.Baseline					0.800	0.595
b.Baseline+EAM	√				0.816	0.617
c.Baseline+EAM+EFM	√	√			0.835	0.634
d.Baseline+EAM+EFM+MCAM	√	√	√		0.841	0.639
e.Baseline+EAM+EFM+MCAM+FFM	√	√	√	√	0.848	0.647

后处理方法	参数值	方法
后处理方法	参数值	本文	MVSS	SPAN	ManTra	GSRNet	DenseFCN	LocateNet	EMTNet
高斯噪声	标准差=3	0.872	0.764	0.164	0.060	0.602	0.802	0.724	0.822
	标准差=9	0.854	0.758	0.165	0.051	0.603	0.749	0.707	0.809
	标准差=15	0.839	0.751	0.165	0.050	0.597	0.690	0.682	0.808
JPEG 压缩	质量因子=50	0.873	0.761	0.250	0.186	0.567	0.811	0.691	0.812
	质量因子=75	0.887	0.770	0.252	0.226	0.569	0.812	0.736	0.825
	质量因子=100	0.891	0.766	0.252	0.457	0.572	0.811	0.738	0.825
高斯模糊	卷积核=3	0.880	0.768	0.249	0.198	0.602	0.802	0.737	0.824
	卷积核=9	0.856	0.742	0.239	0.165	0.583	0.749	0.716	0.811
	卷积核=15	0.815	0.711	0.231	0.161	0.573	0.690	0.645	0.782