基于混沌AES和同义词扩展的中文文本水印算法

doi:10.13229/j.cnki.jdxbgxb.20240248

摘要/Abstract

摘要：

针对目前基于修改式和生成式的文本水印算法普遍存在语义歧义较大、嵌入容量有限及安全性不足的问题，本文提出了一种基于混沌AES和同义词扩展的改进算法。该算法通过设计基于Sentence-Bert的上下文搭配机制，不仅能确保同义词替换后的上下文与原文语义高度相似，而且能判别同义词篡改后产生的语义歧义；将异体字作为同义词扩展，在不引起语义歧义的前提下，实现更多同义词替换，从而提升嵌入容量；利用混沌映射机制，升级传统MD5算法，构建混沌AES算法，实现对隐匿信息的加密保护，进而增加其破解难度。研究结果表明：对比同类算法，该算法在语义歧义、嵌入容量及安全性方面均有良好表现。

关键词: 文本水印算法, 混沌MD5, 混沌AES, 混沌映射机制, 上下文搭配机制, 同义词扩展

Abstract:

At present， the text watermarking algorithms based on modification and generation generally have the problems of large semantic ambiguity， limited embedding capacity and insufficient security. In view of the above problems， this paper proposes an improved algorithm based on chaotic AES and synonym expansion. By designing a context matching mechanism based on Sentence-Bert， the algorithm can not only ensure that the context after synonym replacement is highly similar to the semantics of the original text， but also distinguish the semantic ambiguity caused by synonym tampering. The variant characters are extended as synonyms， and more synonyms are replaced without causing semantic ambiguity， thereby improving the embedding capacity. By using chaotic mapping mechanism， the traditional MD5 algorithm is upgraded， and the chaotic AES algorithm is constructed to realize the encryption protection of hidden information， thereby increasing the difficulty of cracking. The research results show that compared with similar algorithms， the algorithm has good performance in semantic ambiguity， embedding capacity and security.

Key words: text watermarking algorithm, chaotic MD5, chaotic AES, chaotic mapping mechanism, context matching mechanism, synonym expansion

中图分类号:

TP391

李书明,李冰楠,杨超. 基于混沌AES和同义词扩展的中文文本水印算法[J]. 吉林大学学报(工学版), 2025, 55(11): 3715-3726.

Shu-ming LI,Bing-nan LI,Chao YANG. Chinese text watermarking algorithm based on chaotic AES and synonym expansion[J]. Journal of Jilin University(Engineering and Technology Edition), 2025, 55(11): 3715-3726.

图/表 14

图1

图2

图3

图4

图5

图6

图7

表1

图8

图9

表2

表3

表4

表5

参考文献 17

[1]	朱强强. 基于汉字笔画微变形的文本隐写方法研究[D]. 杭州: 杭州电子科技大学网络与信息安全学院,2024.
	Zhu Qiang-qiang. Research on text steganography based on chinese character stroke fine-tuning[D]. Hangzhou: College of Network and Information Security, Hangzhou Dianzi University, 2024.
[2]	黄瑶, 潘丽丽, 熊思宇, 等. 基于生成对抗网络与多头注意力的文本隐写术[J]. 计算机工程与科学,2023, 45(10): 1789-1796.
	Huang Yao, Pan Li-li, Xiong Si-yu, et al. Text steganography based on generative adversarial networks and multi-head attention[J]. Journal of Computer Engineering & Science, 2023, 45(10): 1789-1796.
[3]	费文斌, 唐向宏, 王静, 等. 基于预测误差扩展的可逆文本水印算法[J]. 中文信息学报, 2015, 29(1):133-138.
	Fei Wen-bin, Tang Xiang-hong, Wang Jing, et al. Reversible text watermarking algorithm based on prediction error expansion[J]. Journal of Chinese Information Processing, 2015, 29(1): 133-138.
[4]	Xiang L Y, Wu W S, Li X, et al. A linguistic steganography based on word indexing compression and candidate selection[J]. Multimedia Tools and Applications, 2018, 77(21): 28969-28989.
[5]	姚晔, 刘书辉, 王慧, 等. 基于字符扰动变形和字库替换的鲁棒中文文本水印[J]. 密码学报, 2023, 10(4): 769-785.
	Lao Ye, Liu Shu-hui, Wang Hui, et al. Robust chinese text watermarking method based on chinese character glyph perturbation and font replacing[J]. Journal of Cryptologic Research, 2023, 10(4): 769-785.
[6]	张娜, 张琨, 张先国, 等. 基于主题词与信息熵编码的文本零水印算法[J]. 计算机与数字工程, 2021, 49(8): 1612-1618.
	Zhang Na, Zhang Kun, Zhang Xian-guo, et al. Text zero-watermarking algorithm based on keywords and information entropy encoding[J]. Journal of Computer & Digital Engineering, 2021, 49(8): 1612-1618.
[7]	Zheng X Y, Wu H Z. Autoregressive linguistic steganography based on BERT and consistency coding[J]. Security and Communication Networks, 2022: 1-11.
[8]	Yu L, Lu Y L, Yan X H, et al. MTS-Stega: linguistic steganography based on multi-time-step[J]. Entropy, 2022, 24(5): 585.
[9]	金家立. 基于自动选择编码方式的文本信息隐藏技术研究[D]. 沈阳: 沈阳工业大学信息科学与工程学院,2023.
	Jin Jia-li. Research on text information hiding technology based on automatic selection encoding[D]. Shenyang: School of Information Science and Engineering, Shenyang University of Technology,2023.
[10]	Reimers N, Gurevych I. Sentence-bert: sentence embeddings using siamese bert-networks[J]. Arxiv Preprint, 2019, 8: 190810084.
[11]	Wang X Y, Feng D G, Lai X J, et al. Collisions for hash functions MD 4, MD5, HAVAL-128 and RIPEMD[J/OL]. IACR Cryptology eprint Archive, [2004-08-17].
[12]	Klima V. Tunnels in hash functions: MD5 collisions within a minute[J/OL].IACR Cryptol. ePrint Arch, [2006-04-17].
[13]	Stevens M. Attacks on hash functions and applications[D]. Leiden: Mathematical institute faculty, science Leiden University, 2012.
[14]	Rønjom S, Bardeh N G, Helleseth T. Yoyo tricks with AES[C]∥Advances in Cryptology-ASIACRYPT 2017: 23rd International Conference on the Theory and Applications of Cryptology and Information Security, Hong Kong, China, 2017: 217-243.
[15]	Bardeh N G, Rønjom S. The exchange attack: how to distinguish six rounds of AES with chosen plaintexts[C]∥International Conference on the Theory and Application of Cryptology and Information Security, Kobe, Japan, 2019: 347-370.
[16]	Dunkelman O, Keller N, Ronen E, et al. The retracing boomerang attack[C]∥Annual International Conference on the Theory and Applications of Cryptographic Techniques, Zagreb, Croatia, 2020: 280-309.
[17]	Bardeh N G, Rijmen V. New key recovery attack on reduced-round AES[J]. Cryptology ePrint Archive, 2022(2): 43-62.

相关文章 15

[1]	王秀慧,徐永波. 软注意力掩码嵌入下中文命名实体识别算法[J]. 吉林大学学报(工学版), 2026, 56(1): 231-238.
[2]	单飞,李辉,孙浩,聂世刚,申忠虎. 基于改进simAM-YOLOv8的路面多病害识别方法[J]. 吉林大学学报(工学版), 2026, 56(1): 219-230.
[3]	陈海鹏,刘宏昕,康辉,刘雪洁. 基于边界不确定性学习的图像篡改定位方法[J]. 吉林大学学报(工学版), 2025, 55(12): 4063-4071.
[4]	赵宏,马宇轩,宋馥荣. 基于Diff-AdvGAN的图像对抗样本生成方法[J]. 吉林大学学报(工学版), 2025, 55(12): 4052-4062.
[5]	冯萍,杨茈茜,王韧杰,冯师语,吴航,孙宇. 基于跨度和语义特征的实体关系抽取模型[J]. 吉林大学学报(工学版), 2025, 55(12): 4045-4051.
[6]	杨燕,沈汪良. 多尺度细节增强与分层抑噪的图像去雾算法[J]. 吉林大学学报(工学版), 2025, 55(12): 4010-4023.
[7]	邓天民,谢鹏飞,余洋,陈月田. 双分支特征自适应融合的车道线检测方法[J]. 吉林大学学报(工学版), 2025, 55(12): 3840-3851.
[8]	蔡晓东,黄业洋,董丽芳. 基于增强正例与层间负例的语义相似性模型[J]. 吉林大学学报(工学版), 2025, 55(11): 3705-3714.
[9]	曹玉东,廖鑫林,陈鑫,贾旭. 融合深度主动学习的视觉目标检测模型[J]. 吉林大学学报(工学版), 2025, 55(11): 3697-3704.
[10]	张瑞峰,郭芳兆,李锵. 基于多尺度注意力信息复用网络的胸片图像分类[J]. 吉林大学学报(工学版), 2025, 55(11): 3686-3696.
[11]	王红斌,唐浩东,线岩团,刘博,顾新亮. 基于实体可靠路径与语义增强的知识图谱对齐[J]. 吉林大学学报(工学版), 2025, 55(11): 3673-3685.
[12]	姜来为,王策,杨宏宇. 基于深度学习的多目标跟踪研究进展综述[J]. 吉林大学学报(工学版), 2025, 55(11): 3429-3445.
[13]	林琳,陈雨欣,佴威至. 基于手势帧序列提取的自适应实时手势分类算法[J]. 吉林大学学报(工学版), 2025, 55(9): 3042-3048.
[14]	朴燕,康继元. RAUGAN：基于循环生成对抗网络的红外图像彩色化方法[J]. 吉林大学学报(工学版), 2025, 55(8): 2722-2731.
[15]	孙佩铭,王喆. 基于导向差分进化算法的党务活动调度优化方法[J]. 吉林大学学报(工学版), 2025, 55(8): 2761-2770.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

同义词提取结果	异体字提取结果
认为-以为：1，可以-能够：1，进行-开展：1	外-外：0，并—并：0
总结-总结：0，几个-多个：1，根据-基于：1	晚-晚：0，业—?：1
完成-完成：0，有关-相关：1，比如-例如：1	内-內：1，任—仼：1
几天-几天：0，可以-能够：1，认为-认为：0	值-値：1，并—幷：1
不在-不在：0，由此-由此：0，判断-判断：0	出-出：0，晚—晩：1
产生-生成：1，阈值-阙值：1，根据-根据：0	出-出：0，并—并：0
规则-规则：0，准确-正确：1，住校-住校：0	出-出：0，别—別：1
识别-识别：0，结合-结合：0，杜绝-杜绝：0	内-内：0，出—出：0
随意-随便：1，通过-借助：1，形成-建立：1	外-：1，出—岀：1
当天-当日：1，通过-通过：0，及时-实时：1	晚-晩：1，并—并：0
相应-对应：1，及时-及时：0，了解-洞悉：1

攻击主体	攻击对象	攻击次数	破解次数
CMD5网站	传统MD5加密密钥	1 000	613
CMD5网站	混沌MD5增强密钥	1 000	0
MD5Crack 工具	传统MD5加密密钥	1 000	968
MD5Crack 工具	混沌MD5增强密钥	1 000	0

算法性能	文献［3］	文献［6］	本文
抗语义歧义	中	高	较高
嵌入容量	中	较低	高
同义词篡改判别	不支持	较高	高
混沌算法抗破解	中	不支持	较高
加密算法抗破解	不支持	较高	高
时间复杂度	中	高	略高于中等