吉林大学学报(信息科学版) ›› 2024, Vol. 42 ›› Issue (5): 930-936.

• • 上一篇    下一篇

金融交易反欺诈人工智能建模方法研究 

 钱亮宏1, 王福德2,3, 宋海龙   

  1. 1. 益数软件科技(上海)有限公司数据科学部,上海200233;2. 吉林海诚科技有限公司 技术部,长春130119; 3. 吉林农业大学智慧农业研究院,长春130118
  • 收稿日期:2023-12-28 出版日期:2024-10-21 发布日期:2024-10-23
  • 通讯作者: 王福德(1990— ), 男, 吉林大安人, 吉林海诚科技有限公司工程师, 主要从事 计算机教育技术研究,(Tel)86-18243044666(E-mail)562324919@ qq. com。
  • 作者简介:钱亮宏(1989— ), 男, 南京人, 益数软件科技(上海)有限公司工程师, 主要从事人工智能研究, (Tel)86-15900530128 (E-mail)31311853@ qq. com
  • 基金资助:
    吉林省教育厅产业化培育基金资助项目(JJKH20240274CY)

Research on AI Modeling Approaches of Financial Transactional Fraud Detection

QIAN Lianghong1, WANG Fude2, SONG Hailong2   

  1. 1. Data Science Department, Yepdata Software Technology Company Limited, Shanghai 200233, China; 2. Technology Department, Technical Department, Jilin Haicheng Technology Company Limited, Changchun 130119, China; 3. Smart Agriculture Research Institute, Jilin Agricultural University, Changchun 130118, China
  • Received:2023-12-28 Online:2024-10-21 Published:2024-10-23

摘要: 为解决金融交易反欺诈和维护金融安全,针对金融交易数据不平衡、类别离散的特点,提出一套端到端 的建模流程、方法和模型结构。 该流程涵盖数据预处理、模型训练和预测。 同时比较了不同模型在不同特征 数量情况下的效果和效率,并基于真实数据集进行验证,从而为金融机构根据自身的优化目标和资源限制选用 不同类别和特征数量的模型提供参考。 特征数较大(200以上)的基于树的模型适用于资源较充裕切追求极致 模型效果的场景,中等规模(特征数100~200)的神经网络模型适用于资源一般的场景, 而特征数较小的决策 树模型或逻辑回归模型适用于资源有限且模型效果要求不高的长尾场景。 

关键词: 金融交易反欺诈, 人工智能, 模型选择, 机器学习, 深度学习 

Abstract: To detect transactional fraud in financial services industry and maintain financial security, an end-to- end modeling framework, methodology, and model architecture are proposed for financial transactional data with imbalanced and discrete classes. The framework covers data preprocessing, model training, and model prediction. The performance and efficiency of different models with different numbers of features are compared and validated on a real-world dataset. The results demonstrate that the proposed approach can effectively improve the accuracy and efficiency of financial transactional fraud detection, providing a reference for financial institutions to select models with different types and numbers of features according to their own optimization goals and resource constraints. Tree-based models excel with over 200 features in resource-rich settings, while neural networks are optimal for medium-sized feature sets (100 ~200). Decision trees or logistic regression are suitable for small feature sets in resource-constrained, long-tail scenarios. 

Key words: fraud detection, artificial intelligence, model selection, machine learning, deep learning 

中图分类号: 

  • TP181