吉林大学学报(信息科学版) ›› 2024, Vol. 42 ›› Issue (2): 356-365.

• • 上一篇    下一篇

基于三值向二值演化的 BNN 剪枝方法

徐 图, 张 博, 李 镇, 陈怡凝, 申人升, 熊波涛, 常玉春   

  1. 大连理工大学 微电子学院, 辽宁 大连 116000
  • 收稿日期:2023-04-10 出版日期:2024-04-10 发布日期:2024-04-12
  • 通讯作者: 常玉春(1973— ), 男, 长春人, 大连理工大学教授, 主要从事 模拟/ 数模混合集成电路设计、 专用集成电路设计等研究, (Tel)86-13180826212(E-mail)cyc@ dlut. edu. cn。 E-mail:cyc@ dlut. edu. cn
  • 作者简介:徐图(1998— ), 男, 辽宁铁岭人, 大连理工大学硕士研究生, 主要从事数字集成电路设计、 深度学习算法等研究, (Tel) 86-18018900069(E-mail)dlutxutu@ 163. com
  • 基金资助:
    大连市科学技术局基金资助项目(2020RT01); 产业基础再造和制造业高质量发展专项基金资助项目( TC220A04A-49); 电子元器件实验室可靠性物理与应用技术科学技术基金资助项目(6142806210302)

BNN Pruning Method Based on Evolution from Ternary to Binary

XU Tu, ZHANG Bo, LI Zhen, CHEN Yining, SHEN Rensheng, XIONG Botao, CHANG Yuchun   

  1. College of Microelectronics, Dalian University of Technology, Dalian 116000, China
  • Received:2023-04-10 Online:2024-04-10 Published:2024-04-12

摘要: 针对目前 BNN(Binarized Neural Network)剪枝方法存在剪枝比例低、 识别准确率显著下降以及依赖训练 后微调的问题, 提出了一种基于三值向二值演化的滤波器级的 BNN 剪枝方法, 命名为 ETB( Evolution from Ternary to Binary) ETB 是基于学习的, 通过在 BNN 的量化函数中引入可训练的量化阈值, 使权重和激活值 逐渐从三值演化到二值或零, 旨在使网络在训练期间自动识别不重要的结构。 此外, 一个剪枝率调节算法也被 设计用于调控网络的剪枝率。 训练后, 全零滤波器和对应的输出通道可被直接裁剪而获得精简的 BNN, 无需 微调。 为证明提出方法的可行性和其提升 BNN 推理效率而不牺牲准确率的潜力, CIFAR-10 上进行实验: CIFAR-10 数据集上, ETB VGG-Small 模型进行了 46. 3% 的剪枝, 模型大小压缩至 0. 34 MByte, 准确率为 89. 97% , 并在 ResNet-18 模型上进行了 30. 01% 的剪枝, 模型大小压缩至 1. 33 MByte, 准确率为 90. 79% 。 在 准确率和参数量方面, 对比一些现有的 BNN 剪枝方法, ETB 具有一定的优势。

关键词: 二值神经网络, 剪枝, 可训练门限, 演化

Abstract: BNNs( Binarized Neural Networks) are popular due to their extremely low memory requirements. While BNNs can be further compressed through pruning techniques, existing BNN pruning methods suffer from low pruning ratios, significant accuracy degradation, and reliance depending on fine-tuning after training. To overcome these limitations, a filter-level BNN pruning method is proposed based on evolution from ternary to binary, named ETB ( Evolution from Terry to Binary). ETB is learning-based, and by introducing trainable quantization thresholds into the quantization function of BNNs, it makes the weights and activation values gradually evolve from ternary to binary or zero, aiming to enable the network to automatically identify unimportant structures during training. And a pruning ratio adjustment algorithm is also designed to regulate the pruning rate of the network. After training, all zero filters and corresponding output channels can be directly pruned to obtain a simplified BNN without fine-tuning. To demonstrate the feasibility of the proposed method and the potential for improving BNN inference efficiency without sacrificing accuracy, experiments are conducted on CIFAR-10. ETB is pruned the VGG-Small model by 46. 3% , compressing the model size to 0. 34 MB, with an accuracy of 89. 97% . The ResNet-18 model is also pruned by 30. 01% , compressing the model size to 1. 33 MB, with an accuracy of 90. 79% . Compared with some existing BNN pruning methods in terms of accuracy and parameter quantity, ETB has certain advantages.

Key words: binarized neural network, pruning, trainable threshold, evolution

中图分类号: 

  • TP399