吉林大学学报(工学版) ›› 2024, Vol. 54 ›› Issue (12): 3423-3432.doi: 10.13229/j.cnki.jdxbgxb.20230087

• 车辆工程·机械工程 •    下一篇

基于注意力机制的机械臂目标抓取网络技术

赵彬1,2,3(),吴成东1,3,张雪娇3,孙若怀1,姜杨3   

  1. 1.东北大学 信息科学与工程学院,沈阳 110819
    2.新松机器人自动化股份有限公司,沈阳 110168
    3.东北大学 机器人科学与工程学院,沈阳 110169
  • 收稿日期:2023-01-31 出版日期:2024-12-01 发布日期:2025-01-24
  • 作者简介:赵彬(1987-),男,高级工程师,博士.研究方向:视觉抓取,人工智能.E-mail:tech_zhaobin@126.com
  • 基金资助:
    国家自然科学基金项目(U20A20197);辽宁省重点研发计划项目(2020JH2/10100040)

Target grasping network technology of robot manipulator based on attention mechanism

Bin ZHAO1,2,3(),Cheng-dong WU1,3,Xue-jiao ZHANG3,Ruo-huai SUN1,Yang JIANG3   

  1. 1.College of Information Science and Engineering Northeastern University,Shenyang 110819,China
    2.SIASUN Robot & Automation Co. ,Ltd. ,Shenyang 110168,China
    3.Faculty of Robot Science and Engineering Northeastern University,Shenyang 110169,China
  • Received:2023-01-31 Online:2024-12-01 Published:2025-01-24

摘要:

为解决机械臂的单物体抓取检测问题,提出了一种基于注意力机制(CBAM)的SqueezeNet网络模型。首先,对深度视觉抓取系统进行阐述,完成了抓取系统的手眼标定。采用随机裁剪、翻转、调节对比度和增加噪声等方式有效扩充了物体抓取检测数据集。其次,引入轻量化SqueezeNet模型,采用五参数法表征二维抓取框,在不加大网络设计难度的前提下即可完成对目标的抓取。再次,引入“即插即用”的带有注意力机制的网络在通道维度上和空间维度上对传进来的特征图进行加权处理,对SqueezeNet模型抓取网络进行优化改进。最后,将改进后的CBAM-SqueezeNet算法在公开的数据集Cornell grasping dataset和Jacquard dataset上进行验证,抓取检测准确率分别为94.8%和96.4%,比SqueezeNet的准确率提升了2%。CBAM-SqueezeNet网络抓取方法的推理速度为15 ms,达到了抓取精度和运行速度的平衡。将本文方法在Kinova和新松手臂上进行了试验,网络抓取的成功率为93%,速度更快、效率更高。

关键词: 抓取检测, 注意力机制, SqueezeNet, 单目标物体检测, 深度学习

Abstract:

In order to solve the problem of single-object grasping detection of robotic arms, a SqueezeNet model algorithm based on attention mechanism (CBAM) was proposed. Firstly, the deep vision grasping system was described, and the hand-eye calibration of the grasping system was completed. The data set was preprocessed by randomly clipping, flipping, adjusting contrast, and increasing noise, effectively expanding the object capture data set. Secondly, a lightweight SqueezeNet model was introduced. It uses a five-parameter method to characterize the two-dimensional grab frame, which can complete the target capture without increasing the difficulty of network design. Thirdly, a plug-and-play network with an attention mechanism was introduced to weight the incoming feature maps in the channel and spatial dimensions. The SqueezeNet model-grabbing network was optimized and improved. Finally, the improved CBAM-SqueezeNet algorithm was verified on the public data sets Cornell grasping dataset and Jacquard dataset. The grab detection accuracy is 94.8% and 96.4%, accuracy increased 2% than the SqueezeNet network. The CBAM-SqueezeNet network grabbing method has a reasoning speed of 15 ms, which balances grabbing accuracy and running speed. The paper conducted experiments on the Kinova and SIASUN arm, and the success rate of network capture was 93%, which was faster and more efficient.

Key words: grab detection, attention mechanism, squeezenet, single-target object detection, deep learning

中图分类号: 

  • TP242.6

图1

Kinova抓取系统"

图2

用Intel RealSense D435深度相机标定"

图3

手眼标定示意图"

图4

五维抓取参数"

图5

通道注意力机制和空间注意力机制"

图6

CBAM-SqueezeNet网络的总体结构"

图7

抓取系统流程示意图"

表 1

公共抓取数据集的描述"

数据集类型物体个数RGB-D图像抓取个数
CornellRGB-D24010358019
JacquardRGB-D11 00054 000110 000

图8

本文模型在不同数据集上的检测输出"

表2

Squeeze和CBAM-SqueezeNet的参数"

网络类型总参数参数大小/MB
Squeeze209 1760.80
CBAM-Squeeze226 3320.86

表3

本文模型在Cornell数据集上的检测性能"

网络类型准确率/%帧/s交并比
Squeeze92.6970.79
CBAM-Squeeze94.8670.81

表4

本文模型在Jacquard数据集上的检测性能"

网络类型准确率/%帧/s交并比
Squeeze93.2970.81
CBAM-Squeeze96.4670.84

图9

基于注意力机制的抓取检测的Loss和准确率"

图10

单物体场景下新松UR抓取示意图"

1 Zhang H B, Zhou X W, Lan X G, et al. A real-time robotic grasping approach with oriented anchor box[J]. IEEE Transactions on Systems, Man, and Cybernetics Systems, 2021, 50(5): 3014-3025.
2 Patten T, Park K, Vincze M. DGCM-Net: dense geometrical correspondence matching network for incremental experience-based robotic grasping[J/OL]. [2023-01-20].
3 Valarezo Añazco E, Rivera Lopez P, Park N, et al. Natural object manipulation using anthropomorphic robotic hand through deep reinforcement learning and deep grasping probability network[J]. Applied Intelligence, 2021,51(2): 1041-1055.
4 Lan R, Sun L, Liu Z, et al. MADNet: a fast and lightweight network for single-image super resolution[J]. IEEE Transactions on Cybernetics, 2021,51(3): 1443-1453.
5 Xiao Z J, Yang X D, Wei X, et al. Improved lightweight network in image recognition[J]. Jisuanji Kexue Yu Tansuo, 2021, 15(4): 743-753.
6 Li M, He Z, Zhu Y, et al. A method of grasping detection for kiwifruit harvesting robot based on deep learning[J]. Agronomy, 2022, 12(12): No.3096.
7 Li H, Cheng H. Lightweight neural network design for complex verification code recognition task[J]. Computer Systems and Applications, 2021, 4: No.247.
8 Song T N, Qin W W, Liang Z, et al. Improved dual-channel attention mechanism image classification method for lightweight network[J]. Hangkong Bingqi, 2021, 28(5): 81-85.
9 Zhao B, Wu C D, Zou F S, et al. Research on small sample multi-target grasping technology based on transfer learning[J]. Sensors, 2023, 23(13): No.5826.
10 Kim D, Jo H, Song J. Irregular depth tiles: automatically generated data used for network-based robotic grasping in 2D dense clutter[J]. International Journal of Control, Automation, and Systems, 2021, 19(10): 3428-3434.
11 Kumra S, Joshi S, Sahin F. GR-ConvNet v2: a real-time multi-grasp detection network for robotic grasping[J]. Sensors, 2022, 22(16): No.6208.
12 Wang D, Liu C, Chang F, et al. High-performance pixel-level grasp detection based on adaptive grasping and grasp-aware network[J].IEEE Transactions on Industrial Electronics, 2022,69(11): 11611-11621.
13 Himeur C, Lejemble T, Pellegrini T, et al. PCEDNet: a lightweight neural network for fast and interactive edge detection in 3D point clouds[J]. ACM Transactions on Graphics, 2022, 41(1):No. 3481804.
[1] 张磊,焦晶,李勃昕,周延杰. 融合机器学习和深度学习的大容量半结构化数据抽取算法[J]. 吉林大学学报(工学版), 2024, 54(9): 2631-2637.
[2] 李路,宋均琦,朱明,谭鹤群,周玉凡,孙超奇,周铖钰. 基于RGHS图像增强和改进YOLOv5网络的黄颡鱼目标提取[J]. 吉林大学学报(工学版), 2024, 54(9): 2638-2645.
[3] 郭昕刚,程超,沈紫琪. 基于卷积网络注意力机制的人脸表情识别[J]. 吉林大学学报(工学版), 2024, 54(8): 2319-2328.
[4] 余萍,赵康,曹洁. 基于优化A-BiLSTM的滚动轴承故障诊断[J]. 吉林大学学报(工学版), 2024, 54(8): 2156-2166.
[5] 乔百友,武彤,杨璐,蒋有文. 一种基于BiGRU和胶囊网络的文本情感分析方法[J]. 吉林大学学报(工学版), 2024, 54(7): 2026-2037.
[6] 郭昕刚,何颖晨,程超. 抗噪声的分步式图像超分辨率重构算法[J]. 吉林大学学报(工学版), 2024, 54(7): 2063-2071.
[7] 张丽平,刘斌毓,李松,郝忠孝. 基于稀疏多头自注意力的轨迹kNN查询方法[J]. 吉林大学学报(工学版), 2024, 54(6): 1756-1766.
[8] 孙铭会,薛浩,金玉波,曲卫东,秦贵和. 联合时空注意力的视频显著性预测[J]. 吉林大学学报(工学版), 2024, 54(6): 1767-1776.
[9] 陆玉凯,袁帅科,熊树生,朱绍鹏,张宁. 汽车漆面缺陷高精度检测系统[J]. 吉林大学学报(工学版), 2024, 54(5): 1205-1213.
[10] 高云龙,任明,吴川,高文. 基于注意力机制改进的无锚框舰船检测模型[J]. 吉林大学学报(工学版), 2024, 54(5): 1407-1416.
[11] 李雄飞,宋紫萱,朱芮,张小利. 基于多尺度融合的遥感图像变化检测模型[J]. 吉林大学学报(工学版), 2024, 54(2): 516-523.
[12] 杨国俊,齐亚辉,石秀名. 基于数字图像技术的桥梁裂缝检测综述[J]. 吉林大学学报(工学版), 2024, 54(2): 313-332.
[13] 李晓旭,安文娟,武继杰,李真,张珂,马占宇. 通道注意力双线性度量网络[J]. 吉林大学学报(工学版), 2024, 54(2): 524-532.
[14] 苏育挺,景梦瑶,井佩光,刘先燚. 基于光度立体和深度学习的电池缺陷检测方法[J]. 吉林大学学报(工学版), 2024, 54(12): 3653-3659.
[15] 王勇,边宇霄,李新潮,徐椿明,彭刚,王继奎. 基于多尺度编码-解码神经网络的图像去雾算法[J]. 吉林大学学报(工学版), 2024, 54(12): 3626-3636.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 李寿涛, 李元春. 在未知环境下基于递阶模糊行为的移动机器人控制算法[J]. 吉林大学学报(工学版), 2005, 35(04): 391 -397 .
[2] 刘庆民,王龙山,陈向伟,李国发. 滚珠螺母的机器视觉检测[J]. 吉林大学学报(工学版), 2006, 36(04): 534 -538 .
[3] 李红英;施伟光;甘树才 .

稀土六方Z型铁氧体Ba3-xLaxCo2Fe24O41的合成及电磁性能与吸波特性

[J]. 吉林大学学报(工学版), 2006, 36(06): 856 -0860 .
[4] 张全发,李明哲,孙刚,葛欣 . 板材多点成形时柔性压边与刚性压边方式的比较[J]. 吉林大学学报(工学版), 2007, 37(01): 25 -30 .
[5] .

吉林大学学报(工学版)2007年第4期目录

[J]. 吉林大学学报(工学版), 2007, 37(04): 0 .
[6] 李月英,刘勇兵,陈华 . 凸轮材料的表面强化及其摩擦学特性
[J]. 吉林大学学报(工学版), 2007, 37(05): 1064 -1068 .
[7] 冯浩,席建锋,矫成武 . 基于前视距离的路侧交通标志设置方法[J]. 吉林大学学报(工学版), 2007, 37(04): 782 -785 .
[8] 张和生,张毅,温慧敏,胡东成 . 利用GPS数据估计路段的平均行程时间[J]. 吉林大学学报(工学版), 2007, 37(03): 533 -0537 .
[9] 杨树凯,宋传学,安晓娟,蔡章林 . 用虚拟样机方法分析悬架衬套弹性对
整车转向特性的影响
[J]. 吉林大学学报(工学版), 2007, 37(05): 994 -0999 .
[10] 冯金巧;杨兆升;张林;董升 . 一种自适应指数平滑动态预测模型[J]. 吉林大学学报(工学版), 2007, 37(06): 1284 -1287 .