Journal of Jilin University(Engineering and Technology Edition) ›› 2021, Vol. 51 ›› Issue (6): 2259-2267.doi: 10.13229/j.cnki.jdxbgxb20200577

Previous Articles    

Trajectory planning for unmanned aerial vehicle slung⁃payload aerial transportation system based on reinforcement learning

Bin XIAN1(),Shi-jing ZHANG1,Xiao-wei HAN1,Jia-ming CAI1,Ling WANG2   

  1. 1.School of Electrical and Information Engineering,Tianjin University,Tianjin 300072,China
    2.Tianjin Navigation Instrument Research Institute,Tianjin 300131,China
  • Received:2020-07-30 Online:2021-11-01 Published:2021-11-15

Abstract:

This paper presents an on-line trajectory planning method based on the reinforcement learning for driving the quadrotor to its destination accurately and suppressing the swing motion of the slung-payload effectively. To deal with the unknown external disturbances, the desired trajectory of the Unmanned Aerial Vehicle (UAV) is divided into two parts: the positioning trajectory planning and the disturbance rejection trajectory planning. The positioning trajectory planning can be designed in advance to guide the UAV to reach the desired position, and the disturbance rejection trajectory planning can compensate the unknown external disturbances based on the reinforcement learning strategy and suppress the swing motion of the slung-payload simultenously. The Lyapunov based stability analysis is employed to prove the stability of the closed-loop system, the convergence of the UAV′s position and the swing motion of the slung-payload. Finally, real-time comparing experiments are performed to verify the effectiveness of the proposed trajectory generation method and its robustness to external disturbances and variation of the mass of the slung-payload.

Key words: automatic control technology, quadrotor unmanned aerial vehicle, slung-payload system, reinforcement learning, trajectory planning

CLC Number: 

  • TP273

Fig.1

Schematic diagram of the quadrotorslung-payload system"

Fig.2

Experiment testbed"

Fig.3

First group"

Table 1

First group:experimental data analysis"

参数系统状态量期望轨迹S形位置定位轨迹
调节时间/s6.7737.805
稳态误差均值y方向/m0.01720.0179
z方向/m0.00550.0120
负载摆角/(°)0.72451.5941
稳态误差的标准差y方向/m0.02160.0210
z方向/m0.00690.0065
负载摆角/(°)0.82461.8539
稳态最大偏差y方向/m0.06100.0740
z方向/m0.01900.0330
负载摆角/(°)2.34914.2806

Fig.4

Second group"

Table 2

Second group:experimental data analysis"

参数系统状态量期望轨迹S形位置定位轨迹
调节时间/s6.2177.966
稳态误差均值y方向/m0.02370.0309
z方向/m0.01530.0281
负载摆角/(°)1.15631.9496

稳态误差的

标准差

y方向/m0.02140.0176
z方向/m0.00680.0051
负载摆角/(°)1.08922.1159
稳态最大偏差y方向/m0.06900.0760
z方向/m0.02900.0380
负载摆角/(°)2.89625.5317
1 张琳, 章新杰, 郭孔辉, 等. 未知环境下智能汽车轨迹规划滚动窗口优化[J]. 吉林大学学报:工学版, 2018, 48(3): 652-660.
Zhang Lin, Zhang Xin-jie, Guo Kong-hui, et al. Rolling window optimization for intelligent vehicle trajectory planning in unknown environment[J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(3): 652-660.
2 梁晓, 胡欲立. 基于微分平滑的四旋翼运输系统轨迹跟踪控制[J]. 控制理论与应用, 2019, 36(4): 525-532.
Liang Xiao, Hu Yu-li. Trajectory control of a quadrotor with a cable-suspended load based on differential flatness[J]. Control Theory and Applications, 2019, 36(4): 525-532.
3 Liang X, Fang Y, Sun N, et al. Dynamics analysis and time-optimal motion planning for unmanned quadrotor transportation systems[J]. Mechatronics, 2018, 50: 16-29.
4 Wang S, Xian B. An anti-swing trajectory approach for an unmanned aerial vehicle with a slung payload[C]∥Proc of 2018 37th Chinese Control Conference, Wuhan, China, 2018: 5560-5565.
5 Potdar N D, de Croon G C H E, Alonso-mora J. Online trajectory planning and control of a MAV payload system in dynamic environments[J]. Autonomous Robots, 2020, 44(7): 1065-1089.
6 杨顺, 蒋渊德, 吴坚, 等. 基于多类型传感数据的自动驾驶深度强化学习方法[J]. 吉林大学学报:工学版, 2019, 49(4): 1026-1033.
Yang Shun, Jiang Yuan-de, Wu Jian, et al. Autonomous driving policy learning based on deep reinforcement learning and multi-type sensor data[J]. Journal of Jilin University(Engineering and Technology Edition), 2019, 49(4): 1026-1033.
7 郭宪, 方勇纯. 深入浅出强化学习:原理入门[M]. 北京:电子工业出版社, 2018.
8 董博, 刘克平, 李元春. 动态约束下可重构模块机器人分散强化学习最优控制[J]. 吉林大学学报:工学版,2014, 44(5): 1375-1384.
Dong Bo, Liu Ke-ping, Li Yuan-chun. Decentralized reinforcement learning optimal control for time varying constrained reconfigurable modular robot[J]. Journal of Jilin University (Engineering and Technology Edition), 2014, 44(5): 1375-1384.
9 Alothman Y, Jasim W, Gu D. Quad-rotor lifting transporting cable-suspended payloads control[C]∥Proc of 2015 21st International Conference on Automation and Computing, Glasgow, England, 2015: 299-304.
10 Abu-khalaf M, Lewis F L. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach[J]. Automatica, 2005, 41(5): 779-791.
11 Song R, Lewis F, Wei Q, et al. Multiple actor-critic structures for continuous-time optimal control using input-output data[J]. IEEE Transactions on Neural Networks and Learning Systems, 2015, 26(4): 851-865.
12 Vamvoudakis K G, Lewis F L. Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem[J]. Automatica, 2010, 46(5): 878-888.
13 Bhasin S, Kamalapurkar R, Johnson M, et al. A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems[J]. Automatica, 2013, 49(1): 82-92.
14 Fan Q, Yang G. Adaptive actor-critic design-based integral sliding-mode control for partially unknown nonlinear systems with input disturbances[J]. IEEE Transactions on Neural Networks and Learning Systems, 2015, 27(1): 165-177.
15 Sun N, Fang Y. An efficient online trajectory generating method for underactuated crane systems[J]. International Journal of Robust and Nonlinear Control, 2014, 24(11): 1653-1663.
[1] Guang-xin HAN,Ju-le ZHAO,Yun-feng HU. Moving horizon linear quadratic regulator control for ball and plate system with input constraints [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(6): 1982-1989.
[2] Shuai LYU,Jing LIU. Stochastic local search heuristic method based on deep reinforcement learning [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(4): 1420-1426.
[3] Ya-hui ZHAO,Fei-yang YANG,Zhen-guo ZHANG,Rong-yi CUI. Korean text structure discovery based on reinforcement learning and attention mechanism [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(4): 1387-1395.
[4] Shu-you YU,Huan CHANG,Ling-yu MENG,Yang GUO,Ting QU. Disturbance observer based moving horizon control for path following problems of wheeled mobile robots [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(3): 1097-1105.
[5] Li-na LI,Xiao-hui WEI,Lin-lin HAO,Xing-wang WANG,Chu WANG. Cost⁃effective elastic resource allocation strategy in large⁃scale streaming data processing [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1832-1843.
[6] Ai-guo WU,Jun-qing HAN,Na DONG. Adaptive sliding mode control based on ultra⁃local model for robotic manipulator [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1905-1912.
[7] Wei WANG,Jian-ting ZHAO,Kuan-rong HU,Yong-cang GUO. Trajectory tracking of robotic manipulators based on fast nonsingular terminal sliding mode [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(2): 464-471.
[8] Fu LIU,Yi AN,Bo DONG,Yuan-chun LI. Decentralized energy guaranteed cost decentralized optimal control of reconfigurable robots based on ADP [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(1): 342-350.
[9] Xing-tian QU,Xue-xu WANG,Hui-chao SUN,Kun ZHANG,Long-wei YAN,Hong-yi WANG. Fuzzy self⁃adaptive PID control for fused deposition modeling 3D printer heating system [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(1): 77-83.
[10] Miao-miao MA,Jun-jun PAN,Xiang-jie LIU. Model predictive load frequency control of microgrid with electrical vehicles [J]. Journal of Jilin University(Engineering and Technology Edition), 2019, 49(5): 1644-1652.
[11] Shun YANG,Yuan⁃de JIANG,Jian WU,Hai⁃zhen LIU. Autonomous driving policy learning based on deep reinforcement learning and multi⁃type sensor data [J]. Journal of Jilin University(Engineering and Technology Edition), 2019, 49(4): 1026-1033.
[12] Shu⁃you YU,Lei TAN,Wu⁃yang WANG,Hong CHEN. Control of active four wheel steering vehicle based ontriple⁃step method [J]. Journal of Jilin University(Engineering and Technology Edition), 2019, 49(3): 934-942.
[13] Hai⁃ying WEN,Xiang REN,Wei⁃liang XU,Ming CONG,Wen⁃long QIN,Shu⁃hai HU. Bionic design and experimental test of temporomandibular joint for masticatory robot [J]. Journal of Jilin University(Engineering and Technology Edition), 2019, 49(3): 943-952.
[14] Qing⁃ying LAI,Jun LIU,Ruo⁃yu ZHAO,Yong⁃ji LUO,Ling⁃yun MENG,Ya⁃zhi XU. Optimal trajectory planning for middle⁃to⁃high speed maglev based on dynamic programming with mutative spacing [J]. Journal of Jilin University(Engineering and Technology Edition), 2019, 49(3): 749-756.
[15] LI Zhan-dong,TAO Jian-guo,LUO Yang,SUN Hao,DING Liang,DENG Zong-quan. Design of thrust attachment underwater robot system in nuclear power station pool [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(6): 1820-1826.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!