吉林大学学报(工学版) ›› 2014, Vol. 44 ›› Issue (5): 1375-1384.doi: 10.7964/jdxbgxb201405025

Previous Articles     Next Articles

Decentralized reinforcement learning optimal control for time varying constrained reconfigurable modular robot

DONG Bo1,LIU Ke-ping2,LI Yuan-chun2   

  1. 1.Department of Control Science and Engineering,Jilin University,Changchun 130022,China;
    2.Department of Control Engineering,Changchun University of Technology,Changchun 130012,China
  • Received:2013-06-01 Online:2014-09-01 Published:2014-09-01

Abstract: Based on Action-Critic-Identifier (ACI) and Radial Basis Function (RBF) neural network, a novel decentralized reinforcement learning optimal control method for time varying constrained reconfigurable modular robot is presented. The continuous time nonlinear optimal control problem of strongly coupled uncertainty robotic system is solved. The dynamics of the robot is described as a synthesis of interconnected subsystems. As a precondition to the continuous-time MDPs performance indicators, the optimal value function, optimal control policy and global uncertainty of the subsystems are estimated combing with ACI and RBF network. The optimal conditions of HJB equation with regard to the subsystem are satisfied, so that the reconfigurable modular robot system can track the desired trajectory in a short time and the estimation error can converge to zero in finite time. The stability of the system is confirmed by Lyapunov theory. Simulations are performed to illustrate the effectiveness of the proposed decentralized control scheme.

Key words: automatic control technology, reconfigurable modular robot, reinforcement learning, nonlinear optimal control, decentralized control

CLC Number: 

  • TP273
[1] Li Yuan-chun, Dong Bo. Decentralized ADRC control for reconfigurable manipulators based on VGSTA-ESO of sliding mode[J]. Information-an International Interdisciplinary Journal, 2012, 15(6): 2453-2465.
[2] 李英,朱明超,李元春.基于速度观测模型的可重构机械臂补偿控制[J].控制理论与应用,2008,25(5):891-897.Li Ying, Zhu Ming-chao, Li Yuan-chun. Velocity observer based compensator for motion control of a reconfigurable manipulator [J]. Control Theory & Applications, 2008, 25(5):891-897.
[3] 朱明超,李元春.可重构机械臂分散自适应模糊滑模控制[J].吉林大学学报:工学版,2009,39(1):170-176.Zhu Ming-chao, Li Yuan-chun. Decentralized adaptive sliding mode control for reconfigurable manipulators using fuzzy logic[J].Journal of Jilin University(Engineering and Technology Edition), 2009,39(1):170-176.
[4] 朱明超,李英,李元春.基于观测器的可重构机械臂分散自适应模糊控制[J].控制与决策,2009,24(3):429-434.Zhu Ming-chao, Li Ying, Li Yuan-chun. Observer-based decentralized adaptive fuzzy control for reconfigurable manipulator[J].Control and Decision, 2009, 24(3):429-434.
[5] Xu Yan-kai, Cao Xi-ren. Lebesgue-sampling-based optimal control problems with time aggregation[J]. IEEE Transactions on Automatic Control, 2011, 56(5): 1097-1109.
[6] Lewis F L, Vrabie D. Reinforcement learning and adaptive dynamic programming for feedback control[J]. IEEE Circuits and Systems Magzine, 2009, 9(3): 32-50.
[7] Xu Xin, He Han-gen, Hu De-wen. Efficient reinforcement learning using recursive least-squares methods[J]. Journal of Artificial Intelligence Research, 2002, 16: 259-292.
[8] Lewis F L, Liu De-rong. Reinforcement Learning and Approximate Dynamic Programming for Feedback Control[M]. New York: Wiley-IEEE Press, 2012.
[9] Lewis F L, Syrmos V L. Optimal Control[M]. New York: John Wiley & Sons, Inc, 1995.
[10] Sassano M, Astolfi A. Dynamic approximate solutions of the HJ inequality and of the HJB equation for input-affine nonlinear systems[J]. IEEE Transactions on Automatic Control, 2012, 57(10):2490-2503.
[11] 吴玉香,王聪. 基于确定学习的机器人任务空间自适应神经网络控制[J].自动化学报, 2013, 39(6): 806-815.Wu Yu-xiang, Wang Cong. Deterministic learning based adaptive network control of robot in task space[J]. Acta Automatica Sinica, 2013,39(6): 806-815.
[12] Patre P M, MacKunis W, Kaiser K, et al. Asymptotic tracking for uncertain dynamic systems via a multilayer neural network feedforward and RISE feedback control structure[J]. IEEE Transactions on Automatic Control, 2008,53(9): 2180-2185.
[13] Paden B, Sastry S. Calculus for computing Filippov's differential inclusion with application to the variable structure control of robot manipulators[J]. IEEE Transactions on Circuits Systems, 1987, 3(1):73-82.
[1] GU Wan-li,WANG Ping,HU Yun-feng,CAI Shuo,CHEN Hong. Nonlinear controller design of wheeled mobile robot with H performance [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(6): 1811-1819.
[2] LI Zhan-dong,TAO Jian-guo,LUO Yang,SUN Hao,DING Liang,DENG Zong-quan. Design of thrust attachment underwater robot system in nuclear power station pool [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(6): 1820-1826.
[3] WANG De-jun, WEI Wei-li, BAO Ya-xin. Actuator fault diagnosis of ESC system considering crosswind interference [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(5): 1548-1555.
[4] YAN Dong-mei, ZHONG Hui, REN Li-li, WANG Ruo-lin, LI Hong-mei. Stability analysis of linear systems with interval time-varying delay [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(5): 1556-1562.
[5] TIAN Yan-tao, ZHANG Yu, WANG Xiao-yu, CHEN Hua. Estimation of side-slip angle of electric vehicle based on square-root unscented Kalman filter algorithm [J]. 吉林大学学报(工学版), 2018, 48(3): 845-852.
[6] ZHANG Shi-tao, ZHANG Bao, LI Xian-tao, WANG Zheng-xi, TIAN Da-peng. Enhancing performance of FSM based on zero phase error tracking control [J]. 吉林大学学报(工学版), 2018, 48(3): 853-858.
[7] WANG Lin, WANG Hong-guang, SONG Yi-feng, PAN Xin-an, ZHANG Hong-zhi. Behavior planning of a suspension insulator cleaning robot for power transmission lines [J]. 吉林大学学报(工学版), 2018, 48(2): 518-525.
[8] HU Yun-feng, WANG Chang-yong, YU Shu-you, SUN Peng-yuan, CHEN Hong. Structure parameters optimization of common rail system for gasoline direct injection engine [J]. 吉林大学学报(工学版), 2018, 48(1): 236-244.
[9] ZHU Feng, ZHANG Bao, LI Xian-tao, WANG Zheng-xi, ZHANG Shi-tao. Gyro signal processing based on strong tracking Kalman filter [J]. 吉林大学学报(工学版), 2017, 47(6): 1868-1875.
[10] JIN Chao-qiong, ZHANG Bao, LI Xian-tao, SHEN Shuai, ZHU Feng. Friction compensation strategy of photoelectric stabilized platform based on disturbance observer [J]. 吉林大学学报(工学版), 2017, 47(6): 1876-1885.
[11] FENG Jian-xin. Recursive robust filtering for uncertain systems with delayed measurements [J]. 吉林大学学报(工学版), 2017, 47(5): 1561-1567.
[12] XU Jin-kai, WANG Yu-tian, ZHANG Shi-zhong. Dynamic characteristics of a heavy duty parallel mechanism with actuation redundancy [J]. 吉林大学学报(工学版), 2017, 47(4): 1138-1143.
[13] HU Yun-feng, GU Wan-li, LIANG Yu, DU Le, YU Shu-you, CHEN Hong. Start-stop control of hybrid vehicle based on nonlinear method [J]. 吉林大学学报(工学版), 2017, 47(4): 1207-1216.
[14] SHEN Shuai, ZHANG Bao, LI Xian-tao, ZHU Feng, JIN Chao-qiong. Acceleration feedback control based on tracking differentiator [J]. 吉林大学学报(工学版), 2017, 47(4): 1217-1224.
[15] SHAO Ke-yong, CHEN Feng, WANG Ting-ting, WANG Ji-chi, ZHOU Li-peng. Full state based adaptive control of fractional order chaotic system without equilibrium point [J]. 吉林大学学报(工学版), 2017, 47(4): 1225-1230.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!