吉林大学学报(信息科学版) ›› 2024, Vol. 42 ›› Issue (6): 1025-1030.

• • 上一篇    下一篇

基于强化学习的无人机航线规划研究

何庆新1, 涂晓彬1, 于银辉2   

  1. 1. 闽南理工学院 信息工程学院,福建泉州362242;2. 吉林大学 通信工程学院,长春130012
  • 收稿日期:2024-06-26 出版日期:2024-12-23 发布日期:2024-12-23
  • 通讯作者: 于银辉(1964— ), 女, 山东泰安人,吉林大学教授, 硕士生导师,主要从事通信网络能效 管理与无线通信网络理论研究,(Tel)86-13604319589 (E-mail) yuyh@jlu. edu. cn。
  • 作者简介:何庆新(1980— ), 男, 吉林松原人, 闽南理工学院副教授, 主要从事大数据分析研究, (Tel)86-13400818882(E-mail) 34527336@ qq. com
  • 基金资助:
    福建省科技厅科技计划基金资助项目(2024H0038)

Research on UAV Route Planning Based on Reinforcement Learning

HE Qingxin1, TU Xiaobin1, YU Yinhui2    

  1. 1. College of Information Engineering, Minnan University of Science and Technology, Quanzhou 362242, China; 2. College of Communication Engineering, Jilin University, Changchun 130012, China
  • Received:2024-06-26 Online:2024-12-23 Published:2024-12-23

摘要: 为解决无人机的低通信能耗比问题,并在维持高通信质量的同时降低能耗,提出了一种基于强化学习的 无人机航线规划方案。 将连续的飞行空间划分为多层二维网格以便于生成无人机状态点,并建立一个基于 通信质量和能耗参数的奖励函数,通过Q-Learning算法学习获得通信能耗比最优航线。 实验结果表明, 该学习 模型规划的航线能获得较高的通信能耗比,具有一定应用价值。

关键词: 航线规划;Q-Learning算法, 无人机

Abstract: The energy consumption of a UAV(Unmanned Aerial Vehicle) determines the length of its operational cycle. To address the issue of low communication-to-energy consumption ratio, a reinforcement learning-based UAV path planning solution is proposed to reduce energy consumption while maintaining high communication quality. The continuous flight space is divided into multi-layer two-dimensional grids to facilitate the generation of UAV state points, and a reward function based on communication quality parameters and energy consumption parameters is established. The Q-Learning algorithm is employed to learn and obtain the path with the optimal communication-to-energy consumption ratio. Experimental results show that the path planned by this learning model can achieve a higher communication-to-energy consumption ratio, demonstrating its practical value.

Key words: route planning, Q-Learning algorithm, unmanned aerial vehicle

中图分类号: 

  • TN929.531