吉林大学学报(理学版) ›› 2023, Vol. 61 ›› Issue (6): 1432-1440.

• • 上一篇    下一篇

基于深度Q学习的无线传感器网络目标覆盖问题算法

高思华1,2, 顾晗1, 贺怀清1, 周钢3   

  1. 1. 中国民航大学 计算机科学与技术学院, 天津 300300; 2. 吉林大学 计算机科学与技术学院, 长春 130012;
    3. 中国民航信息网络股份有限公司 科技管理部, 北京 101300
  • 收稿日期:2022-12-20 出版日期:2023-11-26 发布日期:2023-11-26
  • 通讯作者: 贺怀清 E-mail:hqhe@cauc.edu.cn

Algorithm for Target Coverage Problem Based on Deep Q Learning in Wireless Sensor Networks

GAO Sihua1,2, GU Han1, HE Huaiqing1, ZHOU Gang3   

  1. 1. College of Computer Science and Technology, Civil Aviation University of China, Tianjin 300300, China;
     2. College of Computer Science and Technology, Jilin University, Changchun 130012, China;
    3. Department of Science and Technology Management, TravelSky Technology Limited, Beijing 101300, China
  • Received:2022-12-20 Online:2023-11-26 Published:2023-11-26

摘要: 针对求解无线传感器网络目标覆盖问题过程中存在的节点激活策略机理不明确、 可行解集存在冗余等问题, 提出一种基于深度Q学习的目标覆盖算法, 学习无线传感器网络中节点的调度策略. 首先, 算法将构建可行解集抽象成Markov决策过程, 智能体根据网络环境选择被激活的传感器节点作为离散动作; 其次, 奖励函数从激活节点的覆盖能力和自身剩余能量考虑, 评价智能体选择动作的优劣. 仿真实验结果表明, 该算法在不同规模的网络环境下均有效, 网络生命周期均优于3种贪婪算法、 最大寿命覆盖率算法和自适应学习自动机算法.

关键词: 目标覆盖问题, 深度Q学习, 无线传感器网络, 强化学习

Abstract: Aiming at the uncertain mechanism  of node activation strategies and redundancy of feasible solution sets in the process of solving target coverage problem in wireless sensor networks, we proposed a deep Q learning based target coverage algorithm to learn the scheduling strategies of nodes in wireless sensor networks. Firstly, the algorithm abstracted the construction of feasible solution sets into  Markov decision process, and intelligently selected activated sensor nodes as discrete actions according to the network environment. Secondly, a reward function  evaluated the performance of the intelligent agent in selecting actions based on the 
 coverage capacity and its residual energy of the active node. The simulation  experiment result shows that the algorithm is effective in different network environments, and the network lifecycle is superior to the  three  greedy algorithms, the maximum lifetime  coverage algorithm and the adaptive learning automaton algorithm.

Key words: target coverage problem, deep Q learning, wireless sensor networks, reinforcement learning

中图分类号: 

  • TP391