吉林大学学报(工学版) ›› 2025, Vol. 55 ›› Issue (7): 2418-2424.doi: 10.13229/j.cnki.jdxbgxb.20240743

• 通信与控制工程 • 上一篇    

基于深度Q网络算法的空天地边缘计算网络资源分配方法

李新春1,2(),孙鹤源1,许驰3   

  1. 1.辽宁工程技术大学 电子与信息工程学院,辽宁 葫芦岛 125105
    2.辽宁理工学院 电气与智能控制学院,辽宁 锦州 121013
    3.中国科学院 沈阳自动化研究所,沈阳 110016
  • 收稿日期:2024-07-05 出版日期:2025-07-01 发布日期:2025-09-12
  • 作者简介:李新春(1963-),男,博士,高级工程师.研究方向:无线传感器网络,图像处理.E-mail: s3268247250@outlook.com.
  • 基金资助:
    国家自然科学基金项目(92267108);国家自然科学基金项目(62173322);辽宁省科学技术计划项目(2023JH3/10200004);辽宁省科学技术计划项目(2022JH25/10100005)

Network resource allocation method of aerospace edge computing based on deep Q network algorithm

Xin-chun LI1,2(),He-yuan SUN1,Chi XU3   

  1. 1.School of Electronic and Information Engineering,Liaoning Technical University,Huludao 125105,China
    2.School of Electrical and Intelligent Control,Liaoning Institute of Science and Engineering,Jinzhou 121013,China
    3.Shenyang Institute of Automation,Chinese Academy of Sciences,Shenyang 110016,China
  • Received:2024-07-05 Online:2025-07-01 Published:2025-09-12

摘要:

由于卫星、无人机和地面站位置不断变化,导致空天地边缘计算网络链路不固定,且网络需要快速响应用户请求,对吞吐量与实时性的要求较高,增加了网络资源分配的难度。对此,本文提出基于深度Q网络算法的空天地边缘计算网络资源分配方法。首先,考虑网络拓扑的动态性和资源异构性,建立资源间的通信模型,为资源分配提供基础框架;然后,基于最大吞吐量设计资源分配目标函数,并利用马尔科夫决策模型表述目标函数,将资源分配问题转化为序列决策问题,便于在动态变化的网络环境中作出决策;最后,基于深度Q网络算法求解目标函数,通过强化学习的方式,使算法能够通过与环境的交互学习到最优的资源分配策略,适应网络的实时性和动态性。实验结果表明:应用该方法后,网络累计回报较高,资源任务平均能耗降低,说明该方法实际可行。

关键词: 空天地一体化网络, 深度Q网络算法, 边缘计算, 资源分配, 马尔科夫决策模型

Abstract:

Due to the constantly changing positions of satellites, UAVs and ground stations, the space space edge computing network link is not fixed, and the network needs to respond quickly to user requests, which requires high throughput and real-time, increasing the difficulty of network resource allocation. In this regard, this research proposes a network resource allocation method for space edge computing based on deep Q network algorithm. Firstly, considering the dynamic nature of network topology and resource heterogeneity, establish a communication model between resources to provide a basic framework for resource allocation; Then, based on the maximum throughput, a resource allocation objective function is designed, and a Markov decision model is used to express the objective function. The resource allocation problem is transformed into a sequential decision problem, which facilitates decision-making in a dynamically changing network environment; Finally, based on the deep Q-network algorithm, the objective function is solved, and through reinforcement learning, the algorithm can learn the optimal resource allocation strategy through interaction with the environment, adapting to the real-time and dynamic nature of the network. The experimental results show that after applying this method, the cumulative return of the network is higher and the average energy consumption of resource tasks is reduced, indicating that this method is more suitable for practical applications.

Key words: integrated network of air, space, and space, deep Q-network algorithm, edge computing, resource allocation, markov decision model

中图分类号: 

  • TP273

图1

Q网络和深度Q网络算法的收敛曲线对比"

图2

3种方法的累计回报"

图3

3种方法的任务平均能耗"

[1] 徐晓斌, 王琪, 范存群, 等. 面向空天地一体化信息网络的边缘计算资源融合管理方法[J]. 计算机学报, 2023, 46(4): 690-710.
Xu Xiao-bin, Wang Qi, Fan Cun-qun, et al. An aggregated edge computing resource management method for space-air-ground integrated information networks[J]. Chinese Journal of Computers, 2023, 46 (4): 690-710.
[2] 邓平科, 张同须, 施南翔, 等. 星算网络——空天地一体化算力融合网络新发展[J]. 电信科学, 2022, 38(6): 71-81.
Deng Ping-ke, Zhang Tong-xu, Shi Nan-xiang, et al. Computing satellite networks—the novel development of computing-empowered space-air-ground integrated networks[J].Telecommunications Science, 2022, 38(6): 71-81.
[3] 张航, 唐冬, 黄高飞, 等. 无线供能协作计算网络中的时延最小化资源分配方案[J]. 计算机应用研究,2022, 39(1): 214-220.
Zhang Hang, Tang Dong, Huang Gao-fei, et al.Latency minimization resource allocation scheme in wireless powered cooperative computing network[J]. Application Research of Computers, 2022, 39(1): 214-220.
[4] 李云, 高倩, 姚枝秀, 等. 移动边缘计算中智能服务编排和算网资源分配联合优化方法[J]. 通信学报, 2023, 44(7): 51-63.
Li Yun, Gao Qian, Yao Zhi-xiu, et al. Joint optimization method of intelligent service arrangement and computing-networking resource allocation for MEC [J]. Journal on Communications, 2023, 44 (7): 51-63.
[5] Li Y H, Liu Z L, Tao Q. A resource allocation strategy for internet of vehicles using reinforcement learning in edge computing environment[J].Soft Computing, 2022, 27(7): 3999-4009.
[6] 叶迎晖, 施丽琴, 卢光跃. 无线供能移动边缘网络中计算时延最小化资源分配方法研究[J]. 电子与信息学报, 2022, 44(5): 1839-1846.
Ye Ying-hui, Shi Li-qin, Lu Guang-yue.Execution delay minimization in wireless powered mobile edge computing networks[J]. Journal of Electronics & Information Technology, 2022, 44(5): 1839-1846.
[7] 杜丹冰. 一种优化空天地一体化网络吞吐量算法[J].弹箭与制导学报, 2023, 43(5): 109-114.
Du Dan-bing. A throughput optimization in space-air-ground integrated networks[J]. Journal of Projectiles, Rockets, Missiles and Guidance, 2023,43(5): 109-114.
[8] 乔文欣, 卢昱, 刘益岑, 等. 空天地协同的边缘云服务功能链动态编排方法[J]. 西安电子科技大学学报, 2022, 49(2): 79-88.
Qiao Wen-xin, Lu Yu, Liu Yi-cen, et al. Dynamic scheduling method for service function chains in space air terrestrial aided edge cloud networks[J]. Journal of Xidian University, 2022, 49(2): 79-88.
[9] 郭凇岐, 安康, 孙艺夫, 等. 基于深度学习的HARQ辅助空天地融合网络时延受限容量预测[J]. 电讯技术, 2023, 63(7): 963-971.
Guo Song-qi, An Kang, Sun Yi-fu, et al. Delay-limited throughput prediction of HARQ-assisted satellite-aerial-terrestrial integrated network: a deep learning approach[J]. Telecommunication Engineering, 2023, 63(7): 963-971.
[10] 刘芳, 朱天贺, 苏卫星, 等. 基于高斯隐马尔可夫模型的人机共享控制区域化决策算法[J]. 电子学报, 2022, 50(11): 2659-2667.
Liu Fang, Zhu Tian-he, Su Wei-xing, et al. Regionalized decision algorithm for human-machine shared control based on gaussian hidden markov model[J]. Acta Electronica Sinica, 2022, 50(11): 2659-2667.
[11] 朱霸坤, 朱卫纲, 李伟, 等. 基于马尔可夫的多功能雷达认知干扰决策建模研究[J]. 系统工程与电子技术, 2022, 44(8): 2488-2497.
Zhu Ba-kun, Zhu Wei-gang, Li Wei, et al. Research on decision-making modeling of cognitive jamming for multi-functional radar based on Markov[J]. Systems Engineering and Electronics, 2022, 44(8): 2488-2497.
[12] 王浩聪, 付主木, 孙昊琛, 等.改进深度Q学习的燃料电池混合动力汽车能量管理[J].河南科技大学学报: 自然科学版, 2022, 43(4): 34-40.
Wang Hao-cong, Fu Zhu-mu, Sun Hao-chen, et al.Energy management of fuel cell hybrid electric vehicle based on improved deep Q-learning[J]. Journal of Henan University of Science And Technology(Natural Science), 2022, 43(4): 34-40.
[13] 夏天豪, 夏长清, 潘昊, 等. 基于强化学习的算力资源度量方法[J]. 燕山大学学报, 2023, 47(3): 246-254.
Xia Tian-hao, Xia Chang-qing, Pan Hao, et al. Computational power resource measurement method based on reinforcement learning[J]. Journal of Yanshan University, 2023, 47(3): 246-254.
[14] 侯慧, 何梓姻, 陈跃, 等. 基于深度强化学习区间多目标优化的智能建筑低碳优化调度[J]. 电力系统自动化, 2023, 47(21): 47-57.
Hou Hui, He Zi-yin, Chen Yue, et al. Low-carbon optimal dispatch of smart building based on interval multi-objective optimization with deep reinforcement learning[J].Automation of Electric Power Systems, 2023,47(21): 47-57.
[15] 李文武, 周佳妮, 裴本林, 等. 梯级水库深度强化学习长期随机优化调度研究[J]. 水力发电学报, 2023, 42(11): 21-32.
Li Wen-wu, Zhou Jia-ni, Pei Ben-lin, et al.Study on long-term stochastic optimal operation of cascade reservoirs by deep reinforcement learning[J]. Journal of Hydroelectric Engineering, 2023,42(11): 21-32.
[1] 苏命峰,王国军,周聪,王田. 边云协作下时延和能耗约束的启发式任务卸载方法[J]. 吉林大学学报(工学版), 2025, 55(5): 1648-1663.
[2] 赵庶旭,孙治朝,王小龙. 移动边缘计算场景中的动态身份认证协议[J]. 吉林大学学报(工学版), 2025, 55(3): 1050-1060.
[3] 黄汉英,李鹏飞. 边缘服务器计算资源分配方法与仿真实验[J]. 吉林大学学报(工学版), 2025, 55(1): 316-324.
[4] 朱思峰,胡家铭,杨诚瑞,蔡江昊. 物联网边缘计算场景下基于优先级任务的卸载决策优化[J]. 吉林大学学报(工学版), 2024, 54(11): 3338-3350.
[5] 朱思峰,蔡江昊,柴争义,孙恩林. 车联网边缘场景下基于免疫算法的计算卸载优化[J]. 吉林大学学报(工学版), 2024, 54(1): 221-231.
[6] 朱思峰,赵明阳,柴争义. 边缘计算场景中基于粒子群优化算法的计算卸载[J]. 吉林大学学报(工学版), 2022, 52(11): 2698-2705.
[7] 李丽娜,魏晓辉,郝琳琳,王兴旺,王储. 大规模流数据处理中代价有效的弹性资源分配策略[J]. 吉林大学学报(工学版), 2020, 50(5): 1832-1843.
[8] 刘毅,肖玲玲,王改静,张武军. 基于联合优化的D2D资源分配算法[J]. 吉林大学学报(工学版), 2020, 50(1): 306-314.
[9] 姜来为, 沙学军, 吴宣利, 张乃通. LTE-A异构网络中新的用户选择接入和资源分配联合方法[J]. 吉林大学学报(工学版), 2017, 47(6): 1926-1932.
[10] 赵晓晖, 杨伟伟, 金晓光. 基于不同时延业务的中继正交频分复用系统资源分配算法[J]. 吉林大学学报(工学版), 2015, 45(6): 2049-2055.
[11] 唐瑞春, 邱悦, 丁香乾, 李静. 基于效用最大化协商机制的云媒体资源分配算法[J]. 吉林大学学报(工学版), 2015, 45(3): 932-937.
[12] 陈健, 樊光辉, 阔永红. 认知无线电中继协助网络资源分层优化算法[J]. 吉林大学学报(工学版), 2014, 44(5): 1498-1505.
[13] 游晓明, 刘升, 王裕明. 量子行为网络资源并行分配优化模型及其应用[J]. 吉林大学学报(工学版), 2012, 42(增刊1): 341-345.
[14] 丛犁, 张海林, 刘毅, 赵力强, 张国鹏. 基于粒子群优化的协作网络资源分配的博弈策略[J]. 吉林大学学报(工学版), 2012, 42(01): 207-212.
[15] 程翔, 李立. 单物品多单元双向拍卖环境下的网格资源分配仿真[J]. 吉林大学学报(工学版), 2010, 40(05): 1359-1365.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!