›› 2012, Vol. ›› Issue (06): 1505-1509.

• 论文 • 上一篇    下一篇

片上多核处理器末级共享Cache可重用数据预测机制

韩立敏, 高德远, 樊晓桠, 史莉雯, 安建峰   

  1. 西北工业大学 计算机学院, 西安 710129
  • 收稿日期:2011-10-09 出版日期:2012-11-01
  • 通讯作者: 安建峰(1977-),男,讲师,博士.研究方向:计算机系统结构.E-mail:anjf@nwpu.edu.cn E-mail:anjf@nwpu.edu.cn
  • 基金资助:
    国家自然科学基金项目(60736012,61003037,61173047);"863"国家高技术研究发展计划项目(2009AA01Z110);西北工业大学基础研究基金项目.

Reusable data predicting mechanism for shared last level Cache in chip multi-processor

HAN Li-min, GAO De-yuan, FAN Xiao-ya, SHI Li-wen, AN Jian-feng   

  1. School of Computer Science, Northwestern Polytechnical University, Xi'an 710129, China
  • Received:2011-10-09 Online:2012-11-01

摘要: 为了减少片上多核处理器(Clip multi-processor,CMP)末级共享Cache中的干扰,根据应用程序的存储访问频率特性,提出了一种基于替换算法的可重用数据预测机制。当末级共享Cache的数据将被替换时,先检测此数据的历史访问信息,根据历史访问信息过滤出会被重复使用的数据,并将其保存在片上专用存储器中。仿真结果表明:本文的可重用数据预测机制将IPC(Instruction per clock)平均提高了2.9%,平均减少了应用程序中22.69%的有害替换,有效地减少了Cache抖动。

关键词: 计算机系统结构, 多核处理器, 冲突缺失, Cache抖动, 末级共享Cache

Abstract: To reduce interference of the shared LLC on a chip Multi-Processor (CMP) architecture, we propose a Replacement policy based Reusable Data Predicting mechanism (RRDP) to select possible reused data from replaced data according to the memory access frequency characteristics of application. When some data will be replaced in LLC Cache, we filter out predicted reusable data by checking their history reference information, and store them into a dedicated on-chip memory. Our simulation results show that the proposed mechanism increases the IPC of programs by 2.9% on average and reduces the harmful replacements by 22.69%, so it can effectively reduce Cache thrashing.

Key words: computer architecture, multi-core processor, contention miss, Cache thrashing, shared last level Cache

中图分类号: 

  • TP302
[1] Dybdahl H, Stenström P,Natvig L.An LRU-based replacement algorithm augmented with frequency of access in shared chip-multiprocessor Caches[C]//Proceedings of the 2006 Workshop on Memory Performance: Dealing with Applications,Systems and Architectures,New York,USA,2006:45-52.
[2] Kharbutli M,Yan S.Counter-based cache replacement and bypassing algorithms[J].IEEE Transaction on Computers,Washington D C,USA,2008,57(4):433-447.
[3] Chaudhuri M. Pseudo-LIFO: The foundation of a new family of replacement policies for last-level caches[C]//Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, New York, USA,2009: 401-412.
[4] Khan S M, Jiménez D A, Burger D, et al. Using dead blocks as a virtual victim cache[C]//Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, New York, USA, 2010: 489-500.
[5] 王震, 徐高潮. 基于IPC与公平性的共享Cache划分[J]. 吉林大学学报:理学版, 2011,49(4):740-744. Wang Zhen, Xu Gao-chao. Shared cache partitioning based on IPC and fairness[J]. Journal of Jilin University(Science Edition),2011,49(4):740-744.
[6] Jason Loew. M-sim:the multi-threaded simulator. version 3.0.http://www.cs.binghamton.edu/~msim/
[7] Zahran M, McKee S A. Global management of cache hierarchies[C]//The ACM International Conference on Computing Frontiers, Bertinoro, Italy, 2010:131-139.
[8] Zhou Yuan-yuan,Philbin J F,Li Kai.The multi-queue replacement algorithm for second level buffer caches[C]//In Proceedings of 2001 USENIX Annual Technical Conference,California C A,USA,2002:1-11.
[9] Xiang Ling-xiang, Chen Tian-zhou, Shi Qing-song, et al. Less reused filter: improving L2 cache performance via filtering less reused lines[C]//Proceedings of the 23rd International Conference on Supercomputing, New York, USA, 2009:68-79.
[10] Chandra D, Guo F, Kim S, et al. Predicting inter-thread cache contention on a chip multi-processor architecture[C]//Proceedings of the 11th International Symposium on High-Performance Computer Architecture, Washington D C, USA,2005: 1-12.
[11] Jouppi N P. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers[C]//17th International Symposium on Computer Architecture, New York, USA, 1990:364-373.
[1] 余宜诚, 胡亮, 迟令, 初剑峰. 一种改进的适用于多服务器架构的匿名认证协议[J]. 吉林大学学报(工学版), 2018, 48(5): 1586-1592.
[2] 董坚峰, 张玉峰, 戴志强. 改进的基于狄利克雷混合模型的推荐算法[J]. 吉林大学学报(工学版), 2018, 48(2): 596-604.
[3] 赵博, 秦贵和, 赵永哲, 杨文迪. 基于半陷门单向函数的公钥密码[J]. 吉林大学学报(工学版), 2018, 48(1): 259-267.
[4] 刘磊, 刘利娟, 吴新维, 张鹏. 基于ECPMR的编译器测试方法[J]. 吉林大学学报(工学版), 2017, 47(4): 1262-1267.
[5] 董立岩, 王越群, 贺嘉楠, 孙铭会, 李永丽. 基于时间衰减的协同过滤推荐算法[J]. 吉林大学学报(工学版), 2017, 47(4): 1268-1272.
[6] 于斌斌, 武欣雨, 初剑峰, 胡亮. 基于群密钥协商的无线传感器网络签名协议[J]. 吉林大学学报(工学版), 2017, 47(3): 924-929.
[7] 邓昌义, 郭锐锋, 张忆文, 王鸿亮. 基于平衡因子的动态偶发任务低功耗调度算法[J]. 吉林大学学报(工学版), 2017, 47(2): 591-600.
[8] 魏晓辉, 刘智亮, 庄园, 李洪亮, 李翔. 支持大规模流数据在线处理的自适应检查点机制[J]. 吉林大学学报(工学版), 2017, 47(1): 199-207.
[9] 郝娉婷, 胡亮, 姜婧妍, 车喜龙. 基于多管理节点的乐观锁协议[J]. 吉林大学学报(工学版), 2017, 47(1): 227-234.
[10] 魏晓辉, 李翔, 李洪亮, 李聪, 庄园, 于洪梅. 支持大规模流数据处理的弹性在线MapReduce模型及拓扑协议[J]. 吉林大学学报(工学版), 2016, 46(4): 1222-1231.
[11] 车翔玖, 梁森. 一种基于大顶堆的SPIHT改进算法[J]. 吉林大学学报(工学版), 2016, 46(3): 865-869.
[12] 董悦丽, 郭权, 孙斌, 康玲. 药物分子对接动态任务迁移优化[J]. 吉林大学学报(工学版), 2015, 45(4): 1253-1259.
[13] 匡哲君,师唯佳,胡亮. 基于无线传感器网络的角色成员关系剩余能量新算法[J]. 吉林大学学报(工学版), 2015, 45(2): 600-605.
[14] 张忆文,郭锐锋. 实时系统混合任务低功耗调度算法[J]. 吉林大学学报(工学版), 2015, 45(1): 261-266.
[15] 张忆文1, 2, 郭锐锋1. 制的容错节能调度算法[J]. 吉林大学学报(工学版), 2014, 44(4): 1112-1117.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!