吉林大学学报(工学版) ›› 2013, Vol. 43 ›› Issue (增刊1): 29-33.

• 论文 • 上一篇    下一篇

利用XY坐标实现GPU缓存索引分布在帧缓冲区上的漫射

张俊1,2   

  1. 1. 中南大学 信息科学与工程学院,长沙 410083;
    2. 中南大学 先进控制与智能自动化湖南省工程实验室, 长沙 410083
  • 收稿日期:2012-05-20 发布日期:2013-06-01
  • 作者简介:张俊(1973-),男,研究员.研究方向:三维图形处理器,系统芯片SOC集成与验证.E-mail:junzhang@mail.csu.edu.cn
  • 基金资助:

    湖南省科技计划项目(2010ck3010).

Achieve scattering distribution of GPU cache indices on frame buffer via XY coordinates

ZHANG Jun1,2   

  1. 1. College of Information Science and Engineering, Central South University, Changsha 410083, China;
    2. Hunan Engineering Laboratory for Advanced Control and Intelligent Automation, Central South University, Changsha 410083, China
  • Received:2012-05-20 Published:2013-06-01

摘要:

针对为CPU设计的缓存索引映射技术会导致严重的缓存冲突缺失问题,本文提出了一种全新的基于像素XY坐标进行索引计算的XY型缓存索引映射技术。该方法可以在帧缓冲区上获得缓存行索引分布的良好漫射,并且能完全避免不同帧解像度的不良影响。实验结果表明,XY型缓存索引映射技术可使缓存缺失率最大降低82%。采用该技术的直接映射缓存或2-way组相联缓存的缺失率接近全相联缓存,这有助于降低缓存设计复杂度与缓存功耗。

关键词: XY型缓存索引映射技术, XY方向的空间局部性, 图形处理器, 冲突缺失

Abstract:

To solve the problem of serious cache conflict caused by traditional cache index mapping schemes, a new XY-type cache index mapping scheme based on the calculation from XY coordinates was proposed.A perfect scattering distribution of GPU cache index on frame buffer could be obtained,the bad effect caused by frame resolution was avoided completely.The evaluation results show that this scheme can reduce cache miss ratio by 82% at most.Since the direct-mapped or 2-way set-associative cache implementation can achieve the cache miss ratio close to full-associative cache,this scheme can decrease the design complexity and consumption power.

Key words: XY-type cache index mapping scheme, spatial locality in XY directions, graphics processing unit(GPU), conflict miss

中图分类号: 

  • TP335

[1] Kharbutli M, Irwin K,Solihin Y,et al.Using prime numbers for cache indexing to eliminate conflict Misses[C]//IEEE Proceedings-Software,2004:288-299.

[2] Ramaswamy S,Yalamanchili S.Improving cache efficiency via resizing + remapping[C]// IEEE 25th International Conference on Computer Design,2007:47-54.

[3] Frailong J,Jalby W,Lenfant J.A XOR-schemes:A flexible data organization in parallel memories[C]// Proc.int'l Cconf.Parallel Processing,1985:276-283.

[4] Seznec A.A case for two-way skewed-associative caches[C]// Proceedings of the 20th Annual International Symposium on Computer Architecture,1993:169-178.

[5] Fenglong Song,Zhiyong Liu,Dongrui Fan,et al.Design of new hash mapping functions[C]// IEEE 9th International Conference on Computer and Information Technology,2009:45-50.

[1] 林金花, 王延杰, 孙宏海. 改进的自适应特征细分方法及其对Catmull-Clark曲面的实时绘制[J]. 吉林大学学报(工学版), 2018, 48(2): 625-632.
[2] 武勇, 王俊, 曹运合, 张培川. 基于二次预测的粒子滤波算法[J]. 吉林大学学报(工学版), 2015, 45(5): 1696-1701.
[3] 贾晓未, 魏嵬, 贾克斌. 一种基于GPU的图元网状结构DRR并行加速算法[J]. 吉林大学学报(工学版), 2013, 43(增刊1): 34-38.
[4] 韩立敏, 高德远, 樊晓桠, 史莉雯, 安建峰. 片上多核处理器末级共享Cache可重用数据预测机制[J]. , 2012, (06): 1505-1509.
[5] 白洪涛1,2,欧阳丹彤3,4,李熙铭3,4,何丽莉3,4. 基于GPU的共享信息素矩阵多蚁群算法[J]. 吉林大学学报(工学版), 2011, 41(6): 1678-1683.
[6] 陈纯毅1,2,杨华民2,李文辉1,范静涛2. 线索化包围盒层次结构的并行创建算法[J]. 吉林大学学报(工学版), 2011, 41(05): 1388-1393.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!