吉林大学学报(工学版) ›› 2018, Vol. 48 ›› Issue (6): 1917-1924.doi: 10.13229/j.cnki.jdxbgxb20171161

• • 上一篇    下一篇

基于自适应释放策略的低开销确定性重放方法

应欢1(),刘松华2,唐博文3,4,韩丽芳1,周亮1   

  1. 1. 中国电力科学研究院有限公司,北京 100192
    2. 国网山东省电力公司东明县供电公司,山东 菏泽 274500
    3. 中国科学院 计算技术研究所,北京 100190
    4. 中国科学院大学,北京 100049
  • 收稿日期:2017-11-29 出版日期:2018-11-20 发布日期:2018-12-11
  • 作者简介:应欢(1988-),女,工程师,博士.研究方向:程序分析与代码安全,动态优化.
  • 基金资助:
    “863”国家高技术研究发展计划项目(2015AA050202)

Efficient deterministic replay technique based on adaptive release strategy

YING Huan1(),LIU Song-hua2,TANG Bo-wen3,4,HAN Li-fang1,ZHOU Liang1   

  1. 1. China Electric Power Research Institute,Beijing 100192,China
    2. State Grid Electric Power Company of ShanDong Province DongMing County Power Supply Company,Heze 274500,China
    3. Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190, China
    4. University of Chinese Academy of Sciences,Beijing 100049,China
  • Received:2017-11-29 Online:2018-11-20 Published:2018-12-11

摘要:

针对基于页保护机制的确定性重放方法虽然能够有效降低记录开销,但由于页保护异常仍会引入性能开销的问题,本文深入研究了共享页面访问权限释放同步点对并行程序记录性能的影响,提出了一种基于自适应释放策略的确定性重放方法。采用PARSEC测试集进行性能评估,实验结果表明,该方法能够更进一步降低记录开销。

关键词: 信息处理技术, 并行程序, 共享内存, 调试, 确定性重放, 页保护

Abstract:

Inherent non-determinism in parallel program brings huge difficulty and challenge to programmers to test and debug. The major source of non-determinism is the undetermined order of shared memory accesses. Deterministic replay provides the possibility of reproducing parallel program execution, and deterministic replay method based on page protection mechanism can reduce the recording overhead effectively. However, the exception of page protection is the main performance cost. This paper deeply analyzes runtime overhead incurred by release synchronization points in record-runs, and proposes a deterministic replay method based on adaptive release strategy. Performance evaluation shows that the record slowdown is much smaller than prior user-level replay system.

Key words: information processing technology, parallel program, shared memory, debugging, deterministic replay, page protection

中图分类号: 

  • TP368

表1

释放同步点对程序的性能影响"

benchmarks (trecord_2-trecord_1)/
trecord_1
tsegv/trecord tinit/tsegv tlibc/tsegv trequest/tsegv tpthreads/tseg thandler/tsegv tpreempted/tsegv
1 2 1 2 1 2 1 2 1 2 1 2 1 2
ferret 17.4 2.0 21.6 35.7 1.9 2.5 0.0 0.0 4.3 15.5 0.0 46.3 0.0 0.0 93.9
facesim -28.5 24.1 10.3 4.8 9.8 20.8 0.0 5.3 43.2 0.4 0.0 6.6 0.0 7.5 34.1
raytrace -22.6 39.2 23.5 5.3 16.6 0.0 0.0 40.0 74.1 0.0 0.0 54.7 0.0 0.0 0.0
bodytrack -5.3 -4.1 9.0 79.0 70.5 0.5 0.0 2.2 6.9 1.5 0.0 8.9 0.0 1.3 0.7
vips 3.6 5.5 4.9 63.1 58.0 0.0 0.0 0.4 12.2 0.2 0.0 16.2 0.0 20.1 29.8
x264 4.4 0.0 0.0 65.5 10.0 0.0 0.0 0.3 0.0 31.8 0.0 2.5 0.0 0.0 0.0
blackscholes 20.1 0.0 9.6 76.8 6.9 20.1 0.0 0.0 0.0 0.0 0.0 13.1 0.0 0.0 0.0
swaptions 42.6 5.2 36.8 59.2 33.4 2.7 0.0 0.0 35.1 0.0 0.0 0.5 0.0 0.0 0.0
canneal 2.3 80.0 81.0 3.6 2.3 5.9 0.0 42.1 96.6 0.0 0.0 47.2 0.0 0.4 0.3
dedup 3.9 21.5 27.4 51.3 46.8 0.0 0.0 9.0 42.8 9.5 0.0 21.3 0.0 8.1 9.4
streamcluster -14.5 32.7 31.6 1.5 1.3 0.0 0.0 4.4 50.8 18.2 0.0 67.6 0.0 6.6 12.8

图1

并行程序片段"

图2

释放同步点对并行程序的影响"

图3

程序的并行度"

图4

自适应释放算法"

图5

条件变量的使用"

图6

Scribe、UPLAY与APLAY的性能比较"

[1] 蒋炎岩, 许畅, 马晓星 , 等. 获取访存依赖:并发程序动态分析基础技术综述[J]. 软件学报, 2017,28(4):747-763.
Jiang Yan-yan, Xu Chang, Ma Xiao-xing , et al. Approaches to obtaining shared memory dependences for dynamic analysis of concurrent programs: a survey[J]. Journal of Software, 2017,28(4):747-763.
[2] Patil H, Pereira C, Stallcup M, et al. PinPlay:a framework for deterministic replay and reproducible analysis of parallel programs [C]//The 8th International Symposium on Code Generation and Optimization, Toronto, Ontario, Canada, 2010: 2-11.
[3] Huang J, Liu P, Zhang C. LEAP: lightweight deterministic multi-processor replay of concurrent Java programs [C]//Proceedings of the 8th ACM Sigsoft International Symposium on Foundations of Software Engineering,Santa Fe, NM, USA, 2010: 207-216.
[4] Lee D, Chen P M, Flinn J, et al. Chimera: hybrid program analysis for determinism [C]//Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation, New York, NY, USA, 2012: 463-474.
[5] Bond M D, Kulkarni M, Cao M, et al. Efficient deterministic replay of multithreaded executions in a managed language virtual machine [C]//Proceedings of the Principles and Practices of Programming on the Java Platform, New York, NY, USA, 2015: 90-101.
[6] Honarmand N, Torrellas J. Replay debugging: leveraging record and replay for program debugging [C]//Proceeding of the 41st Annual International Symposium on Computer Architecuture, Minneapolis, MN, USA, 2014: 445-456.
[7] Pokam G, Danne K, Pereira C. QuickRec: prototyping an intel architecture extension for record and replay of multithreaded programs [C]//Proceedings of the 40th Annual International Symposium on Computer Architecture, Tel-Aviv,Israel, 2013: 643-654.
[8] Honarmand N, Dautenhahn N, Torrellas J, et al. Cyrus: unintrusive application-level record-replay for replay parallelism [C]//Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems, Houston, TX, USA, 2013: 193-206.
[9] Honarmand N, Torrellas J. Replay debugging: leveraging record and replay for program debugging [C]//Proceeding of the 41st Annual International Symposium on Computer Architecuture, Minneapolis, MN, USA, 2014: 445-456.
[10] Pokam G, Pereira C, Hu S L, et al. CoreRacer: a practical memory race recorder for multicore x86 TSO processors [C]//Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture, Porto Alegre, Brazil, 2011: 216-225.
[11] Laadan O, Viennot N, Nieh J. Transparent, lightweight application execution replay on commodity multiprocessor operating systems [C]//Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, New York,USA, 2010: 155-166.
[12] 应欢, 王东辉, 武成岗 , 等. 适用于商用系统环境的低开销确定性重放技术[J]. 吉林大学学报:工学版, 2017,47(1):208-217.
Ying Huan, Wang Dong-hui, Wu Cheng-gang , et al. Efficient deterministic replay technique on commodity system environment[J]. Journal of Jilin University(Engineering and Technology Edition), 2017,47(1):208-217.
[13] Wang Z, Li J, Wu C, et al. HSPT: practical implementation and efficient management of embedded shadow page tables for cross-ISA system virtual machines [C]//Proceedings of the 11th ACM Sigplan/Sigops International Conference on Virtual Execution Environments, Istanbul, 2015: 53-64.
[14] Kedia P . Efficient deterministic replay through dynamic binary translation[D]. Delhi:Indian Institute of Technology, 2015.
[15] Lee D . Holistic system design for deterministic replay[D]. Ann Arbor:University of Michigan, 2013.
[16] Honarmand N . record and deterministic replay of parallel programs on multiprocessors[D]. Urbana:University of Illinois, 2014.
[17] Bienia C, Kumar S, Singh J P, et al. The PARSEC benchmark suite: characterization and architectural implications [C]//Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques,Toronto, Ontario, Canada, 2008: 72-81.
[18] Yuan X, Wu C, Wang Z, et al. ReCBuLC: reproducing concurrency bugs using local clocks [C]//Proceedings of the 37th International Conference on Software Engineering,Florence, 2015: 824-834.
[19] 唐士斌 . 多线程程序的确定性调试方法[D]. 北京:中国科学院计算技术研究所, 2014.
Tang Shi-bin . Debugging muiltithreaded programs with determinism[D]. Beijing:The Institute of Computing Technology ,Chinese Academy of Sciences, 2014.
[20] 陈宇飞 . 可伸缩的确定性重放技术研究[D]. 上海:复旦大学计算机科学技术学院, 2014.
Chen Yu-fei . Improving the scalability of deterministic replay[D]. Shanghai:School of Computer Science, Fudan University, 2014.
[1] 托乎提努尔,张海龙,王杰,王娜,冶鑫晨,王万琼. 基于图形处理器的高速中值滤波算法[J]. 吉林大学学报(工学版), 2019, 49(3): 979-985.
[2] 付银娟,李勇,徐丽琴,张昆辉. NLFM⁃Costas射频隐身雷达信号设计及分析[J]. 吉林大学学报(工学版), 2019, 49(3): 994-999.
[3] 苏寒松,代志涛,刘高华,张倩芳. 结合吸收Markov链和流行排序的显著性区域检测[J]. 吉林大学学报(工学版), 2018, 48(6): 1887-1894.
[4] 徐岩,孙美双. 基于卷积神经网络的水下图像增强方法[J]. 吉林大学学报(工学版), 2018, 48(6): 1895-1903.
[5] 李居朋,张祖成,李墨羽,缪德芳. 基于Kalman滤波的电容屏触控轨迹平滑算法[J]. 吉林大学学报(工学版), 2018, 48(6): 1910-1916.
[6] 黄勇,杨德运,乔赛,慕振国. 高分辨合成孔径雷达图像的耦合传统恒虚警目标检测[J]. 吉林大学学报(工学版), 2018, 48(6): 1904-1909.
[7] 陆智俊,钟超,吴敬玉. 星载合成孔径雷达图像小特征的准确分割方法[J]. 吉林大学学报(工学版), 2018, 48(6): 1925-1930.
[8] 刘仲民,王阳,李战明,胡文瑾. 基于简单线性迭代聚类和快速最近邻区域合并的图像分割算法[J]. 吉林大学学报(工学版), 2018, 48(6): 1931-1937.
[9] 单泽彪,刘小松,史红伟,王春阳,石要武. 动态压缩感知波达方向跟踪算法[J]. 吉林大学学报(工学版), 2018, 48(6): 1938-1944.
[10] 姚海洋, 王海燕, 张之琛, 申晓红. 双Duffing振子逆向联合信号检测模型[J]. 吉林大学学报(工学版), 2018, 48(4): 1282-1290.
[11] 全薇, 郝晓明, 孙雅东, 柏葆华, 王禹亭. 基于实际眼结构的个性化投影式头盔物镜研制[J]. 吉林大学学报(工学版), 2018, 48(4): 1291-1297.
[12] 陈涛, 崔岳寒, 郭立民. 适用于单快拍的多重信号分类改进算法[J]. 吉林大学学报(工学版), 2018, 48(3): 952-956.
[13] 陈绵书, 苏越, 桑爱军, 李培鹏. 基于空间矢量模型的图像分类方法[J]. 吉林大学学报(工学版), 2018, 48(3): 943-951.
[14] 孟广伟, 李荣佳, 王欣, 周立明, 顾帅. 压电双材料界面裂纹的强度因子分析[J]. 吉林大学学报(工学版), 2018, 48(2): 500-506.
[15] 林金花, 王延杰, 孙宏海. 改进的自适应特征细分方法及其对Catmull-Clark曲面的实时绘制[J]. 吉林大学学报(工学版), 2018, 48(2): 625-632.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 李月英,刘勇兵,陈华 . 凸轮材料的表面强化及其摩擦学特性
[J]. 吉林大学学报(工学版), 2007, 37(05): 1064 -1068 .
[2] 聂建军,杜发荣,高峰 . 存在热漏的内燃机与斯特林联合循环的有限时间的热力学研究[J]. 吉林大学学报(工学版), 2007, 37(03): 518 -0523 .
[3] 王超,宋克柱,唐进 . 高性能水下地震数据采集系统设计与实现[J]. 吉林大学学报(工学版), 2007, 37(01): 168 -172 .
[4] 包铁,刘淑芬 . 基于通信顺序进程的网络故障管理形式化描述[J]. 吉林大学学报(工学版), 2007, 37(01): 117 -120 .
[5] 许思传,李荣庆,程钦,郭英男,张纪鹏,孙济美,王永富 . 边界条件对乙醇HCCI发动机燃烧的影响[J]. 吉林大学学报(工学版), 2007, 37(01): 74 -78 .
[6] 邵宝东,孙兆伟,王丽凤 . 微槽冷却热沉结构尺寸的优化设计[J]. 吉林大学学报(工学版), 2007, 37(02): 313 -0318 .
[7] 杨文,石永久,王元清,施刚 . 结构钢焊接残余应力三维有限元分析[J]. 吉林大学学报(工学版), 2007, 37(02): 347 -0352 .
[8] 程平,张海涛,高岩,李俊锋,王洪艳 . ANN在聚丙烯酸酯乳液性质预测中的应用
[J]. 吉林大学学报(工学版), 2007, 37(02): 362 -0366 .
[9] 肖献强,李欣欣,杨志刚,程光明 . 基于运动估计和图像匹配的视觉控制算法[J]. 吉林大学学报(工学版), 2007, 37(03): 655 -0659 .
[10] 李洪萍,裴玉龙,杨中良 .

快速路自由流速度及其影响因素

[J]. 吉林大学学报(工学版), 2007, 37(04): 772 -776 .