吉林大学学报(工学版) ›› 2017, Vol. 47 ›› Issue (1): 199-207.doi: 10.13229/j.cnki.jdxbgxb201701030

• Orginal Article • Previous Articles     Next Articles

Adaptive checkpoint mechanism supporting large-scale stream data processing

WEI Xiao-hui, LIU Zhi-liang, ZHUANG Yuan, LI Hong-liang, LI Xiang   

  1. College of Computer Science and Technology,Jilin University, Changchun 130012,China
  • Received:2016-03-02 Online:2017-01-20 Published:2017-01-20

Abstract: A novel checkpoint mechanism is presented that can support stream data processing and online dynamic adjustment of the checkpoint period. First, for the data flow burst, we propose a recovery time model to provide guarantee for the recovery time. Then, depending on the real-time variation of workload, we provide a real-time cost model for checkpoint. Finally, the peak traffic avoidance protocol can dynamically choose the best checkpoint time by updating the real-time cost of checkpoint periodically. Experiments show that, compared with existing methods, our self-adaptive mechanism has obvious advantages in flexibility and real time, and it is able to meet the requirements of high reliability and real-time fault tolerance in stream data processing.

Key words: computer system architecture, stream data processing, checkpoint, processing delay, recovery time

CLC Number: 

  • TP391
[1] Neumeyer L, Robbins B, Nair A,et al. S4: distributed stream computing platform[C]∥IEEE International Conference on Data Mining Workshops, Sydney,2010:170-177.
[2] Toshniwal A, Taneja S, Shukla A,et al. Storm@twitter[C]∥Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, Snowbird,2014:147-156.
[3] Apache Samza[DB/OL].[2015-10-12].http://samza.incubator.apache.org.
[4] Wang H, Peh L S, Koukoumidis E, et al. Meteor shower:a reliable stream processing system for commodity data centers[C]∥Parallel & Distributed Processing Symposium (IPDPS), Shanghai,2012: 1180-1191.
[5] Zaharia M, Das T, Li H, et al. Discretized streams: fault-tolerant streaming computation at scale[C]∥Proceedings of the 24th ACM Symposium on Operating Systems Principles,Farminton,2013: 423-438.
[6] Qian Z, He Y, Su C, et al. Timestream: reliable stream computation in the cloud[C]∥Proceedings of the 8th ACM European Conference on Computer Systems,Prague,2013: 1-14.
[7] Akidau T, Balikov A, Bekirog ˇ lu K, et al. MillWheel: fault-tolerant stream processing at internet scale[J]. Proceedings of the VLDB Endowment, 2013, 6(11): 1033-1044.
[8] Upadhyaya P, Kwon Y C, Balazinska M. A latency and fault-tolerance optimizer for online parallel query plans[C]∥Proceedings of ACM SIGMOD International Conference on Management of Data,Athens,2011:241-252.
[9] Sebepou Z, Magoutis K. CEC: Continuous eventual checkpointing for data stream processing operators[C]∥IEEE/IFIP 41st International Conference on Dependable Systems & Networks (DSN),Hong Kong,2011:145-156.
[10] Castro F R, Migliavacca M, Kalyvianaki E, et al. Integrating scale out and fault tolerance in stream processing using operator state management[C]∥Proceedings of the ACM SIGMOD International Conference on Management of Data,New York,2013:725-736.
[11] ApacheHadoop[EB/OL].[2015-10-13].http://hadoop.apache.org.
[12] 李翔. D-Map/Reduce拓扑动态调整机制及协议[D]. 长春:吉林大学计算机科学与技术学院, 2015.
Li Xiang.D-Map/Reduce dynamic topology management system and protocols[D].Changchun:College of Computer Science and Technology,Jilin University,2015.
[1] YU Yi-cheng, HU Liang, CHI Ling, CHU Jian-feng. Improved anonymous authentication protocol for multi-server architectures [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(5): 1586-1592.
[2] HAO Ping-ting, HU Liang, JIANG Jing-yan, CHE Xi-long. Optimistic lock protocol of multi-managed nodes [J]. 吉林大学学报(工学版), 2017, 47(1): 227-234.
[3] ZHANG Yi-wen,GUO Rui-feng. Low-power scheduling algorithm for mixed task in real-time system [J]. 吉林大学学报(工学版), 2015, 45(1): 261-266.
[4] ZHANG Yi-wen, GUO Rui-feng. Fault-tolerant energy-saving scheduling algorithm base on checkpoint scheme [J]. 吉林大学学报(工学版), 2014, 44(4): 1112-1117.
[5] HE Zhong-zheng, MEN Chao-guang, LI Xiang. Schedulability of fault-tolerant real-time system based on checkpoint interval optimization [J]. 吉林大学学报(工学版), 2014, 44(2): 433-439.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!