Big data availability: Selective partial checkpointing for in-memory database queries
conference contribution
posted on 2018-01-04, 09:50authored byDaniel Playfair, Amitabh Trehan, Barry McLarnon, Dimitrios S. Nikolopoulos
Fault tolerance is an important challenge for supporting critical big data analytic operations. Most existing solutions only provide fault tolerant data replication, requiring failed queries to be restarted. This approach is insufficient for long-running time-sensitive analytic queries, due to lost query progress. Several solutions provide intra-query fault tolerance. However, these focus on distributed or row-oriented databases and are not suitable for use with the column-oriented in-memory databases increasingly used for highperformance workloads. We propose a new approach for intra-query checkpointing that produces an optimal checkpoint solution for a fixed checkpointing budget to minimise overhead on in-memory column-oriented database clusters. We describe a modified architecture for fault tolerant query execution using this approach. We present a general model for the problem, in which an adversary is free to terminate the execution of the query, eliminating all unsaved work. We present an algorithm that represents a first step towards producing checkpoint plans by optimally placing a single checkpoint. Our analysis shows this approach allows reduced checkpoint overheads while providing resilience for long-running queries.
History
School
Science
Department
Computer Science
Published in
2016 IEEE International Conference on Big Data (Big Data)
Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016
Pages
2785 - 2794
Citation
PLAYFAIR, D. ... et al, 2017. Big data availability: Selective partial checkpointing for in-memory database queries. Proceedings of the 2016 IEEE International Conference on Big Data (Big Data), Washington, DC, USA, 5th-8th December 2016, pp. 2785-2794.