The performance of a Database Management System (DBMS) is decided by the system configurations and the workloads it needs to process. To achieve instance optimality [13], database administrators and end-users need to choose the optimal configurations and allocate the most appropriate resources in accordance with the workloads for each database instance. However, the high complexity of time-varying workloads makes it extremely challenging to find the optimal configuration, especially for a cloud DBMS that may have millions of database instances with diverse workloads. There is no one-size-fits-all configuration that works for all workloads since each workload has varying patterns on configuration and resource requirements. If a configuration cannot adapt to the dynamic changes of workloads, there could be a significant degradation in the overall performance of a DBMS unless a sophisticated administrator is continuously re-configuring the DBMS.
An ideal solution to address the above challenges is the autonomous or self-driving DBMSs (e.g., Oracle Autonomous Database [12], Peloton [14], NoisePage [15], and openGauss [7]) which are expected to automatically and constantly configure, tune, and optimize themselves in accordance with the workload changes without any intervention from human experts. Since the optimal configuration setting is very dependent on the workload characteristics, thus the first and key step for an autonomous DBMS is to predict the future workload based on the historical data. Firstly, the DBMS should be able to forecast when the workload will significantly change (i.e., workload shift), how many workloads will arrive (i.e., arrival rate), and what is the next query that a user will execute (i.e., next query) in the future. That predicted workload information enables an autonomous DBMS to decide when and how to re-configure itself in a predictive manner before the workload changes occur. Secondly, an autonomous DBMS also needs to predict the query performance by estimating some essential runtime metrics before execution, such as how long a query will take to complete (i.e, execution time) and how much resources will be consumed (i.e., resource utilization). Predicting the execution time and resource demand prior to execution is useful in many tasks, including admission control, query scheduling, progress monitoring, system sizing, and resource management [18].
In this tutorial, we will focus on 1) how to forecast the future workloads (e.g., workload shift detection, arrival rate prediction, and next query prediction), and 2) how to analyze the behaviors of the workloads (e.g., execution time prediction and resource usage estimation). We will provide a comprehensive overview and detailed introduction of the two topics, from state-of-the-art methods, real-world applications, to open problems and future directions. Specifically, we will not only discuss traditional methods, such as time-series analysis [3, 16], Markov modeling [4,5], analytical modeling [17,18], and experiment-driven methods [1], but also cover the state-of-the-art AI techniques, including machine learning [8], deep learning [10], reinforcement learning [11], and graph embedding [20]. Table 1 summarizes the major research topics that will be presented in this tutorial.
Workload Characteristics | Descriptions | Information | |
1 | Workload Shift | When the workloads will change | Time |
2 | Arrival Rate | How many workloads will arrive | Volume |
3 | Next Query | What will be the next query or transaction | Change |
4 | Execution Time | How long the workloads will take to run | Duration |
5 | Resource Usage | How many resources will be consumed | Resource |
Part I: Motivation and background (5 min)
Part II: Workload forecasting and performance prediction (70 min)
Part III: A Cases study of real-world applications (20 min)
Part IV: Open challenges and future directions (5 min)
Zhengtong Yan is a doctoral student at the University of Helsinki. His research topics include autonomous multi-model databases and cross-model query optimization.
Jiaheng Lu is a professor at the University of Helsinki. His main research interests lie in database systems specifically in the challenge of efficient data processing from real-life, massive data repositories and the Web. He has written four books on Hadoop and NoSQL databases, and more than 100 papers published in SIGMOD, VLDB, TODS, and TKDE, etc. He has given several tutorials on multi-model data management and autonomous databases in VLDB, CIKM, and EDBT conferences. He frequently serves as a PC member for conferences including SIGMOD, VLDB, ICDE, EDBT, CIKM, etc.
Qingsong Guo is a postdoctoral researcher at the University of Helsinki His research interests include multi-model databases and automatic management of big data with deep learning.
Gongsheng Yuan is a doctoral student at the University of Helsinki. His research topics lie in databases with quantum theory or reinforcement learning.
Calvin Sun is the Chief Database Architect at Huawei Cloud. He has 20+ years experience in developing several database systems, ranging from embedded database, large-scale distributed database, to cloud-native database.
Steven Yuan is the Director of Huawei Toronto Distributed Scheduling and Data Engine Lab. He leads an research team in big data and cloud domain, focusing on distributed scheduling and distributed database, from IaaS to PaaS.