IEEE ICDE 2021 Tutorial

Workload-Aware Performance Tuning for Autonomous DBMSs

Abstract

Optimal configuration is vital for a DataBase Management System (DBMS) to achieve high performance. There is no one-size-fits-all configuration that works for different workloads since each workload has varying patterns with different resource requirements. There is a relationship between configuration, workload, and system performance. If a configuration cannot adapt to the dynamic changes of a workload, there could be a significant degradation in the overall performance of DBMS unless a sophisticated administrator is continuously re-configuring the DBMS. In this tutorial, we focus on autonomous workload-aware performance tuning, which is expected to automatically and continuously tune the configuration as the workload changes. We survey three research directions, including 1) workload classification, 2) workload forecasting, and 3) workload-based tuning. While the first two topics address the issue of obtaining accurate workload information, the third one tackles the problem of how to properly use the workload information to optimize performance. We also identify research challenges and open problems, and give real-world examples about leveraging workload information for database tuning in commercial products (e.g., Amazon Redshift). We will demonstrate workload-aware performance tuning in Amazon Redshift in the presentation.

Outline:

Part A:

  • Motivation and Background
  • Workload Classification
  • Workload Forecasting (Prediction)

Part B:

  • Workload-Based Tuning
  • Amazon Redshift
  • Open Challenges and Discussion

Presenters:

Zhengtong Yan is a doctoral student at the University of Helsinki. His research topics include autonomous multi-model databases and cross-model query optimization.

Jiaheng Lu is a professor at the University of Helsinki. His main research interests lie in database systems specifically in the challenge of efficient data processing from real-life, massive data repositories and the Web. He has written four books on Hadoop and NoSQL databases, and more than 100 papers published in SIGMOD, VLDB, TODS, and TKDE, etc. He has given several tutorials on multi-model data management and autonomous databases in VLDB, CIKM, and EDBT conferences. He frequently serves as a PC member for conferences including SIGMOD, VLDB, ICDE, EDBT, CIKM, etc.

Naresh Chainani is a senior software development manager at Amazon Web Services (AWS) working on query processing, query performance, distributed systems, and workload management. He is passionate about building high-performance databases that are easy to use and has multiple papers and patents.

Chunbin Lin is a software engineer at Amazon Web Services (AWS) and he is working on AWS Redshift. He completed his Ph.D. in computer science at the University of California, San Diego (UCSD). His research interests are distributed database management and big query analytics. He has more than 30 papers published in SIGMOD, VLDB, VLDB J, and TODS, etc.