Jul 28, 2021
Seven years of OtterTune R&D: Part 2 – Learnings in the wild
In our previous blog post, we detailed the seven years of intensive research and lessons we learned that led to the commercialization of the OtterTune database tuning service. Since then, we’ve brought OtterTune into the real world at Société Générale (SocGen) and Booking.com. We keep learning about how to optimize database performance, and OtterTune keeps getting smarter. Here are a few details we’ve learned so far.
Lessons from our first real-world deployment
Our latest academic paper, An Inquiry into Machine Learning-based Automatic Configuration Tuning Services on Real-World Database Management Systems, focuses on our 10-month study of OtterTune running in a production environment at SocGen. OtterTune co-founder Dr. Dana Van Aken will be presenting these results at VLDB 2021 in August.
For this research, we deployed OtterTune on-site to tune a 1TB Oracle (v12.2) DBMS. We extended OtterTune to support multiple tuning algorithms. In addition to the original GPR algorithm from the SIGMOD 2017 paper, we added two additional algorithms:
Deep deterministic policy gradient (DDPG) from CDBTune and
Deep neural network (DNN) by OtterTune co-founder Bohan Zhang.
Our results show that OtterTune’s ML-generated configurations improved the DBMS’s performance by up to 45% over a configuration hand-tuned by the bank’s in-house expert DBAs. We found that there was no major difference in the quality of tuning among the three algorithms. But they all beat the human experts. GPR converges the fastest but DNN had slightly better results. DDPG might be better if we ran it for longer but a single trial already took 3-4 days (each bar in the chart is the average of nine trials). This was our first real-world implementation of OtterTune and we gained a lot of insights.
Key details and advancements
During the process of setting up and deploying OtterTune at SocGen for this study, several issues arose that we did not anticipate. Some of these were specific to SocGen’s operating environment and cloud infrastructure. Several issues, however, are broadly related to the field of automated DBMS tuning with machine learning.
Tune without accessing sensitive data – Because the bank has strict security requirements, the DBMS could not use locally attached storage. It had to store the database on a shared NAS for regulatory reasons. We also had to consider GDPR privacy regulations. This was not an issue because the metrics OtterTune gathers are benign. OtterTune’s algorithms only use knobs and metrics, it does not retrieve sensitive queries or user data.
Machine learning needs safeguards – Automated database tuning is more than just generating new configurations. The service needs additional safeguards to make sure it correctly identifies invalid configurations that cause the DBMS to fail to start and poor configurations that cause the DBMS to run much slower than before. If the ML algorithms do not have prior training data, they will inevitably select poor configurations to install on the DBMS at the beginning of a tuning session because they will explore the possible configurations to learn how different settings impact the target objective.
Limit configuration testing times – We had to overcome issues with handling long-running and failed configurations. For example, the execution time of the workload in this instance depended on how long it took to replay the queries in the 10-minute trace we used in this study. We implemented an early abort feature that would stop executing long-running traces after a specified period of time (45 minutes by default) since some bad configurations would cause the execution time to take several hours.
Discerning maintenance tasks from regular DB workload – In real-world deployments, DBMS maintenance tasks have to be considered. Some DBMSs invoke these tasks at scheduled intervals, while other tasks are in response to the workload (e.g., PostgreSQL’s autovacuum runs when a table is modified a certain number of times). It is best to be aware of these tasks in advance before starting a tuning session since they often cause performance to degrade.
Auto-tuning benefits vs. costs – There are unexpected cost considerations to consider as well. Our results showed that ML-based algorithms generated configurations that improved performance by up to 45% over the configurations hand-tuned by expert DBAs. Although these gains are noteworthy, there is a trade-off between the time it takes to deploy OtterTune versus the benefit. But there are several non-obvious factors that organizations need to consider when making the decision to tune a DBMS. Such factors include the economic significance of the application, the administrative cost of tuning the database, and whether the organization has the tooling and infrastructure to run the tuning sessions.