Sep 14, 2021
How will machine learning transform database administration?
Dana Van Aken
Autonomous database tuning relies on machine learning (ML) algorithms to automate tasks that have, to date, been performed manually by database administrators (DBAs). Today, you can find ML-powered automated tuning tools in products like OtterTune, Oracle Autonomous Database, and MySQL Autopilot.
Will machine learning make DBAs and devops obsolete?
I get that question from time to time. My answer is always, “No, ML will not make humans obsolete,” but the technology will change how humans administer databases and require some new skills to master. Here are three predictions about the impact that automatic database tuning will have on database administration:
Prediction #1: ML will yield better DB price-performance with less effort
This first prediction is obvious; it is why DBMS technology developers are investing so much R&D into the functionality. Optimizing DBMS performance is a complex process. Consider configuration tuning, which involves tweaking hundreds of knobs to optimize the DBMS performance for a specific workload on a given hardware and OS platform.
Finding the perfect settings is tedious–usually a trial and error process that can take days. It is so tedious that people do not do it frequently enough, or too often, they do not do it at all, which is inefficient and results in poor performance or higher DBMS hosting costs.
Autonomous database tuning solves this by implementing ML algorithms trained to determine the optimal config settings (or query plan, index selection) for a given workload. The approach is automatic, generates better performance, and saves developers time and effort. Where ML shines is with large deployments of several dozen or hundreds of databases.
No human has the time (or patience) to tune each database individually. But the ML-automated tuning process needs guidance, which leads me to my next prediction.
Prediction #2: Humans will guide autonomous database operations
When it comes to database tuning, ML algorithms do not know and do not care what they are tuning in the system, leading to a complex set of challenges. For example, what if an algorithm sets the size of a buffer pool to be larger than the amount of memory available on the system and then restarts the system?
Some database systems, like PostgreSQL, will refuse to start because the buffer pool is too large. Other systems, like MySQL, will gladly start the system even though the knob is erroneously set. It may then crash or get killed by the OS when the database tries to allocate more memory than is available.
For this reason, humans (e.g., devops, DBAs, etc.) will need to oversee and guide ML-driven automation to make sure databases are tuned correctly and safely. There are several facets to this:
What the ML algorithms should optimize in the database (e.g., query latency, cost).
Whether to sacrifice the data safety and correctness in exchange for better performance.
Confirming appropriate value ranges for different knobs.
Specifying which knobs, if any, are off-limits to ML (and should only be changed manually).
Defining tuning time windows to control when the ML algorithms databases perform tuning and potentially restart the database (for specific config changes to take effect).
OtterTune helps humans and ML work better together by exposing a customizable list of recommended knobs and values, and estimating which knobs will have the most impact on performance.
Prediction #3: ML will add a new dimension to database troubleshooting
Humans working with ML-based database tuning tools will require some familiarity with ML. Such understanding is necessary if the database has unexpected behavior: a human operator will need to identify whether the behavior is due to ML-based decision-making and if so, what metrics and data were fed into the ML algorithms. It will also be helpful to have human-decipherable explanations of why the algorithms are making certain changes.
DBAs need to be thoughtful about what data trains or signals ML models, especially with databases supporting cyclic workloads with complex patterns, because the system state is constantly changing.
Autonomous database tuning is a growing trend that database developers and administrators should be tracking. If you want to learn more about it, here are some helpful resources:
And one more thing…never use the default database configuration! The default configuration file (my.cnf, postgresql.conf) that you get when you first install a database or create one in Amazon RDS is always terrible and makes your database run slow.
Use human expertise and some type of automated tuning tools, whether that’s OtterTune, PGTune, MySQLTuner, or any of the various ones from commercial vendors. Any automated knob tuning can deliver significant improvement, without having to make drastic changes.