Sep 21, 2021
Benchmark: Using machine learning to optimize Amazon RDS MySQL performance
Ruogu Du | Bohan Zhang
Amazon RDS MySQL is a popular choice for running a MySQL database. When you provision an RDS MySQL instance, it comes pre-installed with Amazon’s default configuration settings for dozens of tunable knobs. These knobs control the DBMS’s runtime behavior (e.g., buffer pool policies, thread scheduling).
Amazon’s default settings use general-purpose values for the most common workloads. That is, they are not particularly bad for any application’s workload. But of course, this means that they are also not optimal for your database’s specific workload.
Correctly setting these knobs to optimize your database’s price-versus-performance trade-offs is non-trivial. It requires you to have both of the following:
Specialized knowledge of MySQL internals.
Time to make changes and test the impact of the change on performance until it is optimized (i.e., trial-and-error).
If you run Amazon RDS MySQL and you lack (1) or (2) above, or if your workload changes over time, your cloud costs are likely higher than they should be for your database.
Comparing MySQL Configurations for Amazon RDS
We wanted to see how poor Amazon’s RDS knob settings could be and understand how to fix them. We ran an experiment using MySQL (v8.0) on Amazon RDS and measured its performance using three database configurations:
Default Amazon RDS knob settings
Knob setting recommendations from MySQLTuner
Knob setting recommendations from OtterTune’s machine learning algorithms
We ran the tests on an Amazon db.m5.4xlarge RDS instance equipped with:
64 GB RAM
2000 provisioned IOPs
We used the BenchBase framework to run the industry-standard TPC-C benchmark. TPC-C models an order processing application similar to an online storefront, where customers purchase items and the database keeps track of inventory. We configured BenchBase to load a TPC-C database with 200 warehouses (~25 GB of table data) and simulate an application with 50 active connections.
To use MySQLTuner on an RDS instance, we downloaded the latest version of the script from Github and then ran it on an EC2 instance inside of our Amazon VPC. Since this EC2 instance is inside our firewall, we only have to provide MySQLTuner the username and password for the target MySQL instance. MySQLTuner is unaware that the database is running in Amazon RDS, so we have to manually edit the Parameter Group using the AWS console.
We applied the following configuration that MySQLTuner recommended. We note that MySQLTuner performs many other useful checks, but it only recommended that we modify two knobs for MySQL’s InnoDB storage engine:
We provided OtterTune the necessary information to connect to MySQL using Amazon’s AWS API and the database itself to retrieve MySQL’s runtime metrics via SQL.
We then instructed OtterTune’s ML algorithms to generate MySQL configurations that optimize for the total number of transactions committed (trx_rw_commits + trx_ro_commits + trx_nl_ro_commits).
We can confirm that MySQL is properly collecting these counters by checking the system’s internal INNODB_METRICS catalog table to make sure that the counters are enabled:
We allowed OtterTune to tune 16 MySQL knobs. The listing below is the best configuration that OtterTune generated for the db.m5.4xlarge RDS instance.
Some of these knobs do not require OtterTune to restart MySQL before they take effect. Such knobs are referred to as dynamic system variables in MySQL. Other knobs do not take affect until you restart the DBMS.
Result: OtterTune makes MySQL 4x faster
In the graph above, we see that, as expected, Amazon’s default configuration for MySQL RDS performs the worst with only 781 transactions per second (txn/sec). MySQLTuner’s recommendations improve MySQL’s throughput by 116% over the RDS default by changing two knobs.
This improvement is undoubtedly good, but OtterTune optimizes more MySQL knobs to improve its performance even further. OtterTune’s ML algorithm achieves the best performance with over 3500 txn/sec, 4x higher than the default RDS configuration from Amazon and 20% higher than the MySQLTuner outcome.
Better performance or lower cost?
The first test showed how OtterTune customizes MySQL’s knob settings for the application’s workload to get better performance than Amazon’s default settings and the popular MySQLTuner tool. Better performance means that the DBMS can execute more queries quickly even if your database grows over time or your application adds new features. Improving a database’s performance helps applications that may be running a bit slow.
For people who are satisfied with their database’s current speed, but would like to lower their monthly AWS bills, tuning database knobs can also help. To demonstrate that, we reran the TPC-C benchmark on a smaller db.m5.2xlarge RDS with fewer provisioned IOPs than the instance we tested above:
32 GB RAM
1000 provisioned IOPs
In the graph below, we see that, again, OtterTune can optimize MySQL’s configuration to get better performance than Amazon and MySQLTuner. On this smaller instance, using the OtterTune-generated configuration makes MySQL run 288% and 63% faster than Amazon and MySQLTuner, respectively.
But what is most impressive is that OtterTune’s optimized configuration for MySQL on the db.m5.2xlarge instance achieves almost 2x better performance than the default Amazon RDS configuration for the db.m5.4xlarge instance.
As of the time of this article in September 2021, the smaller db.m5.2xlarge instance costs $7,191/year (instance cost + provisioned IOPs), while the db.m5.4xlarge instance costs $14,383/year.
These results mean that for this workload, OtterTune gets double the performance at half the cost! It is important to remember that this result is specific to this workload. Other applications may experience a smaller improvement, while others may have an even more significant performance boost and cost reduction.
There are too many factors that affect MySQL’s runtime behavior. This complexity is why database tuning is so hard. But we are fairly certain that you should probably not use Amazon’s default RDS configuration.