Is Augmented Data Management Really New?

July 27, 2022
  • Andy Pavlo

There is a lot of noise in the early-stage tech startup game, which makes it hard to get people to pay attention to what you are trying to do. This is why we crave any recognition. But not everyone can be like Yo and get featured on The Colbert Report, or be like SingleStore and get a toast from Ashton Kutcher on Twitter.

We don’t want to be stunting too much, but we are excited that Gartner listed us as a Cool Vendor in 2022 for their Augmented Data Management category. We were mentioned alongside our ex-LogicBlox friends at RelationalAI (see their recent CMU-DB tech talk about what they are building). Gartner also included Sync Computing. We don’t know them, but their website looks nice.

You might ask yourself what “augmented data management” means and why OtterTune is lighting it up in this space. Read on to find out.

What is Augmented Data Management?

50 Year History of Database Automation

In 2020, Gartner defined augmented data management as the application of artificial intelligence (AI) and machine learning (ML) to optimize data-intensive systems. In other words, using AI/ML technologies to automate software/hardware optimizations. This broad definition could encompass many different tools, projects, and startups. At OtterTune, we are obsessed with databases (especially me) and the long history of automated ways to optimize them, so that’s what I will talk about here.

The term augmented data management is not widely used by researchers working on database automation. Like many industry analysts, Gartner likes to come up with their own words to define emerging technology categories. For example, in 2014, they came up with HTAP to represent hybrid OLTP and OLAP workloads, which I like better than alternatives like “OLxP.” Although Gartner puts OtterTune into their augmented data management category, I feel like what we do overlaps with Gartner’s 2016 term AIOps.

A key thing to be aware of is that database automation is nothing new. People have been researching this problem and trying to do it for the last 50 years, but they were just calling it a different name. In the 1970s, people developed self-adaptive methods for database design (e.g., indexes, partitioning). This term persisted mostly in academic literature for about two decades. Then Microsoft’s AutoAdmin project pushed the field forward under the moniker of self-tuning databases. In the previous decade, we’ve seen the rise of cloud automation, where service providers automate all of the back-end provisioning and management of large-scale database-as-a-service (DBaaS) fleets. Almost all of this previous work relied on human-derived rules and heuristics. What makes augmented data management different (and interesting!) is the reliance on ML/AI methods to automatically determine the right optimization strategies for databases.

The Rise of DBaaS and DevOps

Since database automation has been around since the 1970s and it’s such a significant problem, why would Gartner only single it out in the last two years? The critical factor is the ubiquity of database-as-a-service offerings on the cloud and the rise of the DevOps role.

Google Trends: DBA vs. DevOps

Databases are super easy to deploy now. You can get a production-ready PostgreSQL or MySQL instance running on Amazon RDS in a few minutes. This means that there are more databases than ever, but people incorrectly think their cloud vendor will optimize their DBMS. Some DBaaS do that, like Microsoft’s automatic indexing for Azure SQL Database or Oracle’s autonomous database. But most systems do not.

So, who is optimizing them? Traditionally, this has been the purview of your trusty database administrator (DBA). But with the rise of the “DevOps” role (and increasing gang violence among DBAs), the prevalence of DBAs at many companies has diminished in the last decade. One piece of evidence of this organizational shift is this Google Trend analysis. This means that more databases are being managed by people that would not describe themselves as database experts. Speaking for myself, I am participating in more calls with potential OtterTune customers who are developers looking for help with their databases than I do with DBAs.

The Problem: Getting a Grip on Cloud Spend

What do companies do without a database expert looking after systems to solve performance issues (e.g., slow queries, bloated storage, inefficient query plans)? We have found that they just throw money at their cloud vendors, hoping to make their problems magically disappear. The upshot of this is excessive cloud spending on databases, an issue that is becoming more critical as cloud costs continue to rise (see “You are Overpaying Jeff Bezos for Your Databases”).  Optimizing cloud infrastructure for your database’s current usage can be vital for performance, but it’s complex and time-consuming.  On the other hand, scaling vertically in the cloud is fast and easy, and many organizations will choose to scale up when they encounter performance issues.  But if your MySQL or PostgreSQL database is horribly misconfigured, giving your cloud vendor more money only defers the performance problems instead of fixing them. Consequently, these organizations tend to underestimate the amount of money wasted on resources that wouldn’t be necessary if the system was optimized.

The Answer: Automated Database Optimization

Cost and infrastructure optimization in the cloud is a complex problem. In my experience of working in this area for over a decade (and based on the last 50 years of research), automated tools like OtterTune that rely on ML to optimize database systems are the only solution that handles the most challenging databases at scale. Gartner makes a similar conclusion in their report and argues that augmented data management (i.e., OtterTune) is the crucial way to address this problem. We see it, and our customers see it.  And Gartner sees it. We couldn’t be happier with the recognition.

If you have out-of-control cloud costs due to badassless database optimization, create a free OtterTune account and join us on Slack!

Ready to put your database optimization on autopilot?

Use OtterTune to automatically check the health of your Amazon RDS MySQL and PostgreSQL databases.

Connect your first database for free