🚀 GlareDB Pro now available:Simple, powerful, flexible serverless compute
productengineering
January 30, 2024

Debunking the Database Monolith

author image
Sam Kleinman, Engineering Lead
Blog post cover image

I would like to posit two truths:

  • within certain limits, it's always possible to build a database system that will be faster/more efficient for a specific workload

  • organizations (companies, teams, divisions) will never be able to select a single database or analytics engine to manage all of their data

Don't believe anyone who tells you otherwise.


The first truth is perhaps self-evident: optimizations -- within the limits of the underlying systems and physics -- are often possible if you're willing to make sacrifices. You can compress things really well if you don't mind that it takes more CPU (and time) to read them; you can make most OLTP systems pretty fast if you can keep all of the "working data set" in RAM; you can get really fast write performance if you're willing to sacrifice some compactness (storage) and read performance; you can use indexes to reduce the effective working set size (increased memory efficiency, better read performance) but writes (and updates in particular) suffer.

Building database systems within these systems of constraints (what is engineering after all) is about compromise. General purpose database solutions are always trying to strike a balance to suit most use cases. But it's always a compromise and if you're willing to sacrifice a use case and/or can accept increased operational, hardware, or maintenance cost you can get something that's faster and better.

The great general purpose databases of the last 40 years have done an excellent job selling the industry on the myth that maybe databases can be made generic enough to be the system for all data.

It's a beautiful myth: big institutions love the myth because it means they buy one database product, and it will be the system for storing and sharing data of any kind. Fewer decisions, fewer products, less complexity, less friction for connecting different kinds of related data. Database vendors also love the myth because it means they can become the default database of record and own all of an organizations workloads. For decades.

It's never been possible to pick just one database or even two. I think at a minimum, most applications or systems end up needing three (transactional, archive/analytics, and timeseries/metrics.) Many applications also have search or streaming data needs, which adds another database system or two. And in each of these categories there are a few options with different characteristics, so in a sufficiently large company you can end up with a lot of databases.

This is actually a great thing.

Application developers should be able to spend most of their time writing application logic and not fine tuning their data algorithms and pipelines to get better performance out of generic systems, and pushing work down to a specific database engine is a great way for developers to focus on their applications and be able to get the most out of their data.


GlareDB is the engine born into this reality. There will never be a winning generic database monolith, and there shouldn't be, but developers, data analysts, and engineers should be able to work seamlessly with data across application boundaries or across workloads. Applications should be able to look at timeseries and operational data at the same time. Archival data and operational data should be accessible to applications without requiring a pipeline or an extra layer of complexity.

By providing a single interface -- just SQL! -- on top of all of the database systems that organizations (and people!) already use, GlareDB makes it possible to exist in this post-monolith world. We know that teams have data in JSON and CSV files, we know that you have data in S3 buckets and in Snowflake; we know that every organization runs on nests of Excel files. This is the reality, and we don't want to build the one-true general purpose database engine of the future, we want to build a tool that solves the problems we have today and will definitely have tomorrow, and to provide a framework that can grow with your workload and organization as your needs change.

If you want to learn more about GlareDB, or work with us on re-imagining your data systems and applications for the post-monolith reality, you can check out GlareDB Cloud, set up a chat, or send an email to hello@glaredb.com. We'd love to talk more!

Get started now

Ready to get the most out of your data? Get started with GlareDB locally, or spin up a deployment on GlareDB Cloud!

$curl https://glaredb.com/install.sh | sh
Try GlareDB Cloud