A Meta Note About Developer Experience and Simplicity in Data Tooling
tl;dr> This is a meta post inspired by my first 3 months of building with GlareDB. After seeing how fun and simple it is to use in a landscape filled with so much choice and complexity, I've realized the essential nature of the data tooling workflows unlocked by GlareDB and how we need better tools that provide features and simplicity.
I've been a data engineer and developer advocate in the data space for years, and one of my main responsibilities for a lot of this time has been to set up various integrations to move data around, transform them, check them, share them, and use them. There's been some recent conversation around the term "modern data stack" and whether it's still useful, but no matter your position, the fact is that data are messy and abundant and essential, and so there will likely continue to be improvements and innovations in the form of new tools and even new categories in the space. The quantity of these tools and sheer number of potential permutations, makes testing and demonstrating these integrations one of the most fun and also potentially the most painful parts of the job.
What's fun? Connecting these new tools together can be really powerful and satisfying! It's amazing how new formats of data--Parquet, Lance, Pandas and Polars dataframes--have made the data more easily interoperable than ever before and enable some really powerful workflows. I had a blast writing the Dashboards and Data Quality blog posts, and it felt great to see how contemporary tooling can take things that are often total drudgery, like formatting a dashboard, or writing data quality checks, and streamline them.
So why the potential pain? Tasks like creating and provisioning accounts, finding data, loading the data in various databases, getting the right connection strings can all be nitpicky, laborious, and error-prone. Even while the contemporary data tooling landscape makes it so easy to plumb data pipes together into pipelines, "drilling the data wells" as it were (sorry this metaphor can really go far), can still be a frustrating task. I am excited about the end-uses of the data, and I am excited about the way we process data. But setting up the infrastructure, and configuring access, is hard to get excited about. Even at a small company, infrastructure and account provisioning can involve 2 or 3 people, and take a couple of days to get ironed out.
This is one of the things that has made me so excited to work on GlareDB. As I build examples and demos of GlareDB and all the different things that GlareDB can connect to and work with, I find that ~5-10% of my time is spent using GlareDB and actually building the hands-on examples, and ~90-95% of my time is spent provisioning, and configuring, and loading new data. So: if setting up new stuff has a high activation cost and feels burdensome, then being able to limit the number of tools that need to be set up reduces complexity, cost, and frustration. Check out the blog post on query federation with dbt for a hands-on example of this!
I also think about this xkcd a lot, and understand the tension of saying "adopt just one more tool and all your problems will be solved." The thing that has made me really proud to work on GlareDB specifically has been:
- Managing lots of tools is frustrating, and GlareDB lets you hide that complexity from your users and applications.
- GlareDB is focused on providing a unified interface for data, but it'd be nice if the standards around authentication and access control were more widely adopted.
- With GlareDB you don't have to configure access for each application or engineer; instead GlareDB provides a single interface for your applications to talk to all your data without needing independent credentials to many systems for each application, script, or user.
It's been a total pleasure to work on a tool that prioritizes the developer experience in this way.
Get started now
Ready to get the most out of your data? Get started with GlareDB locally, or spin up a deployment on GlareDB Cloud!