Thoughts

From ETL to ELT and Back Again: Transformations in AI Token Space

Galen Marchetti

25 Sep 2025 • 4 min read

Everyone and their grandmother nowadays knows about ETL (extract-transform-load). Whoever ran that category creation push did a great job, because it’s become the household name for “any process that puts data into the data warehouse”. There’s even a concept of Reverse ETL, which has nothing to do with “load-transform-extract” but instead just means “taking data out of the warehouse and putting it back into your software systems”.

The problem is, ETL is mostly dead nowadays. People don’t talk about it that way, but it’s true. In reality, most people practice ELT (extract-load-transform) as a best practice. Gone are the days where transformations in-route from source systems to the data warehouse were essential for high-performance analytics. Compute and storage are cheaper than ever, and modern data warehouses can more or less take all the transformations you want to throw at them without breaking the bank.

I think etymology and semantics matter. They teach important things about history and industry. It might seem like ETL and ELT are close enough that the distinction doesn’t matter, but the switch between the two can teach us a lot about designing effective systems within the constraints of our time.

Another semantic oddity: “Fivetran” is short for “five transformations”, the original idea being that you would only ever need five transformations to do anything you want with your data: filter, join, aggregate, project, and sort. In 2012, the “Five Transformations” company was pulling data from source systems and shoving it into warehouses that were expensive and limited, like Teradata. They entered a market where true “ETL” was standard, you transform your data before loading it into a data warehouse. Those pre-loading transformations were so important that the founders named their whole company after them.

Nowadays, most people view Fivetran as a standard for moving data from point A to point B, with minimal transformation. It’s mostly a data integration product, not a data transformation product. Their name is a vestigial structure from a time long gone, when running transformations entirely in data warehouses was prohibitively expensive. Back then, it made no sense to dump raw data as-is into Teradata and run the rest of your pipelines in the warehouse. It would have been silly.

I think there’s a lot to learn from this story. We can apply these lessons the newest and grooviest technology hitting modern data stacks: AI applications with MCP connectors.

Everyone is providing an MCP connector for their software systems. There’s MCP connectors for HubSpot, Intercom, Stripe… and soon for everything you can imagine. Looking at Anthropic’s interface for connecting tools, you might think you can just go to your favorite model provider, connect directly to your systems via MCP, and be able to answer any question you’d want to answer. This is version 0 of AI enablement. With good, focused prompt engineering, it can work decently when your workflows are concerned within only one system.

The problems arise when you must work with questions that span multiple systems. If you achieve AI adoption with MCP on top of your operational systems, then any cross-system workflows rely on doing data integration and transformation entirely in token-space. There is no centralized storage and computation layer for the underlying data, so you’re relying on a sequence of tool calls strung together by token generation to figure out how to pull data together.

The real issue is that this is terribly inefficient and that’s not going to change soon. AI token generation of 2025 is the Teradata computation of 2012. Powerful, but prohibitively expensive for enterprise-scale integration and transformation work. It is significantly cheaper, and honestly more effective, to run your integrations and transformations in a central data warehouse, form an ontology or other semantic layer, and then expose that to your AI workloads. If the 1980s to early 2000s were the ETL era for data warehousing (as in transforming before loading to the data warehouse), then we are currently in the ETL era of AI (transform before loading data into AI).

If the 80s to early 2000s were the ETL era for data warehousing, then we are currently in the ETL era of AI.

A lot of people don’t want to hear this because it requires either piling another stack of work on top of an already overworked data team, or it requires getting a data team together for the first time when you’ve been able to scrape by without one in the past. These folks want to jump ahead directly to the ELT era (load raw data directly into AI and let it figure it out). AI makes it easy to think this will work, because it works on small-scale, narrowly scoped use cases. However, this tendency is a big reason for the failure of many enterprise AI projects to return real ROI… poor performance for real business use cases, and expensive.

Our vision at AstroBee is to make the process of transforming data before it gets into your AI as efficient and affordable from a human labor standpoint as possible. AstroBee is a platform that builds the centralized, integrated source of truth for you (the ontology). That way, with minimal investment, you can feed your AI projects the data they deserve. Out of the box, AstroBee comes with an analytics agent that works for you as your enterprise data analyst, but you can plug your data into any other AI workflows you’re working on. If you’re interested in working with an early version of AstroBee just let us know at hello@astrobee.ai!