Embedding pipelines are the new ETL
I’ve seen a lot of promising AI prototypes fall apart after launch. And it’s rarely because the model was bad. More often, the problem starts much earlier; teams treat the data layer like something they can figure out later. They’ll spend weeks fine-tuning prompts, testing models and debating evaluation scores, then throw together the retrieval…