The day infrastructure marched on the martech application layer

The martech infrastructure wars went hot last week as Databricks launched an agentic CDP, CustomerLake, at their Data + AI Summit in San Francisco.

That sure stirred the martech platform pot.

Up to now, I would have categorized Databricks as primarily an infrastructure company, one of the leading data layer platforms along with Snowflake, Google BigQuery, Amazon Redshift, and Microsoft Fabric.

Even though none of these platforms were martech products per se, I’ve been saying for several years that they were driving some of the greatest advances in our industry. Why? Because the biggest technical challenge in martech for forever has been the integration of data across the stack and the organization.

Let’s start there.

He who integrates the data integrates the universe

We’ve tried all sorts of ways to solve this integration problem. Initially by building point-to-point integrations between individual products. That quickly became unmanageable as the number of products in the stack grew. It created n² complexity, where each app needed a bespoke integration to each other app we wanted to work with it. (Spoiler: we simply left most of them unintegrated. Silo City.)

We then evolved to more hub-and-spoke platforms — center-of-gravity applications such as Salesforce and HubSpot, the whole CDP space, and integration-platform-as-a-service (iPaaS) providers such as Zapier and Workato. Integrate an app once to the platform, and it was connected to every other app integrated with that platform. Sort of. The platform and each integration often constrained what could flow between them to a subset of the possible data you might want shared. Some saw your data as their moat, which disincentivized them from being fully open. The architecture was less complex, with only n integration points (more or less). Yet integration pain remained.

With the rise of cloud data warehouses/lakehouses, more and more apps pushed data down into that shared data substrate, as customers demanded to run their own cross-app analytics and machine learning. As apps started to also pull data out — a process unsexily known as reverse ETL — these data platforms acted as a new kind of hub that could pass any data across the stack in a hub-and-spoke pattern.

The real “ah ha” moment came by asking: what if we didn’t pull the data out, but just worked with it in place, directly in the data warehouse/lakehouse?

That was the spark for the composable CDP, a term coined by Hightouch in 2022. Over the next several years, nearly every vendor in the CDP space adopted some variation of “composable” and “zero-copy” capabilities with data warehouses/lakehouses — often with vehement debate about the proper use of those terms. After all, there had to be egress of some data in order to activate it.

Overall though, this approach represented another leap forward in reducing integration complexity. I hand-wavily estimated it as log n complexity — the shallow blue curve on the graph above. Still some work to be done, but significantly less overhead per app.

What if the warehouse/lakehouse was the CDP?

Speaking of leaps, it’s not much of one to think you could further optimize the whole system by building a CDP that was fully embedded inside your warehouse/lakehouse platform. The only thing that might hold you back would be the conventional divide between application and infrastructure businesses — and the tensions that crossing that divide would create with the CDP partners in your ecosystem. But honestly, the border between applications and infrastructure was already being overrun.

So when Databricks announced CustomerLake, the reaction from most people was: surprised, not surprised.

From a first look, it is a full-blown CDP that can build customer 360 profiles, create and refine segments, resolve and enrich identities, and power real-time personalization.

What makes it “agentic” is a collection of (1) profile agents that work autonomously to clean and maintain golden customer records as raw data continuously flows through the system and (2) marketing campaign agents that build audiences and campaigns and then autonomously optimize their decisions for individual customers. Databricks calls these self-learning campaigns that can evolve on their own “infinity campaigns.”

(Infinity campaigns is one of the more catchy martech names I’ve heard in a while — sounds like the plot device for a Marvel martech movie.)

But CustomerLake’s superpower is that it’s a native part of the Databricks platform. Naturally, it can directly access data anywhere in its lakehouse — which so could any composable CDP. But CustomerLake does so under the umbrella of its Unity Catalog that already manages the governance controls and business semantics for a company’s broader infrastructure. That will be a welcome simplification for organizations seeking to tame data and AI sprawl. One less layer to lasso.

And while I haven’t seen pricing yet, the hint has been that because it is embedded in Databricks and is inherently aligned with their existing data and compute consumption model, it could be offered a lower incremental cost than a standalone CDP. We’ll see.

But there’s more to this story.

A platform that spans the entire composable canvas

Back in March, I published a research report on The New Martech “Stack” for the AI Age, in partnership with Databricks. Its thesis was that traditional martech stacks could be reimagined with far greater flexibility by intentionally building around the centralized data core of a warehouse/lakehouse platform.

I intentionally discarded the stack metaphor to encourage a more fluid mental model, and instead visualized a composable canvas with rings of responsibility that radiated outward to greater distributed capabilities.

A semantic layer wraps the data core, which is often an integral part of the data layer (e.g., Unity Catalog). The next ring out is context-as-a-service (CaaS) platforms, which I’ve envisioned as the infrastructure-ish (re)incarnation of domain-specific SaaS platforms. A decisioning ring is likely bundled in with most CaaS platforms, although there are interesting possibilities for independent decisioning services.

In my view, most composable CDPs fit in this model as integrated CaaS and decisioning products, with CustomerLake now the reference example of an embedded one.

The outer ring is the wide Kuiper Belt of all apps and agents — custom and commercial — operating under the gravitational influence of the inner rings.

What struck me over two days of 3-hour keynotes — whew — at the Databricks summit was how their portfolio of data and AI products now spans this entire composable canvas as fully embedded solutions.

At the data core, they announced Lake Transactional/Analytical Processing (LTAP), an architecture that unifies transactions, analytics, streaming, and operational data on a single copy of storage in the lakehouse. This is a big deal. The operational/analytical split in data systems has been a persistent bottleneck for decades. Your operational systems knew what a customer just did. Your analytical systems knew what it meant. Reconciling the two meant data engineering pipelines, replicas, and ETL jobs — each one a place for data to drift, break, or go stale.

With LTAP, the data that applications and agents act on, the data that analytics and AI reason over, and the data that streams through the business in real time can all share the same governed foundation. That shared foundation becomes essential for swarms of agents to act intelligently at speed and scale.

Their Unity Catalog is the semantic layer doing that governing, which includes their Unity AI Gateway to control AI access, spend, and observability across agents, tools, models and MCPs.

And then in the outer ring of apps and agents, Databricks has something for everyone. Agent Bricks is their platform for professional developers. Genie App Builder is their entry into “vibe coding.” Genie Agents lets business users spin up their own agents without any kind of coding at all. All deploy in the same governed canvas, inheriting the semantics of Unity Catalog and standing on the single foundation of LTAP underneath. Apps and agents can multiply on the edge, while governance persists at the center.

This isn’t to say that Databricks will be the only vendor across a company’s composable canvas. Martech will remain a vibrantly heterogeneous environment, most of all in those outer rings. But one vendor spanning the entire canvas can be a powerful gravitational force, especially for the hypertail of custom apps and agents that cry for cohesion.

Closing the Golden Context gap

One other Databricks announcement that stood out for me was Genie Ontology, a self-improving context layer that continuously mines a company's data, docs, apps, and conversations to build a living model of how the business actually works. (I think of this as Glean-ish, albeit starting from the data platform’s center of gravity.)

In our State of Martech 2026 report last month, we spent a whole chapter explaining “context” for marketing and martech. Our overarching framework defined three kinds of context. The customer’s context: what was relevant to them. The company’s context: what matters to the business and how it operates. And systems context: the subset of customer and company context that was actually visible and actionable in our systems.

The perfectly aligned intersection of customer, company, and systems context is what we coined the “Golden Context” — a nod to the golden record, but much more dynamic.

Admittedly, Golden Context is aspirational for most businesses. Typically only a fraction of customer and company context is covered in systems context. A major mission for marketing operations today, in my opinion, is to increase the overlap of systems context in the pursuit of Golden Context.

But it’s hard work. While much progress has been made on customer context over the years — the raison d'être of every CDP/CRM product — the very notion of capturing and leveraging company context in a programmable fashion is quite new. It was just back in December that Jaya Gupta and Ashu Garg wrote their call-to-arms for context graphs, a post that launched a thousand startups and incumbent roadmap revisions.

Genie Ontology brings a swath of company context into systems context, accelerating the realization of Golden Context. Indeed, Databricks leaders Tasso Argyros, Ali Ghodsi, and Reynold Xin framed their vision of an agentic CDP this way in their announcement post:

What does this mean for the martech ecosystem?

The death of the CDP has been ~~greatly~~ somewhat exaggerated. Databricks just breathed new life into the category. Some might say they just breathed fire into the category, as they are poised to be a fierce new competitor in the space. But competition drives innovation. In the big picture, I believe this next wave of CDP evolution will be great for marketers and martech.

Many CDP or CDP-ish vendors had already set their sights higher up, on decisioning loops and agentic experiences. Hightouch’s Agentic Marketing Platform. GrowthLoop’s Compound Marketing Engine. Treasure Data’s next chapter as Treasure AI. BlueConic’s acquisition of Blueshift into their Customer Growth Engine.

While Databricks offering their own CDP certainly complicates relationships with some of the partners in their ecosystem — hello, coopetition, my old friend — the Databricks platform as a whole remains wide open. Any CDP can still run on that infrastructure.

More so, I believe there will be many creative blends with other martech products. That Adobe, Bloomreach, Braze, Iterable, and Twilio were all CustomerLake launch partners — who also have their own compelling customer data capabilities — speaks to the multi-faceted reality of martech.

This is an inflection point in the curve, but nowhere near the end of the line.

The bigger disruption, I believe, will be that this opens the door for other infrastructure leaders to cross that app/infra boundary. I have no privileged knowledge, but I bet we’ll see other data and cloud platforms follow with CDP-like solutions of their own. And, I suspect, some will offer other solutions that further encroach on the territory that has been held by pure-play martech application platforms — who ironically have built on top of those infrastructure providers.

So grab your popcorn and keep your options open. It’s sure going to get interesting.

Never a dull moment in martech,

Scott

P.S. By the way, if you haven’t read The New Martech “Stack” for the AI Age yet, you can download a free copy here.