Databricks launches AiChemy multi-agent AI for drug discovery

Databricks has outlined a reference architecture for a multi-agent AI system, named AiChemy, that combines internal enterprise data on its platform with external scientific databases via the Model Context Protocol (MCP) to accelerate drug discovery tasks such as target identification and candidate evaluation.

These early-stage steps are critical in drug development because they help pharma companies determine which biological mechanisms to pursue and which compounds are worth advancing, directly influencing the cost, time, and likelihood of success in later clinical stages.

The multi-agent AI system is built on Databricks components, including its Data Intelligence Platform, Delta Lake, and Mosaic AI, including Agent Bricks, which manage and govern enterprise data while enabling the creation and orchestration of domain-specific agents and “skills.”

These skills include instructions for querying and summarizing scientific literature, retrieving chemical and molecular data, performing similarity searches across compounds, and synthesizing evidence across sources.

The system combines these agents and skills with external data sources such as OpenTargets, PubMed, and PubChem, accessed via MCP, allowing agents to retrieve and reason over both proprietary and public scientific data.

In doing so, AiChemy brings these data access, orchestration, and analysis in a single, governed environment, which Databricks says will help researchers in pharma companies surface relevant insights from disparate datasets without losing context, in turn accelerating tasks like target identification and candidate evaluation.

Underpinning the entire system is a supervisor agent that coordinates how individual agents and skills are used to fulfill a query.

Databricks describes this supervisor agent not as a prepackaged component, but as a pattern that enterprise teams can implement using its Mosaic AI and Agent Bricks tooling.

Enterprise teams building such a supervisor agent, according to a Databricks blog post, would need to start by defining and implementing domain-specific skills, such as literature search, compound lookup, or data synthesis, and registering them so they can be programmatically invoked.

Developers then would need to configure the supervisor agent with instructions or policies that determine how it selects and sequences these skills in response to a query, including how tasks are decomposed and routed, the company wrote in the blog post.

This setup is typically tied to enterprise and external data sources via MCP, with access controls and governance applied through Databricks’ platform, it added.

The AiChemy initiative builds on earlier Databricks efforts in healthcare and drug discovery.

In June 2025, the company partnered with Atropos Health to combine real-world clinical data with its Data Intelligence Platform to support evidence generation and accelerate research workflows.

A month later, in July 2025, it announced a partnership with TileDB focused on integrating multimodal scientific data, such as genomics, imaging, and clinical records, to enable AI-driven analysis for drug discovery and clinical insights.

The AiChemy reference architecture, Databricks said, has been made available through a web application and a GitHub repository, where developers can explore the system and adapt it to their own use cases using its Agent Bricks framework.

Go to Source

Author: