VMware Private AI Agent Builder: Composing Models, Knowledge Bases and Prompts (Private AI Series, Part 15)

Agent Builder in VMware Private AI Services lets you compose a model endpoint, a knowledge base and prompt instructions into a grounded agent. Here is what it actually does, where it sits, and where the agentic hype gets ahead of reality.

by

Dr. Pranay Jha

June 15, 2026

No comments

8 minutes

Read Time

VMware Private AI Series · Part 15 of 24

Call something an "agent" in 2026 and half the room pictures software that books its own travel and rewrites its own goals. That is not what you get when you open Agent Builder in VMware Private AI Services. What you get is more useful and a lot less dramatic: one place to wire a model endpoint to a knowledge base, pin a prompt, and ship a grounded chat backend your users can actually trust.

What Agent Builder actually is

Agent Builder is the composition layer of Private AI Services. Think of it as a router with a save button. An incoming prompt goes first to a knowledge base for retrieval, then to a completions model endpoint that generates the answer from what was retrieved. You supply three ingredients and the service wires them together.

Those three ingredients are a completions model endpoint (served by the Model Runtime and reached through the ML API gateway), an indexed knowledge base (built by the Data Indexing and Retrieval Service and stored in pgvector on Data Services Manager), and your prompt instructions plus a handful of advanced retrieval settings. A playground in the UI gives you a fast loop to swap models, adjust the prompt, and watch the answers change. When the configuration behaves, you save it, and that saved agent becomes the backend your application calls. If you have followed the RAG pipeline and Model Store and Model Runtime parts of this series, Agent Builder is where those pieces finally meet.

Agent Builder composes a model, a knowledge base and a prompt into one saved backend.

Where it sits in the Private AI Services stack

Agent Builder is the last mile, not the whole platform. Underneath it sits the rest of Private AI Services, and every layer has to be in place before the agent on top is worth anything. Skip the indexing work and you get a confident model that knows nothing about your business.

Model Gallery and Model Governance handle importing, validating and sharing approved models. Model Runtime and the ML API gateway run those models as a scalable service so separate teams broker requests against shared GPU capacity. The Data Indexing and Retrieval Service pulls from sources like Confluence, SharePoint, Google Drive and S3, chunks and embeds the documents, and refreshes them on a schedule into a pgvector knowledge base on Data Services Manager. Agent Builder then composes a model and a knowledge base into an agent, and your front end (Open WebUI or your own application) consumes it.

In VCF 9.1 this whole flow moved into the VCF Private AI Services UI inside a namespace. Organization administrators enable Private AI Services through VCF Automation, and end users do the document upload, the indexing jobs, and the agent creation from one console. That matters operationally: the people building agents are no longer waiting on a platform team for every step.

Agent Builder is the top of the stack; everything below it has to exist first.

How a query flows through an agent

Here is what happens on a single question:

The user sends a prompt to the saved agent endpoint.
Agent Builder routes it to the attached knowledge base, where an embedding model and a similarity search in pgvector return the most relevant chunks.
Those chunks are injected as context alongside your prompt instructions.
The completions model generates an answer grounded in that retrieved context.
The response goes back to your application.

The grounding is the entire point. The model answers from your indexed corpus, not from whatever it absorbed during training. That is what makes the output defensible in an enterprise setting, and it is why the knowledge base, not the model, is usually where these projects quietly succeed or fail.

Prompt to retrieval to generation: the path every question takes.

Myth versus reality: this is not autonomous agentic AI

Time for the blunt part. Agent Builder today is a RAG composition and serving tool. It is not an autonomous, multi-step agent framework, and the word "agent" is doing a lot of marketing work across the industry right now.

What it does: route a prompt through retrieval and a completions model, with instructions and access controls attached. What it does not do out of the box: run long tool-calling loops, plan multi-step tasks, or take actions across your systems on its own the way frameworks such as LangGraph or CrewAI advertise. Richer tools and a deeper playground are on the roadmap, and that direction is real, but if you walk in expecting self-directed agents that go off and do things, you will be let down.

The gap between the agentic pitch and what ships in the box.

Dimension	Agent Builder	DIY agent framework
Primary job	Compose RAG chat backends	Orchestrate multi-step agents
Multi-step tool use	Limited today, roadmap item	Core feature
Governance and access	Built in, per knowledge base	You build it yourself
Data grounding	Native, via pgvector KBs	Wire it up yourself
Best fit	Governed chat over private data	Complex autonomous workflows

My take after enough of these conversations: that gap is fine, and arguably a feature. The value most enterprises can actually capture this year is governed, grounded chat over private data, with each business unit pinned to its own knowledge base and its own access scope. That is a solved, shippable problem. Autonomous agents acting across production systems are mostly still a demo, and a governance headache you do not want to inherit yet.

Where it fits, and where it doesn’t

Recommended when you want governed chat backends grounded in private knowledge bases, with the model and the data staying inside your VCF estate, and when you want different teams composing their own agents over their own slices of the corpus without data bleeding across domains. That last point, fine-grained knowledge-base access per agent, is the quietly valuable part and the reason to prefer Agent Builder over a bolted-on open-source stack.

Not the right tool when you need complex multi-agent orchestration, long tool-calling chains, or framework-level customization today. In that case, serve the model through NIM or the Model Runtime and build the agent layer yourself, then point it at the same knowledge bases. You keep the platform benefits without forcing a framework problem into a tool that was not built for it.

Validate three things before you promise anything to a business unit. First, that your indexing and refresh policy actually keep the knowledge base current, because a stale corpus is the silent failure mode here and nobody notices until an agent confidently cites last quarter’s policy. Second, that the completions endpoint is sized for real concurrency, not a single demo user. Third, that access controls map cleanly to your tenancy model so one team’s agent cannot read another team’s documents.

My Take

Agent Builder is the least flashy and most practical part of Private AI Services. It will not hand you an autonomous workforce. It will let a platform team stand up grounded, access-controlled AI chat over private data in an afternoon, which is what most organizations actually need first. Treat it as the composition and serving layer it is, keep the agentic ambitions on a separate roadmap, and you will ship something useful instead of something impressive in a slide. Are you building agents on Private AI Services yet, or still stuck arguing about what "agent" even means in your shop?

References

VMware Private AI Series · Part 15 of 30
« Previous: Part 14 | VMware Private AI Complete Guide | Next: Part 16 »

About The Author

Dr. Pranay Jha

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

See author's posts