What VMware Private AI Foundation with NVIDIA Actually Is (Private AI Series, Part 1)

Part 1 of the VMware Private AI Series: a clear, opinionated explainer of what PAIF really is, what it is not, the components that do the work, and when it earns its license on VCF 9.1.

by

Dr. Pranay Jha

June 15, 2026

No comments

8 minutes

Read Time

VMware Private AI Series · Part 1 of 24

TL;DR · Key Takeaways

PAIF is not a product you install and it is not an AI model. It is an add-on SKU layered on VMware Cloud Foundation that turns GPUs into a governed, self-service resource.
The value is the integration: vGPU lifecycle, model governance, and automated provisioning. The NVIDIA software underneath you could technically run yourself.
In VCF 9.1, VMs and Kubernetes nodes can get exclusive GPU access without an NVIDIA AI Enterprise license. That changes the cost math for plain inference.
PAIF earns its license when multiple teams compete for GPUs and you have a governance or compliance requirement, not when one team has two cards.

Ask ten VMware admins what VMware Private AI Foundation with NVIDIA is and you will get three answers that are all slightly wrong. Some think it is a chatbot Broadcom ships. Some think it is a model. Some think it is a separate appliance you rack next to vCenter. None of those are right, and the confusion is costing teams real money on licensing decisions they make before they understand what they are buying.

This is Part 1 of a 24-part series. Before we deploy anything, size anything, or argue about vGPU profiles, it is worth being precise about what this platform actually is. So let us be precise.

What PAIF actually is, in one sentence

VMware Private AI Foundation with NVIDIA (PAIF) is an add-on to VMware Cloud Foundation that bundles NVIDIA AI Enterprise with a set of VMware automation, governance, and lifecycle capabilities, so your existing private cloud can provision and operate GPU-accelerated AI workloads as a self-service platform. That is the whole thing. It is VCF, plus the NVIDIA software stack, plus the glue that makes GPUs behave like any other VCF resource.

Read that again and notice what is missing: there is no model in the definition. PAIF does not give you an LLM. It gives you the infrastructure to run the models you choose, on hardware you own, inside your own data center.

PAIF is the integration of three layers you can already reason about, not a new black box.

Where it sits in the stack

The fastest way to stop misunderstanding PAIF is to place every piece on a layer diagram. Nothing in PAIF floats free. Every capability lands on a layer you already operate, from the GPU in the host all the way up to the application a developer ships.

PAIF adds the top three layers and licenses them. The two grey layers are your responsibility going in.

The components, and which vendor does the work

When people say PAIF is confusing, what they usually mean is that the component names blur together. Here is the honest split of who owns what. Notice that the AI runtime is almost entirely NVIDIA, and VMware owns the parts that make it consumable on a shared platform.

Component	What it does	Whose code
Deep Learning VMs	Prebuilt GPU VM images for model development and serving	VMware image, NVIDIA stack
NIM microservices	Containerized, API-standard inference for optimized models	NVIDIA
GPU Operator and vGPU	Driver lifecycle and GPU sharing across VMs and pods	NVIDIA
Model Store and Runtime	Curate, govern, and serve approved models as endpoints	VMware (Private AI Services)
Data Services Manager	Managed Postgres with pgvector for RAG retrieval	VMware
VCF Automation and Operations	Self-service catalog, namespaces, GPU monitoring and cost	VMware

If you want the long-form question and answer treatment of these pieces, I covered a lot of it in the VCF 9 Private AI FAQ. This series goes deeper on each component in its own part.

What it is not (where the myths come from)

Most of the bad PAIF decisions I see trace back to one of four myths. Putting the myth next to the reality clears up a surprising amount.

Four myths that drive most of the wrong scoping calls, paired with what is actually true.

My take: the chatbot myth is the expensive one. Teams scope PAIF expecting a finished assistant, then act surprised when month two is still about driver versions and namespace quotas. PAIF removes the platform toil. It does not remove the work of building the actual AI application on top, and anyone selling it that way is overselling.

How a model actually gets served

The clearest way to feel what PAIF buys you is to follow one request from a developer asking for a model to a running, governed endpoint. Without PAIF, several of these steps are tickets to the virtualization team. With it, they are catalog actions.

The self-service path. Each step that used to be a ticket becomes a governed, monitored catalog action.

What changed in VCF 9.1

Broadcom announced VCF 9.1 on 5 May 2026, and it matters for this definition in two ways. First, the platform now supports the NVIDIA HGX platform with Blackwell GPUs, NVLink Switch, ConnectX-7 NICs, and BlueField-3 with Enhanced DirectPath I/O, which brings GPUDirect RDMA and GPUDirect Storage for multi-host training and high-throughput inference. That is a real expansion of what PAIF can host, not a cosmetic bump.

Second, and more interesting for cost, VMs and Kubernetes cluster nodes can now get high-performance, exclusive access to GPU resources without an NVIDIA AI Enterprise license. If your use case is straight inference with exclusive GPUs and you are not depending on the NVAIE software catalog, that quietly removes a line item. I would still validate it against your exact models, because the moment you want NIM containers and NVAIE support, the license is back in scope. Private AI Services 2.1, released 20 March 2026, also moved enablement and lifecycle of these services into the VCF Automation UI at the namespace level, which is what makes the self-service flow above real rather than aspirational.

When PAIF earns its license

Here is the part the brochures skip. PAIF is recommended when you have multiple teams competing for a shared GPU estate and a real governance or data-residency requirement. That is exactly the situation where self-service provisioning, model governance, and per-namespace GPU accounting pay for themselves, because the alternative is a queue of tickets and a spreadsheet nobody trusts.

It is not the right call when one team owns two GPUs and ships one model. At that scale the integration tax is higher than the toil it removes, and a Deep Learning VM or a plain GPU passthrough host with open-source serving will get you there faster and cheaper. The assumption to validate before you buy: do you genuinely have multi-tenant GPU contention and a governance mandate, or do you have one workload and a hope that more are coming? Buy for the first. Do not buy for the second. If you want the hands-on view of standing this up, the deployment walkthrough shows the actual effort involved, and my introduction to VMware Private AI covers the broader why.

The Bottom Line

VMware Private AI Foundation with NVIDIA is the integration layer that makes GPUs a first-class, governed, self-service resource on the private cloud you already run. It is not a model, not an app, and not a new appliance. Get that straight and every later decision in this series, from GPU selection to RAG pipelines, gets easier. What is the first thing you want PAIF to do in your environment: serve one model fast, or give five teams safe self-service? Your answer decides most of what follows.

References

VMware Private AI Series · Part 1 of 30
VMware Private AI Complete Guide | Next: Part 2 »

About The Author

Dr. Pranay Jha

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

Dr. Pranay Jha