Dr. Pranay Jha

VMware • Cloud • AI • Enterprise Architecture

FORMERLY
VMware Insight & Cloud Pathshala
What began over a decade ago as a passion for sharing knowledge has evolved into a unified platform for Enterprise AI, VMware, Cloud Architecture, Research, and Modern Infrastructure.
,

What VMware Private AI Foundation with NVIDIA Actually Is (Private AI Series, Part 1)

Part 1 of the VMware Private AI Series: a clear, opinionated explainer of what PAIF really is, what it is not, the components that do the work, and when it earns its license on VCF 9.1.

VMware Private AI Series · Part 1 of 24

TL;DR · Key Takeaways

  • PAIF is not a product you install and it is not an AI model. It is an add-on SKU layered on VMware Cloud Foundation that turns GPUs into a governed, self-service resource.
  • The value is the integration: vGPU lifecycle, model governance, and automated provisioning. The NVIDIA software underneath you could technically run yourself.
  • In VCF 9.1, VMs and Kubernetes nodes can get exclusive GPU access without an NVIDIA AI Enterprise license. That changes the cost math for plain inference.
  • PAIF earns its license when multiple teams compete for GPUs and you have a governance or compliance requirement, not when one team has two cards.

Ask ten VMware admins what VMware Private AI Foundation with NVIDIA is and you will get three answers that are all slightly wrong. Some think it is a chatbot Broadcom ships. Some think it is a model. Some think it is a separate appliance you rack next to vCenter. None of those are right, and the confusion is costing teams real money on licensing decisions they make before they understand what they are buying.

This is Part 1 of a 24-part series. Before we deploy anything, size anything, or argue about vGPU profiles, it is worth being precise about what this platform actually is. So let us be precise.

What PAIF actually is, in one sentence

VMware Private AI Foundation with NVIDIA (PAIF) is an add-on to VMware Cloud Foundation that bundles NVIDIA AI Enterprise with a set of VMware automation, governance, and lifecycle capabilities, so your existing private cloud can provision and operate GPU-accelerated AI workloads as a self-service platform. That is the whole thing. It is VCF, plus the NVIDIA software stack, plus the glue that makes GPUs behave like any other VCF resource.

Read that again and notice what is missing: there is no model in the definition. PAIF does not give you an LLM. It gives you the infrastructure to run the models you choose, on hardware you own, inside your own data center.

PAIF is a sum, not a product Three things you already understand, integrated into one self-service platform VMware Cloud Foundation vSphere, vSAN, NSX, VKS NVIDIA AI Enterprise NIM, GPU Operator, vGPU VMware glue Automation, governance, model store, monitoring PAIF one platform + + You already own the left box. PAIF licenses the middle and the right, and wires them together.
PAIF is the integration of three layers you can already reason about, not a new black box.

Where it sits in the stack

The fastest way to stop misunderstanding PAIF is to place every piece on a layer diagram. Nothing in PAIF floats free. Every capability lands on a layer you already operate, from the GPU in the host all the way up to the application a developer ships.

Where each piece lives Bottom is hardware you rack, top is what a developer consumes AI apps: RAG assistants, agents, custom inference clients What the business actually sees AI services: Model Store, Model Runtime, vector DB, Agent Builder Private AI Services 2.1 GPU enablement: NIM, GPU Operator, vGPU drivers, DLVMs NVIDIA AI Enterprise VCF platform: vSphere, vSAN, NSX, VKS, Automation, Operations The private cloud you already run Hardware: GPU hosts, HGX with Blackwell, ConnectX-7, BlueField-3 Certified servers and NICs PAIF add-on licenses and integrates these three red layers on top of your VCF You provide the base platform and the iron Red layers are what the PAIF SKU adds. Grey layers are prerequisites you own and operate already.
PAIF adds the top three layers and licenses them. The two grey layers are your responsibility going in.

The components, and which vendor does the work

When people say PAIF is confusing, what they usually mean is that the component names blur together. Here is the honest split of who owns what. Notice that the AI runtime is almost entirely NVIDIA, and VMware owns the parts that make it consumable on a shared platform.

ComponentWhat it doesWhose code
Deep Learning VMsPrebuilt GPU VM images for model development and servingVMware image, NVIDIA stack
NIM microservicesContainerized, API-standard inference for optimized modelsNVIDIA
GPU Operator and vGPUDriver lifecycle and GPU sharing across VMs and podsNVIDIA
Model Store and RuntimeCurate, govern, and serve approved models as endpointsVMware (Private AI Services)
Data Services ManagerManaged Postgres with pgvector for RAG retrievalVMware
VCF Automation and OperationsSelf-service catalog, namespaces, GPU monitoring and costVMware

If you want the long-form question and answer treatment of these pieces, I covered a lot of it in the VCF 9 Private AI FAQ. This series goes deeper on each component in its own part.


What it is not (where the myths come from)

Most of the bad PAIF decisions I see trace back to one of four myths. Putting the myth next to the reality clears up a surprising amount.

Myth vs reality The four assumptions that lead to wrong licensing and scoping calls THE MYTH PAIF ships a ready-made chatbot It is a separate appliance to rack It includes the models you need You always need an NVAIE license THE REALITY It is the platform to build and serve your own assistants. No app included. It is an add-on to the VCF you run. No new control plane to rack. You pick and import models. Some NIM and Nemotron come via NVAIE. In 9.1, exclusive GPU access can run without an NVAIE license.
Four myths that drive most of the wrong scoping calls, paired with what is actually true.

My take: the chatbot myth is the expensive one. Teams scope PAIF expecting a finished assistant, then act surprised when month two is still about driver versions and namespace quotas. PAIF removes the platform toil. It does not remove the work of building the actual AI application on top, and anyone selling it that way is overselling.

How a model actually gets served

The clearest way to feel what PAIF buys you is to follow one request from a developer asking for a model to a running, governed endpoint. Without PAIF, several of these steps are tickets to the virtualization team. With it, they are catalog actions.

Request to endpoint, self-service The path PAIF turns from a ticket queue into a catalog click 1 Dev requests a model 2 Catalog grants GPU namespace 3 Approved model from Model Store 4 NIM serves on vGPU or MIG 5 Governed API endpoint live VCF Operations watches GPU use and cost across all five steps. That visibility is part of the product.
The self-service path. Each step that used to be a ticket becomes a governed, monitored catalog action.

What changed in VCF 9.1

Broadcom announced VCF 9.1 on 5 May 2026, and it matters for this definition in two ways. First, the platform now supports the NVIDIA HGX platform with Blackwell GPUs, NVLink Switch, ConnectX-7 NICs, and BlueField-3 with Enhanced DirectPath I/O, which brings GPUDirect RDMA and GPUDirect Storage for multi-host training and high-throughput inference. That is a real expansion of what PAIF can host, not a cosmetic bump.

Second, and more interesting for cost, VMs and Kubernetes cluster nodes can now get high-performance, exclusive access to GPU resources without an NVIDIA AI Enterprise license. If your use case is straight inference with exclusive GPUs and you are not depending on the NVAIE software catalog, that quietly removes a line item. I would still validate it against your exact models, because the moment you want NIM containers and NVAIE support, the license is back in scope. Private AI Services 2.1, released 20 March 2026, also moved enablement and lifecycle of these services into the VCF Automation UI at the namespace level, which is what makes the self-service flow above real rather than aspirational.

When PAIF earns its license

Here is the part the brochures skip. PAIF is recommended when you have multiple teams competing for a shared GPU estate and a real governance or data-residency requirement. That is exactly the situation where self-service provisioning, model governance, and per-namespace GPU accounting pay for themselves, because the alternative is a queue of tickets and a spreadsheet nobody trusts.

It is not the right call when one team owns two GPUs and ships one model. At that scale the integration tax is higher than the toil it removes, and a Deep Learning VM or a plain GPU passthrough host with open-source serving will get you there faster and cheaper. The assumption to validate before you buy: do you genuinely have multi-tenant GPU contention and a governance mandate, or do you have one workload and a hope that more are coming? Buy for the first. Do not buy for the second. If you want the hands-on view of standing this up, the deployment walkthrough shows the actual effort involved, and my introduction to VMware Private AI covers the broader why.

The Bottom Line

VMware Private AI Foundation with NVIDIA is the integration layer that makes GPUs a first-class, governed, self-service resource on the private cloud you already run. It is not a model, not an app, and not a new appliance. Get that straight and every later decision in this series, from GPU selection to RAG pipelines, gets easier. What is the first thing you want PAIF to do in your environment: serve one model fast, or give five teams safe self-service? Your answer decides most of what follows.

References

VMware Private AI Series · Part 1 of 30
VMware Private AI Complete Guide  |  Next: Part 2 »

About The Author


Discover more from Dr. Pranay Jha

Subscribe to get the latest posts sent to your email.

Leave a Reply

Your email address will not be published. Required fields are marked *

Architect’s Toolkit

About the Author

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

Discover more from Dr. Pranay Jha

Subscribe now to keep reading and get access to the full archive.

Continue reading