Category: VMware & Cloud
-
Building a RAG Pipeline on VMware Private AI: 7 Failures That Quietly Break Retrieval (Private AI Series, Part 14)
Most RAG failures on VMware Private AI Foundation are not the LLM. Here are the seven pipeline failures that quietly wreck retrieval quality on PAIF 9, and how I fix each one in the field.
-
Vector Databases in VMware Private AI: Running pgvector on Data Services Manager (Private AI Series, Part 13)
A reference-architecture look at the retrieval tier of VMware Private AI: where DSM-managed PostgreSQL with pgvector sits, how to place and size it, and whether to index with HNSW or IVFFlat.
-
VMware Private AI Services: Deploying Models with the Model Store and Model Runtime (Private AI Series, Part 12)
A hands-on runbook for Private AI Services 2.1: stand up a Harbor model gallery, validate and push models with the vcf pais CLI, then serve them as endpoints through Model Runtime and the ML API Gateway.
-
NVIDIA NIM Microservices on VMware Private AI: The Model-Serving Layer Explained (Private AI Series, Part 11)
NVIDIA NIM is the model-serving layer of VMware Private AI. A reference-architecture look at the NIM Operator, NIMCache and NIMService, GPU placement, and the design choices that decide whether your endpoints survive production.
-
Deep Learning VMs in VMware Private AI Foundation: The Data Scientist Workbench (Private AI Series, Part 10)
What a Deep Learning VM in VMware Private AI Foundation actually is, how the image is built, the first-boot steps that quietly break deployments, and when to move off it to a VKS cluster.
-
Installing the NVIDIA GPU Operator and vGPU Drivers for VMware Private AI Foundation (Private AI Series, Part 9)
A practical runbook for installing the NVIDIA GPU Operator and matching vGPU host and guest drivers on VMware Private AI Foundation, with the validation checks and version-skew traps that decide whether GPUs actually schedule.
-
Prepare a GPU Workload Domain for VMware Private AI Foundation (Private AI Series, Part 8)
A field-tested, bottom-up procedure for standing up a GPU-accelerated workload domain on VCF 9.0 for Private AI Foundation: firmware, the vLCM vGPU driver, Shared Direct, a single-zone Supervisor, and the mistakes that actually bite.
-
VMware Private AI Reference Architecture and Sizing: A Practical Blueprint (Private AI Series, Part 7)
How to size a VMware Private AI Foundation build the right way: two-domain design, choosing the deployment model, and working from workload back to GPU hosts and BOM on VCF 9.1.
-
GPU Partitioning for VMware Private AI: Choosing Between vGPU, MIG and Passthrough (Private AI Series, Part 6)
Time-sliced vGPU, MIG-backed vGPU, GPU passthrough and the new ESXi 9 Update 1 hybrid mode each fit different Private AI workloads. Here is how to design the split, with a capability matrix and a reference topology.
-
Choosing the Right GPU for VMware Private AI: L40S vs H100 vs H200 vs Blackwell (Private AI Series, Part 5)
A field-tested guide to picking the GPU for VMware Private AI Foundation: how L40S, H100, H200, RTX PRO 6000 Blackwell and A100 compare, why form factor beats the model name, and a clear verdict on which to choose for RAG, inference or training.
-
VMware Private AI Foundation Planning and Prerequisites: GPU Hosts, Drivers and Readiness (Private AI Series, Part 4)
A practitioner’s planning guide for VMware Private AI Foundation with NVIDIA on VCF 9: GPU host selection, the vGPU driver and GPU Operator interoperability matrix, sharing-mode choices, and the readiness checks that decide whether your first deployment lands clean.
-
VMware Private AI Foundation Licensing: VCF Add-On vs NVIDIA AI Enterprise (Private AI Series, Part 3)
Private AI Foundation is three licenses, not one: VCF per core, the PAIF add-on per core, and NVIDIA AI Enterprise per GPU. Here is how they stack, what bundles with your GPUs, and the verdict on subscription vs perpetual.
Architect’s Toolkit
VMware Cloud Foundation
- VCF Documentation
- VCF 9 Planning & Preparation Workbook
- VCF Bill of Materials (BoM)
- VMware Compatibility Guide
- VMware Interoperability Matrix
- VMware Configuration Maximums
- VMware Ports & Protocols
- VMware Hands-on Labs
- RVTools Download
Nutanix
AI & Cloud-Native Platform
- AI Infra Sizing & Cost Calculator
- NVIDIA Build (Model Catalog)
- NVIDIA AI Enterprise Reference Architecture
- NVIDIA NIM Performance Benchmarking
- NVIDIA NGC Catalog
- NeMo Microservices Helm Chart
- Helm Charts Repository
- Hugging Face Models
Architecture & Design
About the Author

Dr Pranay Jha
Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

You May Have Missed






