Category: VMware & Cloud
-
VMware Private AI Sizing and Cost: GPU Memory Math, Capacity Planning and TCO (Private AI Series, Part 18)
How to size a VMware Private AI platform from the workload up: GPU memory math, the KV cache trap, a model-to-card matrix, and the four-layer cost model that actually decides the business case.
-
GPU Monitoring with VCF Operations for VMware Private AI: The Signals That Actually Catch a Failing Workload (Private AI Series, Part 17)
VCF Operations gives you GPU dashboards out of the box, but the metric most teams trust is the one that lies. Here is what to watch on a Private AI Foundation estate, why GPU utilization misleads, and the hardware-health signals the default dashboards never surface.
-
Self-Service AI Catalog Items with VCF Automation for VMware Private AI (Private AI Series, Part 16)
How to publish self-service GPU catalog items for VMware Private AI Foundation with the VCF Automation Quickstart, plus the namespace, vGPU class and quota bindings that decide whether the catalog is safe to hand out.
-
VMware Private AI Agent Builder: Composing Models, Knowledge Bases and Prompts (Private AI Series, Part 15)
Agent Builder in VMware Private AI Services lets you compose a model endpoint, a knowledge base and prompt instructions into a grounded agent. Here is what it actually does, where it sits, and where the agentic hype gets ahead of reality.
-
Building a RAG Pipeline on VMware Private AI: 7 Failures That Quietly Break Retrieval (Private AI Series, Part 14)
Most RAG failures on VMware Private AI Foundation are not the LLM. Here are the seven pipeline failures that quietly wreck retrieval quality on PAIF 9, and how I fix each one in the field.
-
Vector Databases in VMware Private AI: Running pgvector on Data Services Manager (Private AI Series, Part 13)
A reference-architecture look at the retrieval tier of VMware Private AI: where DSM-managed PostgreSQL with pgvector sits, how to place and size it, and whether to index with HNSW or IVFFlat.
-
VMware Private AI Services: Deploying Models with the Model Store and Model Runtime (Private AI Series, Part 12)
A hands-on runbook for Private AI Services 2.1: stand up a Harbor model gallery, validate and push models with the vcf pais CLI, then serve them as endpoints through Model Runtime and the ML API Gateway.
-
NVIDIA NIM Microservices on VMware Private AI: The Model-Serving Layer Explained (Private AI Series, Part 11)
NVIDIA NIM is the model-serving layer of VMware Private AI. A reference-architecture look at the NIM Operator, NIMCache and NIMService, GPU placement, and the design choices that decide whether your endpoints survive production.
-
Deep Learning VMs in VMware Private AI Foundation: The Data Scientist Workbench (Private AI Series, Part 10)
What a Deep Learning VM in VMware Private AI Foundation actually is, how the image is built, the first-boot steps that quietly break deployments, and when to move off it to a VKS cluster.
-
Installing the NVIDIA GPU Operator and vGPU Drivers for VMware Private AI Foundation (Private AI Series, Part 9)
A practical runbook for installing the NVIDIA GPU Operator and matching vGPU host and guest drivers on VMware Private AI Foundation, with the validation checks and version-skew traps that decide whether GPUs actually schedule.
-
Prepare a GPU Workload Domain for VMware Private AI Foundation (Private AI Series, Part 8)
A field-tested, bottom-up procedure for standing up a GPU-accelerated workload domain on VCF 9.0 for Private AI Foundation: firmware, the vLCM vGPU driver, Shared Direct, a single-zone Supervisor, and the mistakes that actually bite.
-
VMware Private AI Reference Architecture and Sizing: A Practical Blueprint (Private AI Series, Part 7)
How to size a VMware Private AI Foundation build the right way: two-domain design, choosing the deployment model, and working from workload back to GPU hosts and BOM on VCF 9.1.
Architect’s Toolkit
VMware Cloud Foundation
- VCF Documentation
- VCF 9 Planning & Preparation Workbook
- VCF Bill of Materials (BoM)
- VMware Compatibility Guide
- VMware Interoperability Matrix
- VMware Configuration Maximums
- VMware Ports & Protocols
- VMware Hands-on Labs
- RVTools Download
Nutanix
AI & Cloud-Native Platform
- AI Infra Sizing & Cost Calculator
- NVIDIA Build (Model Catalog)
- NVIDIA AI Enterprise Reference Architecture
- NVIDIA NIM Performance Benchmarking
- NVIDIA NGC Catalog
- NeMo Microservices Helm Chart
- Helm Charts Repository
- Hugging Face Models
Architecture & Design
About the Author

Dr Pranay Jha
Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

You May Have Missed

