Dr. Pranay Jha

VMware • Cloud • AI • Enterprise Architecture

FORMERLY
VMware Insight & Cloud Pathshala
What began over a decade ago as a passion for sharing knowledge has evolved into a unified platform for Enterprise AI, VMware, Cloud Architecture, Research, and Modern Infrastructure.

VMware Private AI Foundation with NVIDIA: The Complete Guide (Private AI Series)

Everything you need to plan, deploy, operate, secure and optimize VMware Private AI Foundation with NVIDIA on VCF 9, in one complete, sequential series. Thirty parts, from first principles to production day-2 and the field topics that come after. Start at Part 1, follow a reading path below, or jump to the part you need.

Series complete · 30 of 30 parts published
Reading paths: where to start by role

Architect

Design the platform end to end: concepts, reference architecture and the trade-offs that bite.

Start: Part 1Part 2Part 7Part 26Part 29

Operator (Day 2)

Run it in production: deploy, monitor, troubleshoot and upgrade without breaking the GPUs.

Start: Part 8Part 9Part 17Part 23Part 24

Data scientist / ML

Build on the platform: serve models, retrieve, customize and ship agents safely.

Start: Part 10Part 12Part 14Part 27Part 28

Decision map: jump to the call you need to make
vGPU, MIG or passthrough?Part 6: GPU partitioning
How do I serve and scale models?Part 25: NIM Operator reference architecture
Fine-tune or use RAG?Part 27: NeMo Customizer
Is Private AI even the right platform?Part 30: PAIF vs OpenShift AI vs hyperscaler
Core Path: the end-to-end spine
Phase 1 · Foundations
  1. 01What VMware Private AI Foundation with NVIDIA actually is
  2. 02Architecture & components, end to end
  3. 03Licensing & editions: PAIF add-on and NVIDIA AI Enterprise
Phase 2 · Plan & Design
  1. 04Planning & prerequisites before you touch a GPU
  2. 05GPU hardware selection for Private AI
  3. 06vGPU, MIG and passthrough: how to slice a GPU
  4. 07Reference architecture & sizing a first cluster
Phase 3 · Deploy
  1. 08Preparing the VCF workload domain for AI workloads
  2. 09Installing the NVIDIA GPU Operator & vGPU drivers
  3. 10Deep Learning VMs: provisioning & model development
  4. 11Deploying NIM microservices for inference
Phase 4 · AI Services & Apps
  1. 12Private AI Services: Model Store & Model Runtime
  2. 13Vector database with Data Services Manager (pgvector)
  3. 14Building a RAG pipeline on Private AI Foundation
  4. 15Agent Builder & agentic applications
  5. 16Self-service provisioning with VCF Automation
Deep Dives: specialized tracks
Phase 5 · Operate (Day 2)
  1. 17GPU monitoring & observability in VCF Operations
  2. 18Sizing, cost & capacity planning for inference
Phase 6 · Secure & Disconnected
  1. 19Air-gapped & disconnected deployments (AMT)
  2. 20Security, governance & data privacy for Private AI
Phase 7 · Optimize & Master
  1. 21Performance tuning & benchmarking inference
  2. 22Model lifecycle & MLOps on private cloud
  3. 23Troubleshooting VMware Private AI Foundation
  4. 24Upgrades, 9.0 to 9.1, and what’s next
Extended Field Topics: from the engagements
Phase 8 · Field Topics
  1. 25NVIDIA NIM Operator: declarative model serving by reference architecture
  2. 26Networking for AI workloads: segmentation, ingress and east-west
  3. 27Fine-tuning with NeMo Customizer: LoRA, full SFT and when to bother
  4. 28Guardrails & responsible AI: what NeMo Guardrails actually stops
  5. 29Disaster recovery & multi-tenancy: what to protect and how to share
  6. 30PAIF vs OpenShift AI vs hyperscaler managed AI: an honest verdict

Want the concepts first? For a vendor-neutral, ground-up explanation of how generative AI works, before building it on this stack, see my Generative AI: From Zero to Mastery series.

Architect’s Toolkit

About the Author

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.