Category: VMware & Cloud

AI Stack, VCF, VMware & Cloud

VMware Private AI Sizing and Cost: GPU Memory Math, Capacity Planning and TCO (Private AI Series, Part 18)

Dr. Pranay Jha

June 15, 2026

How to size a VMware Private AI platform from the workload up: GPU memory math, the KV cache trap, a model-to-card matrix, and the four-layer cost model that actually decides the business case.
Continue Reading
AI Stack, VCF, VMware & Cloud

GPU Monitoring with VCF Operations for VMware Private AI: The Signals That Actually Catch a Failing Workload (Private AI Series, Part 17)

Dr. Pranay Jha

June 15, 2026

VCF Operations gives you GPU dashboards out of the box, but the metric most teams trust is the one that lies. Here is what to watch on a Private AI Foundation estate, why GPU utilization misleads, and the hardware-health signals the default dashboards never surface.
Continue Reading
AI Stack, Automation, VMware & Cloud

Self-Service AI Catalog Items with VCF Automation for VMware Private AI (Private AI Series, Part 16)

Dr. Pranay Jha

June 15, 2026

How to publish self-service GPU catalog items for VMware Private AI Foundation with the VCF Automation Quickstart, plus the namespace, vGPU class and quota bindings that decide whether the catalog is safe to hand out.
Continue Reading
AI Stack, AI/ML, VMware & Cloud

VMware Private AI Agent Builder: Composing Models, Knowledge Bases and Prompts (Private AI Series, Part 15)

Dr. Pranay Jha

June 15, 2026

Agent Builder in VMware Private AI Services lets you compose a model endpoint, a knowledge base and prompt instructions into a grounded agent. Here is what it actually does, where it sits, and where the agentic hype gets ahead of reality.
Continue Reading
AI Stack, AI/ML, VMware & Cloud

Building a RAG Pipeline on VMware Private AI: 7 Failures That Quietly Break Retrieval (Private AI Series, Part 14)

Dr. Pranay Jha

June 15, 2026

Most RAG failures on VMware Private AI Foundation are not the LLM. Here are the seven pipeline failures that quietly wreck retrieval quality on PAIF 9, and how I fix each one in the field.
Continue Reading
AI Stack, AI/ML, VMware & Cloud

Vector Databases in VMware Private AI: Running pgvector on Data Services Manager (Private AI Series, Part 13)

Dr. Pranay Jha

June 15, 2026

A reference-architecture look at the retrieval tier of VMware Private AI: where DSM-managed PostgreSQL with pgvector sits, how to place and size it, and whether to index with HNSW or IVFFlat.
Continue Reading
AI Stack, AI/ML, VMware & Cloud

VMware Private AI Services: Deploying Models with the Model Store and Model Runtime (Private AI Series, Part 12)

Dr. Pranay Jha

June 15, 2026

A hands-on runbook for Private AI Services 2.1: stand up a Harbor model gallery, validate and push models with the vcf pais CLI, then serve them as endpoints through Model Runtime and the ML API Gateway.
Continue Reading
AI Stack, AI/ML, VMware & Cloud

NVIDIA NIM Microservices on VMware Private AI: The Model-Serving Layer Explained (Private AI Series, Part 11)

Dr. Pranay Jha

June 15, 2026

NVIDIA NIM is the model-serving layer of VMware Private AI. A reference-architecture look at the NIM Operator, NIMCache and NIMService, GPU placement, and the design choices that decide whether your endpoints survive production.
Continue Reading
AI Stack, AI/ML, VMware & Cloud

Deep Learning VMs in VMware Private AI Foundation: The Data Scientist Workbench (Private AI Series, Part 10)

Dr. Pranay Jha

June 15, 2026

What a Deep Learning VM in VMware Private AI Foundation actually is, how the image is built, the first-boot steps that quietly break deployments, and when to move off it to a VKS cluster.
Continue Reading
AI Stack, VCF, VMware & Cloud

Installing the NVIDIA GPU Operator and vGPU Drivers for VMware Private AI Foundation (Private AI Series, Part 9)

Dr. Pranay Jha

June 15, 2026

A practical runbook for installing the NVIDIA GPU Operator and matching vGPU host and guest drivers on VMware Private AI Foundation, with the validation checks and version-skew traps that decide whether GPUs actually schedule.
Continue Reading
AI Stack, VCF, VMware & Cloud

Prepare a GPU Workload Domain for VMware Private AI Foundation (Private AI Series, Part 8)

Dr. Pranay Jha

June 15, 2026

A field-tested, bottom-up procedure for standing up a GPU-accelerated workload domain on VCF 9.0 for Private AI Foundation: firmware, the vLCM vGPU driver, Shared Direct, a single-zone Supervisor, and the mistakes that actually bite.
Continue Reading
AI Stack, VCF, VMware & Cloud

VMware Private AI Reference Architecture and Sizing: A Practical Blueprint (Private AI Series, Part 7)

Dr. Pranay Jha

June 15, 2026

How to size a VMware Private AI Foundation build the right way: two-domain design, choosing the deployment model, and working from workload back to GPU hosts and BOM on VCF 9.1.
Continue Reading

Architect’s Toolkit

About the Author

Dr Pranay Jha

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

You May Have Missed

View All

Dr. Pranay Jha

Category: VMware & Cloud

VMware Private AI Sizing and Cost: GPU Memory Math, Capacity Planning and TCO (Private AI Series, Part 18)

GPU Monitoring with VCF Operations for VMware Private AI: The Signals That Actually Catch a Failing Workload (Private AI Series, Part 17)

Self-Service AI Catalog Items with VCF Automation for VMware Private AI (Private AI Series, Part 16)

VMware Private AI Agent Builder: Composing Models, Knowledge Bases and Prompts (Private AI Series, Part 15)

Building a RAG Pipeline on VMware Private AI: 7 Failures That Quietly Break Retrieval (Private AI Series, Part 14)

Vector Databases in VMware Private AI: Running pgvector on Data Services Manager (Private AI Series, Part 13)

VMware Private AI Services: Deploying Models with the Model Store and Model Runtime (Private AI Series, Part 12)

NVIDIA NIM Microservices on VMware Private AI: The Model-Serving Layer Explained (Private AI Series, Part 11)

Deep Learning VMs in VMware Private AI Foundation: The Data Scientist Workbench (Private AI Series, Part 10)

Installing the NVIDIA GPU Operator and vGPU Drivers for VMware Private AI Foundation (Private AI Series, Part 9)

Prepare a GPU Workload Domain for VMware Private AI Foundation (Private AI Series, Part 8)

VMware Private AI Reference Architecture and Sizing: A Practical Blueprint (Private AI Series, Part 7)

Architect’s Toolkit

VMware Cloud Foundation

Nutanix

AI & Cloud-Native Platform

Architecture & Design

About the Author

Dr Pranay Jha

You May Have Missed

NVIDIA AI Enterprise Explained: What’s in the Suite, How It’s Licensed, and Whether It’s Worth It

VMware Private AI vs Red Hat OpenShift AI vs Hyperscaler Managed AI: An Honest Verdict (Private AI Series, Part 30)

Disaster Recovery and Multi-Tenancy for VMware Private AI: What to Protect and How to Share (Private AI Series, Part 29)

Guardrails and Responsible AI on VMware Private AI: What NeMo Guardrails Actually Stops (Private AI Series, Part 28)

Fine-Tuning Models on VMware Private AI with NeMo Customizer: LoRA, Full SFT and When to Bother (Private AI Series, Part 27)