Category: VMware & Cloud
-
Networking for VMware Private AI Workloads: Segmentation, Ingress and the East-West Path (Private AI Series, Part 26)
Model serving lives or dies on the network nobody designed. Here is how to segment AI namespaces with NSX, expose inference endpoints through the Gateway API and the load balancer, and keep RAG east-west traffic fast and private.
-
NVIDIA NIM Operator on VMware Private AI: The Reference Architecture for Declarative Model Serving (Private AI Series, Part 25)
The NIM Operator is the Kubernetes-native control plane for model serving on VMware Private AI. Here is how its CRDs, caching and autoscaling actually fit together, and the vGPU constraint that bites multi-GPU models.
-
VMware Private AI Foundation Upgrade: Moving from VCF 9.0 to 9.1 Without Breaking Your GPUs (Private AI Series, Part 24)
A practical 9.0 to 9.1 upgrade runbook for VMware Private AI Foundation, plus a closing verdict on the platform after 24 parts. The order of operations, the vGPU driver branch trap, and host-by-host GPU domain remediation.
-
Troubleshooting VMware Private AI Foundation: 7 Failures That Actually Bite (Private AI Series, Part 23)
The seven failures I hit most often on VMware Private AI Foundation with NVIDIA, from a dark GPU on the ESXi host to a NIM pod crashing on CUDA out of memory, with the real error strings and the checks that isolate each layer.
-
VMware Private AI MLOps: Built-In Model Lifecycle vs DIY MLflow and KServe (Private AI Series, Part 22)
Two ways to run model lifecycle on VMware Private AI: the built-in Model Store and Model Runtime, or a DIY MLflow and KServe stack on VKS. Here is when each one wins, and the verdict.
-
How to Benchmark LLM Inference on VMware Private AI with genai-perf (Private AI Series, Part 21)
A practical runbook for benchmarking NIM inference on VMware Private AI Foundation: the metrics that matter, the concurrency sweep that exposes the real latency-throughput curve, and how to pick an operating point you can defend.
-
Is VMware Private AI Actually Private? A Security and Data Privacy Reality Check (Private AI Series, Part 20)
On-prem Private AI keeps your data in the building, but the breach risk is inside the cluster. How vDefend microsegmentation, confidential computing and RBAC in VCF 9.1 actually secure a Private AI pipeline.
-
Air-Gapped VMware Private AI Foundation: Mirroring, AMT and the Bootstrap Problem (Private AI Series, Part 19)
Deploying VMware Private AI Foundation in a fully disconnected enclave: what to mirror, how the artifact mirroring tool (AMT) fits, the Harbor bootstrap problem, and how to validate offline NIM and GPU before handover.
-
VMware Private AI Sizing and Cost: GPU Memory Math, Capacity Planning and TCO (Private AI Series, Part 18)
How to size a VMware Private AI platform from the workload up: GPU memory math, the KV cache trap, a model-to-card matrix, and the four-layer cost model that actually decides the business case.
-
GPU Monitoring with VCF Operations for VMware Private AI: The Signals That Actually Catch a Failing Workload (Private AI Series, Part 17)
VCF Operations gives you GPU dashboards out of the box, but the metric most teams trust is the one that lies. Here is what to watch on a Private AI Foundation estate, why GPU utilization misleads, and the hardware-health signals the default dashboards never surface.
-
Self-Service AI Catalog Items with VCF Automation for VMware Private AI (Private AI Series, Part 16)
How to publish self-service GPU catalog items for VMware Private AI Foundation with the VCF Automation Quickstart, plus the namespace, vGPU class and quota bindings that decide whether the catalog is safe to hand out.
-
VMware Private AI Agent Builder: Composing Models, Knowledge Bases and Prompts (Private AI Series, Part 15)
Agent Builder in VMware Private AI Services lets you compose a model endpoint, a knowledge base and prompt instructions into a grounded agent. Here is what it actually does, where it sits, and where the agentic hype gets ahead of reality.
Architect’s Toolkit
VMware Cloud Foundation
- VCF Documentation
- VCF 9 Planning & Preparation Workbook
- VCF Bill of Materials (BoM)
- VMware Compatibility Guide
- VMware Interoperability Matrix
- VMware Configuration Maximums
- VMware Ports & Protocols
- VMware Hands-on Labs
- RVTools Download
Nutanix
AI & Cloud-Native Platform
- AI Infra Sizing & Cost Calculator
- NVIDIA Build (Model Catalog)
- NVIDIA AI Enterprise Reference Architecture
- NVIDIA NIM Performance Benchmarking
- NVIDIA NGC Catalog
- NeMo Microservices Helm Chart
- Helm Charts Repository
- Hugging Face Models
Architecture & Design
About the Author

Dr Pranay Jha
Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

You May Have Missed






