Dr. Pranay Jha

VMware • Cloud • AI • Enterprise Architecture

FORMERLY
VMware Insight & Cloud Pathshala
What began over a decade ago as a passion for sharing knowledge has evolved into a unified platform for Enterprise AI, VMware, Cloud Architecture, Research, and Modern Infrastructure.

NVIDIA AI: The Complete Guide (NVIDIA AI Series)

The NVIDIA AI stack, end to end, for infrastructure architects and platform engineers: GPUs and fabrics, the AI Enterprise software platform, inference with NIM and Dynamo, customization with NeMo, models and agents, and the operations that keep it running, on-prem and on private cloud. A complete 30-part series. Where it meets VMware, it links to the Private AI series rather than repeating it.

Series complete · 30 of 30 published
Phase 1 · Foundations & the AI Factory
  1. 01What the NVIDIA AI Stack Actually Is, End to End
  2. 02NVIDIA AI Enterprise: What the Subscription Includes
  3. 03The GPU Lineup: Hopper vs Blackwell vs Rubin
  4. 04GPU Memory and Precision: HBM and FP8/FP4
  5. 05The AI Factory: DGX, HGX, MGX and the NVL72 Rack
Phase 2 · GPU Infrastructure
  1. 06GPU Partitioning: MIG, vGPU, Time-Slicing, Passthrough
  2. 07AI Networking I: NVLink and NVSwitch
  3. 08AI Networking II: InfiniBand vs Spectrum-X Ethernet
  4. 09Storage and the Data Path: GPUDirect Storage
  5. 10Power, Cooling and Density
Phase 3 · The Software Platform
  1. 11Drivers, CUDA and the Container Toolkit
  2. 12The NVIDIA GPU Operator on Kubernetes
  3. 13The Network Operator and Accelerated Fabric
  4. 14The NGC Catalog
  5. 15Air-Gapped Deployment, Lifecycle and CVE Patching
Phase 4 · Inference
  1. 16NIM Microservices: How a Model Gets Served
  2. 17Deploying and Autoscaling NIM
  3. 18TensorRT and TensorRT-LLM
  4. 19Triton Inference Server vs NIM
  5. 20NVIDIA Dynamo: Disaggregated Inference at Scale
  6. 21Inference Economics: Throughput, Latency, Cost per Token
Phase 5 · Customization & Training
  1. 22The NeMo Framework
  2. 23Customization: LoRA, SFT and RLHF
  3. 24NeMo Curator: Data Prep at Scale
  4. 25Multi-Node Training: Scheduling and Checkpointing
Phase 6 · Models, Agents & Apps
  1. 26The Nemotron Foundation Models
  2. 27RAG with NeMo Retriever and Guardrails
  3. 28NVIDIA Blueprints and Agentic AI (AI-Q)
Phase 7 · Operations & Verdict
  1. 29GPU Observability and Multi-Tenancy
  2. 30Running NVIDIA AI On-Prem and on VCF, and the Verdict

Architect’s Toolkit

About the Author

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.