Dr. Pranay Jha

VMware • Cloud • AI • Enterprise Architecture

FORMERLY
VMware Insight & Cloud Pathshala
What began over a decade ago as a passion for sharing knowledge has evolved into a unified platform for Enterprise AI, VMware, Cloud Architecture, Research, and Modern Infrastructure.

Generative AI From Zero to Mastery: The Complete Guide (GenAI Series)

Everything you need to understand generative AI, from what a model is to how it runs in production, in one complete, sequential series. Written to be read by a curious beginner and still useful to an engineer or architect, and vendor-neutral throughout. All 30 parts are now published. Start at Part 1, or jump to the part you need.

Series complete · 30 of 30 parts published
Phase 1 · Foundations
  1. 01What Is Generative AI? A Plain-English Guide
  2. 02The GenAI Words Everyone Uses, and What They Mean
  3. 03How We Got from If-Statements to ChatGPT
  4. 04What a Model Really Is
  5. 05What Generative AI Can and Cannot Do
Phase 2 · How It Works
  1. 06How Neural Networks Learn, Without the Math
  2. 07How Words Become Numbers: Tokens and Embeddings
  3. 08Attention, the Idea That Made Modern AI Work
  4. 09Training vs Inference: Why Using AI Is the Real Cost
  5. 10The Context Window, and Why Models Forget
  6. 11Why AI Models Make Things Up (and What Temperature Does)
Phase 3 · Using GenAI Well
  1. 12Prompt Engineering That Actually Works
  2. 13RAG: How to Stop Your AI Making Things Up
  3. 14Vector Databases: How Semantic Search Really Works
  4. 15Fine-Tuning vs RAG vs Prompting: Which One, and When
  5. 16AI Agents: What Actually Works, and What is Hype
  6. 17Multimodal AI: Text, Images, and Audio in One Model
  7. 18Why Looks Good Is Not Enough: Evaluating GenAI Output
Phase 4 · Under the Hood
  1. 19Why Data, Not Model Size, Usually Decides Quality
  2. 20Quantization: Running Big Models on Smaller GPUs
  3. 21Guardrails and Responsible AI: What They Catch, and Miss
  4. 22Where the Money Actually Goes in Generative AI
Phase 5 · Infrastructure and Serving
  1. 23Why GenAI Runs on GPUs, and the Memory Wall
  2. 24vLLM vs TensorRT-LLM vs SGLang: Which Inference Engine
  3. 25Scaling Inference: Latency vs Throughput (and GPU Ops)
  4. 26The Network and Storage Behind Large-Scale AI
  5. 27On-Prem vs Cloud vs Hybrid for GenAI: An Honest Verdict
Phase 6 · Frontier
  1. 28What It Takes to Train Across Thousands of GPUs
  2. 29Mixture-of-Experts and Where AI Is Heading
  3. 30The Economics and Future of Generative AI
Building this on your own infrastructure? See the companion VMware Private AI series for the production build on VCF.

Architect’s Toolkit

About the Author

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.