Dr. Pranay Jha

VMware • Cloud • AI • Enterprise Architecture

Tag: NVIDIA AI Series

AI Stack, AI/ML

GPU Partitioning on NVIDIA Data-Center GPUs: MIG vs vGPU vs Time-Slicing vs Passthrough (NVIDIA AI Series, Part 6)

Dr. Pranay Jha

June 22, 2026

Four ways to partition an NVIDIA H100, H200, or B200 GPU: MIG, vGPU, CUDA time-slicing, and full passthrough. This post covers the isolation guarantees, profile geometry, Kubernetes GPU Operator configuration, and a sizing worked example to help you pick the right mode for your cluster.
Continue Reading
AI Stack, AI/ML

The NVIDIA AI Factory: DGX, HGX, MGX and the NVL72 Reference Systems (NVIDIA AI Series, Part 5)

Dr. Pranay Jha

June 22, 2026

DGX, HGX and MGX are not performance tiers, they are three ways to integrate the same NVIDIA GPUs. Here is how they differ and where the GB200 and GB300 NVL72 rack actually earns its 120 kW.
Continue Reading
AI Stack, AI/ML, VMware & Cloud

GPU Memory and Precision: HBM3e, HBM4 and What Actually Fits (NVIDIA AI Series, Part 4)

Dr. Pranay Jha

June 22, 2026

A 70B model in FP16 needs 140 GB of weights before a single token of context. Here is the GPU memory and precision math that decides what fits, why HBM (not FLOPS) is the real ceiling, and where FP8 and NVFP4 buy you headroom.
Continue Reading
AI Stack, AI/ML, VMware & Cloud

The NVIDIA Data-Center GPU Lineup: Hopper vs Blackwell vs Rubin (NVIDIA AI Series, Part 3)

Dr. Pranay Jha

June 22, 2026

The NVIDIA data-center GPU lineup from Hopper to Blackwell to Rubin, compared for training and inference: memory, bandwidth, FP4 and rack-scale NVL72, with a clear way to choose.
Continue Reading
AI Stack, AI/ML

NVIDIA AI Enterprise: What the Subscription Includes and What It Costs (NVIDIA AI Series, Part 2)

Dr. Pranay Jha

June 22, 2026

NVIDIA AI Enterprise is the supported, secured wrapper around the open-source NVIDIA stack, licensed per GPU. Part 2 covers what is in the box (NIM, NeMo, Run:ai, the operators), how it is licensed (subscription, consumption, perpetual), what it costs, and when it is worth it.
Continue Reading
AI Stack, AI/ML

What the NVIDIA AI Stack Actually Is, End to End (NVIDIA AI Series, Part 1)

Dr. Pranay Jha

June 22, 2026

NVIDIA AI is not one product, it is a stack roughly nine layers deep from silicon to agents. Part 1 maps the whole thing: GPUs, CUDA, the operators, TensorRT-LLM, Triton, Dynamo, NIM, NeMo, Nemotron, Blueprints and the AI Enterprise wrapper that supports it all.
Continue Reading

Architect’s Toolkit

About the Author

Dr Pranay Jha

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

You May Have Missed

AI Stack, AI/ML, VMware & Cloud

Running NVIDIA AI On-Prem and on VCF: Cost, Trade-offs and the Verdict (NVIDIA AI Series, Part 30)

June 23, 2026
AI Stack, AI/ML

GPU Observability and Multi-Tenancy: DCGM, Honest Utilization, and Sharing (NVIDIA AI Series, Part 29)

June 23, 2026
AI Stack, AI/ML

NVIDIA Blueprints and Agentic AI: AI-Q and the NeMo Agent Toolkit (NVIDIA AI Series, Part 28)

June 23, 2026
AI Stack, AI/ML

The NVIDIA NeMo Framework: Training and Fine-Tuning at Scale (NVIDIA AI Series, Part 22)

June 23, 2026
AI Stack, AI/ML

NVIDIA NeMo Retriever: RAG with Embeddings, Reranking and Guardrails (NVIDIA AI Series, Part 27)

June 23, 2026