Dr. Pranay Jha

VMware • Cloud • AI • Enterprise Architecture

FORMERLY
VMware Insight & Cloud Pathshala
What began over a decade ago as a passion for sharing knowledge has evolved into a unified platform for Enterprise AI, VMware, Cloud Architecture, Research, and Modern Infrastructure.
,

NGC Catalog: Containers, Models, Helm Charts and How to Consume Them (NVIDIA AI Series, Part 14)

The NGC catalog is your upstream source for NVIDIA GPU-optimized containers, pretrained models, and Helm charts. Here is how the nvcr.io registry, org/team/API-key model, and NVAIE entitlement actually work, with a full operational pull-and-deploy walkthrough.

NVIDIA AI Series · Part 14 of 30
TL;DR — Key Takeaways
  • The NGC catalog is NVIDIA’s curated distribution point for containers, pretrained models, Helm charts, and resource bundles, all pulled from nvcr.io.
  • Content splits cleanly into a free public tier and an NVIDIA AI Enterprise-entitled tier. Entitlement changes what is supported, not just what you can pull.
  • The NGC CLI and API key model gives you scriptable access, but the org/team hierarchy means one bad key or wrong org name fails silently or with a cryptic auth error.
  • For CI pipelines and air-gapped prep, treat NGC as an upstream source, not a runtime dependency. Mirror selectively; do not depend on live pulls in production.
  • Helm charts on NGC use a non-standard fetch path compared to a standard Helm repo. Know this before wiring them into Argo CD or Flux.
Who this is for: Platform engineers and AI infrastructure architects who are deploying GPU workloads on Kubernetes on-prem or in a private cloud, and need to understand the NGC supply chain before they can safely build CI pipelines or plan an air-gapped deployment. You should already have the GPU Operator running (Part 12) and the Network Operator wired in (Part 13). If you are new to the series, start at the NVIDIA AI Guide.

You run docker pull nvcr.io/nvidia/pytorch:24.12-py3 in your CI job at 2 a.m. and it fails with a 401. You check the credentials, they look fine. You check the token expiry, it has not expired. The real problem is that the API key was scoped to a personal namespace and your CI agent is authenticating against the org namespace. NGC’s org/team/key model has a few sharp edges that catch everyone eventually. This part is about understanding the full NGC content supply chain before those edges catch you in production.

What Actually Lives in the NGC Catalog

The NGC catalog (catalog.ngc.nvidia.com) is a curated set of GPU-optimized software. Four content types live there, and they behave differently when you try to consume them programmatically.

Containers

These are OCI images stored in the nvcr.io container registry. NVIDIA publishes frameworks (PyTorch, TensorFlow, JAX), inference runtimes (TensorRT-LLM, Triton), NIM microservice images, CUDA development images, and HPC application containers. The registry space follows the pattern nvcr.io/<org>/<image>:<tag>, or nvcr.io/<org>/<team>/<image>:<tag> for team-scoped content. NVIDIA’s own published images live under nvcr.io/nvidia/. NIM images live under nvcr.io/nim/.

Pretrained Models

Model artifacts are not OCI images. They are versioned file archives downloaded via the NGC CLI or the NGC API. A pretrained model might be a set of checkpoint files, ONNX weights, or TensorRT engine files. You cannot docker pull a model; you use ngc registry model download-version. This trips up engineers who assume NGC is just a container registry.

Helm Charts

Helm charts on NGC are stored as versioned artifacts, not in a standard Helm repo index. You fetch them with helm fetch pointed at a specific NGC chart URL, or pull them via the NGC CLI. The GPU Operator Helm chart itself is published to NGC, though NVIDIA also mirrors it to the standard helm.ngc.nvidia.com repository. For Argo CD or Flux, you typically reference the standard Helm repo endpoint rather than the artifact download path.

Resources

Resources is the catch-all bucket: Jupyter notebooks, deployment scripts, configuration templates, and SDK bundles. You download these via NGC CLI just like model artifacts.

NGC Content Taxonomy
Four artifact types, two access paths
catalog.ngc.nvidia.com nvcr.io registry CONTAINERS docker pull / OCI MODELS ngc registry model HELM CHARTS helm fetch / NGC CLI RESOURCES ngc registry resource PUBLIC (free tier) No entitlement needed ENTITLED (NVAIE) API key + active subscription
Fig 1. The four NGC artifact types and the two access tiers. Containers and models span both tiers; entitlement opens support SLAs, not just download access.

The nvcr.io Registry and How Authentication Works

The NGC container registry endpoint is nvcr.io. Authentication uses a special convention: the username is always the literal string $oauthtoken, and the password is your NGC API key. This is not a typo or a shell variable. The string $oauthtoken signals to the registry that you are authenticating with an API key rather than a user/password pair.

API keys are generated at org.ngc.nvidia.com. Each key is scoped to a specific NGC account (personal or org-level). If your CI system uses a key scoped to a personal account and the image lives in an org namespace, the pull will fail with a 401 even though the key is valid. The fix is to generate a key at the org level and store it in your secret store, not in a developer’s personal account.

The Org/Team Hierarchy

NGC has a three-level namespace: org > team > content. An org is the top-level account, typically matching your company or department. Teams are sub-groups within an org; you push private images to a team space with paths like nvcr.io/<org>/<team>/<image>:<tag>. Every org has at least one default team, but teams must be created and users assigned before pushes can succeed.

The org name is not always obvious. After logging in to ngc.nvidia.com, your org slug appears in the top-right of the dashboard. It is usually an alphanumeric string that does not match your company display name. Put this slug in your runbook immediately; tracking it down at 2 a.m. is unpleasant.

NGC Org / Team / API Key Model
How namespace scoping affects authentication and pull access
ORG nvcr.io/<org-slug>/ Personal API Key personal namespace only Org API Key org + team namespaces Team A sub-namespace Team B sub-namespace WRONG for CI Will fail on org images RIGHT for CI Scoped to org + teams
Fig 2. NGC org/team/key hierarchy. A personal API key cannot authenticate pulls from org or team namespaces. Use an org-scoped key in CI.

Operational Artifact: Pull-and-Deploy Flow

Below is the complete sequence: authenticate, list available images, pull a container, download a model artifact, and fetch a Helm chart via the NGC CLI. All commands HTML-escaped; expected output and failure modes included.

Step 1: Authenticate to nvcr.io

# Username is the LITERAL string $oauthtoken, not a variable
echo "${NGC_API_KEY}" | docker login nvcr.io \
  --username '$oauthtoken' \
  --password-stdin

# Expected output:
# WARNING! Your password will be stored unencrypted in /root/.docker/config.json
# Login Succeeded

# FAILURE MODE: if you see "unauthorized: authentication required"
# the API key is scoped to the wrong namespace (personal vs org).
# Generate an org-level key at org.ngc.nvidia.com/account/api-keys

Step 2: List Available Images via NGC CLI

# Install NGC CLI first: https://org.ngc.nvidia.com/setup/installers/cli
# Configure with: ngc config set  (enter org slug, API key, output format)

ngc registry image list nvidia/pytorch --format_type=ascii

# Expected output (truncated):
# +----------------------------------+-------+---------+--------+
# | Name                             | Tag   | Size    | Updated|
# +----------------------------------+-------+---------+--------+
# | nvidia/pytorch                   | 25.03-py3 | 18.4 GB | ...  |
# | nvidia/pytorch                   | 24.12-py3 | 17.9 GB | ...  |
# +----------------------------------+-------+---------+--------+

# List NIM images
ngc registry image list nim/ --format_type=ascii

# FAILURE MODE: "Invalid API key" or empty output with no error
# means the NGC CLI config has a stale key or mismatched org slug.
# Run: ngc config set  and re-enter credentials.

Step 3: Pull a Container

docker pull nvcr.io/nvidia/pytorch:25.03-py3

# Pull a NIM image (requires NVAIE entitlement for supported use)
docker pull nvcr.io/nim/meta/llama-3.1-8b-instruct:1.8.0

# Expected: standard docker pull progress bars, then:
# Status: Downloaded newer image for nvcr.io/nvidia/pytorch:25.03-py3

# FAILURE MODE: 403 Forbidden on NIM images without NVAIE entitlement
# The image may be listable but the pull will be denied.
# Verify entitlement at: org.ngc.nvidia.com/subscriptions

Step 4: Download a Pretrained Model

# Models are not OCI images - use NGC CLI
ngc registry model download-version \
  nvidia/riva/speechtotext_en_us_citrinet_1024_gamma_0_25:deployable_v2.0.0 \
  --dest /mnt/models/

# Expected output:
# Downloading ...
# Transfer id: ...  Download status: Completed
# Downloaded 2.1 GB to /mnt/models/

# FAILURE MODE: if the model is NVAIE-entitled and your key is personal,
# you get: "You do not have access to this resource."
# Org-level key + active NVAIE subscription required.

Step 5: Fetch a Helm Chart

# GPU Operator Helm chart via standard Helm repo (preferred for GitOps)
helm repo add nvidia https://helm.ngc.nvidia.com/nvidia
helm repo update
helm fetch nvidia/gpu-operator --version 25.3.0 [VERIFY]

# Alternative: NGC CLI for charts stored only as NGC artifacts
ngc registry chart download-version \
  nvidia/tao/tao-toolkit-api:5.5.0 \
  --dest ./charts/

# FAILURE MODE: helm fetch from helm.ngc.nvidia.com may require
# a repo token for entitled charts. Add credentials:
# helm repo add nvidia https://helm.ngc.nvidia.com/nvidia \
#   --username '$oauthtoken' --password "${NGC_API_KEY}"

Gotcha

The NGC CLI config file lives at ~/.ngc/config. In a CI environment running as root in a container, this resolves to /root/.ngc/config. If your runner mounts a read-only home directory or your Dockerfile drops privileges mid-build, the CLI will silently fail to read the config and fall back to unauthenticated mode. Pin NGC_CLI_CONFIG_DIR to a writable path and mount it as a secret volume, not a baked layer.

NVIDIA AI Enterprise Entitlement: What Actually Changes

This is the most misunderstood part of the NGC story. NVIDIA AI Enterprise (NVAIE) is an annual per-GPU subscription. The entitlement changes two things, not one.

Dimension Free / Community Tier NVAIE Entitled
Container pull access Public catalog images free to pull Entitled (locked) images accessible
Support SLA Community forums only NVIDIA Enterprise Support, CVE SLAs
CVE patching Best-effort, no SLA Tracked and patched per NVIDIA PSIRT SLA
NGC Private Registry Not included Included for DGX + NVAIE customers
NIM image access build.nvidia.com API only (hosted) Pull NIM images to self-hosted infra
GPU Operator / Network Operator Helm Available, community support Available, NVIDIA support included

The Enterprise Catalog, which used to be a separate portal from the public NGC catalog, was merged into the unified NGC catalog. An NVAIE-entitled user logging in to catalog.ngc.nvidia.com sees the same interface as a community user, but the entitled items carry a lock icon and become accessible once the entitlement is linked to the org account.

The key operational point: you can often pull an entitled image even with a community key if the image was recently public, because NVIDIA does not always lock images immediately on entitlement. Do not rely on this. In a production deployment, verify that your org account shows an active subscription under Subscriptions in the NGC dashboard before you build your deployment around those images.

In practice: When I onboard a new cluster to NGC, the first thing I do is log in to org.ngc.nvidia.com, generate an org-level API key, store it in the cluster’s secret manager, and then run a one-line smoke test: docker login nvcr.io --username '$oauthtoken' --password "${NGC_API_KEY}" && docker pull nvcr.io/nvidia/cuda:12.6.0-base-ubuntu22.04. If that fails, fix authentication before touching anything else. Everything downstream depends on it.

Public vs Supported: The Two Lanes

There is a practical distinction between what NVIDIA publishes and what NVIDIA supports. The diagram below captures this. Most teams discover the hard way that "it’s on NGC" does not mean "NVIDIA will fix it if it breaks."

Supported vs Unsupported Lanes on NGC
What being on NGC does and does not guarantee
NVAIE ENTITLED LANE COMMUNITY LANE NIM microservice images Entitled models + resources NVIDIA support + CVE SLA NGC Private Registry included Framework containers (PyTorch, TF) GPU Operator + Network Operator charts Public models + CUDA images No CVE SLA, forum support only
Fig 3. The two NGC access lanes. Both lanes use nvcr.io. Entitlement determines support SLA and CVE response, not just download access.

NGC in CI Pipelines: What to Wire and What to Avoid

A CI pipeline that pulls directly from nvcr.io on every build is a reliability risk. NVIDIA’s registry has rate limits and occasional maintenance windows. Framework containers are large (12 to 20 GB). A cold pull in a pipeline adds minutes and creates a single external dependency on every build.

The Right Pattern: Pull Once, Mirror Internally

Mirror images to an internal registry (Harbor, GitLab Container Registry, JFrog Artifactory) on a scheduled job. Your CI pipeline pulls from the internal mirror. This decouples build speed and reliability from NGC availability, and is the first step toward the air-gapped posture covered in Part 15.

What to Mirror vs What to Pull Live

Artifact Mirror to Internal? Rationale
Framework containers (PyTorch, TF, CUDA) Yes Large, stable; cold pull kills build time
NIM images Yes Required for air-gap; entitlement verified at mirror time
GPU Operator Helm chart Yes (version-pin) Pin a version; do not auto-pull latest in GitOps
Pretrained model checkpoints Yes (object storage) Multi-GB; store in S3 or MinIO, not in image layers
CUDA base images (tiny variants) Optional Smaller; live pull acceptable in non-air-gapped builds
Resources / Notebooks No Small; pull on-demand via NGC CLI in dev workflows
NGC Pull-and-Deploy Flow
From NGC upstream to production cluster
NGC UPSTREAM catalog.ngc.nvidia.com nvcr.io scheduled sync job INTERNAL MIRROR Harbor / JFrog / GitLab org-registry.internal CVE-scanned on ingest CI pulls prod pulls CI PIPELINE build / test / push PROD CLUSTER Kubernetes + GPU nodes ARTIFACT STORE models / charts S3 / MinIO
Fig 4. The recommended NGC consumption pattern. NGC is the upstream source; an internal mirror is the runtime dependency. Prod clusters never pull directly from nvcr.io.

The NGC Private Registry: When You Need It

The NGC Private Registry is the hosted namespace within nvcr.io where you store your own content: custom training containers, fine-tuned model checkpoints, internal Helm charts. It is included for DGX system customers and NVAIE subscribers. It is NOT a replacement for an on-prem container registry. It is a collaboration hub for sharing GPU-optimized content across a team without standing up your own registry infrastructure.

The push path is docker push nvcr.io/<org>/<team>/<image>:<tag>. The team must be created first in the NGC portal, and the user pushing must be a member of that team. If the push fails with a 403 and you are certain the API key is correct, check team membership before anything else.

For organizations with strict data residency requirements, storing custom containers in NGC Private Registry may not be acceptable even if the data is encrypted in transit. In those cases, the NGC Private Registry is useful only for public NVIDIA content consumption; custom artifacts stay on-prem in Harbor or a similar self-hosted registry.

My Take

I treat NGC as upstream-only in every production deployment I have built. No cluster in production ever pulls directly from nvcr.io at runtime. The operational model is: NGC is the vendor’s shipping dock. You receive the goods, inspect them (CVE scan on ingest), move them to your internal warehouse (Harbor or JFrog), and your infrastructure draws from the warehouse. This is the only posture that works at scale and survives the day NGC has a maintenance window at 3 a.m. your time. The air-gapped deployment guide in Part 15 goes deeper on the exact mirror scripts and CVE patch lifecycle. For VCF-specific deployment details, see the Private AI Series.

Verdict: NGC as Source of Truth vs Mirroring

Use NGC as your source of truth for what NVIDIA ships. Use an internal mirror as your runtime source of truth for what your clusters consume. These are different things and should be treated differently.

Treat NGC as authoritative for: the canonical image tag to pull for a given framework version, the supported version matrix for GPU Operator and driver, the Helm chart values schema, and the list of CVE-patched releases. Subscribe to NGC release notifications for the software you run. When NVIDIA releases a patched image, your sync job picks it up and your mirror updates.

Do NOT treat NGC as authoritative for: runtime availability (it is not a CDN with an SLA for your cluster), image retention (NVIDIA deprecates old tags; a tag you pulled six months ago may be gone), or model completeness (not all models are on NGC; Hugging Face and self-hosted model stores fill gaps).

When not to use NGC at all: if your security policy prohibits pulling from external registries, skip NGC entirely and work directly with NVIDIA’s enterprise team to get offline media. If you are running a model that is not NVAIE-entitled and you need CVE coverage, NGC does not provide it; evaluate whether a community framework container from NGC is appropriate for a workload with enterprise security requirements.

What to validate before going to production: confirm the org API key is org-scoped and stored in a rotation-capable secret store; confirm the image tag you plan to deploy is still present in NGC (pin it to a digest, not just a tag); confirm your NVAIE subscription is active and linked to the org account if you are running entitled software; confirm your internal mirror job ran successfully and the CVE scan passed before promoting an image to prod.

The Bottom Line

NGC is the most complete curated source for NVIDIA GPU-optimized software. The org/team/key model has real sharp edges that burn teams who treat it as a simple public registry. The entitlement boundary matters for production: community images work for development but carry no CVE SLA. Mirror everything to an internal registry before it touches a production cluster. For the full air-gapped lifecycle and CVE patching workflow, continue to Part 15: Air-Gapped Deployment and CVE Patching. Have a question about your NGC org setup? Drop it in the comments.

NVIDIA AI Series · Part 14 of 30
« Previous: Part 13  |  NVIDIA AI Guide  |  Next: Part 15 »

References

About The Author


Discover more from Dr. Pranay Jha

Subscribe to get the latest posts sent to your email.

Leave a Reply

Your email address will not be published. Required fields are marked *

Architect’s Toolkit

About the Author

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

Discover more from Dr. Pranay Jha

Subscribe now to keep reading and get access to the full archive.

Continue reading