vSAN ESA vs OSA in VCF 9: Storage Design and When to Choose Which (VCF 9 Series, Part 6)

vSAN ESA vs OSA in VCF 9: single-tier all-NVMe versus two-tier disk groups, the resilience math, the lowered hardware minimums, and a clear verdict on which to choose.

by

Dr. Pranay Jha

June 13, 2026

No comments

10 minutes

Read Time

VCF 9 Series · Part 6 of 37

TL;DR · Key Takeaways

ESA is the default for VCF 9. OSA is supported but is now a legacy-preservation choice, not a new-build choice.
ESA is single-tier all-NVMe with no disk groups. It gives RAID-5/6 space efficiency at near-mirror performance.
Compression is always on in ESA. VCF 9.1 adds Auto-RAID and global deduplication.
ESA needs ReadyNode-certified NVMe TLC hardware. The Nov 2025 revision dropped the minimums to 16 cores and 128 GB RAM per host.
Verdict: any greenfield build or hardware refresh should be ESA. Choose OSA only to keep existing non-ESA hardware in service.

The question I still get asked in storage design sessions is whether to start a new VCF 9 cluster on vSAN OSA or ESA. In 2025 that was a real debate. In 2026 it mostly is not, and this post is about why, plus the narrow case where OSA still earns its place.

Two architectures, one default

OSA, the Original Storage Architecture, is the two-tier model you know: disk groups with a dedicated cache device fronting capacity devices. ESA, the Express Storage Architecture, is single-tier and all-NVMe. Every device serves both performance and capacity, there is no disk-group construct, and a new log-structured filesystem underpins it. In VCF 9, ESA is the recommended default for all new vSAN deployments and hardware refreshes. OSA is not deprecated, and it remains fully supported, but the platform is built around ESA now.

ESA pools identical NVMe devices; OSA fronts capacity disks with a dedicated cache per disk group.

ESA versus OSA at a glance

Dimension	vSAN ESA	vSAN OSA
Tiering	Single-tier, all-NVMe, no disk groups	Two-tier, cache + capacity disk groups
Resilience vs performance	RAID-5/6 at near-mirror performance	Trade-off: RAID-1 fast, RAID-5/6 slower
Compression	Always on (cluster service)	Optional, per-policy
Hardware	ESA ReadyNode, NVMe TLC required	Broader HCL, SAS/SATA SSD or hybrid
Performance vs OSA	2x to 5x on the same hardware (Broadcom)	Baseline
9.1 additions	Auto-RAID, global dedup, cross-mount	Maintained, not the focus of new features
Best for	Any new build or refresh	Reusing existing non-ESA hardware

Why ESA changes the resilience math

The big idea in ESA is that it kills the old OSA trade-off between mirroring and erasure coding. In OSA you chose RAID-1 for speed or RAID-5/6 for space efficiency and accepted the performance hit. ESA gives erasure-coded space efficiency at near-mirror performance, so you stop paying for resilience with latency. The capacity math is the reason it matters: FTT=1 RAID-1 mirroring costs 2x the raw capacity, RAID-5 costs about 1.33x on a 3+1 layout and 1.5x on the 3-to-5-host 2+1 layout, and RAID-6 4+2 costs about 1.5x while still tolerating two failures. Same resilience, far less capacity burned. RAID-5/6 erasure coding is all-flash only, which ESA always is. Compression moved from a per-policy toggle to an always-on cluster service. VCF 9.1 layers on Auto-RAID, a system-managed resilience model where clusters of 6 or more hosts default to FTT=2 with RAID-6, 3 to 5 hosts use FTT=1 with the 2+1 RAID-5 scheme, and the policy rule is removed so you stop hand-tuning it.

Raw capacity needed per unit of usable data, by resilience scheme.

In 9.1, cluster size drives the default resilience scheme automatically.

Why the hardware objection is gone

The historical argument for OSA was cost: ESA needed certified NVMe ReadyNodes and felt expensive to start. The November 2025 ReadyNode revision changed that. The profiles were consolidated and the floor dropped to 16 cores and 128 GB RAM per host, with up to a 67% RAM reduction and 33% core reduction on storage-cluster profiles. Minimums can be as low as 2 NVMe devices per host, though 3 to 4 is sensible so the 2+1 RAID-5 secondary resilience works without buying more disks. RAM scales with device count, roughly 128 GB at 12 devices up to 256 GB at 24. The space efficiency angle also feeds licensing: vSAN bundles 1 TiB of raw capacity per VCF core, and ESA always-on compression plus 9.1 global deduplication directly stretch that included entitlement, as detailed in the licensing breakdown.

When OSA still makes sense

Exactly one scenario: you are reusing existing two-tier, SAS-SSD, or hybrid hardware that is not ESA-ReadyNode-certified, and you want to keep that gear in service to end of life. OSA is the preserve-your-investment path. It is not the build-something-new path. If you are buying any new disks, you are buying NVMe TLC, and that means ESA. The placement of these clusters in the wider design is in the reference architecture deep-dive.

Sizing ESA hosts without overbuying

The ReadyNode profiles after the November 2025 revision collapse to three sizes each for the two families you will actually use: vSAN-HCI-SM, MED, and LRG for hyperconverged clusters, and vSAN-SC-SM, MED, and LRG for storage clusters. RAM scales with device count, roughly 128 GB at 12 devices, 192 GB at 18, and 256 GB at 24. A practical greenfield starting point is 4 NVMe devices per host on PCIe Gen5, or 6 on Gen4, which gives you enough spindles for the 2+1 RAID-5 scheme to land its secondary resilience without a later disk purchase. The two-device floor exists, but I would not build production on it, because you lose the option to tolerate a device failure during a rebuild. Read-Intensive TLC is supported for most use cases, so you are not forced into the most expensive Mixed-Use drives unless your write profile genuinely demands them.

I signed off a small ESA cluster built on the two-device minimum once, purely to hit a budget line, and it bit us during the first drive failure. With only two devices per host there was no room for the 2+1 RAID-5 scheme to rebuild its secondary resilience locally, so a routine disk swap turned into a nervous few hours watching objects sit at reduced redundancy until the replacement resynced. We added a third and fourth device per host at the next maintenance window and the problem never came back. The two-device floor is real and supported, but I would not build production on it again. Start at four.

Disaggregation and cross-mounting in 9.1

The disaggregated model, once branded vSAN Max and now called vSAN storage clusters, separates storage hosts from compute and requires ESA. VCF 9.1 made it materially more flexible. A storage cluster can now be shared across vCenter boundaries the way a traditional array is, and a compute-only cluster can mount both OSA and ESA datastores. The old cross-mount limit that stopped an ESA datastore being mounted to an OSA cluster is gone in 9.1. This matters for design because it lets you stand up a dense ESA storage tier and feed it to multiple compute clusters without forcing every cluster onto the same architecture overnight, which is a far gentler migration story than a forklift.

A note on the vendor numbers

Broadcom quotes 2x to 5x performance over OSA on the same hardware, global deduplication up to 8x in 9.1, and lower TCO than external arrays. Treat the headline ratios as vendor claims rather than independent benchmarks, because real numbers depend on your block size, working-set locality, and how compressible your data actually is. The directional point holds regardless: ESA removes the resilience-versus-performance trade-off and the always-on data services stretch your bundled TiB. Just size from your own workload profile, not from a slide.

Storage policy and the usable capacity math

Choosing ESA is half the decision. The other half is the storage policy, because that is what turns raw capacity into usable capacity, and it is where teams either overbuy or quietly run out of space. On ESA you can run RAID-5 or RAID-6 with the performance that used to force you onto RAID-1, so the old reflex of mirroring everything for speed no longer costs you what it did.

Policy	FTT	Min hosts	Capacity overhead	Use for
RAID-1 mirror	1	3	2.0x (100 percent)	small clusters, latency-critical objects
RAID-5 (2+1)	1	4	1.5x (50 percent)	general, 4 to 5 hosts
RAID-5 (4+1)	1	6	1.25x (25 percent)	general, 6 or more hosts
RAID-6 (4+2)	2	6	1.5x (50 percent)	higher resilience, 6 or more hosts

Work an example. Take a six host cluster with 40 TiB raw per host, so 240 TiB raw. Reserve one host of headroom for maintenance and rebuild, leaving roughly 200 TiB. Apply RAID-5 (4+1) at 1.25x and you land near 160 TiB usable before slack. Keep 25 to 30 percent free for vSAN operations and you are budgeting around 112 to 120 TiB of real workload space. That gap between 240 raw and about 115 usable is the number people forget when they size on raw terabytes alone.

In 9.1, Auto-RAID picks the scheme for you, RAID-6 at six hosts or more and RAID-5 (2+1) at three to five, so on newer builds this becomes a policy you review rather than one you set by hand. The math still matters, because it is what tells you whether the cluster you are about to buy actually holds what you think it does.

I let a hardware worry push a design to OSA once, and the cluster paid for it for two years. The team was nervous about ESA on newer drives and chose the familiar architecture, then spent the whole next refresh cycle wishing they had the write performance and space efficiency ESA would have delivered on the exact same hardware. Migrating an OSA cluster to ESA later is a rebuild, not a toggle. Unless you have a concrete reason ESA cannot run on your drives, choose it for any new build, because the architecture you pick on day one is the one you live with until the next hardware cycle.

Choose ESA for any new build

For any VCF 9 greenfield build or hardware refresh, choose vSAN ESA. Full stop. You get erasure-coded RAID-5/6 at mirror-class performance, always-on compression plus 9.1 global dedup that multiplies your bundled per-core TiB, and hardware minimums low enough that the old cost objection no longer holds. On 9.1, leave Auto-RAID enabled and let the cluster manage resilience. OSA is now strictly a legacy-preservation choice: pick it only to keep existing non-ESA-certified hardware running to end of life. Do not start a new OSA cluster in VCF 9. If you are repurposing an existing fleet, what share of your current vSAN hardware is ESA-ReadyNode-certified today?

Do not start a new OSA cluster in VCF 9.

References

VCF 9 Series · Part 6 of 37
« Previous: Part 5 | VCF 9 Complete Guide | Next: Part 7 »

About The Author

Dr. Pranay Jha

Dr. Pranay Jha is a Cloud and AI Consultant with 18+ years of experience in hybrid cloud, virtualization, and enterprise infrastructure transformation. He specializes in VMware technologies, multi-cloud strategy, and Generative AI solutions. He holds a PhD in Computer Applications with research focused on Cloud and AI, has published multiple research papers, and has been a VMware vExpert since 2016 and a VMUG Community Leader.

Two architectures, one default

ESA versus OSA at a glance

Why ESA changes the resilience math

Why the hardware objection is gone

When OSA still makes sense

Sizing ESA hosts without overbuying

Disaggregation and cross-mounting in 9.1

A note on the vendor numbers

Storage policy and the usable capacity math

Choose ESA for any new build

References

About The Author

Dr. Pranay Jha

Discover more from Journal of Intelligent Infrastructure

Leave a Reply Cancel reply

Architect’s Toolkit

PJ’s Tools

VMware Cloud Foundation

Nutanix

AI & Cloud-Native Platform

Architecture & Design

About the Author

Dr Pranay Jha