Understanding Bare Metal vs Virtualization for Neoclouds
The Neocloud trade has evolved. We are past the initial phase where access to H100s was the sole determinant of value. As the market matures, the alpha has shifted to compute efficiency.
For the investors, the lens through which to evaluate the long-term moat of a Neocloud provider is not their GPU count, but their architectural choice: Bare Metal versus Virtualization. This decision is one of the most important determinant of a provider’s gross margin potential and the ultimate ROIC of the asset.
In this analysis, I dissect the structural divergence between these two delivery models. I will argue that while legacy virtualization is a tax on performance, the market is not purely binary. The winner will be determined by “Direct Silicon Access”, whether achieved through raw metal or optimized HPC virtualization.
Disclosure: I hold a long position in Nebius ($NBIS). While Nebius utilizes a virtualized stack, the thesis below focuses on the general physics of AI infrastructure and the unit economics that govern the sector.
The Infrastructure Spectrum
AI Clouds typically utilize compute across a spectrum of abstraction, and the alpha lies in understanding the asymmetry between them.
Bare Metal (BM) Compute is the provision of a dedicated physical server to a single tenant. In the Neocloud context, the client receives exclusive access to the entire hardware stack: the host CPU, system memory, local storage, and the high-end accelerators with their interconnects. There is no hypervisor layer, no software abstraction, and no resource sharing. The client’s workload runs directly on the silicon. This model represents the maximum possible utilization of the hardware’s theoretical performance capacity and, crucially, grants the provider Asymmetric Optionality. A provider with a Bare Metal architecture can always choose to install a virtualization layer to service smaller, less demanding clients; however, a legacy provider built entirely on a virtualized orchestration stack cannot easily strip away that layer to offer true Bare Metal without significant re-architecture. Bare Metal is the foundational asset class.
Virtualization (VM/vGPU) Compute, conversely, is the multi-tenant model where a hypervisor abstracts the physical hardware. However, we must apply a nuance here that the general market often misses: not all virtualization is created equal. While legacy hyperscalers use heavy virtualization layers that incur significant performance penalties, top-tier Neoclouds like Nebius or CoreWeave have deployed Optimized Virtualization. By utilizing technologies such as Single Root I/O Virtualization (SR-IOV) and PCI passthrough, these providers allow the Virtual Machine to bypass the host OS and access the GPU directly. This dramatically reduces the “Hypervisor Tax,” allowing these specific virtualized environments to perform nearly on par with Bare Metal for many workloads. Therefore, the analysis is not simply “Bare Metal versus Virtualization,” but rather “Direct Silicon Access versus Abstraction.” The winner is not necessarily the one who eliminates the software layer entirely, but the one who reduces the distance between the instruction and the execution to zero, whether through physical isolation or advanced software engineering.
The Performance Imperative: Commodity vs. HPC Virtualization
The most compelling argument for the Bare Metal model is the elimination of the “Hypervisor Tax.” However, for the sake of intellectual honesty, we must distinguish between the Commodity Virtualization of legacy hyperscalers and the HPC Virtualization employed by top-tier Neoclouds.
In a Commodity Virtualization environment (typical of general-purpose clouds), the hypervisor intercepts and manages Direct Memory Access (DMA) and interrupts. When applied to massive GPU clusters, this abstraction layer acts as a drag coefficient on latency. Industry benchmarks have shown that for training large transformer models on these legacy stacks, the overhead can reduce effective throughput by up to 30% compared to bare metal. This 30% delta is extending training runs, burning more electricity, slowing time-to-market.
However, the top-performing Neoclouds (including those utilizing VMs like Nebius) have largely mitigated this through HPC Virtualization. They are leveraging technologies like SR-IOV (Single Root I/O Virtualization) and PCIe Passthrough, which allows the GPU to bypass the hypervisor and communicate directly with the network interface cards (NICs). This “Passthrough” architecture reduces the overhead to 1-3%.
Therefore, the structural advantage of Bare Metal is that it is “correct by default,” whereas Virtualization requires elite engineering to not be a bottleneck. The market bifurcation, therefore, is not just BM vs. VM, but rather Optimized Interconnects vs. Legacy Abstraction.
The Unit Economics : TCO, Egress, and the Cost of Idle Time
The performance advantage of Bare Metal can be analyzed through the Total Cost of Ownership (TCO) prism for clients with high utilization profiles.
Hyperscalers, operating under a high-overhead virtualization model, typically price on-demand H100 access in the range of four to five dollars per GPU-hour. Specialized Bare Metal Neoclouds can leverage a leaner operational model and a focus on high-density, purpose-built infrastructure, are often observed pricing the same hardware in the two to three-and-a-half dollar per GPU-hour range (again, this is an average). This represents a direct reduction of 30% to 50% on the primary compute component.
However, the true economic leverage for the client is derived from the Utilization Multiplier and the Cost of Idle Time. As established, the Bare Metal environment can complete the same training job in 30% less time due to the elimination of the Hypervisor Tax. When this performance gain is combined with the lower base hourly rate, the effective cost advantage for a high-utilization client becomes exponential. While Neoclouds are managed services, they capture a significant portion of this TCO benefit for their clients. A client running a model training job that takes 1,000 hours on a virtualized instance at $4.50/hour, the total cost is $4,500. The same job might take only 700 hours on a Bare Metal instance at $3.00/hour, resulting in a total cost of $2,100. This represents a 53% direct cost reduction on the compute alone, before factoring in the time-to-market advantage of the faster completion. This is a structural cost advantage that cannot be ignored by any serious AI enterprise.
The whitepaper on the economics of AI clusters from Nebius provides a crucial framework for understanding the Cost of Idle Time. In a large, virtualized cluster, the compounding latency and jitter from the hypervisor and shared network resources mean that a significant portion of the total allocated GPU time is spent waiting for data synchronization, not computing. This idle time is still billed to the client at the full hourly rate. The paper suggests that for large-scale, multi-node training, the effective utilization of a virtualized cluster can drop below 60%, meaning over 40% of the client’s spend is effectively wasted on non-compute time. Bare Metal, by contrast, with its deterministic performance and dedicated interconnects, can push effective utilization rates above 90%. The economic thesis for Bare Metal is therefore simple: it converts billed time into productive compute time, dramatically reducing the effective cost per useful computation.
Beyond the compute cost, the second, often hidden, economic factor is the Egress Fee Arbitrage. For data-intensive AI workloads, the cost of moving data out of the cloud—the egress fee—can become a crippling expense. Training foundational models requires constant movement of massive datasets, often measured in petabytes, for initial training, fine-tuning, and deployment. Hyperscalers charge substantial, complex, and tiered fees for data leaving their network, which can inflate the total infrastructure bill by 15% to 25% or more for data-heavy users. Neoclouds, by design, often offer simpler, lower, or even non-existent egress fees as a core part of their value proposition. Their business model is predicated on selling high-utilization compute, not on monetizing data transfer. For a client whose business model depends on the constant flow of proprietary data, this egress fee arbitrage alone can provide sufficient economic justification to migrate to a Bare Metal Neocloud provider, creating a powerful, sticky relationship.
The TCO model for a large-scale AI deployment must be viewed through the lens of a financial services firm analyzing a capital investment. The key variables are not just the hourly rate, but the total cost of the hardware over its useful life, the operational expenses (OpEx) for power and cooling, and the network architecture costs. A hyperscaler’s virtualized environment bundles these costs into a single, high hourly rate, obscuring the underlying inefficiencies. A Bare Metal Neocloud, by contrast, operates on a model where the client effectively leases the hardware at a rate that more closely reflects the true depreciation and operational cost, plus a reasonable margin. The Neocloud’s ability to achieve higher power usage effectiveness (PUE) and operational efficiency in a purpose-built data center further widens this TCO gap, a margin that is ultimately passed on to the client in the form of lower effective pricing. The whitepaper further emphasizes that the CapEx-to-OpEx ratio is fundamentally different: the Neocloud model is CapEx-heavy for the provider but OpEx-efficient for the client, a structure that rewards long-term, high-volume consumption.
Solving the “Usability Gap”
While the economic and performance arguments for Bare Metal are irrefutable, they introduce a distinct operational challenge that the seasoned investor must not overlook: The Usability Gap.
Historically, the trade-off for Bare Metal’s performance was operational painpoints. In a raw Bare Metal environment, the client is often responsible for the operating system + the drivers + the network configuration, and the handling of node failures. For a fast-moving AI startup, this engineering overhead is a distraction from their core mission of model training and most of the time out of their capacity.
Tier-1 Neoclouds have solved this paradox by building a Bare Metal Orchestration Layer (typically Managed Kubernetes).
This software layer allows a client to provision a thousand bare metal nodes with the same ease and speed as spinning up a virtual instance on a hyperscaler, but without the performance-killing hypervisor abstraction. The orchestration software handles the physical hardware (failed drives, bad cables, non-responsive nodes) and presents a clean API to the user.
This brings us back to the Execution Caveat regarding optionality. This is where the divergence in the asset class becomes visible. On one side, you have “Rack Renters”, i.e providers who simply lease HGX boxes with an IP address. While they theoretically could build a virtualization or orchestration layer, they lack the software DNA to do so. They are effectively trapped at the hardware layer, unable to move up the value chain. On the other side, you have Platform Neoclouds (like Nebius or CoreWeave). These firms have invested heavily in the software stack that wraps the bare metal. They have successfully executed on the optionality, bridging the gap between raw silicon performance and cloud-like usability.
For investors this distinction is vital as they are not solely underwriting GPU purchases but also the software capabilities that make those GPUs consumable. A Neocloud that relies on virtualization (like Nebius) but couples it with a world-class, purpose-built orchestration stack often outperforms a “pure” Bare Metal provider that lacks the software engineering talent to manage the cluster effectively. The winner is not the one with the most metal, but the one who makes the metal easiest to wield.
Market Segmentation and Client Preference
The choice between Bare Metal and Virtualization is not a matter of one being universally superior; rather, it is a clear segmentation of the AI infrastructure market based on workload requirements and client maturity.
Bare Metal is usually the compute of choice for big entreprises and frontier AI Labs for most performance-critical segments, because they have the capacity to develop their own software layer. These are the companies engaged in multi-billion-parameter model training where every percentage point of performance gain translates into millions of dollars saved. Their workloads are predictable and demand the maximum throughput from the underlying hardware. For these clients, the risk of the performance degradation from the Hypervisor layer is simply not acceptable as a business risk. They are willing to trade the hyperscaler’s granular flexibility for the performance and TCO advantage of a dedicated environment. This segment includes the major independent AI research firms and the internal AI divisions of large technology companies focused on building proprietary foundation models. For instance we have seen that OpenAI has signed compute agreements with Coreweave CRWV 0.00%↑ for that matter;
Furthermore, latency-sensitive inference and real-time systems also gravitate toward Bare Metal. This includes high-frequency trading platforms, where microseconds can cost or earn directly into millions, real-time computer vision systems used in manufacturing or defense, and autonomous vehicle control systems. These applications require low-latency response times where the jitter introduced by a hypervisor is again not acceptable. The direct hardware access afforded by Bare Metal is a hard requirement for meeting stringent Service Level Agreements (SLAs) in these sectors. This has been particularly the case for the financial services industry.
A third type of client for bare metal services is those who require High-Utilization Workloads, which are economically optimized for Bare Metal. Clients with long-running jobs (who can commit to utilization rates above 70%) are the ideal economic fit. The Neocloud provider can offer a lower hourly rate because they have a higher confidence in their asset utilization, creating a virtuous cycle of lower cost for the client and higher gross margin for the provider. This is the segment where the Neoclouds build their defensible revenue base through long-term contracts and reserved instances.
Conversely, Virtualization remains the dominant and appropriate choice for other market segments.
First, General-Purpose Compute and Microservices will continue to rely on virtualization. Web servers, databases, and general application backends do not require specialized accelerators or the ultra-low latency of a dedicated interconnect fabric. For these workloads, the flexibility, rapid provisioning, and ease of scaling offered by the traditional VM model are perfectly adequate and often preferred. The overhead of a hypervisor is negligible for these CPU-bound tasks.
Second, R&D and Burstable Workloads are best suited for virtualization. Projects in the early stages of development or workloads with highly variable demand profiles benefit most from the instant scalability and pay-as-you-go granularity of the VM model. The performance penalty is irrelevant when the primary requirement is rapid iteration and cost control for intermittent usage. This is the entry point for many new AI startups and the sandbox environment for established enterprises, allowing them to experiment and fail fast without committing to dedicated resources. The ability to spin up and tear down hundreds of virtual instances in minutes is a powerful operational advantage for these use cases, one that Bare Metal, with its inherent provisioning latency, cannot easily match.
Third, legacy enterprise often default to virtualization. Companies with existing IT infrastructure, and a preference for the familiar software layer of the traditional hyperscaler model will find the VM environment easier to integrate and manage, even if it is not the most economically efficient choice for their emerging AI workloads. For these firms, the complexity of managing a Bare Metal environment often outweighs the TCO benefits, at least until their AI spend reaches a critical mass that forces a re-evaluation of the infrastructure strategy. This segment represents a massive, slow-moving pool of capital that will eventually be forced to confront the economic reality of the Hypervisor Tax as their AI deployments scale. The Neoclouds are not targeting this segment today, but they are positioning themselves to capture the moment this segment’s AI spend matures and demands performance.
The Structural Moat of Neoclouds
The traditional hyperscaler model is optimized for millions of small and variable web workloads, and it is fundamentally sub-optimal for the intensity of AI training.
The Neocloud’s competitive moat is built on three pillars of what I call “Architectural Specialization”:
1. The Density Advantage (The Thermodynamics Moat) Legacy cloud data centers were designed for standard rack densities (8–12kW). Modern AI clusters require densities of 40kW to 100kW+ per rack. Neoclouds build “AI-Native” facilities from the floor up, optimized for liquid cooling and extreme power density. This results in a structurally lower Power Usage Effectiveness (PUE) and lower OpEx. A hyperscaler attempting to retrofit a 2015-era data center for 2025-era AI clusters faces massive inefficiency and capital cost.
2. Network Topology : In AI, the network is the computer. Hyperscalers rely on generalized Ethernet topologies designed for multi-tenancy and redundancy. Top-tier Neoclouds allow for Non-Blocking Network Fabrics (using InfiniBand or optimized RoCE) that ensure all GPUs can communicate at full line rate. Whether delivered via Bare Metal or Optimized Virtualization, the physical network topology of a Neocloud is flatter and faster, reducing the “tail latency” that kills training runs.
3. The “Unbundling” of the Software Stack : Hyperscalers are trapped by their own “Golden Handcuffs”, i.e complex proprietary software stacks that justify their high margins. They cannot offer low-level hardware access without cannibalizing their core business model. Neoclouds, conversely, offer “Hardware Fidelity.” They give the client control over the kernel and the drivers. For a Platform Neocloud this means offering a software layer that empowers the user rather than restricting them. They don’t force you into a proprietary ecosystem; they give you a clean, Kubernetes-native environment that feels like the cloud but performs like a supercomputer.
Hyperscalers are not blind to this; they are responding with “Dedicated Hosts.” However, these are often compromises (features layered onto a VM-centric architecture rather than commitments). A hyperscaler’s “Bare Metal” offering is frequently constrained by legacy network topology and general-purpose cooling. The Neoclouds maintain a significant first-mover advantage not just in deploying GPUs, but in operating the specialized physical plant required to run them efficiently. They are not competing on the same curve; they are building a different machine entirely.
Conclusion
The rise of the Neocloud is a direct response to the physics of modern AI, which rejects the abstraction layers of the past decade. The market is clearly bifurcating into two distinct infrastructure classes.
Virtualization will continue to dominate general-purpose compute. The market for web services, microservices, and variable workloads is mature, highly competitive, and characterized by commoditization. However, the mission-critical AI workloads will gravitate toward Direct Silicon Access. This is the high-growth, high-margin segment. But for the investor, the “Alpha” is not found by simply buying into Bare Metal but rather into PaaS, which may also include VMs.
Providers who offer raw metal with no software differentiation are to be avoided. While they possess the architectural advantage of Bare Metal, they lack the software DNA to move up the value chain. They are destined for margin compression, trapped as commodity hardware lessors.
My high-conviction lies with the Platform Neoclouds, the providers who combine the raw performance of Bare Metal with the elite software engineering required to build orchestration layers and high-performance virtualization. They do not just own the silicon; they make it consumable.
The long-term winners in this infrastructure race will be those who understand a dual truth:
Physics: The shortest path between the workload and the silicon is the most profitable.
Operations: The hardware is useless without the software to wield it.
Disclaimer: This report is for informational purposes only and does not constitute investment advice. Sources and arguments are based on market data and industry analysis as of December 2025.

