Blog

From Elastic Compute to Specialised Compute

Written by Klecha & Co. | Jan 19, 2026 10:14:22 AM

When cloud computing emerged in the late 2000s, its economic value derived from elasticity: virtualised resources abstracted hardware, allowing enterprises to rent compute as needed.

This model thrived on homogeneity, standard CPUs, commoditised storage, and vast datacentres managed at global scale. Artificial intelligence dismantles this assumption.

Model training requires tightly coupled clusters of accelerators; inference demands low-latency distribution. The cloud’s historic abstraction, “compute as a service”, now becomes a bottleneck, as performance depends on physical topology.

AI workloads are spatially sensitive: moving data hundreds of kilometers adds milliseconds that degrade real-time inference. Consequently, hyperscalers are redesigning datacentres around zoned specialisation, GPU superpods for training, near-edge nodes for inference, and storage tiers optimised for high-bandwidth data pipelines.

In 2025, over 40 percent of new datacentre investment by hyperscalers is allocated to GPU-dense configurations [9]. Microsoft’s Arizona West region, for example, dedicates 60 percent of its rack capacity to AI accelerators, while Amazon’s Nova infrastructure integrates custom Trainium 2 and
Inferentia 3 chips to reduce token-processing costs by 70 percent.

These shifts represent not incremental upgrades but a complete re-architecture of cloud physical design.