Compute & Moore's Law
From 2,300 transistors in the 1971 Intel 4004 to 100 billion+ in a modern AI accelerator, transistor counts have doubled roughly every 2–3 years. FLOPS per dollar of supercomputing have improved ~100× per decade for 70 years. AI training compute has been growing ~4–5× per year since 2010 — far faster than Moore's Law alone can deliver.
Key insights
Moore's law is slowing, not stopping
From 1965 to ~2010, transistor density doubled every ~24 months (Moore's Law). Density growth has slowed to ~3 years per doubling since 2015 as feature sizes approach atomic limits. But chip performance hasn't slowed proportionally — gains now come from architectural changes (chiplets, 3D stacking, specialised accelerators), not pure scaling. Cost per transistor stopped falling around 2014; performance per dollar continues to improve but more slowly.
AI compute is the new exponent
From 2010 to 2024, the compute used to train frontier AI models grew ~4–5× per year — far faster than Moore's Law (~1.4×/year). The gains came from parallel scaling (more GPUs in clusters) and architectural improvements (Transformer, mixed-precision training). GPT-4 training is estimated at ~2.1×10²⁵ FLOPs; Llama 3.1 405B at ~3.8×10²⁵; rumoured GPT-5/Llama-4-scale training at 10²⁶ and above.
Capex and energy are the new constraints
Top-end fabs (TSMC 3nm/2nm, Samsung 3nm) cost $20–30 billion each. EUV machines from ASML cost $200M per unit. Data-centre electricity demand from AI is doubling every 2 years. The constraint on compute scaling is shifting from chip-density physics to fab construction, machine availability, and grid interconnection — all of which scale slowly even with capital.
Transistor count per chip — landmark processors 1971–2024
Transistors per chip (log scale appropriate)
Key Finding: Approximate 100,000,000× increase over 50 years. Recent flagship AI accelerators reach 100+ billion transistors.
Training compute of notable AI models 2012–2024
Estimated training FLOPs, log scale appropriate
Key Finding: Frontier training compute has grown roughly 4–5× per year for 12 years — vastly outpacing Moore's Law alone.
Methodology & caveats
Moore's Law in its three meanings
(1) The original 1965 statement: transistor counts double every year. (2) The 1975 revision: every two years. (3) The marketing meaning: 'computers get exponentially better'. (1) and (2) refer to a specific count on a specific cost basis; (3) bundles density, frequency, parallelism and architecture. The first two have slowed; the third remains broadly true but at a slower rate.
FLOPS — peak vs sustained
Theoretical peak FLOPS = number of execution units × frequency × ops/cycle. Sustained FLOPS on real workloads is typically 30–70% of peak. AI training uses mixed precision (FP16/BF16/FP8) — quoted FLOPS often refer to the lowest-precision rate, which can be 4–8× higher than FP32 peak. Compare AI compute numbers carefully — apples and oranges are common.
Cost per FLOPS
FLOPS per dollar has improved roughly 4× per decade since 1950 — slower than transistor density because of fixed costs (packaging, system integration, cooling). For AI-relevant FP16/BF16 compute, the rate is faster: ~10× per decade. But total spend on AI compute is rising ~3× per year, far outpacing per-unit cost decline.