The question of when artificial general intelligence will arrive has migrated from the margins of academic speculation to the center of global economic and security planning. Investment decisions involving hundreds of billions of dollars, national security strategies affecting nuclear-armed powers, and workforce planning for entire economies now hinge on timeline estimates for a technology that remains, by any rigorous definition, unrealized. The confidence with which various actors assert AGI timelines ranges from the casually apocalyptic to the dismissively skeptical, but the underlying uncertainty has not meaningfully decreased even as AI capabilities have advanced dramatically. Understanding what the evidence actually supports — and where confident assertions outrun that evidence — is essential for anyone attempting to plan rationally for the period through 2028 and beyond.
Defining the Target: What AGI Actually Means
The first and most fundamental challenge in AGI timeline prediction is definitional. There is no consensus definition of artificial general intelligence, and different definitions produce dramatically different timeline estimates. This is not merely an academic quibble. The definition determines what counts as evidence of progress, what benchmarks are relevant, and what capabilities would constitute “arrival.”
The most commonly cited informal definition — an AI system that can perform any intellectual task that a human can — is operationally useless. Human intellectual capabilities span an enormous range, from basic sensory processing through scientific discovery and artistic creation to social reasoning and emotional intelligence. A system that exceeds human performance on standardized tests but cannot navigate a novel social situation is general in one sense but not another.
More operationally useful definitions have emerged from the research community. The concept of a “Level 5 AI” capable of performing any cognitive task at or above the level of a skilled human professional within any domain, without task-specific training, represents one common benchmark. Others focus on economic impact — defining AGI as the point at which AI systems can substitute for human labor across a majority of economically productive tasks. Still others define AGI in terms of autonomous scientific research capability — the ability to formulate hypotheses, design experiments, execute them, and produce novel scientific knowledge without human guidance.
These definitions matter enormously for timeline predictions. If AGI means “a system that can pass any standardized professional examination,” then timelines are relatively short — current frontier models already pass most such examinations, and incremental improvements may achieve universality within a few years. If AGI means “a system that can conduct autonomous open-ended scientific research producing Nobel Prize-worthy discoveries,” then timelines extend considerably, as this capability requires not just broad knowledge but genuine novelty generation and physical-world experimental interaction.
The Scaling Hypothesis Under Scrutiny
The most influential argument for short AGI timelines is the scaling hypothesis — the proposition that continued increases in model size, training data, and compute will produce continued capability improvements that eventually cross the threshold into general intelligence. This hypothesis has been remarkably successful as an engineering heuristic. From GPT-2 through GPT-4 and beyond, increases in scale have reliably produced capability improvements across a wide range of tasks, often in ways that surprised even the researchers involved.
The scaling laws documented by Kaplan et al. and subsequent researchers demonstrated power-law relationships between compute investment and model performance that held across several orders of magnitude. The consistency of these scaling laws provided the empirical foundation for the massive compute investments that have characterized AI development since 2022, with major laboratories spending billions of dollars on training runs predicated on the assumption that scaling would continue to deliver capability improvements.
However, the evidence for continued scaling returns has become substantially more complex since 2024. Several developments have complicated the simple narrative of “more compute equals smarter AI.”
First, benchmark saturation has become a persistent challenge. As frontier models have achieved near-human or superhuman performance on established benchmarks, researchers have been forced to develop increasingly sophisticated evaluation criteria. The pattern has been consistent: a new benchmark is introduced, frontier models initially score modestly, rapid improvement follows, and the benchmark saturates — often within months rather than years. This pattern makes it difficult to assess whether scaling is producing genuine capability gains or merely improving performance on tasks that are easier than they initially appeared.
Second, the relationship between benchmark performance and real-world capability has proven weaker than many assumed. Models that achieve impressive scores on reasoning benchmarks may fail at straightforward practical tasks. Models that generate fluent, expert-sounding text may embed subtle errors that require human expertise to detect. The gap between benchmark performance and operational reliability remains significant and does not obviously close with additional scale alone.
Third, the economics of scaling have reached a point where continued exponential growth in compute investment faces practical constraints. Training runs for frontier models now cost hundreds of millions of dollars and require months of time on dedicated clusters of thousands of specialized processors. The construction of new data centers to support these runs takes years and faces supply chain constraints for critical components, particularly high-bandwidth memory and advanced networking equipment. While investment continues to flow, the pace of compute scaling has slowed relative to the exponential trajectories projected in 2023-2024.
Expert Surveys and Their Limitations
Surveys of AI researchers have been a primary tool for generating AGI timeline estimates, and their results have shifted notably over the past several years. The most comprehensive ongoing survey effort, which polls researchers who have published at top AI conferences, has consistently shown a trend toward shorter timelines. The median estimate for when there is a 50% probability of achieving AGI has moved from approximately 2060 in surveys conducted in 2018 to approximately 2040 in the most recent surveys. A significant minority of respondents — roughly 20% — now estimate that AGI is more likely than not to arrive before 2030.
These survey results deserve careful interpretation. The shift toward shorter timelines coincides with a period of dramatic capability improvements that may have anchored respondents’ expectations on recent trends. Survey responses are also influenced by definition ambiguity — researchers interpreting “AGI” differently will naturally produce different timeline estimates.
More fundamentally, there is no strong evidence that AI researchers are well-calibrated predictors of AI development timelines. Historical surveys show poor calibration on previous predictions, with researchers consistently overestimating the difficulty of tasks that proved relatively easy (like playing Go at superhuman level or passing professional examinations) while underestimating the difficulty of tasks that proved surprisingly hard (like robust autonomous driving or fluid physical manipulation).
The Compute Trajectory Through 2028
What can be said with reasonable confidence about the compute landscape through 2028? Several trends are supported by committed capital expenditure and physical infrastructure currently under construction.
Global AI compute capacity is on track to approximately triple between 2026 and 2028, driven primarily by data center construction projects that are already funded and in progress. NVIDIA’s next-generation GPU architectures, AMD’s competitive products, and custom silicon from Google, Amazon, and Microsoft will collectively deliver roughly an order of magnitude improvement in price-performance for AI training and inference workloads.
The United States’ share of global AI compute capacity is projected to remain above 50% through 2028, though China’s domestic compute buildout — accelerated by export control restrictions that have motivated massive domestic semiconductor investment — will narrow the gap. The strategic implications of compute distribution are increasingly recognized by policymakers, with AI compute capacity being explicitly discussed in national security terms.
These compute trajectory projections suggest that frontier AI training runs in 2028 will have access to roughly 5-10x more compute than current state-of-the-art efforts. If scaling laws continue to hold with their historical parameters, this would produce models substantially more capable than current systems — but whether that improvement crosses any meaningful “AGI threshold” depends entirely on where that threshold actually lies, which returns us to the definitional challenge.
What the Evidence Actually Supports
Synthesizing the available evidence, several conclusions seem justified for the period through 2028.
First, AI systems will continue to improve substantially in capability. The combination of architectural innovation, scaling, improved training methodology, and new data modalities will produce systems that outperform current models across virtually every measurable dimension. Specific capabilities that currently seem impressive — code generation, mathematical reasoning, multimodal understanding — will become commoditized.
Second, the boundary between narrow AI and general AI will continue to blur. Systems that operate competently across dozens or hundreds of distinct task categories — while still failing at others — will complicate binary assessments of whether “AGI has arrived.” The most likely scenario for the period through 2028 is not a dramatic threshold crossing but a gradual expansion of the task space in which AI systems perform at or above professional human level.
Third, genuine uncertainty about the path to full generality will persist. The hard problems — robust common-sense reasoning, genuine novelty generation, physical-world understanding beyond textual description, long-horizon planning in complex environments — may yield to continued scaling, may require fundamentally new approaches, or may prove to be much closer to solution than current performance suggests. The honest answer is that we do not know, and anyone claiming certainty about AGI timelines is expressing confidence that the evidence does not support.
Fourth, the economic and social impacts of AI will intensify dramatically regardless of whether anything properly called AGI arrives by 2028. The policy-relevant question is not “when will AGI arrive?” but “how do we govern AI systems that are increasingly capable, increasingly autonomous, and increasingly integrated into critical systems?” That question demands answers that cannot wait for definitional debates about general intelligence to be resolved.
Implications for Planning
For policymakers, investors, and organizational leaders planning through 2028, the practical implications of this analysis are clear. Planning should be capability-based rather than threshold-based. The question is not whether AGI arrives by a particular date but what specific AI capabilities will exist, how they will affect specific domains, and what governance structures are needed to manage their deployment.
Scenario planning should encompass a wide range of AI capability trajectories rather than betting on a single forecast. The uncertainty is genuine and will not resolve before the planning decisions must be made. Organizations that prepare for a range of outcomes — from continued incremental improvement to rapid capability breakthroughs — will be better positioned than those that bet on a single timeline.
The 2028 election will occur in an AI capability environment substantially more advanced than today’s, regardless of where one stands on AGI timeline debates. Preparing for that environment — in terms of workforce adaptation, regulatory readiness, and institutional capacity — is more productive than debating whether the systems of 2028 will deserve the label “general intelligence.”
Probability estimates in this analysis are based on our ensemble model incorporating expert surveys, compute trajectory data, benchmark progression rates, and historical analogy calibration. These estimates are updated quarterly. Last model update: Q1 2026.