AI Capabilities Forecasts
Part 1: Towards P(doom)
"If what I say now seems to be very reasonable, then I will have failed completely. Only if what I tell you appears absolutely unreasonable have we any chance of visualising the future as it really will happen." - Arthur C. Clarke, 1964
My primary occupation during this sabbatical has been reading, thinking, and talking with people about AI risk. I will have less time to dedicate specifically to this interest as I start actively looking for a job, which makes this a good time to reflect on all the things I’ve learned over the last few months. This is the first of a series of posts dealing with what I’ve gained from that process and what I currently think about these problems.
One of the issues around forecasting outcomes in AI is that there are actually (at least) two linked predictions that need to be made. First, you need to forecast how powerful AI systems will be (capabilities). Then, you need to forecast what the likely outcomes are conditional on those capabilities (outcomes). Focusing on just one of these things (and often just one potential scenario) is the error behind some of the most maddening takes from otherwise intelligent people.
To deal with this I’m going to split the two components and deal with each individually. I’ll start with the predictions for capabilities in this post. In the next, I’ll go through the specific outcomes I see conditional on those capabilities.
There are many, many, many examples of people or organizations doing this sort of breakdown, most of whom have thought very deeply about these problems. 80,000 hours, an organization that aims to direct people towards spending their career on the most impactful problems of our time, considers catastrophic risk from advanced AI the most critical cause area. The Future of Life Institute has a beautiful site that walks through some of the most concerning negative (and some positive) worlds, and the Center for AI Safety also has a clear and informative breakdown of some risks they consider plausible and critical to address. There are also a couple of pessimistic, high-profile breakdowns like AI 2027 (currently 91% accurate for predictions in 2025) or If Anyone Builds it Everyone Dies, where the titular ‘It’ is superintelligent AI and ‘Everyone Dies’ means everyone dies.
This is not to say that everyone is convinced that AI poses risks that should be taken seriously. Maybe most famously Yann LeCun, who in 2018 won a Turing Award for his work on deep learning and is known as one of the ‘Godfathers of AI’ has made very strong statements that current AI systems will not pose existential risks.1 There are many others who oppose any form of regulation on AI at all, presumably because they don’t take these risks seriously (and/or because they really, really like money). The default case seems to be a sort of ambivalent, uninterested belief that nothing ever happens and that AI is overhyped and therefore not dangerous.
I don’t expect my writing to convince these people, so what is the point of writing about this at all? For one, it’s personally useful to move from a general personal vibe of AI risks into something more concrete that causes me to carefully examine my assumptions. For another, doing this publicly acts as an excellent accountability mechanism. It’s easy to get trapped in confirmation bias, especially when it comes to misremembering your opinions from the past, so publicly documenting these opinions as predictions is one way to help calibrate myself better in the future. This particular set of forecasts is interesting, as I really hope I’m wildly incorrect in these forecasts, because otherwise the future looks pretty bleak.
Forecasting Tiers of Capability
There are no bright lines when it comes to delimiting tiers of AI capabilities, so I am going to define 4 relatively broad categories. These categories implicitly combine capability (what can it do) with autonomy/agency (can it do that without a human in the loop). While those may be different axes, in practice I think they correlate very strongly and we should generally expect autonomy to increase in step with capabilities. The categories I’m using are loosely based on Deep Mind’s Levels of Artificial General Intelligence (AGI).
For each tier (other than tier 0) I’ll give probabilities for reaching that level in 2, 5, and 10 year timeframes. Given the timing of this post, this corresponds nicely with the end of 2027, 2030, and 2035 respectively. I chose these timeframes mostly for comparability with other forecasts, but I also think that the most relevant advances are likely to either happen within this 10 year timeframe or become far harder to predict and involve totally unforeseen circumstances.
For each tier I also give a brief rationale, and then discuss some general sources of uncertainty applicable across tiers at the end of the post. I did not write these to be a full defense of my views on each tier, as doing this for even a single level of AGI would be worthy of an entire post.
Tier 0 - Current Level Systems (not AGI)
Capabilities match or modestly exceed some humans in some tasks, including productive non-physical work like programming, but with serious limitations in the majority of tasks. We are here currently, and are still coming to grips with what and how AI can be made practically useful. This is made more difficult by the fact that capabilities change rapidly, so a functionality that is impossible now may be trivial in six months. There are implications of this tier in the ‘outcomes’ domain, but as far as a capabilities forecast this tier has already been achieved.
Tier 1 - AGI-ish
At this stage capabilities are generally better than at least half of humans across a meaningful fraction (>20%) of economically meaningful tasks, but limitations in capability and autonomy require humans to be constantly in the loop. AI systems are a valuable tool that multiplies human efforts.
Forecasts to reach tier 1 - 2 years: 30%, 5 years: 50%, 10 years: 75%
Rationale: It does not seem like we have very far to go to achieve this milestone. Existing systems are already exceeding this threshold in some limited cases, but 20% of tasks is a big number and will take time to reach. The major obstacles here seem to be reliability and agency, more than capabilities per-se. There are also major interface level issues to address, as chat box or API integration is not sufficient for widespread adoption. I feel strongly that these obstacles are primarily engineering challenges rather than requiring field shaping breakthroughs. Because of this I expect progress to be relatively linear and predictable.
Tier 2 - Replacement Level AGI
Capabilities better than most humans (>90%) at most tasks (>90%), including nearly all non-physical tasks and many physical tasks via robotics. Humans in the loop usually do more harm than good. This is somewhat weaker than a typical definition of AGI (strictly, can do anything a human can do), but I think for practical purposes this is a more useful distinction. An AGI that is literally exactly as good as the best human at exactly all tasks will exist for approximately 1 millisecond before qualifying as ASI, so I don’t see that distinction as useful.
Forecasts to reach tier 2 - 2 years: 15%, 5 years: 30%, 10 years: 40%
Rationale: Unlike the AGI-ish scenario, I think there is a real possibility that this level cannot be reached with current architectures and training approaches. Many things, especially things that require physical world modeling, do not have a clear transfer from a model based purely on text, images, and video. There are also many features of human thinking (such as learning from experience, often in one shot) which are not currently incorporated in LLM based architectures but seem critical for many important tasks. These are active areas of research, but research breakthroughs are notoriously hard to predict and may be necessary to reach this level.
In addition, most technology improvement is asymptotic, meaning that progress is initially slow, then very rapid, then slows down dramatically as most of the easy advances are incorporated and only the most challenging problems remain. If AI development follows this pattern, I expect the asymptote to arrive somewhere between 20% and 90% of human capabilities, and likely closer to the 20% level. In other words, I expect the challenges in going from AGI-ish to true AGI to be more significant than the challenges going from AGI to ASI.
I still give it close to even odds (40%) that we reach replacement level AGI within 10 years purely through predictable engineering improvement of current systems like in the AGI-ish case. If this level cannot be achieved within 5 years I expect that means we’ve hit a fundamental asymptote that can only be solved through breakthroughs, which I anticipate will take much longer. So while the probability rises from 0-30% over the next 5 years, it only increases by another 10% in the following 5 years.
Tier 3+ - Artificial Superintelligence (ASI)
Capabilities exceed all humans at all tasks, including all physical tasks and tasks which humans are currently incapable of accomplishing. Humans in the loop are strictly worse than purely independent ASI systems.
Forecasts for Tier 3 - 2 years: 5%, 5 years: 20%, 10 years: 25%
Rationale:
If replacement level AGI is achieved, it is highly likely (>50%) that ASI is achieved shortly after. Replacement level AGI is very nearly ASI, if only because an arbitrarily large number of AGIs could cooperate at a superhuman level. One of the things AGI could and would do would be to keep improving itself. I am skeptical of this happening on a 2 year timeframe, but think this takeoff could happen very rapidly if AGI approaches human capabilities.
I don’t consider the AGI and ASI timelines totally equivalent because:
1) An AGI, if achieved within 5-10 years, will likely be trained largely on human data that was painstakingly accumulated over millennia, and exceeding that capability level could be much slower (e.g., the models need to run lots of slow, long-running experiments to learn). In this case ASI would still be on the horizon but would take longer to arrive.
2) As a society, we may wake up to the existential risk posed by ASI and decide to prevent its development after seeing true AGI, or impose a control mechanism that prevents systems from reaching superhuman capabilities (though we currently have no idea how to do this, perhaps AGI can help).
3) There may be a natural intelligence cap or diminishing returns from intelligence that is right around human level (I consider this unlikely, but it is possible).
Point 1 is the primary reason the 5 year estimate is not higher, and by 10 years point 2 seems more promising to me.
Comparing to Expert Forecasts
I think these categories provide a useful intuition for the kinds of outcomes we should be worried about, rather than being linked to any specific technical advancements. However, this does make it a bit hard to forecast exactly when each will be achieved. With that in mind, the timeframe estimates here should be considered extremely broad. For instance, while I estimate 40% probability of replacement level AGI within 10 years, I would not be surprised at all to find that this happened within 5 years or that it requires an entirely new AI paradigm be developed and does not occur for 20 years or more (though I would be very surprised if it took 2 years or 50 years).
These are my own estimates, but they agree pretty well with aggregated forecasts from several prediction markets (AGI in 2031), and is within the distribution of what some of the high profile field leaders have said:2

All of these predictions are taken from a single time point, so they aren’t directly comparable. But, at a rough approximation, I am more pessimistic than most about a 2 year AGI timeline and roughly in line with Ray Kurzweil (futurist) at 5 years or Sam Altman (CEO of OpenAI) at 10 years. As a side note I feel that the sigmoidal fits shown here should be ignored, because it implies that AGI is inevitable given enough time. I don’t believe this is true, and I highly doubt that Demis Hassabis would say that his 75% chance by 2030 is equivalent to saying 100% chance by 2035. If AGI is not achieved within ~10 years I expect it to take much longer if achieved at all.
Key Sources of Uncertainty
All of the numbers I’ve provided are highly uncertain, but there are some specific things that could happen (or fail to happen) that would make me much more confident in these outcomes.
Capabilities Accumulate
An important consideration of these different levels is that they build on one another. According to the CEOs of multiple leading AI labs, current systems already accelerate the work being done within those labs and write a substantial fraction of their code. Each level provides support that makes the subsequent level more achievable. Because one of the things humans do is build AI systems, AIs that amplify or replace human work will also speed up AI capabilities progress.
This leads to lots of weird implications for which we lack good historical parallels. New technologies provide new capabilities, but those capabilities are generally separate from the capabilities used to create the technology. The invention of steam power was key for unlocking the industrial revolution, but this was used to enable many other technologies (trains/steamships, new manufacturing approaches, etc.) and did not lead directly to ever more potent power generation.
The most comparable innovation is probably the internet. As a tool the internet has many uses, but one thing it does well is to make it easier to write software, which is then used to improve the internet. This is a kind of self-improvement - the internet we have today is far more robust, powerful, and useful than the internet we had in 1991, and this improvement has been very rapid in historical terms. Part of that progress has come from a sort of self-improvement loop.
AGI is unique. The limiting inputs are intellectual labor, data, and compute. Intellectual labor is implicitly solved by AGI which can improve its own code. There are suggestions that either compute or data may create bottlenecks, which I discuss further in the next section. But if AGI is sufficiently capable, it can solve either of these problems itself by creating its own data (through synthesis or experiments) or substituting efficiency improvement for computational power. This is what is known as the ‘software only’ singularity, a plausible path towards self-improving AI.
This is the primary reason people seriously worry about creating AGI. Once you unleash a self-improving technology absent any other clear limiting factors, you quickly lose control over the progress of that technology. This leads to the sort of risks I’ll discuss in the next post.
Diminishing Returns

Both technology advancement and natural processes often follow a roughly sigmoidal process. This process starts slow, goes through a period of rapid exponential change, and then settles to a new equilibrium level. This is the default expectation we should have for most processes - unlimited exponential growth is unsustainable in the real world. It’s highly possible that there we will encounter a sigmoidal trend in AI capabilities that causes them to level off somewhere between now and ASI levels. I mention two possibilities (data and compute limitations) below which are specific potential causes of this leveling off, but there are many ‘unknown unknowns’ which could impact this to either shorten or lengthen timelines.
Knowing exactly when this will occur is extremely difficult, and I have wide error bars on that estimate. A sigmoid and an exponential look identical until the former starts to level out. I am not at all convinced by theoretical approaches like comparisons between human brain flops and compute flops, which I think are incomparable for a host of reasons. Both data and compute limitations are forecast to start biting around 2028, and this offers one potential timepoint to anchor on. But, as of right now, there is exactly 0 evidence of capabilities falling off the exponential growth curve at least for software engineering tasks (shown in the METR plot below), so I do not expect to reach the transition in this potential sigmoid any time soon.

Compute Limitations
While there have been many algorithmic and training efficiency gains contributing to capabilities improvement, it’s fair to say that the lion’s share has come from simply scaling up existing systems to use more compute and data. The leading companies in particular seem to be all in on the scaling hypothesis: that throwing more compute at the problem will be sufficient to reach AGI.
However, compute exists in the physical world and takes time and resources to build. If exponential growth in compute is required to achieve exponential growth in capabilities, we expect this to decay at some point because the physical world abhors unlimited exponentials. Over the past several years compute has actually grown at an exponential rate, but some forecasts expect this to level off relatively soon.
In this recent paper from a collaboration between MIT and METR, they estimate the growth in compute specifically for OpenAI based on already announced data center contracts and compare directly with the METR capabilities graph I showed before. Based on their projections, while compute continues to grow over the coming decade, the rate of this growth falls off the exponential around 2028 resulting in a slow down in capabilities growth. This makes 2028 a reasonable timeframe to expect a capabilities slowdown from this factor.
Conversely, another paper from economists Parker Whitfill and Cheryl Wu demonstrates that this conclusion depends entirely on the ability for labor (in the form of algorithmic progress) to substitute for compute. If labor and compute can be exchanged to achieve progress then the compute limitations become a non-issue, while if they act as compliments to one another then the compute limitations remain a factor. Based on their work, they find that the compliments scenario is more likely for ‘frontier research,’ but this may change in the future and this factor makes me less confident that compute will become a major limiter by 2028.
Data Limitations
LLMs build their repertoire of behavior entirely on human generated data, and then refine those behaviors and capabilities through various types of reinforcement learning and fine tuning. There are automated components to these last pieces, but largely they still rely on signals from humans. As we rapidly approach using approximately 100% of human generated data ever digitized, we are forced to rely on synthetic data (which can lead to all kinds of interesting failures collectively known as model collapse) or to manually generate new data which is slow and expensive. It’s possible that this imposes a fundamental limit on capabilities at or below human level, and that this would lead to the sigmoid leveling off.
Epoch AI has done some really good work estimating the timeframe for this issue, with a median estimate that we’ll run out of data around 2028. Because of this I wouldn’t expect this to bite until a few years from now. If it is not solved by that point, it may push timelines out significantly. However, many other approaches (especially self-driving or self-play approaches like AlphaZero) rely very heavily on synthetic data to conduct training. So I do not consider this obstacle to be a guaranteed hard stop.
Architecture Breakthroughs
Current LLM systems are, at their very root, prediction algorithms that emulate human writing. It’s frankly astonishing that this is sufficient to produce the capabilities we see in current systems. Very, very few people would have predicted the impact of the transformer architecture even in 2021, though the original paper on the underlying technology was published in 2017.
LLMs are very different from other recent breakthroughs in AI, such as Alpha Zero which learns through self-play and requires essentially no human data other than the rules of the game. Alpha Zero and other pure reinforcement approaches have achieved capabilities that far exceed human levels - but only in narrow domains where data can be simulated in unlimited quantities. This approach does not translate well to AGI, because most tasks in the real world cannot be simulated effectively (yet).
If AGI cannot be achieved with current approaches (plausible) then a breakthrough of a completely different sort may be required. There have been surprisingly reliable breakthroughs in AI over the past 16 years of the neural network era, with effective neural networks in the form of LSTM (2009), the launch of deep learning with AlexNet (2012), AlphaGo beating Lee SeDol (2016, and AlphaZero in 2017), attention networks that underlie current LLM systems (2017), AlphaFold unlocking protein folding (2020), and finally consumer AI via ChatGPT (2022). Unfortunately, predicting these breakthroughs is practically impossible.
If a breakthrough is required to achieve AGI, I generally expect the timeline to extend significantly. Almost all other types of AI research have ground to a halt in favor of following this promising LLM pathway, which I’d expect to suck up a lot of the effort and funding that would otherwise go to different approaches. On the other hand, there has been an astounding amount of investment (both in regular capital and human capital) in AI over the past 5 years, and this could easily increase the likelihood of relevant breakthroughs.
I’m far from certain that current approaches are fundamentally incapable of achieving AGI without breakthroughs. But if a breakthrough is required, I’m even less certain what that will look like or how long it will take.
Summary
Timelines are difficult to forecast, so all of the above should be taken with a large grain of salt. This is particularly true when exponentials are involved. Time is linear,3 which means small errors in exponential estimates can result in massive errors in time based predictions. Because many concerning aspects of AI capabilities growth involve exponentials, AI capabilities are especially hard to forecast.
Still, I (and many others) consider the likelihood of reaching each of these capability levels to be very high, comparable to a coin flip that our world is dramatically changed within 10 years. Even setting aside ASI, replacement level AGI would be a cataclysmic shift in day to day life for essentially all humans. What this world might look like, and what the world would look like under different tiers of capability, is the topic of the next post.
Importantly, the other two ‘Godfathers of AI’, Yoshua Bengio and Geoffrey Hinton, strongly disagree with LeCun.
I made my forecasts without referencing these specific sources. However I consume a lot of AI related content and my views are no doubt influenced by these opinions either directly or indirectly.
Unless you’re a physicist.



