Ilya Sutskever declares the era of pre-training over, citing “peak data” and a plateau in AI model scaling. This video explores his predictions for AI’s future, including a shift toward agentic behavior, synthetic data, and inference-time computing. As foundational models evolve, the discussion expands to new pathways for advancing AI capabilities.

Brought to you by:
Vanta – Simplify compliance – ⁠⁠⁠⁠⁠⁠⁠https://vanta.com/nlw

⁠⁠⁠⁠⁠⁠⁠The AI Daily Brief helps you understand the most important news and discussions in AI.

Learn how to use AI with the world’s biggest library of fun and useful tutorials: Use code ‘youtube’ for 50% off your first month.

The AI Daily Brief helps you understand the most important news and discussions in AI.
Subscribe to the podcast version of The AI Daily Brief wherever you listen:
Subscribe to the newsletter:
Join our Discord:


One of the Giants of the AI field has Officially called it Peak data and said That we are at the end of pre-t training Welcome back to the AI Daily Brief one Of the really interesting conversations That we've been paying attention to for The last month or so has to do with the Idea of whether we've hit a plateau in Llm Performance Based on the current Methods for training AI models former Open AI co-founder and now founder of The company's safe super intelligence Ilas sitk made a rare appearance in Vancouver on Friday to make some fairly Ground shaking predictions about the Future of AI speaking at the neur IPS Conference Ilia claimed pre-training as We know it will unquestionably end let's Go back and get a little bit of context For these comments before we dig into Exactly what Ilia had to say all current Foundation models rely on scaling up Pre-training to make progress basically They throw more data and more compute at The problem to achieve the next paradigm Shift in model capability a few months Ago however sources inside a frontier Lab started to express concerns that Pre-training had hit a scaling wall Training runs were starting to show Diminishing return returns from adding More compute to training clusters and More data to training sets and what had Originally just been reporting from the

Information started to get Credence from Big CEO appearances at the Microsoft Ignite conference last month CEO sa Nella said we're seeing the emergence of A new scaling law he was of course Referring to scaling time test compute Which is the technology that underpins Open ai1 model Google CEO Sundar picai At the New York Times dealbook Summit Said I think that progress is going to Get harder when I look at 2025 the low Hanging fruit is gone the hill is Steeper now for open ai's part they Think that the new opportunity of the Reasoning models and test time compute Means that as Sam mman put it there is No wall but it's clear that even they Have shifted their strategy and that Basically instead of Simply scaling Computing power and adding additional Data new approaches that involve Allowing the models to quote unquote Think longer is a viable if alternative Scaling strategy Ilia himself weighed in On the debate and really took it to a New level given that he had been such a Long-term proponent of just throw more Compu and data at it Ilia told Reuters The 2010s were the age of scaling now We're back in the age of Wonder and Discovery once again everyone is looking For the next thing scaling the right Thing matters now more than ever so what Is the right thing well let's go back to

These comments from last week at The NPS Conference Ilia seems to believe that The end of the pre-training era is for More fundamental reasons he believes the Industry has reached the Practical limit For scaling stating while compute is Growing we've achieved Peak data and There'll be no more we have to deal with The data that we have there's only one Internet instead Ilia is proposing a Very different Pathway to achieving the Next generation of AI models he Mentioned agents synthetic data and Inference time compute as experiments That are already being run when it comes To agents Ila's belief is that the Current crop of so-called agents are Extremely limited and don't necessarily Evolve much further using current Methods while this current crop are an Impressive first stage they're still Prone to becoming confused and require Human supervision to carry out tasks Correctly Ilia said right now the system Systems are not agents in any meaningful Sense they're just beginning he says That in the future models will be able To reason more once again he said we're In the early stages claiming that Current models only replicate human Intuition rather than coming up with Their own novel strings of logic Ilia Gave chess playing AI as an example Noting that the leading models were

Completely unpredictable to human Grand Masters he said The more a system Reasons the more unpredictable it Becomes zooming all the way out Ilia Gave an example from nature which to him Suggests fundamental breakthroughs in AI Sophistication are possible he noted That host mammals display the same Predictable relationship between body Weight and brain size non-human primates Are slightly above this curve but scale In the same manner hominids like humans And their ancestors show a completely Different relationship between body mass And brain size as humans evolved brain Size skyrocketed in a way that was Unpredictable based on comparisons with Other species Ilia claimed quote this Means there is a precedent for biology Figuring out some kind of different Scaling ultimately Ilia believes that The path to Super intelligence will Yield drastically different capabilities To the pre-training era of AI he said he Expects to see super intelligent models That are fundamentally agentic meaning They will be natively capable of Carrying out tasks in the same way that A human can he also believes that they Will be necessarily unpredictable saying We will have to be dealing with AI Systems that are incredibly Unpredictable they will understand Things from limited data they will not

Get confused all of the things which are Really big Limitations importantly these were all Very generalized predictions of how AI Will evolve Ilia said I'm not saying how And I'm not saying when I'm saying that It will when all of those things come Together we will have systems of Radically different qualities and Properties that exist today and this is Sort of the big point that whatever Happens next it likely looks very Different than what we have today one of The most important points is this Observation that we've reached Peak data Current models have been trained on the Entire internet at this stage and while A lot of folks jump to say maybe there Are sources of data as yet untapped Entrepreneur Ibrahim Amed wrote the one Point I somewhat disagree with is that We've tapped all data there's immense Private data that's completely untapped Mike neonic writes while to me he's Convinced we're out of data maybe he Means public scrapable data example There's a huge difference between the Text on a Wikipedia page and a Screenshot of a Wikipedia page so much Contextual data is locked away in Perception it would be a huge benefit to Pre-training but Ila's point is somewhat Different it seems to me like he's Saying that while private and synthetic

Data could theoretically expand the size Of the data set they're unlikely to Contain any novel Concepts or ideas put Another way once you've memorized the Entire catalog of human thought what More is there to learn yovo summed it up This way learning to complete partial Observations is not not sufficient to Get intelligence he said I think this Was kind of obvious to many but maybe Noteworthy that a true scale believer Said it some of the other commentary was Frustration around what was not said Demitro Eran from Google's Deep Mind Said sad part about the talk is what he Didn't say 10 years ago Ilia would have Told us what he thinks we should do Yesterday he just alluded to ideas from Others that's what happens when you run A company and are more interested in Secrecy than benefiting science Nate Sanders took it a step further saying Seems clear to me that the data drought Pessimism over the last 90 days is Because Ilia and the SSI team were out Fundraising and this was their core Thesis I will say that even if that's True it's not necessarily just a Fundraising thing or at least the Direction of the correlation isn't clear In other words is Ilia pushing this Narrative because it's helpful for Fundraising or is it something that he Believes that he's just capitalizing on

We also don't even have any confirmed Reports that he actually is fundraising This is all just speculation perhaps Most interesting is the part of the Conversation where it's almost like Ilia Has unlocked people to think more Broadly about what might come next John Rush wrote Ilia finally confirmed Scaling llms at the pre-training stage Plateaued the compute is scaling but Data isn't and new or synthetic data Isn't moving the needle what's next same As the human brain stopped growing in Size but Humanity kept advancing the Agents and tools on top of llms will Fuel the progress sequence to sequence Learning agentic Behavior teaching Self-awareness think of it as the iPhone Which kept getting bigger and more Useful from a hardware point but Plateaued and focus shifted to Applications I don't know if that's how It plays out but I think it's great that We're starting to have that conversation Really really interesting stuff from Ilia glad he gave that talk and excited To see how this conversation proceeds Into the new year for now though that is Going to do it for today's AI Daily Brief until next time peace