OpenAI’s upcoming O3 model has sparked widespread speculation about its capabilities and potential impact. From hints at advanced reasoning to its implications for AGI development, the excitement is palpable. Meanwhile, rivals like DeepSeek challenge the playing field with cost-effective, high-performance alternatives. This episode unpacks the facts, dispels the hype, and explores the broader implications for AI innovation and policy.

Brought to you by:
Vanta – Simplify compliance – ⁠⁠⁠⁠⁠⁠⁠https://vanta.com/nlw

⁠⁠⁠⁠⁠⁠⁠The AI Daily Brief helps you understand the most important news and discussions in AI.

Learn how to use AI with the world’s biggest library of fun and useful tutorials: Use code ‘youtube’ for 50% off your first month.

The AI Daily Brief helps you understand the most important news and discussions in AI.
Subscribe to the podcast version of The AI Daily Brief wherever you listen:
Subscribe to the newsletter:
Join our Discord:


Welcome back to the AI Daily Brief is Open aai about to ship artificial General intelligence the conversation That we're having today got started on Friday after noon when Sam Alman Announced that open ai's 03 reasoning Model is close to release he posted Thank you to the external safety Researchers who tested 03 mini we have Now finalized a version and are Beginning the release process planning To ship in a couple of weeks also we Heard the feedback we launch API and Chat GPT at the same time it's very good The hype cycle began immediately Santi Jenes shats writes O3 is coming brace For the AGI in fact there was so much of This type of discussion that Alman dove In participating in a long discussion in The replies to level set expectations After McKay Wrigley asked Are you able To speak of how capable O3 mini is Compared to 01 Pro Alman said worse than 01 pro at most things but fast when Terce Bob wrote sad I want a model even Smarter than 01 Pro willing to pay Alman Said 03 is much smarter we're turning Our attention to that now and 03 Pro Mind-blown Emoji In terms of who has access to this the New model will be available to at least Open AI Pro subscribers in other words The folks who are paying $200 per month Overall after the weekend Sam Alman came

Back to Twitter to say Twitter hype is Out of control again we are not going to Deploy AGI next month nor have we built It we have some very cool stuff for you But please chill and cut your Expectations 100x now of course when Open aai first previewed 03 at the end Of December to manying it was the first Model that looked a little bit like AGI It was the first to score 75% on the arc AGI Benchmark maybe the best yard stick We have right now for testing AGI style Performance however that testing was Done on the full model and used an Incredible amount of compute rcgi tests Allow for a budget of $10,000 for Inference for official ranking Unofficial open aai also completed a run Using over 100,000 of inference and Perform much higher but that level of Compute isn't feasible to deliver to the Public so we're getting something much Smaller and consequently less powerful So that doesn't mean this model won't be A paradigm shift in its own right chubby For example wrote to explain again why 03 is so important we get a reasoning Model that is better than full 01 and Costs only a fraction of it at medium Compute O3 mini is still cheaper at Least a tiny bit than 01 mini but Outperforms full 01 and code forces by More than 100 ELO that means better Reasoning for more applications and more

Users wider application leads to more Insights and more breakthroughs that's Why O3 mini is so important Henry Mau The founder of jny AI got specific if O3 Mini is cheap enough it might just Supplant 40 and Sonet 3.5 for daily Coding tasks Blake C an app developer Wrote 01 Pro will take 5 minutes Sometimes when you ask it to say fix Some code but it's like 2 to 3x better Than son it most of the time if o03 mini Is 2x on it and the same speed that will Be nuts TDM suggested that this isn't Really about releasing a more performant Model but rather a step towards making Open ai's reasoning models more cost Effective they posted so O3 mini is Basically just faster 01 I think the Primary reason they are releasing this Is that the 01 cost can't be reduced Enough to sustain scale while not losing Money on it another would be for API Devs to start using O3 Mini more instead Of Sonet since it would be faster and Smarter and so taking cues from Sam Alman this really doesn't sound like Consumer grade AGI and yet there are Other hints that open AI is approaching Some very big things axio reported over The weekend that Sam Alman has been Invited to brief the Trump White House Next week the article stated that quote A top company possibly open AI in coming Weeks will announce a Next Level

Breakthrough that unleashes PhD level Super agents to do complex human tasks Open AI sources said that they are quote Both jazzed and spooked by recent Progress interestingly there haven't Really been any public Rumblings about Open aai launching agents but it does Seem to many that this is an area where The company has been lagging behind and Yet it seems like this might not be the Case for long Tibor blaho for example Found references to agents in open ai's Code he tweeted confirm the chat gbt Mac OS desktop app has hidden options to Define shortcuts for the desktop Launcher to toggle operator and force Quit Operator Operator is the name of Open ai's forthcoming general purpose Agent the information previously Reported that January was the intended Launch month for operator chubby once Again also noted that open aai already Has a comparison page on their website Showing operators performance contrast Against anthropics computer use mode and Google's Mariner agent they wrote looks Like release is imminent the benchmarks In its leaked graphic which we don't Know if it's real show a substantial Step up from anthropics model and a Slight improvement from Google's Dedicated web browsing agent in that Domain still it doesn't seem as though Openai have perfected computer use mode

For example the leap testing showed the Agent could only successfully sign up For a cloud services account and launch Launch a virtual machine 60% of the time Responding to some of the hype Kumar Aangi the head of automation at Cognizant tried to Tamp down Expectations of what these agents can do He posted no this is not going to get us ASI or AGI these are agents real- time Yes and can be useful too uniquely in Narrow cases expensively in others but Agents nevertheless which means they Call the models the models need to Provide the AGI and Asi and they're not Doing that anytime soon not even deepsea Car1 although it is 27x cheaper than 01 Speaking of which while these release Rumors from open AI set imaginations Racing a rival Chinese lab sucked a lot Of the oxygen out of the room with their Latest model over the weekend deep seek Released their full version of the R1 Reasoning model now you might remember That we've talked about deep seek a Number of times Economist Tyler Cowen Used it as his example of why Trump Should think differently about Biden's Chip export policies and in terms of What was released the model performs in Line with 01 on most benchmarks in Particular sbench verified which focuses On programming task s R1 is now fully Available as an open source model for

Commercial use and is capable of serving Outputs via API at less than 5% of the Cost of o1 hobbyists are also able to Run the model at home with several Demonstrating that it runs on a cluster Of Mac minis accompanying the full Release of R1 was a technical paper Describing the post training process Which develops reasoning capability on Top of a foundation model deep seek said They tried multiple forms of post Training before landing on a relatively Simple reinforcement learning process Max winga a research engineer at Conjecture AI posted it's wild to me That they did this with no fine-tuning Prior to the RL stage R1 learns to Reason on its own like Alpha zero during Training they observe the model learning To use Advanced reasoning techniques an Aha moment we are playing with alien Minds not just tools AI entrepreneur Elvis aravia writes the deeps R1 paper Is a gem it's clear that llm reasoning Capabilities can be learned in different Ways reinforcement learning if applied Correctly and at scale can lead to some Really powerful and interesting scaling And emergent properties now all of this Has some people thinking ahead to Future Possibilities the AI for Success account For example Tweets in a few years China Will create AGI and open source it for All deep seek R1 costs 96% less compared

To open AI 01 and it's almost as good as 01 intelligence too cheap to meter 2025 Is going to be crazy I can feel it Indeed the rapid development going on in China has major implications for AI Policy in announcing the latest round of Export controls the Biden Administration Made it clear that International Competitiveness was a key issue the Policy statement set an explicit goal to Ensure that US models are dominant Across the world especially in the Global South Dean W ball a research Fellow at George Mason University posted Deep SE car1 takeaways for policy one Chinese Labs will likely continue to be Fast followers in terms of reaching Similar Benchmark performance to US Models two the impressive performance of Deep seeks distilled models smaller Versions of R1 means that very capable Reasoners will continue to proliferate Widely and be runnable on local hardware Far from the eyes of any top- down Control regime including us diffusion Rule three open models are going to have Strategic value value for the us and we Need to figure out ways to get more Frontier open models out to the world we Rely exclusively on meta for this right Now which while great is just one firm Why do open Ai and anthropic not open Source their older models what would be The harm mostly where people's minds are

Is just feeling the acceleration Perplexity CEO aravan shrinivas writes It's kind of wild to see reasoning get Commoditized this fast we should fully Expect an 03 level model that's open Source by the end of the year probably Even midyear so friends Lots going on as We dig deeper into J anuary that however Is going to do it for today's AI Daily Brief until next time peace