OpenAI’s new Operator agent is making waves. In this episode, explore seven real-world ways users are testing its capabilities. From handling routine tasks like grocery shopping and bill payments to more ambitious applications like sales outreach and app development, Operator is setting a new standard for automation. Discover its potential, limitations, and the innovative ideas shaping its use.

Welcome back to the AI Daily Brief Yesterday I had a classic thing happen This show is Daily Show right six out of The seven days of the week there is an AI Daily Brief talking to you about the Latest AI news and discourse and you Would think that daily is a frequent Enough Cadence to actually capture and Be up to date with all the news alas Sometimes even that isn't enough and Yesterday we had one of those situations Where the headlines part of the episode Talked about how it appeared that Operator would be coming this week and Between the time that I finished Recording and when it was actually Published operator had come out I had a Feeling as I was recording that that was Going to happen but in any case that Means that today we get to actually look At operator which is of course open ai's First true or at least advertised to be True agent project they call it an agent That can use its own browser to perform Tasks for you so let's find out what it Is and then we're going to talk through Seven ways that people are using it Already operator has been long in the Making indeed even as recently as a Couple of weeks ago there were news Articles coming out that said that were Exploring things like why open aai Hadn't released an agent yet their Announcement post describes operator as

An agent that can go to the web to Perform tasks for you interestingly it Uses its own browser and with that Browser it can look at a web page Interact with it through typing clicking Or scrolling open AI is to some extent Planning a flag here around what an Agent is referring to them as AI capable Of doing work for you independently you Give it a task and it will execute they Suggest that this research preview Version of operator is good repetitive Browser tasks such as filling out forms Ordering groceries and creating memes Now in terms of how it actually works There is some similarity to the way that Anthropics computer use mode is designed The agent takes constant screenshots to See what it's doing in the web browser And can take control using the mouse and Keyboard unlike anthropic though open Aai has implemented this as a fully Remote setup after receiving Instructions operator opens its own Virtual browser window in a cloud Instance you can watch it carry out its Task or you can click away and get on With other work while operator works in The background Users retain full control of their Computer with operator running in its Own fully contained browser this of Course limits the specific things that It can do but it also makes it more

Usable at the same time open aai has Worked with specific major websites like StubHub door Dash and Open Table to try To improve and smooth out the Integration but theoretically operator Can access any website that it needs to Carry out its task there is a lot of Human in the loop here as well open AI Writes if operator encounters challenges Or makes mistakes it can leverage its Reasoning capabilities to self-correct When it gets stuck and needs assistance It simply hands control back to the user Ensuring a smooth and collaborative Experience indeed in addition to helping Operator deal with certain types of Issues taking over is also required to Finalize certain tasks for example this Version of operator does not have access To credit card details so if that's part Of completing the task it hands the System back over to the user to complete That particular step operator also asks For feedback at critical moments within Its tasks under the hood open AI has Fine tuned version of GPT 40 to drive Operator which they're calling their Computer using agent or Kua as far as Benchmarks go Kua achieved an 87% Success rate on web Voyager which is a Live website navigation test and a 58.1% Success rate on web Arena which Simulates e-commerce and content Management situations much better than

Vanilla gbt 40 but certainly not Necessarily the level of reliability We'd want before these types of Experiences become endemic speaking of Which as Venture bead points out Tik Tok Parent bite dance also launched its own AI agent for controlling web browsers Yesterday called UI tar s they write It's totally open source and both Similarly impressive Benchmark Performance which makes them wonder if People will be willing to pay for chat GPT Pros $200 a month which is the only Way that you can get access to operator At the moment as has been the custom With open AI releases lately the feature Is only available to us Pro users with Sam Altin saying that Europe will Unfortunately take a while so let's talk Now about some of the ways that people Are actually using operator keep in mind These are all very nent first test kind Of use cases and it always in take some Time to really figure out the best ways To use any new capabilities like Operator offers certainly when it comes To how open AI was positioning this it's A lot of the very basic assistant tasks That I've often on the show said I don't Think are going to be the real drivers Of agent behavior when it comes to Consumers ultimately whether I'm right And these aren't the long-term drivers Of agentic behavior or I'm wrong and

This is exactly what people end up Wanting to use agents for it's clear That they're valuable as a test case and As a way to start training and giving Agents capabilities the first use case That many people people shared was some Version of grocery shopping this was one Of the examples in fact that the open Aai Team used to demonstrate operator Capabilities they gave it a shopping List written down on a piece of paper Says can you buy these for me please an Operator goes brings the list to Instacart and after it's found the items And added them to the cart asks whether It should finalize the order in a week When crypto has been booming it's Appropriate that another experimental Use case this one from Rowan Chung who Of course runs the rundown is crypto Investment research based on tokens that Are actually worth looking into to Obviously you could generalize this use Case as research the reason that I Thought this example was interesting to Share was that it demonstrates one part Of the human agent interface at one Point operator got hit with an Ru human Capture and pinged Rowan to take control Again to confirm and move forward number Three in another very common Demonstration use case and once again One that I've railed on before is travel Planning why combinator president Gary

Tan writes open aai operator is very Impressive planning an impromptu trip to Vegas it's able to navigate jsx website And unusual cases and basically figured Out soldout scenarios change dates and Times and now it's figuring out where to Eat for Friday night for two I will say That when it comes to this type of Assistant use case the more complex the Travel is in other words the more Details that need to be solved the more I can see this particular type of Interface which just Chatters at you to Get the information it needs to execute Being an actually useful update a fourth Use case this one once again from Rowan Researching a good birthday gift for my Mom based on what she likes couple Things that were interesting about this Experiment first of all there were Certain times and websites that it Couldn't access and it was capable of Switching gears and finding another site That would do something similar it also In addition to looking for specific Items took it a step farther and Actually helped compare and find the Best price across the web number five Staying on the theme of wrote regular Tasks a16z partner Olivia Moore says I Just gave operator a picture of a paper Bill I got in the mail from only the Bill picture it navigated to the website Pulled up my account entered my info and

Asked for my credit card number to Complete payment once again you see here That it's not going to take that final Step of actually inputting the credit Card number without human approval Although presumably in the long run that Might be something that people get more Comfortable with actually allowing and Various agent assistants actually enable As well sixth use case and this is I Think where it gets a little bit more Interesting from a business standpoint Is actually using the tool for sales This comes from pocket flow AI Helena Jang and let's just listen to the 30 Seconds of what she did Hi here's a list Of powerful women at companies we would Love to work with and I want to reach Out to their head of AI with such a Message So I have prompted Operator And talking to the operator this is just So cool so basically what operator did Here was take a list of names find their LinkedIn profiles and add a message to Connection requests effectively doing Prospecting lastly our seventh use case And again I saw a number of different Examples of this was using the agent to Build apps baby AGI Creator and VC Yohi Writes I used open aai operator to build Deploy and open source a tool on GitHub Using repet agent took about 30 minutes

He also gave some feedback writing while Working with repet agent it actually Deployed the app tested it and described The error back to repet agent for me Operator asked me a few more questions That I wanted but it was mostly for Safety EG filling form so I guess okay With it it had trouble with a few things Around UI like knowing it needs to Scroll a page to see the rest of it and It needed pointers to find the git Feature in repet once it found the git Feature it didn't need my assistance to Create a repo and open source after Having the agent wrer read me While a bit slower this was even more Automated than replate agent especially Testing features and working through Errors which is impressive the app that Yohi built by the way was quote the Classic to-do app with the twist it's For agents API for agent to create read Update delete tasks user web UI for Manually managing tasks test UI for Testing endpoints and API performance Metrics kishan also made an app sharing A video and tweeting use chat GPT Operator to use bolt to create a project Management app a general agent using a Coding agent and it worked pretty well I Even deployed the app this is insane so Basically we had here exactly as he Described this General agent which is Operator using the specific bolt agent

Which is a web coding agent to create Something and it worked when you see Things like this which open up Fundamentally new possibilities and Things that were never possible before That's why I'm more skeptical of the Very basic superficial do my grocery Shopping for me type of tasks sure it Could be that assistants get so good at Those things that it's not even worth a Tiny handful of minutes that it used to Take to do them but certainly what gets Me excited and what I think is is going To drive more uptake are these neverbe Possible things like building complete Applications in this way ultimately the Way that I would describe people's General attitudes towards this is that While it isn't a lightning bolt chat GPT Style of moment operator is just good It's not great at everything yet it has Some challenges but it's definitely a Preview of the future and where we're Headed I anticipate over the next few Weeks we are going to see a ton of Different use cases thrown at this thing And probably some that start to take off As really and regularly valuable I will Of course be back here to share those With you as they happen but for now that Is going to do it for today's AI Daily Brief and until next time peace