We Taught AI to Play Games—Now It’s a $3.6 Million Company
This episode is a little different from our usual fare: It’s a conversation with our head of AI training Alex Duffy about Good Start Labs, a company he incubated inside Every. Today, Good Start Labs is spinning out of Every as a separate company with $3.6 million in funding from General Catalyst, Inovia, Every, and a group of angel investors from top-tier AI labs like DeepMind. We get into how Alex learned some of his biggest lessons about the real world from games, starting with RuneScape, which taught him how markets work and how not to get scammed. He explains why the static benchmarks we use to evaluate LLMs today are breaking down, and how games like Diplomacy offer a richer, more dynamic way to test and train large language models. Finally, Alex shares where he sees the most promise in AI—software, life sciences, and education—and why he believes games can make the models we use smarter, while helping people understand and use AI more effectively.If you found this episode interesting, please like, subscribe, comment, and share.Want even more?Sign up for Every to unlock our ultimate guide to prompting ChatGPT here: https://every.ck.page/ultimate-guide-to-prompting-chatgpt. It’s usually only for paying subscribers, but you can get it here for free.To hear more from Dan Shipper:Subscribe to Every: https://every.to/subscribeFollow him on X: https://twitter.com/danshipperTimestamps00:00:00 - Start00:01:48 - Introduction00:04:14 - Why evals and benchmarks are broken00:07:13 - The sneakiest LLMs in the market00:13:00 - A competition that turns prompting into a sport00:15:49 - Building a business around using games to make AI better00:22:39 - Can language models learn how to be funny00:25:31 - Why games are a great way to evaluate and train new models00:26:58 - What child psychology tells us about games and AI00:30:10 - Using games to unlock continual learning in AI00:36:42 - Why Alex cares deeply about games00:44:37 - Where Alex sees the most promise in AI00:50:54 - Rethinking how young people start their careers in the age of AILinks to resources mentioned in the episode:Alex Duffy: alex duffy (@alxai_)Good Start Labs: https://goodstartlabs.com/, good start (@goodstartlabs)The book Alex is reading about the importance of games: Playing with Reality: How Games Shape Our WorldThe book Dan recommends by the psychoanalyst D.W. Winnicott: Playing and Reality
--------
58:23
--------
58:23
Box CEO Aaron Levie on Why AI Agents Won’t Take Your Job
Aaron Levie is AI-pilled, but he’s one of the few CEOs who sees a future where AI agents work for us, instead of replacing us—helping us to do more than we could before.Aaron’s been the CEO of Box for 20 years–long enough to see a few tech revolutions up close—and taking the company AI-first gave him a glimpse of what the next one means for us. We get into why jobs aren’t going away, the new shape of work, and what it takes to build an AI-first company from the inside.If you found this episode interesting, please like, subscribe, comment, and share. Want even more?Sign up for Every to unlock our ultimate guide to prompting ChatGPT here: https://every.ck.page/ultimate-guide-to-prompting-chatgpt. It’s usually only for paying subscribers, but you can get it here for free.To hear more from Dan Shipper:Subscribe to Every: https://every.to/subscribeFollow him on X: https://twitter.com/danshipperMeet NotebookLM, the AI research tool and thinking partner that can analyze your sources, turn complexity into clarity and transform your content: https://notebooklm.google.com/Timestamps:00:00:00 - Start00:01:30 – Introduction00:02:36 – Why AI won’t take your job00:06:42 – Jevons Paradox and the future of work00:10:40 – How Aaron’s experience with the cloud era shapes his view of AI00:19:44 – Why every knowledge worker is becoming a manager of AI agents00:25:21 – What Aaron’s learned from bringing AI into every corner of Box00:33:57 – What’s overhyped in AI today00:43:31 – How Aaron balances everyday execution with innovationLinks to resources mentioned in the episode:Aaron Levie: Aaron Levie (@levie)Box: https://www.box.com/Dan’s essay on the shift toward the allocation economy: "The Knowledge Economy Is Over. Welcome to the Allocation Economy"Dwarkesh’s podcast with Richard Sutton: https://www.dwarkesh.com/p/richard-sutton
--------
52:56
--------
52:56
MCP Servers: Teaching AI to Use the Internet Like Humans
If your MCP server has dozens of tools, it’s probably built wrong.You need tools that are specific and clear for each use case—but you also can’t have too many. This creates an almost impossible tradeoff that most companies don’t know how to solve.That’s why we interviewed Alex Rattray, the founder and CEO of Stainless. Stainless builds APIs, SDKs, and MCP servers for companies like OpenAI and Anthropic. Alex has spent years mastering how to make software talk to software, and he came on the show to share what he knows. We get into MCP and the future of the AI-native internet.If you found this episode interesting, please like, subscribe, comment, and share. Want even more?Sign up for Every to unlock our ultimate guide to prompting ChatGPT here: https://every.ck.page/ultimate-guide-to-prompting-chatgpt. It’s usually only for paying subscribers, but you can get it here for free.To hear more from Dan Shipper:Subscribe to Every: https://every.to/subscribeFollow him on X: https://twitter.com/danshipperReady to build a site that looks hand-coded—without hiring a developer? Launch your site for free at Framer.com, and use code DAN to get your first month of Pro on the house.Timestamps:00:00:00 - Start00:01:14 - Introduction00:02:54 - Why Alex likes running barefoot00:05:09 - APIs and MCP, the connectors of the new internet00:10:53 - Why MCP servers are hard to get right00:20:07 - Design principles for reliable MCP servers00:23:50 - Scaling MCP servers for large APIs00:25:14 - Using MCP for business ops at Stainless00:28:12 - Building a company brain with Claude Code00:33:59 - Where MCP goes from here00:41:10 - Alex’s take on the security model for MCPLinks to resources mentioned in the episode:Alex Rattray: Alex Rattray (@RattrayAlex), Alex Rattray Stainless: https://www.stainless.com/
--------
51:40
--------
51:40
Cognition’s CEO on What Comes After Code
The future has a way of showing up early to some places. In software engineering, one of those places is Cognition—the startup that made headlines in early 2024 with Devin, the world’s first autonomous coding agent, and more recently with its acquisition of the AI code editor Windsurf.Scott Wu, Cognition’s cofounder and CEO, has a front-row seat to what comes next. In this episode of AI & I, we talk with Wu about why the fundamentals of computer science still matter in an AI-first world, the direction he sees for the short- and long-term future of programming, and why he believes we may already be living with AGI.Timestamps: 00:00:00 – Start00:02:02 – Introduction00:02:32 – Why Scott thinks AGI is here00:09:27 – Scott’s personal journey as a founder00:16:55 – Why the fundamentals of computer science still matter00:22:30 – How the future of programming will evolve00:26:50 – A new workflow for the AI-first software engineer00:29:33 – How Devin stacks up against Claude Code00:40:05 – Reinforcement learning to build better coding agents00:50:05 – What excites Scott about AI beyond CognitionIf you found this episode interesting, please like, subscribe, comment, and share! Want even more?Sign up for Every to unlock our ultimate guide to prompting ChatGPT here: https://every.ck.page/ultimate-guide-to-prompting-chatgpt. It’s usually only for paying subscribers, but you can get it here for free.To hear more from Dan Shipper:Subscribe to Every: https://every.to/subscribe Follow him on X: https://twitter.com/danshipper Links to resources mentioned in the episode:Scott Wu: Scott Wu (@ScottWu46) Learn more about Cognition: https://cognition.ai/ Try the world’s first autonomous coding agent: https://devin.ai/
--------
53:22
--------
53:22
One Developer Got Thousands of Users Before His App Launched
Naveen Naidu built an app that found product-market fit backwards.Most apps launch first and then try to find users. Monologue, Naveen’s AI voice dictation app that came out of beta yesterday, did the opposite. It built a following of thousands of users during its incubation period at Every—many of them switching over from venture capital-backed competitors—all while the app barely had a landing page.The growth has continued in the 24 hours since launch, with an average of 1 million words being transcribed weekly, and in this episode of AI & I, we sit down with Naveen to talk about his journey as the single engineer behind a viral app. We get into the false starts and side projects that taught Naveen how to ship fast, the brutal feedback that kept Monologue honest, why Every decided to build in a crowded category, and the AI coding tools that let one developer do the work of a team.Get free early access to Amazon's Alexa Plus: https://www.amazon.com/dp/B0DCCNHWV5?ref_=aucc_us_dis_everyalexa_q3_25Timestamps:00:01:27 – Introduction00:03:51 – A live demo of Monologue00:06:27 – Hard lessons from Naveen’s years in the wilderness00:12:29 – Building a muscle to ship fast00:21:11 – The spark that became Monologue00:26:09 – Dogfooding your way to a killer feature00:29:45 – Why the harshest product feedback is the most valuable00:31:47 – Every’s strategy for launching an app in a crowded space00:40:08 – Giving Monologue the Every “smell”00:45:09 – Naveen’s one-person AI stack to build beautiful appsIf you found this episode interesting, please like, subscribe, comment, and share!Want even more?Sign up for Every to unlock our ultimate guide to prompting ChatGPT here: https://every.ck.page/ultimate-guide-to-prompting-chatgpt. It’s usually only for paying subscribers, but you can get it here for free.To hear more from Dan Shipper:Subscribe to Every: https://every.to/subscribeFollow him on X: https://twitter.com/danshipperLinks to resources mentioned in the episode: https://www.monologue.to/
Learn how the smartest people in the world are using AI to think, create, and relate. Each week I interview founders, filmmakers, writers, investors, and others about how they use AI tools like ChatGPT, Claude, and Midjourney in their work and in their lives. We screen-share through their historical chats and then experiment with AI live on the show. Join us to discover how AI is changing how we think about our world—and ourselves.
For more essays, interviews, and experiments at the forefront of AI: https://every.to/chain-of-thought?sort=newest.